Sampling and Surveys
Section 4.1
The WHO of our data….
The population of interest is the entire group of
people/things that we wish to study.
Since we want to know
things about the population,
we need to figure out how
to gather that data!
I know! Let’s Sample All of them!
A census is a “sample” of the entire population.
Problems with a Census
So why don’t we always do a census?
- difficult or impractical to complete
- too complex in terms of time and budget
Taking a Survey
Often, we ask questions of a small group
(called a sample) in the hope of learning
something about the entire population…
These are called opinion polls or surveys
Population
Collect data from a
representative Sample...
Sample
Make an Inference
about the Population.
The Iowa Poll
About the poll
The Iowa Poll, conducted Dec. 8-11 for The Des
Moines Register by Selzer & Co. of Des Moines, is
based on interviews with 650 Iowans ages 18
or older. Interviewers contacted households with
randomly selected landline and cellphone
numbers. Responses were adjusted by age, sex
and congressional district to reflect the general
population based on recent census data.
Questions based on the sample of 650 Iowa adults
have a maximum margin of error of plus or
minus 3.8 percentage points. This means that if
this survey were repeated using the same questions
and the same methodology, 19 times out of 20, the
findings would not vary from the percentages
shown here by more than plus or minus 3.8
percentage points. Results based on smaller
samples of respondents — such as by gender or age
— have a larger margin of error.
So…
Who was the sample?
Who was the population?
The ultimate question….
Does the sample used represent
the population accurately?
The Literary Digest Poll of 1936…
Alf Landon (R) versus Franklin Roosevelt (D)
Prediction: Alf Landon
in a Landslide!!
President Dewey…??
The Moral of the Story…
If your sample was chosen in a poor manner,
it doesn’t matter how many people you
surveyed, bad data will still produce bad
results.
Garbage in, Garbage out!
In order to draw valid conclusions, you need a
sample (no matter the size) that well represents
the population!
Getting a Representative Sample…
• Making sure that, on average, the sample looks like
the rest of the population allows us to draw
conclusions based on our data.
A small sample, IF it
is chosen correctly,
can represent the
entire population!
In fact, this is the
basis for almost
all of statistics!
Parameters
vs.
Statistics
A Population Parameter is a value that describes the
entire population.
– This value isSample
rarely known Statistic
and typically unknowable due to
constant change and the difficulty in surveying an entire
population.
– Our goal is usually to estimate the parameter.
A Sample Statistic is a value that is found from the
Population Parameter
sample data.
– We use the sample statistic to estimate the population
parameter.
Now you try…
This summer, when I went to the grocery store, I kept track
of my receipts to help with budgeting. I wrote down how
much I spent at Hy-Vee during June and July. On average, I
spent $75.98.
• What is the population parameter I'm trying to estimate?
• If Hy-Vee took a sample of customers and checked their
receipts, what population parameter is Hy-Vee trying to
estimate?
So how do we gather Sample Statistics?
•Picking a sample at random protects us from
the influences of all the features of our
population, even ones that we may not have
thought of.
•Statistical sampling uses random chance, not
human choice!!
Jelly Blubbers!
Materials needed…
- Jelly Blubber colony
- Ruler
- Calculator (TAKE IT OUT!)
- Data Sheet
Jelly Blubbers…
What is our population of interest?
What is the Population Parameter?
Judgmental Sample
Select 5 Jelly Blubbers that, in your judgment, are
representative of the population of Jelly Blubbers.
On your data sheet, record the lengths of your five
Jelly Blubbers in millimeters and then find the
average. (The average of a sample is 𝑋, the average
of a population is )
Simple Random Sample (SRS)
A Simple Random Sample (SRS) has
two requirements…
– every ‘person’ has an equal chance
and
The Hat Method!
– every combination of ‘people’ has an equal
chance of being selected.
How can I select a SRS?
1. Random Number Table
Assign a number to each person or object to be sampled, then use a
random number table to select a certain number of them. Use TABLE D in
the back of your book!
A row in Table D might look like this:
05007
16605
81194
14873
04197
85576
……..
How could you use this to randomly select 5 Jelly Blubbers?
How can I select a SRS?
2. Random Numbers on the Calculator
Assign a number to each person or object to be sampled,
then use a random number generator on the calculator to
select a number of them.
randint (min, max, # to select)
Simple Random Sample
Use your calculator to randomly choose 5 Jelly
Blubbers and measure their lengths in mm. Find
the average length (𝑋) and record it.
•Advantages?
•Disadvantages?
Systematic Sample
•A systematic sample involves selecting every nth object.
•This is useful when you believe that the order of the list will
not affect the results of your survey.
•To get a systematic sample:
1) Randomly determine a starting place in the list.
2) From your starting place, sample every nth
object on the list.
Example: I randomly choose the 17th item in the list, and then
will choose every 10th item after that… #27, #37, #47, etc…
Systematic Random Sample
Since we have 100 Jelly Blubbers and we want a
sample of 5, we need to count every _____th Jelly
Blubber. Use your calculator to randomly choose a
starting place. Record their measurements and
calculate the sample average (𝑋) .
•Advantages?
•Disadvantages?
Cluster Sample
A cluster sample involves splitting the population into
subgroups (called Clusters).
This is useful when you think all subgroups are pretty similar
and each group will adequately represent the population.
To get a cluster sample:
1. Split your population into heterogeneous groups, called
clusters.
2. Use an SRS to determine which cluster(s) to sample. Then,
sample everyone in those clusters.
ALL from SOME!
Example of Cluster Sampling
Suppose that I want to find out what proportion
of ACHS seniors plan on leaving Iowa after
graduation.
– What would be wrong with just sampling the
seniors in the AP Statistics classes?
– What existing structure in our school could be
used as clusters?
Cluster Random Sample
Using your calculator, pick a random number
between 1 and 20, then multiply that by 5. Your
sample will be that Jelly Blubber and the four
Blubbers preceding it. Calculate and record your
sample average, (𝑋).
•Advantages?
•Disadvantages?
Stratified Random Sample
To get a stratified sample:
A stratified sample is a bit more
1.
Split your population
homogeneous
groups,
complicated
than theinto
others.
It involves
called
strata.the population into subgroups
first
splitting
that
are each
all different
in an
one
way.
2.
Within
strata, use
SRS
to determine
who is sampled.
This is useful when you think a certain
3.
Combine the samples
from each
strata into
one
characteristic
(age, gender,
address,
etc.)
overall
sample.
may
be an
influence on the parameter you
are trying to estimate.
SOME from ALL!
Example of Stratified Sampling
I wonder what percentage of ACHS students are
in favor of the new proposed rules at school
dances.
Is it possible that certain
segments of the school
population might feel differently
about this issue?
If so… better stratify!
Stratified Random Sample
The Jelly Blubbers have already been separated
into 5 different strata. Using your calculator,
pick a single random Jelly Blubber to measure
from each stratum. Calculate and record your
sample average.
•Advantages?
•Disadvantages?
Now… Analyze the Outcomes!
•Did we all get the same results each time?
•Does each graph look alike?
•Which one does the best job of predicting Jelly
Blubber length? Why?
Identify the Sampling Method Used
a)
We want to know what percentage of local doctors accept Medicare
patients. We call the offices of 50 doctors randomly selected from the
local Yellow Pages.
b)
We want to know what percentage of Iowa shopping mall businesses
anticipate hiring additional employees in the upcoming month. We
randomly select 3 shopping malls from a list of Iowa malls and then
survey every business in that mall.
c)
We want to know if students at our school are satisfied with the food
available at ACHS. We go to the cafeteria and interview every 10th person
in line.
d)
We want to know the average gas mileage for cars. We randomly select
20 Toyotas, 15 Hondas, 15 Fords, and 12 Chevrolets.
Two More ways to sample….
• Voluntary Response Sampling
• Convenience Sampling
Watch out for Bias
Bias means that something about the sample’s
design has systematically distorted the result
so that the sample would consistently under
(or over) estimate the value you are trying to
measure (the population parameter).
There is usually no way to fix a biased sample
and no way to salvage useful information from
it.
AP EXAM TIP
When identifying bias in a sample, be sure
to also state the direction of the bias.
Does the bias tend to over or underestimate the parameter being investigated?
Explain why this direction makes sense for
the situation.
Problems to watch for…
Sometimes the sampling frame (the list from
which we sample from) is difficult to obtain or
even to define. This creates a problem because
the people who are left out of the list may differ
from the people on the list.
Remember President Landon?
Me either.
Problems to watch for…
Many samples suffer from a bias called undercoverage, in
which some portion of the population is not sampled at
all or has a smaller representation in the sample than it
has in the actual population.
Problems to watch for…
A major issue in sampling is nonresponse bias, where
someone who is chosen for the sample cannot be
contacted or refuses to cooperate. The problem is that
those who don’t respond may differ from those who
did respond.
Non-Response Bias
A Survey about Surveys!
Problems to watch for…
Another major issue for surveys is known as response bias (not
to be confused with non-response!). Response bias refers to
anything in the survey that influences the responses, such as
wanting to please the interviewer, not wanting to answer
personal or legal questions, the wording of the questions, etc.
Response Bias!
Problems to watch for…
Watch out for the wording of the question in a
survey, as it can also influence the responses.
Asking a question with a leading statement is a
good way to bias the responses, which you don’t
want!
Response Bias!
How to combat bias…
Look for bias in any survey you encounter.
- If you are developing your own survey,
critique your survey before gathering data.
- Spend your resources and time trying to
reduce bias
- Pretest your survey so that you can make
changes before it is too late.
- Report your sampling method in detail!
AP test
problem
1997 #27
For the other options, determine the sampling
method.
Can you detect any possible bias in these
different sampling methods?