Probability sampling

advertisement
Sampling Procedures
January 24 & 26, 2011
Objectives
By the end of this meeting, participants
should be able to:
1) Distinguish the techniques behind and
advantages of various kinds of sampling
procedures.
2) Evaluate a survey sample based on
common problems that should be avoided.
3) Calculate and interpret the margin of error
for survey findings.
Sampling
Suppose that you wanted to know how the
American people felt about the war in
Afghanistan. What would you do?
a) Obviously asking the entire public the question
would be prohibitively expensive.
b) Fortunately, you would not have to. A carefully
created and implemented survey of a few
thousand people would give you a good
reflection of the views of the entire country.
Sampling Method
a) First thing a researcher needs to do is
decide what is the population of interest.
b) From there a researcher needs to decide
what is the sampling frame.
Types of Sample
There are two broad types of sample
a) Non-probability sample
b) Probability sample
Types of Non-probability Samples
a) Typical sample- choosing people that seem
typical based on various demographic
factors
b) Purposive sample- choosing people
deliberatively possibly based on the advice
of others
c) Volunteer subjects- surveying people that
volunteer (American Idol, various instapolls)
Types of Non-probability Samples
d) Haphazard sampling- choosing those people that
are easiest to contact (Literary Digest 1936 poll,
class polls, street corner polls)
e) Quota Sampling- choosing people that perfectly
reflect the population of study (minorities,
women, homosexuals, etc. )
f) Snowball sampling- interviewing a random
sample and asking them to identify people that
they know that fit certain criteria. Generally used
for rare populations
Probability Sampling
a) Despite their widespread usage nonprobability samples are almost universally
inferior to probability samples due to the
introduction of bias
b) Probability sampling means that each
person in the population has a known
probability of being chosen (although not
necessarily equal)
Types of Probability Samples
a) Simple random sample- using a list of the
entire population, a random group is
chosen
b) Systematic selection procedure- using a
list of the entire population, a random
number is selected and that many units are
skipped between interviews
•
•
Need to avoid periodicity
Hard to generate lists
Types of Probability Samples
c) Stratified sample- divide the population in
pieces and randomly sample within those
pieces (regions, counties, dorms, etc.; this
is the method used for election exit polls)
d) Cluster sample- divide the population into
clusters (think neighborhoods) and
conduct several interviews in each cluster.
This introduces bias but can significantly
reduce cost.
Types of Probability Samples
e) Multistage area sample- a portion of the
geographic area is sampled, followed by a
sampling of areas within the selected areas. The
areas will be weighted based on their
population.
f) Hybrid sampling- a combination of any of the
previous sampling methods. Such types include
Multiple Frame Designs, repeated attempts to
sample the same population and Parallel
Samples, comparing a baseline sample to
another sample.
Telephone Samples
a) Telephone surveys are one of the most
common types of surveys
b) They can be potentially difficult because
not all people are listed and telephone
books are frequently out of date.
c) Some researchers use a method called add
a digit dialing. In this method, numbers
are chosen at random from the directory
and a digit is added to that number
Telephone Samples
d) Another method is random digit dialing. A
computer will generate a telephone number at
random (perhaps an area code, prefix or just a
suffix). This method needs roughly 5 numbers
for each number in the sample due to the high
rate of failure
e) Telephone samples suffer from multiple phone
lines in each home, refusal, difficulty reaching
any one, etc.
•
•
Bias?
Alternative methods?
Telephone Samples
f) One issue that faces telephone surveys is
which person to speak to within the
household. One of the most common
methods used in the next birthday
method.
Problems in Sampling
Generally random sampling will give a sample that
reflects the broader population. There are still
potential problems that need to be considered.
a) Noncoverage error- parts of the population may
not be covered in the sample (for example,
people without phones, the infirmed, etc. )
b) Sampling the wrong population- the precise
population needs to be sampled not just a part of
it
Problems in Sampling
c) Response rate- lower response rates are
not necessarily a problem unless those that
refuse have similar demographic factors.
d) Sampling error- this is the error inherent
in using a sample to generalize to a
broader population
Sampling Error
We need to consider sampling error:
a) The error that arises from trying to represent a
population with a sample.
b) Sampling error does not include other sorts of error that
can result from surveys.
We often speak of a 95% confidence interval.
a) If repeated samples were taken, 95% of the samples
would contain results within the margin of error.
b) “A statistician would say that we are taking a 5% chance
of drawing a faulty conclusion…” (WKB, 68).
Margin of Error
Margin of error (forumula on board)
a) Where, t =1.96 for large samples
b) The value 1.96 comes from our understanding of
the distribution of possible values.
c) f is the sampling fraction (or the fraction of the
population that is being sampled), (1-f)1/2 is
ignored
a) when sampling with replacement
b) or when the population is very large
Margin of Error
d) p is the sample proportion (for example, the
proportion approving of President Obama).
•
•
p and (1-p) may be written in either proportion (0 to 1)
or percentage (0 to 100) terms
Decide which based on what unit you want the
margin of error to be expressed in.
Margin of Error: Example
a)
b)
c)
d)
e)
f)
g)
h)
Suppose there are 1,872 political science majors on campus. We
randomly select 250 and ask them whether they watched TV last
night. 66 percent of the sample respond “yes.”
How confident can we be that this percentage reflects our
underlying population of interest (political science majors)?
Recall: margin of error formula
p = .66 and (1-p)= .34; for percentages: p=66 and (1-p)=34
n = 250 and f = 250/1872
Margin of error=±1.96[(66×34)/249]1/2×[1−(250/1872)]1/2= 5.5%
We are 95 percent confident that 66 ± 5.5% of social science
majors watched TV last night.
Or, we can say, we are 95% confident that the true population
percentage is in between 60.5% and 71.5%
For January 31
a)
b)
Read WKB chapter 4.
Answer the following questions:
a) Imagine we conducted a survey of 350 UGA
students and found that 93% thought the Bulldogs
would win the SEC championship. What is the
margin of error (at 95%)? Based on that margin of
error, we can be 95% sure that championship
predictions lie in what range?
b) Imagine we conducted a survey of GA voters and
found that Obama had approval of 55%. 750 people
were surveyed, what is the margin of error (at
95%)? Based on that margin of error, we can be 95%
sure that Obama’s support in GA lies in what
range?
Download