1/11/2013 1 Sampling Issues Key Aspects of Sample Selection

advertisement
1/11/2013
Sampling Issues
Key issues involved in Sampling
Non-probability sampling
One-stage (also called “single-stage) sampling
Multi-stage sampling
Key Aspects of Sample Selection
 Sampling Frame – the set of people that has a chance of being
selected
 Probability sampling procedures – each respondent must
have a “known chance” of being selected.
 Efficiency – how feasible is it to draw a sample from the list?
 Sample Size – related to efficiency, cost, and desired sampling
error
 Response Rate – also related to efficiency and cost
Considerations: Sampling Frame
 Is the Sampling Frame comprehensive?
 What is the problem with selecting a sample from a telephone
book?
 How comprehensive would a business’ customer list be?
 KEY: Are those selected significantly different from those
who whom you will be generalizing results?
 Unit of analysis? Household ? Individual? Visit?
 QUESTION: What is a PERFECT sampling frame?
1
1/11/2013
Non-Probability Sampling
 What is it?
 Nonprobability sampling does not involve random selection and
probability sampling does
 We may or may not represent the population well, and it will often be
hard for us to know how well we've done so
 Types:
 Convenience (also called “accidental” or “haphazard”)
 Just interview whoever comes along
 Snowball - The first respondent refers a friend. The friend also
refers a friend, and so on.
 Purposive (also called “judgmental”) ; similar to convenience, but
you select based upon a specific purpose)
 Looking specifically for high school seniors, so went to a high school and
asked students what grade they were in. Only interviewed the seniors.
Probability Sampling
 Also known as “random sampling”
 The probability of getting any particular sample may be
calculated
 One-stage
 Simple random sample
 Systematic sample
 Stratified sample
 Multi-stage
Simple Random Sample
 Approximates drawing numbers
out of a hat
 Identify the total number of
“subjects” in the sampling frame
 Assign each subject a number
 Use a random-number-generator (or a table of random
numbers) to select the total number of sample you need
 Sample Free random-number generator:
http://stattrek.com/statistics/random-numbergenerator.aspx
2
1/11/2013
Systematic Sample
 Identify the total number of “subjects” in the sampling frame (N)
and the number of respondents you want in the sample pool (n)
 Select a random starting point
 n / N will give you the number you need to start selecting
 Example:
 N = 12; you want to select 4 from the population (n=4)
 4/12 = 3; you want to
sample every 3rd person
 You choose a random
start point of 2
 Starting with the second person,
select every 3rd person
Stratified Sample
 Stratification is the process of dividing members of the population
into homogeneous subgroups before sampling.
 Identify strata (a respondent can only be assigned to one and only
one strata)
 Then select (either systematically or randomly) the desired
number of respondents from each strata
 Usually called (regardless of systematic or random) a stratified
random sample
 Example:
 Create four strata based on class level
in college (freshman, sophomore,
junior, senior)
 Select your sample out of each strata
so that you have the desired number of
freshmen, sophomores, juniors, and seniors
One-Stage versus
Multi-Stage Sampling
 In one-stage, you sample only “once”
 In multi-stage, you sample two or more times, delving down
deeper each time
 Example:
 You want to survey visitors leaving Disney properties. Understanding
that those visiting Disneyland may have different impressions than
those visiting other Disney resorts, you first sample resorts (you can’t
do all of them, but you can do some of them) – 1st stage
 You also understand that day of week may make a difference so,
within each of the selected Disney resorts, you sample based upon
day of week – 2nd stage
 Then, knowing that time of day is also important, you sample (within
Disney resorts and within Day of Week) day parts – 3rd Stage
3
1/11/2013
Other Sampling Considerations
• Estimating Parameters or Identifying
•
•
•
•
Differences
Quota
Weighting
Oversample
Split Sample
Estimate Parameters? Or identifying if
differences exist between groups?
 When determining “how many surveys you need and from
what population,” you need to understand the difference
between estimating parameters for a population and
observing statistically significant differences amongst
subgroups.
 Estimating Parameters (generalizing to the population):
If you want to say, “54% of men hold positive perceptions of
Disneyland,” you will need to have a sufficient sample size to
calculate an acceptable sampling error (usually at least 100)
 Establishing Differences
If you want to say something like, “Men tend to be hold more
positive perceptions of Disneyland than women,” you can do
this with a crosstab (if the “cell size” is large enough or the
observed differences are large enough)
Establishing “Quotas”
 Say your population of college students is as follows:




40% freshmen
25% sophomores
20% juniors
15% seniors
 Even if you correctly select your survey pool from this
population and start interviewing, you could end up with a
completely different breakdown
 Then you may wish to establish a quota for specific groups
(e.g., once you get 40% freshmen, you stop interviewing
freshmen) OR you may want to randomly/systematically
exclude the “extra” freshmen from the sample after the fact
4
1/11/2013
Weighting
 Done “after the fact” using SPSS or other statistical packages
 Figure out what proportions of the population are
 Identify how many surveys you want out of each group
(regardless of proportions)
 Conduct the survey
 Weight down the disproportionately large population
Oversampling
 Conduct the basic survey (using the DL/DCA example, conduct a
survey that meets the appropriate proportions). Call this the Base
Sample:
 DCA: 200 interviews (40%)
 DL: 300 interviews (60%)
 Set this sample “aside.”
 Interview an additional 100 DCA visitors.
 Combine this “oversample” with the 200 you got from the base
sample for a total of 300 interviews with DCA visitors
 Tabulate results as follows:
 Total population: from the Base Sample to estimate parameters for
Disney visitors
 DL visitors: from the 300 interviews gathered in the Base Sample
 DCA visitors: from the combined sample of the 200 interviews
gathered in the Base Sample and the 100 additional interviews from
the oversample
Split Sample
 Used when the questionnaire is too long or you want to test
alternative concepts
 Select a large sample
 Separate into two or more samples
(selected at random OR during
data collection with a random variable)
Intro & General Qstns
Version 1
Qstns
 Design the questionnaire with
Version 2
Qstns
Version 3
Qstns
Demo Qstns
“base” questions and:
 “Version 1” and “version 2” questions OR
 Alternative order questions
 Gather data for all versions simultaneously
 Tabulate and report results accordingly
5
1/11/2013
(Attempted) Census
 Attempt to survey EVERYONE in the population
 Works if the population is small enough
 EXAMPLES:
 End of semester evaluations of instructors (N = 25?)
 Workshop attendees
 Can apply a finite population correction factor when
calculating sampling error
6
Download