1/11/2013 Sampling Issues Key issues involved in Sampling Non-probability sampling One-stage (also called “single-stage) sampling Multi-stage sampling Key Aspects of Sample Selection Sampling Frame – the set of people that has a chance of being selected Probability sampling procedures – each respondent must have a “known chance” of being selected. Efficiency – how feasible is it to draw a sample from the list? Sample Size – related to efficiency, cost, and desired sampling error Response Rate – also related to efficiency and cost Considerations: Sampling Frame Is the Sampling Frame comprehensive? What is the problem with selecting a sample from a telephone book? How comprehensive would a business’ customer list be? KEY: Are those selected significantly different from those who whom you will be generalizing results? Unit of analysis? Household ? Individual? Visit? QUESTION: What is a PERFECT sampling frame? 1 1/11/2013 Non-Probability Sampling What is it? Nonprobability sampling does not involve random selection and probability sampling does We may or may not represent the population well, and it will often be hard for us to know how well we've done so Types: Convenience (also called “accidental” or “haphazard”) Just interview whoever comes along Snowball - The first respondent refers a friend. The friend also refers a friend, and so on. Purposive (also called “judgmental”) ; similar to convenience, but you select based upon a specific purpose) Looking specifically for high school seniors, so went to a high school and asked students what grade they were in. Only interviewed the seniors. Probability Sampling Also known as “random sampling” The probability of getting any particular sample may be calculated One-stage Simple random sample Systematic sample Stratified sample Multi-stage Simple Random Sample Approximates drawing numbers out of a hat Identify the total number of “subjects” in the sampling frame Assign each subject a number Use a random-number-generator (or a table of random numbers) to select the total number of sample you need Sample Free random-number generator: http://stattrek.com/statistics/random-numbergenerator.aspx 2 1/11/2013 Systematic Sample Identify the total number of “subjects” in the sampling frame (N) and the number of respondents you want in the sample pool (n) Select a random starting point n / N will give you the number you need to start selecting Example: N = 12; you want to select 4 from the population (n=4) 4/12 = 3; you want to sample every 3rd person You choose a random start point of 2 Starting with the second person, select every 3rd person Stratified Sample Stratification is the process of dividing members of the population into homogeneous subgroups before sampling. Identify strata (a respondent can only be assigned to one and only one strata) Then select (either systematically or randomly) the desired number of respondents from each strata Usually called (regardless of systematic or random) a stratified random sample Example: Create four strata based on class level in college (freshman, sophomore, junior, senior) Select your sample out of each strata so that you have the desired number of freshmen, sophomores, juniors, and seniors One-Stage versus Multi-Stage Sampling In one-stage, you sample only “once” In multi-stage, you sample two or more times, delving down deeper each time Example: You want to survey visitors leaving Disney properties. Understanding that those visiting Disneyland may have different impressions than those visiting other Disney resorts, you first sample resorts (you can’t do all of them, but you can do some of them) – 1st stage You also understand that day of week may make a difference so, within each of the selected Disney resorts, you sample based upon day of week – 2nd stage Then, knowing that time of day is also important, you sample (within Disney resorts and within Day of Week) day parts – 3rd Stage 3 1/11/2013 Other Sampling Considerations • Estimating Parameters or Identifying • • • • Differences Quota Weighting Oversample Split Sample Estimate Parameters? Or identifying if differences exist between groups? When determining “how many surveys you need and from what population,” you need to understand the difference between estimating parameters for a population and observing statistically significant differences amongst subgroups. Estimating Parameters (generalizing to the population): If you want to say, “54% of men hold positive perceptions of Disneyland,” you will need to have a sufficient sample size to calculate an acceptable sampling error (usually at least 100) Establishing Differences If you want to say something like, “Men tend to be hold more positive perceptions of Disneyland than women,” you can do this with a crosstab (if the “cell size” is large enough or the observed differences are large enough) Establishing “Quotas” Say your population of college students is as follows: 40% freshmen 25% sophomores 20% juniors 15% seniors Even if you correctly select your survey pool from this population and start interviewing, you could end up with a completely different breakdown Then you may wish to establish a quota for specific groups (e.g., once you get 40% freshmen, you stop interviewing freshmen) OR you may want to randomly/systematically exclude the “extra” freshmen from the sample after the fact 4 1/11/2013 Weighting Done “after the fact” using SPSS or other statistical packages Figure out what proportions of the population are Identify how many surveys you want out of each group (regardless of proportions) Conduct the survey Weight down the disproportionately large population Oversampling Conduct the basic survey (using the DL/DCA example, conduct a survey that meets the appropriate proportions). Call this the Base Sample: DCA: 200 interviews (40%) DL: 300 interviews (60%) Set this sample “aside.” Interview an additional 100 DCA visitors. Combine this “oversample” with the 200 you got from the base sample for a total of 300 interviews with DCA visitors Tabulate results as follows: Total population: from the Base Sample to estimate parameters for Disney visitors DL visitors: from the 300 interviews gathered in the Base Sample DCA visitors: from the combined sample of the 200 interviews gathered in the Base Sample and the 100 additional interviews from the oversample Split Sample Used when the questionnaire is too long or you want to test alternative concepts Select a large sample Separate into two or more samples (selected at random OR during data collection with a random variable) Intro & General Qstns Version 1 Qstns Design the questionnaire with Version 2 Qstns Version 3 Qstns Demo Qstns “base” questions and: “Version 1” and “version 2” questions OR Alternative order questions Gather data for all versions simultaneously Tabulate and report results accordingly 5 1/11/2013 (Attempted) Census Attempt to survey EVERYONE in the population Works if the population is small enough EXAMPLES: End of semester evaluations of instructors (N = 25?) Workshop attendees Can apply a finite population correction factor when calculating sampling error 6