STAT 110 - Section 5 Lecture 8 Professor Hao Wang University of South Carolina Spring 2012 Last time: The Beauty of Sampling With proper sampling methods, based on a sample of about 1000 adults we can almost certainly estimate, to within 3% (i.e., MOE=3%), the percentage of the entire population who have a certain trait or opinion. This result does not depend on how large the population is. Chapter 4 – Sample Surveys in the Real World Type of errors: 1. Sampling Errors a. Random Sampling Error b. Bad Sampling Methods 2. Non-sampling Errors a. Processing errors b. Poorly worded questions c. Response error d. Non-Response Chapter 4 – Sample Surveys in the Real World sampling errors – errors caused by the act of taking a sample They cause sample results to be different from the results of a census. sampling frame – a list of individuals from which we will draw our sample should list every individual in the population Errors in Sampling random sampling error – results from chance selection in the simple random sample • MOE lets us calculate how serious the error is. • The error is due to chance – always present. A large sample helps control this. • MOE includes only random sampling error. • Most sample surveys are afflicted with errors other than random sampling errors. Errors in Sampling Bad sampling method – a convenience sample or a voluntary response sample is also a form of sampling error. Voluntary sample Convenience sample undercoverage – occurs when some groups in the population are left out of the process of choosing the sample Undercoverage • Using telephone directory to survey general population. • Problem: excludes those who move often, those with unlisted home numbers, those without a phone. • Solution: use random digit dialing. nonsampling errors – errors not related to the act of selecting a sample from the population can even be present in a census • nonrespone (missing data) • response errors • processing errors Example The subject lies about past drug use. A. B. C. D. Sampling Error: Bad Sampling Method Non Sampling Error: Response Error Non Sampling Error: Non Response Error Non Sampling Error: Processing Error Example The subject cannot be contacted after five calls. A. B. C. D. Sampling Error: Bad Sampling Method Non Sampling Error: Response Error Non Sampling Error: Non Response Error Non Sampling Error: Processing Error Example Interviewers choose people on the street to interview. A. B. C. D. Sampling Error: Bad Sampling Method Non Sampling Error: Response Error Non Sampling Error: Non Response Error Non Sampling Error: Processing Error Consider Wording Be aware that the wording of a question influences the answers. Examples: Is our government providing too much money for welfare programs? – 44% said “yes” Is our government providing too much money for assistance to the poor? – 13% said yes More Complex Sample Designs • Sometimes a strict simple random sample is difficult to obtain. - Multistage Sampling Design - Cluster Sampling - Systematic Sampling - Stratified Random Sampling • Stratified Random Sample • Step 1: Divide the sampling frame into distinct groups of individuals, called strata. • – Choose strata because you have an interest in the groups or because the individuals within each group are similar • – Example: graduate/undergraduate students • Step 2: Take a separate SRS in each stratum and combine these to make up the complete sample. Stratified Random Sample. A club has 25 student members and 10 faculty members. The club can send 4 students and 2 faculty members to a convention. Students 01 Barrett 06 Frazier 11 Hu 16 Liu 21 Ren 02 Brady 07 Gibellato 12 Jimenez 17 Marin 22 Santos 03 Chen 08 Gulati 13 Katsaounis 18 Nemeth 23 Sroka 04 Draper 09 Han 14 Kim 19 O’Rourke 24 Tordoff 05 Duncan 10 Hostetler 15 Kohlschmidt 20 Paul 25 Wang Faculty 0 Berliner 2 Dean 4 Goel 6 Moore 8 Stasney 1 Craigmile 3 Fligner 5 Lee 7 Pearl 9 Wolfe Line 116:14459 26056 31424 80371 65103 62253 50490 61181 Choose a Stratified RS of 4 Students, then of 2 Faculty Cluster Sampling • In order to reduce costs in sampling, researchers focus on efficiency by sampling from clusters • Clusters are often formed by geographic location, resulting in decreased travel costs for the research company. • Randomly sample clusters then survey everyone in each cluster. Cluster Sample - Divide population into clusters. Select one or more clusters and include everyone in those clusters in the sample. • Example: SC has 46 counties. Select 5 counties at random, use all household in each selected county as sample. • Example: USC has 30 dorms, each dorm has 6 floors; 180 floors form the clusters. Take a random sample of floors and measure everyone on those floors. Want to find the opinions of US adults, but want to save on time and money by randomly selecting residences. All adults residing in a sampled residence will be interviewed. A. Stratified B. Cluster C. Both • Want to find the opinions of US adults and need to make sure that 3 specific religious groups are represented. You sample 100 Christians, 100 Jewish, and 100 Muslims. A. Stratified B. Cluster C. Both • Want to find the opinions of city dwelling US adults and need to make sure that the east and west coasts are represented. You send 5 interviewers to the east coast and 5 to the west coast. 5 City blocks are chosen at random. Everyone living in a chosen city block is interviewed. (similarly for the east coast) A. Stratified B. Cluster C. Both Questions to Ask Before You Believe a Poll • Who carried out the survey? • What was the population? • How was the sample selected? • How large was the sample? • What was the margin of error? • What was the response rate? • How were the subjects contacted? • When was the survey conducted? • What questions were asked? USC has 20,065 undergraduates and 7,423 graduate students. In an effort to gauge the opinions of all students on campus parking issues, a simple random sample consisting of 201 undergraduates and a simple random sample of 74 graduate students are taken. This is an example of: A – a cluster sample B – a systematic sample C – a stratified random sample D - undercoverage USC has 20,065 undergraduates and 7,423 graduate students. In an effort to gauge the opinions of all students on campus parking issues, a simple random sample consisting of 201 undergraduates is taken. This is an example of: A – a cluster sample B – a systematic sample C – a stratified random sample D - undercoverage