Sampling and Surveys Section 4.1 The WHO of our data…. The population of interest is the entire group of people/things that we wish to study. Since we want to know things about the population, we need to figure out how to gather that data! I know! Let’s Sample All of them! A census is a “sample” of the entire population. Problems with a Census So why don’t we always do a census? - difficult or impractical to complete - too complex in terms of time and budget Taking a Survey Often, we ask questions of a small group (called a sample) in the hope of learning something about the entire population… These are called opinion polls or surveys Population Collect data from a representative Sample... Sample Make an Inference about the Population. The Iowa Poll About the poll The Iowa Poll, conducted Dec. 8-11 for The Des Moines Register by Selzer & Co. of Des Moines, is based on interviews with 650 Iowans ages 18 or older. Interviewers contacted households with randomly selected landline and cellphone numbers. Responses were adjusted by age, sex and congressional district to reflect the general population based on recent census data. Questions based on the sample of 650 Iowa adults have a maximum margin of error of plus or minus 3.8 percentage points. This means that if this survey were repeated using the same questions and the same methodology, 19 times out of 20, the findings would not vary from the percentages shown here by more than plus or minus 3.8 percentage points. Results based on smaller samples of respondents — such as by gender or age — have a larger margin of error. So… Who was the sample? Who was the population? The ultimate question…. Does the sample used represent the population accurately? The Literary Digest Poll of 1936… Alf Landon (R) versus Franklin Roosevelt (D) Prediction: Alf Landon in a Landslide!! President Dewey…?? The Moral of the Story… If your sample was chosen in a poor manner, it doesn’t matter how many people you surveyed, bad data will still produce bad results. Garbage in, Garbage out! In order to draw valid conclusions, you need a sample (no matter the size) that well represents the population! Getting a Representative Sample… • Making sure that, on average, the sample looks like the rest of the population allows us to draw conclusions based on our data. A small sample, IF it is chosen correctly, can represent the entire population! In fact, this is the basis for almost all of statistics! Parameters vs. Statistics A Population Parameter is a value that describes the entire population. – This value isSample rarely known Statistic and typically unknowable due to constant change and the difficulty in surveying an entire population. – Our goal is usually to estimate the parameter. A Sample Statistic is a value that is found from the Population Parameter sample data. – We use the sample statistic to estimate the population parameter. Now you try… This summer, when I went to the grocery store, I kept track of my receipts to help with budgeting. I wrote down how much I spent at Hy-Vee during June and July. On average, I spent $75.98. • What is the population parameter I'm trying to estimate? • If Hy-Vee took a sample of customers and checked their receipts, what population parameter is Hy-Vee trying to estimate? So how do we gather Sample Statistics? •Picking a sample at random protects us from the influences of all the features of our population, even ones that we may not have thought of. •Statistical sampling uses random chance, not human choice!! Jelly Blubbers! Materials needed… - Jelly Blubber colony - Ruler - Calculator (TAKE IT OUT!) - Data Sheet Jelly Blubbers… What is our population of interest? What is the Population Parameter? Judgmental Sample Select 5 Jelly Blubbers that, in your judgment, are representative of the population of Jelly Blubbers. On your data sheet, record the lengths of your five Jelly Blubbers in millimeters and then find the average. (The average of a sample is 𝑋, the average of a population is ) Simple Random Sample (SRS) A Simple Random Sample (SRS) has two requirements… – every ‘person’ has an equal chance and The Hat Method! – every combination of ‘people’ has an equal chance of being selected. How can I select a SRS? 1. Random Number Table Assign a number to each person or object to be sampled, then use a random number table to select a certain number of them. Use TABLE D in the back of your book! A row in Table D might look like this: 05007 16605 81194 14873 04197 85576 …….. How could you use this to randomly select 5 Jelly Blubbers? How can I select a SRS? 2. Random Numbers on the Calculator Assign a number to each person or object to be sampled, then use a random number generator on the calculator to select a number of them. randint (min, max, # to select) Simple Random Sample Use your calculator to randomly choose 5 Jelly Blubbers and measure their lengths in mm. Find the average length (𝑋) and record it. •Advantages? •Disadvantages? Systematic Sample •A systematic sample involves selecting every nth object. •This is useful when you believe that the order of the list will not affect the results of your survey. •To get a systematic sample: 1) Randomly determine a starting place in the list. 2) From your starting place, sample every nth object on the list. Example: I randomly choose the 17th item in the list, and then will choose every 10th item after that… #27, #37, #47, etc… Systematic Random Sample Since we have 100 Jelly Blubbers and we want a sample of 5, we need to count every _____th Jelly Blubber. Use your calculator to randomly choose a starting place. Record their measurements and calculate the sample average (𝑋) . •Advantages? •Disadvantages? Cluster Sample A cluster sample involves splitting the population into subgroups (called Clusters). This is useful when you think all subgroups are pretty similar and each group will adequately represent the population. To get a cluster sample: 1. Split your population into heterogeneous groups, called clusters. 2. Use an SRS to determine which cluster(s) to sample. Then, sample everyone in those clusters. ALL from SOME! Example of Cluster Sampling Suppose that I want to find out what proportion of ACHS seniors plan on leaving Iowa after graduation. – What would be wrong with just sampling the seniors in the AP Statistics classes? – What existing structure in our school could be used as clusters? Cluster Random Sample Using your calculator, pick a random number between 1 and 20, then multiply that by 5. Your sample will be that Jelly Blubber and the four Blubbers preceding it. Calculate and record your sample average, (𝑋). •Advantages? •Disadvantages? Stratified Random Sample To get a stratified sample: A stratified sample is a bit more 1. Split your population homogeneous groups, complicated than theinto others. It involves called strata.the population into subgroups first splitting that are each all different in an one way. 2. Within strata, use SRS to determine who is sampled. This is useful when you think a certain 3. Combine the samples from each strata into one characteristic (age, gender, address, etc.) overall sample. may be an influence on the parameter you are trying to estimate. SOME from ALL! Example of Stratified Sampling I wonder what percentage of ACHS students are in favor of the new proposed rules at school dances. Is it possible that certain segments of the school population might feel differently about this issue? If so… better stratify! Stratified Random Sample The Jelly Blubbers have already been separated into 5 different strata. Using your calculator, pick a single random Jelly Blubber to measure from each stratum. Calculate and record your sample average. •Advantages? •Disadvantages? Now… Analyze the Outcomes! •Did we all get the same results each time? •Does each graph look alike? •Which one does the best job of predicting Jelly Blubber length? Why? Identify the Sampling Method Used a) We want to know what percentage of local doctors accept Medicare patients. We call the offices of 50 doctors randomly selected from the local Yellow Pages. b) We want to know what percentage of Iowa shopping mall businesses anticipate hiring additional employees in the upcoming month. We randomly select 3 shopping malls from a list of Iowa malls and then survey every business in that mall. c) We want to know if students at our school are satisfied with the food available at ACHS. We go to the cafeteria and interview every 10th person in line. d) We want to know the average gas mileage for cars. We randomly select 20 Toyotas, 15 Hondas, 15 Fords, and 12 Chevrolets. Two More ways to sample…. • Voluntary Response Sampling • Convenience Sampling Watch out for Bias Bias means that something about the sample’s design has systematically distorted the result so that the sample would consistently under (or over) estimate the value you are trying to measure (the population parameter). There is usually no way to fix a biased sample and no way to salvage useful information from it. AP EXAM TIP When identifying bias in a sample, be sure to also state the direction of the bias. Does the bias tend to over or underestimate the parameter being investigated? Explain why this direction makes sense for the situation. Problems to watch for… Sometimes the sampling frame (the list from which we sample from) is difficult to obtain or even to define. This creates a problem because the people who are left out of the list may differ from the people on the list. Remember President Landon? Me either. Problems to watch for… Many samples suffer from a bias called undercoverage, in which some portion of the population is not sampled at all or has a smaller representation in the sample than it has in the actual population. Problems to watch for… A major issue in sampling is nonresponse bias, where someone who is chosen for the sample cannot be contacted or refuses to cooperate. The problem is that those who don’t respond may differ from those who did respond. Non-Response Bias A Survey about Surveys! Problems to watch for… Another major issue for surveys is known as response bias (not to be confused with non-response!). Response bias refers to anything in the survey that influences the responses, such as wanting to please the interviewer, not wanting to answer personal or legal questions, the wording of the questions, etc. Response Bias! Problems to watch for… Watch out for the wording of the question in a survey, as it can also influence the responses. Asking a question with a leading statement is a good way to bias the responses, which you don’t want! Response Bias! How to combat bias… Look for bias in any survey you encounter. - If you are developing your own survey, critique your survey before gathering data. - Spend your resources and time trying to reduce bias - Pretest your survey so that you can make changes before it is too late. - Report your sampling method in detail! AP test problem 1997 #27 For the other options, determine the sampling method. Can you detect any possible bias in these different sampling methods?