Section 5.1 Designing Samples Malboeuf 2009 Observational vs. Experiment An observational study observes individuals and measures variable of interest but does not attempt to influence the responses. An experiment, on the other hand, deliberately imposes some treatment on individuals in order to observe their responses. AP Statistics, Section 5.1, Part 1 3 Population and Sample The entire group of individuals that we want information about is called the population. A sample is a part of the population that we actually examine in order to gather information. AP Statistics, Section 5.1, Part 1 4 Sampling vs. a Census Sampling involves studying a part in order to gain information about the whole. A census attempts to contact every individual in the entire population. AP Statistics, Section 5.1, Part 1 5 How to capture a “Sample” Getting a portion of the population is not difficult. Getting a good sample is difficult. Creating a plan to do this is called “sample design”. AP Statistics, Section 5.1, Part 1 6 Voluntary response sample (example: Call in opinion polls). Read “Call-in opinion polls” (p272) The problem with call in opinion polls is that the people who answer the polls tend to have strong opinions, especially strong negative opinions. This sample is biased; this sample is not representative of the population. AP Statistics, Section 5.1, Part 1 7 How not to sample cont. Convenience sample (example: Mall intercept interviews) Read “Interviewing at the mall” (p272) Convenience sampling may not get you access to all the people in the population. Interviewers often avoid people who may make them feel uncomfortable. This sample is biased; this sample is not representative of the population. AP Statistics, Section 5.1, Part 1 8 Bias The design of a study is biased if it systematically favors certain outcomes. AP Statistics, Section 5.1, Part 1 9 Both voluntary response samples and convenience samples choose a sample that is almost guaranteed not to represent the entire population. When choosing your sample it is very important to try to avoid bias. Two additional types of sampling bias are: Non-Response Bias: when an individual chosen for the sample does not participate. For example, does not return a mailed survey Under-Coverage Bias: when some groups of the population are left out of the process of choosing the sample. For example, not being able to get a list with all the adults in the USA who are on a specific type of medication Data can also be biased by factors that are not related to the method by which a sample was chosen. Below are two common factors that can result in bias. Non Sampling Bias: * The wording of a question “Almost two thirds of the people in the USA would like to see English as the only language used in official documents. Are you in favor of this?” How to sample The best way to sample is to use a “simple random sample” A simple random sample (SRS) of size n consists of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected. AP Statistics, Section 5.1, Part 1 12 How to create a SRS Choose an SRS in two steps: Step 1: Label. Assign a numerical label to every individual in the population. Step 2: Random Assignment. Random number table (Table B) Random number generator (RandInt in the TI-83) AP Statistics, Section 5.1, Part 1 13 AP Statistics, Section 5.1, Part 1 14 Stratified Random Sample To select a stratified random sample, first divide the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRSs to form the full sample. AP Statistics, Section 5.1, Part 1 15 Multistage Sampling Design Randomly choose stage 1 strata (for example, states) Randomly choose stage 2 strata (for example, cities within states) and so on until you get down to the sample size. AP Statistics, Section 5.1, Part 1 16 “Random” is the key Good sampling technique uses random selection to reduce the possibility of bias. AP Statistics, Section 5.1, Part 1 17 Cautions about sample surveys Undercoverage occurs when some groups in the population are left out of the process of choosing the sample. Nonresponse occurs when an individual chosen for the sample can’t be contacted or does not cooperate. AP Statistics, Section 5.1, Part 1 18 Cautions about sample surveys Response bias. Respondents may lie if they feel uncomfortable telling the truth. AP Statistics, Section 5.1, Part 1 19 Cautions about sample surveys Wording of questions. “It is estimated that disposable diapers account for less than 2% of the trash in today’s landfills. In contrast, beverage containers, third-class mail and yard wastes are estimated to account for about 21% of the trash in landfills. Given this, in your opinion, would it be fair to ban disposable diapers?” AP Statistics, Section 5.1, Part 1 20 Why Sample? We want to make inferences about the population as a whole. We can’t afford to talk to everyone. Even though two samples, following the same design most probably will give us different results, those results are reasonable estimates of the population as a whole AP Statistics, Section 5.1, Part 1 21 How to get the best estimates? Large random sample give more precise results than smaller sample. AP Statistics, Section 5.1, Part 1 22 Assignment Exercises: 5.1-9 all, 11-15 odd AP Statistics, Section 5.1, Part 1 23