MAT 1000 Mathematics in Today's World Last Time 1. Two types of observational study 2. Three methods for choosing a sample Last Time Population: the collection of all individuals being studied Census: an observational study that observes the entire population Sample survey: an observational study that only observes some of the population Sample: the group of individuals chosen in a sample survey Last Time Methods of choosing a sample Voluntary response sample: the individuals in the sample volunteer to participate Convenience sample: the individuals in the sample are chosen because they are the easiest to reach Simple random sample: every group in the population has the same chance of being the sample Today 1. What does a sample tell us about the population? 2. Practical problems in sample surveys. What do samples tell us? Example Suppose we are studying the ages of Wayne State students. Structure of the data: Individuals: WSU students Variable: Age Population: all WSU students What do samples tell us? Example We could try and determine the age of every individual in the population, but this is time consuming and expensive. Better to use a sample. What do samples tell us? Example We hope the mix of ages of students in our sample gives a good reflection of the mix of ages of all WSU students. With a good sampling method (like SRS), we can be reasonably sure this is the case. What happens after we collect data on our sample? What do samples tell us? Example That data will be a big list of numbers—one age for each student in the sample. A big list of numbers is not very informative. Better to try and summarize. One way to summarize a list of numbers is to take the average (mean). If our sample is good, the average age of the students in our sample ought to be close to the average age of all WSU students. What do samples tell us? The goal in a sample survey is to use information about a sample to “infer” information about the entire population. The word “infer” means “derive” or “estimate.” Here “information” refers to some kind of numbers (“average age” in the example). So we can restate the goal of a sample survey slightly… What do samples tell us? The goal in a sample survey is to use numbers that describe a sample to estimate the value of a number that describes the entire population. Numbers that describe a sample are called statistics. Numbers that describes a population are called parameters. In the example, the average age of all WSU students is the parameter, and the statistic is the average age of the students in our sample. What do samples tell us? One common type of sample survey is the opinion poll. Example On November 4th there will be an election for the Governor of Michigan. A recent poll of 750 Michigan voters found that 45% would vote for Rick Snyder (R) and 42% for Mark Schauer (D). What do samples tell us? Population: all Michigan voters Variable: which candidate a person will vote for Unlike “age,” the variable here is not numeric. So we can’t summarize the survey results with an average. Instead we use a proportion (percent). Statistic: proportion of the sample who support Rick Snyder Parameter: proportion of the population who support Rick Snyder (unknown) What do samples tell us? Goal: make a good estimate of the parameter The sampling method we use matters. Convenience sample: ask WSU professors who they plan to vote for Voluntary response sample: Rush Limbaugh asks listeners to call in Will these samples give a good representation of the population of all Michigan voters? What do samples tell us? We say that convenience sampling and voluntary response sampling are biased sampling methods. The samples we get using these methods are probably not representative of the population. Bias: in repeated samples, the sample statistic consistently misses the population parameter in the same direction To avoid bias, use a simple random sample. What do samples tell us? If we use a simple random sample of 750 Michigan voters, every group of 750 people has the same chance of being chosen. Could we just by chance get a simple random sample of 750 WSU professors? Sure. We can never know for sure that our sample represents the whole population. If we use an SRS, the best we can say is that we are confident that our sample represents the whole population. What do samples tell us? A good way to understand this: think of taking a SRS many many times. Each time we get a different sample. In each sample the proportion of people who plan on voting for Rick Snyder will be slightly different. So we get different statistics. But, with an SRS all of these different statistics should be close to each other, and to the parameter. What do samples tell us? For any sampling method we use, there will be some variability in the statistics we get. Variability: different samples from the same population may yield different values of the sample statistic Variability and bias are two different ways a sample can fail to represent the population. Bias and Variability Consider shooting arrows (statistics) at a target (the parameter): Bias means the archer systematically misses in the same direction. Variability means that the arrows are scattered. What do samples tell us? To reduce bias, use random sampling. To reduce variability, use larger samples. What do samples tell us? In a sample survey the discrepancy between the statistic and the parameter is called “error” (doesn’t mean mistake). What are some sources of error? • Bias • Variability These are called “sampling errors.” What do samples tell us? There are other types of errors in a sample survey called “nonsampling errors.” Processing errors: data entry, calculations Response error: individuals answer incorrectly (lie or misremember) Even the wording of questions can be a source of error! Concerns when Asking Survey Questions • Deliberate bias • Unintentional bias • Desire to please • Asking the uninformed • Unnecessary complexity • Ordering of questions • Confidentiality and anonymity Deliberate Bias • “If you found a wallet with $20 in it, would you return the money?” • “If you found a wallet with $20 in it, would you do the right thing and return the money?” Unintentional Bias • “I have taught several students over the past few years.” o How many students do you think I have taught? o How many years am I referring to? • “Over the past few days, how many servings of fruit have you eaten?” o How many days are you considering? o What constitutes a serving? Desire to Please • “Is your instructor doing a good job presenting the course material in a clear and interesting way?” Yes No Asking the Uninformed: Case Study Washington Post National Weekly Edition (April 10-16, 1995, p. 36) • A 1978 poll done in Cincinnati asked people whether they “favored or opposed repealing the 1975 Public Affairs Act.” o There was no such act! o About one third of those asked expressed an opinion about it. Unnecessary Complexity • “Do you sometimes find that you have arguments with your family members and coworkers?” o Arguments with family members o Arguments with co-workers o “sometimes find” (vague or unclear) Ordering of Questions • “How often do you normally go out on a • date? about ___ times a month.” “How happy are you with life in general.” o Strong association between these questions. o If the ordering is reversed, then there would be no strong association between these questions Confidentiality and Anonymity • Confidential answer o respondent is known, but the information is a secret o facilitates follow-up studies • Anonymous answer o the respondent is not known, or cannot be linked to his/her response o usually yields more truthful answers