Chapter 8 – Producing Data: Sampling Observational Study: observes individuals and measures variables of interest but does not attempt to influence the responses. Purpose: Describe some group or situation Experiment: deliberately imposes some treatment on individuals in order to observe their responses. Purpose: Study whether the treatment causes a change in the response “You don’t have to eat the whole ox to know that the meat is tough.” The major use of inferential statistics is to use information from a sample to infer something about a population. The idea of sampling: We want to say something about a Population: To get at this we take a Sample: Why take a sample? Ex. Joe D. Politician is running for President. He calls you on the phone and asks you to find out what percentage of the registered voters in the country will vote for him. There are a few things you could try. Option I: Call all registered voters on the phone and ask them whom they will vote for. Provide a very accurate result Very tedious and time consuming project 1 Option II: Call 4 registered voters, 1 in each time zone, and ask them whom they will vote for. Very easy task Results would not be very reliable To use a sample to make inferences about a population, the sample should be representative of the population. How likely is it that these 4 registered voters would represent the population of all registered voters? Not very! Option III: Somewhere between Option I and II -- randomly select 2000 registered voters and poll them. Easier than Option I More reliable than Option II. Since we make conclusions about a population based on the information obtained from a sample, it is important that the sample should be representative of the population. Sampling design, the method chosen to select the sample from the overall population, has important consequences. Poor sampling design can yield misleading conclusions. A sampling method is biased if Some commonly used but bad sampling designs: 1. Voluntary Response Sample: A voluntary response sample consists of people who chose themselves by responding to a general appeal. (Be Aware) They often over represent people with strong opinions, most often negative opinions. 2 Example: A magazine for health foods and organic healing wants to establish that large doses of vitamins will improve health. The editors ask readers who have regularly taken vitamins in large doses to write in, describing their experiences. Of the 2754 readers, who reply, 93% report some benefit from taking vitamins. 2. Convenience Sampling: Grab individuals or a group that is handy (easier to reach), and take measure. Example: 1. To inspect the delivery of oranges, you randomly pick 10 oranges from the top of the crate. 2. Surveying shopping habits at malls. Problem with the above two sampling methods: Enter statisticians… In order to minimize the possibility of bias, such as favoritism (by a sampler) and self-selection (by respondents), statisticians use _________ to select samples. The idea is to avoid bias by One such example is a _______________________: A SRS is like placing names (the population) in a hat and drawing out a handful (the sample). 3 Other Sampling Designs A general framework for statistical sampling is a probability sample. A probability sample is SRS is one of the probability samples. The use of chance to select the sample is the essential principle of statistical inference. Sampling designs for sampling from large populations spread out over a wide area are usually more complex than an SRS. It is common to sample important groups within the population separately, then combine these samples. Example: A population of election districts might be divided into urban, suburban, and rural strata. Stratified Random Sample: To select a stratified random sample, first classify the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRSs to form the full samples. A stratified design can produce more precise information than an SRS of the same size by taking the advantage that individuals in the same stratum are similar to one another. Cautions About Sample Surveys Undercoverage: - occurs when some groups in the population are left out of the process of choosing the sample. - Example – an opinion poll conducted by telephone will miss the 5% of American households without residential phones. Nonresponse: - occurs when an individual chosen for the sample cannot be contacted or refuses to participate - Even with careful planning and several callbacks, nonresponse to sample surveys often reaches 50% or more. 4