Designing Samples Student Notes I Collecting Data: sample survey experiment observational study Population: The entire group of individuals that you want information about Sample: a part of the population that we actually examine in order to gather information. Census: attempts to contact every individual in the entire process. II Sampling Voluntary Response Sampling Convenience Sampling Simple Random Sample Stratified Random Sample Cluster Sampling Probability Sample Sampling involves studying a part in order to gain information about the whole. Bias: The design of the study is biased if it systematically favors certain outcomes. Example A poll where a student surveys only his friends is clearly biased. Why? Convenience Sampling: Choose individuals easiest to reach. Mall surveys and cafeteria surveys would be two examples of a convenience sampling. What is wrong with that? A voluntary response sample consists of people who choose themselves by responding to a general appeal. Voluntary response samples are biased because people with strong opinions, especially negative opinions are most likely to respond. For example, if there was pre-election coverage taking place on TV and the screen provided a number a person could call to "vote" for or against a certain candidate being interviewed, the people watching would be able to choose whether or not they wanted to take part in the survey, and thus this would be a voluntary response sample. 1 Before doing the numbers, we should point out that the quality of the sample is more important than its size. The selection process itself is crucial. For example, a voter survey that systematically excluded females would be worthless, and there are a host of other ways to ruin or bias a sample. III Random Samples Simple Random Sample Probability Sample Stratified Random Sample Cluster sampling Multistage sampling 1. Simple Random Sample (SRS) of size n consists of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected. Suppose we have 20 students and I want to choose an SRS of 4 students. If I randomly select 2 of the 10 girls and 2 of the 10 boys, is that a SRS? A simple random sample has two properties that make it the standard against which we measure all other methods: 1. Unbiased: Each unit has the same chance of being chosen. 2. Independence: Selection of one unit has no influence on the selection of the other units. Steps to a Simple Random Sample 1. Start with a random number table. There is one in your text. Table B. Open the back cover of your text book. Below find a piece of one. 2. Number every member of your population. 3. For samples of size n use the random number table to choose the first n numbers from your population 2 For an example….if we were to take a random sample of three students from the room: 1. Number all of the students in the room 2. Starting with Line 101 find the numbers of the first three students. Line 101 102 103 104 105 19223 73676 45467 52711 95592 95034 47150 71709 38889 94007 05756 99400 77558 93074 69971 28713 01927 00095 60227 91481 96409 27754 32863 40011 60779 12531 42648 29485 85848 53791 42544 82425 82226 48767 17297 82853 36290 90056 52573 59335 Unfortunately, in the real world, completely unbiased, independent samples are hard to find. For instance, surveying voters by randomly dialing telephone number is biased: it ignores voters without a telephone and over samples people with more than one number! 2. Stratified Random Sample: To select a stratified random sample, first divide the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRSs to form the full sample. 3. Probability Sample: A probability sample is any sample chosen by chance. We must know what samples are possible and what chance or probability each sample has. Not all probability samples have the characteristic where each member of the group has the same chance of being selected. Does a Stratified Random Sample ensure us that each member of the group has the same chance of being selected? Why or why not? 4. Multistage sampling: Selects successively smaller groups within the population in stages, resulting in a sample consisting of clusters of individuals. Each stage may employ an SRS, a stratified sample, or another type of sample. A national multistage sample proceeds somewhat as follows: 1. Take a sample from the 3,000 counties in the United States 2. Select a sample of township within each of the counties chosen 3. Select a sample of block within each chosen township 4. Take a sample of households within each block 3 Example: Population: Residents of California. From the collection of counties, randomly pick 20 counties Then, from each selected county, randomly pick 10 cities/towns Then, randomly pick 100 individuals from each of the selected cities/towns. Note: This design does not yield an SRS of California residents...Why not? Cautions about sample surveys Undercoverage; Do we have an accurate and complete list of the population of interest? Nonresponse bias: Has everyone we surveyed responded? Response Bias Lying...embarrassment ... e.g.. suppose we ask if a person has every shoplifted? Interviewer bias...gender or race could alter response Wording of the question example follows: Version 1: Do you think there should be an Version 2: Do you think there should be an amendment to the Constitution prohibiting amendment to the Constitution protecting abortions? the live of the unborn Child? Confusing or leading questions Last Word of Warning: Without randomized design, there can be no dependable statistical analysis, no matter how it is modified. The beauty of random sampling is that it is statistically guarantees the accuracy of the results. 4