STA 200 Spring 2011 CHAPTER 2 Objective We want to be able to extrapolate results from a sample to the population at large. In order to do this (and reach meaningful conclusions), the sample should be representative of the population. Bad Sampling Convenience Sampling Select the individuals who are the easiest to reach Voluntary Response Sampling The sample selects itself via response to a general appeal (call-in polls, write-in polls) Example (Convenience) Suppose you want to find out if UK faculty members think there should be more math and statistics as part of the USP requirements. To obtain the sample, suppose you visit faculty members on the 7th, 8th, and 9th floors of the Patterson Office Tower (where the math and statistics departments are located). What’s wrong with this? Example (Voluntary Response) Consider a write-in poll concerning a maximum salary for athletes/actors. Some people are going to be more motivated than others to participate in the poll. What kind of opinion might they have? Bias When using a bad sampling method, you get biased results. (With regard to percentages, this means you’ll get a percentage either higher or lower than you should.) Bias occurs when certain outcomes are statistically favored because the population is incorrectly represented by the sample. Good Sampling Simple Random Sample Consists of n individuals chosen in such a way that every set of n individuals has the same chance of being selected Choosing a sample randomly significantly reduces bias. In other words, the sample will reflect the population much better. Choosing an SRS Nowadays, an SRS is usually chosen using a computer. However, we can also use a table of random digits (like the one in the back of the textbook). The process: Assign a numerical label to each individual in the population. Make sure all of the labels are the same length. Use software or a table of random digits to select labels. Example (Using a Table of Random Digits) A food distributor wants to know if the boxes of cereal in a particular shipment contain the correct amount of cereal. The distributor intends to randomly select five boxes out of a shipment of 500 and weigh them. What labels should we use? Example (cont.) Use the following line from the table to pick the SRS: 19223 95034 05756 28713 96409 12531 … Now, use another line to pick the SRS: 05007 16632 81194 14873 04197 85576 … Trusting a Sample If an SRS (or more complicated good sampling method) is used, the sample should be quite representative of the population. If a poor sampling method is used, this will not be the case. Thus, if we try to extrapolate results from a poorly obtained sample to the entire population, the conclusions we reach will be rubbish.