Chapter 2 Notes

advertisement
STA 200 Spring 2011
CHAPTER 2
Objective
 We want to be able to extrapolate results
from a sample to the population at large.
 In order to do this (and reach meaningful
conclusions), the sample should be
representative of the population.
Bad Sampling
 Convenience Sampling
 Select the individuals who are the easiest to reach
 Voluntary Response Sampling
 The sample selects itself via response to a general
appeal (call-in polls, write-in polls)
Example (Convenience)
 Suppose you want to find out if UK faculty
members think there should be more math and
statistics as part of the USP requirements.
 To obtain the sample, suppose you visit faculty
members on the 7th, 8th, and 9th floors of the
Patterson Office Tower (where the math and
statistics departments are located).
 What’s wrong with this?
Example (Voluntary Response)
 Consider a write-in poll concerning a
maximum salary for athletes/actors.
 Some people are going to be more motivated
than others to participate in the poll. What
kind of opinion might they have?
Bias
 When using a bad sampling method, you get
biased results. (With regard to percentages,
this means you’ll get a percentage either
higher or lower than you should.)
 Bias occurs when certain outcomes are
statistically favored because the population is
incorrectly represented by the sample.
Good Sampling
 Simple Random Sample
 Consists of n individuals chosen in such a way that
every set of n individuals has the same chance of
being selected
 Choosing a sample randomly significantly
reduces bias. In other words, the sample will
reflect the population much better.
Choosing an SRS
 Nowadays, an SRS is usually chosen using a
computer. However, we can also use a table of
random digits (like the one in the back of the
textbook).
 The process:
 Assign a numerical label to each individual in the
population. Make sure all of the labels are the same
length.
 Use software or a table of random digits to select
labels.
Example (Using a Table of Random Digits)
 A food distributor wants to know if the boxes
of cereal in a particular shipment contain the
correct amount of cereal. The distributor
intends to randomly select five boxes out of a
shipment of 500 and weigh them.
 What labels should we use?
Example (cont.)
 Use the following line from the table to pick
the SRS:
 19223 95034 05756 28713 96409 12531 …
 Now, use another line to pick the SRS:
 05007 16632 81194 14873 04197 85576 …
Trusting a Sample
 If an SRS (or more complicated good sampling
method) is used, the sample should be quite
representative of the population.
 If a poor sampling method is used, this will not
be the case.
 Thus, if we try to extrapolate results from a
poorly obtained sample to the entire population,
the conclusions we reach will be rubbish.
Download