Section 1

advertisement
Chapter 8 – Producing Data: Sampling
Observational Study: observes individuals and measures variables of interest
but does not attempt to influence the responses.
Purpose: Describe some group or situation
Experiment: deliberately imposes some treatment on individuals in order to
observe their responses.
Purpose: Study whether the treatment causes a change in the response
“You don’t have to eat the whole ox to know that the meat is tough.”
The major use of inferential statistics is to use information from a sample to
infer something about a population.
The idea of sampling:
We want to say something about a
Population:
To get at this we take a
Sample:
Why take a sample?
Ex. Joe D. Politician is running for President. He calls you on the phone and
asks you to find out what percentage of the registered voters in the country
will vote for him. There are a few things you could try.
Option I: Call all registered voters on the phone and ask them whom they will
vote for.
 Provide a very accurate result
 Very tedious and time consuming project
1
Option II: Call 4 registered voters, 1 in each time zone, and ask them whom
they will vote for.
 Very easy task
 Results would not be very reliable
To use a sample to make inferences about a population, the sample should be
representative of the population.
How likely is it that these 4 registered voters would represent the population
of all registered voters? Not very!
Option III: Somewhere between Option I and II -- randomly select 2000
registered voters and poll them.
 Easier than Option I
 More reliable than Option II.
Since we make conclusions about a population based on the information
obtained from a sample, it is important that the sample should be
representative of the population.
Sampling design, the method chosen to select the sample from the overall
population, has important consequences.
Poor sampling design can yield misleading conclusions.
A sampling method is biased if
Some commonly used but bad sampling designs:
1. Voluntary Response Sample:
A voluntary response sample consists of people who chose themselves by
responding to a general appeal. (Be Aware) They often over represent
people with strong opinions, most often negative opinions.
2
Example: A magazine for health foods and organic healing wants to
establish that large doses of vitamins will improve health. The editors
ask readers who have regularly taken vitamins in large doses to write
in, describing their experiences. Of the 2754 readers, who reply,
93% report some benefit from taking vitamins.
2. Convenience Sampling: Grab individuals or a group that is handy (easier to
reach), and take measure.
Example:
1. To inspect the delivery of oranges, you randomly pick 10 oranges from the
top of the crate.
2. Surveying shopping habits at malls.
Problem with the above two sampling methods:
Enter statisticians… 
In order to minimize the possibility of bias, such as favoritism (by a sampler)
and self-selection (by respondents), statisticians use _________ to select
samples.
The idea is to avoid bias by
One such example is a _______________________:
A SRS is like placing names (the population) in a hat and drawing out a handful
(the sample).
3
Other Sampling Designs
A general framework for statistical sampling is a probability sample.
A probability sample is

SRS is one of the probability samples.

The use of chance to select the sample is the essential principle of statistical
inference.
Sampling designs for sampling from large populations spread out over a wide area are
usually more complex than an SRS.
It is common to sample important groups within the population separately, then combine
these samples.
Example: A population of election districts might be divided into urban, suburban, and rural
strata.
Stratified Random Sample: To select a stratified random sample, first classify the
population into groups of similar individuals, called strata. Then choose a separate SRS in
each stratum and combine these SRSs to form the full samples.
A stratified design can produce more precise information than an SRS of the same size by
taking the advantage that individuals in the same stratum are similar to one another.
Cautions About Sample Surveys
Undercoverage:
- occurs when some groups in the population are left out of the process of choosing
the sample.
- Example – an opinion poll conducted by telephone will miss the 5% of American
households without residential phones.
Nonresponse:
- occurs when an individual chosen for the sample cannot be contacted or refuses to
participate
- Even with careful planning and several callbacks, nonresponse to sample surveys
often reaches 50% or more.
4
Download