Designing Sample Student Notes

advertisement
Designing Samples
Student Notes
I Collecting Data:
sample survey
experiment
observational study
Population: The entire group of individuals that you want information about
Sample: a part of the population that we actually examine in order to gather information.
Census: attempts to contact every individual in the entire process.
II Sampling
Voluntary Response Sampling
Convenience Sampling
Simple Random Sample
Stratified Random Sample
Cluster Sampling
Probability Sample
Sampling involves studying a part in order to gain information about the whole.
Bias: The design of the study is biased if it systematically favors certain outcomes.
Example
A poll where a student surveys only his friends is clearly biased. Why?
Convenience Sampling: Choose individuals easiest to reach. Mall surveys and cafeteria
surveys would be two examples of a convenience sampling. What is wrong with that?
A voluntary response sample consists of people who choose themselves by responding to
a general appeal. Voluntary response samples are biased because people with strong
opinions, especially negative opinions are most likely to respond.
For example, if there was pre-election coverage taking place on TV and the screen
provided a number a person could call to "vote" for or against a certain candidate being
interviewed, the people watching would be able to choose whether or not they wanted to
take part in the survey, and thus this would be a voluntary response sample.
1
Before doing the numbers, we should point out that the quality of the sample is more
important than its size.
The selection process itself is crucial. For example, a voter survey that systematically
excluded females would be worthless, and there are a host of other ways to ruin or bias a
sample.
III Random Samples
Simple Random Sample
Probability Sample
Stratified Random Sample
Cluster sampling
Multistage sampling
1. Simple Random Sample (SRS) of size n consists of n individuals from the population
chosen in such a way that every set of n individuals has an equal chance to be the
sample actually selected.
Suppose we have 20 students and I want to choose an SRS of 4 students. If I randomly
select 2 of the 10 girls and 2 of the 10 boys, is that a SRS?
A simple random sample has two properties that make it the standard against which we
measure all other methods:
1. Unbiased: Each unit has the same chance of being chosen.
2. Independence: Selection of one unit has no influence on the selection of the other
units.
Steps to a Simple Random Sample
1. Start with a random number table. There is one in your text. Table B. Open the back
cover of your text book. Below find a piece of one.
2. Number every member of your population.
3. For samples of size n use the random number table to choose the first n numbers from
your population
2
For an example….if we were to take a random sample of three students from the room:
1. Number all of the students in the room
2. Starting with Line 101 find the numbers of the first three students.
Line
101
102
103
104
105
19223
73676
45467
52711
95592
95034
47150
71709
38889
94007
05756
99400
77558
93074
69971
28713
01927
00095
60227
91481
96409
27754
32863
40011
60779
12531
42648
29485
85848
53791
42544
82425
82226
48767
17297
82853
36290
90056
52573
59335
Unfortunately, in the real world, completely unbiased, independent samples are hard to
find. For instance, surveying voters by randomly dialing telephone number is biased: it
ignores voters without a telephone and over samples people with more than one number!
2. Stratified Random Sample: To select a stratified random sample, first divide the
population into groups of similar individuals, called strata. Then choose a separate
SRS in each stratum and combine these SRSs to form the full sample.
3. Probability Sample: A probability sample is any sample chosen by chance.
We must know what samples are possible and what chance or probability each sample
has.
Not all probability samples have the characteristic where each member of the group
has the same chance of being selected.
Does a Stratified Random Sample ensure us that each member of the group has the same
chance of being selected? Why or why not?
4. Multistage sampling: Selects successively smaller groups within the population in
stages, resulting in a sample consisting of clusters of individuals. Each stage may
employ an SRS, a stratified sample, or another type of sample.
A national multistage sample proceeds somewhat as follows:
1. Take a sample from the 3,000 counties in the United States
2. Select a sample of township within each of the counties chosen
3. Select a sample of block within each chosen township
4. Take a sample of households within each block
3
Example:
Population: Residents of California.
From the collection of counties, randomly pick 20 counties
Then, from each selected county, randomly pick 10 cities/towns
Then, randomly pick 100 individuals from each of the selected cities/towns.
Note: This design does not yield an SRS of California residents...Why not?
Cautions about sample surveys
Undercoverage; Do we have an accurate and complete list of the population of
interest?
Nonresponse bias: Has everyone we surveyed responded?
Response Bias
Lying...embarrassment ... e.g.. suppose we ask if a person has every shoplifted?
Interviewer bias...gender or race could alter response
Wording of the question example follows:
Version 1: Do you think there should be an Version 2: Do you think there should be an
amendment to the Constitution prohibiting amendment to the Constitution protecting
abortions?
the live of the unborn Child?
Confusing or leading questions
Last Word of Warning:
Without randomized design, there can be no dependable statistical analysis, no matter
how it is modified. The beauty of random sampling is that it is statistically guarantees
the accuracy of the results.
4
Download