VI. Sampling

advertisement
VI. Sampling: (Nov. 2, 4)

Frankfort-Nachmias & Nachmias (Chapter 8 – Sampling and Sample
Designs)

King, Keohane and Verba (Chapter 4)

Barbara Geddes. 1990. “How the Cases You Choose Affect the Answers
You Get: Selection Bias in Comparative Politics.” Political Analysis, 2:1,
131-150.
Applications
 William Reed, “A Unified Statistical Model of Conflict Onset and Escalation.”
American Journal of Political Science, Vol. 44, No. 1 (Jan., 2000), pp. 84-93

Richard Timpone. 1998. “Structure, Behavior and Voter Turnout in the
United States.” American Political Science Review, Vol. 92 (1): 145-158.
Sampling
– any well-defined set of units
of analysis; the group to which our
theories apply
 Population
– any subset of units collected in
some manner from the population; the
data we use to test our theories
 Sample
 Parameter
vs. Statistic
Types of Samples
sample – each element of the
population has a known probability of
being included in the sample
 Probability
 Nonprobability
sample - each element of
the population has an unknown probability
of being included in the sample
Types of Nonprobability Samples
 Convenience
sample
 Purposive sample
– may not be representative of
the population to which we want to
generalize
 Problem
Famous Example of Convenience
Sampling
Digest – used automobile
registration lists and telephone directories
as sampling frame for presidential polls
 Literary


1928 - 18 million postcards to accurately
predict outcome of 1928 election (Hoover-R)
1932: 20 million postcards to accurately
predict 1932 election (Roosevelt-D)
Famous Example of Convenience
Sampling
Digest – used automobile
registration lists and telephone directories
as sampling frame for presidential polls
 Literary



1928 - predicted Hoover-R
1932: predicted Roosevelt-D
1936: predicted Landon (R) 57%

What happened?
Famous Example of Convenience
Sampling
 Before

1936
Upper class/Working Class – more or less
representative partisan distribution
Famous Example of Convenience
Sampling
 Before

Upper class/Working Class – more or less
representative partisan distribution
 1936


1936
and beyond
Upper class disproportionately Republican
Working class disproportionately Democrat
Types of Nonprobability Samples
samples – elements are chosen
based on selected characteristics and the
representation of these characteristics in
the population
 Quota


Insures accurate representation of selected
characteristics
Elements with selected characteristics chosen
in convenience fashion
Famous Examples of Quota
Samples
1936 – George
Gallup used quota
sampling to
accurately predict:

The (inaccurate)
Literary Digest
prediction
 The winner of the
1936 election
Famous Examples of Quota
Samples

1948 – quota sampling incorrectly predicts
Dewey to defeat Truman
Types of Probability Samples
random sample – each element
of the population has an equal chance of
being selected
 Simple
sample – elements selected
from a list at predetermined intervals
 Systematic
Types of Probability Samples
sample – elements in
population are grouped into strata, and
each strata is randomly sampled
 Stratified
Example of Stratified Sampling

Population: 75% white, 10% black, 10 Hispanic,
5% Asian

Simple random sample of 1000: Approximately
750 white, 100 black, 100 Hispanic, 50 Asian


Samples too small for group comparisons
Solution: Use stratified sampling to over-sample
minority groups (disproportionate stratified
sampling)
Types of Probability Samples
sample – elements are grouped
into “clusters,” and sampling proceeds in
two stages:
 Cluster
• (1) A random sample of clusters is chosen
• (2) Elements within selected clusters are then
randomly selected and aggregated to form final
sample
• This is the sampling method used in many national
surveys (e.g. clusters=metropolitan areas, zip
codes, area codes)
Sampling Distribution (of sample means)
Population
Draw Random Sample of Size n
Calculate sample mean
Repeat until all possible random samples of size n are
exhausted
The resulting collecting of sample means is the sampling
distribution of sample means
Sampling Distribution of Sample Means
 Def:
A frequency distribution of all
possible sample means taken from
the same population for a given
sample size (n)
Sampling Distribution of Sample Means
 Def:
A frequency distribution of all
possible sample means taken from
the same population for a given
sample size (n)


The mean of the sampling distribution
will be equal to the population mean.
The sampling distribution will be
normally distributed (regardless of
population distribution if n>30)
Standard Error
 How
the sample means vary from sample
to sample (i.e. within the sampling
distribution) is expressed statistically by
the value of the standard deviation of the
sampling distribution.
Standard Error, cont.
 The
standard error for a sample mean is
calculated as: s / √n


Where s = sample standard deviation
n = sample size
Simulating a Sampling Distribution
(For a Sample Proportion)
 Dichotomous
variable for which the true
population value is set at .25
 Randomly
 Repeat
draw 1,000 samples of size n
for different n’s and compare
Simulation of a Sampling
Distribution (n=10)
Simulation of a Sampling
Distribution (n=100)
Sample Size and Sampling Error
Sample Selection Bias
 What
is it?
 What are the consequences of selecting
on:


The dependent variable?
The independent variable?
Download