PPT Sampling error and bias

advertisement
Sampling : Error and bias
Sampling definitions








Sampling universe
Sampling frame
Sampling unit
Basic sampling unit or elementary unit
Sampling fraction
Respondent
Survey subject
Unit of analysis
Sampling types
Two basic categories of sampling
 Probability sampling
• Also called formal sampling or random sampling
 Non-probability sampling
• Also called informal sampling
Probability sampling
What is probability sampling?
A selection of elements in a
population, such that every element
has a known, non-zero probability of
being selected.
Types of probability sampling





Simple random sampling (SRS)
Systematic random sampling
Stratified sampling
Cluster sampling
Multi-stage sampling
Questions for sampling design

Presampling choices
• What is the nature of the study: exploratory,
descriptive, analytical?
• What are the outcomes of interest?
• What are the target populations?
• Do you want estimates for subpopulations or just
for the entire population?
• How will the data be collected?
• Is sampling necessary and appropriate?
Questions for sampling design

Sampling choices
• What listing will be used as the sampling frame?
• What is the desired precision?
• What type of samping will be done?
• Will the probability of selection be equal or
unequal?
• What is the sample size?
Questions for sampling design

Postsampling choices
• How can the effect of nonresponse be assessed?
• Is weighted analysis necessary?
• What are the confidence limits for the major
estimates?
But…
Result from survey is never
exactly the same as
the actual value in the population
WHY?
Components of total error
Point
estimate
from survey
40%
True
population
value
50%
Total error
Prevalence
0%
100%
Nonsampling
bias
Sampling
error
Sampling bias
Nonsampling bias
Is
present even if sampling and analysis done
correctly
Would still be present if survey measured outcome
in ENTIRE sampling frame
In sum, you have either sampled the wrong
people or screwed up your measurements!
Nonsampling bias

Types:
• Sampling frame is not equal to population to which
you want to generalize (sampling universe)
•
•
•
Sampling frame out of date
Non-response among sampling units in sampling frame
Measurement error
•
•
•
•
•
Tape incorrectly fixed to height board
Scale consistently reads low by 0.5 kg
Failure to remove heavy clothing before weighing
Misleading questions
Recall bias
Nonsampling bias
Source of bias
Sampling frame out of date
Prevention or cure
Use current sampling frame
Limit generalizations
Non-response
Minimize non-response
Use various statistical
methods to weight data
Measurement error
Standardize instruments
Write clear & simple
questions
Train survey workers
Supervise survey workers
Sampling bias

Selection of nonrepresentative sample, i.e., the
likelihood of selection not equal for each sampling
unit
 Failure to weight analysis of unequal probability
sample
In sum, you have not sampled people with equal
probability and you have not accounted for this
in your analysis!
Sampling bias

Examples
• Nonrepresentative sample
Selecting youngest child in household
• Choosing households close to the road
• Using a different sampling fraction in different
provinces
Failure to do statistical weighting
•
•
Sampling bias
Source of bias
Nonrepresentative sampling
Prevention or cure
ALWAYS ask yourself "Will
this choice enhance
representativeness or
reduce it"?
Calculate the probabilities of
selection
Failure to do weighting
Apply appropriate statistical
weights if selection
probabilities unequal
Sampling error

Difference between survey result and population
value due to random selection of sample
 Influenced by:
• Sample size
• Sampling scheme
Unlike nonsampling bias and sampling bias, it
can be predicted, calculated, and accounted for.
Sampling error

Measures of sampling error:
• Confidence limits
• Standard error
• Coefficient of variance
• P values
• Others
 Use these measures to:
• Calculate sample size prior to sampling
• Determine how sure we are of result after
analysis
Bias and sampling error
Nonsampling bias
Bias
Sampling bias
Sampling error
Sampling error
In sum…
Bias
 Includes nonsampling bias
and sampling bias
 Is due to mistakes which
can be avoided
 Cannot be precisely
measured
 Control and prevention
requires careful attention
Sampling error
 Is unavoidable if sampling
< 100% of population
 Can be controlled by
selecting appropriate
sample size and sampling
method
 Can be precisely
calculated after-the-fact
Essential concepts
Bias & Accuracy
Sampling error & Precision
Accuracy
What is accuracy?
The degree to which a measurement, or an
estimate based on measurements, represents
the true value of the attribute that is being
measured.
Last. A Dictionary of Epidemiology. 1988
In short, obtaining results close to the TRUTH.
Accuracy
Associated terms:
 Validity
Precision
What is precision?
Precision in epidemiologic measurements
corresponds to the reduction of random error.
Rothman. Modern Epidemiology. 1986.
In short, obtaining similar results with
repeated measurement
Precision
Associated terms:
 Reliability
 Reproducability
Accuracy vs. precision
Accuracy: obtaining results close to truth
Survey 1
Survey 2
Survey 3
Real
population
value
Accuracy vs. precision
Precision: obtaining similar results with repeated
measurement (may or may not be accurate)
Accuracy vs. precision
Poor precision (from small sample size) with
reasonable accuracy (without bias):
Accuracy vs. precision
Good precision (from small sample size) with
reasonable accuracy (without bias):
Accuracy vs. precision
Good precision (from large sample size), but with
poor accuracy (with bias):
In sum…

Sampling error
•
•
•

Difference between survey result and population value due to
random selection of sample
Greater with smaller sample sizes
Induces lack of precision
Bias
•
•
•
•
Difference between survey result and population value due to
error in measurement, selection of non-representative sample or
other factors
Due to factors other than sample size
Therefore, a large sample size cannot guarantee absence of
bias
Induces lack of accuracy, even with good precision
Usual situation after a survey
Result of single survey
95% confidence limits
Usual situation after a survey
Result of single survey
95% confidence limits
Usual situation after a survey
Result of single survey
95% confidence limits
Usual situation after a survey

How can you tell which situation you have?
Result of single survey
Result of single survey
95% confidence limits
95% confidence limits
Precision, bias, and sample size
Precision vs. bias
 Larger sample size increases precision
• It does NOT guarantee absence of bias
• Bias may result in very incorrect estimate
• If little sampling error, may have confidence in this
wrong estimate
 Quality control is more difficult the larger the
sample size
 Therefore, you may be better off with smaller
sample size, less precision, but much less
bias.
Download