Practical Tools for
Nonresponse Bias Studies
Kristen M. Olson
University of Nebraska-Lincoln
Short Course – JOS 30th Anniversary
June 10-11, 2015
Materials for this short course were developed
originally by
Robert M. Groves and J. Michael Brick
Initial Conversations for this course came from the
Nonresponse Bias Summit Meeting, Ann Arbor,
January 29-30, 2005
Paul Biemer, Martin Frankel, Brian Harris-Kojetin, Steve
Heeringa, Paul Lavrakas, Kristen Olson, Colm
O’Muircheartaigh, Beth-Ellen Pennell, Emilia Peytcheva,
Andy Peytchev, Eleanor Singer, Clyde Tucker, Mandi Yu,
Sonja Ziniel
Coverage of Short Course
• Included
– Nonresponse bias approaches for different
modes
– Mostly household, some establishment
surveys
– Dominance of U.S. examples
• De-emphasized
– Panel attrition issues
– Item nonresponse issues
3
Schedule
June 10:
13:00 – 14:00
14:00 – 14:30
14:30 – 14:45
14:45 – 16:30
June 11:
13:00 – 14:45
14:45 – 15:00
15:00 – 16:00
Introduction
Benchmarking
Break
Study designs using external
data
Study designs involving internal
survey data
Break
Postsurvey adjustment analyses
4
INTRODUCTION
5
What is Nonresponse?
• Unit nonresponse is the failure to obtain survey
measures on a sample unit
• It occurs after the sampling step of survey
(don’t confuse with failure to cover target
population by sampling frame)
• It reflects total failure to obtain survey data
(don’t confuse with item nonresponse, the
failure to obtain an answer to a given item)
6
Measurement
Representation
Target
Population
Construct
Coverage
Error
Validity
Sampling
Frame
Sampling
Error
Measurement
Measurement
Error
Sample
Nonresponse
Error
Response
Respondents
Processing Error
Adjustment
Error
Edited
Response
Groves, et al. 2004, Survey Methodology
Figure 2.5
Postsurvey
Adjustments
Survey
Statistic
7
Response Rates
• AAPOR standards for calculations
• Weighted response rates may be appropriate
• Often used as data quality and field
performance indicator
• Low response rates can be an indicator of
potential problems such as
– Nonresponse bias
– Underestimated variances
8
Nonresponse Error for Sample Mean
In simplest terms
m
Yr  Yn    Yr  Ym 
n

OR
Respondent Mean = Full Sample Mean +
(Nonresponse Rate)*(Respondent Mean – Nonrespondent
Mean)
OR
Survey Results = Desired Results + Error
OR
Nonresponse Error = f(Rate, Difference between
Respondents and Nonrespondents)
9
Examples
I. Response Rate – 75%
Respondent Mean – 10
Nonrespondent Mean – 14
Nonresponse Error = .25*(10 – 14) = –1
II. Response Rate – 90%
Respondent Mean – 10
Nonrespondent Mean – 40
Nonresponse Error = .10*(10 – 40) = –3
10
Low Nonresponse Rate, Small Difference
between Respondents and Nonrespondents
1
0.8
0.6
Nonrespondents
0.4
0.2
0
yr
ym
11
High Nonresponse Rate, Small Difference
between Respondents and Nonrespondents
1
0.8
0.6
0.4
0.2
0




































































































































































































































































































yr ym
12
Low Nonresponse Rate, Large Difference
between Respondents and Nonrespondents
1
0.8
0.6
0.4
0.2
0
yr
ym
13
High Nonresponse Rate, Large Difference
between Respondents and Nonrespondents
1
0.8
0.6
0.4
0.2
0
yr
ym
14
A Stochastic View of Response
Propensities
 yp
  yp 
 y  p
Bias ( y r ) 
 
p
 p 
where  yp  covariance between y and
response propensity, p
p  mean propensity over the sample
 yp  correlation between y and p
 y  standard deviation of y
 p  standard deviation of p
15
What does the Stochastic View Imply?
• Key issue is whether the influences on
survey participation are shared with the
influences on the survey variables
• Increased nonresponse rates do not
necessarily imply increased nonresponse
error
• Hence, investigations are necessary to
discover whether the estimates of interest
might be subject to nonresponse errors
16
Meta-Analysis of Nonresponse Error
Studies
• 59 studies, some with multiple estimates
• Each has data available to compute a
relative bias due to nonresponse. The
absolute value of the relative bias is
(y r  y n )
yn
17
Percentage Absolute Relative Nonresponse Bias of 959 Respondent Means by Nonresponse
Rate of the 59 Surveys in Which They Were Estimated.
Robert M. Groves, and Emilia Peytcheva Public Opin Q 2008;72:167-189
© The Author 2008. Published by Oxford University Press on behalf of the American Association for Public Opinion
Research. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org
18
Conclusions
• Nonresponse error does exist
• Nonresponse rate by itself is not a
good predictor of nonresponse errors
• The analysis does NOT address
whether changing the nonresponse
rate within a study would have affected
the error
19
Some Comments From a Meta-Analysis
of Nonresponse Bias Studies
• What do we know about properties of
different nonresponse bias study
designs?
• Tendency for studies of internal
variation to have higher bias estimates
– be careful of multiple interpretations
• survey variables linked to mechanisms
producing nonresponse OR
• confounds among variables and study
techniques
Groves, R., and Peytcheva, E. (2008). The Impact of Nonresponse Rates on Nonresponse Bias: A 20
Meta-Analysis,” Public Opinion Quarterly.
Bias as Proportion of Standard
Deviation
0.25
0.2
0.15
0.1
0.05
0.08
0.1
0.19
0.14
Frame
Supplement
Screener
Followup
0
Method Used to Estimate Nonresponse Bias
21
Does Nonresponse Bias on Demographic
Variables Predict Nonresponse Bias on
Substantive Variables?
Peytcheva, Emilia and Robert M. Groves. (2009) “Using Variation in Response Rates of Demographic Subgroups
as Evidence of Nonresponse Bias in Survey Estimates.” Journal of Official Statistics. 25(2): 193-201.
22
So When is Nonresponse Error a
Problem for a Given Survey?
• Difficult to know without assessing errors
through auxiliary studies
• Response rates often used as an indicator of
“risk” of nonresponse error
• Various indicators for risk of nonresponse
error have been proposed (Groves, et al.,
Survey Practice, 2008; Wagner, 2012;
Nishimura, Wagner and Elliott, 2015)
23
Wagner’s (2012) Typology
• Indicators involving the response indicator
• Indicators involving the response indicator
and frame data or paradata
– Nonresponse bias for estimates based on
variables available on sample
• Indicators involving the response indicator,
frame data or paradata, and the survey data
– Studying variation within the respondent set
24
Z1
Z2
R
Y1
Y2
1
1
1
1
1
1
1
0
0
Sampling frame and
paradata
0
Survey data
0
25
Goals of Course
• Course should provide you with tools
to examine nonresponse bias
• You can use the tools regardless of
response rates to obtain insights into
potential errors
26
Weights and Response Rates
• A base or selection weight is the inverse of the
probability of selection of the unit. The sum of all the
sampled units’ base weights estimates the population
total.
• When units are sampled using a complex sample design,
the AAPOR guidelines suggest using (base) weights to
compute response rates that reflect the percentage of the
sampled population that respond. Unweighted rates are
useful for other purposes, such as describing the
effectiveness of the effort.
• Weighted response rates are computed by summing the
units’ base weights by disposition code rather than
summing the unweighted counts of units.
• In establishment surveys, it is useful to include a
measure of size (e.g., number of employees or students)
to account for the units relative importance. The weight
for computing response rates is the base weight times
the measure of size.
27
Weights and Nonresponse Analysis
• A general rule is that weights should be used in
nonresponse analysis studies so that
relationships at the population level can be
examined. Guides for choosing the specific
weights to use are:
– Use base weights for nonresponse bias studies that
compare all sampled respondents and
nonrespondents. Weights adjusted for nonresponse
may be misleading in this situation.
– Use fully adjusted weights for nonresponse bias
studies that compare survey estimates with data from
external sources. One important exception is when the
survey weights are poststratified. In this case, weights
prior to poststratification are generally more
28
appropriate.
Weighting Example
• Suppose a sample of schools is selected
from two strata:
– Stratum 1: 50% sampled and 60% respond
– Stratum 2: 20% sampled and 80% respond
• All the sampled schools can be
classified as urban or rural
• A nonresponse bias study examines the
estimate of percentage of schools that
are urban, by using the frame data on
urbanicity
29
Weighting Example (Cont.)
Stratum
Urban
1
2
Total
Rural
1
2
Total
n
150
80
230
100
420
520
wt
2
5
2
5
r
90
64
154
60
336
396
• Estimates of percentage urban
Full response base weighted estimate:
(150*2+80*5)/(150*2+80*5+100*2+420*5)
= 700/3000=23.3%
30
Weighting Example (Cont.)
• Estimates of percentage urban
Respondent base weighted estimate:
(90*2+64*5)/(90*2+64*5+60*2+336*5)
= 500/2300=21.7%
bias = 21.7%-23.3%=-1.6%
Respondent unweighted estimate:
(90+64)/(90+64+60+336)
= 154/550=28.0%
bias = 28.0%-23.3%=4.7%
31
Five Things You Should Remember
from the Short Course
1.
2.
3.
4.
5.
The three principal types of nonresponse bias studies are:
- Comparing surveys to external data
- Studying internal variation within the data collection, and
- Contrasting alternative postsurvey adjusted estimates
All three have strengths and weaknesses; using multiple
approaches simultaneously provides greater understanding
Nonresponse bias is specific to a statistic, so separate
assessments may be needed for different estimates
Auxiliary variables correlated with both the likelihood of
responding and key survey variables are important for
evaluation
Thinking about nonresponse before the survey is important
because different modes, frames, and survey designs permit
different types of studies
32
Nonresponse Bias Study Techniques
1.
2.
Comparison to other estimates (benchmarking)
Nonresponse bias for estimates based on variables available on sample
2.1 Sampling frame
2.2 External data matched to sample
2.3 Observations taken during data collection
2.4 Seeded sample
2.5 Comparing response rates on subgroups
2.6 Calculate R-indicators
3.
Studying variation within the respondent set
3.1 Use of screening and prior wave data collection
3.2 Following up nonrespondents
3.3 Two phase (double) sampling of nonrespondents
3.4 Analyzing estimates by level of effort
3.5 Analyzing estimates by Predicted Response Propensity
3.6 Mounting randomized nonresponse experiments
4.
Altering the weighting or imputation adjustments
4.1 Prepare estimates under different assumptions
4.2 Adjust using alternative weights
4.3 Calculate the Fraction of Missing Information
4.4 Adjust using selection (Heckman) models
33
Data Used in Nonresponse Bias Studies
Types of data available (or that could be
produced in the study)
•
•
•
•
•
Individual data for each population unit
Individual data for each sampled unit
Aggregate data for each sampled unit
Individual data from the data collection
process
Individual data collected from a follow-up
34
1. Benchmarking
35
1. Comparison to Other Estimates Benchmarking
• Data or estimates from another source
that are closely related to respondent
estimates may be used to evaluate bias
due to nonresponse in the survey
estimates
– Benchmarking
36
1. Benchmarking Survey Estimates
to those from Another Data Source
• Another survey or administrative record
system may contain estimates of variables
similar to those being produced from the
survey
• Example: Demographic characteristics from
the American Community Survey (Census),
number of persons graduating from high
school from the Common Core of Data (NCES)
• Difference between estimates from survey and
other data source is an indicator of bias (both
nonresponse and other)
37
1. How to Conduct a Nonresponse
Bias Benchmark Study
1. Identify comparison surveys with very high
response rates or administrative systems
that can produce estimates of variables
similar to key survey estimates
2. Assess major reasons why the survey
estimates and the estimates from the
comparison sources may differ
3. Compute estimates from the survey (using
final weights) and from the comparison
source to be as comparable as possible
(often requires estimates for domains)
4. The difference is an estimate of overall bias
38
1. Statistical Tests using
Benchmarking
• Using the respondent data, calculate the fully
adjusted respondent mean.
– Can also calculate the base weighted mean for
comparison.
• Calculate the appropriate (design-based) two
sample t-tests or chi-square test
– May need to account for sampling error in the
benchmark estimate
– Test the null hypothesis that the respondent mean
and benchmark mean are identical
39
Pros and Cons of Benchmark
Comparison to Estimate NR Bias
• Pros
– Relatively simple to do and often inexpensive
– Estimates from survey use final weights and are thus relevant
– Gives an estimate of bias that may be important to analysts
• Cons
– Estimated bias contains errors from the comparison source
as well as from the survey; this is why it is very important that
the comparison source be highly accurate
– Measurement properties are generally not consistent for
survey and comparison source; often is largest source of
error
– Item nonresponse in both data sets reduces comparability
40
Li, G., Schoeni, R.F., Danziger, S., and Charles, K.K. (2010). New
expenditure data in the PSID: Comparisons with the CE. Monthly
Labor Review, 133(2), 29-39.
• Purpose: Estimate dynamic aspects of economic and
demographic behavior
• Target population: U.S. individuals and family units
• Sample design: Longitudinal household survey started
in 1968 with oversample of low income households
• Mode of data collection: f-t-f originally, mainly phone
since 1972
• Response rate: 75% initially, attrition loss of 36% by
’80, 2-3% nonresponse each subsequent year
• Target estimate: Various expenditure measures
• Nonresponse error measure: Expansion of expenditure
questions in 1999 allows for comparison to Consumer
Expenditure (CE—Interview Survey; RR=80% in 2000)
population estimates overall and by age.
41
PSID Estimates Compared to CE
Interview Survey Estimates
Ratios of mean PSID expenditure to mean CE expenditure:
Expenditure category
1999
2001
2003
Total
0.96
1.02
1.01
Total food
1.03
1.08
1.10
Total housing
0.94
1.00
0.97
Mortgage
1.10
1.27
1.17
Rent
0.96
0.96
0.96
0.97
1.11
1.09
Health care: Insurance
42
Conclusions
• In general, for the expenditures measured PSID
estimates are comparable to CE estimates.
• Broad categories align very well; some minor
differences at subcategory level.
• Cross-sectional “lifecycle” estimates (by age
category) are generally similar; exception is
difference in early 50s, primarily due to difference in
education expenditures.
• Limitation:
– Nonresponse is not the only reason for
differences in the estimates, and may not even be
the main source of the difference
43
Yeager, D.S., Krosnick, J.A., Chang, L., Javitz, H.S., Levendusky, M.S.,
Simpser, A., and Wang, R. (2011). “Comparing the Accuracy of RDD
Telephone Surveys and Internet Surveys Conducted with Probability and
Non-Probability Samples.” Public Opinion Quarterly. 75(4): 709-747.
• Purpose: Evaluate quality of two types of probability
samples and seven non-probability samples.
• Target population: U.S. adults
• Sample design: Varies over the surveys. Focus on
the RDD and probability internet panel samples
• Mode of data collection: phone and web
• Response rate: 35.6% (phone); 15.3% (probability
web)
• Target estimate: Demographic and non-demographic
estimates
• Nonresponse error measure: Difference between
benchmark and survey estimate before and after
post-stratification
44
Non-demographic comparisons,
probability samples only
Nonsmoker
(Relative bias = -2%, -3%, -4%, -5%)
100
Had 12 drinks in lifetime
(Relative bias = 9%, 9%, 12%, 11%)
100
80
80
60
60
40
40
20
20
0
Benchmark
Telephone
Without post-stratification
0
Internet
Benchmark
With post-stratification
Without post-stratification
100
100
80
80
60
60
40
40
20
20
0
0
Telephone
Without post-stratification
Source: Yeager, et al. (2011)
Internet
With post-stratification
Internet
With post-stratification
Has a driver’s license
(Relative bias = 5%, 4%, 1%, -1%)
Did not have a passport
(Relative bias = -15%, -11%, -4%, -3%)
Benchmark
Telephone
Benchmark
Telephone
Without post-stratification
Internet
With post-stratification
45
Conclusions
• Before poststratification, the telephone and
probability internet survey performed approximately
equivalently.
• After poststratification, the average difference
between the benchmark and the sample estimate is
slightly (but not significantly) smaller for the
telephone than the internet survey.
• Limitation:
– Nonresponse is not the only reason for
differences in the estimates, and may not even be
the main source of the difference.
– Data collected in 2004; may be different today.
46
2. Using Variables Available
on the Sample
47
Two types of approaches
• Compare estimates for respondents
and nonrespondents - or respondents
and full sample – on variables
contained for full sample
• Evaluate variation in response rates /
response propensities over subgroups
defined by frame/paradata
48
2. Nonresponse bias for estimates based
on variables available on sample
2.1 Sampling frame
2.2 External data matched to sample
2.3 Observations taken during data collection
2.4 Seeded sample
2.5 Compare response rates for subgroups
2.6 Calculate R-Indicators
49
2.1 Using Sampling Frame Variables
• Sampling frame may contain variables on
target population that are correlated to those
being estimated in the survey
• Example: Length of membership on
organization list in study of member
attitudes, listed phone status in telephone
survey of residential mobility
• Difference between statistics from the full
sample and statistics from respondents-only
is an indicator of nonresponse bias
50
2.1 How to Conduct a Nonresponse Bias
Study Using Frame Data
1.
2.
3.
4.
Examine sampling frame to identify variables
correlated to key survey estimates, keeping
any that are potentially correlated
Only data for the full sample are needed, so
that efforts to process frame data may be
restricted to the full sample if necessary
Compute statistics for the full sample and for
respondents-only (using base weights); the
difference is the estimated nonresponse bias
Classify nonrespondents by reasons (e.g.,
noncontact/refusal) and compute statistics for
these groups to identify sources of bias
51
2.1 Statistical Tests using Sampling
Frame Variables
• Using the frame variables as the outcome (Y)
variable, calculate the respondent mean and
the nonrespondent mean.
– Can also calculate refusal and noncontact mean.
• Calculate the appropriate (design-based) ttests or chi-square test of differences in
means or proportions between the
respondents and nonrespondents
– Test the null hypothesis that the respondent mean
and nonrespondent mean are identical
52
2.1 Pros and Cons of Using Frame Data to
Estimate Nonresponse Bias
• Pros
– Measurement properties for the variables are consistent for
respondents and nonrespondents
– Bias is strictly due to nonresponse
– Provides data on correlation between propensity to respond
and the variables
• Cons
– Bias estimates are for the frame variables; only frame
variables highly correlated with the key survey statistics are
relevant
– The method assumes no nonresponse adjustments are
made in producing the survey estimates; if frame variables
are highly correlated, then they usually are used in
adjustment
– Item nonresponse on the frame reduces utility of variables
53
Groves, R. M., S. Presser, R. Tourangeau, B. T. West, M. P.
Couper, E. Singer and C. Toppe (2012). "Support for the
Survey Sponsor and Nonresponse Bias." Public Opinion
Quarterly 76(3): 512-524.
• Purpose: Estimate support for the March of Dimes
• Target population: Households in list maintained by
March of Dimes
• Sample design: Stratified random sample from rich frame
• Mode of data collection: Mail
• Response rate: 23% U of Michigan sponsor; 24% labor
force survey; 12% March of Dimes sponsor
• Target estimate: Lifetime total times volunteered (others
in article)
• Nonresponse error measure: Variables on list sampling
frame
54
Mean Lifetime total times volunteered
(Relative bias = 110%, 39%, 29%)
Mean number of times volunteered
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
March of Dimes Sponsor
Full sample mean
University of Michigan
Respondent Mean
Michigan Labor Force
Nonrespondent mean
55
Conclusions
• People who volunteer are more likely to
participate in a survey
– And bias in estimates greater when the
sponsor is related to volunteering
• Nonresponse error impact varies
across estimates and across sponsors
for the same estimate
• Limitation:
– Limited variables available on records
56
Tourangeau, R., Groves, R.M., and Redline, C.D. (2010).
Sensitive topics and reluctant respondents: Demonstrating a
link between nonresponse bias and measurement error.
Public Opinion Quarterly, 74(3), 413-432.
• Purpose: Voting status (Methodological experiment)
• Target population: Maryland residents who are
registered voters
• Sample design: Stratified samples of voters and
nonvoters (n=1,346 voters and n=1,343 nonvoters)
• Mode of data collection: Telephone or mail
(methodological experiment)
• Response rate: 34.3% telephone; 33.2% mail
• Target estimate: Voting status
• Nonresponse error measure: Comparison to voting
records
57
Voting Status
(Relative Bias = 22%, 30%)
70
60
50
40
30
20
10
0
Voted in 2004
Truth
Voted in 2006
Respondents
Nonrespondents
58
Conclusions
• Positive association between past
voting and survey participation; for
telephone cases, contact rates for
voters and nonvoters not significantly
different but cooperation rates highly
significantly different
• Limitation:
– Bias estimates limited to voting
information available from voting records
(not true current voting information)
59
2.2 Matching External Data to the Sample
• Sometimes administrative or other data
sets on the entire sample exist
• Example: employee records, pension
records, voting records
• Difference between record-based
statistics from the full sample and
statistics from respondents-only is an
indicator of nonresponse bias
60
2.2 How to Conduct a Nonresponse Bias
Study Using Matched Data
1. Locate an external data file covering the
entire sample
2. Match individual records between sampling
frame data and external data
3. Compute statistics for the full sample and
for respondents-only (using base weights);
the difference is the estimated
nonresponse bias
4. Classify nonrespondents by reasons (e.g.,
noncontact/refusal) and compute statistics
for these groups to identify sources of bias
61
2.2 Statistical Tests using Matched
Variables
• Using the matched variables as the outcome
(Y) variable, calculate the respondent mean
and the nonrespondent mean.
– Can also calculate refusal and noncontact mean.
• Calculate the appropriate (design-based) ttests or chi-square test of differences in
means or proportions between the
respondents and nonrespondents
– Test the null hypothesis that the respondent mean
and nonrespondent mean are identical
62
2.2 Pros and Cons of Using Matched Data
to Estimate Nonresponse Bias
• Pros
– Measurement properties for the variables are consistent for
respondents and nonrespondents
– Bias is strictly due to nonresponse
– Provides data on correlation between propensity to respond
and the variables
• Cons
– Matching problems diminish the value of the estimates
– Bias estimates are for the external data variables; only
external data variables highly correlated with the key survey
statistics are relevant
– The method assumes no nonresponse adjustments are
made in producing the survey estimates; if external data
variables are highly correlated, then they usually are used in
adjustment
– Item nonresponse on the external data reduces utility of
variables
63
Lee et al. (2009). “Exploring Nonresponse Bias in
a Health Survey Using Neighborhood
Characteristics.” American Journal of Public
Health. 99:1811-1817
• Purpose: Estimate health characteristics of California
residents
• Target population: Adults in households in California
in 2005
• Sample design: RDD list-assisted sample
• Mode of data collection: CATI telephone
• Response rate: Screener RR4: 49.8%; Sampled Adult:
54.0%; Overall RR4 26.9%
• Target estimate: Various estimates of health
outcomes
• Nonresponse error measure: Merged information
from telephone exchanges defined by census
64
measures
% in census tract speak only English at home
and (Relative Bias = 2.3%; -2.1%)
% Speak Only English at home
% Below 100% of federal poverty level
65
Conclusions
• Differences between respondents and
nonrespondents for telephone exchange
level data based on census 2000 were
modest for 90 characteristics
• Much larger differences between noncontacts and ‘other’ nonrespondents and
respondents than between refusals and
respondents
• The exchange level data were used to impute
values for the actual survey variables, and
very minor differences were found
• Limitation:
– Limited predictive power at level of exchange
66
Parsons, N. L., & Manierre, M. J. (2014). Investigating the
Relationship among Prepaid Token Incentives, Response
Rates, and Nonresponse Bias in a Web Survey. Field
Methods, 26(2), 191-204.
• Purpose: Campus housing amenities and student
characteristics on academic outcomes
• Target population: Undergraduate freshman at
Eastern Connecticut State University
• Sample design: Sample of students who live on
campus from registrar
• Mode of data collection: Web questionnaire
• Response rate: 37.6% no incentive group; 49.4
incentive group
• Target estimate: Credits; GPA
• Nonresponse error measure: Comparison to student
academic records
67
Academic Experiences for Two Incentive Groups
(Relative bias = -1%, 13%, 0.4%, 14%)
No Incentive
$2 Prepaid Incentive
16
16
14
14
12
12
10
10
8
8
6
6
4
4
2
2
0
0
Credits
Truth
Respondents
GPA
Nonrespondents
Credits
Truth
Respondents
GPA
Nonrespondents
68
Source: Parsons and Manierre (2014)
Conclusions
• Positive affect toward request from
school is a function of number of
credits enrolled and GPA
• Limitation:
– Bias estimates limited to variables on
records
69
Assael, H., and Keon, J. (1982). Nonsampling
vs. sampling errors in survey research. Journal
of Marketing, 46, 114-123.
• Purpose: Methodological study of telephone
customers
• Target population: Small businesses in 4 large cities
• Sample design: Subscriber sample of 1,579 business
customers
• Mode of data collection: Phone, mail, face to face,
and drop off
• Response rate: 57% average across several
recruitment methods; average 60% mail, 57% phone,
59% face to face, and 52% drop off, including item
missing data
• Target estimate: Phone bill, number of instruments
• Nonresponse error measure: Comparison to
company records
70
Monthly Billings and Number of Telephone
Instruments (Relative Bias = 12%, 27%, 35%, 4%; 6%,
21%, 17%, 12%)
FaceTo Face
"T
ru
t
R
es
h"
p
on
N
on
de
re
nt
sp
s
on
de
nt
s
Drop off
Monthly Billings
Mail
Phone
FaceToFace
Drop off
R
es
h"
p
on
N
on
de
re
nt
sp
s
on
de
nt
s
Mail
Phone
4
3.5
3
2.5
2
1.5
1
0.5
0
"T
ru
t
200
180
160
140
120
100
80
60
40
20
0
Number of Instruments
71
Conclusion
• Nonrespondents tend to have smaller
use of telephone services
• No way of knowing whether common
cause is the size of the business
• Limitation:
– Bias estimates limited to variables on
records
72
2.3 Observations Taken on Respondents
and Nonrespondents During Data Collection
• Interviewers are sometimes asked to
make observations about sample units,
both respondents and nonrespondents
• When observations are correlated with
key survey variables they may be
informative of potential nonresponse
bias in the survey variables
73
2.3 How to Conduct a Nonresponse Bias
Study Using Observation Data
1. Identify attributes of sample cases that
a. Can be observed
b. Are related to response propensity or the survey
variables
2. Develop a measurement approach to
making the observations on both
respondents and nonrespondents
3. Compute statistics on those observations
for the full sample and for respondentsonly (using base weights); the difference is
the estimated nonresponse bias
74
2.3 Statistical Tests using
Observation Variables
• Using the observation variables as the outcome (Y)
variable, calculate the respondent mean and the
nonrespondent mean.
– May not be able to calculate the mean separately for
refusals and noncontacts, depending on the observation.
• Calculate the appropriate (design-based) t-tests or
chi-square test of differences in means or
proportions between the respondents and
nonrespondents
– Test the null hypothesis that the respondent mean and
nonrespondent mean are identical
75
2.3 Pros and Cons of Using Observation
Data to Estimate Nonresponse Bias
• Pros
– Bias is strictly due to nonresponse
– Provides data on correlation between propensity to respond
and the variables
• Cons
– It is sometimes difficult to assure measurement properties
for the variables are consistent for respondents and
nonrespondents
– Bias estimates are for the observation variables; only
observation data variables highly correlated with the key
survey statistics are relevant
– The method assumes no nonresponse adjustments are
made in producing the survey estimates; if observation data
variables are highly correlated, then they usually are used in
adjustment
76
Lynn, P. (2003). PEDASKI: Methodology for
collecting data about survey nonrespondents.
Quality and Quantity, 37, 239-261.
•
•
•
•
•
•
Purpose: Measurement of crime victimization
Target population: UK household population
Sample design: Multi-stage sample
Mode of data collection: Face to face
Response rate: 83.5%
Target estimate: Interviewer observation
variables on sample unit
• Nonresponse error measure: Difference between
respondents and nonrespondent sample units
on interviewer observations
77
Detached House, Entryphone at Door of
Sample Unit (Relative Bias = 5%, -10%)
Percentage Detached
Structures
re
sp
on
de
nt
s
s
en
t
N
on
R
es
po
nd
h"
"T
ru
t
s
re
sp
on
de
nt
en
t
N
on
R
es
po
nd
"T
ru
t
s
25
20
15
10
5
0
h"
25
20
15
10
5
0
Percentage Entryphone at
Entrance
78
Conclusion
• Respondent households overestimate
prevalence of detached houses,
underestimate prevalence of units with
entryphones
• Limitation:
– These estimates are not key estimates of
the survey
79
Sastry, N. and Pebley, A.R. (2003). Nonresponse
in the Los Angeles Family and Neighborhood
Survey, RAND Working Paper Series 03-01
• Purpose: Measurement of neighborhood effects on
social and economic outcomes of adults and children
• Target population: Los Angeles County residents
• Sample design: Multi-stage sample
• Mode of data collection: Face to face
• Response rate: 85% for randomly selected adult
• Target estimate: Interviewer observation variables on
sample unit
• Nonresponse error measure: Difference between
respondents and nonrespondent sample units on
interviewer observations
80
Apartments, Rent <$500
(Relative Bias = 2%, -0.1%)
81
Conclusion
• Respondent households overestimate
prevalence of apartments, little bias on
estimated rent
• Limitation:
– These estimates are not key estimates of
the survey
82
West, B. T. (2013). An examination of the quality and utility of interviewer
observations in the National Survey of Family Growth. Journal of the
Royal Statistical Society: Series A (Statistics in Society), 176(1), 211-225.
• Purpose: Measurement of fertility experiences
• Target population: U.S. household population 1544 years old
• Sample design: Multi-stage area probability
sample
• Mode of data collection: Face to face
• Response rate: 79%
• Target estimate: Lister observation variables on
sample unit
• Nonresponse error measure: Difference between
respondents and nonrespondent sample units on
lister observations
83
Odds ratios predicting main
interview response propensity
Interviewer observes…
Adjusted Odds
Ratio
Physical impediments to household
0.986
High main interview probability
1.559
Medium main interview probability
0.295
Low main interview probability
0.093
Does not report main interview probability
--
All HU’s in segment residential
0.999
Safety concerns in segment
1.018
Respondent sexually active
1.923
Children under 15 years in HH
1.184
84
Conclusions
• Interviewer observations significantly
improve the fit of the response
propensity model
– Also significantly predict important survey
variables
– But measurement error in the observations
increases RMSE of adjusted estimates
• Limitations
– Measurement error in the observations can
be studied only for respondents
85
2.4 Using a Seeded Sample
• Seed the sample by including units with
known characteristics to see if the units
respond at different rates (similar to use of
sampling frame data but for a subset of the
sample)
• Example: Seed the sample with persons who
are and are not members of an organization
• Difference in seeded sample response rates
by the characteristics used to estimate bias
86
2.4 How to Conduct a Seeded
Sample Nonresponse Bias Study
1.
2.
3.
4.
Find source with highly reliable characteristics related to key
survey estimates, and include some units with and without
the characteristic in the sample
Include the seeded sample in the survey operations, making
sure the seeded units are handled just like all other units
(e.g., interviewers are blinded to seeding)
Compute response rates by the characteristic. If the rates are
similar then estimates correlated with the characteristic have
small biases; if the rates are very different, then the
estimates are subject to large biases depending on the
response rate. Generally this analysis is done without survey
weights and bias estimates are not produced
Examine source of nonresponse by tabulating response
rates by reasons (e.g., noncontact/refusal) for the seeded
sample by the characteristics
87
2.4 Pros and Cons of Using Seeded
Sample to Examine Bias
• Pros
– Measurement properties for the variables are consistent for
seeded sample
– Bias is strictly due to nonresponse
– Provides estimates of correlation between response
propensity and seeded sample characteristic
• Cons
– Estimates can only be produced for variables known from
the seeded sample and seeded sample is often small
– Bias estimates are usually not produced
– The method assumes no nonresponse adjustments are
made in producing the survey estimates
88
Groves, R., Presser, S., and Dipko, S. (2004). The role
of topic interest in survey participation decisions.
Public Opinion Quarterly, 68, 2-31.
• Purpose: Support of schools and education
• Target population: Adults in U.S. telephone
households; seeded sample of teachers
• Sample design: List-assisted RDD; stratified
sample of teacher list
• Mode of data collection: Telephone
• Response rate: Variable
• Target estimate: Sensitivity of teachers to
topic
• Nonresponse error measure: Comparison of
response rates between teachers and RDD 89
Response Rates for Survey on Education and
the Schools for Teachers and Full RDD Sample
80
70
60
50
40
30
20
10
0
Teachers
RDD
90
Conclusion
• Teachers respond at much higher rates
to survey on topic of potential interest
to them
• Limitation:
– No direct nonresponse bias estimate on
key statistics possible only with frame
containing such information
91
Montaquila, J.M., Williams, D., and Han, D. (2011). An
application of a two-phase address based approach to
sampling for subpopulations. Washington Statistical
Society Methodology Seminar.
• Purpose: Characteristics of children’s care and
education
• Target population: Infants, preschoolers, and
school-age children; seeded sample of addresses
identified to correspond to households with children
• Sample design: ABS; SRS from seeded sample
frame
• Mode of data collection: Primarily mail
• Response rate: (Next slide)
• Target estimate: Characteristics of care/education
• Nonresponse error measure: Comparison of
response rates between national ABS and seeded
sample
92
Response Rates for Seeded Sample of
Addresses Identified as Having Children and
National ABS Sample
(Screenout Version of Screener)
93
Conclusion
• Households with children respond at higher
rate to survey on topic of potential interest to
them
• Limitations:
– No direct nonresponse bias estimate on key
statistics possible only with frame containing
such information (but NHES used other
approaches as well)
– Seeding is not clean; only 81% of “addresses with
children” enumerated eligible children
94
2.5 Comparing Response Rates on
Subgroups
• Since nonresponse bias results only when
subgroups with different characteristics have
different response rates, this approach
examines the response rate component
• Response rates are computed for subgroups,
often using frame data
• If the response rates are not substantially
different, there should not be a large
nonresponse bias in statistics for the groups
95
2.5 How to Conduct a Nonresponse Bias
Study Comparing Response Rates
1. Identify attributes of the sample that are
knowable for both respondents and
nonrespondents
2. Compute response rates for categories of
these attributes
3. These response rate differences provide
insights into possible nonresponse bias to
the extent the attribute variables are
correlated with the survey variables
96
2.5 Statistical Tests for Response
Rates on Subgroups
• Identify subgroups of interest using frame,
matched, or observation variables
• Calculate the response rate for each of these
groups
• Use t-tests or chi-square tests to evaluate
differences in the response rates across the
groups
• If there is no difference in response rates
across the groups, then conclude no
nonresponse bias related to that particular
97
subgroup
2.5 Pros and Cons of Response Rate
Analysis to Estimate Bias
• Pros
– Simple and inexpensive
– Provides some evidence about potential for bias
• Cons
– Does not provide direct estimate of nonresponse
bias
– Only limited variables (frame) examined
– Assumes no nonresponse adjustments are made
in producing the survey estimates
– Item nonresponse on the frame reduces utility of
variables
98
van Goor, H., and Stuiver, B. (1998). Can weighting compensate for
nonresponse bias in a dependent variable? An evaluation of
weighting methods to correct for substantive bias in a mail survey
among Dutch municipalities. Social Science Research, 27, 481-499.
• Purpose: Study implementation of policy on
caravan sites
• Target population: All Dutch municipalities
• Sample design: Two-stage sample of
municipalities from provinces
• Mode of data collection: Mail
• Response rate: 74%
• Target estimate: Characteristics related to policy
performance of municipalities
• Nonresponse error measure: Comparison of
response rates (other measures in study)
99
Response Rates by Municipality Size and
Associated Nonresponse Bias Estimate
(Relative bias = -16%)
Response Rate by Size of
Municipality (000's)
Percentage Muncipalities
Less than 5,000 Population
2050
>50
s
de
nt
en
t
re
sp
on
1020
N
on
510
R
es
po
nd
h"
"T
ru
t
<5
s
16
14
12
10
8
6
4
2
0
90
80
70
60
50
40
30
20
10
0
Assumes self-weighting sample of
100
municipalities
Conclusion
• Response rates differ by municipality
size, which is correlated with policy
performance
• Limitation:
– Other measures available on survey used
to make judgments about nonresponse
bias
101
McFarlane, E., Murphy, J., Olmsted, M.G., Severance, J. (2010). The
effects of a mixed-mode experiment on response rates and
nonresponse bias in a survey of physicians. Presented at the Joint
Statistical Meetings
• Purpose: Physician survey is one component
used to arrive at U.S. News and World Report’s
“America’s Best Hospitals” rankings
• Target population: Board certified physicians
• Sample design: Stratified single-stage sample of
physicians (n=3,112)
• Mode of data collection: Mail only; mail with web
follow-up; mail and web
• Response rate: 48%
• Target estimate: Hospital rankings
• Nonresponse error measure: Comparison of
102
response rates
Response Rates by Age (Mail-only Cases
Only) and Associated Nonresponse Bias
Estimate (Relative Bias = -8.8%)
103
Conclusion
• Response rates differ by age of
physician
• Limitation:
– Key outcomes of interest are the hospital
rankings (not age). However, to the extent
that hospital rankings vary by age of
physician, the finding reported above is
indicative of potential bias.
104
2.6 Calculate R-indicators
• Since nonresponse bias results only
when subgroups with different
characteristics have different response
rates, this approach also examines the
response rate component
• Examines variability in response
propensities across sampled persons
105
3.7 How to calculate R-indicators
1. Obtain frame data with indicator for
respondent or nonrespondent
2. Using logistic regression, estimate a model
predicting the probability of being a
respondent as a function of frame variables
3. Obtain predicted probabilities from the model.
These are person-level estimates of response
propensity.
4. Calculate the base-weighted standard
deviation of the response propensities, S(p)
5. Calculate the R-indicator, R(p)=1-2S(p), where
R(p)=1 indicates strong representativeness
and R(p)=0 indicates weak representativeness
6. Can also calculate partial R-indicators to look
106
at influence of individual variables
3.7 Pros and Cons of using R-Indicators
• Pros
– Single number that is an indicator of how
representative the survey is on a variety of
characteristics
• Cons
– Value of R-indicator depends on which
variables are included in the propensity
model and whether interaction terms are
included
– Does not provide measure of nonresponse
107
bias for individual survey estimates
Schouten, B., Cobben, F. and Bethlehem, J. 2009.
“Indicators for the representativeness of survey
response.” Survey Methodology. 35(1): 101-113.
• Purpose: Estimate unemployment rate in the
Netherlands
• Target population: All adults in the Netherlands
• Sample design: Area sample with subsample of
nonrespondents
• Mode of data collection: Face to face,
telephone, mail and web
• Response rate: 62.2% for main study, 76.9%
main with call back to nonrespondents, 75.6%
for main with shortened questionnaire for
nonrespondents
• Target estimate: Varies
108
• Nonresponse error measure: R-indicator
R-indicators calculated for two
stages of recruitment protocol
Labor Force Survey
R-indicator
Main only
Main + call back to
nonrespondents
Main + shortened
questionnaire
80.1%
95%
Confidence
Interval
(77.5, 82.7)
85.1%
(82.4, 87.8)
78.0%
(75.6, 80.4)
Schouten, Cobben and Bethlehem, Table 4
109
Conclusions
• Using the callback approach made the
sample more representative, but
shortening the questionnaire did not
• Values of R-indicators depend on the
variables included in the propensity
model
110
Lynn, P. (2013). Alternative Sequential Mixed-Mode Designs:
Effects on Attrition Rates, Attrition Bias, and Costs. Journal
of Survey Statistics and Methodology, 1(2), 183-205.
• Purpose: Estimate effect of mode change in UK
Household Longitudinal Study
• Target population: All households in the United Kingdom
• Sample design: Area sample
• Mode of data collection: Face to face, telephone
Response rate: 73.9%, 65.2%, 57.1% Wave 2, 3, 4 face-toface; 65.6%, 59.8%, 54.0% Wave 2, 3, 4 mixed mode
– Telephone light = start with phone; switch to F2F when I’er
indicated as needed; telephone heavy = switch to F2F only at last
possible moment
• Target estimate: Varies
• Nonresponse error measure: R-indicator
111
R indicators – Wave 4 response
using Wave 1 data
Face to face
Mixed mode – telephone
light
Mixed mode – telephone
heavy
7
19
covariates covariates
0.727
0.662
0.731
0.684
0.668
0.651
112
Conclusions
• Although differences in response rates,
little difference in composition across
single mode vs. mixed-mode designs
• Limitations
– R-indicators limited to variables included
– Not clear what would happen in first wave
of study
113
3. STUDYING VARIATION WITHIN
THE RESPONDENTS
114
3. Studying Variation within the
Respondent Set
• Data collected as part of the survey or as an
addition to the survey in the survey may be
used to evaluate bias due to nonresponse in
the survey estimates
3.1 Use of screener and prior wave data
3.2 Following up nonrespondents
3.3 Two phase (double) sampling of
nonrespondents
3.4 Analyzing estimates by level of effort
3.5 Analyzing estimates by predicted response
propensity
3.6 Mounting randomized nonresponse experiments
115
3.1 Screening or Prior Wave Data as
a Nonresponse Bias Study
•
•
•
Some surveys screen the sample, and then
conduct more detailed interviews with all or
a subsample of the screened units (e.g.,
one adult reports for all adults in the
household and one adult is then sampled
for detailed data)
Longitudinal surveys have data available
on some sample cases from prior waves of
data collection
Often the final weights are for the second
stage respondents and use screener data
to adjust weights for nonresponse. To
estimate bias this adjustment must be
116
eliminated.
3.1 How to Conduct a Nonresponse Bias
Study Using Screener or Prior Wave Data
1. Try to maximize the screening or prior wave
response rate
2. Collect data on variables correlated with
the survey variables during the screening
or prior wave
3. Compute statistics for the full sample and
for main or current wave survey
respondents-only (using base weights); the
difference is the estimated nonresponse
bias on the screener or prior wave
variables
117
3.1 Statistical Tests using Screener
or Prior Wave Data
• Using the screener or prior wave variables as
the outcome (Y) variable, calculate the
respondent mean and the nonrespondent
mean.
– Can also calculate refusal and noncontact mean.
• Calculate the appropriate (design-based) ttests or chi-square test of differences in
means or proportions between the
respondents and nonrespondents
– Test the null hypothesis that the respondent mean
and nonrespondent mean are identical
118
3.1 Pros and Cons of Screener Data
to Estimate Nonresponse Bias
• Pros
– Measurement properties for the variables are
consistent for respondents and nonrespondents
– Bias is strictly due to nonresponse
– Provides data on correlation between propensity
to respond and the variables
• Cons
– Nonresponse in the screening step reduces the
scope of the sample that the bias estimates
describe
– Bias estimates are for the screener data variables;
only screener data variables highly correlated
with the key survey statistics are relevant
119
Zabel, J. (1998). An analysis of attrition in the Panel Study of
Income Dynamics and the Survey of Income and Program
Participation with an application to a model of labor market
behavior. Journal of Human Resources, 33, 479-506.
• Purpose: Measure labor market and income change
over time
• Target population: U.S. household population
• Sample design: SIPP is 8-wave (every 4 months)
panel from area probability design
• Mode of data collection: Phone, face to face
• Response rate: 92.4% in wave 1; 73.4% for all 8
waves
• Target estimate: Household income
• Nonresponse error measure: Comparison of those
completing 8 waves with those who attrited
sometime before wave 8
120
Mean Household Income in Wave 1 for SIPP
1990 Respondents by Attrition Status
(Relative bias = 4%)
16,000
14,000
12,000
10,000
8,000
6,000
4,000
2,000
0
Wave 1
Respondents*
NonAttritors
Attritors
*Assuming weighting of Nonattritors and Attritors proportionate to
unweighted case counts
121
Conclusions
• Full panel respondents have higher
household incomes at wave 1 than
those dropping out of the panel after
wave 1
• Limitations:
– Estimate of nonresponse bias limited to
wave 1 data
– Estimate of nonresponse bias does not
include component due to wave 1
nonresponse
122
Abraham, K.G., Maitland, A., and Bianchi, S.M. (2006).
Nonresponse in the American Time Use Survey: Who is missing
from the data and how much does it matter? Public Opinion
Quarterly, 70, 676-703.
• Purpose: Measure how Americans use their time
• Target population: U.S. civilian noninstitutionalized
population ages 15+
• Sample design: Random selection from households
completing 8th wave (“month-in-sample”) of the
Current Population Survey
• Mode of data collection: Phone
• Response rate: About 94% for CPS 8th month-insample interview; 56% response rate for 2004 ATUS
• Target estimate: Time spent on various activities
• Nonresponse error measure: Comparison of ATUS
respondents to ATUS sampled cases (8th MIS CPS
123
respondents)
Percentage Who Work 45+ Hours/Week,
Percentage Who Rent
(Relative Bias = 9%, -20%)
•Estimates computed by applying unweighted sample sizes to tabulated
response rates.
124
Conclusions
• Results “offer little support for hypothesis that busy
people are less likely to respond.”
• “Consistent and significant differences in response
rates across groups…seem to [support] the ‘social
integration’ hypothesis.”
• Limitation:
– Estimate of nonresponse bias limited to CPS 8th
MIS respondents
125
3.2 Followup of Nonrespondents
• This is one of the most common
techniques
• Use of respondent data obtained
through extra-ordinary efforts as
comparison to respondent data
obtained with traditional efforts
• “Effort” may include callbacks,
incentives, change of mode, use of elite
corps of interviewers
126
3.2 How to Do a Nonrespondent
Followup Study
1. Define a set of recruitment techniques
judged to be superior to those in the
ongoing effort
2. Determine whether budget permits use of
those techniques on all remaining active
cases
•
If not, implement 2nd phase sample (described later)
3. Implement enhanced recruitment protocol
4. Compare respondents obtained in
enhanced protocol with those in the initial
protocol
127
3.2 Statistical Tests using
Nonresponse Follow-up Studies
• Using the nonresponse follow-up variables as the
outcome (Y) variable, calculate the main study
respondent mean and the nonresponse follow-up
respondent (proxy nonrespondent) mean.
• Calculate the appropriate (design-based) t-tests
or chi-square test of differences in means or
proportions between the main study respondents
and nonresponse follow-up respondents
– Test the null hypothesis that the main study
respondent mean and nonresponse follow-up mean are
identical
128
3.2 Pros and Cons of Nonresponse
Followup Study
• Pros
– Direct measures are obtained from
previously nonrespondent cases
– Same measurements are used
– Nonresponse bias on all variables can be
estimated
• Cons
– Rarely are followup response rates 100%
– Requires extended data collection period
129
Criqui, M., Barrett-Connor, E., and Austin, M. (1978).
Differences between respondents and nonrespondents in a population-based cardiovascular
disease study. American Journal of Epidemiology, 108,
367-372.
• Purpose: Various health related attributes
• Target population: 30-79 years old residents of
planned community in California
• Sample design: Telephone directory, roster of
community center membership, census maps
• Mode of data collection: Telephone recruitment, face
to face interview
• Response rate: 82.1%
• Target estimate: Heart failure in relative and self
• Nonresponse error measure: Followup of
nonrespondents using shortened questionnaire,
yielding about 60% answering question
130
h"
R
es
po
nd
N
en
on
ts
re
sp
on
de
nt
s
3.5
3
2.5
2
1.5
1
0.5
0
"T
ru
t
% Hospitalization for Heart Failure
h"
R
es
po
nd
N
en
on
ts
re
sp
on
de
nt
s
45
40
35
30
25
20
15
10
5
0
"T
ru
t
% Heart Attack in Relative
Percentage Reporting Heart-Related Events in Survey
of Cardiovascular Disease, Females
(Relative Bias = 7%, -33%)
Criqui, M.H., Barrett-Connor, E. and Austin, M. (1978).
Differences between respondents and non-respondents in a population-based cardiovascular
disease study. American Journal of Epidemiology, 108(5), 367-372.
131
Conclusions
• Respondents overestimate extent of
family history with heart failure but
underestimate prevalence of heartrelated hospitalization
• Limitation:
– Followup efforts do not measure all
nonrespondents
132
Matsuo, et al. (2010). ‘Measurement and adjustment of
nonresponse bias based on nonresponse surveys: the case of
Belgium and Norway in the European Social Survey Round 3’.
Survey Research Methods. 4: 165-178.
• Purpose: Measure attitudes and beliefs throughout
Europe
• Target population: Adults in Belgium and Norway
• Sample design: Area probability sample
• Mode of data collection: Face to Face
• Response rate: Belgium: 61% main survey; 44.7%
nonresponse follow-up; Norway: 64% main survey;
30% nonresponse follow-up
• Target estimate: Social participation and
neighborhood security
• Nonresponse error measure: Use short questionnaire
on nonrespondents
133
Percentage who feel very safe, participate in
social activities much less than most, and are
not at all interested in politics
(Relative Bias = 3.4%, -23.4%, -30.7%)
Table 3 in Matsuo, et al. (2010)
134
Conclusions
• Short questionnaire to nonresponding
schools and districts was successful in
obtaining characteristics of schools useful
for nonresponse adjustments
• The estimates before nonresponse
adjustment had significant but small biases
• Only one school statistic had a significant
bias after the nonresponse adjustment
• Limitation:
– Bias estimates do not reflect nonresponse on the
followup questionnaire
135
3.3 Two Phase (Double) Sampling
for Nonrespondents
• This is a form of nonrespondent followup,
limited to a subset of remaining
nonrespondents
• The first “phase” is the original selection; the
second “phase” is the subsampling of cases
not measured in the first recruitment efforts
• Attractive option when first phase is
relatively cheap (but low response rate) and
second phase is expensive (with very high
response rates)
• Data from two phase are combined using
weights that are reciprocals of products of
first phase and second phase selection
136
probabilities
3.3 How to Conduct a Two Phase
Sample of Nonrespondents
1. Define second phase recruitment protocol
to be attractive to nonrespondents of first
phase
2. Determine sampling fraction fitting budget
available
3. Implement probability sample of remaining
nonrespondents
4. Combine first phase and second phase
respondent data reflecting both selection
probabilities
137
3.3 Statistical Tests using Two Phase
Sample Design
• Using the survey variables as the outcome (Y)
variable, calculate the 1st phase respondent
mean and the 2nd phase (nonrespondent) mean.
• Calculate the appropriate (weighted designbased) t-tests or chi-square test of differences
in means or proportions between the 1st phase
respondents and 2nd phase respondents
– Test the null hypothesis that the 1st phase respondent
mean and 2nd phase respondent mean are identical
138
3.3 Pros and Cons of a Two Phase
Sample of Nonrespondents
• Pros
– With 100% response rate on second phase,
design removes nonresponse error
– Explicitly deals with costs of nonrespondent
followup by permitting application of powerful
recruitment protocol to a subset
• Cons
– Rarely are 100% response rates obtained
– Standard errors of estimates are generally inflated
due to differential weighting due to second phase
selections
139
Peytchev, A., Carley-Baxter, L. R., & Black, M. C. (2011). Multiple
Sources of Nonobservation Error in Telephone Surveys: Coverage and
Nonresponse. Sociological Methods & Research, 40(1), 138-168.
• Purpose: Estimate intimate partner violence
• Target population: Civilian noninstitutionalized U.S.
population
• Sample design: Landline telephone numbers
• Mode of data collection: Face to face
• Response rate: 28.5%, followed by double sample,
raising weighted response rate to about 35.5%
• Target estimate: Percentage of women reporting that
they had one or more sexual experiences
• Nonresponse error measure: Respondents to double
sample nonresponse follow-up
140
Probability Weighted Estimates for
National Intimate Partner and Sexual Violence
Survey Pilot
50
45
40
35
30
25
20
15
10
5
0
Women - Stalking Women - sexual Women - Physical
violence
agression
Respondents
Men - Stalking
Men - Sexual
Violence
Men - Physical
Aggression
Nonrespondents
141
Conclusions
• Double sample introduced prepaid and promised
incentives and a shorter questionnaire
• Double sample brought into the respondent pool
older sample persons, with less aggression
• Limitation:
– Bias associated with nonresponse on 2nd phase sample not
reflected
142
Dallosso, H., Matthew, R., McGrother, C., Clarke, M., Perry, S.,
Shaw, C., and Jagger, C. (2003). An investigation into
nonresponse bias in a postal survey on urinary symptoms. British
Journal of Urology, 91, 631-636.
• Purpose: Estimate symptoms of urinary
incontinence
• Target population: Adults in Leicestershire
• Sample design: 55,527 Adults, age 40 or more, on
Leicestershire Health Authority list
• Mode of data collection: Mail questionnaire
• Response rate: 63.3% on mail questionnaire (49% in
followup)
• Target estimate: Various urinary incontinence
symptoms
• Nonresponse error measure: Comparison of early to
late responders; followup of double sample of 1,050
nonresponders by face to face interviews
143
Variable
Urinary leakage
Stress leakage
Urge leakage
Freq. of
micturition
Nocturia
Frequency of
strong urge
Mean
Prevalence
statistic
Several times
a month
Several times
a month
Several times
a month
Hourly or more
3 times/night
or more
Several times
a month
(Face to Face
to Mail)
(Mail Only)
Odds ratio
(NR to R)
Odds ratio
(late to early)
1.11
0.93
1.33
1.08
1.19
1.05
1.44
0.94
1.14
0.87
1.09
1.22
0.84
0.95
144
Conclusions
• Late respondents to mail questionnaire
slightly less likely to have urinary
incontinence symptoms; respondents to
double sample face to face interview more
likely
• Limitations:
– Two modes complicate estimates (however,
underreporting bias generally higher with
interviewer)
– Nonresponse rate in double sample measurement
complicates estimates (49% of eligibles)
145
3.4 Analyzing Estimates by Level of
Effort
• Some nonresponse models assume that
those units that require more effort to
respond (more callbacks, incentives, refusal
conversion) are similar to the units that do
not respond
• Characteristics are estimated for
respondents by level of effort
• Models fitted to see if it fits and can be used
to estimate characteristics of
nonrespondents
146
3.4 How to Analyze Level of Effort
1. Associate level of effort data to
respondents (e.g., number of callbacks,
ever refused, early or late responder)
2. Compute statistics for each level of effort
separately (usually unweighted or base
weights only)
3. If there is a (linear) relationship between
level of effort and the statistic, then may
decide to extrapolate to estimate statistic
for those that did not respond
4. Often more appropriate to do the analysis
separately for major reasons for
nonresponse
147
3.4 Statistical Tests using Level of
Effort Data
• Using the variables reported in the survey as the
outcome (Y) variable, calculate the mean for each
successive level of effort.
• Calculate the appropriate (design-based) t-tests or
chi-square test of differences in means or
proportions between the levels of effort
– Test the null hypothesis that the respondent mean across
levels of effort are identical
• Estimate a linear regression predicting the survey
variable with the number of follow up attempts
– Use logistic regression as appropriate for dichotomous
outcomes
– Test the hypothesis of a linear trend in estimates across
levels of effort
148
3.4 Pros and Cons of Using Level of
Effort Analysis to Estimate Bias
• Pros
– Simple to do, provided data collection systems
capture the pertinent information
– In some surveys may provide a reasonable
indicator of the magnitude and direction of
nonresponse bias
• Cons
– Highly dependent on model assumptions that
have not been validated in many applications
– Difficult to extrapolate to produce estimates of
nonresponse bias without other data
149
Lin, I.-F., and Schaeffer, N. (1995). Using survey
participants to estimate the impact of nonparticipation.
Public Opinion Quarterly, 59, 236-258.
• Purpose: Child support awards and payments
• Target population: Divorce cases with child-support
eligible children in 20 Wisconsin counties
• Sample design: Element sample of court cases
• Mode of data collection: Telephone
• Response rate: 69.7% mothers, 57% fathers
• Target estimate: Child support dollars owed
• Nonresponse error measure: Comparison to court
records for respondents and nonrespondents
150
Refusals
Interviews by call
Noncontacts
Mean dollars owed in child support (nonresident mothers)
Lin, I.-F., and Schaeffer, N. (1995). Using survey participants to estimate the impact of
nonparticipation. Public Opinion Quarterly, 59, 236-258.
151
Conclusion
• Mean amount owed in child support for
refusals and noncontacts differ from each
other and from those completing the
interview
• The relationship between the number of calls
and the amount owed is not clear, so little
evidence that nonrespondents are like those
who respond with more effort
• Limitation:
– Limited variables on records; only officially
documented payments on records
152
Curtin, R., Presser, S., and Singer, E. (2000). The effects
of response rate changes on the index of consumer
sentiment. Public Opinion Quarterly, 64, 413-428.
• Purpose: Estimate consumer confidence
index
• Target population: Adults in U.S. telephone
households
• Sample design: List-assisted RDD
• Mode of data collection: Telephone
• Response rate: Variable between 68-71%
• Target estimate: Consumer confidence index
• Nonresponse error measure: Comparison of
early to late responders
153
Correlations of ICS Quarterly Estimates
between “All-call Design” and Restricted Call
Design
Correlation type
All calls to initial
cooperators
All calls to 1-5 calls
Number of quarters
Correlations between statistics
based on quarterly estimates
Level
Change
estimates
estimates
.979
.980
70
.772
.781
69
Curtin, Presser, and Singer (2000)
154
Conclusions
• Little effect of additional efforts on level
of index
• Somewhat greater effects on change
estimates of index
• Limitation:
– No reflection of bias arising from
nonresponse after full effort
155
Kreuter, F., G. Müller and M. Trappmann (2010). "Nonresponse and
Measurement Error in Employment Research: Making Use of
Administrative Data." Public Opinion Quarterly 74(5): 880-906.
• Purpose: Estimate labor market participation
• Target population: Adults in German households
• Sample design: Dual frame – residential
addresses + register of benefit recipients
• Mode of data collection: Telephone and in person
• Response rate: 28.7% register; 24.7% population
• Target estimate: Benefit receipt
• Nonresponse error measure: Estimates over
number of call attempts
156
Nonresponse Bias for Welfare Benefit
Receipt, Employment Status, and Foreign
Status, Over Contact-level Strata, Cumulative
Welfare benefits
89.4%
20.8%
88.5%
23.3%
87.9%
24.2%
86.0%
Employed
84.4%
84.8%
84.7%
25.2%
26.3%
26.2%
26.3%
6.2%
6.1%
Transfer to
CAPI
Refusal
Conversion
4.6%
5.2%
5.4%
5.7%
5.8%
1-2 contact
attempts
3-5
6-8
9-15
>15 contact
attempts
Source: Kreuter, et al. (2010) Table 2.
Foreign
81.3%
26.2%
10.1%
Truth
157
Conclusions
• Variation of estimates over calls to first
contact is sensitive to at-home patterns
of persons having different values on
estimates
• Limitation:
– No reflection of bias remaining in the total
data set
158
3.5 Analyzing Estimates by
Predicted Response Propensity
• The stochastic model for nonresponse
shows that nonresponse bias of a mean is a
function of the correlation between response
propensity and the survey variable of interest
• Response propensity models are estimated
and predicted propensities are obtained
• Examine the correlation between the
predicted propensity and Y or changes in the
sample estimate over groupings of response
propensity
159
3.5 How to Analyze Estimates by
Predicted Response Propensity
1.
2.
3.
4.
5.
Obtain frame data with indicator for respondent or
nonrespondent
Using logistic regression, estimate a model predicting
the probability of being a respondent (1) versus a
nonrespondent (0) as a function of frame variables
Obtain predicted probabilities from the model. These
are person-level estimates of response propensity.
Estimate the association between the predicted
propensity and your survey variable of interest using
quintiles of the propensity distribution or correlations
between the predicted propensity and the survey
variable.
Often more appropriate to do the analysis separately
for major reasons for nonresponse
160
3.5 Statistical Tests using Response
Propensity Groups
• Using logistic regression, obtain predicted
probabilities of participation for all respondents and
nonrespondents
• Divide the predicted probabilities into discrete
groups (usually 5 quintiles)
• Using the variables reported in the survey as the
outcome (Y) variable, calculate the mean for each
quintile.
• Calculate the appropriate (design-based) t-tests or
chi-square test of differences in means or
proportions between the levels of effort
– Test the null hypothesis that the respondent mean across
161
quintiles are identical
3.5 Pros and Cons of Using Predicted
Response Propensities to Estimate Bias
• Pros
– May use this procedure already to create
nonresponse adjustment weights
– In some surveys may provide a reasonable
indicator of the magnitude and direction of
nonresponse bias
• Cons
– Highly dependent on model assumptions and
parameterization
– Assumes the correlation between the survey
variables and propensity are the same among
respondents and nonrespondents
162
Olson, K. (2006) “Survey Participation, Nonresponse
Bias, Measurement Error Bias, and Total Bias.” Public
Opinion Quarterly. 70(5): 737-758.
• Purpose: Estimate characteristics of divorce and
remarriage
• Target Population: Divorced couples
• Sample Design: Simple random sample of divorce
records in four counties in Wisconsin
• Mode of Data Collection: Telephone and Mail
• Response Rate: 71%
• Target Estimate: Mean length of marriage
• Nonresponse error Measure: Change in estimated mean
length of marriage over quintiles of estimated
cooperation propensity
163
Mean length of marriage in months by quintile
of estimated cooperation propensity
164
Olson, 2006, Figure 2
Conclusions
• Respondent mean is moving closer to
the target (‘truth’) as low propensity
respondents are brought into the
sample pool
• No clear difference for this statistic in
the pattern for record values or
reported values
165
Dahlhamer, J.M. & Simile, C.M. (2009). “Subunit Nonresponse
in the National Health Interview Survey (NHIS): An
Exploration Using Paradata.” Proceedings of the Government
Statistics Section of the American Statistical Association.
• Purpose: Estimate health characteristics of U.S.
population
• Target Population: Adults in the U.S. age 18+
• Sample Design: Multi-stage area probability sample
• Mode of Data Collection: Face-to-face
• Response Rate: 87% for household and family, 68%
for sample adult
• Target Estimate: Variety of estimates
• Nonresponse error Measure: Change in the estimate
of vaccines over five response propensity quintiles
166
Estimates by propensity quintile
% who received the
Influenza Vaccine during
past 12 months
% who engage in regular
leisure time physical
activity
167
Dahlhamer and Simile, 2009, Figures 2 and 4
Conclusion
• Unadjusted estimates may
overestimate prevalence estimates of
some diagnoses and access or use of
health care, but not for health
behaviors
• Paradata improved propensity model fit
• Explore expanding NHIS adjustments
using propensity models
168
3.6 Randomized Experiments
Affecting Nonresponse Rates
• Some survey features have been shown to
affect response rates
– incentives
– sponsorship
– main topic
• The full sample can be divided at random
into subsamples, each assigned a different
such feature
• Estimates of key statistics can be measured
on subsamples
169
3.6 How to Mount Randomized
Nonresponse Experiments
1. Choose a design feature expected to affect
response rates and hypothesized to affect
nonresponse bias
2. Mount randomized experimental design,
assigning different design features to
different random subsamples
3. Compare response rates and key survey
estimates among experimental treatment
groups
170
3.6 Statistical Tests using Survey
Experiments
• Using the variables reported in the survey as the
outcome (Y) variable, calculate the mean for each
experimental condition.
• Calculate the appropriate (design-based) t-tests,
ANOVAs or chi-square test of differences in means
or proportions between the levels of effort
– Test the null hypothesis that the respondent mean across
experimental conditions are identical
171
3.6 Pros and Cons of NonresponseRelated Experiments
• Pros
– Pure effect of design feature on response rates is
obtained
• Cons
– Expensive to mount multiple protocols
– Without some external benchmark, which
treatment is preferred is questionable
– Method offers potentially lower nonresponse bias
in one treatment but higher biases in other
treatments, losing an opportunity to reduce bias
in full sample
172
Merkle, D.M. and Edelman, M. (2009). An Experiment on
Improving Response Rates and Its Unintended Impact
on Survey Error. Survey Practice. 2(3).
• Purpose: Evaluate voting behavior in exit poll
• Target population: New Jersey and New York City general
election voters
• Sample design: Voting precincts with systematic sample
of voters
• Mode of data collection: Paper questionnaire handed out
by interviewer
• Response rate: Eye-catching interviewer folder: 54.2%;
Folder + Voters News Service pen for respondent: 55.4;
Traditional: 49.9%
• Target estimate: Vote signed and absolute error [(Exit
poll Dem % - Rep %) – (Official Dem % - Rep %)]
• Nonresponse error measure: Comparison of signed and
173
absolute error by treatment group
Signed Error by Treatment Condition
18
16
14
12
10
8
6
4
2
0
-2
-4
Folder/Pen
Folder
Signed error
Traditional
Absolute error
Merkle and Edelman (2009)
174
Conclusions
• Folder increased response rates, but
also increased error
– Folder significantly overrepresented
Democratic candidate voters; traditional
slightly overrepresented Republican
voters
• Limitation:
– Implemented only in New Jersey and New
York City
175
Groves, R.M., Couper, M.P, et al. (2006) “Experiments
in Producing Nonresponse Bias,” Public Opinion
Quarterly, 70(5): 720-736.
• Purpose: Attitudes toward birding or mall design
• Target population: members of the American Birding
Association or general population adults
• Sample design: random sample from the ABA
members or purchased from commercial vendor
• Mode of data collection: Mail
• Response rate: variable by topic of questionnaire
• Target estimate: Participating in bird watching
activities
• Nonresponse error measure: Comparison of
response rates by survey topic treatment groups
176
Response Rate and Percentage of Respondents Interested
in Birding for Survey About Birding Versus Mall Design
Response Rates
% interested in birding
Groves, R.M., Couper, M.P, et al. (2006) “Experiments in Producing Nonresponse Bias,” Public Opinion Quarterly, 70(5): 720-736. Figure 7 and 8
177
Conclusions
• Response rates for birders highest
when topic is of interest to them
• Estimates of birding also affected by
survey topic
• Limitations
– Not possible to determine whether the
differences among the general population
adults in birding estimates due to
nonresponse or measurement errors
178
4. COMPARING ESTIMATES USING
ALTERNATIVE WEIGHTING OR
IMPUTATION SCHEMES
179
4. Methods of Weight Adjustment
• Alter estimation weights and compare the
estimates using the various weights to
evaluate nonresponse bias. Weighting
methods may include poststratification,
raking, calibration, logistic regression, or
even imputation.
4.1 Prepare estimates under different assumptions
4.2 Adjust using models of characteristics
4.3 Adjust using models of response propensity
4.4 Adjust using selection (Heckman) models
180
4.1 Alternative Estimates Based on
Different Assumptions
•
•
A simple approach is to conduct a “what if”
analysis that may not require new weights.
The “what if” estimate assume a
distribution for the nonrespondents that
differs from the observed data.
If the difference between the “what if”
estimates and the regular survey estimates
is large then the estimates have the
potential for nonresponse bias.
181
4.1 How to Conduct Nonresponse
Bias Based on Different
Assumptions
1. Identify key assumptions in weighting and
determine “what if” or alternative
assumptions (e.g., very fine-grained
nonresponse adjustment cells).
2. Assume that all or a large fraction of the
nonrespondents have a specific
characteristic. Compute an estimate using
this assumption (or re-do with fine-grained
adjustment cells).
3. The difference between the “what if”
estimates and the regular survey estimates
is a bound on nonresponse bias.
182
4.1 Pros and Cons of Comparing Estimates
Based on Different Assumptions
• Pros
– The “what if” option is usually inexpensive and
the fine-grained revision may not be difficult
– If information from previous or similar studies
with follow-ups are available, then reasonably
precise assumptions about the characteristics of
nonrespondents can be postulated
• Cons
– Alternative estimates are often highly variable and
may not be very informative about the actual bias
– Without previous studies, the alternative
assumptions may not have much support
183
Kauff, Jacqueline, Robert Olsen and Thomas Fraker. (2002)
Nonrespondents and Nonresponse Bias: Evidence from a
Survey of Former Welfare Recipients in Iowa. Mathematica
Policy Research.
• Purpose: Understand well-being of former TANF
recipients
• Target population: families that left Iowa’s TANF
program in 1999
• Sample design: random sample from list of former
TANF recipients
• Mode of data collection: Telephone
• Response rate: 76%
• Target estimate: % Employment and Mean Earnings
• Nonresponse error measure: ‘What if’ estimates for
labor market and health outcomes
184
Estimates of % Employed and $ Monthly Earnings
assuming ‘best case’ outcomes (all improved)
and ‘worst case’ (all declined) outcomes for nonrespondents
(Bias = 3.6%, -8.7%; 8.9%, -8.6%)
% Employed
$ Monthly Earnings
720
700
680
660
640
620
600
580
560
540
64
% Employed
62
60
58
56
54
52
50
Actual Best case
(76% RR)
Worst
case
Actual
(76%)
Best Case Worst Case
(earned at
90th%ile)
Kauff, Olsen and Fraker. (2002) Nonrespondents and Nonresponse Bias: Evidence from a Survey of Former Welfare Recipients in Iowa. Exhibit 4.2
185
Conclusion
• Estimates based on ‘best’ and ‘worst’
case scenarios are similar to each
other, with similar policy implications
• Survey has limited nonresponse bias in
wave 1 study
186
Volken, T. (2013) “Second-stage non-response in the
Swiss health survey: determinants and bias in
outcomes.” BMC Public Health 13:167.
• Purpose: National survey on health status, utilization and
behavior
• Target population: Noninstitutional Swiss population fluent
in German, French or Italian
• Sample design: Multi-stage probability sample
• Mode of data collection: CATI or CAPI in screener; mail in
additional questionnaire
• Response rate: 80.3% of screener respondents yielded a
completed mail survey
• Target estimate: Self-rated health from screener; imputed
values for nonparticipants
• Nonresponse error measure: Comparison to screener
answers or imputed answers for respondents and
nonrespondents of mail interview
187
Respondent and imputed estimates
for four health outcomes
25
20
15
10
5
0
Influenza vaccination
Arthrosis
Osteoporosis
Respondents
Imputed Nonrespondents
High blood pressure
188
Conclusion
• Magnitude of bias on most outcomes
estimated to be moderate
– Direction of differences varies by gender
• Limitations
– Bias estimates created off of imputed
values
– Imputation models may be incorrect
189
Artificial Example
• Purpose: Estimate characteristics of low income
households for those in a federal assistance program
• Target population: U.S. households in program
• Sample design: Equal probability sample from list
• Mode of data collection: Telephone
• Response rate: 80%
• Target estimate: Number in poverty and percent
Hispanic
• Nonresponse error measure: Alternative
assumptions about the proportion of
nonrespondents
190
Artificial “What If” Scheme
Bias of estimated proportion making alternative assumptions
regarding the percent of nonrespondents in the groups.
Percent of
Characteristic respondents
Below poverty
25.0
Hispanic
13.0
Bias if assumed percent for
nonrespondents is
1.5 times
2.5 times
respondent % respondent %
2.5
7.5
1.3
3.9
191
Conclusions
• Bias in estimates of the percentage in
the program who are in poverty and
Hispanic may be substantial if the
respondents and nonrespondents are
very different
• Limitation:
– Difficult to evaluate assumptions about the
percentage of nonrespondents having a
characteristic
192
4.2 Using alternative weighting
schemes
•
•
•
Weighting can reduce nonresponse bias if the
weights are correlated with the estimate.
Auxiliary data in weighting that are good
predictors of the characteristic may give
alternative weights that have less bias.
If the estimates using the alternative weights
do not differ from the original estimates, then
either the nonresponse is not resulting in bias
or the auxiliary data does not reduce the bias.
If the estimates vary by the weighting scheme,
then the weighting approach should be
carefully examined and the one most likely to
have lower nonresponse bias should be used.
193
4.2 How to Conduct Nonresponse
Bias Analysis Using Alternative
Weights
1. Conduct analysis to identify any auxiliary
data (often from external source) available
that are good predictors of the key survey
estimates and/or response propensity.
2. Using weighting method such as weighting
class adjustments, response propensity, or
calibration estimation with these variables
and produce alternative weights.
3. Compute the difference between the
estimates using the alternative weights and
the estimates from the regular weights as a
measure of nonresponse bias for the
estimate.
194
4.2 How to Conduct Nonresponse Bias
Analysis Using Alternative Weights
Note that a few methods of analysis to identify
any auxiliary variables that are good predictors
of response propensity include:
• CHAID analysis (classification trees)
• Logistic regression modeling
Variables that are used in a separate weighting
class or response propensity-based
adjustment must be available for all
respondents and nonrespondents.
195
4.2 Pros and Cons of Comparing Alternative
Estimates Weights
• Pros
– If good predictors are available, then it is likely
that the use of these in the weighting will reduce
the bias in the statistics being evaluated
– If the differences in the estimates are small, it is
evidence that nonresponse bias may not be large
• Cons
– Recomputing weights may be expensive
– If good correlates are not available then lack of
differences may be indicator of poor relationships
rather than the absence of bias
– The approach is limited to statistics that have
high correlation with auxiliary data
196
Kreuter, F., Olson, K., Wagner, J., Yan, T., Ezzati-Rice, T. M., et al. (2010). Using
Proxy Measures and Other Correlates of Survey Outcomes to Adjust for
Nonresponse: Examples from Multiple Surveys. Journal of the Royal Statistical
Society Series A-Statistics in Society, 173(2), 389-407.
• Purpose: Five surveys – UM Transportation Research
Institute; MEPS; ESS; ANES; NSFG.
• Target population: Varies
• Sample design: Varies; generally area probability
• Mode of data collection: Face-to-Face and telephone
(varies)
• Response rate: Varies
• Target estimate: Multiple estimates across five surveys
• Nonresponse bias measure: Use alternative weighting
scheme (add auxiliary variables to standard adjustment
scheme) to reduce bias
197
ESS; +=litter; o=multiunit housing
198
Conclusions
• Correlation between auxiliary variables
and survey variables very weak across
all five surveys
• Weighted estimates see largest change
when correlation between new auxiliary
variables and survey variables is
largest
199
•
•
•
•
•
•
•
Tivesten, E., Jonsson, S., Jakobsson, L., & Norin, H.
(2012). Nonresponse analysis and adjustment in a mail
survey on car accidents. Accident Analysis &
Prevention, 48(0), 401-415.
Purpose: Evaluate causes of automobile accidents.
Target population: Owners of Volvo cars that had vehicle
repair costs above 45,000 SEK following an accident, and
were insured by Volvia insurance company
Sample design: All eligible persons in target population
Mode of data collection: Mail
Response rate: 35.5%
Target estimate: Non-driving level distractions
Nonresponse bias measure: Compare weighted (response
propensity model including information on accident type,
town size, car owner, driver age, gender) and unweighted
estimates
200
Adjust Weights Using Response
Propensity Model
201
Conclusions
• Driver’s age, sex, accident type and town
size effective nonresponse adjustment
variables
• Weights increased the number of low
vigilance and non-driving related distracted
drivers
• Limitation:
– Only looked at new car (Volvo) owners
202
Ekholm and Laaksonen (1991). Weighting via
response modeling in the Finnish household
budget survey. Journal of Official Statistics, 3,
325-337.
• Purpose: Finnish Household Budget survey
estimates and monitors consumption in Finland
• Target population: Households in Finland
• Sample design: Sample persons from register and
collect data from households (1985)
• Mode of data collection: 2 f-t-f interviews 2 weeks
apart
• Response rate: 70%
• Target estimate: Number of households
• Nonresponse error measure: Use data from registers
to estimate logistic regression for probability to
respond (factors include urbanicity, region, property
income, and household structure).
203
Estimated Number of Households (in
1,000s) Using Two Adjustment Schemes
Adjustment scheme
128-cell
Benchmark 35 regional propensity
survey
Domain
poststrata
model
All Finland
2,010
1,960
2,042
Uusimaa
na
504
540
North Carelia
na
70
74
204
Conclusions
• Evidence suggested that revised
scheme reduced bias
• The scheme was considered to be
highly practical since it could be used
with a variety of statistics
• Limitation:
– Evaluation using external estimates limited
to a few variables
205
4.3 Calculate the Fraction of Missing
Information (FMI)
• Measure out of multiple imputation literature
about how uncertain we are about imputed
values. Smaller values indicate more certainty
about missing values.
– Upwardly bounded by the response rate
• More strongly correlated auxiliary variables
with survey variables will reduce the FMI
• Provides insights into the strength of
adjustment variables; Can also be used
during data collection
206
4.3 How to conduct a Fraction of
Missing Information analysis
• Obtain as much information as possible on
both respondents and nonrespondents
• Obtain important survey variables from
respondent pool
• Use multiple imputation methods (e.g., SAS
PROC MI, Stata ICE, IVEWare) to impute
multiple datasets
• Use standard multiple imputation analysis
methods to estimate the FMI for a variety of
survey estimates
207
4.3 Pros and cons of conducting a
Fraction of Missing Information analysis
• Pros
– Results in multiply imputed dataset, which may
need to do anyway
– Provides information about the usefulness of
auxiliary variables for adjustment
– Can use multiply imputed datasets for an estimate
of nonresponse bias
• Cons
– Multiple imputation is labor intensive
– Requires good predictors of a wide range of
survey variables
208
Wagner, J. (2010). The Fraction of Missing Information as a
Tool for Monitoring the Quality of Survey Data. Public
Opinion Quarterly, 74(2), 223-243.
• Purpose: National Survey of Family Growth and
Survey of Consumer Attitudes
• Target population: NSFG: US adult
noninstitutionalized population 15-44; SCA: All adults
• Sample design: NSFG: Area probability design; SCA:
RDD
• Mode of data collection: NSFG: Face-to-face; SCA:
telephone
• Response rate: Changes over course of data
collection
• Target estimate: Proportion never married (NSFG);
Consumer confidence indices (SCA)
• Nonresponse error measure: Fraction of Missing
209
Information calculation relative to response rate
Fraction of Missing Information and Nonresponse Rate
for the NSFG by Day and Quarter: Proportion Never
Married.
James Wagner Public Opin Q 2010;74:223-243
Fraction of Missing Information and Nonresponse Rate
for the SCA by Call Number: ICS, ICC, and ICE.
210
Conclusion
• Frame and paradata are good predictors
of the survey data in the NSFG, but not in
the SCA
• Information is useful for guiding data
collection decisions
• Limitations
– Some modes may have less informative
information
– Required 100 imputations conducted every
day
211
4.4 Adjust Weights Using Heckman
Model of Selection Bias
•
•
•
Similar in philosophy to response
propensity modeling.
Use auxiliary data to predict probability
of responding to the survey and then
use second stage regression to estimate
characteristic including the selection
variable.
If the estimates vary, then nonresponse
bias may be evident and Heckman model
may give lower nonresponse bias for the
statistic.
212
4.4 How to Conduct Nonresponse
Bias Analysis Using Heckman Model
1. Conduct analysis to identify any auxiliary
data available that are good predictors of
units responding (self-selecting).
2. Use regression-based estimation to model
self-selection bias.
3. Include the estimated selection bias terms
in the regression for estimating the
characteristic.
4. Compute the difference between the
estimates using the Heckman model and
the estimates from the regular weights as a
measure of nonresponse bias.
213
4.4 Pros and Cons of Comparing Alternative
Estimates Based on Heckman Model
• Pros
– If good predictors are available, then it is likely
that the use of these in the weighting will reduce
the bias in the statistics being evaluated
• Cons
– Recomputing weights may be expensive
– Method has assumptions that often do not hold
– Separate regressions needed for each statistic
214
Messonier, M., Bergstrom, J., Cornwell, C., Teasley, R., and
Cordell, H. (2000). Survey response-related biases in contingent
valuation: concepts, remedies, and empirical application to
valuing aquatic plant management. American Journal of
Agricultural Economics, 83, 438-450.
• Purpose: Estimate willingness to pay for aquatic
plant management in Lake Guntersville, AL
• Target population: Recreational users of lake
• Sample design: Stratified random sample of visitors
during 9AM-5PM time period and sample of lakeside
residents
• Mode of data collection: Mail questionnaire among
those interviewed as visitors and residents
• Response rate: 50%
• Target estimate: Mean amount willing to pay for plant
management
• Nonresponse bias estimate: Heckman two-stage
correction for unit nonresponse, using variables
215
measured during initial interview
Two-stage “Heckman” Adjustment
• Probability of being a respondent to
mail questionnaire (nonfishers)
.09(education of sample person) +
.03(number of persons in household)
• Use this equation as a ‘selection bias’
equation to adjust estimates of another
regression model to estimate
willingness to pay
216
Estimated Mean Dollar Willingness
to Pay
600
500
400
300
200
100
0
Without Selection Bias
Adjustment
With Unit Nonresponse
Adjustment
217
Conclusions
• Without selection bias adjustment,
there appears to be an underestimate
of the amount nonfishers are willing to
pay for aquatic management
• Limitation:
– No assessment of assumptions underlying
use of selection bias equation
218
Five Things You Should Remember
from the Short Course
1.
2.
3.
4.
5.
The three principal types of nonresponse bias studies are:
- Comparing surveys to external data
- Studying internal variation within the data collection, and
- Contrasting alternative postsurvey adjusted estimates
All three have strengths and weaknesses; using multiple
approaches simultaneously provides greater understanding
Nonresponse bias is specific to a statistic, so separate
assessments may be needed for different estimates
Auxiliary variables correlated with both the likelihood of
responding and key survey variables are important for
evaluation
Thinking about nonresponse before the survey is important
because different modes, frames, and survey designs permit
different types of studies
219