Practical Tools for Nonresponse Bias Studies Kristen M. Olson University of Nebraska-Lincoln Short Course – JOS 30th Anniversary June 10-11, 2015 Materials for this short course were developed originally by Robert M. Groves and J. Michael Brick Initial Conversations for this course came from the Nonresponse Bias Summit Meeting, Ann Arbor, January 29-30, 2005 Paul Biemer, Martin Frankel, Brian Harris-Kojetin, Steve Heeringa, Paul Lavrakas, Kristen Olson, Colm O’Muircheartaigh, Beth-Ellen Pennell, Emilia Peytcheva, Andy Peytchev, Eleanor Singer, Clyde Tucker, Mandi Yu, Sonja Ziniel Coverage of Short Course • Included – Nonresponse bias approaches for different modes – Mostly household, some establishment surveys – Dominance of U.S. examples • De-emphasized – Panel attrition issues – Item nonresponse issues 3 Schedule June 10: 13:00 – 14:00 14:00 – 14:30 14:30 – 14:45 14:45 – 16:30 June 11: 13:00 – 14:45 14:45 – 15:00 15:00 – 16:00 Introduction Benchmarking Break Study designs using external data Study designs involving internal survey data Break Postsurvey adjustment analyses 4 INTRODUCTION 5 What is Nonresponse? • Unit nonresponse is the failure to obtain survey measures on a sample unit • It occurs after the sampling step of survey (don’t confuse with failure to cover target population by sampling frame) • It reflects total failure to obtain survey data (don’t confuse with item nonresponse, the failure to obtain an answer to a given item) 6 Measurement Representation Target Population Construct Coverage Error Validity Sampling Frame Sampling Error Measurement Measurement Error Sample Nonresponse Error Response Respondents Processing Error Adjustment Error Edited Response Groves, et al. 2004, Survey Methodology Figure 2.5 Postsurvey Adjustments Survey Statistic 7 Response Rates • AAPOR standards for calculations • Weighted response rates may be appropriate • Often used as data quality and field performance indicator • Low response rates can be an indicator of potential problems such as – Nonresponse bias – Underestimated variances 8 Nonresponse Error for Sample Mean In simplest terms m Yr Yn Yr Ym n OR Respondent Mean = Full Sample Mean + (Nonresponse Rate)*(Respondent Mean – Nonrespondent Mean) OR Survey Results = Desired Results + Error OR Nonresponse Error = f(Rate, Difference between Respondents and Nonrespondents) 9 Examples I. Response Rate – 75% Respondent Mean – 10 Nonrespondent Mean – 14 Nonresponse Error = .25*(10 – 14) = –1 II. Response Rate – 90% Respondent Mean – 10 Nonrespondent Mean – 40 Nonresponse Error = .10*(10 – 40) = –3 10 Low Nonresponse Rate, Small Difference between Respondents and Nonrespondents 1 0.8 0.6 Nonrespondents 0.4 0.2 0 yr ym 11 High Nonresponse Rate, Small Difference between Respondents and Nonrespondents 1 0.8 0.6 0.4 0.2 0 yr ym 12 Low Nonresponse Rate, Large Difference between Respondents and Nonrespondents 1 0.8 0.6 0.4 0.2 0 yr ym 13 High Nonresponse Rate, Large Difference between Respondents and Nonrespondents 1 0.8 0.6 0.4 0.2 0 yr ym 14 A Stochastic View of Response Propensities yp yp y p Bias ( y r ) p p where yp covariance between y and response propensity, p p mean propensity over the sample yp correlation between y and p y standard deviation of y p standard deviation of p 15 What does the Stochastic View Imply? • Key issue is whether the influences on survey participation are shared with the influences on the survey variables • Increased nonresponse rates do not necessarily imply increased nonresponse error • Hence, investigations are necessary to discover whether the estimates of interest might be subject to nonresponse errors 16 Meta-Analysis of Nonresponse Error Studies • 59 studies, some with multiple estimates • Each has data available to compute a relative bias due to nonresponse. The absolute value of the relative bias is (y r y n ) yn 17 Percentage Absolute Relative Nonresponse Bias of 959 Respondent Means by Nonresponse Rate of the 59 Surveys in Which They Were Estimated. Robert M. Groves, and Emilia Peytcheva Public Opin Q 2008;72:167-189 © The Author 2008. Published by Oxford University Press on behalf of the American Association for Public Opinion Research. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org 18 Conclusions • Nonresponse error does exist • Nonresponse rate by itself is not a good predictor of nonresponse errors • The analysis does NOT address whether changing the nonresponse rate within a study would have affected the error 19 Some Comments From a Meta-Analysis of Nonresponse Bias Studies • What do we know about properties of different nonresponse bias study designs? • Tendency for studies of internal variation to have higher bias estimates – be careful of multiple interpretations • survey variables linked to mechanisms producing nonresponse OR • confounds among variables and study techniques Groves, R., and Peytcheva, E. (2008). The Impact of Nonresponse Rates on Nonresponse Bias: A 20 Meta-Analysis,” Public Opinion Quarterly. Bias as Proportion of Standard Deviation 0.25 0.2 0.15 0.1 0.05 0.08 0.1 0.19 0.14 Frame Supplement Screener Followup 0 Method Used to Estimate Nonresponse Bias 21 Does Nonresponse Bias on Demographic Variables Predict Nonresponse Bias on Substantive Variables? Peytcheva, Emilia and Robert M. Groves. (2009) “Using Variation in Response Rates of Demographic Subgroups as Evidence of Nonresponse Bias in Survey Estimates.” Journal of Official Statistics. 25(2): 193-201. 22 So When is Nonresponse Error a Problem for a Given Survey? • Difficult to know without assessing errors through auxiliary studies • Response rates often used as an indicator of “risk” of nonresponse error • Various indicators for risk of nonresponse error have been proposed (Groves, et al., Survey Practice, 2008; Wagner, 2012; Nishimura, Wagner and Elliott, 2015) 23 Wagner’s (2012) Typology • Indicators involving the response indicator • Indicators involving the response indicator and frame data or paradata – Nonresponse bias for estimates based on variables available on sample • Indicators involving the response indicator, frame data or paradata, and the survey data – Studying variation within the respondent set 24 Z1 Z2 R Y1 Y2 1 1 1 1 1 1 1 0 0 Sampling frame and paradata 0 Survey data 0 25 Goals of Course • Course should provide you with tools to examine nonresponse bias • You can use the tools regardless of response rates to obtain insights into potential errors 26 Weights and Response Rates • A base or selection weight is the inverse of the probability of selection of the unit. The sum of all the sampled units’ base weights estimates the population total. • When units are sampled using a complex sample design, the AAPOR guidelines suggest using (base) weights to compute response rates that reflect the percentage of the sampled population that respond. Unweighted rates are useful for other purposes, such as describing the effectiveness of the effort. • Weighted response rates are computed by summing the units’ base weights by disposition code rather than summing the unweighted counts of units. • In establishment surveys, it is useful to include a measure of size (e.g., number of employees or students) to account for the units relative importance. The weight for computing response rates is the base weight times the measure of size. 27 Weights and Nonresponse Analysis • A general rule is that weights should be used in nonresponse analysis studies so that relationships at the population level can be examined. Guides for choosing the specific weights to use are: – Use base weights for nonresponse bias studies that compare all sampled respondents and nonrespondents. Weights adjusted for nonresponse may be misleading in this situation. – Use fully adjusted weights for nonresponse bias studies that compare survey estimates with data from external sources. One important exception is when the survey weights are poststratified. In this case, weights prior to poststratification are generally more 28 appropriate. Weighting Example • Suppose a sample of schools is selected from two strata: – Stratum 1: 50% sampled and 60% respond – Stratum 2: 20% sampled and 80% respond • All the sampled schools can be classified as urban or rural • A nonresponse bias study examines the estimate of percentage of schools that are urban, by using the frame data on urbanicity 29 Weighting Example (Cont.) Stratum Urban 1 2 Total Rural 1 2 Total n 150 80 230 100 420 520 wt 2 5 2 5 r 90 64 154 60 336 396 • Estimates of percentage urban Full response base weighted estimate: (150*2+80*5)/(150*2+80*5+100*2+420*5) = 700/3000=23.3% 30 Weighting Example (Cont.) • Estimates of percentage urban Respondent base weighted estimate: (90*2+64*5)/(90*2+64*5+60*2+336*5) = 500/2300=21.7% bias = 21.7%-23.3%=-1.6% Respondent unweighted estimate: (90+64)/(90+64+60+336) = 154/550=28.0% bias = 28.0%-23.3%=4.7% 31 Five Things You Should Remember from the Short Course 1. 2. 3. 4. 5. The three principal types of nonresponse bias studies are: - Comparing surveys to external data - Studying internal variation within the data collection, and - Contrasting alternative postsurvey adjusted estimates All three have strengths and weaknesses; using multiple approaches simultaneously provides greater understanding Nonresponse bias is specific to a statistic, so separate assessments may be needed for different estimates Auxiliary variables correlated with both the likelihood of responding and key survey variables are important for evaluation Thinking about nonresponse before the survey is important because different modes, frames, and survey designs permit different types of studies 32 Nonresponse Bias Study Techniques 1. 2. Comparison to other estimates (benchmarking) Nonresponse bias for estimates based on variables available on sample 2.1 Sampling frame 2.2 External data matched to sample 2.3 Observations taken during data collection 2.4 Seeded sample 2.5 Comparing response rates on subgroups 2.6 Calculate R-indicators 3. Studying variation within the respondent set 3.1 Use of screening and prior wave data collection 3.2 Following up nonrespondents 3.3 Two phase (double) sampling of nonrespondents 3.4 Analyzing estimates by level of effort 3.5 Analyzing estimates by Predicted Response Propensity 3.6 Mounting randomized nonresponse experiments 4. Altering the weighting or imputation adjustments 4.1 Prepare estimates under different assumptions 4.2 Adjust using alternative weights 4.3 Calculate the Fraction of Missing Information 4.4 Adjust using selection (Heckman) models 33 Data Used in Nonresponse Bias Studies Types of data available (or that could be produced in the study) • • • • • Individual data for each population unit Individual data for each sampled unit Aggregate data for each sampled unit Individual data from the data collection process Individual data collected from a follow-up 34 1. Benchmarking 35 1. Comparison to Other Estimates Benchmarking • Data or estimates from another source that are closely related to respondent estimates may be used to evaluate bias due to nonresponse in the survey estimates – Benchmarking 36 1. Benchmarking Survey Estimates to those from Another Data Source • Another survey or administrative record system may contain estimates of variables similar to those being produced from the survey • Example: Demographic characteristics from the American Community Survey (Census), number of persons graduating from high school from the Common Core of Data (NCES) • Difference between estimates from survey and other data source is an indicator of bias (both nonresponse and other) 37 1. How to Conduct a Nonresponse Bias Benchmark Study 1. Identify comparison surveys with very high response rates or administrative systems that can produce estimates of variables similar to key survey estimates 2. Assess major reasons why the survey estimates and the estimates from the comparison sources may differ 3. Compute estimates from the survey (using final weights) and from the comparison source to be as comparable as possible (often requires estimates for domains) 4. The difference is an estimate of overall bias 38 1. Statistical Tests using Benchmarking • Using the respondent data, calculate the fully adjusted respondent mean. – Can also calculate the base weighted mean for comparison. • Calculate the appropriate (design-based) two sample t-tests or chi-square test – May need to account for sampling error in the benchmark estimate – Test the null hypothesis that the respondent mean and benchmark mean are identical 39 Pros and Cons of Benchmark Comparison to Estimate NR Bias • Pros – Relatively simple to do and often inexpensive – Estimates from survey use final weights and are thus relevant – Gives an estimate of bias that may be important to analysts • Cons – Estimated bias contains errors from the comparison source as well as from the survey; this is why it is very important that the comparison source be highly accurate – Measurement properties are generally not consistent for survey and comparison source; often is largest source of error – Item nonresponse in both data sets reduces comparability 40 Li, G., Schoeni, R.F., Danziger, S., and Charles, K.K. (2010). New expenditure data in the PSID: Comparisons with the CE. Monthly Labor Review, 133(2), 29-39. • Purpose: Estimate dynamic aspects of economic and demographic behavior • Target population: U.S. individuals and family units • Sample design: Longitudinal household survey started in 1968 with oversample of low income households • Mode of data collection: f-t-f originally, mainly phone since 1972 • Response rate: 75% initially, attrition loss of 36% by ’80, 2-3% nonresponse each subsequent year • Target estimate: Various expenditure measures • Nonresponse error measure: Expansion of expenditure questions in 1999 allows for comparison to Consumer Expenditure (CE—Interview Survey; RR=80% in 2000) population estimates overall and by age. 41 PSID Estimates Compared to CE Interview Survey Estimates Ratios of mean PSID expenditure to mean CE expenditure: Expenditure category 1999 2001 2003 Total 0.96 1.02 1.01 Total food 1.03 1.08 1.10 Total housing 0.94 1.00 0.97 Mortgage 1.10 1.27 1.17 Rent 0.96 0.96 0.96 0.97 1.11 1.09 Health care: Insurance 42 Conclusions • In general, for the expenditures measured PSID estimates are comparable to CE estimates. • Broad categories align very well; some minor differences at subcategory level. • Cross-sectional “lifecycle” estimates (by age category) are generally similar; exception is difference in early 50s, primarily due to difference in education expenditures. • Limitation: – Nonresponse is not the only reason for differences in the estimates, and may not even be the main source of the difference 43 Yeager, D.S., Krosnick, J.A., Chang, L., Javitz, H.S., Levendusky, M.S., Simpser, A., and Wang, R. (2011). “Comparing the Accuracy of RDD Telephone Surveys and Internet Surveys Conducted with Probability and Non-Probability Samples.” Public Opinion Quarterly. 75(4): 709-747. • Purpose: Evaluate quality of two types of probability samples and seven non-probability samples. • Target population: U.S. adults • Sample design: Varies over the surveys. Focus on the RDD and probability internet panel samples • Mode of data collection: phone and web • Response rate: 35.6% (phone); 15.3% (probability web) • Target estimate: Demographic and non-demographic estimates • Nonresponse error measure: Difference between benchmark and survey estimate before and after post-stratification 44 Non-demographic comparisons, probability samples only Nonsmoker (Relative bias = -2%, -3%, -4%, -5%) 100 Had 12 drinks in lifetime (Relative bias = 9%, 9%, 12%, 11%) 100 80 80 60 60 40 40 20 20 0 Benchmark Telephone Without post-stratification 0 Internet Benchmark With post-stratification Without post-stratification 100 100 80 80 60 60 40 40 20 20 0 0 Telephone Without post-stratification Source: Yeager, et al. (2011) Internet With post-stratification Internet With post-stratification Has a driver’s license (Relative bias = 5%, 4%, 1%, -1%) Did not have a passport (Relative bias = -15%, -11%, -4%, -3%) Benchmark Telephone Benchmark Telephone Without post-stratification Internet With post-stratification 45 Conclusions • Before poststratification, the telephone and probability internet survey performed approximately equivalently. • After poststratification, the average difference between the benchmark and the sample estimate is slightly (but not significantly) smaller for the telephone than the internet survey. • Limitation: – Nonresponse is not the only reason for differences in the estimates, and may not even be the main source of the difference. – Data collected in 2004; may be different today. 46 2. Using Variables Available on the Sample 47 Two types of approaches • Compare estimates for respondents and nonrespondents - or respondents and full sample – on variables contained for full sample • Evaluate variation in response rates / response propensities over subgroups defined by frame/paradata 48 2. Nonresponse bias for estimates based on variables available on sample 2.1 Sampling frame 2.2 External data matched to sample 2.3 Observations taken during data collection 2.4 Seeded sample 2.5 Compare response rates for subgroups 2.6 Calculate R-Indicators 49 2.1 Using Sampling Frame Variables • Sampling frame may contain variables on target population that are correlated to those being estimated in the survey • Example: Length of membership on organization list in study of member attitudes, listed phone status in telephone survey of residential mobility • Difference between statistics from the full sample and statistics from respondents-only is an indicator of nonresponse bias 50 2.1 How to Conduct a Nonresponse Bias Study Using Frame Data 1. 2. 3. 4. Examine sampling frame to identify variables correlated to key survey estimates, keeping any that are potentially correlated Only data for the full sample are needed, so that efforts to process frame data may be restricted to the full sample if necessary Compute statistics for the full sample and for respondents-only (using base weights); the difference is the estimated nonresponse bias Classify nonrespondents by reasons (e.g., noncontact/refusal) and compute statistics for these groups to identify sources of bias 51 2.1 Statistical Tests using Sampling Frame Variables • Using the frame variables as the outcome (Y) variable, calculate the respondent mean and the nonrespondent mean. – Can also calculate refusal and noncontact mean. • Calculate the appropriate (design-based) ttests or chi-square test of differences in means or proportions between the respondents and nonrespondents – Test the null hypothesis that the respondent mean and nonrespondent mean are identical 52 2.1 Pros and Cons of Using Frame Data to Estimate Nonresponse Bias • Pros – Measurement properties for the variables are consistent for respondents and nonrespondents – Bias is strictly due to nonresponse – Provides data on correlation between propensity to respond and the variables • Cons – Bias estimates are for the frame variables; only frame variables highly correlated with the key survey statistics are relevant – The method assumes no nonresponse adjustments are made in producing the survey estimates; if frame variables are highly correlated, then they usually are used in adjustment – Item nonresponse on the frame reduces utility of variables 53 Groves, R. M., S. Presser, R. Tourangeau, B. T. West, M. P. Couper, E. Singer and C. Toppe (2012). "Support for the Survey Sponsor and Nonresponse Bias." Public Opinion Quarterly 76(3): 512-524. • Purpose: Estimate support for the March of Dimes • Target population: Households in list maintained by March of Dimes • Sample design: Stratified random sample from rich frame • Mode of data collection: Mail • Response rate: 23% U of Michigan sponsor; 24% labor force survey; 12% March of Dimes sponsor • Target estimate: Lifetime total times volunteered (others in article) • Nonresponse error measure: Variables on list sampling frame 54 Mean Lifetime total times volunteered (Relative bias = 110%, 39%, 29%) Mean number of times volunteered 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 March of Dimes Sponsor Full sample mean University of Michigan Respondent Mean Michigan Labor Force Nonrespondent mean 55 Conclusions • People who volunteer are more likely to participate in a survey – And bias in estimates greater when the sponsor is related to volunteering • Nonresponse error impact varies across estimates and across sponsors for the same estimate • Limitation: – Limited variables available on records 56 Tourangeau, R., Groves, R.M., and Redline, C.D. (2010). Sensitive topics and reluctant respondents: Demonstrating a link between nonresponse bias and measurement error. Public Opinion Quarterly, 74(3), 413-432. • Purpose: Voting status (Methodological experiment) • Target population: Maryland residents who are registered voters • Sample design: Stratified samples of voters and nonvoters (n=1,346 voters and n=1,343 nonvoters) • Mode of data collection: Telephone or mail (methodological experiment) • Response rate: 34.3% telephone; 33.2% mail • Target estimate: Voting status • Nonresponse error measure: Comparison to voting records 57 Voting Status (Relative Bias = 22%, 30%) 70 60 50 40 30 20 10 0 Voted in 2004 Truth Voted in 2006 Respondents Nonrespondents 58 Conclusions • Positive association between past voting and survey participation; for telephone cases, contact rates for voters and nonvoters not significantly different but cooperation rates highly significantly different • Limitation: – Bias estimates limited to voting information available from voting records (not true current voting information) 59 2.2 Matching External Data to the Sample • Sometimes administrative or other data sets on the entire sample exist • Example: employee records, pension records, voting records • Difference between record-based statistics from the full sample and statistics from respondents-only is an indicator of nonresponse bias 60 2.2 How to Conduct a Nonresponse Bias Study Using Matched Data 1. Locate an external data file covering the entire sample 2. Match individual records between sampling frame data and external data 3. Compute statistics for the full sample and for respondents-only (using base weights); the difference is the estimated nonresponse bias 4. Classify nonrespondents by reasons (e.g., noncontact/refusal) and compute statistics for these groups to identify sources of bias 61 2.2 Statistical Tests using Matched Variables • Using the matched variables as the outcome (Y) variable, calculate the respondent mean and the nonrespondent mean. – Can also calculate refusal and noncontact mean. • Calculate the appropriate (design-based) ttests or chi-square test of differences in means or proportions between the respondents and nonrespondents – Test the null hypothesis that the respondent mean and nonrespondent mean are identical 62 2.2 Pros and Cons of Using Matched Data to Estimate Nonresponse Bias • Pros – Measurement properties for the variables are consistent for respondents and nonrespondents – Bias is strictly due to nonresponse – Provides data on correlation between propensity to respond and the variables • Cons – Matching problems diminish the value of the estimates – Bias estimates are for the external data variables; only external data variables highly correlated with the key survey statistics are relevant – The method assumes no nonresponse adjustments are made in producing the survey estimates; if external data variables are highly correlated, then they usually are used in adjustment – Item nonresponse on the external data reduces utility of variables 63 Lee et al. (2009). “Exploring Nonresponse Bias in a Health Survey Using Neighborhood Characteristics.” American Journal of Public Health. 99:1811-1817 • Purpose: Estimate health characteristics of California residents • Target population: Adults in households in California in 2005 • Sample design: RDD list-assisted sample • Mode of data collection: CATI telephone • Response rate: Screener RR4: 49.8%; Sampled Adult: 54.0%; Overall RR4 26.9% • Target estimate: Various estimates of health outcomes • Nonresponse error measure: Merged information from telephone exchanges defined by census 64 measures % in census tract speak only English at home and (Relative Bias = 2.3%; -2.1%) % Speak Only English at home % Below 100% of federal poverty level 65 Conclusions • Differences between respondents and nonrespondents for telephone exchange level data based on census 2000 were modest for 90 characteristics • Much larger differences between noncontacts and ‘other’ nonrespondents and respondents than between refusals and respondents • The exchange level data were used to impute values for the actual survey variables, and very minor differences were found • Limitation: – Limited predictive power at level of exchange 66 Parsons, N. L., & Manierre, M. J. (2014). Investigating the Relationship among Prepaid Token Incentives, Response Rates, and Nonresponse Bias in a Web Survey. Field Methods, 26(2), 191-204. • Purpose: Campus housing amenities and student characteristics on academic outcomes • Target population: Undergraduate freshman at Eastern Connecticut State University • Sample design: Sample of students who live on campus from registrar • Mode of data collection: Web questionnaire • Response rate: 37.6% no incentive group; 49.4 incentive group • Target estimate: Credits; GPA • Nonresponse error measure: Comparison to student academic records 67 Academic Experiences for Two Incentive Groups (Relative bias = -1%, 13%, 0.4%, 14%) No Incentive $2 Prepaid Incentive 16 16 14 14 12 12 10 10 8 8 6 6 4 4 2 2 0 0 Credits Truth Respondents GPA Nonrespondents Credits Truth Respondents GPA Nonrespondents 68 Source: Parsons and Manierre (2014) Conclusions • Positive affect toward request from school is a function of number of credits enrolled and GPA • Limitation: – Bias estimates limited to variables on records 69 Assael, H., and Keon, J. (1982). Nonsampling vs. sampling errors in survey research. Journal of Marketing, 46, 114-123. • Purpose: Methodological study of telephone customers • Target population: Small businesses in 4 large cities • Sample design: Subscriber sample of 1,579 business customers • Mode of data collection: Phone, mail, face to face, and drop off • Response rate: 57% average across several recruitment methods; average 60% mail, 57% phone, 59% face to face, and 52% drop off, including item missing data • Target estimate: Phone bill, number of instruments • Nonresponse error measure: Comparison to company records 70 Monthly Billings and Number of Telephone Instruments (Relative Bias = 12%, 27%, 35%, 4%; 6%, 21%, 17%, 12%) FaceTo Face "T ru t R es h" p on N on de re nt sp s on de nt s Drop off Monthly Billings Mail Phone FaceToFace Drop off R es h" p on N on de re nt sp s on de nt s Mail Phone 4 3.5 3 2.5 2 1.5 1 0.5 0 "T ru t 200 180 160 140 120 100 80 60 40 20 0 Number of Instruments 71 Conclusion • Nonrespondents tend to have smaller use of telephone services • No way of knowing whether common cause is the size of the business • Limitation: – Bias estimates limited to variables on records 72 2.3 Observations Taken on Respondents and Nonrespondents During Data Collection • Interviewers are sometimes asked to make observations about sample units, both respondents and nonrespondents • When observations are correlated with key survey variables they may be informative of potential nonresponse bias in the survey variables 73 2.3 How to Conduct a Nonresponse Bias Study Using Observation Data 1. Identify attributes of sample cases that a. Can be observed b. Are related to response propensity or the survey variables 2. Develop a measurement approach to making the observations on both respondents and nonrespondents 3. Compute statistics on those observations for the full sample and for respondentsonly (using base weights); the difference is the estimated nonresponse bias 74 2.3 Statistical Tests using Observation Variables • Using the observation variables as the outcome (Y) variable, calculate the respondent mean and the nonrespondent mean. – May not be able to calculate the mean separately for refusals and noncontacts, depending on the observation. • Calculate the appropriate (design-based) t-tests or chi-square test of differences in means or proportions between the respondents and nonrespondents – Test the null hypothesis that the respondent mean and nonrespondent mean are identical 75 2.3 Pros and Cons of Using Observation Data to Estimate Nonresponse Bias • Pros – Bias is strictly due to nonresponse – Provides data on correlation between propensity to respond and the variables • Cons – It is sometimes difficult to assure measurement properties for the variables are consistent for respondents and nonrespondents – Bias estimates are for the observation variables; only observation data variables highly correlated with the key survey statistics are relevant – The method assumes no nonresponse adjustments are made in producing the survey estimates; if observation data variables are highly correlated, then they usually are used in adjustment 76 Lynn, P. (2003). PEDASKI: Methodology for collecting data about survey nonrespondents. Quality and Quantity, 37, 239-261. • • • • • • Purpose: Measurement of crime victimization Target population: UK household population Sample design: Multi-stage sample Mode of data collection: Face to face Response rate: 83.5% Target estimate: Interviewer observation variables on sample unit • Nonresponse error measure: Difference between respondents and nonrespondent sample units on interviewer observations 77 Detached House, Entryphone at Door of Sample Unit (Relative Bias = 5%, -10%) Percentage Detached Structures re sp on de nt s s en t N on R es po nd h" "T ru t s re sp on de nt en t N on R es po nd "T ru t s 25 20 15 10 5 0 h" 25 20 15 10 5 0 Percentage Entryphone at Entrance 78 Conclusion • Respondent households overestimate prevalence of detached houses, underestimate prevalence of units with entryphones • Limitation: – These estimates are not key estimates of the survey 79 Sastry, N. and Pebley, A.R. (2003). Nonresponse in the Los Angeles Family and Neighborhood Survey, RAND Working Paper Series 03-01 • Purpose: Measurement of neighborhood effects on social and economic outcomes of adults and children • Target population: Los Angeles County residents • Sample design: Multi-stage sample • Mode of data collection: Face to face • Response rate: 85% for randomly selected adult • Target estimate: Interviewer observation variables on sample unit • Nonresponse error measure: Difference between respondents and nonrespondent sample units on interviewer observations 80 Apartments, Rent <$500 (Relative Bias = 2%, -0.1%) 81 Conclusion • Respondent households overestimate prevalence of apartments, little bias on estimated rent • Limitation: – These estimates are not key estimates of the survey 82 West, B. T. (2013). An examination of the quality and utility of interviewer observations in the National Survey of Family Growth. Journal of the Royal Statistical Society: Series A (Statistics in Society), 176(1), 211-225. • Purpose: Measurement of fertility experiences • Target population: U.S. household population 1544 years old • Sample design: Multi-stage area probability sample • Mode of data collection: Face to face • Response rate: 79% • Target estimate: Lister observation variables on sample unit • Nonresponse error measure: Difference between respondents and nonrespondent sample units on lister observations 83 Odds ratios predicting main interview response propensity Interviewer observes… Adjusted Odds Ratio Physical impediments to household 0.986 High main interview probability 1.559 Medium main interview probability 0.295 Low main interview probability 0.093 Does not report main interview probability -- All HU’s in segment residential 0.999 Safety concerns in segment 1.018 Respondent sexually active 1.923 Children under 15 years in HH 1.184 84 Conclusions • Interviewer observations significantly improve the fit of the response propensity model – Also significantly predict important survey variables – But measurement error in the observations increases RMSE of adjusted estimates • Limitations – Measurement error in the observations can be studied only for respondents 85 2.4 Using a Seeded Sample • Seed the sample by including units with known characteristics to see if the units respond at different rates (similar to use of sampling frame data but for a subset of the sample) • Example: Seed the sample with persons who are and are not members of an organization • Difference in seeded sample response rates by the characteristics used to estimate bias 86 2.4 How to Conduct a Seeded Sample Nonresponse Bias Study 1. 2. 3. 4. Find source with highly reliable characteristics related to key survey estimates, and include some units with and without the characteristic in the sample Include the seeded sample in the survey operations, making sure the seeded units are handled just like all other units (e.g., interviewers are blinded to seeding) Compute response rates by the characteristic. If the rates are similar then estimates correlated with the characteristic have small biases; if the rates are very different, then the estimates are subject to large biases depending on the response rate. Generally this analysis is done without survey weights and bias estimates are not produced Examine source of nonresponse by tabulating response rates by reasons (e.g., noncontact/refusal) for the seeded sample by the characteristics 87 2.4 Pros and Cons of Using Seeded Sample to Examine Bias • Pros – Measurement properties for the variables are consistent for seeded sample – Bias is strictly due to nonresponse – Provides estimates of correlation between response propensity and seeded sample characteristic • Cons – Estimates can only be produced for variables known from the seeded sample and seeded sample is often small – Bias estimates are usually not produced – The method assumes no nonresponse adjustments are made in producing the survey estimates 88 Groves, R., Presser, S., and Dipko, S. (2004). The role of topic interest in survey participation decisions. Public Opinion Quarterly, 68, 2-31. • Purpose: Support of schools and education • Target population: Adults in U.S. telephone households; seeded sample of teachers • Sample design: List-assisted RDD; stratified sample of teacher list • Mode of data collection: Telephone • Response rate: Variable • Target estimate: Sensitivity of teachers to topic • Nonresponse error measure: Comparison of response rates between teachers and RDD 89 Response Rates for Survey on Education and the Schools for Teachers and Full RDD Sample 80 70 60 50 40 30 20 10 0 Teachers RDD 90 Conclusion • Teachers respond at much higher rates to survey on topic of potential interest to them • Limitation: – No direct nonresponse bias estimate on key statistics possible only with frame containing such information 91 Montaquila, J.M., Williams, D., and Han, D. (2011). An application of a two-phase address based approach to sampling for subpopulations. Washington Statistical Society Methodology Seminar. • Purpose: Characteristics of children’s care and education • Target population: Infants, preschoolers, and school-age children; seeded sample of addresses identified to correspond to households with children • Sample design: ABS; SRS from seeded sample frame • Mode of data collection: Primarily mail • Response rate: (Next slide) • Target estimate: Characteristics of care/education • Nonresponse error measure: Comparison of response rates between national ABS and seeded sample 92 Response Rates for Seeded Sample of Addresses Identified as Having Children and National ABS Sample (Screenout Version of Screener) 93 Conclusion • Households with children respond at higher rate to survey on topic of potential interest to them • Limitations: – No direct nonresponse bias estimate on key statistics possible only with frame containing such information (but NHES used other approaches as well) – Seeding is not clean; only 81% of “addresses with children” enumerated eligible children 94 2.5 Comparing Response Rates on Subgroups • Since nonresponse bias results only when subgroups with different characteristics have different response rates, this approach examines the response rate component • Response rates are computed for subgroups, often using frame data • If the response rates are not substantially different, there should not be a large nonresponse bias in statistics for the groups 95 2.5 How to Conduct a Nonresponse Bias Study Comparing Response Rates 1. Identify attributes of the sample that are knowable for both respondents and nonrespondents 2. Compute response rates for categories of these attributes 3. These response rate differences provide insights into possible nonresponse bias to the extent the attribute variables are correlated with the survey variables 96 2.5 Statistical Tests for Response Rates on Subgroups • Identify subgroups of interest using frame, matched, or observation variables • Calculate the response rate for each of these groups • Use t-tests or chi-square tests to evaluate differences in the response rates across the groups • If there is no difference in response rates across the groups, then conclude no nonresponse bias related to that particular 97 subgroup 2.5 Pros and Cons of Response Rate Analysis to Estimate Bias • Pros – Simple and inexpensive – Provides some evidence about potential for bias • Cons – Does not provide direct estimate of nonresponse bias – Only limited variables (frame) examined – Assumes no nonresponse adjustments are made in producing the survey estimates – Item nonresponse on the frame reduces utility of variables 98 van Goor, H., and Stuiver, B. (1998). Can weighting compensate for nonresponse bias in a dependent variable? An evaluation of weighting methods to correct for substantive bias in a mail survey among Dutch municipalities. Social Science Research, 27, 481-499. • Purpose: Study implementation of policy on caravan sites • Target population: All Dutch municipalities • Sample design: Two-stage sample of municipalities from provinces • Mode of data collection: Mail • Response rate: 74% • Target estimate: Characteristics related to policy performance of municipalities • Nonresponse error measure: Comparison of response rates (other measures in study) 99 Response Rates by Municipality Size and Associated Nonresponse Bias Estimate (Relative bias = -16%) Response Rate by Size of Municipality (000's) Percentage Muncipalities Less than 5,000 Population 2050 >50 s de nt en t re sp on 1020 N on 510 R es po nd h" "T ru t <5 s 16 14 12 10 8 6 4 2 0 90 80 70 60 50 40 30 20 10 0 Assumes self-weighting sample of 100 municipalities Conclusion • Response rates differ by municipality size, which is correlated with policy performance • Limitation: – Other measures available on survey used to make judgments about nonresponse bias 101 McFarlane, E., Murphy, J., Olmsted, M.G., Severance, J. (2010). The effects of a mixed-mode experiment on response rates and nonresponse bias in a survey of physicians. Presented at the Joint Statistical Meetings • Purpose: Physician survey is one component used to arrive at U.S. News and World Report’s “America’s Best Hospitals” rankings • Target population: Board certified physicians • Sample design: Stratified single-stage sample of physicians (n=3,112) • Mode of data collection: Mail only; mail with web follow-up; mail and web • Response rate: 48% • Target estimate: Hospital rankings • Nonresponse error measure: Comparison of 102 response rates Response Rates by Age (Mail-only Cases Only) and Associated Nonresponse Bias Estimate (Relative Bias = -8.8%) 103 Conclusion • Response rates differ by age of physician • Limitation: – Key outcomes of interest are the hospital rankings (not age). However, to the extent that hospital rankings vary by age of physician, the finding reported above is indicative of potential bias. 104 2.6 Calculate R-indicators • Since nonresponse bias results only when subgroups with different characteristics have different response rates, this approach also examines the response rate component • Examines variability in response propensities across sampled persons 105 3.7 How to calculate R-indicators 1. Obtain frame data with indicator for respondent or nonrespondent 2. Using logistic regression, estimate a model predicting the probability of being a respondent as a function of frame variables 3. Obtain predicted probabilities from the model. These are person-level estimates of response propensity. 4. Calculate the base-weighted standard deviation of the response propensities, S(p) 5. Calculate the R-indicator, R(p)=1-2S(p), where R(p)=1 indicates strong representativeness and R(p)=0 indicates weak representativeness 6. Can also calculate partial R-indicators to look 106 at influence of individual variables 3.7 Pros and Cons of using R-Indicators • Pros – Single number that is an indicator of how representative the survey is on a variety of characteristics • Cons – Value of R-indicator depends on which variables are included in the propensity model and whether interaction terms are included – Does not provide measure of nonresponse 107 bias for individual survey estimates Schouten, B., Cobben, F. and Bethlehem, J. 2009. “Indicators for the representativeness of survey response.” Survey Methodology. 35(1): 101-113. • Purpose: Estimate unemployment rate in the Netherlands • Target population: All adults in the Netherlands • Sample design: Area sample with subsample of nonrespondents • Mode of data collection: Face to face, telephone, mail and web • Response rate: 62.2% for main study, 76.9% main with call back to nonrespondents, 75.6% for main with shortened questionnaire for nonrespondents • Target estimate: Varies 108 • Nonresponse error measure: R-indicator R-indicators calculated for two stages of recruitment protocol Labor Force Survey R-indicator Main only Main + call back to nonrespondents Main + shortened questionnaire 80.1% 95% Confidence Interval (77.5, 82.7) 85.1% (82.4, 87.8) 78.0% (75.6, 80.4) Schouten, Cobben and Bethlehem, Table 4 109 Conclusions • Using the callback approach made the sample more representative, but shortening the questionnaire did not • Values of R-indicators depend on the variables included in the propensity model 110 Lynn, P. (2013). Alternative Sequential Mixed-Mode Designs: Effects on Attrition Rates, Attrition Bias, and Costs. Journal of Survey Statistics and Methodology, 1(2), 183-205. • Purpose: Estimate effect of mode change in UK Household Longitudinal Study • Target population: All households in the United Kingdom • Sample design: Area sample • Mode of data collection: Face to face, telephone Response rate: 73.9%, 65.2%, 57.1% Wave 2, 3, 4 face-toface; 65.6%, 59.8%, 54.0% Wave 2, 3, 4 mixed mode – Telephone light = start with phone; switch to F2F when I’er indicated as needed; telephone heavy = switch to F2F only at last possible moment • Target estimate: Varies • Nonresponse error measure: R-indicator 111 R indicators – Wave 4 response using Wave 1 data Face to face Mixed mode – telephone light Mixed mode – telephone heavy 7 19 covariates covariates 0.727 0.662 0.731 0.684 0.668 0.651 112 Conclusions • Although differences in response rates, little difference in composition across single mode vs. mixed-mode designs • Limitations – R-indicators limited to variables included – Not clear what would happen in first wave of study 113 3. STUDYING VARIATION WITHIN THE RESPONDENTS 114 3. Studying Variation within the Respondent Set • Data collected as part of the survey or as an addition to the survey in the survey may be used to evaluate bias due to nonresponse in the survey estimates 3.1 Use of screener and prior wave data 3.2 Following up nonrespondents 3.3 Two phase (double) sampling of nonrespondents 3.4 Analyzing estimates by level of effort 3.5 Analyzing estimates by predicted response propensity 3.6 Mounting randomized nonresponse experiments 115 3.1 Screening or Prior Wave Data as a Nonresponse Bias Study • • • Some surveys screen the sample, and then conduct more detailed interviews with all or a subsample of the screened units (e.g., one adult reports for all adults in the household and one adult is then sampled for detailed data) Longitudinal surveys have data available on some sample cases from prior waves of data collection Often the final weights are for the second stage respondents and use screener data to adjust weights for nonresponse. To estimate bias this adjustment must be 116 eliminated. 3.1 How to Conduct a Nonresponse Bias Study Using Screener or Prior Wave Data 1. Try to maximize the screening or prior wave response rate 2. Collect data on variables correlated with the survey variables during the screening or prior wave 3. Compute statistics for the full sample and for main or current wave survey respondents-only (using base weights); the difference is the estimated nonresponse bias on the screener or prior wave variables 117 3.1 Statistical Tests using Screener or Prior Wave Data • Using the screener or prior wave variables as the outcome (Y) variable, calculate the respondent mean and the nonrespondent mean. – Can also calculate refusal and noncontact mean. • Calculate the appropriate (design-based) ttests or chi-square test of differences in means or proportions between the respondents and nonrespondents – Test the null hypothesis that the respondent mean and nonrespondent mean are identical 118 3.1 Pros and Cons of Screener Data to Estimate Nonresponse Bias • Pros – Measurement properties for the variables are consistent for respondents and nonrespondents – Bias is strictly due to nonresponse – Provides data on correlation between propensity to respond and the variables • Cons – Nonresponse in the screening step reduces the scope of the sample that the bias estimates describe – Bias estimates are for the screener data variables; only screener data variables highly correlated with the key survey statistics are relevant 119 Zabel, J. (1998). An analysis of attrition in the Panel Study of Income Dynamics and the Survey of Income and Program Participation with an application to a model of labor market behavior. Journal of Human Resources, 33, 479-506. • Purpose: Measure labor market and income change over time • Target population: U.S. household population • Sample design: SIPP is 8-wave (every 4 months) panel from area probability design • Mode of data collection: Phone, face to face • Response rate: 92.4% in wave 1; 73.4% for all 8 waves • Target estimate: Household income • Nonresponse error measure: Comparison of those completing 8 waves with those who attrited sometime before wave 8 120 Mean Household Income in Wave 1 for SIPP 1990 Respondents by Attrition Status (Relative bias = 4%) 16,000 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 Wave 1 Respondents* NonAttritors Attritors *Assuming weighting of Nonattritors and Attritors proportionate to unweighted case counts 121 Conclusions • Full panel respondents have higher household incomes at wave 1 than those dropping out of the panel after wave 1 • Limitations: – Estimate of nonresponse bias limited to wave 1 data – Estimate of nonresponse bias does not include component due to wave 1 nonresponse 122 Abraham, K.G., Maitland, A., and Bianchi, S.M. (2006). Nonresponse in the American Time Use Survey: Who is missing from the data and how much does it matter? Public Opinion Quarterly, 70, 676-703. • Purpose: Measure how Americans use their time • Target population: U.S. civilian noninstitutionalized population ages 15+ • Sample design: Random selection from households completing 8th wave (“month-in-sample”) of the Current Population Survey • Mode of data collection: Phone • Response rate: About 94% for CPS 8th month-insample interview; 56% response rate for 2004 ATUS • Target estimate: Time spent on various activities • Nonresponse error measure: Comparison of ATUS respondents to ATUS sampled cases (8th MIS CPS 123 respondents) Percentage Who Work 45+ Hours/Week, Percentage Who Rent (Relative Bias = 9%, -20%) •Estimates computed by applying unweighted sample sizes to tabulated response rates. 124 Conclusions • Results “offer little support for hypothesis that busy people are less likely to respond.” • “Consistent and significant differences in response rates across groups…seem to [support] the ‘social integration’ hypothesis.” • Limitation: – Estimate of nonresponse bias limited to CPS 8th MIS respondents 125 3.2 Followup of Nonrespondents • This is one of the most common techniques • Use of respondent data obtained through extra-ordinary efforts as comparison to respondent data obtained with traditional efforts • “Effort” may include callbacks, incentives, change of mode, use of elite corps of interviewers 126 3.2 How to Do a Nonrespondent Followup Study 1. Define a set of recruitment techniques judged to be superior to those in the ongoing effort 2. Determine whether budget permits use of those techniques on all remaining active cases • If not, implement 2nd phase sample (described later) 3. Implement enhanced recruitment protocol 4. Compare respondents obtained in enhanced protocol with those in the initial protocol 127 3.2 Statistical Tests using Nonresponse Follow-up Studies • Using the nonresponse follow-up variables as the outcome (Y) variable, calculate the main study respondent mean and the nonresponse follow-up respondent (proxy nonrespondent) mean. • Calculate the appropriate (design-based) t-tests or chi-square test of differences in means or proportions between the main study respondents and nonresponse follow-up respondents – Test the null hypothesis that the main study respondent mean and nonresponse follow-up mean are identical 128 3.2 Pros and Cons of Nonresponse Followup Study • Pros – Direct measures are obtained from previously nonrespondent cases – Same measurements are used – Nonresponse bias on all variables can be estimated • Cons – Rarely are followup response rates 100% – Requires extended data collection period 129 Criqui, M., Barrett-Connor, E., and Austin, M. (1978). Differences between respondents and nonrespondents in a population-based cardiovascular disease study. American Journal of Epidemiology, 108, 367-372. • Purpose: Various health related attributes • Target population: 30-79 years old residents of planned community in California • Sample design: Telephone directory, roster of community center membership, census maps • Mode of data collection: Telephone recruitment, face to face interview • Response rate: 82.1% • Target estimate: Heart failure in relative and self • Nonresponse error measure: Followup of nonrespondents using shortened questionnaire, yielding about 60% answering question 130 h" R es po nd N en on ts re sp on de nt s 3.5 3 2.5 2 1.5 1 0.5 0 "T ru t % Hospitalization for Heart Failure h" R es po nd N en on ts re sp on de nt s 45 40 35 30 25 20 15 10 5 0 "T ru t % Heart Attack in Relative Percentage Reporting Heart-Related Events in Survey of Cardiovascular Disease, Females (Relative Bias = 7%, -33%) Criqui, M.H., Barrett-Connor, E. and Austin, M. (1978). Differences between respondents and non-respondents in a population-based cardiovascular disease study. American Journal of Epidemiology, 108(5), 367-372. 131 Conclusions • Respondents overestimate extent of family history with heart failure but underestimate prevalence of heartrelated hospitalization • Limitation: – Followup efforts do not measure all nonrespondents 132 Matsuo, et al. (2010). ‘Measurement and adjustment of nonresponse bias based on nonresponse surveys: the case of Belgium and Norway in the European Social Survey Round 3’. Survey Research Methods. 4: 165-178. • Purpose: Measure attitudes and beliefs throughout Europe • Target population: Adults in Belgium and Norway • Sample design: Area probability sample • Mode of data collection: Face to Face • Response rate: Belgium: 61% main survey; 44.7% nonresponse follow-up; Norway: 64% main survey; 30% nonresponse follow-up • Target estimate: Social participation and neighborhood security • Nonresponse error measure: Use short questionnaire on nonrespondents 133 Percentage who feel very safe, participate in social activities much less than most, and are not at all interested in politics (Relative Bias = 3.4%, -23.4%, -30.7%) Table 3 in Matsuo, et al. (2010) 134 Conclusions • Short questionnaire to nonresponding schools and districts was successful in obtaining characteristics of schools useful for nonresponse adjustments • The estimates before nonresponse adjustment had significant but small biases • Only one school statistic had a significant bias after the nonresponse adjustment • Limitation: – Bias estimates do not reflect nonresponse on the followup questionnaire 135 3.3 Two Phase (Double) Sampling for Nonrespondents • This is a form of nonrespondent followup, limited to a subset of remaining nonrespondents • The first “phase” is the original selection; the second “phase” is the subsampling of cases not measured in the first recruitment efforts • Attractive option when first phase is relatively cheap (but low response rate) and second phase is expensive (with very high response rates) • Data from two phase are combined using weights that are reciprocals of products of first phase and second phase selection 136 probabilities 3.3 How to Conduct a Two Phase Sample of Nonrespondents 1. Define second phase recruitment protocol to be attractive to nonrespondents of first phase 2. Determine sampling fraction fitting budget available 3. Implement probability sample of remaining nonrespondents 4. Combine first phase and second phase respondent data reflecting both selection probabilities 137 3.3 Statistical Tests using Two Phase Sample Design • Using the survey variables as the outcome (Y) variable, calculate the 1st phase respondent mean and the 2nd phase (nonrespondent) mean. • Calculate the appropriate (weighted designbased) t-tests or chi-square test of differences in means or proportions between the 1st phase respondents and 2nd phase respondents – Test the null hypothesis that the 1st phase respondent mean and 2nd phase respondent mean are identical 138 3.3 Pros and Cons of a Two Phase Sample of Nonrespondents • Pros – With 100% response rate on second phase, design removes nonresponse error – Explicitly deals with costs of nonrespondent followup by permitting application of powerful recruitment protocol to a subset • Cons – Rarely are 100% response rates obtained – Standard errors of estimates are generally inflated due to differential weighting due to second phase selections 139 Peytchev, A., Carley-Baxter, L. R., & Black, M. C. (2011). Multiple Sources of Nonobservation Error in Telephone Surveys: Coverage and Nonresponse. Sociological Methods & Research, 40(1), 138-168. • Purpose: Estimate intimate partner violence • Target population: Civilian noninstitutionalized U.S. population • Sample design: Landline telephone numbers • Mode of data collection: Face to face • Response rate: 28.5%, followed by double sample, raising weighted response rate to about 35.5% • Target estimate: Percentage of women reporting that they had one or more sexual experiences • Nonresponse error measure: Respondents to double sample nonresponse follow-up 140 Probability Weighted Estimates for National Intimate Partner and Sexual Violence Survey Pilot 50 45 40 35 30 25 20 15 10 5 0 Women - Stalking Women - sexual Women - Physical violence agression Respondents Men - Stalking Men - Sexual Violence Men - Physical Aggression Nonrespondents 141 Conclusions • Double sample introduced prepaid and promised incentives and a shorter questionnaire • Double sample brought into the respondent pool older sample persons, with less aggression • Limitation: – Bias associated with nonresponse on 2nd phase sample not reflected 142 Dallosso, H., Matthew, R., McGrother, C., Clarke, M., Perry, S., Shaw, C., and Jagger, C. (2003). An investigation into nonresponse bias in a postal survey on urinary symptoms. British Journal of Urology, 91, 631-636. • Purpose: Estimate symptoms of urinary incontinence • Target population: Adults in Leicestershire • Sample design: 55,527 Adults, age 40 or more, on Leicestershire Health Authority list • Mode of data collection: Mail questionnaire • Response rate: 63.3% on mail questionnaire (49% in followup) • Target estimate: Various urinary incontinence symptoms • Nonresponse error measure: Comparison of early to late responders; followup of double sample of 1,050 nonresponders by face to face interviews 143 Variable Urinary leakage Stress leakage Urge leakage Freq. of micturition Nocturia Frequency of strong urge Mean Prevalence statistic Several times a month Several times a month Several times a month Hourly or more 3 times/night or more Several times a month (Face to Face to Mail) (Mail Only) Odds ratio (NR to R) Odds ratio (late to early) 1.11 0.93 1.33 1.08 1.19 1.05 1.44 0.94 1.14 0.87 1.09 1.22 0.84 0.95 144 Conclusions • Late respondents to mail questionnaire slightly less likely to have urinary incontinence symptoms; respondents to double sample face to face interview more likely • Limitations: – Two modes complicate estimates (however, underreporting bias generally higher with interviewer) – Nonresponse rate in double sample measurement complicates estimates (49% of eligibles) 145 3.4 Analyzing Estimates by Level of Effort • Some nonresponse models assume that those units that require more effort to respond (more callbacks, incentives, refusal conversion) are similar to the units that do not respond • Characteristics are estimated for respondents by level of effort • Models fitted to see if it fits and can be used to estimate characteristics of nonrespondents 146 3.4 How to Analyze Level of Effort 1. Associate level of effort data to respondents (e.g., number of callbacks, ever refused, early or late responder) 2. Compute statistics for each level of effort separately (usually unweighted or base weights only) 3. If there is a (linear) relationship between level of effort and the statistic, then may decide to extrapolate to estimate statistic for those that did not respond 4. Often more appropriate to do the analysis separately for major reasons for nonresponse 147 3.4 Statistical Tests using Level of Effort Data • Using the variables reported in the survey as the outcome (Y) variable, calculate the mean for each successive level of effort. • Calculate the appropriate (design-based) t-tests or chi-square test of differences in means or proportions between the levels of effort – Test the null hypothesis that the respondent mean across levels of effort are identical • Estimate a linear regression predicting the survey variable with the number of follow up attempts – Use logistic regression as appropriate for dichotomous outcomes – Test the hypothesis of a linear trend in estimates across levels of effort 148 3.4 Pros and Cons of Using Level of Effort Analysis to Estimate Bias • Pros – Simple to do, provided data collection systems capture the pertinent information – In some surveys may provide a reasonable indicator of the magnitude and direction of nonresponse bias • Cons – Highly dependent on model assumptions that have not been validated in many applications – Difficult to extrapolate to produce estimates of nonresponse bias without other data 149 Lin, I.-F., and Schaeffer, N. (1995). Using survey participants to estimate the impact of nonparticipation. Public Opinion Quarterly, 59, 236-258. • Purpose: Child support awards and payments • Target population: Divorce cases with child-support eligible children in 20 Wisconsin counties • Sample design: Element sample of court cases • Mode of data collection: Telephone • Response rate: 69.7% mothers, 57% fathers • Target estimate: Child support dollars owed • Nonresponse error measure: Comparison to court records for respondents and nonrespondents 150 Refusals Interviews by call Noncontacts Mean dollars owed in child support (nonresident mothers) Lin, I.-F., and Schaeffer, N. (1995). Using survey participants to estimate the impact of nonparticipation. Public Opinion Quarterly, 59, 236-258. 151 Conclusion • Mean amount owed in child support for refusals and noncontacts differ from each other and from those completing the interview • The relationship between the number of calls and the amount owed is not clear, so little evidence that nonrespondents are like those who respond with more effort • Limitation: – Limited variables on records; only officially documented payments on records 152 Curtin, R., Presser, S., and Singer, E. (2000). The effects of response rate changes on the index of consumer sentiment. Public Opinion Quarterly, 64, 413-428. • Purpose: Estimate consumer confidence index • Target population: Adults in U.S. telephone households • Sample design: List-assisted RDD • Mode of data collection: Telephone • Response rate: Variable between 68-71% • Target estimate: Consumer confidence index • Nonresponse error measure: Comparison of early to late responders 153 Correlations of ICS Quarterly Estimates between “All-call Design” and Restricted Call Design Correlation type All calls to initial cooperators All calls to 1-5 calls Number of quarters Correlations between statistics based on quarterly estimates Level Change estimates estimates .979 .980 70 .772 .781 69 Curtin, Presser, and Singer (2000) 154 Conclusions • Little effect of additional efforts on level of index • Somewhat greater effects on change estimates of index • Limitation: – No reflection of bias arising from nonresponse after full effort 155 Kreuter, F., G. Müller and M. Trappmann (2010). "Nonresponse and Measurement Error in Employment Research: Making Use of Administrative Data." Public Opinion Quarterly 74(5): 880-906. • Purpose: Estimate labor market participation • Target population: Adults in German households • Sample design: Dual frame – residential addresses + register of benefit recipients • Mode of data collection: Telephone and in person • Response rate: 28.7% register; 24.7% population • Target estimate: Benefit receipt • Nonresponse error measure: Estimates over number of call attempts 156 Nonresponse Bias for Welfare Benefit Receipt, Employment Status, and Foreign Status, Over Contact-level Strata, Cumulative Welfare benefits 89.4% 20.8% 88.5% 23.3% 87.9% 24.2% 86.0% Employed 84.4% 84.8% 84.7% 25.2% 26.3% 26.2% 26.3% 6.2% 6.1% Transfer to CAPI Refusal Conversion 4.6% 5.2% 5.4% 5.7% 5.8% 1-2 contact attempts 3-5 6-8 9-15 >15 contact attempts Source: Kreuter, et al. (2010) Table 2. Foreign 81.3% 26.2% 10.1% Truth 157 Conclusions • Variation of estimates over calls to first contact is sensitive to at-home patterns of persons having different values on estimates • Limitation: – No reflection of bias remaining in the total data set 158 3.5 Analyzing Estimates by Predicted Response Propensity • The stochastic model for nonresponse shows that nonresponse bias of a mean is a function of the correlation between response propensity and the survey variable of interest • Response propensity models are estimated and predicted propensities are obtained • Examine the correlation between the predicted propensity and Y or changes in the sample estimate over groupings of response propensity 159 3.5 How to Analyze Estimates by Predicted Response Propensity 1. 2. 3. 4. 5. Obtain frame data with indicator for respondent or nonrespondent Using logistic regression, estimate a model predicting the probability of being a respondent (1) versus a nonrespondent (0) as a function of frame variables Obtain predicted probabilities from the model. These are person-level estimates of response propensity. Estimate the association between the predicted propensity and your survey variable of interest using quintiles of the propensity distribution or correlations between the predicted propensity and the survey variable. Often more appropriate to do the analysis separately for major reasons for nonresponse 160 3.5 Statistical Tests using Response Propensity Groups • Using logistic regression, obtain predicted probabilities of participation for all respondents and nonrespondents • Divide the predicted probabilities into discrete groups (usually 5 quintiles) • Using the variables reported in the survey as the outcome (Y) variable, calculate the mean for each quintile. • Calculate the appropriate (design-based) t-tests or chi-square test of differences in means or proportions between the levels of effort – Test the null hypothesis that the respondent mean across 161 quintiles are identical 3.5 Pros and Cons of Using Predicted Response Propensities to Estimate Bias • Pros – May use this procedure already to create nonresponse adjustment weights – In some surveys may provide a reasonable indicator of the magnitude and direction of nonresponse bias • Cons – Highly dependent on model assumptions and parameterization – Assumes the correlation between the survey variables and propensity are the same among respondents and nonrespondents 162 Olson, K. (2006) “Survey Participation, Nonresponse Bias, Measurement Error Bias, and Total Bias.” Public Opinion Quarterly. 70(5): 737-758. • Purpose: Estimate characteristics of divorce and remarriage • Target Population: Divorced couples • Sample Design: Simple random sample of divorce records in four counties in Wisconsin • Mode of Data Collection: Telephone and Mail • Response Rate: 71% • Target Estimate: Mean length of marriage • Nonresponse error Measure: Change in estimated mean length of marriage over quintiles of estimated cooperation propensity 163 Mean length of marriage in months by quintile of estimated cooperation propensity 164 Olson, 2006, Figure 2 Conclusions • Respondent mean is moving closer to the target (‘truth’) as low propensity respondents are brought into the sample pool • No clear difference for this statistic in the pattern for record values or reported values 165 Dahlhamer, J.M. & Simile, C.M. (2009). “Subunit Nonresponse in the National Health Interview Survey (NHIS): An Exploration Using Paradata.” Proceedings of the Government Statistics Section of the American Statistical Association. • Purpose: Estimate health characteristics of U.S. population • Target Population: Adults in the U.S. age 18+ • Sample Design: Multi-stage area probability sample • Mode of Data Collection: Face-to-face • Response Rate: 87% for household and family, 68% for sample adult • Target Estimate: Variety of estimates • Nonresponse error Measure: Change in the estimate of vaccines over five response propensity quintiles 166 Estimates by propensity quintile % who received the Influenza Vaccine during past 12 months % who engage in regular leisure time physical activity 167 Dahlhamer and Simile, 2009, Figures 2 and 4 Conclusion • Unadjusted estimates may overestimate prevalence estimates of some diagnoses and access or use of health care, but not for health behaviors • Paradata improved propensity model fit • Explore expanding NHIS adjustments using propensity models 168 3.6 Randomized Experiments Affecting Nonresponse Rates • Some survey features have been shown to affect response rates – incentives – sponsorship – main topic • The full sample can be divided at random into subsamples, each assigned a different such feature • Estimates of key statistics can be measured on subsamples 169 3.6 How to Mount Randomized Nonresponse Experiments 1. Choose a design feature expected to affect response rates and hypothesized to affect nonresponse bias 2. Mount randomized experimental design, assigning different design features to different random subsamples 3. Compare response rates and key survey estimates among experimental treatment groups 170 3.6 Statistical Tests using Survey Experiments • Using the variables reported in the survey as the outcome (Y) variable, calculate the mean for each experimental condition. • Calculate the appropriate (design-based) t-tests, ANOVAs or chi-square test of differences in means or proportions between the levels of effort – Test the null hypothesis that the respondent mean across experimental conditions are identical 171 3.6 Pros and Cons of NonresponseRelated Experiments • Pros – Pure effect of design feature on response rates is obtained • Cons – Expensive to mount multiple protocols – Without some external benchmark, which treatment is preferred is questionable – Method offers potentially lower nonresponse bias in one treatment but higher biases in other treatments, losing an opportunity to reduce bias in full sample 172 Merkle, D.M. and Edelman, M. (2009). An Experiment on Improving Response Rates and Its Unintended Impact on Survey Error. Survey Practice. 2(3). • Purpose: Evaluate voting behavior in exit poll • Target population: New Jersey and New York City general election voters • Sample design: Voting precincts with systematic sample of voters • Mode of data collection: Paper questionnaire handed out by interviewer • Response rate: Eye-catching interviewer folder: 54.2%; Folder + Voters News Service pen for respondent: 55.4; Traditional: 49.9% • Target estimate: Vote signed and absolute error [(Exit poll Dem % - Rep %) – (Official Dem % - Rep %)] • Nonresponse error measure: Comparison of signed and 173 absolute error by treatment group Signed Error by Treatment Condition 18 16 14 12 10 8 6 4 2 0 -2 -4 Folder/Pen Folder Signed error Traditional Absolute error Merkle and Edelman (2009) 174 Conclusions • Folder increased response rates, but also increased error – Folder significantly overrepresented Democratic candidate voters; traditional slightly overrepresented Republican voters • Limitation: – Implemented only in New Jersey and New York City 175 Groves, R.M., Couper, M.P, et al. (2006) “Experiments in Producing Nonresponse Bias,” Public Opinion Quarterly, 70(5): 720-736. • Purpose: Attitudes toward birding or mall design • Target population: members of the American Birding Association or general population adults • Sample design: random sample from the ABA members or purchased from commercial vendor • Mode of data collection: Mail • Response rate: variable by topic of questionnaire • Target estimate: Participating in bird watching activities • Nonresponse error measure: Comparison of response rates by survey topic treatment groups 176 Response Rate and Percentage of Respondents Interested in Birding for Survey About Birding Versus Mall Design Response Rates % interested in birding Groves, R.M., Couper, M.P, et al. (2006) “Experiments in Producing Nonresponse Bias,” Public Opinion Quarterly, 70(5): 720-736. Figure 7 and 8 177 Conclusions • Response rates for birders highest when topic is of interest to them • Estimates of birding also affected by survey topic • Limitations – Not possible to determine whether the differences among the general population adults in birding estimates due to nonresponse or measurement errors 178 4. COMPARING ESTIMATES USING ALTERNATIVE WEIGHTING OR IMPUTATION SCHEMES 179 4. Methods of Weight Adjustment • Alter estimation weights and compare the estimates using the various weights to evaluate nonresponse bias. Weighting methods may include poststratification, raking, calibration, logistic regression, or even imputation. 4.1 Prepare estimates under different assumptions 4.2 Adjust using models of characteristics 4.3 Adjust using models of response propensity 4.4 Adjust using selection (Heckman) models 180 4.1 Alternative Estimates Based on Different Assumptions • • A simple approach is to conduct a “what if” analysis that may not require new weights. The “what if” estimate assume a distribution for the nonrespondents that differs from the observed data. If the difference between the “what if” estimates and the regular survey estimates is large then the estimates have the potential for nonresponse bias. 181 4.1 How to Conduct Nonresponse Bias Based on Different Assumptions 1. Identify key assumptions in weighting and determine “what if” or alternative assumptions (e.g., very fine-grained nonresponse adjustment cells). 2. Assume that all or a large fraction of the nonrespondents have a specific characteristic. Compute an estimate using this assumption (or re-do with fine-grained adjustment cells). 3. The difference between the “what if” estimates and the regular survey estimates is a bound on nonresponse bias. 182 4.1 Pros and Cons of Comparing Estimates Based on Different Assumptions • Pros – The “what if” option is usually inexpensive and the fine-grained revision may not be difficult – If information from previous or similar studies with follow-ups are available, then reasonably precise assumptions about the characteristics of nonrespondents can be postulated • Cons – Alternative estimates are often highly variable and may not be very informative about the actual bias – Without previous studies, the alternative assumptions may not have much support 183 Kauff, Jacqueline, Robert Olsen and Thomas Fraker. (2002) Nonrespondents and Nonresponse Bias: Evidence from a Survey of Former Welfare Recipients in Iowa. Mathematica Policy Research. • Purpose: Understand well-being of former TANF recipients • Target population: families that left Iowa’s TANF program in 1999 • Sample design: random sample from list of former TANF recipients • Mode of data collection: Telephone • Response rate: 76% • Target estimate: % Employment and Mean Earnings • Nonresponse error measure: ‘What if’ estimates for labor market and health outcomes 184 Estimates of % Employed and $ Monthly Earnings assuming ‘best case’ outcomes (all improved) and ‘worst case’ (all declined) outcomes for nonrespondents (Bias = 3.6%, -8.7%; 8.9%, -8.6%) % Employed $ Monthly Earnings 720 700 680 660 640 620 600 580 560 540 64 % Employed 62 60 58 56 54 52 50 Actual Best case (76% RR) Worst case Actual (76%) Best Case Worst Case (earned at 90th%ile) Kauff, Olsen and Fraker. (2002) Nonrespondents and Nonresponse Bias: Evidence from a Survey of Former Welfare Recipients in Iowa. Exhibit 4.2 185 Conclusion • Estimates based on ‘best’ and ‘worst’ case scenarios are similar to each other, with similar policy implications • Survey has limited nonresponse bias in wave 1 study 186 Volken, T. (2013) “Second-stage non-response in the Swiss health survey: determinants and bias in outcomes.” BMC Public Health 13:167. • Purpose: National survey on health status, utilization and behavior • Target population: Noninstitutional Swiss population fluent in German, French or Italian • Sample design: Multi-stage probability sample • Mode of data collection: CATI or CAPI in screener; mail in additional questionnaire • Response rate: 80.3% of screener respondents yielded a completed mail survey • Target estimate: Self-rated health from screener; imputed values for nonparticipants • Nonresponse error measure: Comparison to screener answers or imputed answers for respondents and nonrespondents of mail interview 187 Respondent and imputed estimates for four health outcomes 25 20 15 10 5 0 Influenza vaccination Arthrosis Osteoporosis Respondents Imputed Nonrespondents High blood pressure 188 Conclusion • Magnitude of bias on most outcomes estimated to be moderate – Direction of differences varies by gender • Limitations – Bias estimates created off of imputed values – Imputation models may be incorrect 189 Artificial Example • Purpose: Estimate characteristics of low income households for those in a federal assistance program • Target population: U.S. households in program • Sample design: Equal probability sample from list • Mode of data collection: Telephone • Response rate: 80% • Target estimate: Number in poverty and percent Hispanic • Nonresponse error measure: Alternative assumptions about the proportion of nonrespondents 190 Artificial “What If” Scheme Bias of estimated proportion making alternative assumptions regarding the percent of nonrespondents in the groups. Percent of Characteristic respondents Below poverty 25.0 Hispanic 13.0 Bias if assumed percent for nonrespondents is 1.5 times 2.5 times respondent % respondent % 2.5 7.5 1.3 3.9 191 Conclusions • Bias in estimates of the percentage in the program who are in poverty and Hispanic may be substantial if the respondents and nonrespondents are very different • Limitation: – Difficult to evaluate assumptions about the percentage of nonrespondents having a characteristic 192 4.2 Using alternative weighting schemes • • • Weighting can reduce nonresponse bias if the weights are correlated with the estimate. Auxiliary data in weighting that are good predictors of the characteristic may give alternative weights that have less bias. If the estimates using the alternative weights do not differ from the original estimates, then either the nonresponse is not resulting in bias or the auxiliary data does not reduce the bias. If the estimates vary by the weighting scheme, then the weighting approach should be carefully examined and the one most likely to have lower nonresponse bias should be used. 193 4.2 How to Conduct Nonresponse Bias Analysis Using Alternative Weights 1. Conduct analysis to identify any auxiliary data (often from external source) available that are good predictors of the key survey estimates and/or response propensity. 2. Using weighting method such as weighting class adjustments, response propensity, or calibration estimation with these variables and produce alternative weights. 3. Compute the difference between the estimates using the alternative weights and the estimates from the regular weights as a measure of nonresponse bias for the estimate. 194 4.2 How to Conduct Nonresponse Bias Analysis Using Alternative Weights Note that a few methods of analysis to identify any auxiliary variables that are good predictors of response propensity include: • CHAID analysis (classification trees) • Logistic regression modeling Variables that are used in a separate weighting class or response propensity-based adjustment must be available for all respondents and nonrespondents. 195 4.2 Pros and Cons of Comparing Alternative Estimates Weights • Pros – If good predictors are available, then it is likely that the use of these in the weighting will reduce the bias in the statistics being evaluated – If the differences in the estimates are small, it is evidence that nonresponse bias may not be large • Cons – Recomputing weights may be expensive – If good correlates are not available then lack of differences may be indicator of poor relationships rather than the absence of bias – The approach is limited to statistics that have high correlation with auxiliary data 196 Kreuter, F., Olson, K., Wagner, J., Yan, T., Ezzati-Rice, T. M., et al. (2010). Using Proxy Measures and Other Correlates of Survey Outcomes to Adjust for Nonresponse: Examples from Multiple Surveys. Journal of the Royal Statistical Society Series A-Statistics in Society, 173(2), 389-407. • Purpose: Five surveys – UM Transportation Research Institute; MEPS; ESS; ANES; NSFG. • Target population: Varies • Sample design: Varies; generally area probability • Mode of data collection: Face-to-Face and telephone (varies) • Response rate: Varies • Target estimate: Multiple estimates across five surveys • Nonresponse bias measure: Use alternative weighting scheme (add auxiliary variables to standard adjustment scheme) to reduce bias 197 ESS; +=litter; o=multiunit housing 198 Conclusions • Correlation between auxiliary variables and survey variables very weak across all five surveys • Weighted estimates see largest change when correlation between new auxiliary variables and survey variables is largest 199 • • • • • • • Tivesten, E., Jonsson, S., Jakobsson, L., & Norin, H. (2012). Nonresponse analysis and adjustment in a mail survey on car accidents. Accident Analysis & Prevention, 48(0), 401-415. Purpose: Evaluate causes of automobile accidents. Target population: Owners of Volvo cars that had vehicle repair costs above 45,000 SEK following an accident, and were insured by Volvia insurance company Sample design: All eligible persons in target population Mode of data collection: Mail Response rate: 35.5% Target estimate: Non-driving level distractions Nonresponse bias measure: Compare weighted (response propensity model including information on accident type, town size, car owner, driver age, gender) and unweighted estimates 200 Adjust Weights Using Response Propensity Model 201 Conclusions • Driver’s age, sex, accident type and town size effective nonresponse adjustment variables • Weights increased the number of low vigilance and non-driving related distracted drivers • Limitation: – Only looked at new car (Volvo) owners 202 Ekholm and Laaksonen (1991). Weighting via response modeling in the Finnish household budget survey. Journal of Official Statistics, 3, 325-337. • Purpose: Finnish Household Budget survey estimates and monitors consumption in Finland • Target population: Households in Finland • Sample design: Sample persons from register and collect data from households (1985) • Mode of data collection: 2 f-t-f interviews 2 weeks apart • Response rate: 70% • Target estimate: Number of households • Nonresponse error measure: Use data from registers to estimate logistic regression for probability to respond (factors include urbanicity, region, property income, and household structure). 203 Estimated Number of Households (in 1,000s) Using Two Adjustment Schemes Adjustment scheme 128-cell Benchmark 35 regional propensity survey Domain poststrata model All Finland 2,010 1,960 2,042 Uusimaa na 504 540 North Carelia na 70 74 204 Conclusions • Evidence suggested that revised scheme reduced bias • The scheme was considered to be highly practical since it could be used with a variety of statistics • Limitation: – Evaluation using external estimates limited to a few variables 205 4.3 Calculate the Fraction of Missing Information (FMI) • Measure out of multiple imputation literature about how uncertain we are about imputed values. Smaller values indicate more certainty about missing values. – Upwardly bounded by the response rate • More strongly correlated auxiliary variables with survey variables will reduce the FMI • Provides insights into the strength of adjustment variables; Can also be used during data collection 206 4.3 How to conduct a Fraction of Missing Information analysis • Obtain as much information as possible on both respondents and nonrespondents • Obtain important survey variables from respondent pool • Use multiple imputation methods (e.g., SAS PROC MI, Stata ICE, IVEWare) to impute multiple datasets • Use standard multiple imputation analysis methods to estimate the FMI for a variety of survey estimates 207 4.3 Pros and cons of conducting a Fraction of Missing Information analysis • Pros – Results in multiply imputed dataset, which may need to do anyway – Provides information about the usefulness of auxiliary variables for adjustment – Can use multiply imputed datasets for an estimate of nonresponse bias • Cons – Multiple imputation is labor intensive – Requires good predictors of a wide range of survey variables 208 Wagner, J. (2010). The Fraction of Missing Information as a Tool for Monitoring the Quality of Survey Data. Public Opinion Quarterly, 74(2), 223-243. • Purpose: National Survey of Family Growth and Survey of Consumer Attitudes • Target population: NSFG: US adult noninstitutionalized population 15-44; SCA: All adults • Sample design: NSFG: Area probability design; SCA: RDD • Mode of data collection: NSFG: Face-to-face; SCA: telephone • Response rate: Changes over course of data collection • Target estimate: Proportion never married (NSFG); Consumer confidence indices (SCA) • Nonresponse error measure: Fraction of Missing 209 Information calculation relative to response rate Fraction of Missing Information and Nonresponse Rate for the NSFG by Day and Quarter: Proportion Never Married. James Wagner Public Opin Q 2010;74:223-243 Fraction of Missing Information and Nonresponse Rate for the SCA by Call Number: ICS, ICC, and ICE. 210 Conclusion • Frame and paradata are good predictors of the survey data in the NSFG, but not in the SCA • Information is useful for guiding data collection decisions • Limitations – Some modes may have less informative information – Required 100 imputations conducted every day 211 4.4 Adjust Weights Using Heckman Model of Selection Bias • • • Similar in philosophy to response propensity modeling. Use auxiliary data to predict probability of responding to the survey and then use second stage regression to estimate characteristic including the selection variable. If the estimates vary, then nonresponse bias may be evident and Heckman model may give lower nonresponse bias for the statistic. 212 4.4 How to Conduct Nonresponse Bias Analysis Using Heckman Model 1. Conduct analysis to identify any auxiliary data available that are good predictors of units responding (self-selecting). 2. Use regression-based estimation to model self-selection bias. 3. Include the estimated selection bias terms in the regression for estimating the characteristic. 4. Compute the difference between the estimates using the Heckman model and the estimates from the regular weights as a measure of nonresponse bias. 213 4.4 Pros and Cons of Comparing Alternative Estimates Based on Heckman Model • Pros – If good predictors are available, then it is likely that the use of these in the weighting will reduce the bias in the statistics being evaluated • Cons – Recomputing weights may be expensive – Method has assumptions that often do not hold – Separate regressions needed for each statistic 214 Messonier, M., Bergstrom, J., Cornwell, C., Teasley, R., and Cordell, H. (2000). Survey response-related biases in contingent valuation: concepts, remedies, and empirical application to valuing aquatic plant management. American Journal of Agricultural Economics, 83, 438-450. • Purpose: Estimate willingness to pay for aquatic plant management in Lake Guntersville, AL • Target population: Recreational users of lake • Sample design: Stratified random sample of visitors during 9AM-5PM time period and sample of lakeside residents • Mode of data collection: Mail questionnaire among those interviewed as visitors and residents • Response rate: 50% • Target estimate: Mean amount willing to pay for plant management • Nonresponse bias estimate: Heckman two-stage correction for unit nonresponse, using variables 215 measured during initial interview Two-stage “Heckman” Adjustment • Probability of being a respondent to mail questionnaire (nonfishers) .09(education of sample person) + .03(number of persons in household) • Use this equation as a ‘selection bias’ equation to adjust estimates of another regression model to estimate willingness to pay 216 Estimated Mean Dollar Willingness to Pay 600 500 400 300 200 100 0 Without Selection Bias Adjustment With Unit Nonresponse Adjustment 217 Conclusions • Without selection bias adjustment, there appears to be an underestimate of the amount nonfishers are willing to pay for aquatic management • Limitation: – No assessment of assumptions underlying use of selection bias equation 218 Five Things You Should Remember from the Short Course 1. 2. 3. 4. 5. The three principal types of nonresponse bias studies are: - Comparing surveys to external data - Studying internal variation within the data collection, and - Contrasting alternative postsurvey adjusted estimates All three have strengths and weaknesses; using multiple approaches simultaneously provides greater understanding Nonresponse bias is specific to a statistic, so separate assessments may be needed for different estimates Auxiliary variables correlated with both the likelihood of responding and key survey variables are important for evaluation Thinking about nonresponse before the survey is important because different modes, frames, and survey designs permit different types of studies 219