Two-phase sampling approach for propensity score estimation in voluntary sampling Jae-Kwang Kim 1 Iowa State University November 23, 2012 1 Joint work with Sixia Chen Introduction 1 Introduction 2 Motivating Example 3 Proposed method 4 Nonnested two-phase extension 5 Simulation study 6 Application to 2012 Iowa Caucus Survey 7 Conclusion Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 2 / 45 Introduction Voluntary samples Voluntary samples: Self-selected Sample inclusion probability unknown Selection bias Weighting for voluntary sample Propensity score weighting using calibration on demographic variables (e.g.: Valliant and Dever, 2011; Lee and Valliant, 2009; Lee, 2006). Assuming ignorable sampling mechanism; Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 3 / 45 Introduction Voluntary samples Leverage-saliency theory (Groves et al, 2000): What makes people participate in the survey ? Incentive Interest (in the subject matter of the study) Other demographic factors (denoted by “x”) Thus, the classical approach of calibration weighting based on demographic variables only may lead to biased estimation. “Interest” in the survey is not directly observable but some “y” variable can provide useful information about the interest. Ex: Hours of watching TV news in a week may provide information about one’s interest in “politics”. The “y” observation is available only for the survey participants. Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 4 / 45 Introduction Voluntary samples Propensity score weighting for nonignorable missing mechanism: Use “y” (as well as x) when modeling the propensity scores. Greenlees, Reece, and Zieschang (1982): Use joint model assumptions Chang and Kott (2008), Kott and Chang (2010), Wang, Shao, Kim (2012): Use instrumental variable calibration. Challenges Very difficult to estimate the (propensity) model parameters May not have good instrumental variable. Use strong model assumptions, No tool for model diagnostics Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 5 / 45 Introduction Voluntary samples Question: How to improve the modeling and parameter estimation for nonignorable nonresponse ? Follow-up experiment on the nonrespondents Can build a propensity model Parameter estimation easy Follow-up may not be practically feasible for voluntary sampling. No sampling frame Cost increases Proposed approach: Consider a follow-up experiment on the participants. Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 6 / 45 Motivating Example 1 Introduction 2 Motivating Example 3 Proposed method 4 Nonnested two-phase extension 5 Simulation study 6 Application to 2012 Iowa Caucus Survey 7 Conclusion Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 7 / 45 Motivating Example Motivation (2012 Iowa Caucus Survey) Telephone survey for 2012 Iowa Caucus. First attempt (Nov. 2011): 15% response rate; Second attempt (Dec. 2011): 75% response rate. Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 8 / 45 Motivating Example Motivation (2012 Iowa Caucus Survey) (cont’d) Three-phase sampling structure (U → A → R1 → R2) 1 2 3 Phase One: Probability sampling from a list frame Phase Two: Voluntary sampling with about 15% participation rate Phase Three: Voluntary sampling with about 75% participation rate Similar questions were asked for the two surveys: We may assume that the two voluntary sampling mechanism are essentially the same (except for the overall response rate). In this case, we can treat R1 as a finite population and fit a response model for selecting R2 from R1. Use the estimated response probability to compute the propensity weights for R1. Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 9 / 45 Motivating Example Motivation (2012 Iowa Caucus Survey) (cont’d) δ1 = 1 if participated in the first survey δ2 = 1 if participated in the second survey Propensity score models: π1 = Pr(δ1 = 1|X , Y ) and π2 = Pr(δ2 = 1|X , Y , δ1 = 1) The propensity scores may depend on Y (Y : Candidate preference) May assume that Kim (ISU) π1 /(1−π1 ) π2 /(1−π2 ) does not depend on (x, y ). Propensity score estimation in voluntary sampling November 23, 2012 10 / 45 Proposed method 1 Introduction 2 Motivating Example 3 Proposed method 4 Nonnested two-phase extension 5 Simulation study 6 Application to 2012 Iowa Caucus Survey 7 Conclusion Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 11 / 45 Proposed method Basic Setup U: Target finite population; A1 : a volunteer sample obtained from an unknown sampling design A2 : a second volunteer sample obtained from an unknown sampling design x : covariates observed in A1 and A2 y1 : study variable observed in A1 y2 : study variable observed in A2 δ1i , δ2i : sampling indicators for the first and second sampling designs May have extra information about x, e.g. X̄N is known Parameters of interest: θ1 = E (Y1 ) and θ2 = E (Y2 ). Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 12 / 45 Proposed method Two-phase approach Nested structure: A2 ⊂ A1 ⊂ U (i.e. y1i is also observed in A2 ) The first-phase sampling mechanism is assumed to be π1i (φ) = Pr(δ1i = 1|xi , y1i ) = exp(φ0 + φ1 xi + φ2 y1i ) 1 + exp(φ0 + φ1 xi + φ2 y1i ) The second-phase sampling mechanism is assumed to be π2i (φ∗ ) = Pr(δ2i = 1|xi , y2i , δ1i = 1) = exp(φ∗0 + φ1 xi + φ2 y2i ) 1 + exp(φ∗0 + φ1 xi + φ2 y2i ) Note that we assume (φ1 , φ2 ) to be the same in the two propensity models N is assumed to be known This is the setup for 2012 Iowa Caucus example. Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 13 / 45 Proposed method Proposed estimator Proposed estimator for θ1 : θ̂1 = 1 X y1i , N π (Φ̂) i∈A1 1i (1) where Φ̂ is a consistent estimator of the true parameter Φ = (φ∗0 , φ0 , φ1 , φ2 ). Proposed estimator for θ2 : θ̂2 = y2i 1 X . N π (Φ̂)π2i (Φ̂) i∈A2 1i (2) How to estimate Φ? Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 14 / 45 Proposed method Estimating Φ Idea: Calibration using (1, xi , y1i ) (instrumental variables). X δ2i 0 0 − 1 (1, xi , y1i ) = (0, 0, 0) U1 (Φ) := ∗ π2i (φ0 , φ1 , φ2 ) i∈A1 and U2 (Φ) := X i∈A1 where Φ = 1 − N = 0, π1i (φ0 , φ1 , φ2 ) (φ∗0 , φ1 , φ2 , φ0 ). Note: (xi , y1i ) observed in both A1 and A2 & used in computing φ̂2 in π2i = Kim (ISU) exp(φ∗0 + φ1 xi + φ2 y2i ) . 1 + exp(φ∗0 + φ1 xi + φ2 y2i ) Propensity score estimation in voluntary sampling November 23, 2012 15 / 45 Proposed method Estimating (θ1 , θ2 , Φ) Asymptotic Properties: Consistency (of Φ̂ and of θ̂1 , θ̂2 ); Asymptotic Variance; Variance Estimation; Asymptotic Normality (Skip). Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 16 / 45 Proposed method Estimating (θ1 , Φ) (θ̂1 , Φ̂) can be obtained by solving Up (θ1 , Φ) = 0, Uc (Φ) = 0, P where Up (θ1 , Φ) = N −1 i∈A1 {π1i (φ0 , φ1 , φ2 )}−1 y1i − θ1 , Uc (Φ)0 = [U1 (Φ)0 , U2 (Φ)0 ] , where U1 (Φ) and U2 (Φ) are the calibration equations for computing Φ. Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 17 / 45 Proposed method Asymptotic variance for (θ̂1 , Φ̂) Because E {Up (θ1∗ , Φ∗ )} = 0 and E {Uc (Φ∗ )} = 0, where (θ1∗ , Φ∗ ) is the true parameter values, the solution (θ̂1 , Φ̂) is consistent and has asymptotic variance V θ̂1 Φ̂ ∼ = × Kim (ISU) −1 E (∂Up /∂Φ) 0 E (∂Uc /∂Φ) −1 −1 E (∂Up /∂Φ) 0 E (∂Uc /∂Φ) 0 −1 V (Up ) C (Up , Uc ) C (Uc , Up ) V (Uc ) Propensity score estimation in voluntary sampling November 23, 2012 18 / 45 Proposed method Asymptotic variance for θ̂1 (cont’d) Thus, the asymptotic variance can be written as ) ( ) ( N N δ2i 1 X 1 X δ1i ∼ V θ̂1 (y1i − B2,y ) + V B1,y δ1i ( − 1)h1i , = V N i=1 π1i N i=1 π2i where N X (B1,y , B2,y ) = (1 − π1i )y1i (0, xi , y1i , 1) i=1 0 !−1 PN 0 π1i (1 − π2i )h1i h2i , 0 i=1 PN , i=1 (1 − π1i )(0, xi , y1i , 1) 0 with h1i = (1, xi , y1i ) and h2i = (1, xi , y2i ) . Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 19 / 45 Proposed method Asymptotic variance estimator for θ̂1 For variance estimation, we can use the plug-in method to the linearized variance to obtain 1 X 1 − π̂2i 1 X 1 − π̂1i V̂ θ̂1 = 2 (y1i − B̂2,y )2 + 2 (B̂1,y h1i )2 , 2 N N π̂1i π̂2i2 i∈A1 i∈A2 where X 1 − π̂1i y1i (0, xi , y1i , 1) (B̂1,y , B̂2,y ) = π̂1i i∈A1 Kim (ISU) !−1 P 0 π̂2i−1 (1 − π̂2i )h1i h2i , 0 i∈A 2 P . −1 i∈A1 π̂1i (1 − π̂1i )(0, xi , y1i , 1) Propensity score estimation in voluntary sampling November 23, 2012 20 / 45 Proposed method Asymptotic variance for θ̂2 Recall θ̂2 = 1 X 1 y2i . N π̂1i π̂2i i∈A2 Similarly, ( N " N ) # N X δ1i δ2i X δ2i X δ1i 1 ∼ V θ̂2 = 2 V y2i − D1,y δ1i − 1 h1i − D2,y −1 , N π1i π2i π2i π1i i=1 i=1 i=1 where (D1,y , D2,y ) = N X y2i {(1 − π1i )(0, xi , y1i , 1) + (1 − π2i )(1, xi , y2i , 0)} i=1 × Kim (ISU) !−1 PN 0 π1i (1 − π2i )h1i h2i , 0 i=1 PN . i=1 (1 − π1i )(0, xi , y1i , 1) Propensity score estimation in voluntary sampling November 23, 2012 21 / 45 Proposed method Asymptotic variance estimator for θ̂2 Thus, a consistent estimator for the variance of θ̂2 is given by V̂ (θ̂2 ) = 1 X 1 − π̂1i 1 X 1 − π̂2i (y2i − D̂2,y )2 + 2 (y2i − D̂1,y π̂1i h1i )2 , 2 2 2 2 N N π̂ π̂ π̂ π̂ 2i 1i 1i 2i i∈A i∈A 2 2 where (D̂1,y , D̂2,y ) = × Kim (ISU) y2i {(1 − π̂1i )(0, xi , y1i , 1) + (1 − π̂2i )(1, xi , y2i , 0)} π̂1i π̂2i i∈A2 !−1 P 0 −1 π̂ (1 − π̂ )h h , 0 2i 1i 2i Pi∈A2 2i−1 . i∈A1 π̂1i (1 − π̂1i )(0, xi , y1i , 1) X Propensity score estimation in voluntary sampling November 23, 2012 22 / 45 Proposed method Regression estimation Instead of using direct estimator θ̂2 in (2), we can use a two-phase regression estimator to improve the efficiency. The proposed regression estimator is θ̂2,Reg = θ̂2 − B̂h1 (ĥ2,1 − ĥ1,1 ) (3) P P −1 −1 −1 where ĥ1,1 = N −1 i∈A1 π̂1i h1i , ĥ2,1 = N −1 i∈A2 π̂1i π̂2i h1i , and B̂h1 is chosen to minimize the variance. Note that A2 ⊂ A1 implies that h1i is observed in A2 . Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 23 / 45 Proposed method Regression estimation (Cont’d) If the population information X̄N is available, we can construct the following regression estimators θ̂1,Reg = θ̂1 − B̂Reg (θ̂x,1 − X̄N ) (4) and θ̂2,Reg ∗ ∗ = θ̂2 − B̂1,Reg (ĥ2,1 − ĥ1,1 ) − B̂2,Reg (θ̂x,2 − x̄N ) (5) P P −1 −1 −1 where θ̂x,1 = N −1 i∈A1 π̂1i xi and θ̂x,2 = N −1 i∈A2 π̂1i π̂2i xi . Asymptotic properties can be derived similarly (Skipped). Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 24 / 45 Nonnested two-phase extension 1 Introduction 2 Motivating Example 3 Proposed method 4 Nonnested two-phase extension 5 Simulation study 6 Application to 2012 Iowa Caucus Survey 7 Conclusion Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 25 / 45 Nonnested two-phase extension Nonnested Two-Phase approach Basic setup What if a followup is not used ? Instead of obtaining a followup sample A2 from A1 , suppose that there is another sample A2 , independently selected from A1 , from the same population. Assume that (xi , yi ) are observed in A1 and A2 . (i.e. two surveys have common survey items). Nonnested structure: A1 ⊂ U and A2 ⊂ U. Assume independence of the two sample selection The sampling mechanism depends on (x, y ). Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 26 / 45 Nonnested two-phase extension Nonnested Two-Phase approach (Cont’d) Approach We may treat the sampling for A1 and A2 as a capture-recapture sampling Propensity model π1i (φ) = Pr(δ1i = 1|xi , yi ) = exp(φ0 + φ1 xi + φ2 yi ) 1 + exp(φ0 + φ1 xi + φ2 yi ) π2i (φ∗ ) = Pr(δ2i = 1|xi , yi ) = exp(φ∗0 + φ∗1 xi + φ∗2 yi ) . 1 + exp(φ∗0 + φ∗1 xi + φ∗2 yi ) and Thus, we allow that the two models can be different. Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 27 / 45 Nonnested two-phase extension Nonnested Two-Phase approach (Cont’d) Parameter estimation: Maximize the conditional likelihood LC (Φ) = Y i∈A1 /A2 × Y i∈A2 /A1 π1i (φ) {1 − π2i (φ∗ )} pi (φ, φ∗ ) {1 − π1i (φ)} π2i pi (φ, φ∗ ) (φ∗ ) Y i∈A1 ∩A2 π1i (φ)π2i (φ∗ ) pi (φ, φ∗ ) , where pi (φ, φ∗ ) = 1 − {1 − π1i (φ)} {1 − π2i (φ∗ )} and Φ = (φ, φ∗ ). The conditional likelihood is obtained by considering the conditional distribution of (δ1i , δi2 ) given that unit i is selected in either one of the two samples. Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 28 / 45 Nonnested two-phase extension Nonnested Two-Phase approach (Cont’d) Score functions derived from the conditional likelihood SC 1 (Φ) := X 0 0 (1, xi , yi ) − X i∈A1 ∪A2 i∈A1 π1i (φ) 0 0 (1, xi , yi ) ∗ pi (φ, φ ) and SC 2 (Φ) := X 0 0 (1, xi , yi ) − X i∈A1 ∪A2 i∈A2 π2i (φ) 0 0 (1, xi , yi ) . ∗ pi (φ, φ ) The propensity score estimator of θ = E (Y ) based on A1 : P θ̂ = Pi∈A1 −1 π1i (φ̂)yi −1 i∈A1 π1i (φ̂) . Asymptotic properties of θ̂ can be obtained similarly (Skipped). Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 29 / 45 Simulation study 1 Introduction 2 Motivating Example 3 Proposed method 4 Nonnested two-phase extension 5 Simulation study 6 Application to 2012 Iowa Caucus Survey 7 Conclusion Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 30 / 45 Simulation study Simulation Study A finite population of size N = 10, 000 was generated from √ Y1i = 1 + 0.5(Xi − 2.5)2 + (e1i − 1)/ 2, √ Y2i = 1 + 0.5(Xi − 2.5)2 + (e2i − 1.4)/ 2, Xi ∼i.i.d N(2, 1), e1i ∼i.i.d χ21 , e3i ∼i.i.d χ21 , e1i is independent with e3i and e2i = 0.8e1i + 0.6e3i . From the finite population, we repeatedly generated two-phase samples with approximate sample size n1 = 500 and n2 = 300 for the phase one and phase two sample, respectively. Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 31 / 45 Simulation study Simulation Study We consider the following response mechanisms for the first phase and second phase sampling indicators δ1i and δ2i : (M1). (Linear Ignorable) π1i = exp(φ0 + φ1 Xi ) , 1 + exp(φ0 + φ1 Xi ) π2i = exp(φ∗0 + φ1 Xi ) , 1 + exp(φ∗0 + φ1 Xi ) where (φ0 , φ1 , φ∗0 ) = (−3.5, 0.3, 0.1) for model A and (−3.5, 0.3, 0.2) for model B. (M2). (Linear Non-ignorable) π1i = exp(φ0 + φ1 Xi + φ2 Y1i ) , 1 + exp(φ0 + φ1 Xi + φ2 Y1i ) π2i = exp(φ∗0 + φ1 Xi + φ2 Y2i ) , 1 + exp(φ∗0 + φ1 Xi + φ2 Y2i ) where (φ0 , φ1 , φ2 , φ∗0 ) = (−4, 0.1, 0.3, −0.2) for model A and (−3.7, 0.1, 0.3, −0.2) for model B. Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 32 / 45 Simulation study Simulation Study (M3). (Complementary log-log Non-ignorable) π1i = 1 − exp {− exp(φ0 + φ1 Xi + φ2 Y1i )} π2i = 1 − exp {− exp(φ∗0 + φ1 Xi + φ2 Y2i )} , where (φ0 , φ1 , φ2 , φ∗0 ) = (−3.4, 0.1, 0.1, −0.3) for model A and (−3.3, 0.1, 0.1, −0.1) for model B. (M4). (Probit Non-ignorable) π1i = Φ(φ0 + φ1 Xi + φ2 Y1i ), π2i = Φ(φ∗0 + φ1 Xi + φ2 Y2i ), where (φ0 , φ1 , φ2 , φ∗0 ) = (−2.6, 0.2, 0.2, −0.5) for model A and (−2.4, 0.2, 0.2, −0.5) for model B. Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 33 / 45 Simulation study Estimators for θ1 (Nested Two-Phase) The “working” model for the propensity score estimation is the Linear Non-ignorable model (M2). From each sample, we computed the following four estimators for θ1 = E (Y1 ): 1 2 3 4 Naive: Calibration estimator which assumes ignorable missing mechanism; PS: Proposed propensity score estimator, computed by P −1 N −1 i∈A1 π̂1i y1i ; REG: Proposed regression estimator, θ̂1 − B̂reg (θ̂1,x − X̄N ), where nP o−1 P 0 −1 −1 B̂reg = i∈A1 π̂1i y1i xi ; i∈A1 π̂1i xi xi OPT: Proposed optimal estimator, θ̂1 − B̂opt (θ̂1,x − X̄N ), where ˆ (θ̂1 , θ̂1,x )V̂ −1 (θ̂1,x ). B̂opt = Cov Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 34 / 45 Simulation study Results (Nested Two-Phase) Table: Simulation results of the point estimators and variance estimators under models (M1)-(M4). Model Parameter θ1 (M1) θ2 Kim (ISU) Method Naive PS REG OPT Naive PS REG OPT Bias S.E RMSE R.B 0.0075 0.0226 0.0278 -0.0063 0.0074 0.0121 0.0208 0.0149 0.0814 0.1783 0.2177 0.1409 0.0982 0.1575 0.1716 0.1667 0.0817 0.1797 0.2195 0.1410 0.0985 0.1580 0.1729 0.1674 N/A 0.0174 0.0107 0.0589 N/A 0.0453 -0.1082 -0.0848 Propensity score estimation in voluntary sampling November 23, 2012 35 / 45 Simulation study Results (Nested Two-Phase) Table: Simulation results of the point estimators and variance estimators under models (M1)-(M4). Model Parameter θ1 (M2) θ2 Kim (ISU) Method Naive PS REG OPT Naive PS REG OPT Bias S.E RMSE R.B 0.5733 0.0011 0.0015 -0.0078 0.7129 0.0003 0.0007 0.0015 0.0872 0.1192 0.1439 0.1126 0.1048 0.1156 0.1160 0.1156 0.5799 0.1192 0.1439 0.1129 0.7206 0.1156 0.1160 0.1156 N/A 0.0401 0.0651 -0.0036 N/A 0.0368 0.0410 0.0326 Propensity score estimation in voluntary sampling November 23, 2012 36 / 45 Simulation study Table: Simulation results of the point estimators and variance estimators under models (M1)-(M4). Model Parameter θ1 (M3) θ2 Kim (ISU) Method Naive PS REG OPT Naive PS REG OPT Bias S.E RMSE R.B 0.1881 -0.0892 -0.0687 -0.1213 0.2526 -0.0757 -0.0750 -0.1077 0.0836 0.1335 0.1658 0.1169 0.0987 0.1297 0.1298 0.1120 0.2058 0.1606 0.1795 0.1685 0.2712 0.1502 0.1499 0.1554 N/A -0.0419 -0.0831 0.0207 N/A -0.0508 -0.0541 0.0145 Propensity score estimation in voluntary sampling November 23, 2012 37 / 45 Simulation study Results (Nested Two-Phase) Table: Simulation results of the point estimators and variance estimators under models (M1)-(M4). Model Parameter θ1 (M4) θ2 Kim (ISU) Method Naive PS REG OPT Naive PS REG OPT Bias S.E RMSE R.B 0.8280 0.0920 0.0726 0.1029 1.0536 0.0764 0.0774 0.0871 0.0975 0.1255 0.1552 0.1129 0.1365 0.1230 0.1239 0.1100 0.8337 0.1556 0.1713 0.1528 1.0624 0.1448 0.1461 0.1403 N/A -0.0086 -0.0110 0.0090 N/A -0.0063 -0.0171 -0.0058 Propensity score estimation in voluntary sampling November 23, 2012 38 / 45 Simulation study Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 39 / 45 Application to 2012 Iowa Caucus Survey 1 Introduction 2 Motivating Example 3 Proposed method 4 Nonnested two-phase extension 5 Simulation study 6 Application to 2012 Iowa Caucus Survey 7 Conclusion Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 40 / 45 Application to 2012 Iowa Caucus Survey Propensity Model π1i (φ) = exp(φ0 + φ1 Xi + φ2 Y1i ) 1 + exp(φ0 + φ1 Xi + φ2 Y1i ) π2i (φ∗ ) = exp(φ∗0 + φ1 Xi + φ2 Y2i ) , 1 + exp(φ∗0 + φ1 Xi + φ2 Y2i ) and where X =(Party, Age) and Y =“First choice” of the presidential candidate for Republic party. Results for propensity model parameter estimation Table: Point Estimation and Standard Error for Φ Coefficient Age Party Romley Perry Paul Others Est S.E. t.value 0.588 0.266 2.211 0.782 0.251 3.116 0.991 0.454 2.183 0.454 0.663 0.685 0.866 0.841 1.030 1.307 0.985 1.327 Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 41 / 45 Application to 2012 Iowa Caucus Survey Results for estimation for θ=proportion of first choice candidate Table: Point Estimation (s.e.) for 2012 Iowa Caucus Survey Results Survey Method Romney Perry Paul Others Nov. Naive Ignorable Proposed Dec. Naive Ignorable Proposed 0.340 0.316 0.303 (0.062) 0.281 0.270 0.244 (0.043) 0.108 0.103 0.106 (0.039) 0.140 0.144 0.134 (0.026) 0.130 0.146 0.093 (0.067) 0.131 0.148 0.112 (0.046) 0.422 0.435 0.499 (0.046) 0.448 0.437 0.509 (0.036) Actual outcome (Jan 3, 2012): θ0 = (24.5%, 10.3%, 21.4%, 43.7%) Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 42 / 45 Conclusion 1 Introduction 2 Motivating Example 3 Proposed method 4 Nonnested two-phase extension 5 Simulation study 6 Application to 2012 Iowa Caucus Survey 7 Conclusion Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 43 / 45 Conclusion Conclusion Remarks We proposed a new approach to propensity score estimation for voluntary samples; capture-recapture sampling approach. If a follow-up sample is obtained from the original participants, we can assume the same propensity model to estimate the model parameters from the follow-up sample. Instead of a follow-up sample, we can also use an independent sample with common survey items from the same population & apply the estimation method for capture-recapture sampling. Auxiliary information from sample or population can be incorporated via regression estimation. Promising for web surveys. Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 44 / 45 Conclusion Thank you! Kim (ISU) Propensity score estimation in voluntary sampling November 23, 2012 45 / 45