Two-phase sampling approach for propensity score estimation in voluntary sampling Jae-Kwang Kim

advertisement
Two-phase sampling approach for propensity score
estimation in voluntary sampling
Jae-Kwang Kim
1
Iowa State University
November 23, 2012
1
Joint work with Sixia Chen
Introduction
1
Introduction
2
Motivating Example
3
Proposed method
4
Nonnested two-phase extension
5
Simulation study
6
Application to 2012 Iowa Caucus Survey
7
Conclusion
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
2 / 45
Introduction
Voluntary samples
Voluntary samples:
Self-selected
Sample inclusion probability unknown
Selection bias
Weighting for voluntary sample
Propensity score weighting using calibration on demographic variables
(e.g.: Valliant and Dever, 2011; Lee and Valliant, 2009; Lee, 2006).
Assuming ignorable sampling mechanism;
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
3 / 45
Introduction
Voluntary samples
Leverage-saliency theory (Groves et al, 2000): What makes people
participate in the survey ?
Incentive
Interest (in the subject matter of the study)
Other demographic factors (denoted by “x”)
Thus, the classical approach of calibration weighting based on
demographic variables only may lead to biased estimation.
“Interest” in the survey is not directly observable but some “y”
variable can provide useful information about the interest.
Ex: Hours of watching TV news in a week may provide information
about one’s interest in “politics”.
The “y” observation is available only for the survey participants.
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
4 / 45
Introduction
Voluntary samples
Propensity score weighting for nonignorable missing mechanism: Use
“y” (as well as x) when modeling the propensity scores.
Greenlees, Reece, and Zieschang (1982): Use joint model assumptions
Chang and Kott (2008), Kott and Chang (2010), Wang, Shao, Kim
(2012): Use instrumental variable calibration.
Challenges
Very difficult to estimate the (propensity) model parameters
May not have good instrumental variable.
Use strong model assumptions, No tool for model diagnostics
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
5 / 45
Introduction
Voluntary samples
Question: How to improve the modeling and parameter estimation for
nonignorable nonresponse ?
Follow-up experiment on the nonrespondents
Can build a propensity model
Parameter estimation easy
Follow-up may not be practically feasible for voluntary sampling.
No sampling frame
Cost increases
Proposed approach: Consider a follow-up experiment on the
participants.
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
6 / 45
Motivating Example
1
Introduction
2
Motivating Example
3
Proposed method
4
Nonnested two-phase extension
5
Simulation study
6
Application to 2012 Iowa Caucus Survey
7
Conclusion
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
7 / 45
Motivating Example
Motivation (2012 Iowa Caucus Survey)
Telephone survey for 2012 Iowa Caucus.
First attempt (Nov. 2011): 15% response rate;
Second attempt (Dec. 2011): 75% response rate.
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
8 / 45
Motivating Example
Motivation (2012 Iowa Caucus Survey) (cont’d)
Three-phase sampling structure (U → A → R1 → R2)
1
2
3
Phase One: Probability sampling from a list frame
Phase Two: Voluntary sampling with about 15% participation rate
Phase Three: Voluntary sampling with about 75% participation rate
Similar questions were asked for the two surveys: We may assume
that the two voluntary sampling mechanism are essentially the same
(except for the overall response rate).
In this case, we can treat R1 as a finite population and fit a response
model for selecting R2 from R1.
Use the estimated response probability to compute the propensity
weights for R1.
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
9 / 45
Motivating Example
Motivation (2012 Iowa Caucus Survey) (cont’d)
δ1 = 1 if participated in the first survey
δ2 = 1 if participated in the second survey
Propensity score models: π1 = Pr(δ1 = 1|X , Y ) and
π2 = Pr(δ2 = 1|X , Y , δ1 = 1)
The propensity scores may depend on Y (Y : Candidate preference)
May assume that
Kim (ISU)
π1 /(1−π1 )
π2 /(1−π2 )
does not depend on (x, y ).
Propensity score estimation in voluntary sampling
November 23, 2012
10 / 45
Proposed method
1
Introduction
2
Motivating Example
3
Proposed method
4
Nonnested two-phase extension
5
Simulation study
6
Application to 2012 Iowa Caucus Survey
7
Conclusion
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
11 / 45
Proposed method
Basic Setup
U: Target finite population;
A1 : a volunteer sample obtained from an unknown sampling design
A2 : a second volunteer sample obtained from an unknown sampling
design
x : covariates observed in A1 and A2
y1 : study variable observed in A1
y2 : study variable observed in A2
δ1i , δ2i : sampling indicators for the first and second sampling designs
May have extra information about x, e.g. X̄N is known
Parameters of interest: θ1 = E (Y1 ) and θ2 = E (Y2 ).
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
12 / 45
Proposed method
Two-phase approach
Nested structure: A2 ⊂ A1 ⊂ U (i.e. y1i is also observed in A2 )
The first-phase sampling mechanism is assumed to be
π1i (φ) = Pr(δ1i = 1|xi , y1i ) =
exp(φ0 + φ1 xi + φ2 y1i )
1 + exp(φ0 + φ1 xi + φ2 y1i )
The second-phase sampling mechanism is assumed to be
π2i (φ∗ ) = Pr(δ2i = 1|xi , y2i , δ1i = 1) =
exp(φ∗0 + φ1 xi + φ2 y2i )
1 + exp(φ∗0 + φ1 xi + φ2 y2i )
Note that we assume (φ1 , φ2 ) to be the same in the two propensity
models
N is assumed to be known
This is the setup for 2012 Iowa Caucus example.
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
13 / 45
Proposed method
Proposed estimator
Proposed estimator for θ1 :
θ̂1 =
1 X y1i
,
N
π (Φ̂)
i∈A1 1i
(1)
where Φ̂ is a consistent estimator of the true parameter
Φ = (φ∗0 , φ0 , φ1 , φ2 ).
Proposed estimator for θ2 :
θ̂2 =
y2i
1 X
.
N
π (Φ̂)π2i (Φ̂)
i∈A2 1i
(2)
How to estimate Φ?
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
14 / 45
Proposed method
Estimating Φ
Idea: Calibration using (1, xi , y1i ) (instrumental variables).
X
δ2i
0
0
− 1 (1, xi , y1i ) = (0, 0, 0)
U1 (Φ) :=
∗
π2i (φ0 , φ1 , φ2 )
i∈A1
and
U2 (Φ) :=
X
i∈A1
where Φ =
1
− N = 0,
π1i (φ0 , φ1 , φ2 )
(φ∗0 , φ1 , φ2 , φ0 ).
Note: (xi , y1i ) observed in both A1 and A2 & used in computing φ̂2 in
π2i =
Kim (ISU)
exp(φ∗0 + φ1 xi + φ2 y2i )
.
1 + exp(φ∗0 + φ1 xi + φ2 y2i )
Propensity score estimation in voluntary sampling
November 23, 2012
15 / 45
Proposed method
Estimating (θ1 , θ2 , Φ)
Asymptotic Properties:
Consistency (of Φ̂ and of θ̂1 , θ̂2 );
Asymptotic Variance;
Variance Estimation;
Asymptotic Normality (Skip).
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
16 / 45
Proposed method
Estimating (θ1 , Φ)
(θ̂1 , Φ̂) can be obtained by solving
Up (θ1 , Φ) = 0, Uc (Φ) = 0,
P
where Up (θ1 , Φ) = N −1 i∈A1 {π1i (φ0 , φ1 , φ2 )}−1 y1i − θ1 ,
Uc (Φ)0 = [U1 (Φ)0 , U2 (Φ)0 ] , where U1 (Φ) and U2 (Φ) are the
calibration equations for computing Φ.
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
17 / 45
Proposed method
Asymptotic variance for (θ̂1 , Φ̂)
Because E {Up (θ1∗ , Φ∗ )} = 0 and E {Uc (Φ∗ )} = 0, where (θ1∗ , Φ∗ ) is the
true parameter values, the solution (θ̂1 , Φ̂) is consistent and has
asymptotic variance
V
θ̂1
Φ̂
∼
=
×
Kim (ISU)
−1 E (∂Up /∂Φ)
0
E (∂Uc /∂Φ)
−1 −1 E (∂Up /∂Φ)
0
E (∂Uc /∂Φ)
0 −1
V (Up )
C (Up , Uc )
C (Uc , Up ) V (Uc )
Propensity score estimation in voluntary sampling
November 23, 2012
18 / 45
Proposed method
Asymptotic variance for θ̂1 (cont’d)
Thus, the asymptotic variance can be written as
)
(
)
(
N
N
δ2i
1 X
1 X δ1i
∼
V θ̂1
(y1i − B2,y ) + V
B1,y δ1i (
− 1)h1i ,
= V
N i=1 π1i
N i=1
π2i
where
N
X
(B1,y , B2,y ) =
(1 − π1i )y1i (0, xi , y1i , 1)
i=1
0
!−1
PN
0
π1i (1 − π2i )h1i h2i , 0
i=1
PN
,
i=1 (1 − π1i )(0, xi , y1i , 1)
0
with h1i = (1, xi , y1i ) and h2i = (1, xi , y2i ) .
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
19 / 45
Proposed method
Asymptotic variance estimator for θ̂1
For variance estimation, we can use the plug-in method to the linearized variance to
obtain
1 X 1 − π̂2i
1 X 1 − π̂1i
V̂ θ̂1 = 2
(y1i − B̂2,y )2 + 2
(B̂1,y h1i )2 ,
2
N
N
π̂1i
π̂2i2
i∈A1
i∈A2
where
X 1 − π̂1i
y1i (0, xi , y1i , 1)
(B̂1,y , B̂2,y ) =
π̂1i
i∈A1
Kim (ISU)
!−1
P
0
π̂2i−1 (1 − π̂2i )h1i h2i , 0
i∈A
2
P
.
−1
i∈A1 π̂1i (1 − π̂1i )(0, xi , y1i , 1)
Propensity score estimation in voluntary sampling
November 23, 2012
20 / 45
Proposed method
Asymptotic variance for θ̂2
Recall
θ̂2 =
1 X 1
y2i .
N
π̂1i π̂2i
i∈A2
Similarly,
( N
" N
)
#
N X δ1i δ2i
X δ2i
X
δ1i
1
∼
V θ̂2 = 2 V
y2i − D1,y
δ1i
− 1 h1i − D2,y
−1
,
N
π1i π2i
π2i
π1i
i=1
i=1
i=1
where
(D1,y , D2,y )
=
N
X
y2i {(1 − π1i )(0, xi , y1i , 1) + (1 − π2i )(1, xi , y2i , 0)}
i=1
×
Kim (ISU)
!−1
PN
0
π1i (1 − π2i )h1i h2i , 0
i=1
PN
.
i=1 (1 − π1i )(0, xi , y1i , 1)
Propensity score estimation in voluntary sampling
November 23, 2012
21 / 45
Proposed method
Asymptotic variance estimator for θ̂2
Thus, a consistent estimator for the variance of θ̂2 is given by
V̂ (θ̂2 ) =
1 X 1 − π̂1i
1 X 1 − π̂2i
(y2i − D̂2,y )2 + 2
(y2i − D̂1,y π̂1i h1i )2 ,
2
2 2
2
N
N
π̂
π̂
π̂
π̂
2i
1i
1i
2i
i∈A
i∈A
2
2
where
(D̂1,y , D̂2,y )
=
×
Kim (ISU)
y2i
{(1 − π̂1i )(0, xi , y1i , 1) + (1 − π̂2i )(1, xi , y2i , 0)}
π̂1i π̂2i
i∈A2
!−1
P
0
−1
π̂
(1
−
π̂
)h
h
,
0
2i
1i
2i
Pi∈A2 2i−1
.
i∈A1 π̂1i (1 − π̂1i )(0, xi , y1i , 1)
X
Propensity score estimation in voluntary sampling
November 23, 2012
22 / 45
Proposed method
Regression estimation
Instead of using direct estimator θ̂2 in (2), we can use a two-phase
regression estimator to improve the efficiency. The proposed
regression estimator is
θ̂2,Reg
= θ̂2 − B̂h1 (ĥ2,1 − ĥ1,1 )
(3)
P
P
−1
−1 −1
where ĥ1,1 = N −1 i∈A1 π̂1i
h1i , ĥ2,1 = N −1 i∈A2 π̂1i
π̂2i h1i , and
B̂h1 is chosen to minimize the variance.
Note that A2 ⊂ A1 implies that h1i is observed in A2 .
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
23 / 45
Proposed method
Regression estimation (Cont’d)
If the population information X̄N is available, we can construct the
following regression estimators
θ̂1,Reg
= θ̂1 − B̂Reg (θ̂x,1 − X̄N )
(4)
and
θ̂2,Reg
∗
∗
= θ̂2 − B̂1,Reg
(ĥ2,1 − ĥ1,1 ) − B̂2,Reg
(θ̂x,2 − x̄N )
(5)
P
P
−1
−1 −1
where θ̂x,1 = N −1 i∈A1 π̂1i
xi and θ̂x,2 = N −1 i∈A2 π̂1i
π̂2i xi .
Asymptotic properties can be derived similarly (Skipped).
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
24 / 45
Nonnested two-phase extension
1
Introduction
2
Motivating Example
3
Proposed method
4
Nonnested two-phase extension
5
Simulation study
6
Application to 2012 Iowa Caucus Survey
7
Conclusion
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
25 / 45
Nonnested two-phase extension
Nonnested Two-Phase approach
Basic setup
What if a followup is not used ?
Instead of obtaining a followup sample A2 from A1 , suppose that
there is another sample A2 , independently selected from A1 , from the
same population.
Assume that (xi , yi ) are observed in A1 and A2 . (i.e. two surveys
have common survey items).
Nonnested structure: A1 ⊂ U and A2 ⊂ U.
Assume independence of the two sample selection
The sampling mechanism depends on (x, y ).
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
26 / 45
Nonnested two-phase extension
Nonnested Two-Phase approach (Cont’d)
Approach
We may treat the sampling for A1 and A2 as a capture-recapture
sampling
Propensity model
π1i (φ) = Pr(δ1i = 1|xi , yi ) =
exp(φ0 + φ1 xi + φ2 yi )
1 + exp(φ0 + φ1 xi + φ2 yi )
π2i (φ∗ ) = Pr(δ2i = 1|xi , yi ) =
exp(φ∗0 + φ∗1 xi + φ∗2 yi )
.
1 + exp(φ∗0 + φ∗1 xi + φ∗2 yi )
and
Thus, we allow that the two models can be different.
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
27 / 45
Nonnested two-phase extension
Nonnested Two-Phase approach (Cont’d)
Parameter estimation: Maximize the conditional likelihood
LC (Φ) =
Y
i∈A1 /A2
×
Y
i∈A2 /A1
π1i (φ) {1 − π2i (φ∗ )}
pi (φ, φ∗ )
{1 − π1i (φ)} π2i
pi (φ, φ∗ )
(φ∗ )
Y
i∈A1 ∩A2
π1i (φ)π2i (φ∗ )
pi (φ, φ∗ )
,
where pi (φ, φ∗ ) = 1 − {1 − π1i (φ)} {1 − π2i (φ∗ )} and Φ = (φ, φ∗ ).
The conditional likelihood is obtained by considering the conditional
distribution of (δ1i , δi2 ) given that unit i is selected in either one of
the two samples.
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
28 / 45
Nonnested two-phase extension
Nonnested Two-Phase approach (Cont’d)
Score functions derived from the conditional likelihood
SC 1 (Φ) :=
X
0
0
(1, xi , yi ) −
X
i∈A1 ∪A2
i∈A1
π1i (φ)
0
0
(1, xi , yi )
∗
pi (φ, φ )
and
SC 2 (Φ) :=
X
0
0
(1, xi , yi ) −
X
i∈A1 ∪A2
i∈A2
π2i (φ)
0
0
(1, xi , yi ) .
∗
pi (φ, φ )
The propensity score estimator of θ = E (Y ) based on A1 :
P
θ̂ = Pi∈A1
−1
π1i
(φ̂)yi
−1
i∈A1 π1i (φ̂)
.
Asymptotic properties of θ̂ can be obtained similarly (Skipped).
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
29 / 45
Simulation study
1
Introduction
2
Motivating Example
3
Proposed method
4
Nonnested two-phase extension
5
Simulation study
6
Application to 2012 Iowa Caucus Survey
7
Conclusion
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
30 / 45
Simulation study
Simulation Study
A finite population of size N = 10, 000 was generated from
√
Y1i = 1 + 0.5(Xi − 2.5)2 + (e1i − 1)/ 2,
√
Y2i = 1 + 0.5(Xi − 2.5)2 + (e2i − 1.4)/ 2,
Xi ∼i.i.d N(2, 1), e1i ∼i.i.d χ21 , e3i ∼i.i.d χ21 , e1i is independent with
e3i and e2i = 0.8e1i + 0.6e3i .
From the finite population, we repeatedly generated two-phase
samples with approximate sample size n1 = 500 and n2 = 300 for the
phase one and phase two sample, respectively.
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
31 / 45
Simulation study
Simulation Study
We consider the following response mechanisms for the first phase
and second phase sampling indicators δ1i and δ2i :
(M1). (Linear Ignorable)
π1i =
exp(φ0 + φ1 Xi )
,
1 + exp(φ0 + φ1 Xi )
π2i =
exp(φ∗0 + φ1 Xi )
,
1 + exp(φ∗0 + φ1 Xi )
where (φ0 , φ1 , φ∗0 ) = (−3.5, 0.3, 0.1) for model A and (−3.5, 0.3, 0.2)
for model B.
(M2). (Linear Non-ignorable)
π1i =
exp(φ0 + φ1 Xi + φ2 Y1i )
,
1 + exp(φ0 + φ1 Xi + φ2 Y1i )
π2i =
exp(φ∗0 + φ1 Xi + φ2 Y2i )
,
1 + exp(φ∗0 + φ1 Xi + φ2 Y2i )
where (φ0 , φ1 , φ2 , φ∗0 ) = (−4, 0.1, 0.3, −0.2) for model A and
(−3.7, 0.1, 0.3, −0.2) for model B.
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
32 / 45
Simulation study
Simulation Study
(M3). (Complementary log-log Non-ignorable)
π1i
= 1 − exp {− exp(φ0 + φ1 Xi + φ2 Y1i )}
π2i
= 1 − exp {− exp(φ∗0 + φ1 Xi + φ2 Y2i )} ,
where (φ0 , φ1 , φ2 , φ∗0 ) = (−3.4, 0.1, 0.1, −0.3) for model A and
(−3.3, 0.1, 0.1, −0.1) for model B.
(M4). (Probit Non-ignorable)
π1i = Φ(φ0 + φ1 Xi + φ2 Y1i ),
π2i = Φ(φ∗0 + φ1 Xi + φ2 Y2i ),
where (φ0 , φ1 , φ2 , φ∗0 ) = (−2.6, 0.2, 0.2, −0.5) for model A and
(−2.4, 0.2, 0.2, −0.5) for model B.
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
33 / 45
Simulation study
Estimators for θ1 (Nested Two-Phase)
The “working” model for the propensity score estimation is the Linear
Non-ignorable model (M2).
From each sample, we computed the following four estimators for
θ1 = E (Y1 ):
1
2
3
4
Naive: Calibration estimator which assumes ignorable missing
mechanism;
PS: Proposed
propensity score estimator, computed by
P
−1
N −1 i∈A1 π̂1i
y1i ;
REG: Proposed regression estimator, θ̂1 − B̂reg (θ̂1,x − X̄N ), where
nP
o−1
P
0
−1
−1
B̂reg = i∈A1 π̂1i
y1i xi
;
i∈A1 π̂1i xi xi
OPT: Proposed optimal estimator, θ̂1 − B̂opt (θ̂1,x − X̄N ), where
ˆ (θ̂1 , θ̂1,x )V̂ −1 (θ̂1,x ).
B̂opt = Cov
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
34 / 45
Simulation study
Results (Nested Two-Phase)
Table: Simulation results of the point estimators and variance estimators under
models (M1)-(M4).
Model
Parameter
θ1
(M1)
θ2
Kim (ISU)
Method
Naive
PS
REG
OPT
Naive
PS
REG
OPT
Bias
S.E
RMSE
R.B
0.0075
0.0226
0.0278
-0.0063
0.0074
0.0121
0.0208
0.0149
0.0814
0.1783
0.2177
0.1409
0.0982
0.1575
0.1716
0.1667
0.0817
0.1797
0.2195
0.1410
0.0985
0.1580
0.1729
0.1674
N/A
0.0174
0.0107
0.0589
N/A
0.0453
-0.1082
-0.0848
Propensity score estimation in voluntary sampling
November 23, 2012
35 / 45
Simulation study
Results (Nested Two-Phase)
Table: Simulation results of the point estimators and variance estimators under
models (M1)-(M4).
Model
Parameter
θ1
(M2)
θ2
Kim (ISU)
Method
Naive
PS
REG
OPT
Naive
PS
REG
OPT
Bias
S.E
RMSE
R.B
0.5733
0.0011
0.0015
-0.0078
0.7129
0.0003
0.0007
0.0015
0.0872
0.1192
0.1439
0.1126
0.1048
0.1156
0.1160
0.1156
0.5799
0.1192
0.1439
0.1129
0.7206
0.1156
0.1160
0.1156
N/A
0.0401
0.0651
-0.0036
N/A
0.0368
0.0410
0.0326
Propensity score estimation in voluntary sampling
November 23, 2012
36 / 45
Simulation study
Table: Simulation results of the point estimators and variance estimators under
models (M1)-(M4).
Model
Parameter
θ1
(M3)
θ2
Kim (ISU)
Method
Naive
PS
REG
OPT
Naive
PS
REG
OPT
Bias
S.E
RMSE
R.B
0.1881
-0.0892
-0.0687
-0.1213
0.2526
-0.0757
-0.0750
-0.1077
0.0836
0.1335
0.1658
0.1169
0.0987
0.1297
0.1298
0.1120
0.2058
0.1606
0.1795
0.1685
0.2712
0.1502
0.1499
0.1554
N/A
-0.0419
-0.0831
0.0207
N/A
-0.0508
-0.0541
0.0145
Propensity score estimation in voluntary sampling
November 23, 2012
37 / 45
Simulation study
Results (Nested Two-Phase)
Table: Simulation results of the point estimators and variance estimators under
models (M1)-(M4).
Model
Parameter
θ1
(M4)
θ2
Kim (ISU)
Method
Naive
PS
REG
OPT
Naive
PS
REG
OPT
Bias
S.E
RMSE
R.B
0.8280
0.0920
0.0726
0.1029
1.0536
0.0764
0.0774
0.0871
0.0975
0.1255
0.1552
0.1129
0.1365
0.1230
0.1239
0.1100
0.8337
0.1556
0.1713
0.1528
1.0624
0.1448
0.1461
0.1403
N/A
-0.0086
-0.0110
0.0090
N/A
-0.0063
-0.0171
-0.0058
Propensity score estimation in voluntary sampling
November 23, 2012
38 / 45
Simulation study
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
39 / 45
Application to 2012 Iowa Caucus Survey
1
Introduction
2
Motivating Example
3
Proposed method
4
Nonnested two-phase extension
5
Simulation study
6
Application to 2012 Iowa Caucus Survey
7
Conclusion
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
40 / 45
Application to 2012 Iowa Caucus Survey
Propensity Model
π1i (φ) =
exp(φ0 + φ1 Xi + φ2 Y1i )
1 + exp(φ0 + φ1 Xi + φ2 Y1i )
π2i (φ∗ ) =
exp(φ∗0 + φ1 Xi + φ2 Y2i )
,
1 + exp(φ∗0 + φ1 Xi + φ2 Y2i )
and
where X =(Party, Age) and Y =“First choice” of the presidential
candidate for Republic party.
Results for propensity model parameter estimation
Table: Point Estimation and Standard Error for Φ
Coefficient
Age
Party
Romley
Perry
Paul
Others
Est
S.E.
t.value
0.588
0.266
2.211
0.782
0.251
3.116
0.991
0.454
2.183
0.454
0.663
0.685
0.866
0.841
1.030
1.307
0.985
1.327
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
41 / 45
Application to 2012 Iowa Caucus Survey
Results for estimation for θ=proportion of first choice candidate
Table: Point Estimation (s.e.) for 2012 Iowa Caucus Survey Results
Survey
Method
Romney
Perry
Paul
Others
Nov.
Naive
Ignorable
Proposed
Dec.
Naive
Ignorable
Proposed
0.340
0.316
0.303
(0.062)
0.281
0.270
0.244
(0.043)
0.108
0.103
0.106
(0.039)
0.140
0.144
0.134
(0.026)
0.130
0.146
0.093
(0.067)
0.131
0.148
0.112
(0.046)
0.422
0.435
0.499
(0.046)
0.448
0.437
0.509
(0.036)
Actual outcome (Jan 3, 2012): θ0 = (24.5%, 10.3%, 21.4%, 43.7%)
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
42 / 45
Conclusion
1
Introduction
2
Motivating Example
3
Proposed method
4
Nonnested two-phase extension
5
Simulation study
6
Application to 2012 Iowa Caucus Survey
7
Conclusion
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
43 / 45
Conclusion
Conclusion Remarks
We proposed a new approach to propensity score estimation for
voluntary samples; capture-recapture sampling approach.
If a follow-up sample is obtained from the original participants, we
can assume the same propensity model to estimate the model
parameters from the follow-up sample.
Instead of a follow-up sample, we can also use an independent sample
with common survey items from the same population & apply the
estimation method for capture-recapture sampling.
Auxiliary information from sample or population can be incorporated
via regression estimation.
Promising for web surveys.
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
44 / 45
Conclusion
Thank you!
Kim (ISU)
Propensity score estimation in voluntary sampling
November 23, 2012
45 / 45
Download