Bootstrap Confidence Intervals for Small Area Means Introduction

advertisement
Bootstrap Confidence Intervals for Small Area Means
Andreea L. Erciulescu and Wayne A. Fuller
Iowa State University, Department of Statistics, Ames, IA 50011
Introduction
Most small area studies focus on constructing predictors for the area means and on estimating the variance of the prediction errors. However, agencies and policy
makers are often interested in confidence intervals for the small area predictors. We present two sided confidence intervals for the small area means of a binary
response variable. We consider unit level data and stochastic covariates. The estimation of the prediction error variance and the estimation of the cutoff points
are key components in the construction of confidence intervals for the small area means. A linear approximation of the model is considered and a Taylor variance
approximation is presented for the prediction error variance. We compare different bootstrap estimation methods for the cutoff points using a simulation study.
Unit Level Model and Small Area Mean Prediction
yij |(xij , bi )
g(xij , β, bi )
∼
Bernoulli(g(xij , β, bi ))
xij
=
=
exp((1, xij )0 β + bi )
1 + exp((1, xij )0 β + bi )
µ̃xi
= µxi + ui
• i = 1, ..., m, j = 1, ..., ni
True small area mean of y
Z
θi = g(x, β, bi )dFxi (x)
µx + δi + ij =: µxi + ij
• (bi , δi , ij , ui ) mutually independent
• bi ∼ ind fb (0, σb2 )
• δi ∼ ind fδ (0, σδ2 )
• (yij , xij ) observed in pairs
• ij ∼ ind
• µ̃ = (µ̃x1 , ..., µ̃xm ) observed
• ui ∼ ind fui (0, ki−1 σ2 )
2
f (0, σ )
Small area mean prediction
R R R
Qni
g(µx + δ + , β, b)dF () ( t=1 f (yit |xit , b)f (xit |δi )) f (µ̃xi |δi )dFδ (δ)dFb (b)
b δ R R Qni
θ̂i =
( t=1 f (yit |xit , b)f (xit |δi )) f (µ̃xi |δi )dFδ (δ)dFb (b)
b δ
Prediction Mean Squared Error Estimators
Fast Double Bootstrap (FDB) Algorithm
Taylor approximation
g(xij , β, bi )
h̄ψ
≈
g(µ̂xi + ij , β̂, 0) + h̄µxi (µxi − µ̂xi ) + h̄β (β − β̂) + h̄bi bi
=
R
∂g(µxi +,β ,bi )
dFi (),
∂ψ
• Vector of parameters ψ = (β, σb2 , µx , σδ2 , σ2 )
ψ ∈ {µxi , β, bi }
• Data generator DG(ψ, r), random number seed r
Taylor Estimator of Prediction MSE
T aylor
αi
• αi∗ = (θ̂i∗ − θi∗ )2 , αi∗∗ = (θ̂i∗∗ − θi∗∗ )2
2
2
2
2
2
2
ˆ
:= M SE(θ̂i − θi ) = ĝ1i (σ̂b , σ̃ei ) + ĝ2i (σ̂b , σ̃ei ) + 2ĝ3i (σ̂b , σ̃ei ),
2
ĝ1i (σ̂b2 , σ̃ei
) =
2
ĝ2i (σ̂b2 , σ̃ei
)
=
•
−1 2
γ̃i ni σ̃ei ,
0
2
(1 − γ̃i ) h̄β V̂ (β̂)h̄β +
h̄2µxi V̂
=
h̄2bi σ̂b2 +
(µ̂xi ),
LEVEL TWO TELESCOPING
−1 2 −1
ni σ̃ei
h̄2bi σ̂b2 ,
−1
ni
∗ ∗
∗
(ψ 1 , θi,1 , αi,1
)
∗ ∗
∗
(ψ 2 , θi,2 , αi,2
)
P ni
j=1
uy,ij .
α̂i∗ = B1
−1
PB1
∗
α
k=1 i,k
Confidence Intervals (CIs)
∗
θ̂i,k
∗
θi,k
∗∗
θ̂i,k
∗∗
θi,k
−
−
θ̂i − θi
∗
∗∗
T̂i := q
, T̂i,k = q
, T̂i,k = q
∗,T aylor
∗∗,T aylor
T aylor
αi,k
αi,k
αi
• k = 1, .., B1
• qi,DB := |T ∗ |i,([(1−αB )B1 ]+1)
−1 PB1
∗∗
• αB = 1 − B1
I(|T
i,k | < qi,B )
k=1
Iz = θ̂i ± ζ̂i,α,z se
ˆi
• β = (−0.8, 1), µx = 0, ki = 10
• σb2 = 0.25, σδ2 = 0.16, σ2 = 0.36
• 400 Monte Carlo samples, B1 = 400, B2 = 1
Symmetric Two-sided General (1 − α)% CI
I = θ̂i ± qt,dfˆ
se
ˆi
,1−α/2
i
ˆ ) = argmin Qi (qi , τi , dfi )
• (τ̂i , df
i
• se
ˆ i = τ̂i
q
T aylor
αi
• Qi (qi , τi , dfi ) :=
∗
∗
∗∗
(α
+
α
−
α
i,k
i,k+1
i,k )
k=1
• Normal distributions fb , fδ , f , fui
• ζ̂i,α,z = Φ−1 ([1 − α/2])
95% Bootstrap CI
PB1
• m = 36, ni ∈ {2, 10, 40}
n
o
T aylor
• se
ˆ i ∈ αi
, α̂i∗ , α̂i∗∗
• qi,B := |T ∗ |i,([(1−α)B1 ]+1)
α̂i∗∗ =
−1
B1
Simulation Results
Wald-type (1 − α)% CI
Pivot-type Statistics
∗
DG(ψ 1 , r1,2 )
∗
DG(ψ 2 , r1,3 )
∗∗
∗∗
→ (θi,1
, αi,1
)
∗∗
∗∗
→ (θi,2
, αi,2
)
..
..
.
.
∗
∗
∗∗
∗∗
∗
∗
,
α
DG(ψ̂, r1,B1 ) → (ψ B1 , θi,B1 , αi,B1 ) DG(ψ B1 , r1,B1 +1 ) → (θi,B
i,B1 +1 )
1 +1
DG(ψ̂, r1,1 ) →
DG(ψ̂, r1,2 ) →
= yij − g(xij , β̂, 0), ūyi =
uy,ij
∗∗,T aylor
∗
∗
ˆ
= M SE(θ̂i − θi ), αi,k
= M ˆSE(θ̂i∗∗ − θi∗∗ )
LEVEL ONE
2
ĝ3i (σ̂b2 , σ̃ei
) = ū2yi V̂ (γ̃i ),
γ̃i
∗,T aylor
αi,k
−1
(qi −τi qtdfi )Vq i (qi −τi qtdfi )0
• qi ∈ {qi,B , qi,DB }, qtdfi :=
−1
Ftdf ([1
i
Empirical Coverages for 95%
ni IαT aylor Iα̂∗i
i
2
90.6
89.7
10
92.2
90.3
40
94.1
91.4
Empirical Coverages
Level 1
ni
90%
95%
2
90.0
94.8
10
89.6
94.4
40
89.0
94.1
Wald-type CIs
Iα̂∗∗
i
88.3
89.1
89.7
for General Bootstrap
Level 2
99%
90%
95%
97.8
90.4
94.5
97.9
90.5
94.5
98.4
89.9
94.5
CIs
99%
97.4
97.5
98.2
− α/2])
• Vq i based on Bahadur representation
Acknowledgement
I would like to acknowledge the U.S. National Science Foundation for a travel award to attend this meeting.
Summary
• Wald-type CIs undercover
• Bootstrap CIs perform well
• FDB does not improve the coverage accuracy
Download