Benchmarked Small Area Prediction September 17, 2012 Iowa State University

advertisement
Benchmarked Small Area Prediction
Emily J.Berg, Wayne A. Fuller, Andreea L. Erciulescu
Iowa State University
Center for Survey Statistics and Methodology
Statistics Department
September 17, 2012
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
1 / 35
Introduction
Small area
• A population for which reliable statistics of interest cannot be
produced due to certain limitations of the available data
Examples
• Geographical regions: state, county, municipality
• Demographical groups: age, sex, race
• Demographical group within a geographical region
What?
• Estimation related to population counts
• Disease mapping research - for a long time
• Regional planning: apportionment of congressional seats and funds
allocation in the governments
- U.S. $130 bilion federal funds per year
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
2 / 35
Introduction - WHO?
• Small Area Income and Poverty Estimates (SAIPE) by Census Bureau
- income and poverty measures for various populations subgroups for
states, counties, and school districts
• Local Area Unemployment Statistics (LAUS) by Bureau of Labor
Statistics
- employment and unemployment for states, metropolitan areas,
counties and certain sub-county areas
• County Estimates by National Agricultural Statistics Service
- county estimates of crop yield
• Substance Abuse and Mental health Services Administration
- substance abuse in states and metropolitan areas
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
3 / 35
Introduction: Small area estimation
• Reliable estimates of one or several variables of interest in areas where
the information available is not sufficient
• Collection: survey in some or all areas
Problems:
• Sampling design to provide reliable data for large areas, but little
attention or no attention to small areas of interest
• Fixed budget or practical constraints
• Measurement errors, in the case of administrative data (even if no
sampling error)
• Underreported small area statistics, in the case of law enforcement
crime records
• Poor quality data due to nonresponse or hard-to-find populations, in
the case of census compiled subgroups population
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
4 / 35
Introduction: Small area estimation approaches
• Direct estimators - based on local data, when large sample size
• “Borrow strength” from other areas - deal with small sample size
- data from ‘similar’ areas
- data from previous occasions
• Models
- to share information between areas and account for correlations
- to link survey outcome or response variables to a set of predictor
variables known for small areas
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
5 / 35
Outline
1
Small area prediction models
2
The benchmarking restriction
3
Augmented models: linear and nonlinear
4
Simulation study
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
6 / 35
Part I: Linear small area model
yi = θi + ei
0
θi = xi β + ui , for i = 1, 2, .., m
• m is the number of small areas
• θi is the small area mean
• xi are known fixed vectors
• ui is the random small area effect
• ei is the sampling error
• assume
ei
ui
Andreea L. Erciulescu (Iowa State University)
∼ IND
0
0
2
σei
,
0
0
2
σui
.
September 17, 2012
7 / 35
Linear small area model: prediction
Assume we observe a = u + e and that
u
0
G 0
∼N
,
.
e
0
0 R
Question: Given the data a, what is our best guess (predictor) for the
unobserved u?
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
8 / 35
Linear small area model: prediction
Note that
a
u
0
0
I I
G 0
I I
,
.
I 0
0 R
I 0
=
I I
I 0
u
e
.
Thus,
a
u
∼N
d
=N
Andreea L. Erciulescu (Iowa State University)
0
0
G+R G
,
.
G
G
September 17, 2012
9 / 35
Linear small area model: prediction
E (u|a) = G(G + R)−1 a is the BLUP of u.
2
u
• If G = diag (σu2 ) and R = diag (σe2 ), then the BLUP ûi = σ2σ+σ
2 ai
u
e
• Usually, the matrices G and R are unknown and we replace them by
estimates Ĝ and R̂, or σ̂u2 and σ̂e2 , respectively.
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
10 / 35
Linear small area model: prediction
The BLUP û satisfies:
• û is a linear function of a
• û is unbiased for u so that E (û − u) = 0
• Var (û − u) is no “larger” than the Var (v − u), where v is any other
linear and unbiased predictor
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
11 / 35
Linear small area model: prediction
Small area information:
η(xi , β) = xi 0 β, for linear fixed area effects
η(xi , β) = g (xi , β), for nonlinear fixed area effects
In this situation we often make predictions for quantities like η(xi , β) + u,
and the predictor of such quantity is η(xi , β̂) + û, the estimator of η(xi , β)
plus the predictor of u.
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
12 / 35
Linear small area model: prediction
The best linear unbiased predictor (BLUP) of θi is
0
0
θˆi = xi β̂ + γi (yi − xi β̂),
where
m
m
X
X
0
2
2
−1 0 −1
xi (σui
+ σei2 )−1 yi = (X0 Σ−1
β̂ = [
xi (σui
+ σei2 )−1 xi ]−1
aa X) X Σaa y
i=1
i=1
2 + σ 2 as the ith diagonal element
• Σaa is a diagonal matrix with σui
ei
0
• X is the m × k matrix with xi as ith row
2 + σ 2 )−1 σ 2
• γi = (σui
ei
ui
The variance of the prediction error is
n
o
n o
0
V θ̂i − θi = γi σei2 + (1 − γi )2 xi V β̂ xi .
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
13 / 35
Benchmarking
Survey weights
• associated with a respondent
• certain number of units in the population that is sampled
• to compensate for the unequal probability of selection ⇒ reduce
selection bias
• calibration
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
14 / 35
Linear small area model: benchmarking restriction
• bechmarked estimator - reproduces large area weighted estimator,
when aggregated
m
X
i=1
ω i θˆi =
m
X
ω i yi
i=1
where ω i are vectors of fixed coefficients.
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
15 / 35
Linear small area model: benchmarking prediction
• No uniformly best benchmarked predictor
• Introduce an objective function
Minimize
ω
Q(θ̂ ) =
m
X
φi E (θ̂iω − yi )2
i=1
subject to
m
X
ω i θ̂iω
=
i=1
m
X
ω i yi .
i=1
Choices for φ−1
i
2 2
2
+ σei2 )−1 σui
σei ↔ BHF
φ−1
= (σui
i
φ−1
i
= ωi cov(θ̂i ,
m
X
ωj θ̂j ) ↔ PB
j=1
2
φ−1
= (σui
+ σei2 ) ↔ ITF
i
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
16 / 35
Augmented linear small area model: benchmarked predictor
ω
The linear predictor that minimized Q(θ̂ ) is
0
θ̂iω = θ̂i + φ−1
i ω i β̂ aug ,
where
m
m
X
X
0 −1
β̂ aug = (
φ−1
ω
ω
)
[
ω j (1 − γj )(yj − θ̂j )].
j j
j
j=1
j=1
GLS coefficient, where
• φ−1
j ω j ≈ explanatory variable
• φj ≈ weight
• yi − θ̂i ≈ dependent variable
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
17 / 35
Augmented linear small area model
Simple ratio adjustment predictor:
ω
θ̂ratio,i
= θ̂i + θ̂i β̂ aug ,
where φ−1
= θ̂i ω −1
and
i
i
m
m
X
X
β̂ aug = (
θ̂i ω i )−1
ω i (yi − θ̂i ).
i=1
Andreea L. Erciulescu (Iowa State University)
i=1
September 17, 2012
18 / 35
Augmented linear small area model: benchmarked predictor
Replace the original model with the augmented model
0
0
yi = xi β + zi β aug + ui + ei
The predictor can now be written:
0
0
0
0
θ̂iω = xi β̂ + zi β̂ aug + γi (yi − xi β̂ − zi β̂ aug )
0
0
0
0
−1
= xi β̂ + φ−1
i ω i β̂ aug + γi (yi − xi β̂) = θ̂i + φi ω i β̂ aug .
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
19 / 35
Augmented linear small area model
The estimated GLS coefficient is
m
m
X
X
0
0
β̂ aug = (
zi ψi−1 zi )−1
zi ψi−1 (yi − xi β̂)
i=1
i=1
• zi = ψi (1 − γi )ω i ≈ new explanatory variable
−2 ≈ new weight
• ψi = φ−1
i (1 − γi )
0
• yi − xi β̂ ≈ dependent variable
• β̂ aug is not the usual regression coefficient because zi arises from a
constraint that is not part of the original model.
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
20 / 35
Augmented linear small area model
0
0
0
0
β̂ aug = (Z Ψ−1 Z)−1 Z Ψ−1 â = (W Φ−1 W)−1 W (I − Γ)â,
where
• Ψ is a diagonal matrix with ψi as the ith diagonal element
0
• Z is the m × r matrix with zi as the ith row
0
• the ith element of â is âi = yi − xi β̂
0
• W is the matrix with ω i as the ith row
• Φ is a diagonal matrix with φi as the ith diagonal element
• Γ is a diagonal matrix with γi as the ith diagonal element.
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
21 / 35
Augmented linear small area model: prediction error
The prediction error is
0
θ̂iω − θi = θ̂i − θi + φ−1
i ω i β̂ aug ,
where
0
−1
−1
β̂ − β = (X0 Σ−1
aa X) X Σaa a
0
0
0
−1
−1
β̂ aug = (W Φ−1 W)−1 W (I − Γ)[(I − X(X0 Σ−1
aa X) X Σaa )a]
• β̂ and β̂ aug are uncorrelated
• V
n
o
n
o
n
o
0
θ̂iω − θi = V θ̂i − θi + φ−2
ω
V
β̂
aug ω i
i
i
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
22 / 35
Part II: Nonlinear small area model
yi = θi + ei
θi = g (xi , β) + ui , for i = 1, 2, .., m
• m is the number of small areas
• θi is the small area mean
• xi are known fixed vectors
• g (xi , β) is continuous in β with two continuous derivates
• ui is the random small area effect
• ei is the sampling error
• assume
ei
ui
Andreea L. Erciulescu (Iowa State University)
∼ IND
0
0
2
σei
,
0
0
2
σui
.
September 17, 2012
23 / 35
Nonlinear small area model
The GLS estimator of β satisfies the equation
m
X
∂g (xi , β̂)
i=1
∂β
2
(σui
+ σei2 )−1 [yi − g (xi , β̂)] = 0
By standard approximation methods,
0
−1
−1
−0.5
β̂ − β = (H0 Σ−1
)
aa H) H Σaa a + op (m
:= Ma + op (m−0.5 ),
where
0
• H is the matrix with hi = ∂g (xi , β)/∂β as the ith row
2 + σ 2 as the ith diagonal element
• Σaa is a diagonal matrix with σui
ei
• a is a vector with yi − g (xi , β) as the ith element
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
24 / 35
Augmented nonlinear small area model: benchmarking
The original model can be replaced with the augmented model
0
0
yi = g ((xi , zi ), (β, β aug )) + ui + ei
The benchmarking restriction
m
X
i=1
ω i θˆi =
m
X
ω i yi ⇔
i=1
m
X
0
0
ω i (1 − γi )[yi − g ((xi , zi ), (β̂, β̂ aug ))] = 0
i=1
The benchmarked predictor is
0
0
0
0
θ̂iω = g ((xi , zi ), (β̂, β̂ aug )) + γi [yi − g ((xi , zi ), (β̂, β̂ aug ))]
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
25 / 35
Augmented nonlinear small area model: benchmarking
The estimated coefficient β̂ aug satisfies
m
X
0
0
0
0
0
haug ,i [yi − g ((xi , zi ), (β̂, β̂ aug ))] = 0,
i=1
where
0
haug ,i =
∂g ((xi , zi ), (β, β aug ))
0
∂β aug
,
and zi is chosen such that
0
0
∂g ((xi , zi ), (β̂, β̂ aug ))
ω i (1 − γi )ψi =
.
∂β aug
The benchmarking restriction
m
X
0
0
ω i (1 − γi )[yi − g ((xi , zi ), (β̂, β̂ aug ))] = 0
i=1
is satisfied.
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
26 / 35
Augmented nonlinear small area model: prediction error
The prediction error is
0
θ̂iω − θi = θ̂i − θi + haug ,i β̂ aug + op (m−0.5 ),
where
β̂ − β = Ma + op (m−0.5 )
0
0
0
β̂ aug = (Haug Ψ−1 Haug )Haug Ψ−1 (I − HM)a + op (m−0.5 ),
0
• the ith row of Haug is haug ,i = ω i (1 − γi )ψi
• β̂ aug and β̂ are approximately uncorrelated
• V
n
o
n
o
n
o
0
θ̂iω − θi = V θ̂i − θi + haug ,i V β̂ aug haug ,i + o(m−1 )
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
27 / 35
Part III: Simulation Model
Model (developed for Canadian Labour Force Survey)
p̂i = θi + ei ,
g (xi , β) =
θi = g (xi , β) + ui
exp(β0 + β1 xi )
1 + exp(β0 + β1 xi )
xi = log[pC ,i (1 − pC ,i )−1 ],
pC ,i = Census proportion
• ni are the sample sizes of small areas
2 = αg (x , β)(1 − g (x , β))
• σui
i
i
2
• σei variance of a 2-stage cluster sample of size ni
• assume
ei
ui
Andreea L. Erciulescu (Iowa State University)
∼ IND
0
0
−1 2
ni σei
,
0
0
2
σui
.
September 17, 2012
28 / 35
Predictors for the Simulation
2 are estimated iteratively
β̂ and σ̂ui
• β̂ minimizes
Q(β) =
m
X
2 −1
(p̂i − g (xi , β))2 (ni−1 σ̂ei2 + σ̂ui
)
i=1
2
• Estimated GLS for σ̂ui
The predictor of θi is
θ̂i = g (xi , β̂) + γ̂i [p̂i − g (xi , β̂)],
where
2 −1 2
γ̂i = (ni−1 σ̂ei2 + σ̂ui
) σ̂ui .
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
29 / 35
Benchmarked Predictors for the Simulation
Restrictions: preserve direct estimators
10
X
i=1
θ̂iω ωi =
10
X
p̂i ωi
i=1
Benchmarking methods
• Ratio adjustment (raking) ψi = (1 − γ̂i )−2 ωi−1 θ̂i
2 + σ̂ 2 ]−1 n−1 σ̂ 2 σ̂ 2
• Augmented model (BHF) ψi = (1 − γ̂i )−2 [ni−1 σ̂ei
ei ui
ui
i
• Augmented model ψi = (1 − γ̂i )−2 ωi−1
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
30 / 35
Simulation Parameters
Parameters
Area
gi
Sample Size
1
0.20
16
2
0.75
16
3
0.50
16
4
0.20
30
5
0.75
30
Area
gi
Sample Size
6
0.75
60
7
0.50
60
8
0.20
204
9
0.75
204
10
0.50
204
• Two simulation sets
• Set 1: ωi increases as ni increases
• Set 2: ωi decreases as ni increases
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
31 / 35
Simulation Results: MC MSE
• Simulation set 1: ωi increases as ni increases
ni
g (xi , β)
ωi
MSE(θ̂i )
Added MSE
Aug BHF
0.0
Aug ω −1
0.0
16
0.20
0.01
11
Raking
0.0
16
0.50
0.01
21
0.2
0.0
0.1
204
0.20
0.25
5
0.2
0.2
0.3
204
0.50
0.25
12
0.2
0.5
0.1
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
32 / 35
Simulation Results: MC MSE
• Simulation set 2: ωi decreases as ni increases
ni
g (xi , β)
ωi
MSE(θ̂i )
Added MSE
Aug BHF
26
Aug ω −1
28
16
0.20
0.25
11
Raking
20
16
0.50
0.25
21
46
68
30
204
0.20
0.04
5
19
0
28
204
0.50
0.01
12
45
0
28
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
33 / 35
MC Coverages of Nominal 95% Prediction Intervals
• Nominal 95% prediction interval
ˆ 0.5
θ̂iω ± tni MSE
i,aug
• Average coverages across areas with the same sample size
Set 1
Set 2
Empirical Coverages (BHF)
ni = 16 ni = 30 ni = 60 ni = 204
0.94
0.93
0.94
0.95
0.96
0.93
0.93
0.95
• Set 1: ωi increases as ni increases
• Set 2: ωi decreases as ni increases
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
34 / 35
Summary
• Augmented approach for nonlinear models
• Benchmarking effect on MSE
• small increase when ωi inversely related to variance
• large increase when ωi positively related to variance
• Alternative weights in the objective function
• Aug BHF - relatively small average amount added to MSE
• Aug ω −1 - nearly constant increase in MSE
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
35 / 35
End
Thank you!
References:
Battese, G.E., Harter, R.M. and Fuller, W.A. (1988). “An error components model for prediction of county crop
areas using survey and satellite data,” Journal of the American Statistical Association, 28-36.
Jiang, J. and Lahiri, J. (2006). “Mixed model prediction and small area estimation,” Test 15, 1-96.
Thomas, D.R., Rao, J.N.K. (1987), “Small-sample comparisons of level and power for simple goodness-of-fit
statistics under cluster sampling,” Journal of the American Statistical Association 82, 630-636.
Wang, J., Fuller, W.A., Qu, Y. (2008), “Small area estimation under a restriction,” Survey Methodology 34, 29-36.
You, Y., and Rao, J.N.K. (2002), “A pseudo-empirical best linear unbiased prediction approach to small area
estimation using survey weights,” Canadian Journal of Statistics 35, 431-439.
Andreea L. Erciulescu (Iowa State University)
September 17, 2012
36 / 35
Download