Semiparametric Mixed Models in Small Area Estimation Mark Delorey F. Jay Breidt

advertisement
Semiparametric Mixed Models in
Small Area Estimation
Mark Delorey
F. Jay Breidt
Colorado State University
September 22, 2002
FUNDING SOURCE
This presentation was developed under the STAR
Research Assistance Agreement CR-829095
awarded by the U.S. Environmental Protection
Agency (EPA) to Colorado State University. This
presentation has not been formally reviewed by
EPA. The views expressed here are solely those of
its authors and the STARMAP Program. EPA does
not endorse any products or commercial services
mentioned in this presentation.
Outline
Small area estimation
Standard parametric small area model
Semiparametric model/estimation
Future work
Small Area Estimation
Probability samples are not sufficiently dense in
small watersheds
Need to incorporate auxiliary information (remote
sensing, GIS) through model
Problem: High bias if model is misspecified
Standard Parametric Mixed Model with Site
Specific Auxiliary Data
Battese, Harter, Fuller (1988)
y gi  xgi   g   gi
where
i = 1,…,ng are the sites within small area g
g ~ iid N(0, 2) and gi ~ N(0, 2)
Battese, Harter, Fuller (cont.)
Small area mean is
1
g 
Ng
y
gi
Estimate by convex combination of model-based
prediction and survey regression estimator:


ˆ  (1  ) x  x 
ˆy
 x
sg
sg

Semiparametric Model with Smooth Function
Adapting time series case from Zhang, Lin, Raz, and
Sowers (1988):
y gi  xgi   f ( t gi )   g   gi  BHF  f ( t gi )
f(t) is a twice differentiable smooth function of time
g ~ N(0, 2)
gi  N(0, 2)
Smooth Function Representation
Impose the constraint f = T + Ba where a is iid
N(0, 2I) (Green, 1987)
T is a matrix consisting of the coordinates of observed
locations in time (space)
B is constructed based on relative positions of observed
locations
Linear Mixed Model
Model can then be written as linear mixed model:
y gi  xgi   tgi   bgi a   g   gi
a ~ N(0, 2I)
g ~ N(0, 2)
gi  N(0, 2)
Prediction with Linear Mixed Model
Then,
1


E( ygi | Y)  x gi   t gi   2111 (Y  X  T)
  X  T   11 12  
Y

 y  ~ N  x   t  , 




gi
gi
gi
22  
 
  21

11, 12, and 22 are known up to some variance components
Prediction with Linear Mixed Model
Using the form of composite estimator from BHF,
we get
1
ˆ g  xg ˆ  tg ˆ   2111
(Y  Xˆ  Tˆ )
for small ng, where ̂ and ̂ are the gls
estimates of  and , respectively.
Future Work
Estimate variance components and smoothing
parameter
Compare results with
y gi  xgi    g  U( t gi )   gi
where U(t) is a correlated random process
(black-box kriging, Barry and Ver Hoef, 1996)
References
Barry and Ver Hoef (1996). Blackbox Kriging: Spatial prediction without specifying
variogram models. Journal of Agricultural, Biological, and Environmental Statistics,
1:297-322.
Battese, Harter, Fuller (1988). An error-component model for prediction of county crop
areas using survey and satellite data. JASA 83:28-36.
Green (1987). Penalized likelihood for general semi-parametric regression models.
International Statistical Review 55:245-260.
Zhang, Lin, Raz, and Sowers (1998). Semiparametric stochastic mixed models for
longitudinal data. JASA 93:710-719.
Download