Semiparametric Mixed Models in Small Area Estimation Mark Delorey F. Jay Breidt Colorado State University September 22, 2002 FUNDING SOURCE This presentation was developed under the STAR Research Assistance Agreement CR-829095 awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. The views expressed here are solely those of its authors and the STARMAP Program. EPA does not endorse any products or commercial services mentioned in this presentation. Outline Small area estimation Standard parametric small area model Semiparametric model/estimation Future work Small Area Estimation Probability samples are not sufficiently dense in small watersheds Need to incorporate auxiliary information (remote sensing, GIS) through model Problem: High bias if model is misspecified Standard Parametric Mixed Model with Site Specific Auxiliary Data Battese, Harter, Fuller (1988) y gi xgi g gi where i = 1,…,ng are the sites within small area g g ~ iid N(0, 2) and gi ~ N(0, 2) Battese, Harter, Fuller (cont.) Small area mean is 1 g Ng y gi Estimate by convex combination of model-based prediction and survey regression estimator: ˆ (1 ) x x ˆy x sg sg Semiparametric Model with Smooth Function Adapting time series case from Zhang, Lin, Raz, and Sowers (1988): y gi xgi f ( t gi ) g gi BHF f ( t gi ) f(t) is a twice differentiable smooth function of time g ~ N(0, 2) gi N(0, 2) Smooth Function Representation Impose the constraint f = T + Ba where a is iid N(0, 2I) (Green, 1987) T is a matrix consisting of the coordinates of observed locations in time (space) B is constructed based on relative positions of observed locations Linear Mixed Model Model can then be written as linear mixed model: y gi xgi tgi bgi a g gi a ~ N(0, 2I) g ~ N(0, 2) gi N(0, 2) Prediction with Linear Mixed Model Then, 1 E( ygi | Y) x gi t gi 2111 (Y X T) X T 11 12 Y y ~ N x t , gi gi gi 22 21 11, 12, and 22 are known up to some variance components Prediction with Linear Mixed Model Using the form of composite estimator from BHF, we get 1 ˆ g xg ˆ tg ˆ 2111 (Y Xˆ Tˆ ) for small ng, where ̂ and ̂ are the gls estimates of and , respectively. Future Work Estimate variance components and smoothing parameter Compare results with y gi xgi g U( t gi ) gi where U(t) is a correlated random process (black-box kriging, Barry and Ver Hoef, 1996) References Barry and Ver Hoef (1996). Blackbox Kriging: Spatial prediction without specifying variogram models. Journal of Agricultural, Biological, and Environmental Statistics, 1:297-322. Battese, Harter, Fuller (1988). An error-component model for prediction of county crop areas using survey and satellite data. JASA 83:28-36. Green (1987). Penalized likelihood for general semi-parametric regression models. International Statistical Review 55:245-260. Zhang, Lin, Raz, and Sowers (1998). Semiparametric stochastic mixed models for longitudinal data. JASA 93:710-719.