Parameter Estimation in the Spatio-Temporal Mixed Effects Model – Analysis of Massive Spatio-Temporal Data Sets Matthias Katzfuß Advisor: Dr. Noel Cressie Department of Statistics The Ohio State University September 17, 2010 Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 1 / 23 Outline Outline 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 2 / 23 Introduction: The STME Model Outline 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 3 / 23 Introduction: The STME Model Notation • Hidden spatio-temporal process yt (s) at time t and location s • Measurements zt (si,t ) = yt (si,t ) + t (si,t ) i = 1, . . . , nt t = 1, . . . , T • In vector notation: z1:T := [z01 , . . . , z0T ]0 , where zt := [z(s1,t ), . . . , z(snt ,t )]0 • Goal: Predict yt (s0 ); t ∈ {1, . . . , T } Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 4 / 23 Introduction: The STME Model Motivating Example: Remote-Sensing Data Day 1 400 Example: Global satellite measurements of CO2 395 390 385 Challenges of global remote-sensing data: • Massiveness Day 2 380 • Need dimension reduction • Sparseness • Need to take advantage of spatial and temporal correlations • Nonstationarity • Need a flexible model 375 370 365 Day 3 360 355 350 Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 5 / 23 Introduction: The STME Model Spatio-Temporal Mixed Effects Model (Cressie et al., 2010) Process Model: yt (s) = x(s)0 βt + b(s)0 ηt + γt (s) • x(s)0 βt : large-scale trend • b(s) := [b1 (s), . . . , br (s)]0 : vector of known spatial basis functions • ηt = Hηt−1 + δt ; t = 1, 2, . . . • η0 ∼ Nr (0, K0 ) • δt ∼ Nr (0, U) • γt (s) ∼ N(0, σγ2 vγ (s)): fine-scale variation Unknown parameters: θ := {βt }, σγ2 , K0 , H, U Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 6 / 23 Introduction: The STME Model Previous Approaches to Massive S-T Data Sets • Many ad-hoc methods used outside the statistics literature (non-optimal, no measures of uncertainty) • Other statistical spatio-temporal dimension-reduction models are less general (e.g., Nychka et al., 2002) • STME model: Parameter estimation via binned-method-of-moments (Kang et al., 2010): • Many arbitrary choices have to be made • Estimates have to be modified to be valid • Does not fully exploit temporal dependence in the data Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 7 / 23 Parameter Estimation Outline 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 8 / 23 Parameter Estimation EM Estimation Outline 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 9 / 23 Parameter Estimation EM Estimation Maximum-Likelihood Estimation • Goal: Find θ̂ML = arg max f (z1:T |θ) θ where recall zt = Xt βt + Bt ηt + γt + t • Problem: Likelihood f (z1:T |θ) is quite complicated • Solution: Expectation-maximization algorithm (Dempster et al., 1977) • Maximization: “Complete-data likelihood” f (η1:T , γ1:T |θ) is easy to maximize • Expectation: Eθ ( f (η1:T , γ1:T |θ) | z1:T ) is obtained via FRS, a rapid sequential updating technique based on the Kalman filter (Kalman, 1960) Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 10 / 23 Parameter Estimation EM Estimation EM Estimation (Katzfuss & Cressie, 2010) The EM algorithm: • Choose initial value θ [0] • For l = 0, 1, 2, . . . (until convergence): 1. E-Step: Run FRS with θ [l] to obtain Eθ[l] ( f (η1:T , γ1:T |θ) | z1:T ) 2. M-Step: θ [l+1] = arg max Eθ[l] ( f (η1:T , γ1:T |θ) | z1:T ) θ 3. Go back to 1. Properties of the resulting estimates: • Parameter estimates guaranteed to be valid • Here, convergence to a (possibly local) maximum of the likelihood function Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 11 / 23 Parameter Estimation Bayesian Estimation Outline 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 12 / 23 Parameter Estimation Bayesian Estimation Bayesian Inference • Parameters θ have a prior distribution • Obtain posterior distribution of unknowns yt (s0 ) and θ given the data z1:T using Bayes’ Theorem • In almost all cases, have to approximate posterior by sampling from it • “Shrinkage”: Biased, but more efficient estimators Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 13 / 23 Parameter Estimation Bayesian Estimation Priors and Posteriors Prior distributions: • “Standard” priors on {βt } and σγ2 • Covariance matrices K0 and U: Multiresolutional Givens-angle prior (Kang & Cressie, 2009) • Control extreme eigenvalues • Shrink off-diagonal elements toward zero • Propagator matrix H: Shrink off-diagonal elements depending on how far corresponding basis functions are apart Posterior distribution: • Samples of posterior distribution obtained using MCMC Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 14 / 23 Application: Analysis of CO2 Data Outline 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 15 / 23 Application: Analysis of CO2 Data The Data Mid-tropospheric CO2 on May 1-4, 2003, as measured by AIRS (nt ≈ 14K ) Day 1 Day 2 400 395 390 385 380 375 Day 3 Day 4 370 365 360 355 350 Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 16 / 23 Application: Analysis of CO2 Data Statistical Analysis • Trend: x(s) = [1 lat(s)]0 • Make predictions on a hexagonal grid of size 57, 065 for each day • Basis functions: r = 380 bisquare functions at 3 spatial resolutions Bisquare function in one dimension 0.6 0.4 0.0 0.2 b(s) 0.8 1.0 Res 1 Res 2 Res 3 −1.0 −0.5 0.0 0.5 1.0 s Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 17 / 23 Application: Analysis of CO2 Data EM Results Predictions using EM Standard errors using EM EM computation time: 16 iterations × one minute each = 16 min total Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 18 / 23 Application: Analysis of CO2 Data Bayesian Results Posterior means Posterior standard deviations 1,500 MCMC iterations × 15 seconds each = 6.25 hours total Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 19 / 23 Application: Analysis of CO2 Data Estimates of the Propagator Matrix HEM HB 1 50 1 50 100 0.5 150 100 0.5 150 0 200 250 0 200 250 −0.5 300 −0.5 300 350 −1 50 100 150 200 250 300 350 Matthias Katzfuß (OSU Statistics) 350 −1 50 100 150 200 250 300 350 STME Parameter Estimation September 17, 2010 20 / 23 Conclusions Outline 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 21 / 23 Conclusions Conclusions • STME Model • Scalable and flexible technique for analysis of massive, nonstationary spatio-temporal data sets • Provides uncertainty quantification • Here, successful use on CO2 satellite data • Parameter estimation: • EM Estimation: Fast, easy • Bayesian estimation: Better prediction (≈ 10% for AIRS data), more accurate uncertainty assessment Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 22 / 23 Conclusions References • Cressie, N., Shi, T., & Kang, E. L. (2010). Fixed rank filtering for spatio-temporal data. Journal of Computational and Graphical Statistics. Forthcoming. • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum Likelihood from • • • • • Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B, 39(1), 1–38. Kalman, R. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1), 35–45. Kang, E. L., & Cressie, N. (2009). Bayesian inference for the spatial random effects model. Department of Statistics Technical Report No. 830. The Ohio State University. Kang, E. L., Cressie, N., & Shi, T. (2010). Using temporal variability to improve spatial mapping with application to satellite data. Canadian Journal of Statistics. Forthcoming. Katzfuss, M., & Cressie, N. (2010). Spatio-Temporal Smoothing and EM Estimation for Massive Remote-Sensing Data Sets. Department of Statistics Technical Report No. 840. The Ohio State University. Nychka, D. W., Wikle, C., & Royle, J. (2002). Multiresolution models for nonstationary spatial covariance functions. Statistical Modelling, 2, 315-331. Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 23 / 23