Nonstationary spatial covariance modeling via spatial deformation & extensions to covariates and models for dynamic atmospheric processes Paul D. Sampson (w/ Doris Damian, Peter Guttorp, Tilmann Gneiting) University of Washington and National Research Center for Statistics and the Environment 2-7 Feb 2003 SAMSI Multiscale Workshop 1 A common framework for spatio-temporal modeling: Response(x,t) = Trend(x,t) + Residual(x,t) Some (obvious) observations on issues of scale: the practical definition and estimation of Trend and Residual depend on: Understanding of the spatio-temporal processes of interest Scientific questions to be addressed by analysis and modeling Details of available spatio-temporal sampling data—inference at ``small’’ spatial and/or temporal scales generally requires sampling and modeling at corresponding scales. 2-7 Feb 2003 SAMSI Multiscale Workshop 2 My experience: 1. Hourly ozone monitoring data at 100 sites in the San Joaquin Valley, CA, for assessment of photochemical model predictions: Accounting for sub-grid scale spatial variability in point monitoring observations to assess grid average photochemical predictions Decomposing spatial difference fields—empirical grid cell estimates vs. model predictions—into components at different spatial scales Comparing empirical spatial covariance structure with model spatial covariance structure at different spatial scales. 2. 10-day aggregate precipitation in Languedoc-Roussillon, France 3. Daily ozone monitoring summaries (max 8-hr ave) from 100’s of monitoring sites across the country for spatial prediction at target locations. 2-7 Feb 2003 SAMSI Multiscale Workshop 3 Languedoc-Roussillon Precipitation Sites 2-7 Feb 2003 SAMSI Multiscale Workshop 4 SEASONAL (Apr-Oct) ozone sampling sites w ith 75% complete data each year, 1992-94, N = 257 (target counties highlighted) 2-7 Feb 2003 SAMSI Multiscale Workshop 5 ANNUAL ozone sampling sites w ith 75% complete data each year, 1992-94, N = 407 (target counties highlighted) 2-7 Feb 2003 SAMSI Multiscale Workshop 6 Objectives for approaches to nonstationary spatial covariance modeling. Characterizing spatially varying, locally (stationary) anisotropic structure. Scientific understanding/representation of covariance structure—not just a method of providing covariances for kriging. Capable of: reflecting effects of known explanatory environmental processes such as transport/wind, topography, point sources modeling effects of known explanatory environmental processes 2-7 Feb 2003 SAMSI Multiscale Workshop 7 Objectives (cont.) Application to purely spatial problems and/or problems with data sampled irregularly in space and time Application in context of dynamic models for space-time structure Application to “large” problems/data sets Diagnostics for local and large-scale correlation structure: o is the spatial structure “right” o is the nature/degree of nonstationarity (smoothness) right? Evaluation of uncertainty in estimation (interpolation) of spatial covariance structure Incorporation in an approach to spatial estimation accounting for uncertainty in estimation of (parameters of) spatial covariance structure 2-7 Feb 2003 SAMSI Multiscale Workshop 8 Selected Methods and References: 1. Basis function methods (Nychka, Wikle, …) 2. Kernel/smoothing methods (Fuentes, …) 3. Process convolution models (Higdon, …) 4. Parametric models 5. Spatial deformation models • Sampson & Guttorp, 1989, 1992, 1994 • Meiring, Monestiez, Sampson & Guttorp, 1997. “Developments... Geostatistics Wollongong '96. • Mardia & Goodall, 1993. in Multivariate Environmental Statistics. • Smith, 1996. Estimating nonstationary spatial correlations. UNC preprint. 2-7 Feb 2003 SAMSI Multiscale Workshop 9 4. Spatial deformation models (cont.) • Perrin & Meiring, 1999. Identifiability J Appl Prob. • Perrin & Senoussi, 1998. Reducing nonstationary random fields to stationarity or isotropy, Stat & Prob Letters. • Perrin & Monestiez, 1998. Parametric radial basis deformations. GeoENV-II. • Iovleff & Perrin, 2000. Simulated annealing. (JSM 2000) • Schmidt & O'Hagan, 2000. Bayesian inference. (JSM 2000) • Damian, Sampson & Guttorp, 2000. Bayesian estimation of semiparametric nonstationary covariance structures. Environmetrics. • Damian, Sampson & Guttorp, 2003. Variance modeling for nonstationary spatial processes with temporal replications. JGR (submitted) • Related recent developments in the atmospheric science literature: 2-7 Feb 2003 SAMSI Multiscale Workshop 10 Spatial Deformation Model Damian et al., 2000 (Environmetrics), 2003 (JGR) Z x, t x, t x H t x x, t 12 ( x, t ) spatio-temporal trend parametric in time; mv spatial process ( x) temporal variance at x, log-normal spatial process ( x, t ) msmt error and short-scale variation N (0, 2 ), independent of H t ( x) H t ( x) mean 0, var 1, 2nd -order cont. spatial process Cov ( H t ( x), H t ( y )) 1. x y 2-7 Feb 2003 SAMSI Multiscale Workshop 11 Model (cont.) H t ( x) mean 0, var 1, 2nd -order cont. spatial process Cov ( H t ( x), H t ( y )) 1. x y Cor H t ( x), H t ( y ) f ( x) f ( y ) f : G D smooth, bijective (Geographic Deformed plane) (d ) isotropic correlation function in a known parametric family (exponential, power exp, Matern) i.e. The correlation structure of the spatial process is an (isotropic) function of Euclidean distances between site locations after a bijective transformation of the geographic coordinate system. 2-7 Feb 2003 SAMSI Multiscale Workshop 12 A deformation of the unit square: (b) true, (c) estimated 2-7 Feb 2003 SAMSI Multiscale Workshop 13 Biorthogonal grids for a deformation of the unit square, (a) true, (b) estimated 2-7 Feb 2003 SAMSI Multiscale Workshop 14 Model (cont.) The spatial deformation f encodes the nonstationarity: spatially varying local anisotropy. We model this in terms of observation sites x1 , x2 , , xN as a pair of thin-plate splines: f x c Ax WT x cAx Linear part: global/large scale anisotropy WT x Non-linear part, decomposable into components of varying spatial scale: x x1 x x xN Model parameters: 2-7 Feb 2003 h 2 log h h 0 c21 , A 22 WN 2 , ( x) N 1 h 0 h 0 f :{c, A, W}, , , 2 , :{ , , 2 } SAMSI Multiscale Workshop 15 Likelihood: given observation vectors Z1,…,ZN of length T: L ( , , f , , 2 ; Z1 , 2 Σ T 2 , Z N ) Z1 , , Z N | , Σ Z , S | , Σ T (T 1) 1 exp tr Σ S Z Σ 1 Z 2 2 with covariance matrix having elements i j i j ij j 2 i j i j ,1 i, j N Integrating out a flat prior on the (constant) mean 1 S | Σ Z , S | , Σ d 2-7 Feb 2003 Σ T 1 2 (T 1) exp tr Σ 1S 2 SAMSI Multiscale Workshop 16 Posterior: Log-normal variance field: , 2 , , c, A, W 1 ( x ) i Σ 1 2 1 1 exp (log 1)Σ (log 1) 2 Full posterior is: , f : c, A, W , 2 , , , 2 , S S Σ 2 , 2 , , c, A, W c, A W 2 c, A : diffuse normal prior on 2 free params (4 constr) 1 W exp (W1 SW1 W2 SW2 ) IWVV , Sij ( xi x j ) 2 "bending energy" prior on space orthogonal to linear 2-7 Feb 2003 SAMSI Multiscale Workshop 17 Computation • Metropolis-Hastings algorithm for sampling from the highly multidimensional posterior. • Given estimates of D-plane locations, f(xi), the transformation is extrapolated to the whole domain using thin-plate splines. (Visualization and diagnostics.) • Predictive distributions for (a) temporal variance at unobserved sites, (b) the spatial covariance for pairs of observed and/or unobserved sites, (c) the observation process at unobserved sites. 2-7 Feb 2003 SAMSI Multiscale Workshop 18 Application to Languedoc-Roussillon Precipitation Data • 108 altitude-adjusted, 10-day aggregate precip records at 39 sites (Nov-Dec, 1975-1992). • Data log-transformed and site-specific means removed (for this analysis). • Estimated deformation is non-linear: correlation stronger in the NE region, weaker in the SW. 2-7 Feb 2003 SAMSI Multiscale Workshop 19 Languedoc-Roussillon Precipitation Sites 2-7 Feb 2003 SAMSI Multiscale Workshop 20 Estimated deformation of Languedoc-Roussillon region (a) (b) 19 19 25 9 22 53 45 45 53 9 41 41 2-7 Feb 2003 22 33 33 25 SAMSI Multiscale Workshop 21 Correlation vs Distance in G-plane and D-plane 2-7 Feb 2003 SAMSI Multiscale Workshop 22 Equi-correlation (0.9) contours D-plane (a) and G-plane (b) 2-7 Feb 2003 SAMSI Multiscale Workshop 23 Estimated (•) and predicted (◦) variances, ( x0 ) , vs. observed temporal variances, with one predictive std dev bars. 2-7 Feb 2003 SAMSI Multiscale Workshop 24 Assessment of (10-day aggregate) precipitation predictions at validation sites 2-7 Feb 2003 SAMSI Multiscale Workshop 25 Assessment of (10-day aggregate) precipitation predictions at validation sites (pred interval and obs) 2-7 Feb 2003 SAMSI Multiscale Workshop 26 SEASONAL (Apr-Oct) ozone sampling sites w ith 75% complete data each year, 1992-94, N = 257 (target counties highlighted) 2-7 Feb 2003 SAMSI Multiscale Workshop 27 Spatio-temporal trend modeling: ( x, t ) • Standardize (center and scale) the data for each of N=750 sites over 3 years and assemble N x (3*364) data matrix (“tricks” to deal with missing data). • Compute SVD and consider first 3-5 right singular vectors (EOFs); smooth these w.r.t. time. • Use these temporal trend components as basis functions for estimation of site-specific trends by regression. 2-7 Feb 2003 SAMSI Multiscale Workshop 28 01/01/1992 01/01/1993 01/01/1994 0.10 0.0 -0.10 Total.svd$svd$v[, j] 0.05 -0.05 Total.svd$svd$v[, j] Annual Trend Components fitted to all sites 01/01/1992 01/01/1992 01/01/1993 01/01/1994 0.10 0.0 -0.10 01/01/1992 01/01/1992 01/01/1993 01/01/1994 dates 92to94 01/01/1993 01/01/1994 -0.10 0.0 0.10 dates 92to94 Total.svd$svd$v[, j] 0.10 0.0 -0.10 Total.svd$svd$v[, j] dates 92to94 2-7 Feb 2003 01/01/1994 dates 92to94 Total.svd$svd$v[, j] 0.0 -0.10 Total.svd$svd$v[, j] dates 92to94 01/01/1993 01/01/1992 SAMSI Multiscale Workshop 01/01/1993 01/01/1994 dates 92to94 29 Region 6: Southern Calif (annual) 2-7 Feb 2003 SAMSI Multiscale Workshop 30 Region 6 : S Calif Starplot of temporal trend coefficients 60379002 60710012 61113001 61110007 60731001 60730001 2-7 Feb 2003 SAMSI Multiscale Workshop 31 01/01/1992 0.4 0.3 0.0 0.1 0.2 sqrt(max 8hr O3) 07/01/1993 01/01/1993 07/01/1994 60731001 60710012 61113001 07/01/1993 2-7 Feb 2003 1992-1994 01/01/1992 0.3 0.0 0.1 0.2 sqrt(max 8hr O3) 0.3 0.0 0.1 0.2 sqrt(max 8hr O3) 0.3 0.2 0.4 1992-1994 0.4 1992-1994 0.4 1992-1994 0.1 sqrt(max 8hr O3) 0.3 0.1 0.0 07/01/1993 0.0 01/01/1992 0.2 sqrt(max 8hr O3) 0.3 0.2 0.0 0.1 sqrt(max 8hr O3) 01/01/1992 61110007 0.4 60379002 0.4 60730001 07/01/1993 SAMSI Multiscale Workshop 1992-1994 01/01/1992 07/01/1993 1992-1994 32 Region 6 : S Calif Circle radii proportional to (detrended) site variances 2-7 Feb 2003 SAMSI Multiscale Workshop 33 N= 55 , S Calif monitoring sites and their representation in a deformed coordinate system reflecting spatial covariance Tue Feb 04 18:28:03 PST 2003 2-7 Feb 2003 SAMSI Multiscale Workshop 34 N=55, S Calif: 4 samples from the posterior distribution of deformations reflecting spatial covariance Tue Feb 04 18:30:29 PST 2003 2-7 Feb 2003 SAMSI Multiscale Workshop 35 0.0015 0.0 0.0005 0.0010 Covariance 0.0010 0.0005 0.0 Covariance 0.0015 0.0020 Region 6 S Calif 0.0020 Region 6 S Calif 0 100 200 300 400 500 0 Geographic Distance (km) 2-7 Feb 2003 100 200 300 400 D-plane Distance SAMSI Multiscale Workshop 36 0.8 0.6 0.0 0.2 0.4 Correlation 0.6 0.4 0.2 0.0 Correlation 0.8 1.0 Region 6 S Calif 1.0 Region 6 S Calif 0 100 200 300 400 500 0 Geographic Distance (km) 2-7 Feb 2003 100 200 300 400 D-plane Distance SAMSI Multiscale Workshop 37 Region 1: Northeast (seasonal) 2-7 Feb 2003 SAMSI Multiscale Workshop 38 Region 1 : Northeast Starplot of temporal trend coefficients 360430005 360530006 250010002 90110008 510410004 2-7 Feb 2003 516500004 SAMSI Multiscale Workshop 39 07/01/1994 0.4 0.3 0.2 0.0 0.1 sqrt(max 8hr O3) 0.3 0.2 0.0 0.1 sqrt(max 8hr O3) 0.3 0.2 0.1 0.0 01/01/1993 07/01/1994 01/01/1992 07/01/1993 516500004 250010002 360430005 2-7 Feb 2003 1992-1994 07/01/1994 0.3 0.2 0.0 0.0 01/01/1993 0.1 sqrt(max 8hr O3) 0.3 0.2 sqrt(max 8hr O3) 0.3 0.2 0.1 0.0 0.4 1992-1994 0.4 1992-1994 0.4 1992-1994 0.1 sqrt(max 8hr O3) 01/01/1993 sqrt(max 8hr O3) 360530006 0.4 90110008 0.4 510410004 01/01/1993 1992-1994 07/01/1994 SAMSI Multiscale Workshop 01/01/1993 07/01/1994 1992-1994 40 Region 1 : Northeast Circle radii proportional to (detrended) site variances 2-7 Feb 2003 SAMSI Multiscale Workshop 41 62 Northeast monitoring sites and their representation in a deformed coordinate system reflecting spatial covariance Thu Jan 30 20:50:30 PST 2003 2-7 Feb 2003 SAMSI Multiscale Workshop 42 N=62, Northeast: 4 samples from the posterior distribution of deformations reflecting spatial covariance Tue Feb 04 18:34:34 PST 2003 2-7 Feb 2003 SAMSI Multiscale Workshop 43 0.0010 0.0 0.0005 Covariance 0.0010 0.0005 0.0 Covariance 0.0015 Region 1 Northeast 0.0015 Region 1 Northeast 0 200 400 600 800 1000 1200 0 Geographic Distance (km) 2-7 Feb 2003 200 400 600 800 1000 1200 D-plane Distance SAMSI Multiscale Workshop 44 0.8 0.6 0.0 0.2 0.4 Correlation 0.6 0.4 0.2 0.0 Correlation 0.8 1.0 Region 1 Northeast 1.0 Region 1 Northeast 0 200 400 600 800 1000 1200 0 200 Geographic Distance (km) 2-7 Feb 2003 400 600 800 1000 1200 D-plane Distance SAMSI Multiscale Workshop 45 Work in Progress • Guidelines for MCMC estimation software, including better calibration of priors for spatial deformation and specification of MCMC proposal distributions. • Incorporation of the spatio-temporal trend model in the software for spatial prediction. • Incorporation of (simple) temporal autocorrelation structure. • More flexible deformation modeling for nonstationary structure at multiple scales: “principal warp” decomposition of thin-plate splines and mappings R2→ Rk. • Explicit modeling/incorporation of covariate effects, spatial (e.g. topography) and spatio-temporal (e.g. wind). 2-7 Feb 2003 SAMSI Multiscale Workshop 46 Some recent atmospheric science literature and proposals for spatio-temporal covariance models • Desroziers, 1999, A coordinate change for data assimilation in spherical geometry of frontal structures. Monthly Weather Review. The main impact of this transformation in the framework of data assimilation is that it enables the use of anisotropic forecast correlations that are flow dependent. • Riishojgaard, 1998. A direct way of specifying flow-dependent background correlations for meteorological analysis systems. Tellus. • Weaver and Courtier, 2001. Correlation modelling on the sphere using a generalized diffusion equation. Quar. J. Royal Met Soc. Generalization to account for anisotropic correlations are also possible by stretching and/or rotating the computational coordinates via a ‘diffusion’ tensor. • Wu et al., 2002. 3-D variational analysis with spatially inhomogeneous covariances. Monthly Weather Review. 2-7 Feb 2003 SAMSI Multiscale Workshop 47 Incorporating covariates • Carroll and Cressie, 1997. geomorphic site attributes in correlation model for snow water equivalent in river basins. c( s1 , s2 ) exp( B s1 s2 ) CX c DX d EX e FX f Where X’s represent differences between the two sites in elevation, slope, tree cover, aspect, Alternative: deform R2 into subspace of R6. • Riishojgaard, 1998, “flow-dependent” correlation structures for meteorological analysis systems. For z(s), a realization of a random field in Rd, c s1 , s2 d s1 s2 1 z (s1 ) z (s2 ) an embedding and deformation of the geographic coordinate space Rd into Rd+1, with a separable, stationary correlation model fitted in new coordinate space. 2-7 Feb 2003 SAMSI Multiscale Workshop 48 Covariance models for dynamic error structures in the context of data assimilation • Cox and Isham, 1988: with v a velocity vector in R2, a physical model for rainfall leads to space-time covariance function c( s1 , s2 ; t1 , t2 ) EVG ( s2 s1 ) V(t2 t1 ) where G(r) denotes area of intersection of two disks of unit radius with centers a distance r apart. There are variants in the meteorological and hydrological literature: depending on tangent line in a barotropic model; using geostrophic or semigeostropic coordinates; or working in a Lagrangian reference frame for convective rainstorms. These yield interesting, anisotropic and nonstationary correlation models (cf. Desroziers, 1997). They suggest interesting space-time extensions of current deformation approach and statistical model fitting questions. 2-7 Feb 2003 SAMSI Multiscale Workshop 49