Satellite based PM concentrations and application to COPD in Cleveland Supplementary Online Material on Local time-space Kriging Theoretical Model - Local Time Space Kriging Large environmental process tends to exhibit non-stationary behavior (Liang and Kumar 2013). For example, air pollutants from static emission sources are transported by winds, leading to spatial, seasonal and diurnal variability (Kumar, Chu, Foster, Peters and Willis 2011). Similar interactions between emission sources and meteorological conditions generate processes that are non-stationary both in the mean and covariance structure. The usual assumption of stationary and separable spatiotemporal covariance is not flexible enough for large datasets both in terms of computation and modeling. Local time space Kriging (LTSK) using a neighborhood has been proposed to address both problems (Haas 1995, Gething, et al. 2007). We generalize the local Kriging approach with the nonseparable product-sum model (De Iaco, Myers and Posa 2001) and implement this method specifically for large data sets. Let denote the observed Gaussian process defined over denotes the spatial domain and t indexes discrete timestamps. Let where denote the spatiotemporal locations where predictions are needed. We term as query point. The implementation of local Kriging requires a location specific neighborhood denoted by . We assume that within the local neighborhood, data are second order stationary with a non-separable spatiotemporal covariance function specified using the product sum approach (De Iaco, et al. 2001) as where for all and for all isotrpoic variogram models with finite sills(De Iaco, et al. 2001) and parameter and ; denote denote a strictly positive where sill and range parameter for spatial variogram and and denote the nugget , partial ; parameters are defined similarly. We incorporate substantive knowledge of the underlying process to specify the local neighborhood around each query point. Let H and U denote the distance and time thresholds where we expect for the upper limit of local spatial and temporal ranges, respectively. In case of aerosol optical depth (AOD) data, such knowledge can be based on the life cycle of the aerosol and the underlying spatial resolution of the satellite data. As a result, define a cylinder around each query point The strength of the underlying spatiotemporal correlations is allowed to vary both spatially and temporally over query points. We thus estimate a location specific variogram with the procedure described below. Given the estimated spatiotemporal range parameters, denoted by and respectively, we define the local neighborhood as a subset of We thus identify a local spatiotemporal domain where the process at observed location are highly correlated with the process realized at query point j, conditioning on the estimated variogram. In practice the implementation of the neighborhood specification faces two challenges: sparse data and computational issues for large data. In presence of missing data, narrow spatiotemporal lags can result in very few data points to accurately estimate time-space covariance function (Kyriakidis and Journel 1999). Consequently, it will result in discontinuity in the prediction across time and space. For example, in the satellite based air pollution data, there are systematic gaps (due to cloud cover and/or data contamination) and sufficient data points may not be available if small time-space lags are used to define the neighborhood. To ensure adequate sample size, we can divide the local neighborhood into non-overlapping spatiotemporal voxels, termed as cubes hereafter. Let and denote the number of distinct cubes across space and time, we proceed with the following procedure only when and for a given lower limits. For the demonstration purposes we set . A wider specification of spatiotemporal lags can result in too many data points within the local neighborhood, especially for satellite data. The computation of empirical variogram described below and the Kriging operation involves working with a dense matrix where denotes the number of neighbors to query point j. If is large, repeatitive inversion requires computation and the implemeantion becomes computationally expencsive for large number of Dimension reduction and subsampling have been utilized to address the Kriging for large data sets (Vecchia 1988, Rennen 2009, Cressie, Shi and Kang 2010). Utilizing the spatial sampling design methods, we reduce the number of neighborhood by sub-sampling within the cylinder . This requires setting up a upper limit of data points (Ln). If exceeds Ln, we sample Ln data points from the total possible neighbors in a spatiotemporally balanced way. Balance sampling achieves good predictive performance of the underlying process(Stevens and Olsen 2004). Specifically we divide the space and time domain into non-overlapping cubes and sample from each cube. Let denote the sampling probability of cube , we assign the number of points in each local spatiotemporal neighborhood. Even though this approach discards some data points, the loss of information is relatively small because of strong spatiotemporal autocorrelation within the neighborhood. The Kriging operation requires at most computatiaion, which only depends upon the number of query points. We estimate empirical variogram using classical method of moments. Suppress the dependency upon query point j for brevity of notation. We define the empirical variogram as where and and denote the distance and time bin. We consider distance bins up to of the maximum distance and time lag. The number of bins must be sufficient to estimate the empirical variogram. We use bins for distance and a single time point as the time bin. With thousands of query point, manual estimation of the variogram is challenging. We propose an automatic way to estimate the variogram. Let and denote the smallest distance and time lags. We estimate the spatial and temporal variograms and from and respectively(De Cesare, Myers and Posa 2002). We propose four candidate variogram with finite sills: exponential, spherical , Gaussian and Matern. The range parameters are estimated using the initial distance lag when the empirical variogram exceeds 80% of the maxim. This rough estimates in general do not affect the predictive performance of Kriging(Zhang and Wang 2010). The nugget and sill parameter are estimated using least square methods, with weights proportional to the number of pairs of data in each lag. The variogram model with the smallest square deviation between the estimates and fitted values is chosen for and respectively. Let and denote the estimated sill parameters. We estimate the global sill based on sample variance of data points, plus a method to correct for the postive biase due to the correlation between these data points(Cressie 1988) These preliminary estimates are adjusted because they may produce an invalid variogram model. Specifically, we adjust so that . The parameter . The resulting covariance function is It follows from ordinary Kriging method to predict using all data from with the estimated covariance funtion. In case the prediction of the observed process over a region is desired, we utilize the above covariance estimates with block Kriging to adjust for the point to areal misalignment(Cressie 1993). References Cressie, N. (1988), "Spatial Prediction and Ordinary Kriging," Mathematical Geology, 20, 405-421. Cressie, N. (1993), Statistics for Spatial Data (Vol. 2), New York: John Wiley & Sons, INC. Cressie, N., Shi, T., and Kang, E. L. (2010), "Fixed Rank Filtering for Spatio-Temporal Data," Journal of Computational and Graphical Statistics, 19, 724-745. De Cesare, L., Myers, D. E., and Posa, D. (2002), "Fortran Programs for Space-Time Modeling," Computers & Geosciences, 28, 205-212. De Iaco, S., Myers, D. E., and Posa, D. (2001), "Space-Time Analysis Using a General Product-Sum Model," Statistics & Probability Letters, 52, 21-28. Gething, P. W., et al. (2007), "A Local Space-Time Kriging Approach Applied to a National Outpatient Malaria Data Set," Computers & Geosciences, 33, 1337-1350. Haas, T. C. (1995), "Local Prediction of a Spatio-Temporal Process with an Application to Wet Sulfate Deposition," Journal of the American Statistical Association, 90, 1189-1199. Kumar, N., Chu, A. D., Foster, A. D., Peters, T., and Willis, R. (2011), "Satellite Remote Sensing for Developing Time and Space Resolved Estimates of Ambient Particulate in Cleveland, Oh," Aerosol Science and Technology, 45, 1090-1108. Kyriakidis, P. C., and Journel, A. G. (1999), "Geostatistical Space-Time Models: A Review," Mathematical Geology, 31, 651-684. Liang, D., and Kumar, N. (2013), "Time-Space Kriging to Address the Problems of Misalignment, Mismatch and Missing Values in Spatiotemporal Datasets," Atmospheric Environment, 72, 60-69. Rennen, G. (2009), "Subset Selection from Large Datasets for Kriging Modeling," Structural and Multidisciplinary Optimization, 38, 545-569. Stevens, D. L., and Olsen, A. R. (2004), "Spatially Balanced Sampling of Natural Resources," Journal of the American Statistical Association, 99, 262-278. Vecchia, A. V. (1988), "Estimation and Model Identification for Continuous Spatial Processes," Journal of the Royal Statistical Society Series B-Methodological, 50, 297-312. Zhang, H., and Wang, Y. (2010), "Kriging and Cross-Validation for Massive Spatial Data," Environmetrics, 21, 290-304.