This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. The Effect of Spatial Covariance Heterogeneity on Prediction Variance J. Andrew Royle*and Doug Nychka Abstract.- Stationarity of a random field is an important assumption required for many spatial statistical analyses. A part of this assumption is homogeneity (translation invariance) of the correlation function. False assumption of homogeneity leads directly to rnisspecification of the spatial covariance and hence the usual kriging predictor is no longer BLUE. In this paper we examine the effect of homogeneous misspecification of a heterogeneous correlation function on prediction standard errors using two cases from a class of heterogeneous models that are spatially weighted mixtures of homogeneous random fields. It is found that misspecification leads to an average loss of efficiency in prediction by up to 20% for extreme cases, and that misspecification of the prediction variance generally leads to conservative prediction intervals. 1. INTRODUCTION Kriging is a commonly used method for making spatial predictions. One of the primary reasons for the use of kriging over other methods of spatial prediction is that prediction standard errors are easily obtained for kriging and that the kriging predictor is BLUE. Unfortunately, the BLUE property holds only under the assumption that the spatial covariance is known, which is not likely to be the case in practice (i.e. there is generally a rnisspecification of the covariance function). Kriging predictions are known to be fairly robust to misspecification of the spatial covariance structure (see Cressie and Zimmerman, 1992). Several authors have addressed the problem of stability of the kriging variance. In particular, Stein (1988, 1989) examined the effect of misspecification of the covariance structure on prediction variance. For a particular (fairly general) model, he derived a bound *Both Authors: Department of Statistics, North Carolina State University, Raleigh NC on the ratio of the optimal prediction variance to the prediction variance for the predictor based on a misspecified covariance. More generally, he has shown that the misspecified prediction variance is asymptotically equivalent to the minimum prediction variance, assuming the true and rnisspecified covariance functions are compatible (see Stein and Handcock (1989) for a discussion of this condition). Here we are interested in the departure of a random field from the assumption of translation invariance of the spatial correlation. Failure to detect departure from this assumption leads directly to misspecification of the covariance function, and we quantify the effects of this misspecification on prediction variance under a class of heterogeneous random field models. 2. STATIONARITY AND HETEROGENEITY Let Z = { Z ( x ) : x E ?JZd} be a random field in d-dimensions, from which we observe a = { z ( x i ) : 1 5 i p } where x represents a vector of coordinate locations in space (e.g. x = ( x l ,2 2 ) in .R2). Z is second-order stationary if < and C o r r ( Z ( x ) ,Z ( x l ) ) = p(x - X I ) V x , 2' E ?JZd Thus under second-order stationarity we have that the mean and covariance do not depend on the location of the points in space (i.e. translation invariance of the first and second moments of 2). Concern over the validity of the stationarity assumption generally focuses on constancy of the mean and variance, however homogeneity (translation invariance) of the spatial correlation is also implied by stationarity. A heterogeneous random field then, is one in which the correlation function is not invariant to translation. 2.1 A Class of Heterogeneous Random Field Models To examine the effect of heterogeneity on prediction variance it is necessary to specify a random field model with heterogeneous covariance structure. Here we consider special cases from the class of models that are spatially weighted mixtures of homogeneous random fields. The 2 component mixture is given by Here the sampled field 2, is a spatially weighted sum of the homogeneous random fields U and V with covariance functions c, and c, and cross-covariance c,,. The nature of the spatial weights a ( $ ) and b ( x ) determines the manner in which the covariance between two points depends on their relative locations. Note that the covariance between Z at x and X I under (1) is C o v ( Z ( x ) ,Z ( x 9 ) ) = a(x)a(xl)c,(x - x') + + b ( x )b(xf)cV(x - x') (2) +(a(x)b(xl) a(xl)b(x))cu,( x - x') It is obvious that the covariance between two points depends explicitly on their locations x and x9 via the weights a(+) and b ( x ) ,and hence Z is a heterogeneous random field. Simple cases of the mixture model are the heterogeneous variance model: Z ( x ) = a ( x ) U ( x ) ,where V a r ( Z ( x ) )is proportional to a ( x ) and the heterogeneous measurement error model: Z ( x ) = U ( x ) b ( x ) V ( x )where V ( x ) is measurement error with constant variance and Z ( x ) has measurement error (i.e. "nugget") variance proportional to b ( x ). When the spatial weights are disjoint indicator functions (i.e. either U ( x ) or V ( x )are observed at each x ) and c,, = 0,then Z is homogeneous within subregions and Z has been called pseudo-stationary,or locally stationary. This model is equivalent to that implied by partitioning a region up into disjoint (homogeneous) subregions, a common solution to heterogeneity in classical geostatistics (Guttorp and Sampson, 1994). In this study, we consider the 2 subregion case further (Model 1, described below). A similar model is the relative variogram model of Cressie (199 1) except that the later assumes a proportional relationship between the covariance and mean within each subregion. Taking the spatial weights to be overlapping (but not constant) produces a reasonable range of heterogeneous behavior, and this is the second situation that will be examined here (Model 2, described below). Moving-window kriging of Haas (1990) implies an underlying mixture model of this sort but with > 2 components. Also related is the common (homogeneous) geostatistical "linear model of coregionalization" (Isaaks and Srivastava, 1989) which takes constant weight on each component of the mixture and hence the covariance is just a (weighted) sum of 2 or more covariance functions. Let Z be defined over the region, denoted as X, containing the 6 x 5 lattice of sample points shown in Figure 1(a). Define subregions of X as S = { x : x2 < 3.5) and N = { x : $2 2 3.5) and hence X = S U N . The two subregions of X are delineated by the broken line of Figure l(a). The 2 heterogeneous models that we consider are defined as follows: + Model 1: Take U and V to be independent (i.e. c,, = 0) and to have isotropic exponential covariance functions, but with (possibly) different exponential parameters. So, c,(llx - x'll) = e-llx-x'llsu and cv(l1x - 2'1 1) = e-llX-"'llev Note that 0, = -log(p,(l)) and 0, = -log(p,( 1)) so that we may describe the covariance functions of U and V in terms of the "nearest-neighbor" correlations p,(l) and p,( 1). These are the correlations between two points separated by distance 1 in Figure l(a). The weighting scheme is defined as and b(x) = 1 - a(x). This case is relatively uninteresting, but serves as a "worst case" scenario, under which we might expect misspecification of the covariance structure to have the largest effect. The heterogeneous correlogram of Z on the 6 x 5 lattice (which is equivalent to the covariogram in this case) is shown as the solid lines in Figure l(b). One could generalize this situation to allow crosscorrelation between U and V, and an example of such a covariance function is shown as the solid lines in Figure l(c). Alternatively, we may induce a similar heterogeneous structure by overlapping the spatial weights, which gives us our second case. Model 2: Taking U and V as in Model 1, build dependence among all Z(x) by superimposing the V field onto the U field (but only over part of the region, producing the heterogeneity). Here the weighting scheme takes a(x) = 1; Vx E and Thus, the covariance among points that both have x* E S contains the additive term cv(llx - xfll). The covariance among all other points is simply given by c,(llx - ~ ' 1 1 ) . The correlograrn of Z for this case is shown as the solid line in Figure l(d). The heterogeneous models shown in Figures l(c) and (d) are functionally very similar, as is evident. That is, they both build dependence among all Z(x), but do this in a slightly different manner. We will define the misspecified covariance function, c,, to be the least-squares projection of the true covariance structure to a homogeneous exponential covari' 1) = ,Ble-llx-x'll~). For ance (i.e. the least-squares fit to the model cm(1 lx - 11 Model 1 and Model 2, the misspecified homogeneous covariance functions are shown as the broken lines in Figure l(b) and (d), respectively. 3. STUDY DESIGN Let Z be mixture of two homogeneous Gaussian random fields with E(U(x)) = E(V(x)) = 0. For this study, we choose a mixture of 2 homogeneous fields for convenience alone. The covariance between Z(x) and Z(x') is lag distance lag distance lag distance - Figure 1: (a) Location of 30 sample points used in studying the effect of heterogeneous correlation. X is the region containing these points and the subregions S and N are delineated by the broken line. (b) Heterogeneous correlation of 30 points under Model I with c,, = 0. (c) Heterogeneous correlation of 30 points under Model 1with p,, ( 1) = .55. (d) - Heterogeneous correlation of 30 points under Model 2. Broken line in all panels is the misspecified homogeneous correlation function. For all cases, p,(l) = .8 and p,(l) = - 3 - For Z(x) corresponding to the p locations, x, at which the random field is measured we have the p x p matrix Var(Z(x)) = C where Cij = c(xi, xi). For any x, E X, we have the p-vector Cov(Z(x,), Z(x)) = c,. The p locations where Z is observed are given by the 6 x 5 lattice shown in Figure l(a). For x, E X, the kriging predictor of Z(x,) is where the vector X are the "kriging weights", and are computed using the true covariance function defined by Model 1or Model 2 or the misspecified covariance structure determined by cm. These are the optimal or misspecified predictor, respectively, with kriging weights A' and A;. The prediction variance is given by Var(Z(x0) - i ( X , ) ) = 1 - 2X1co+ X'CX and the prediction standard error (PSE) is the square root of the prediction variance. The prediction variance for either the optimal or the rnisspecijied predictor may be based on either the true heterogeneous covariance function, c, or the misspecified homogeneous covariance function, cm. For the PSE associated with the prediction of Z(x,), 3 different estimates will be denoted as PSEopt(xo),PSE,(x,), and MPSE(x,). The 3 estimators are defined as: 1. PSEopt- The optimal estimator, using the true covariance function c in the predictor and prediction variance estimator. 2. PSE, - The correct estimator of PSE for the predictor based on the misspecified covariance, c, . 3. M P S E - The incorrect estimator of PSE for the predictor based on the misspecified covariance, c, . Note that the kriging weights, and covariance are indexed by rn for the misspecified covariance function. The estimator PSE, (the subscript r for reference) serves as the "best possible" estimator given the incorrect predictor (i.e. using the predictor which is not optimal, but using the correct variance for the prediction error). Since PSE,,, is the optimal estimator of PSE, we know that PSET( x ) > PSEOpt(x), Vx E X . And, since M P S E is not the correct PSE estimator for the predictor based on the misspecified covariance, it may be that M P S E ( x ) < PSEOpt(x). We use these 3 estimators of PSE to compute prediction standard errors on a 40 x 40 grid of points over the region containing the 30 sample points of Figure 1 (a). The prediction standard errors are computed using only the 30 sample points, and we examine interesting summaries of the 3 PSE estimators. In particular, define the average eficiency of PS ET to PS E,,, as This is simply a measure of the relative precision of the misspecified predictor to the optimal predictor. Similarly, to quantify the worst precision over X define the efficiency of the worst prediction as These summaries provide an assessment of the predictor based on the misspecified covariance function. That is, under misspecification and assuming the correct prediction variance is used, they provide insight into the variability of the predictions relative to the optimal. However, supposing the heterogeneous covariance structure is misspecified, then the likely estimator of prediction standard error is M P S E and a useful measure of the cost of misspecification is the actual coverage probability for a 95% confidence interval for a prediction using M PSE, averaged over all predictions. For any x E X, the ratio M P S E ( x ) /PSE, (2)is the relative inflation in the width of a confidence interval as a result of using the misspecified PSE estimator. The coverage probability of a 95% confidence interval for the prediction is C P ( x )= P r ( Z ( x ) < M P S E ( x ) / P S E T ( x x) 1.96) - P r ( Z ( x ) < - M P S E ( x ) / P S E , ( x ) x 1.96). Hence, if M P S E ( x ) / P S E T ( x= ) 1, the coverage probability is 0.95. The average coverage probability over X is Values of ACP greater than 0.95 indicate conservative confidence intervals (i.e. less precise predictions) and for ACP less than 0.95, predictions are too precise. These summaries are computed for values of p,( 1) = .I, .2, . . .,.9 and p,( 1) = . l , .2, . . . ,.9, for each of the 2 models defined in Section 2.1 Recall that p,(I) and p,( 1 ) are the nearest-neighbor correlations of the homogeneous U and V fields which comprise the heterogeneous mixture model. Thus, for each heterogeneous case, there are 81 situations, ranging from very different dependence structure (e.g. p,( 1) = .1 and pv(l) = .9), to identical (but independent) structure (p, (1) = pv( 1), puv(1) = 0). Efficiency and ACP will be presented as contour plots as a function of p, ( 1) and pv( 1). 4. RESULTS AND CONCLUSIONS Figure 2(a) indicates that the efficiency of prediction is effected very little by misspecification of the heterogeneous covariance structure under Model 1. That is, the mean prediction efficiency is near 1 except for high spatial dependence (nearest-neighbor correlations > 0.8) in which case the prediction standard errors are only 20% larger for predictions based on the misspecified predictor. For the Figure 2(e) indicates inflation worst prediction under misspecification (Eff,.,), of PSE by 20 to 120% for levels of spatial correlation > 0.50. Since it is unlikely that one would use PSE, as the estimator for prediction standard error if the heterogeneous covariance were misspecified, perhaps the more meaningful result is that concerning the average coverage probability using M P S E . The ACP is given in Figure 2(b), where it is seen that, using M P S E to estimate PSE, one achieves very conservative confidence intervals for most situations under Model 1. That is, the ACP for a 95% confidence interval is generally greater than 0.95, even approaching 1 for high spatial dependence. For Model 2, Figure 2(c) indicates efficiency of the misspecified predictor near 1 and hence the misspecified spatial predictor is nearly optimal. In contrast to Model 1, Figure 2(f) indicates that for the worst predictions made under misspecification of Model 2 the increase in maximum PSE for this case is only 20 to 40% for very high levels of spatial correlation (nearest-neighbor correlations > 30) (compare with Figure 2(e)). Also in contrast to Model 1, the ACP using the misspecified estimator of PSE (MPSE) under Model 2 (Figure 2(d)) is generally less than 0.95 (i.e. liberal coverage). This occurs when nearest-neighbor correlations of the overlapping U and V are relatively very different (e.g. p,(l) = .6 and pv(l) = .2). For high and/or similar nearest-neighbor correlations, the ACP is 2 0.95 indicating conservative coverage of the confidence intervals for prediction. By concentrating on a particular class of heterogeneous models for spatial correlation, defined to be a spatially weighted mixture of homogeneous random fields with exponential covariance functions, we have been able to quantify the effect of departure from homogeneity on spatial prediction uncertainty. One problem with studying the effect of heterogeneity, is that it is not clear how to describe the phenomenon in a manner so as to allow generalization of results, such as those presented here, to arbitrary covariance structure. Our choice of the mixture model is fortuitous since this is a heterogeneous analog to the class of models examined by Stein (1989) which are of the form Cov(Z(s),Z(s')) = alcu(s - s') + blcv(s- s') (a): Model 1 Eff(r,opt) (b): Model 1 ACP (c): Model 2 Eff(r,opt) (d): Model 2 ACP (e): Model 1 Effmax(r,opt) (9: Model 2 EFFmax(r,opt) Figure 2: (a) Mean efficiency of the misspecified predictor under Model 1. (b) ACP under Model 1 for 95% confidence intervals using the estimator of prediction standard error (MPSE) based on the misspecified (homogeneous) covariance. (c) Mean efficiency of the misspecified predictor under Model 2. (d) ACP using MSPE under Model 2. (e) Efficiency of the least-precise prediction for the misspecified predictor under Model 1. (0Efficiency of the least-precise prediction under Model 2. Axes are nearest-neighbor correlations of the U and V fields. and were used by him to study the effect of misspecification (of stationary) covariance structure, and thus it is possible that results such as this may be extended to include heterogeneity of the form described here, but with an arbitrary covariance model (i.e. not restricted to the exponential class). This is an area of future work. REFERENCES Cressie, N.A.C. (1991). Statistics for Spatial Data. Wiley, New York, NY. Cressie, N.A.C. and D.L. Zimmerman (1992). On the Stability of the geostatistical method. Math. Geol. 24(1), 45-60. Guttorp, P. and P.D. Sampson (1994). Methods for estimating heterogeneous spatial covariance functions with environmental applications. in G.P. Patil and C.R. Rao, eds., Handbook of Statistics Volume 12, 661-689 Haas, T.C. (1990). Lognormal and moving window methods of estimating acid deposition. 3. Amer. Stat. Assoc. 85,950-963. Isaaks, E.H. and R.M. Srivastava (1989). An Introduction to Applied Geostatistics. Oxford University Press. Stein, M.L. (1988). Asymptotically efficient prediction of a random field with misspecified covariance function. Ann. Stat. l6(l), 55-63. Stein, M.L. (1989). The loss of efficiency in kriging prediction caused by misspecifications of the covariance structure. in M. Armstrong, ed., Geostatistics. Kluwer Academic Publishers, 273-282. Stein, M.L. and M.S. Handcock (1989). Some asymptotic properties of kriging when the covariance function is misspecified. Math. Geol. 2 1, 171- 190.