Document 11863921

advertisement
This file was created by scanning the printed publication.
Errors identified by the software have been corrected;
however, some errors may remain.
The Effect of Spatial Covariance
Heterogeneity on Prediction Variance
J. Andrew Royle*and Doug Nychka
Abstract.- Stationarity of a random field is an important assumption
required for many spatial statistical analyses. A part of this assumption is homogeneity (translation invariance) of the correlation function.
False assumption of homogeneity leads directly to rnisspecification of
the spatial covariance and hence the usual kriging predictor is no longer
BLUE. In this paper we examine the effect of homogeneous misspecification of a heterogeneous correlation function on prediction standard
errors using two cases from a class of heterogeneous models that are
spatially weighted mixtures of homogeneous random fields. It is found
that misspecification leads to an average loss of efficiency in prediction by up to 20% for extreme cases, and that misspecification of the
prediction variance generally leads to conservative prediction intervals.
1. INTRODUCTION
Kriging is a commonly used method for making spatial predictions. One of the
primary reasons for the use of kriging over other methods of spatial prediction
is that prediction standard errors are easily obtained for kriging and that the
kriging predictor is BLUE. Unfortunately, the BLUE property holds only under
the assumption that the spatial covariance is known, which is not likely to be
the case in practice (i.e. there is generally a rnisspecification of the covariance
function).
Kriging predictions are known to be fairly robust to misspecification of the
spatial covariance structure (see Cressie and Zimmerman, 1992). Several authors
have addressed the problem of stability of the kriging variance. In particular, Stein
(1988, 1989) examined the effect of misspecification of the covariance structure
on prediction variance. For a particular (fairly general) model, he derived a bound
*Both Authors: Department of Statistics, North Carolina State University, Raleigh NC
on the ratio of the optimal prediction variance to the prediction variance for the
predictor based on a misspecified covariance. More generally, he has shown that
the misspecified prediction variance is asymptotically equivalent to the minimum
prediction variance, assuming the true and rnisspecified covariance functions are
compatible (see Stein and Handcock (1989) for a discussion of this condition).
Here we are interested in the departure of a random field from the assumption
of translation invariance of the spatial correlation. Failure to detect departure from
this assumption leads directly to misspecification of the covariance function, and
we quantify the effects of this misspecification on prediction variance under a class
of heterogeneous random field models.
2. STATIONARITY AND HETEROGENEITY
Let Z = { Z ( x ) : x E ?JZd} be a random field in d-dimensions, from which we
observe a = { z ( x i ) : 1 5 i
p } where x represents a vector of coordinate
locations in space (e.g. x = ( x l ,2 2 ) in .R2). Z is second-order stationary if
<
and
C o r r ( Z ( x ) ,Z ( x l ) ) = p(x - X I )
V x , 2' E ?JZd
Thus under second-order stationarity we have that the mean and covariance do not
depend on the location of the points in space (i.e. translation invariance of the
first and second moments of 2). Concern over the validity of the stationarity
assumption generally focuses on constancy of the mean and variance, however
homogeneity (translation invariance) of the spatial correlation is also implied by
stationarity. A heterogeneous random field then, is one in which the correlation
function is not invariant to translation.
2.1 A Class of Heterogeneous Random Field Models
To examine the effect of heterogeneity on prediction variance it is necessary to
specify a random field model with heterogeneous covariance structure. Here we
consider special cases from the class of models that are spatially weighted mixtures
of homogeneous random fields. The 2 component mixture is given by
Here the sampled field 2, is a spatially weighted sum of the homogeneous random
fields U and V with covariance functions c, and c, and cross-covariance c,,. The
nature of the spatial weights a ( $ ) and b ( x ) determines the manner in which the
covariance between two points depends on their relative locations. Note that the
covariance between Z at x and X I under (1) is
C o v ( Z ( x ) ,Z ( x 9 ) ) = a(x)a(xl)c,(x - x')
+
+ b ( x )b(xf)cV(x - x')
(2)
+(a(x)b(xl) a(xl)b(x))cu,( x - x')
It is obvious that the covariance between two points depends explicitly on their
locations x and x9 via the weights a(+) and b ( x ) ,and hence Z is a heterogeneous
random field.
Simple cases of the mixture model are the heterogeneous variance model:
Z ( x ) = a ( x ) U ( x ) ,where V a r ( Z ( x ) )is proportional to a ( x ) and the heterogeneous measurement error model: Z ( x ) = U ( x ) b ( x ) V ( x )where V ( x ) is
measurement error with constant variance and Z ( x ) has measurement error (i.e.
"nugget") variance proportional to b ( x ).
When the spatial weights are disjoint indicator functions (i.e. either U ( x ) or
V ( x )are observed at each x ) and c,, = 0,then Z is homogeneous within subregions and Z has been called pseudo-stationary,or locally stationary. This model is
equivalent to that implied by partitioning a region up into disjoint (homogeneous)
subregions, a common solution to heterogeneity in classical geostatistics (Guttorp
and Sampson, 1994). In this study, we consider the 2 subregion case further
(Model 1, described below). A similar model is the relative variogram model of
Cressie (199 1) except that the later assumes a proportional relationship between
the covariance and mean within each subregion.
Taking the spatial weights to be overlapping (but not constant) produces a
reasonable range of heterogeneous behavior, and this is the second situation that
will be examined here (Model 2, described below). Moving-window kriging
of Haas (1990) implies an underlying mixture model of this sort but with > 2
components. Also related is the common (homogeneous) geostatistical "linear
model of coregionalization" (Isaaks and Srivastava, 1989) which takes constant
weight on each component of the mixture and hence the covariance is just a
(weighted) sum of 2 or more covariance functions.
Let Z be defined over the region, denoted as X, containing the 6 x 5 lattice of
sample points shown in Figure 1(a). Define subregions of X as S = { x : x2 < 3.5)
and N = { x : $2 2 3.5) and hence X = S U N . The two subregions of X are
delineated by the broken line of Figure l(a). The 2 heterogeneous models that we
consider are defined as follows:
+
Model 1:
Take U and V to be independent (i.e. c,, = 0) and to have isotropic exponential covariance functions, but with (possibly) different exponential parameters. So, c,(llx - x'll) = e-llx-x'llsu and cv(l1x - 2'1 1) = e-llX-"'llev Note that
0, = -log(p,(l)) and 0, = -log(p,( 1)) so that we may describe the covariance
functions of U and V in terms of the "nearest-neighbor" correlations p,(l) and
p,( 1). These are the correlations between two points separated by distance 1 in
Figure l(a). The weighting scheme is defined as
and b(x) = 1 - a(x). This case is relatively uninteresting, but serves as a "worst
case" scenario, under which we might expect misspecification of the covariance
structure to have the largest effect. The heterogeneous correlogram of Z on the
6 x 5 lattice (which is equivalent to the covariogram in this case) is shown as
the solid lines in Figure l(b). One could generalize this situation to allow crosscorrelation between U and V, and an example of such a covariance function is
shown as the solid lines in Figure l(c). Alternatively, we may induce a similar
heterogeneous structure by overlapping the spatial weights, which gives us our
second case.
Model 2:
Taking U and V as in Model 1, build dependence among all Z(x) by superimposing
the V field onto the U field (but only over part of the region, producing the
heterogeneity). Here the weighting scheme takes a(x) = 1; Vx E and
Thus, the covariance among points that both have x* E S contains the additive
term cv(llx - xfll). The covariance among all other points is simply given by
c,(llx - ~ ' 1 1 ) . The correlograrn of Z for this case is shown as the solid line
in Figure l(d). The heterogeneous models shown in Figures l(c) and (d) are
functionally very similar, as is evident. That is, they both build dependence
among all Z(x), but do this in a slightly different manner.
We will define the misspecified covariance function, c,, to be the least-squares
projection of the true covariance structure to a homogeneous exponential covari' 1) = ,Ble-llx-x'll~). For
ance (i.e. the least-squares fit to the model cm(1 lx - 11
Model 1 and Model 2, the misspecified homogeneous covariance functions are
shown as the broken lines in Figure l(b) and (d), respectively.
3. STUDY DESIGN
Let Z be mixture of two homogeneous Gaussian random fields with E(U(x)) =
E(V(x)) = 0. For this study, we choose a mixture of 2 homogeneous fields for
convenience alone. The covariance between Z(x) and Z(x') is
lag distance
lag distance
lag distance
-
Figure 1: (a) Location of 30 sample points used in studying the effect of
heterogeneous correlation. X is the region containing these points and the
subregions S and N are delineated by the broken line. (b) Heterogeneous
correlation of 30 points under Model I with c,, = 0. (c) Heterogeneous
correlation of 30 points under Model 1with p,, ( 1) = .55. (d) - Heterogeneous
correlation of 30 points under Model 2. Broken line in all panels is the misspecified homogeneous correlation function. For all cases, p,(l) = .8 and
p,(l) = - 3
-
For Z(x) corresponding to the p locations, x, at which the random field is measured
we have the p x p matrix Var(Z(x)) = C where Cij = c(xi, xi). For any
x, E X, we have the p-vector Cov(Z(x,), Z(x)) = c,. The p locations where Z
is observed are given by the 6 x 5 lattice shown in Figure l(a).
For x, E X, the kriging predictor of Z(x,) is
where the vector X are the "kriging weights", and are computed using the true
covariance function defined by Model 1or Model 2 or the misspecified covariance
structure determined by cm. These are the optimal or misspecified predictor,
respectively, with kriging weights A' and A;.
The prediction variance is given by
Var(Z(x0) - i ( X , ) ) = 1 - 2X1co+ X'CX
and the prediction standard error (PSE) is the square root of the prediction variance.
The prediction variance for either the optimal or the rnisspecijied predictor may be
based on either the true heterogeneous covariance function, c, or the misspecified
homogeneous covariance function, cm. For the PSE associated with the prediction
of Z(x,), 3 different estimates will be denoted as PSEopt(xo),PSE,(x,), and
MPSE(x,). The 3 estimators are defined as:
1. PSEopt- The optimal estimator, using the true covariance function c in the
predictor and prediction variance estimator.
2. PSE, - The correct estimator of PSE for the predictor based on the misspecified
covariance, c, .
3. M P S E - The incorrect estimator of PSE for the predictor based on the misspecified covariance, c, .
Note that the kriging weights, and covariance are indexed by rn for the misspecified
covariance function.
The estimator PSE, (the subscript r for reference) serves as the "best possible"
estimator given the incorrect predictor (i.e. using the predictor which is not
optimal, but using the correct variance for the prediction error). Since PSE,,,
is the optimal estimator of PSE, we know that PSET( x ) > PSEOpt(x),
Vx E X .
And, since M P S E is not the correct PSE estimator for the predictor based on the
misspecified covariance, it may be that M P S E ( x ) < PSEOpt(x).
We use these 3 estimators of PSE to compute prediction standard errors on a
40 x 40 grid of points over the region containing the 30 sample points of Figure 1 (a).
The prediction standard errors are computed using only the 30 sample points, and
we examine interesting summaries of the 3 PSE estimators. In particular, define
the average eficiency of PS ET to PS E,,, as
This is simply a measure of the relative precision of the misspecified predictor to
the optimal predictor. Similarly, to quantify the worst precision over X define the
efficiency of the worst prediction as
These summaries provide an assessment of the predictor based on the misspecified covariance function. That is, under misspecification and assuming the correct
prediction variance is used, they provide insight into the variability of the predictions relative to the optimal. However, supposing the heterogeneous covariance
structure is misspecified, then the likely estimator of prediction standard error is
M P S E and a useful measure of the cost of misspecification is the actual coverage
probability for a 95% confidence interval for a prediction using M PSE, averaged
over all predictions. For any x E X, the ratio M P S E ( x ) /PSE, (2)is the relative
inflation in the width of a confidence interval as a result of using the misspecified
PSE estimator. The coverage probability of a 95% confidence interval for the
prediction is C P ( x )= P r ( Z ( x ) < M P S E ( x ) / P S E T ( x x) 1.96) - P r ( Z ( x ) <
- M P S E ( x ) / P S E , ( x ) x 1.96). Hence, if M P S E ( x ) / P S E T ( x=
) 1, the coverage probability is 0.95. The average coverage probability over X is
Values of ACP greater than 0.95 indicate conservative confidence intervals (i.e.
less precise predictions) and for ACP less than 0.95, predictions are too precise.
These summaries are computed for values of p,( 1) = .I, .2, . . .,.9 and p,( 1) =
. l , .2, . . . ,.9, for each of the 2 models defined in Section 2.1 Recall that p,(I) and
p,( 1 ) are the nearest-neighbor correlations of the homogeneous U and V fields
which comprise the heterogeneous mixture model. Thus, for each heterogeneous
case, there are 81 situations, ranging from very different dependence structure
(e.g. p,( 1) = .1 and pv(l) = .9), to identical (but independent) structure (p, (1) =
pv( 1), puv(1) = 0). Efficiency and ACP will be presented as contour plots as a
function of p, ( 1) and pv( 1).
4. RESULTS AND CONCLUSIONS
Figure 2(a) indicates that the efficiency of prediction is effected very little by
misspecification of the heterogeneous covariance structure under Model 1. That
is, the mean prediction efficiency is near 1 except for high spatial dependence
(nearest-neighbor correlations > 0.8) in which case the prediction standard errors
are only 20% larger for predictions based on the misspecified predictor. For the
Figure 2(e) indicates inflation
worst prediction under misspecification (Eff,.,),
of PSE by 20 to 120% for levels of spatial correlation > 0.50. Since it is unlikely
that one would use PSE, as the estimator for prediction standard error if the
heterogeneous covariance were misspecified, perhaps the more meaningful result
is that concerning the average coverage probability using M P S E . The ACP is
given in Figure 2(b), where it is seen that, using M P S E to estimate PSE, one
achieves very conservative confidence intervals for most situations under Model 1.
That is, the ACP for a 95% confidence interval is generally greater than 0.95, even
approaching 1 for high spatial dependence.
For Model 2, Figure 2(c) indicates efficiency of the misspecified predictor
near 1 and hence the misspecified spatial predictor is nearly optimal. In contrast
to Model 1, Figure 2(f) indicates that for the worst predictions made under misspecification of Model 2 the increase in maximum PSE for this case is only 20
to 40% for very high levels of spatial correlation (nearest-neighbor correlations
> 30) (compare with Figure 2(e)). Also in contrast to Model 1, the ACP using the
misspecified estimator of PSE (MPSE) under Model 2 (Figure 2(d)) is generally
less than 0.95 (i.e. liberal coverage). This occurs when nearest-neighbor correlations of the overlapping U and V are relatively very different (e.g. p,(l) = .6 and
pv(l) = .2). For high and/or similar nearest-neighbor correlations, the ACP is
2 0.95 indicating conservative coverage of the confidence intervals for prediction.
By concentrating on a particular class of heterogeneous models for spatial
correlation, defined to be a spatially weighted mixture of homogeneous random
fields with exponential covariance functions, we have been able to quantify the
effect of departure from homogeneity on spatial prediction uncertainty. One
problem with studying the effect of heterogeneity, is that it is not clear how to
describe the phenomenon in a manner so as to allow generalization of results,
such as those presented here, to arbitrary covariance structure. Our choice of the
mixture model is fortuitous since this is a heterogeneous analog to the class of
models examined by Stein (1989) which are of the form
Cov(Z(s),Z(s')) = alcu(s - s')
+ blcv(s- s')
(a): Model 1 Eff(r,opt)
(b): Model 1 ACP
(c): Model 2 Eff(r,opt)
(d): Model 2 ACP
(e): Model 1 Effmax(r,opt)
(9: Model 2 EFFmax(r,opt)
Figure 2: (a) Mean efficiency of the misspecified predictor under Model 1.
(b) ACP under Model 1 for 95% confidence intervals using the estimator of
prediction standard error (MPSE) based on the misspecified (homogeneous)
covariance. (c) Mean efficiency of the misspecified predictor under Model 2.
(d) ACP using MSPE under Model 2. (e) Efficiency of the least-precise prediction for the misspecified predictor under Model 1. (0Efficiency of the
least-precise prediction under Model 2. Axes are nearest-neighbor correlations of the U and V fields.
and were used by him to study the effect of misspecification (of stationary) covariance structure, and thus it is possible that results such as this may be extended to
include heterogeneity of the form described here, but with an arbitrary covariance
model (i.e. not restricted to the exponential class). This is an area of future work.
REFERENCES
Cressie, N.A.C. (1991). Statistics for Spatial Data. Wiley, New York, NY.
Cressie, N.A.C. and D.L. Zimmerman (1992). On the Stability of the geostatistical
method. Math. Geol. 24(1), 45-60.
Guttorp, P. and P.D. Sampson (1994). Methods for estimating heterogeneous
spatial covariance functions with environmental applications. in G.P. Patil and
C.R. Rao, eds., Handbook of Statistics Volume 12, 661-689
Haas, T.C. (1990). Lognormal and moving window methods of estimating acid
deposition. 3. Amer. Stat. Assoc. 85,950-963.
Isaaks, E.H. and R.M. Srivastava (1989). An Introduction to Applied Geostatistics.
Oxford University Press.
Stein, M.L. (1988). Asymptotically efficient prediction of a random field with
misspecified covariance function. Ann. Stat. l6(l), 55-63.
Stein, M.L. (1989). The loss of efficiency in kriging prediction caused by misspecifications of the covariance structure. in M. Armstrong, ed., Geostatistics.
Kluwer Academic Publishers, 273-282.
Stein, M.L. and M.S. Handcock (1989). Some asymptotic properties of kriging
when the covariance function is misspecified. Math. Geol. 2 1, 171- 190.
Download