This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. Choosing between Abrupt and Gradual Spatial Variation? Gerard B.M. Heuvelinkl and Johan A. ~ u i s r n a n ~ Abstract.-Two basic models of spatial variation are widely used in present-day soil survey practice. The discrete model of spatial variation (DMSV) forms the basis of the traditional soil map, in which homogeneous soil mapping units are separated by abrupt boundaries. The continuous model of spatial variation (CMSV) originates from geostatistics, where kriging is used to map gradual changes in soil properties. Neither of the two models is capable of handling situations in which abrupt and gradual spatial variation are both present in the same area. Therefore, recently a straightforward combination of the two models was introduced, known as the mixed model of spatial variation (MMSV). The MMSV contains the two basic models, suggesting that it may perform well on the whole range of spatial variation. In this paper we investigate the anticipated flexibility of the MMSV using a simulation study. As expected, the MMSV is superior to the DMSV and the CMSV when both types of spatial variation are present. But the MMSV is also as suitable as the DMSV in case of a 'discrete' reality, and as suitable as the CMSV in case of a 'continuous' reality. From this we conclude that the MMSV should be recommended for situations where an a priori choice between abrupt and gradual spatial variation cannot easily be made. INTRODUCTION Traditional soil survey is based upon the presumption that soil behaves uniformly within soil mapping units and changes fairly abrupt at the boundaries between them (Voltz and Webster 1990, Webster and Oliver 1990, Burrough 1993). However, the practical validity of this conventional representation of soil variability has repeatedly been questioned (e .g. Webster and Cuanalo 1975, Nortcliff 1978, Campbell et al. 1989, Nettleton et al. 1991). Major drawbacks of the conventional model of soil spatial variation are that it cannot represent gradual boundaries and that it ignores spatial autocorrelation within mapping units. As an alternative to the conventional model, some fifteen years ago the use of geostatistical techniques for modelling spatial variation was introduced to soil science. Burgess and Webster (1980) were among the first to apply kriging, an 'Geostatistician, University of Amsterdam, The Netherlands 'Student, University of Amsterdam, The Netherlands exponent of this theory, to soil survey. Since then the geostatistical approach to modelling soil spatial variation has flourished and kriging is now routinely being adopted for the mapping of soil properties (Webster 1994). Recently, however, it is more and more being realized that the outright abandoning of the conventional approach to soil spatial variation is perhaps too drastic. Kriging definitely has disadvantages as well, such as its inadequacy to deal with sharp boundaries. In order to bridge the gap between the conventional and geostatistical representation of soil spatial variation, several models have been developed that can handle discrete (abrupt) as well as continuous (gradual) spatial variation in the same area (Stein et al. 1988, Voltz and Webster 1990, Heuvelink and Bierkens 1992, Rogowski and Wolf 1994, Goovaerts and Journel 1995, Heuvelink 1996). In this paper we examine one such a model, known as the mixed model of spatial variation (MMSV). In Heuvelink (1996) it was anticipated that the MMSV should work well on the whole range of spatial variation, from purely discrete to purely continuous. We analyze the anticipated flexibility of the MMSV using nine simulated 'realities'. But before describing the exact procedure of the simulation exercise, we will first briefly review the three models of spatial variation used in this study. THREE MODELS OF SPATIAL VARIATION The Discrete Model of Spatial Variation (DMSV) first divides the geographical domain D into K separate units D,. It then makes the following assumptions on the behaviour of a spatially distributed attribute Z(-): 1 2 3 Z(x) =p, +E(x) for all x E D, E ( . ) has zero mean and is spatially uncorrelated Var(~(x))=C, for all x E D Thus the DMSV assumes that Z(x) is the sum of a unit-dependent mean p, and a residual noise ~ ( x ) .The DMSV will usually be adopted when the units D, are available in the form of a polygon map, such as a soil map, a landuse map or a geological map, and. when the within-unit variability is expected to be small in comparison with the between-unit variability. In other words, the DMSV represents the conventional model of soil spatial variation and is appropriate when major jumps in the attribute Z(.) take place at the boundaries of the mapping units. In its simplest form, the Continuous Model of Spatial Variation (CMSV) makes the following assumptions: 1 2 E[Z(x)]=pforallxED Cov(Z(x),Z(x h)) =C,( I h I ) for all x,x h E D + + Thus the CMSV assumes that Z(*) is second-order stationary, meaning that it has a constant mean and that its spatial autocovariance is a function only of the distance between the locations. The CMSV embodies the geostatistical model of spatial variation. In geostatistics it is customary to use the variogram y,(*) to characterize the spatial autocorrelation of Z($ It is related to the autocovariance by the identity yz( 1 h 1 ) =C,(O)-Cz( 1 h I). The assumptions underlying the Mixed Model of Spatial Variation (MMSV) are a combination of those underlying the DMSV and CMSV: 1 2 3 + Z(x) =p, ~ ( x for ) all E(-) has zero mean x E D, Cov(~(x),~(x+h))=C,(lhJ) for all x,x+hED Second-order stationarity is thus imposed on E(-)instead of on Z(.). The MMSV is more general than the DMSV and CMSV, and in fact it contains both models. However, the DMSV and CMSV are included here as separate models because they are very often used in practice. APPLICATION TO NINE SIMULATED REALITIES In order to study the suitability of the three MSV's for mapping under various circumstances, nine different 'realities' were created. This was done by adding maps generated using unconditional Gaussian simulation (Deutsch and Journel 1992) to an artificially constructed 'soil map'. The nine realities are given in figure 1. The A-maps in figure 1 (top row) are strongly dominated by the discrete soil map, the B-maps (middle row) to a much lesser extent, and the C-maps (bottom row) bear no influence from the discrete soil map. The degree of spatial autocorrelation in the added residual decreases from the 1-maps (left column) to the 3-maps (right column). Mapping the soil property from 200 observations From each of the nine simulated maps data sets were created by collecting observations at 200 randomly selected locations. From these observations the soil property was mapped using the three MSV's. This resulted in 27 maps of predictions and 27 maps of prediction error standard deviations. Mapping using the DMSV is done simply by calculating the mean of the observations for all mapping units separately, and using the unit mean as a prediction for all points lying in the same unit. Mapping with the CMSV is done by ordinary kriging, in which a variogram is used computed from the 200 observations. Mapping is somewhat more complicated in case of the MMSV. First a variogram is computed from the 200 residuals obtained by subtracting the unit means from the observations (Kitanidis 1994). Next the attribute is mapped using universal kriging (Cressie 1991). Figure 1.-Nine simulated realities. Letter indicates influence from the artificial soil map: A =large, B =moderate, C =none. Number indicates degree of spatial autocorrelation in the added residual: 1 =large, 2 =moderate, 3 =none. In figure 2 the results of the mapping are given for a selection of three out of the nine simulated realities. Note that these are the prediction maps, and that the corresponding maps of prediction error standard deviations are not given here. The DMSV maps necessarily follow the delineations of the soil map, which is quite all right for map A3 but less so for map B2 and definitely inappropriate in case of map C1. Conversely, the CMSV is suitable for mapping C1, but it is not appropriate for mapping B2 and even less so for mapping A3. The most important observation from figure 2 is that the MMSV is indeed capable of an adequate mapping in all three cases. It is interesting to observe that the MMSV mimics the DMSV in case of a 'discrete reality' and the CMSV in case of a 'continuous' reality. A3 - DMSV A3 - MMSV A3 - CMSV B2 - DMSV B2 - MMSV B2 - CMSV Cl - DMSV Cl - MMSV Cl - CMSV I90 Figure 2.-Mapping three simulated realities (maps A3, B2 and C1) from 200 observations using the DMSV, MMSV and CMSV. Validation In order to evaluate the three prediction methods the mean error (ME), root mean square error (RMSE) and standardized root mean square error (SRMSE) were computed for each of the 9 realities. These statistics were computed from all remaining points in the map. The results are given in figure 3. The mean error is in all cases quite small. This is not surprising, because unbiasedness conditions are included in all three mapping procedures. Differences between the three mapping procedures are also negligible. The SRMSE values are on average somewhat larger than one, particularly when the CMSV is applied to situations in which the soil map influence is dominant. This may be caused by forcing the CMSV upon a non-stationary MMSV or DMSV reality, and perhaps also because the kriging variance does not include the uncertainty in estimating the variogram (Christensen 1991). Most interesting are the RMSE results. In this case we do see meaningful differences between the three mapping procedures. The results demonstrate that the CMSV is inappropriate for the A-maps, whereas the DMSV is inappropriate for the C-maps. An exception is the pure nugget map C3, where all mapping procedures are equally good (bad). The results also confirm that the MMSV is superior for the B-maps. Note that comparison of RMSE values between maps is difficult here because these are affected by differences in spatial autocorrelation. H DMSV 1.5 1.o MMSV 0.5 Q CMSV 0.0 H DMSV 15 I0 MMSV 5 CMSV 0 1.50 1.25 H DMSV 1.oo 0.75 H MMSV 0.50 0.25 0.00 Figure 3.-Validation results for the three prediction methods. I=] CMSV DISCUSSION AND CONCLUSIONS The application to simulated 'realities' shows that the MMSV interpolates well on the entire range of spatial variation. In all cases is the MMSV at least as good as the DMSV and the CMSV, and it is superior in situations where there is abrupt and gradual spatial variation. The simulation exercise also shows that, depending on the situation, the DMSV may perform much worse than the CMSV, and vice versa. This means that if a choice between these two models is to be made (and these are the two models most often used in practice), then it must be taken with care. This may seem obvious, but in practice the choice of model is often dominated by irrelevant factors, such as background and experience of the user. And even when care is taken, it may not always be easy to decide beforehand whether abrupt or gradual spatial variation prevails. Therefore the flexibility of the MMSV demonstrated here is of clear importance, because it implies that by adopting the MMSV one can protect oneself against using the wrong model. It is as if one can leave the choice between abrupt and gradual spatial variation to the MMSV. The MMSV is especially advantageous in situations where abrupt and gradual spatial variation are both present. Some indication of whether this is the case can be obtained from the intra-class correlation (Webster and Oliver 1990), but more informative is the comparison of the variograms of the original attribute and its residual. A mixed form of spatial variation yields a residual variogram that is substantially lower than the original variogram. Since both variograms can easily be computed one can thus quickly decide whether the MMSV is superior to the CMSV and DMSV for a given situation. Another approach to handling abrupt and gradual spatial variation both present in the same area is to adopt the CMSV separately per mapping unit (Stein et al. 1988, Voltz and Webster 1990). The main difference with the MMSV is that this approach excludes the presence of spatial autocorrelation across mapping unit boundaries. Thus it is likely to create boundaries even when they are not really there. As mentioned, the MMSV does not suffer from this problem because it mimics the CMSV under such circumstances. We consider it a major advantage of the MMSV that, although it is meant for situations in which gradual and abrupt spatial variation are both present, it will also perform well when spatial variation is exclusively gradual or exclusively abrupt. REFERENCES Burgess, T. M. and R. Webster . 1980. Optimal interpolation and isarithmic mapping of soil properties. I. The semi-variogram and punctual kriging . J . Soil Science 3 1, 3 15-331. Burrough, P.A. 1993. Soil Variability: a late 20th century view. Soils and Fertilizers 56, 529-562. Campbell, D. J., D.G. Kinninburgh and P.H.T. Beckett. 1989. The soil solution chemistry of some Oxfordshire soils: temporal and spatial variability. J. Soil Science 40, 32 1-339. Christensen, R. 1991. Linear models for multivariate, time series, and spatial data. New York: Springer. Cressie, N. 1991. Statistics for spatial data. New York: Wiley. Deutsch, C.V. and A.G. Journel . 1992. GSLIB: geostatistical software library and user's guide. New York: Oxford University Press. Goovaerts, P. and A.G. Journel. 1995. Integrating soil map information in modelling the spatial variation in continuous soil properties. European J. Soil Science 46, 397-414. Heuvelink, G.B .M. 1996. Identification of field attribute error under different models of spatial variation. Int. J. GIs (in press). Heuvelink, G.B.M. and M.F.P. Bierkens. 1992. Combining soil maps with interpolations from point observations to predict quantitative soil properties. Geoderma 55, 1-15. Kitanidis, P. K. 1994. Generalized covariance functions in estimation. Mathematical Geology 25, 525-540. Nettleton, W.D., B.R. Brasher and G. Borst. 1991. The taxadjunct problem. Soil Sci. Soc. Am. J. 55, 421-427. Nortcliff, S. 1978. Soil variability and reconnaissance soil mapping: a statistical study in Norfolk. J. Soil Science 29, 403-418. Rogowski, A.S. and J.K. Wolf. 1994. Incorporating variability into soil map unit delineations. Soil Sci. Soc. Am. J. 58, 403-418. Stein, A., M. Hoogerwerf and J. Bouma. 1988. Use of soil-map delineations to improve (co-) kriging of point data on moisture deficits. Geoderma 43, 163-177. Voltz, M. and R. Webster. 1990. A comparison of kriging, cubic splines and classification for predicting soil properties from sample information. J. Soil Science 3 1, 505-524. Webster, R. 1994. The development of pedometrics. Geoderma 62, 1-15. Webster, R. and H.E. De La Cuanalo. 1975. Soil transect correlograms of North Oxfordshire and their interpretation. J . Soil Science 26, 176-194. Webster, R. and M.A. Oliver. 1990. Statistical methods in soil and land resource survey. Oxford: Oxford University Press. BIOGRAPHICAL SKETCH Gerard B.M. Heuvelink is a geostatistician with the Landscape and Environmental Research Group, Faculty of Environmental Sciences, University of Amsterdam. He holds an M.S. in Applied Mathematics from Twente Technical University and a Ph.D. in Geography from Utrecht University. Johan A. Huisman is a graduate student in Physical Geography at the Faculty of Environmental Sciences, University of Amsterdam.