This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. Stratified Kriging Using Vague Transition Zones G. BOUCNEAU' , M. VAN MEIRVENNE', 0 . THAS~and G. HOFMAN' Abstract.- Stratification followed by within-stratum interpollation is a widespread spatial prediction procedure in soil inventories. Conventional stratification assumes that strata are delimited by accurate, crisp boundaries. However, map delineations are not equally accurate. Inaccuracy can result from the inherent gradual nature of the soil property, or errors made during the mapping process. This paper presents a model to describe the vagueness of stratum boundaries and proposes a modified within-stratum kriging algorithm to account for the uncertainty due to this vagueness. The procedure was used to map topsoil sand in Belgium. INTRODUCTION Spatial prediction of soil properties is unavoidable when creating a soil information system (SIS). However, every prediction procedure has its shortcomings. Choropleth soil maps assume abrupt changes at their boundaries and interpolation techniques, like kriging, take the condition that variation is gradual and stationary. However, in reality heterogeneous areas often contain both gradual and discontinuous changes of soil properties. Therefore, a combination of both procedures has been suggested to combine the best of both. Stratification of a study area based on soil map delineations, followed by within-stratum interpolation, has been the most widespread combined approach. Stein et al. (1988) calculated a variogram for each soil type, Voltz & Webster (1990) computed a pooled variogram and Van Meirvenne et al. (1994) standardised the pooled variogram. Other approaches have been the formulation of a mixed model of spatial variation (Heuvelink & Bierkens 1992), or more recently the introduction of a non-parametric indicator approach combined with qualitative information provided by soil or geological maps (Goovaerts & Journel 1995). ' Soil scientist, Dept, of Soil Management and Soil Care, Universityof Gent, Coupure 653, 9000 Gent, Belgium. Statistician, Dept. of Applied Mathematics, Biometries and Process Control, Universityof Gent, Coupure 653, 9000 Gent, Belgium. Conventional stratification assumes strata are delimited by accurate, crisp boundaries. Though, map delineations are not equally accurate. Inaccuracy can result from the inherent fuzzy nature of the soil property, the poor field work, the distortion or shrinkage of the original paper base map or through poor quality digitisation. Attempts have been made to characterise the accuracy of soil map boundaries. Mark & Csillag (1989) conceptually modified the cartographic epsilon model to describe the uncertainty of class membership near a map boundary. However, they do not elaborate on this in a mathematical way. Burrough (1989) mapped the transition between soil strata based on concepts of the fuzzy set theory. Though, he did not make any link with spatial interpolation. This paper aims to model the vagueness of stratum boundaries in a mathematical way and to modify the within-stratum kriging algorithm to account for the uncertainty due to this vagueness. THEORY The Vagueness of Stratum Boundaries The vagueness of a stratum boundary is described by examining point observations of the considered soil property perpendicular to that delineation (figure 1). The shortest distance between observation i of property y in stratum j and the map delineation between strata A and B is x,. To avoid artefacts at complex combinations of boundaries, each observation should be used only once, 1 YB Stratum A Stratum B \ - Delineation between both strata I I Stratum delineation I I x~ I I I I I Figure 1.- Examining the vagueness of a stratum boundary using point observations (left), the sigmoidal cosine function describing the vagueness of a stratum boundary (right). i.e. in respect to the nearest delineation. Then, y, is plotted against x,. Distances to observations located in stratum A are given a negative sign, those within stratum B a positive one. The position of the stratum delineation is at x = 0. The following sigmoid cosine function can be fitted to model the behaviour of y at both sides of the boundary (figure 2) : for x,<xA : y, = y, with y~ and y, the typical value of the soil property near the centre of stratum A respectively B and ixAl+lxBlthe width of the transition zone between both strata This function was chosen since it allows to model the behaviour of a soil property near a soil boundary asymmetrically in respect to the position of the boundary. Within-stratum Variograms For each stratum the within-stratum variogram was calculated. Observations located within narrow transition zones of inaccurately mapped boundaries were excluded from variograrn calculations. For inaccurately delimited strata this resulted in a drastic decrease of the semivariance. Prediction With Vague Boundaries In conventional stratification adjacent strata are delimited by an infinitely thin line and every location belongs to only one stratum. Within-stratum kriging will then result in only one kriging estimate, and one associated variance, for every location of that stratum. In our procedure, boundaries represent a transition zone, I ~ ~ l + wide. l ~ ~ lWithin this zone two kriging procedures, K = a and K = b, based on the variograms and data of adjacent strata A respectively B, provide two prediction estimates Fa and jb and two associated estimates of the variances s: and s i . In the transition zone and close to stratum A the structure of the variability will be more similar to the variogram of stratum A than of stratum B and the validity of the kriging procedure K = a is expected to be higher than the validity of the kriging procedure K = b. The reverse holds for a location within the transition zone and close to stratum B. The probabilities P(K = a) and P(K = E) express the validity of kriging procedures K = a respectively K = b. These probabilities are related to the behaviour of the examined soil property in the transition zone. So Eq. 1 is modified as follows: for x<xA : P(K=a)=landP(K=b)=O for x A l x l u , : P(K for x >xB : = a) = 0 and P(K = b) = 1 Outside the transition zone and inside stratum A is P(K = a) = 1 and inside stratum B is P(K = a) = 0. Inside the transition zone P(K = a) decreases from 1 close to stratum A to 0 close to stratum B. The reverse holds for P(K = b). Estimators of the unconditional expected value Z(Y) and its variance 9 are given by (see appendix for the derivations) : ,??(Y)=Y,P(K=~)+Y,P(K=~) PI S2 = P(K = a)P(K = b)(Y, - Y,j2 - PI P(K = a)P(K = b)(Sa - s , ) +s: ~ P(K = a ) + S: P(K = b) To illustrate the influence of P(K = a) and P(K = b) on 9,three situations were distinguished : 1. Y, is twice as large as Yb and S: and S: are equal. This results in a quadratic curve path of the variance along the transition zone (figure 2a) with a maximum at P(K = a) = P(K = b) = 0.5. Here 9 is 3.25 times s:. 2. Ya and Yb are equal and S: is twice as large as sf.This results in a linearly increasing estimation variances between S: and S: (figure 2b). 3. In practice, mostly both the prediction estimates and their variances will differ. In figure 2c both Ya is twice as large as Yb and S: is twice as large as S: . The result is a combination of both preceding situations, with a maximum of 9 being 3.71 times as large as s:. It is clear that the difference in prediction estimates dominates the difference in variances. (a) (b) (c) - Sb2 I I P(K=a) 1.O P(K=b) 0.0 0!5 0.5 I 0.0 1.O I d5 0.5 1.0 0.0 - Sb2 1.o 0.0 Figure 2.- The prediction variance in the transition zone : (a): Y, 2 2 2 ' I I 0.0 1.O d5 0.5 = 2 Yb 2 (b): Y a = Y b a n d Sa = 2 S b ; (c): Y a = 2 y b a n d Sa = 2 S b . 0.0 1.O and S = S :; It makes sense that the prediction variance inflates at the position of a retained map boundary since this boundary itself represents uncertain information. If one chooses to make use of this information, its uncertainty should be added to the prediction variance. RESULTS The procedure was tested on the top soil sand fraction of West-Flanders, Belgium. More details can be found in Boucneau et al. (1996). Stratum boundaries were characterised using Eq. 1. Figure 3 presents an accurately mapped boundary between two strata B and C (Strata are indicated in figure 4a), having a narrow transition zone of 500 m (left), and an inaccurately mapped boundary between two strata A and C, having a wide transition zone of 1500 m (right). A prediction estimate of the topsoil sand and its associated variance was produced based on Eqs. 3 and 4. Figure 4 gives a detail of the map results and allows to compare the proposed procedure with the conventional stratification procedure (i.e. without a transition zone). Three strata can be distinguished: two sandy strata in the upper left comer (stratum A) and the lower right corner (stratum B) and a stratum with low sand content (stratum C) between both. Within the wide transition zone, between the strata A and C, the proposed procedure recognises the prediction estimates are uncertain by attributing them a high prediction variance (figure 4c). Between strata B and C the boundary is accurately mapped. Here the transition zone is narrow and high prediction variances are absent. ' 1 Stratum B stratum C f I StrahunA Stratum C &Transition zone -2000 -1000 0 1000 2000 Distance to the boundary (m) -2000 -1000 0 1000 2000 Distance to the boundary (m) Figure 3. - Narrow (left) and wide (right) transition zones between two strata. (4 Prediction estimate (%) Prediction variance (%L) ail of the topsoil sand map: the prediction estimate based on the proposed (a) and the conventional procedure (b); the prediction variance based on the proposed (c) and the conventional procedure (d). Only observations outside the transition zone are used for computing the variogram (figure 5). As a result, the prediction variance in strata A and C and outside the transition zones, is considerably lower in the proposed procedure than in the conventional procedure (figure 4c and 4d). Using all observations Observations in transition zones of inaccurately mapped boundaries omitted Figure 5.- The within-stratum semivariogram of stratum C. 256 CONCLUSIONS In conventional stratification the study area is sharply partitioned in several strata delimited by a thin line. As a result, the within-stratum variogram is based on all data within each stratum. However, soil attribute values of locations situated near the border of a stratum often differ from the typical value of that attribute. These observations can increase the semivariance inside a stratum considerably. In this way the variability inside the transition zone influences the estimate of the variance within the entire stratum. Our procedure overcomes this problem by characterising the transition zone between strata. Only observations outside the transition zone are used for computing the variogram. For each location in a transition zone the probabilities P(K = a) and P(K = b) are quantified and incorporated in the prediction estimate and prediction variance. In this way, the variance due to the vagueness of stratum boundaries is concentrated near these boundaries. REFERENCES Boucneau, G., Van Meirvenne, M., Thas, 0. & Hofman G. 1996. Stratified kriging using selected soil map boundaries modelled as vague transition zones. Submitted to European Journal ofsoil Science. Burrough, P.A. 1989. Fuzzy mathematical methods for soil survey and land evaluation. Journal ofsoil Science, 40: 477-492. Goovaerts, P. & Journel, A.G. 1995. Integrating soil map information in modelling the spatial variation of continuous soil properties. European Journal of Soil Science, 46: 397-4 14. Heuvelink, G.B.M. & Bierkens, M.F.P. 1992. Combining soil maps with interpolations from point observations to predict quantitative soil properties. Geoderma, 55: 445-468. Mark, D.M. & Csillag, F. 1989. The nature of boundaries on "Area-Class" maps. Cartographica, 26: 65-78. Stein, A., Hoogerwerf, M. & Bouma, J. 1988. Use of soil-map delineations to improve (co-)kriging of point data on moisture deficits. Geoderma, 43: 163177. Van Meirvenne, M., Scheldeman, K., Baert, G. & Hofman, G. 1994. Quantification of soil textural fractions of Bas-Zdire using soil map polygons and/or point observations. Geoderma, 62: 69-82. Voltz, M. & Webster, R. 1990. A comparison of kriging, cubic splines and classification for predicting soil properties from sample information Journal of Soil Science 41: 473-490. APPENDIX Prediction estimates are obtained based on two kriging procedures K=a and K=b which are based on variograms of stratum A respectively B. The expected value of the examined variable Y if the kriging procedure K=k is valid is yk = E(YI K=k). The unconditional expected value y is then obtained by summation over the set of procedures K= {a,b}: y=E(Y)= E(~K=~)P(K=~)=~,P(K=~)+~~P(K=~) ~ E K The variance 02is defined by: o2=Var(Y) = E ( Y ~ ) - { E ( Y )working }~. out each term: Since: 2 ~ ( & ~ l ~ = k ) = ~ a r ( y ~ I ~ = k ) + ( ~=Y:+G; (&l~=k)) And substituting them, the variance d is given by: 2 2 0 2 = ( y ~ + o ~ ) ~ ( ~ = a ) + ( y ; + ~ f ) ~ ( (~K= = ab ))- - y a ~ yi~2(K=b)-2yaybP(K=a)P(K=b) = P(K=a)P(K=b)(y, -yb)2 + o ; ~ ( K = a ) + o ; P ( K = b ) The unbiased estimator for the unconditional mean is: &'Y)=Y,P(K=~)+~P(K=~) Since: y = y, P ( K = a ) + y, P(K = b ) The unbiased estimator for the unconditional variance is: S 2 = P ( K = a ) P ( K = b)(Y, - Y , ) -~ P(K = a ) P ( K = b)(Sa - s , ) ~+ S: P ( K = a ) + S: P ( K = b ) Since: o 2 = P ( K = a ) P ( K = b ) ( y , - Y , ) ~ + o ; ~ ( K = a ) + o f ~ ( ~ = b ) BIOGRAPHICAL SKETCH Geert Boucneau prepares a Ph.D. in Soil Information Systems. Marc Van Meirvenne is professor Soil Information Systems and Spatial Data Analysis. Olivier Thas is active in the domain of Environmental Statistics. Georges Hofman is professor Soil Science and Soil Fertility.