This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. Computing the Area Affected by Phosphorus Runoff in an Everglades Wetland Using Bayesian and Universal Kriging Song S. Qian - Abstract. --Phosphorus-enriched agriculture runoff is believed to be the leading cause of ecosystem changes in an Everglades wetland. In order to study the effects of the added nutrients to the wetland ecosystem, it is necessary to estimate the acreage of the affected region. In this study, Bayesian and universal kriging are used to analyze the data collected by Reddy, et a1 (1991). The background level of the STP concentration is used as an indicator of whether the region is affected or not through an indicator function. The expected value of the affected area is calculated using the results from Bayesian and universal kriging. The results show that universal kriging is sensitive to the specification of covariance function. Universal and Bayesian kriging yield comparable results when the covariance functions are specified in the like manner. INTRODUCTION The Water Conservation Area-2A (WCA2A) is part of the Everglades wetlands in south Florida (Figure 1). It receives the agricultural runoff from the Everglades Agriculture Area. The phosphorus enriched agriculture runoff caused some significant changes in this phosphorus limited wetland ecosystem. The most obvious change was in the change of the dominant vegetation species from sawgrass to cattail near the inlet of the runoff. The significant increase of phosphorus level in the water of WCA2A is considered a major threat to the Everglades National Park. To protect the Park, the South Florida Water Management District proposed to use constructed wetlands, a man-made marsh like buffer area, to remove excess phosphorus before the water enters WCA2A. One important design parameter of a constructed wetland is the unit mass loading rate. The unit area loading rate is usually calculated by measuring the influent mass loading rate (in mass of phosphorus per unit time) and dividing it by the area of the receiving wetland. Because WCA2A covers such a large region, not all parts of the region are effective in removing phosphorus. (In other words, not all parts are affected.) If the entire WCA2A were taken as affected and used to calculate the mass loading rate, the phosphorus assimilating capacity of the wetland is underestimated. The data used in this paper are from Reddy et al. (1991), who sampled soils at 74 stations on seven north -- south transacts, spaced at 2 mile intervals (Figure 2). Richardson (personal communication, 1993) believes that the STP content should be fairly stable if there were no agricultural runoff problem, and the background level of the STP content is Lake Okeechobee A Everglades National Figure 1. Map of south Florida and location of the study region Figure 2. Locations of sampling sites in WCA2A about 500 pg of phosphorus per gram of soil ( ~ g l g )in the top 20 cm. Universal knging (Cressie, 1991) and Bayesian kriging (Handcock and Stein, 1993) are used to model the phosphorus concentration over the region. When the concentrations are larger than the background level, it is believed that the corresponding region is affected by the agricultural runoff. METHODS Data analysis The mean STP content in the top 20 cm layer of the Reddy samples were used. The STP values are log-transformed to stabilize the variance. The spatial coordinates were converted from latitude and longitude to the Universal Transverse Mercator (UTM) grid system Universal kriging (UK) can be interpreted as a Gaussian random field model: z=xp+q (1) where: Z is the log transformed STP content, X is the known design matrix containing the longitude and latitude, and the distance to the nearest pollution source, p is a vector of unknown parameters, and q is the error term which is assumed to be normally distributed with mean 0 and covariance C.The covariance function is cov{Z(s,), Z(s,)} = aK0(s,, s,), for any pair of spatial coordinate points (s,, s,) of interest, where CL > 0 is a scale parameter, and 0 is a vector of structural parameters which specify the shape of the covariance function. In the case of kriging, we observe {Z(s,), Z(s,), ..., Z(sn)}' = Z and the prediction of Z(so) is the objective, where sois a location with no observed data. The UK predictor is the best linear unbiased predictor (Cressie, 1991), of the form io(so) = h(8)'Z. See Ripley ( 1981) for details. The covariance matrix aK0 is usually unknown, and is often estimated using the empirical semi-variogram. Once the semi-variogram model is chosen, the covariance function is taken as known, and the uncertainty in the selected model is ignored. To include this uncertainty in the analysis, Handcock and Stein (1993) introduced Bayesian kriging (BK) using the Matern class of covariance functions, with a parameter 8 = (8,, 8,). The posterior predictive distribution for Z(so) is derived by adapting a noninformative prior: pr(a, P, 0) ~ r c e m (2) Handcock and Stein (1993) showed that the conditional posterior distribution of Z(s,) is: - where q is the number of parameters in p. The marginal posterior distribution of 8 is: pr(0IZ) = pr(8) IK~I-"IX'K;'XI-"& (0)-(nnq)'2 (4) The Bayesian predictive distribution for Z(sJ is: pr(Z(sJZ) = pr(Z(sJ0, 2) x p r ( 0 W 0 J Q (5) ~b), where: i&,) is the UK point predictor, $0) = &(z- x~)'K;'(z aV,(O) is the UK prediction error variance, and tn-qis a student-t distribution with (n - q) degrees of freedom. See Handcock and Stein (1993) for details. The prior distribution of the parameters of the covariance function was chosen as: This prior distribution represents the belief that large values of the parameters are less likely than small ones. Area estimation where I(.) is an indicator function, X is the coordinate vector, and t is the background level. When UK is used, the integrand in the RHS of equation (7) can be evaluated by using the CDF of a normal density. When BK is used: pr(Z(X) > t 10, Z ) x pr(0 I Z)d0dX E(Ap)= (8) X E R ~@ The uncertainty of the estimated area using BK is evaluated using a Monte Carlo simulation method. The procedure is based on the fact that given 0, the area is An = ~ z ( z ( X ) > ti5 0)dX. It can be approximated by sampling Z(X) and evaluating the integrand, and the mean of these values is approximately equal to the proportion of the affected area over the entire WCA2A. A Monte Carlo simulation algorithm was used to sample 0 and then Z(X). RESULTS The residuals of the model (1) are found to be intrinsically stationary and isotropic through visual inspection of the semi-variograms in the north-south and east-west directions. For UK, four different covariance functions are used. One is the Matern covariance function using the maximum posterior estimate of 0 (UK-M). The other are: exponential (UK-exp), spherical (UK-sph), and Gaussian (UK-Gau). Figure 3 shows the log-transformed data. Covariance function The covariance function used in BK is presented in Figure 4, the log transformed posterior as a function of the parameters (0, and 8,). The posterior Eastinf (km) Firure 3. Log-transformeddata The range parameter Figure 4. Posterior of the covariance function parameters Distance (km) Figure 5. Covariance functions Log transformed STP Figure 6. Predictive distributions, site 10 Log transformed STP Figure 7. Predictive distributions,site 25 mode is at (3.6515, -0.4697), or, the Maximum Posterior Estimates (MPE) of the parameters are (4482,0.3391). The estimated a given the MPE of 0 is 0.032363. The covariance functions in UK are estimated through fitting the semi-variogram model. Figure 5 compares the covariance functions computed from different semi-variogram models. In Figure 5, the line labeled "Matem" is the Matern covariance function using the MPE of 0. Predictive distributions The predictive distributions of the log-transformed STP are estimated for the 10th sampling site, which has the largest STP content, and the 25th sampling site, which has a below average STP content. The density functions are shown in Figures 6 and 7. Table 1 compares the results from BK and UK using the three semi-variogram models. Model BK UK-M UK-exp UK-Gau UK-sph BK UK-M UK-exp UK-Gau Table 1. Predictive distributions Sampling Site Predicted Mean Prob. > 500 Crg/g Measured STP 25 2.48754 0.076779 2.398536 25 2.478 177 0.052068 2.398536 25 2.4627 11 0.030408 2.398536 25 2.179858 0.000001 2.398536 25 2.453698 0.022335 2.398536 10 2.826391 0.78698 1 3.100039 10 2.850576 0.848553 3.100039 10 2.8375 1 0.823649 3.100039 10 2.814715 0.736275 3.100039 Comparing the five predictive distributions for the 10th sampling site, we note that the predictive distributions from UK-exp, UK-sph, and UK-Mat are nearly identical. BK yields a similar mean value but slightly larger estimation variance. The predictive distribution from UK-Gau has the largest variance. For the 25th sampling site, we see comparable predictive distributions from UK-exp, UK-sph, and UK-M. BK yields a similar mean and slightly larger variance. UK-Gau produces a much different predictive distribution from the other four. The mode is significantly less then the modes from the other four predictive distributions. The Gaussian covariance function is equivalent to the Matern covariance function with 0, -+ -. This means that UK-Gau describes a function which is infinitely differentiable. This representation may not be realistic. Computation of the affected area Equation (6) was used for UK and Equation (7) was used for BK. The results are listed in Table 2. In this study, we did not try to locate the boundary of the affected region. However, the shape of the affected region may be stable and we plot the contour lines of the probability of STP larger than 500 pglg to indicate the possible shape. Figure 8 is plotted using BK, and Figures 9 to 12 show the surfaces from UK. The surface from the UK-Gau is mostly flat but has sudden jumps. The reason for this unnaturally shaped surface is that the Gaussian covariance function describes "super smooth" surfaces, in this case, super smooth for the residuals. A model with very smoothed residual surface must have a jumpy mean surface. I~odel Table 2. Expected area affected area (km2)1% of the total areal 1 From Table 2, we note that the estimated area using BK is very close to the estimated area using UK-M and UK-exp. The estimated area is sensitive to the specification of the covariance function when UK is used. It is reasonable to state BK is more appropriate in this study, since the covariance function is unknown and the uncertainty in estimating the covariance function is considered. The uncertainty of the estimated area is evaluated only for BK. Figure 13 presents the histogram of the percentages of the affected area from the Monte Carlo simulation. DISCUSSION In conclusion, it is seen that one significant difference is an increased computational intensity of BK. However, the results indicate that UK is sensitive to the specification of the covariance function. The difference between two different covariance functions may be significant even though the semi-variogram models of the two are fitted to the data equally well. This sensitivity to the covariance function justifies the use of BK. There are several studies that estimated the phosphorus affected area of WCA2A. Craft and Richardson (1993) use the same data set as presented in this paper. In their study, the boundary of the affected region was delineated by using both the STP content and the phosphorus accumulation rate. The delineated boundary is plotted onto a USGS topographic map, and the area is thus measured. Their estimate of the area is 115 km2. Only the top 10 cm layer of the Reddy data was used and 600 pg/g is the background level. The STP contour lines were produced using the ordinary kriging algorithm from a commercial graphic software (surfero). The semi-variogram model used is not reported. Using the STP content as an indicator of the influence of the agricultural runoff is a more appropriate way of estimating the affected area. A spatial statistical Easting (km) Figure 8. Bayesian Easting (km) Figure 9. Matern Easting (km) Figure 10. Exponential Easting (krn) Figure 11. Gaussian Easting (km) Figure 12. Spherical Figure 13. dff8f@dr&ae8dimated area approach accounts for the uncertainty in the sample data. Considering the high level of uncertainty involved in the estimating process (Figure 13), the estimated area using BK should be considered comparable to the result in Craft and Richardson (1993). The discrepancy between the estimates may be caused by two factors, without considering the error introduced by the process of measuring the area from a map. They are their use of ordinary kriging and may be a different covariance function, and the different background level. There is no reason to believe that the background level is a constant. The field samples from the region that is unaffected by the agriculture runoff show a significant variation (Craft and Richardson, 1993). In a current study by the author, the background level is represented by a probability distribution. ACKNOWLEDGMENT Drs. Richard Smith and Peter Miiller motivated the author to pursue this project. Dr. Marcia Gumpertz's comments and suggestions significantly improved this article. The author appreciate the encouragement and support from Drs. C.J. Richardson and K.H. Reckhow. REFERENCES Craft, C.B. and C.J. Richardson. 1993. Peat accretion and phosphorus accumulation along a eutrophication gradient in the northern Everglades. Biogeochemistry, 22, 133-156. Cressie, N.A.C., 1991. Statistics for Spatial Data, John Wiley & Sons, Inc. New York. Handcock, M.S., and M.L. Stein. 1993. A Bayesian analysis of kriging, Technometrics, 35(4), 403 -4 10. Howard-Williams, C. 1985. Cycling and retention of nitrogen and phosphorus in wetlands: a theoretical and applied perspective, Freshwater Biology, 15, 391-43 1. Reddy, K.R., W.F. DeBusk, Y. Wang, R. DeLaune, and M. Koch, 1991. Physico-chemical properties of soils in the Water Conservation Area 2 of the Everglades. Final report submitted to South Florida Water Management District, West Palm Beach, FL 33416. Ripley, B.D. 1981. Spatial Statistics, John Wiley & Sons, Inc. New York. BIOGRAPHICAL SKETCH Song S. Qian is a postdoctoral research associate at the Duke Wetland Center of the Nicholas School of the Environment of Duke University. He received his Ph.D. in environmental sciences and MS degree in statistics from Duke University. He has a Master's degree in environmental systems engineering from Nanjing University, China.