This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. Integrating Spatial Statistics With GIS and Remote Sensing in Designing Multiresource Inventories 1 Robin M. Reich 2 Vanessa A. Bravo3 Abstract-In order to design an integrated multiresource inventory and monitoring system that evaluates the status and trends of natural resources (forest, rangeland, agriculture, wildlife, hydrology, soils, etc.) baseline data for comparison is needed. These systems are generally complex and it may not be wise to select just one or two variables for monitoring purposes. Also, analyzing these variables independently of one another may lead to incorrect conclusion because of their inter-dependencies. One approach is to model the spatial relationship that exists between key variables. This information can then be used, for example, to identify forest habitat that are either conducive, or deterrent to the presence of ecologically important plant and/or animal species. Techniques commonly used in describing spatial relationships between two or more variables include regression analysis and a variety of spatial and geostatistical procedures such as kriging and cokriging. The use of spatially explicit models can be used to monitor the efficiency of certain components of proposed management plans, as well as provide a general prediction of how key indicator variables are changing in time and space. Such models also provide greater insight into changes in the landscape, both on the macro- and microscale, and more importantly, the consequential impact these changes have on selected species. Theoretical and technical aspects of this approach are briefly described in this paper. Spatial Modeling An important problem facing natural resource managers is the integration of several types of data when modeling the spatial dynamics of an individual population. There are two aspects to the problem: first, the integration of data from different sources at a fine enough resolution, and second, modeling the spatial dynamics of an individual population. The first aspect, data integration, has been researched extensively during the last decade. The most widely accepted procedure of integrating spatial data is the use of geographic information systems (GIS). GIS allow for the collection, storage, and analysis of objects and phenomena where geographic location is an important characteristic of, or critical to analysis (Arnoff 1991). GIS has been used for a variety of purposes, including the identification of Ipaper presented at the North American Science Symposium: Toward a Unified Framework for Inventorying and Monitoring Forest Ecosystem Resources, Guadalajara, Mexico, November 1-6,1998. 2 Robin M. Reich is professor, Department of Forest Sciences, Colorado State University, Fort Coliins, Colorado, 80521 USA. 3 Vanessa A. Bravo is researcher, Quantitative Spatial Analysis Company, Fort Collins, Colorado USA 80525. 202 suitable wildlife habitat, timber harvest schedules, modeling biodiversity and population dynamics (Lui et al. 1995). Integration of remotely sensed data and geographic information systems is becoming an extremely powerful tool for producing maps of ecosystem resources and has become vital to resource managers in making decisions and establishing policy (Arnoff 1991). The main obstacle in the development of a descriptive GIS model is the coarse-grained resolution of raster data. Spatial Predictive Models The ability to model the small scale variability in stand characteristics requires the generation offull-coverage maps depicting stand characteristics measured in the field. While remotely sensed data has been shown to provide reliable information for macro-scale ecological monitoring, it falls short in providing the precision required by more refined ecosystem resource models (Gown et al. 1994). Spatial statistics and geostatistics provide a means to developing spatial models that can be used to correlate remotely sensed imagery with field measurements. If a satellite image is geographically referenced to a base-map, one can overlay the location of field plots on the image to obtain pixel intensities associated with each of the field plots. Thus, for each sample plot we have field data describing stand characteristics and seven intensities representing the 7 TM bands (Fig. 1A) (Arnoff 1991). If the field data is spatially correlated with the intensity ofthe remotely sensed image it is possible to develop a model describing this spatial continuity (Cliff and Ord 1981). It is also possible to include geographical variables, such as elevation, slope, aspect, and precipitation thought to influence the large scale spatial variability of the environmental property and is available in the form of a complete coverage of the study area. The functional form of this model is defined as: <1>0 =i i+j if3ijX{OX;o + ~p f.k YkYkO + 110 (1) where, l3ij are the regression coefficients associated with the trend surface component of the model, Yk are the regression coefficients associated with the q auxiliary variables, YkO, available as a coverage in the GIS data base, and 110 is the error term which mayor may not be spatially correlated with its neighbors (Kallas 1997; Metzger 1997). Once a spatial, or temporal dependancy is established for a given variable, this information can be used to interpolate values for points not measured (Robertson 1987). In most sample surveys, supplemental information is collected in USDA Forest Service Proceedings RMRS-P-12. 1999 A To account for this spatial autocorrelation in the residuals of the model developed to describe large scale spatial variability, we propose to model the small scale spatial variability (i.e. spatial noise) using the cokriging model: 113 n '110 m n = I,w r 'l1r + I, I,vtrUtr + Eo r=1 t=1 r=1 (2) subject to the linear constraints: 5 r=1 69 B 35 o c 13 10 6 Figure 1.-Gray scale maps of an 890 ha experimental forest northeast of Gainesville, Florida depicting: (A) Digital numbers for band 5 of a Landsat imagery; (8) Estimated basal area (m2/ha) at a 10m resolution for a selected portion of the forest (R2 = 0.77); (C) Standard error of prediction of basal area (m2/ha) for a selected portion of the forest. The center of the black circles are the approximate locations of the sample plots used in developing the spatial model for basal area (Metzger 1997). t=1 r=1 (3) where, Wr are the kriging weights associated with the nnearest residuals, Vtr' are the cokriging weights associated with the m auxiliary variables, Utr that are spatially correlated to the residuals, and E ais the error term which we will assume to be spatially independent and normally distributed with mean 0 and variance cr2 . One of the appealing features of cokriging is that the auxiliary information does not have be collected at the same data points as the variable of interest. This allows us to combine remote sensing and field data to provide a full coverage map with a higher resolution than would have been possible by using remote sensing and field data alone. In essence, remote sensing images provide information on large scale spatial variability, while field data provides information on small scale spatial variability. Prior to fitting the cokriging model, the residuals of the model describing the large scale spatial variability are analyzed for anisotropy (spatial autocorrelation changes with direction). The residuals are also evaluated for the presence of spatial cross-correlation (Bonham et a1. 1995; Czaplewski and Reich 1993; Reich et a1. 1994) with the independent variables included in the large scale model, or variables for which only data associated with field plot locations is available. Complete coverage of the variables associated with the field data is not available in the GIS database. If no spatial cross-correlation is detected, the residuals can be modeled using ordinary kriging, otherwise the residuals are modeled using cokriging. Spatial dependency ofthe residuals can be modeled using the Gaussian semi-variogram y(h;8){ ~o +c1(1- exp(- 311hlll a)th~o} (4) or some other appropriate model (spherical, exponential, linear, etc.), where 8 = (co, Cb a) is a vector of parameters subject to the constraints Co ~ 0, Cl ~ 0, and a ~ O. In modeling the cross-correlations between the residuals and independent variables, the constraints for the model are relaxed to allow the parameters Co and Cl to take on negative values. The parameters of the semi-variogram model are estimated by minimizing: k I,nj addition to the variable of interest (i.e., average stand diameter, percent crown cover, food availability, etc.). If these variables are spatially correlated with the variable of interest, this information can be used to improve estimates (lsaaks and Srivastava 1989). The use of auxiliary information in spatial prediction is referred to as cokriging. The usefulness of auxiliary information is enhanced by the fact that the variable of interest is generally under sampled (Isaaks and Srivastava 1989). USDA Forest Service Proceedings RMRS-P-12. 1999 {2Y"( h(j»-2y( h(j);8)}2 (5) j=1 where 2y*( ) is the sample variogramlcross-variogram obtained at k lags (h(l), ... , h(k», nj is the number of observations contributing to the estimator at each lag, and 2y(h; 8) is the semi-variogram model with parameter 8 = (co, CIo a). Prior to fitting the variogram and cross-variogram model, the residuals and independent variables are rescaled by dividing the individual variables by their respective maximum values (Carr and McCallister 1985). The predicted 203 surface of scaled residuals obtained using kriginglcokriging are then rescaled back to their original units by multiplying the surface by the maximum observed residual. The rescaled surface of the predicted residuals are then added to the predicted surface describing large scale spatial variability to create the final surface with the desired scale (Fig. IB): <PO = ttf3ijXtox~o+ tYkYkO+ i Wr11r+ i:ivtrUtr+Eo i+ j k ~p r=1 i (6) V( <I» ~(Sj)}2 n L (C/l(s) - ~(Sj)}2 (7) j=1 In addition, response surfaces of predicted standard errors (Fig. lC) for the final model can be computed using the following variance formula Osaaks and Srivastava 1989): Var(E O) = i i WiW j COV(11i11) + ii:ViVjCOV(J..liJ..l) i=1 j=1 i=1 j=1 + 2i i : WiVj COV(11iJ..l j) i=1 j=1 2 iWPOV(11011o) i=1 (8) m - 2 LVjCov(J..l j 110) + Cov( 110 110) j=1 where COV(TJi TJj) is the autocovariance between the estimated environmental property at location i andj, Cov(J.!i. Ilj) is the autocovariance between the auxiliary variables at location i and j, and COV(TJi J.l.i) is the cross-covariance between the estimated environmental property and location i and the auxiliary variable at locationj. Spatial Integration The ability to spatially model field data allows one to integrate the data over any specified geographical region (i.e. stand, management unit, watershed, region, etc.) to obtain a point estimate and associated standard error of prediction. This is accomplished by integrating the three dimension response surface representing the variable of interest over the area of interest and dividing by the area. Since the spatially modeled response surfaces can be represented as a grid in ARCIINFO, any specified region will contain a finite number (n) of grid cells of uniform size (Le 10 m x 10 m). Our point estimate of a resource in some 204 (9) ~ 1 ~~ 112 112 = 2" L.J V(Ei) + 2" L.J L.J Pij (h) V(Ei) V(E j) A - I t=1 r=1 (11(Sj) __________ _ 1 11 <I>=-L<P i , <Pi EA n i=1 The estimated variance is given by The kriginglcokriging surfaces can be cross-validated to assess the amount of variability in prediction error of the kriginglcokriging system. Cross-validation involves deleting one observation from the data set and predicting the deleted observation using the remaining observations in the data set. This process is repeated for all observations in the data set. Residuals are computed as the observed minus predicted values and analyzed using standard techniques employed in regression analysis to evaluate the underlying assumptions of the model. Overall performance of the final model (large scale model + kriged/cokriged residuals) can be evaluated by computing an R2 value similar to that used in regression analysis (Kallas 1997): R2=I_~j=~1 bounded region A, is obtai"ned by summing the point estimates associated with each cell, ct>h and dividing by the number of cells in the bounded region: n A i=l A n A A (10) i'#j where V(E) is the estimated variance associated with cell i (Eq. 9), and Pij(h) is the spatial correlation between cells i andj, which are separated by distance h. The spatial correlation is estimated using the appropriate variogram function (Eq. 5) associated with the variable of interest. For example, if we apply Eq. 10 and 11 to the small polygon (24.64 ha) located in the center of Figures IB and Ie we obtain a basal area estimate of 13.4 m 2/ha with a bound on the error of estimation of 3.7 m 2/ha at the 67% level of confidence. Point Process Models The second aspect, modeling the spatial dynamics of an individual population, is a more recent development, especially with the increase in computing power which makes it easier to perform intricate computations needed to explore complex spatial patterns. One class of spatial models that has received considerable attention in recent years is the Gibbsian interaction model, which is often referred to as Markov random fields (Ripley 1990; Cressie 1991). These models encompass conditional spatial autoregression and a wide class of models for interacting point patterns. The term Gibbsian interaction comes from statistical mechanics, where such models have been used for nearly a century to describe the behavior of gases (Ripley 1990; Cressie 1991). In most applications, interactions between events are assumed to be pairwise. Examples of spatial stochastic models that take into consideration the interaction among events include work on sequential packing models of non-overlapping discs (Matern 1960; Bartlett 1974; Diggle et al. 1976), Poisson cluster models (Matern 1960; Diggle 1979), and Strausstype and hard-core models (Strauss 1975; Kelly and Ripley 1976; Gates and Westcott 1980). While most of this work has been theoretical, the increase in computing power has contributed to progress in estimating the parameters of these models using theoretical approximations to the likelihood function or computer simulations. Approximate maximum pseudo-likelihood procedures provide reasonable parameter estimates and are somewhat easier than approximate maximum likelihood (Ripley 1990). Nonparametric estimations of pairwise-interaction point processes for similar problems have also been developed (Diggle et al. 1987). In developing these models it is assumed that we have very specific information on the location of every individual within the population. This information may be obtained from intensive monitoring research sites aimed at studying very specific components of the environment. For example, one might be interested in studying the spatial relationship of the northern goshawk, or selected plants USDA Forest Service Proceedings RMRS-P-12. 1999 with their habitat. The plants and/or animals would be located in the field, georeferenced, and important variables thought to influence their presence measured. This information can then be used to model the spatial interaction of individual species (i.e. threatened and endangered plants and animals) with themselves, other species and their environment using procedures developed by Reich et al. (1997). Suppose one has a mapped spatial pattern of points in a finite planner region. In the case of the northern goshawk, it is easy to identify potential habitat using environmental variables such as elevation, slope and aspect along with existing forest cover type maps. Even though suitable habitat may be identified this does not mean that the species will be present at that location. In habitats where the goshawk is present, a pattern where individuals are rather equally spaced from one another would be expected. Such a pattern is called "regular". One way to model this spatial interaction is to consider a function of distances (rij) between individual sites of activity. In such instances, it is customary to assume that the equilibrium system is statistically characterized by a Gibbs distribution of total potential energy (Cressie 1991): N U N(X) = L'P(ri) (11) where '1'1(rij) and 'P 2(rij) describe interaction between individuals ofa given species and 'P 12 (rij) describes interactions between the two species. The approximate log likelihood of the pairwise potential (Eq. 4) is given by log L(e I X) = L LUeClX ((x) = exp[ -UN(X)]/Z('P;N) (12) where Z(.) is a normalizing constant. For a single species population, a positive potential energy represents a repulsion between individuals while a negative potential energy represents an attraction between individuals (Fig. 2). This model can be expanded to include more than one species: Nl U N(X) = N2 Nl N2 L 'PI (li) +L 'P (li) +L L 'P12 (li) 2 i<j i i<j j (13) i XmP - N(N -l)lOg( 1- ~~n (14) which is easily solved using nonlinear optimization procedures. To use this relationship one needs to be able to mathematically describe the interaction potentials of a spatial point pattern. Three parameterized potential functions proposed by Ogata and Tanemura (1981, 1985) can be evaluated to describe the interactions observed in the distribution of active and inactive nest sites: PF1: l.jf e(r) = -log[l + (ar - 1)e-13r2 ] PF2: l.jfe(r) = -log[l + (a _1)e- PF3: l.jfe(r) = 13r2 ] ~(crlr)12 -a(crlr)6 e=(a,~),a~O, ~>O (15) e=(a, ~), a~O, ~>O (16) e=(a,~,cr), ~>O (17) The second cluster integral, aCe) for the three potential function are given by: aM 12) PFl: a(a,~) = (1t/~)(1PF2: a(a,~) = 1t(1-a)/~ i<j where the points X can be regarded as being distributed according to a Gibbs canonical distribution: n - . PF3. a(a,~,cr) = (18) (19) _'::~1/6cr2~ ~r(6k-2)ak~-k/2 6 ~ k! 12 (20) All three models are capable of modeling both repulsive and a ttractive forces. The pairwise potential models PFl-3 are fit to point data using a nonlinear least squares procedure to maximize the log likelihood (Eq. 18). The Akaike Information Criteria (AIC) (Aka ike 1977) is used to select the model which minimizes the val ue of AI C among the three possible models. A model with a smaller AIC is considered to be a better fit. In the case of point patterns with two categories (i.e. active vs. inactive nests), AIC is computed for each of the three ctS :;::: c (]) 10 +-' 0 a.. (]) (J'J .~ .(ij 5 a.. o- -- - -- - 1.0 1.5 =-===-=-=-==-==-=--~---~-- 2.0 2.5 3.0 Distance (km) Figure 2.-Pairwise potential model (PF 3) describing the spatial interaction of northern goshawk territories on the Kaibab National Forest in northern Arizona. The northern goshawk is territorial with a minimum distance of 1 km between territories; territories are spatially independent at approximately 2.1 km (Reynolds and Joy 1995, personal communication). USDA Forest Service Proceedings RMRS-P-12. 1999 205 components in Eq. 17 (AIC u , AIC 22 , AIC 12), and the best model for each component is selected independently. This is because the approximate log likelihood with respect to the parameters is equivalent to the independent maximization of the individual components (Ogata and Tanemura 1985). As mentioned previously, just because an area is deemed suitable for the presence of a particular species does not mean that the species will be present. Within a given habitat, the spatial distribution of active sites are influenced by small scale spatial variability, such as differences in the abundance of a food supply, plant competition, distances to openings in the canopy, stocking levels, species compositions, etc. To include this small scale spatial variability in the model one can redefine the total potential energy as follows: i<j i<j i j where <l>l(rj) and <l>2(rj) are measures of small scale spatial interaction. This model would allow us to describe the territoriality of the northern goshawk and interaction with their immediate habitat. To model this component, the probability of habitat suitability can be defined at each of the grid points superimposed over the study area. This probabilityis computed as the ratio of estimated density of nests A(S), at spatial location s, to the maximum intensity observed in the study area. The potential energy associated with a given micro-habitat can be defined as $(r) = -loJ 6l max).(s)).(s) ) = {(stand characteristics) (22) which can be regressed on individual stand characteristics available in the GIS database. Large positive values would indicate unsuitable habitats while small values would indicate suitable habitats (Reich et al. 1997). Concluding Remarks The application of such a model can be updated yearly with current information that can quantify progress of the species in question. Information can be very specific, such as, how the spatial location of the species is changing over time to more general questions relating to the effects of food supply availability, natural forest succession, and silviculture treatment. Information derived from the model could also be used to facilitate the efforts of field investigators studying the ecology of the selected species. This model, when combined with information on population dynamics, demographic information and linkerl: to a forest successional model, could provide land managers with valuable insight in developing management plans to guide the recovery efforts of a species. Such a model could be used to address species viability and minimum area requirements. This is a unique approach to modeling the spatial distribution of threatened and endangered species such as the northern goshawk with which their existence is related to past land management activities. The use of spatially explicit models can be used to monitor the efficiency of certain components of the recovery plan as well as to provide a general prediction of how the population is changing in time and space. In this pilot study, the modeling approach suggested above is worthwhile in developing an ecosystem 206 maintenance and preservation program by providing greater insight into changes in the landscape, both on the macro- and micro-scale, and more importantly, to the consequential impact these changes have on selected species. References --------------------------------H. Akaike, " On entropy maximization principle," In Applications of Statistics, P. R. Krishnaiah (ed.), 27-41. Amsterdam, NorthHolland, 1997. S. Aronoff, "Geographic Information Systems: A Management Perspective," WDL Publications. Ottawa, Ontario. 1991 M. S. Bartlett, "The statistical analysis of spatial patterns," Advances Appl. Prob., Vol. 6, pp. 336-358, 1974. C. D. Bonham, R. M. Reich, and K. K. Leader, " Spatial crosscorrelation of Bouteloua gracilis with site factors," Grasslands Science, Vol. 41, pp. 196-201, 1995. J. R. Carr, and P. G. McCallister, " An application of cokriging for estimation of tripartite earthquake response spectra," Math. Geology, Vol. 17, pp. 527-545, 1985. R. L. Czapleski, and R. M. Reich, Expected value and variance of Moran's bivariate spatial autocorrelation statistic under permutation, Research Paper RM-309. U.S. Department of Agriculture, Rocky Mountain Experimental Range Station., Fort Collins, CO, 1993. A. Cliff, and J. K. Ord, Spatial processes, models and applications. Pion, Ltd. London., 1981. N. Cressie, Statistics for spatial data. John Wiley & Sons, New York, 1991. P. J. Diggle, J. Besag, and J. T. Gleavens, "Statistical analysis of spatial point patterns by means of distance methods," Biometrics, Vol. 32, pp. 659-667, 1976. P. J. Diggle, " On parametric estimation and goodness-of-fit testing for spatial point patterns," Biometrics, Vol. 35,pp. 87-101, 1979. P. J. Diggle, D. J. Gates, and A. Stibbard, "A nonparametric estimator for pair-wise interaction point processes," Biometrics, Vol. 74, pp. 763-770, 1987. D. J. Gates, and W. Westcott, "Further bounds for the distribution of minimum interpolation distance on a sphere," Biometrika, Vol. 67, pp. 446-469, 1980. S. N. Gown, R. H. Waring, D. G. Dye, andJ. Yang, "Ecological remote sensing at OTTER: Satellite Macroscale Observation. Ecological Application.Vol. 4, pp. 322-343, 1994. E. H. Isaaks, R. M. Srivastava, An introduction to applied geostatistics, Oxford University Press, New York., 1989. M. Kallas, Hazard rating of Armillaria root rot on the Black Hills National Forest, M.S. Thesis, Department of Forest Sciences, Colorado State University, Fort Collins, CO 80523, pp., 1997. F. P. Kelly, and B. D. Ripley, "A note on Strauss's model for clustering," Biometrics, Vol. 63, pp. 357-360, 1976. J. Lui, J. B. Dunning, Jr., and H. R. Pulliam, "Potential effects ofa forest management plan on Bachman's Sparrows ( Aimophila aestivalis):Linking a Spatially Explicit Model with GIS," Conservation Biology, Vol. 9, pp. 62-75, 1995. B. Matern, "Spatial variation," Medelanden from Statens Skogsforsknings Institut., Vol. 49, No.5, pp. 1-44,1960. K. Metzger, Modeling small-scale spatial variability in stand structure using remote sensing and field data. M.S. Thesis, Department of Forest Sciences, Colorado State University, Fort Collins, CO 80523, 1997. Y. Ogata, and M. Tanemura, "Estimation of interactive potentials of spatial point patterns through the maximum likelihood procedure," Ann. Instit. Statist. Math. Part B, Vol. 33, pp. 315-338, 1981. Y. Ogata, and M. Tanemura, "Estimation of interactive potentials of marked spatial point patterns through the maximum likelihood method," Biometrics, Vol. 41, pp. 421-433, 1985. R. M. Reich, C. D. Bonham, and K. Metzger, "Modeling small-scale spatial interaction of shortgrass prairie species.," Ecological Modeling, Vol. 101, pp. 163-174,1997. R. M. Reich, R. L. Czaplewski, and W. A. Bechtold, "Spatial crosscorrelation in growth ofundisturbed natural shortleafpine stands in northern Georgia," J. Enivorn. and Ecol. Stat., Vol. 1, pp. 201217,1994. USDA Forest Service Proceedings RMRS-P-12. 1999 R. Reynolds, and S. Joy, Personal Communication, USDA Forest Service, Rocky Mountain Forest and Range Experiment Station, 218 West Prospect, Fort Collins, CO 80523, 1997. B. D. Ripley, "Gibbsian interaction models," DA. Griffith (editor), Spatial Statistics: Past, Present, and Future, Institute of Mathematical Geography, Syracuse University, New York. p.3-25, 1990. USDA Forest Service Proceedings RMRS-P-12. 1999 G. P. Robertson, "Geostatisticsinecology: Interpolating with known variance," Ecology, Vol. 63,pp. 744-748,1987. D.J. Strauss, "A model for clustering," Biometrics, Vol. 35, pp. 87101,1975. 207