This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. Genetics, geographics, and prairie dogs: error and accuracy in a validated spatially-explicit dispersal model. Abstract.- Genetic theory predicts that poplbtioas will diffaeatiate besedmthegeographic~behveen~lmleassane envhmmblfieatureaeatesabanier t o d q e a d . &wever,tbe baniersareoftmio6enedh~fieetuteg,andrarelydiredly testedfwtheirabilitytopredidgenetc~asaq,ressedby peticdistances. ' Z b e r e i s a ~ t o d e v e b p ~ e x p l i c i t m o d e l s validated by population dumtedics, such as genetics, to predict population sub-division-animportant step in the canservation of a species. Prairie dogs (Cynomys ludovciamcs),are wlonial fossorial rodents that live in discrPte populati~f]~ which have distinctive signatures in remotely-sensed images, providing a mecbanb to combine spgtia and genetic databases into a single model. However, this combination of spatial and &c data complicates spatial error as spatial genetic parameters, measured fiom protein difkedon, have associated error around distaoce algorithms;while spatial databgses,such as elevation, hydrology, and transportation, have error associated with plamnent. The implications of testing spatial models on dorganisns at the laodscape level are discussed INTRODUCTION Ecological modeling of animal populations with remotely-sensed data is a relatively new field for conservation biologists (Johnson 1990). How animals move among populations and how different environmental features affect population exchange and stability are findmental questions for the conservation of patchy fiagmented habitats (Johnson and Naiman 1987, Jensen et al. 1990). Given the current availability of remotely-sensed databases along with the statistical tools to analyze those databases, there is also an increasing need to address the accuracy and resolution of these landscape databases when used at the scale of an individual animal (Costanza and Maxwell 1994). Modeling population establishment and movement on a landscape level requires using some remotely-sensed data (such as vegetation, elevation, or the like) to imply where the animal populations may occur (Scott et al. 1993). However, the digital databases expressing the environmental parameters contain sources of error associated Natural Resource Specialist, Rocky Mountain System Support Office,National Park Service, Denver, Colorado with the scale, resolution and projection of the source map that can have important implications when applied to ecological databases (Meenetemeyer and Box 1987, Lee et al. 1992, Kemp 1993, Bolstad and Stowe 1994, Hodgson 1995). In addition, the unit of scale of the landscape will change with the different ecological processes being examined and will vary with different organisms as to the correct scale for expressing spatial movements (Hunsacker et al. 1993 Cale and Hobbs 1994). At a certain point, the error associated with the scale of the underlying database can overwhelm any potential to model populations at finer resolutions (Brown and Bara 1994, Berry 1995). This point-of-noreturn can be expressed as an error term or through error management to evaluate model performance (Hunter and Goodchild 1995). Here I discuss the implications of scale and resolution on constructing a spatially explicit model of animal dispersal, and how the validation of this model is limited by discrepencies between the ecological scale (how far an individual animal can move on a landscape) and the database scale (what are the limitations of resolution and projection on the database accuracy for modeling the movements of small mammals). BACKGROUND Dispersal among populations is essential to maintain genetic homogeneity within a population thus p r e s e ~ n gpopulation health and longetivity (Green 1994). As populations become isolated through environmental or behavioral barriers, genetic divergence occurs leading eventually towards speciation. The progress towards speciation can be estimated on a shorter time scale by calculating the dispersal rates per generation and creating a genetic distance (based on exchange as mirrored by genetic differentiation-Nei 1972) that can validate dispersal distances (a rate of exchange based on physical separation). To examine dispersal distances across realistic landscapes as a measure of population conservation, each element of a landscape (elevation, vegetation, transportation, hydrology) can be viewed as an independent treatments (single model runs) to be tested against the predicted dispersal distances based on genetic differentiation (as a measure of population persistence. The effects of environment elements (such as roads or developments) on dispersing animals can be directly by calculating population parameters (genetic differentiation) against an estimate of the exchange rate (a dispersal distance) that creates a spatially-explicit dispersal model applicable for population conservation efforts. Model Parameters Three aspects of ecological model are directly affected by the scale and resolution of the spatial databases themselves: the size of the area covered by the model; the size of the species the model is designed to emulate; and the variablity in the landscape. How big is the study area? The first step in model construction is to define the size of the patch addressed by the model. If the patch is a preserve or politically defined boundary, there may be constraints defined by the boundary of that patch which should be included in any modeling efforts. In addition, the size of the total area of consideration can determine the resolution of the databases feasible given hardware and software limitations. Highly mobile insects, for example, would required extremely high resolution data over large areas that is beyond the storage capacity of many computers. How large is the organism? The size of the species will dictate the required precision of any underlying map layers (Brown and Bara 1994). Smaller species may be more sensitive to the resolution of a spatial data than larger species and thus a model based on a coarser resolution data could miss- or over-represent landscape features and their impact on animal movement. The physical size of the species will also dictate how rapidly dispersal declines with geographic distance, in which case modeling population placement may be more valid than individual movement (even though this is fbndamentally a different question) and the resolution of digital layers maybe unrealistic for that species (Cale and Hobbs 1994). How variable is the study area? Habitat variablity, especially as it Sects small animals, can be under-represented by databases with coarse resolution. For example, small erosive features, long narrow gullies or streambeds, may not be visible to a database with a minimum pixel size of 30meters. If those gullies are barriers to the movement of small animals, the database may not be able to detect those barriers and could mis-represent animal movement. STUDY SITE Black-tailed prairie dogs ( C ' y s ludbvicianus) are colonial fossorial rodents of the Great Plains of North America. The prairie dog is considered a keystone species for the North American plains because they form large colonies and actively modify the environment within their colonies, creating unique ecosystems. The colonies with their distinct vegetation are distinct in aerial photography (Schenbeck and Myre 1990) and thus are ideal to demonstrate spatially-explicit dispersal models. Badlands National Park, South Dakota has a number of large prairie dog colonies scattered throughout the park. As the park has two distinct lobes with colonies on either end, predicted dispersal pattems among colonies are colonies relatively close on one side of the park would be genetically more similar than those on opposite (figure 1). These predictions (euclidean distances describe population dispersal patterns) will serve to validate model predictions based on the dispersal and genetic distances. Figure 1. Cartoon of the distribution of major prairie dog populations in Badlands National Park, South Dakota. Population abbreviations are: Baysinger @A), Burns Basin (BU),Bigfoot @I), Haybutte (HA), Kocher Flats (KO), Tyree Basin (TY), Tyree South (TS). MODEL CONSTRUCTION Environmental layers such as Digital Elevation Model @EM- 1:24,000), Digital Line Graph.(DLG-- 1:250,000), aerial photography (black and white and color hka-red: 1:24000), Soil Conservation Service (SCS DLG--1:250,000), vegetation (National Park Service (NPS) database-- 1:24000) and landuse boundaries (NPS database- 1:250,000) were entered into a GRASS4.1 GIs database on a Sun Sparkstation. All data were rectified with GRASS program modules for geo-referencing (Clark66) and spheroid corrections (NAD27). Additional corrections for resolution were done using C- and Bourne-shell scripts and GRASS resampling. All model runs were generated by these scripts and were run in the UNIX environment, calling the GRASS modules as needed. A model run consisted of an individual GRASS surface (such as roads) redefined based on user-assigned weights meant to symbolize the ease or difficulty for prairie dogs to cross a particular environmental feature. Those weights are then expressed as a cost surface (the curnmulative weights for moving fiom a particular point to any other point on the surface) which is then used to generate a dispersal path (figure 2). Dispersal paths are calculated using alogorithms fiom hydrology as water seeks the lowest path down a drainage. The route water would follow then represents the least-cost path that an animal would follow over an ecologically-defined surface (figure 3). Dispersal paths were validated by using the genetic differentiation of proteins (as determined by the horizontial gel-electrophoresis of polymorphic enzymes) to create a genetic distance analogous to the physical separation (see Bowser 1996 for methods). Genetic distance is a metric measure of the genetic differentiation of proteins among 679 population samples, and can be used as an independent measure to test for linear associations with the paths generated by the dispersal models (Nei 1972, Slatkin 1985). Genetic distances were calculated using BIOSYS- 1 (Swofford and Selander 1989) and Weirs (1990) jack-knifing procedure. L., '. k. I Cost surface Transforming surfaces Figure 2. Conceptual diagram of the cost path generation using surfaces consisting of single environmental feature to create composite surfaces. Populations are represented by 'I' and 'j' and paths are calculated from the centroids of i and j. Different polygons indicate ecological features on the transforming surface that pose a barrier to dispersal. All statistical tests were done using Splus3.3 in the UNIX environment after corrections for normality and checks for autocorrelation. Each environmental surface was seen as an independent model consisting of a single parameter such as roads, streams, etc. Composite layers were created within GIs from combinations of single layers which were then rescaled to account for changes in resolution and tested as a single independent surface RESULTS Dispersal paths calculated for prairie dog populations in BADL were significantly different than the Euclidean distances among the populations (table 1). All of the environmental surfaces created longer paths than the Euclidean distances with the exception of vegetation which was not significantly different. Loglinear regression on splined distances showed that there were weak predictive relationships (p=0.05, r2=. 14) for single models between the genetic distances and the dispersal distances (figure 4). The best fitted model using step-wise regression was roads, streams, and park boundary 680 @=0.05, ?=0.33). Composite models were generally better at predicting the genetic distances than single models using Principal Component Analysis; composite models contributed the most to the first PCA axis which explained 86% of the variance in genetic distances. Figure 3. Cost surface for moving from one location (the pit) to any other point on the surface. Cost re fers to the cummulative sum of pixel weights from a designated starting point to any other point on the surface. Table 1. ANOVA on single surface models as independent treatments against Euclidean distance among populations of prairie dogs in Badlands National Park. Model df f P sign roads 1 64.55 0.00 ** streams 1 slope elevation vegetation 1 1 1 DISCUSSION The validated dispersal models demonstrated here showed an ability to predict another population parameter, genetic distance, which is an important measure of the longetivity, exchange, and isolation of populations. As an exploratory model, the dispersal model was able to demonstrate which environmental layer appear to best explain the genetic patterns observed, as well as provide some insights as to which environmental features may have little effect. Despite the presence of the badlands wall and other erosion features within BADL, elevation was not a strong predictor of genetic distance. 68 1 In contrast, streams and roads together as a composite model was best able to explain the genetic patterns over any single environmental feature. Figure 3. Regression lines for each single surface tested using genetic distances as the dependent variable. Graphs labels from the left: roads, streams,slope, vegetation, elevation, and park boundary. The Influence of Error There are two sources of error discussed here: error associated with the placement of populations; and error associated with treatment affects during the model validation. Population placement The location of populations on the landscape for a dispersal model is critical as the seed point for constructing paths on the cost surface. If a centroid position for a population polygon is used as the starting point, the error associated with that centroid only becomes important when referenced to ground measurements. Each centroid has a sphere of error round it that represents the corrections due to population placement (digitizing and interpretation error), RMS associated with the photograph rectification, along with any error associated with Treatment error Treatment error builds on existing population placement error within the framework of a statistical test. Location error can be analogous to the variance around each point within a treatment. Testing the variance among treatments (assuming each treatment is a different spatial layer), changes the total error incorporated within the statistical model would be a combination of the variance among and within treatments--the first due to population placement and the second due to the underlying precision and resolution of the original environmental database. The combined errors reduce the power of a variance test to explain the "background noise" resulting in low multiple r squared values. CONCLUSIONS Error, scale, and resolution can limit the ability of GIs to accurately model animal 682 movements across complex landscapes. When a model is validated by a population measure (such as genetic distances) the error incorporated into spatial models fiom digital databases and the assumptions of accuracy underlying those databases are elucidated. In the dispersal model, the inability to explain more than 33% of the model variance in the regression models yet still have a demonstratable relationship (and reproduce pattems expected by the population parameters) suggests that the error in placement may distort model results. The take-home message is that although the dispersal model did a modest job in re-creating the genetic patterns and sorting through which environmental features best explained those patterns, the accuracy, precision, and resolution of the underlying databases must be discussed. In this model, the minimum resolution (30 meters) was greater than most of the animal's home territories (and some colonies). Realistically, the model requires the animal to move in jumps of at least one meter up to 30 meters depending on the surface--a dficult task for an animal of 2-10 pounds. The fact that the model still distinguished among surfaces and replicated genetic distance predictions is intriging and will hopefblly generated m h e r exploration as to the effects of the accuracy and error of spatial databases on ecological modeling. ACKNOWLEDGEMENTS Bette Loiselle and John Blake, my dissertation advisors from the University of Missouri-St. Louis, provided strong support in the development of this project as a part of my dissertation. Chip Harvey, Chris Theriault, and Michelle Gudorf of the National Park Service provided excellent discussions on spatial error and ecological modeling. I appreciate the efforts of Peter Strong, David Duran, and Bruce Powell of the National Biological Service (NBS) to keep my computer running. Ralph Root and Susan Stitt (NBS) provided excellent analytical support for the spatial analysis. LITERATURE CITED Bolstad, P. V. and T. Stowe 1994. An evaluation of DEM accuracy: elevation, slope and aspect. Photogrammetric Engineering and Remote Sensing 60(11): 1327-1332. Costanza, R. and T. Maxwell 1994. Resolution and predictability: an approach to the scaling problem. Landscape Ecology 9(1): 47-57. Bowser, G. 1996. Integrating ecological tools with remote sensing: modeling animal dispersal on complex landscapes. Conference proceeding Third Annual Conference on Environmental Modeling and GIs, Santa Fe, New Mexico, CDRom. Brown, D. G. and T. J. Bara 1994. Recognition and reduction of systematic error in elevation and derivative surfaces from 7in minute DEMs. Photogrammetric Engineering and Remote Sensing 60(2): 189- 194 Goodchild, M. F. 1993. The state of GIs for environmental problem solving in Environmental Modeling with GIs. M.F. Goodchild, B. 0 . Parks and L. T. Steyaert editors. Oflord University Press. 485 pages Green. D. G. 1994. Connectivity and complexity in landscapes and ecosystems. Pacific Conservation Biology 1:194-200. Hodgson, M. E. 1995 What cell size does the computed slopelaspect represent? Photogramrnetric Engineering and Remote Sensing 61(5): 513-5 17. Hunter. G. J. and M. F. Goodchild 19%. Dealing with error in spatial databases: a simple case study. Photogrammetric Engineering and Remote Sensing 61(5): 529-537. Jensen, J.R., S, Narumalani, 0.Weatherbee and K. M. 1992. Predictive modeling of cattails and waterlily distribution in a South Carolina reservoir using GIs. Photogrammetric Engineering and Remote Sensing 58: 1561-1568. Johnson, C. A. and R. J. Naiman 1990. The use of geographical information systems to analyze long-term landscape alteration by beaver. Landscape Ecology 4: 5-19. Johnson, L. B. 1990. Analyzing spatial and temporal phenomena using geographic information systems: A review of ecological applications. Landscape Ecology 4: 3 1-43 Hunsaker, C. T., R. A. Nisbet, D. C. L. Lam, J. A. Browder, W. L. Baker, M. G. Turner, and D. B. Botkin. 1993. Spatial model of ecological systems and processes: the role of GIs. in Environmental Modeling with GIs. M.F. Goodchild, B. 0.Parks and L. T. Steyaert editors. Oxford University Press. 485 pages Kemp, K. K. 1993. Spatial databases: sources and issues. in Environmental Modeling with GIs. M.F. Goodchild, B. 0.Parks and L. T. Steyaert editors. Odord University Press. 485 pages. Lee, L., P. K. Snyder, and P. F. Fisher 1992. Modeling the effect of data errors on feature extraction from digital elevation models. Photogramrnetric Engineering and Remote Sensing 58(10): 1461-1467. Schenbeck, G. L. and R. J. Myhre 1986. Aerial photography for assessment of blacktailed prairie dog management on the Buffalo Gap National Grassland, South Dakota. USDA Report no 86-7 3400. Scott, J. M., F. Davis, B. Csuti, R. Noss, B. Butterfield, C. Groves, H. Anderson, S. Caicco, F. D'Erchia, T. C. Edwards jr, J. Ulliman, and R. G. Wright 1993. GAP analysis: a geographic approach to protection of biological diversity. Wildlife Monographs 123. Swofford, D. L. and R. B. Selander 1989. Biosys- 1: A computer program fo the analysis of allelic variation in population genetics and biochemical systematics. Release 1.7 Illinois Natural History Association, Illinois. Wickham, J. D. and D. I. Norton 1994. Mapping and analyzing landscape patterns. Landscape ecology 9(1): 7-23. Wier, B. 1990. Genetic Data Analysis. Sinauer Associates, New York BIOGRAPHICAL SKETCH Gillian Bowser is a natural resource specialist with the National Park Service in Denver, Colorado. She holds an MS from the University of Vermont in Zoology and is currently a PhD candidate with the University of Missouri-St. Louis. Gillian provides technical assistance and develops proposals in the sciences for national parks in the rocky mountain area.