27 CHAPTER 3 PREDICTED ANIMAL SPECIES DISTRIBUTIONS AND SPECIES RICHNESS 3.1 Introduction All species range maps are predictions about the occurrence of those species within a particular area (Csuti 1994). Traditionally, the predicted occurrences of most species begin with samples from collections made at individual point locations. Most species range maps are small-scale (e.g., >1:10,000,000) and derived primarily from point data to construct field guides. The purpose of the GAP vertebrate species maps is to provide more precise information about the current predicted distribution of individual native species within their general ranges. With this information, better estimates can be made about the actual amounts of habitat area and the nature of its configuration. GAP maps are produced at a nominal scale of 1:100,000 or better, and are intended for applications at the landscape or “gamma” scale (homogeneous areas generally covering 1,000 to 1,000,000 hectares and made up of more than one kind of natural community). Applications of these data to site- or stand-level analyses (site – a microhabitat, generally 10 to 100 square meters; stand – a single habitat type, generally 0.1 to 1,000 ha; Whittaker 1977, see also Stoms and Estes 1993) are likely to be compromised by the finer-grained patterns of environmental heterogeneity that are resolved at those levels. Gap analysis uses the predicted distributions of animal species to evaluate their conservation status relative to existing land management (Scott et al. 1993). However, the maps of species distributions may be used to answer a wide variety of management, planning, and research questions relating to individual species or groups of species. In addition to the maps, great utility may be found in the consolidated specimen collection records and literature that are assembled into databases used to produce the maps. Previous to this effort there were no maps available, digital or otherwise, showing the likely present-day distribution of species by habitat type across their ranges. Because of this, ordinary species (i.e., those not threatened with extinction or not managed as game animals) are generally not given sufficient consideration in land-use decisions in the context of large geographic regions or in relation to their actual habitats. Their decline because of incremental habitat loss can, and does, result in one threatened or endangered species “surprise” after another. Frequently, the records that do exist for an ordinary species are truncated by state boundaries. Simply creating a consistent spatial framework for storing, retrieving, manipulating, analyzing, and updating the totality of our knowledge about the status of each animal species is one of the most necessary and basic elements for preventing further erosion of biological resources. 3.2 Mapping Standards Mapping of potential habitat (predicted distribution) was performed for all vertebrate species considered to breed consistently in Pennsylvania. Mapping was conducted on the 28 basis of spatial units equivalent to 30-meter Landsat TM pixels for individual species of birds, mammals, amphibians, and reptiles. For individual fish species, mapping was conducted on the basis of small watersheds for named streams with 9,855 such watersheds in the state. For purposes of comparative analysis among species, the mappings for all species were cross-tabulated into a database of 1 km2 (100 ha) cells. There are 118,218 of the latter cells comprising the state. 3.3 Methods Habitat models were developed as matrices in the form of spreadsheets with columns representing habitat variables and rows representing species. Each species row includes the scientific name, common name, and the ‘element occurrence code’ (ELCODE) provided by The Nature Conservancy. The model for each species was then implemented as a sequence of conditional GIS operations designed to identify habitat and eliminate non-habitat areas. Habitat variables in the matrix models for birds, mammals, amphibians, and reptiles are coded with numbers that range from 1 to 4 which rate the variable as to its relevance for the particular species. The code designations are: 1 = habitat type required by the species (primary use); 2 = habitat type may be used by the species (secondary use); 3 = habitat type avoided by the species; 4 = not relevant to the species. Habitat maps for these groups were produced as (raster) grids having 30-meter resolution, and then resampled to 90-meter resolution for placement on a set of CD-ROMs to be archived by the National GAP Office. The approach used for modeling of fishes was analogous, but differed in several respects. Fish habitat modeling was conducted in GIS (vector) polygon mode with the foundation layer comprised of 9,855 small watersheds for named streams in Pennsylvania. Variables for habitat models were attached directly to each watershed as tabular attributes. Habitat factors included physiographic units, major river basins, stream size class, median slope, and extent of disturbance. The models as spreadsheet profiles determine whether a watershed is primary habitat, secondary habitat, or non-habitat. 3.3.1 Mapping Range Extent: Many Pennsylvania vertebrate species have range restrictions that are not directly tied to local habitat factors, which may be due to climatic influence or historical circumstance. Early in the Pennsylvania GAP Project, The Nature Conservancy compiled a database of species ranges for Pennsylvania. Based on current and historic information, species presence was tabulated in each of 211 cells of a hexagonal lattice that had been configured as a sampling frame for USEPA’s EMAP program. Each hexagon encompasses an area of 635 km2. All hexagons that contained records for a particular species formed its preliminary range (Figure 3.1a). Single hexagons constituting holes in the preliminary range were then incorporated for purposes of the Pennsylvania Gap Project. The (augmented) hexagon range was coupled with a layer delimiting small watersheds in order to select all watersheds having included centers. Boundaries among 29 the selected watersheds were then dissolved to obtain a range modifier for the respective habitat model (Figure 3.1b). Any potential habitat from modeling that fell outside this range was suppressed (Figures 3.2a & 3.2b). Hexagon range restrictions are expressed to varying degrees in the final mappings, being fairly evident for species richness of snakes and lizards. 3.3.2 Wildlife Habitat Relationships: Modeling of wildlife habitat relationships was done in similar manner for mammals, amphibians, reptiles, and birds. Our habitat models are based primarily on species affinity for seven available land cover categories that were identified with relative consistency from satellite imagery. These seven categories were supplemented with modifications for aquatic ecosystems (riverine, palustrine, and open water), landscape position regarding elevation (ridge, mid-slope, valley), urban density (high and low), and stream order (first through eighth). Our initial approach to modeling terrestrial habitat associations for vertebrates in Pennsylvania was to examine existing sets of habitat models from the northeastern U.S. to determine which, if any, were suitable for use in the Pennsylvania GAP Project. In general, these models or their format were not appropriate for use in Pennsylvania. Therefore, we elected to use a matrix approach where habitat factors were characterized by simple categorical variables in a spreadsheet format. These factors had to be compatible with either existing or derived statewide GIS databases (e.g., cover type, topographic orientation, proximity to water, spatial landscape pattern). Factors that were both positively associated and negatively associated with probable occurrence of a species were considered. This allowed us to highlight areas of suitable habitat and mask out unsuitable areas within the general range of a species. We used local and regional literature, best professional judgment, and peer reviewers (see Appendix 2) to develop and check the habitat models. The latter group of experts also provided suggestions for changes in nomenclature or range distribution. The major sources of literature reviewed for mammal habitats were Merritt (1987) and DeGraaf & Rudis (1986), with Jones et al. (1997) being used for final decisions on nomenclature. For reptiles, the major literature reviewed included Shaffer (1995), DeGraaf & Rudis (1981), Conant & Collins (1991), and Ernst et al. (1994). For amphibians, the major literature reviewed included Shaffer (1995), DeGraaf & Rudis (1981), Conant & Collins (1991), and Green & Pauley (1987). Pertinent references for birds are American Ornithologist’s Union (1983, 1995, 1997); Andrle & Carroll (1988); Boone & Krohn (1996); Brauning (1992); Brooks & Croonquist (1990); Buckelew & Hall (1994); Clark & Wheeler (1987); Curson, Quinn, & Beadle (1994); DeGraff & Rudis (1986); Dunn & Garrett (1997); Ehrlich, Dobkin & Wheye (1988); Freemark & Collins (1992); Harrison (1983); Isler & Isler (1987); Madge & Burn (1988); O’Connell (1999); and Rising & Beadle (1996). 30 31 32 Our primary concern for modeling fish species has been to ascribe habitat to sectors of landscapes that are large enough to be evident in regional mappings, but small enough to inform environmental and conservation analyses across landscapes. In light of exploratory work in New York and Missouri, we considered stream reaches to be inappropriately fine scale with respect to both mapping and effort. Small watersheds constitute a next level of scale above stream reaches that can serve for purposes of landscape segmentation relevant to both hydrology and aquatic organisms. Small watersheds also have the advantage of mapping as area features rather than linear features, thus providing a tessellation. References pertinent to our watershed-based modeling of fishes are Allen & Johnson (1997); Argent, Carline & Stauffer (1997, 1998); Cooper (1983); Hocutt & Wiley (1986); Imhof, Fitzgibbon & Annable (1996); Jenkins & Burkehead (1994); Johnson & Gage (1997); Lee et al. (1980); Mayden et al. (1992); Meixler, Bain & Galbreath (1996); Richards, Johnson & Host (1996); Schlosser (1991); Smith (1985); Stauffer, Boltz & White (1995); and Trautman (1981). Geomorphology controls development of drainage networks and character of streams, with influence extending also to physical properties (e.g., turbidity) and chemical properties of water. Geomorphology is reflected in physiographic provinces for Pennsylvania. Drainage divides constitute zoogeographic barriers to movement of organisms that are wholly aquatic. Pennsylvania encompasses portions of several major river basins that engender such segregation of aquatic biota. Stream order can serve as a surrogate for stream size and discharge, which reflects macrohabitat for fish species. By viewing an overlay of streams on watersheds, they could be classified interpretively. Gradient serves to separate fish habitat along the longitudinal axis of a stream. Some fishes occupy streams of low gradient, whereas others prefer higher gradients. Low gradient streams typically have sand, silt, and clay substrates. High gradient streams typically have cobble, boulder, and rock substrates. Medium gradient streams often have a heterogeneous mix of substrate types. Land cover can be used as a surrogate for human disturbance of the landscape. This variable provides an indication of microhabitat diversity, as well as allowing for consideration of tolerance to human-induced landscape influences. A large digital database of fish collection records for Pennsylvania was instrumental in developing and validating fish models. Records from over 20,000 collection events from 1950 to 1999 were used in the analysis. 33 Each class for a variable was cast as a separate field (column) in a spreadsheet for habitat modeling. Basin and physiographic fields were coded as either 1 or 0 for presence or absence, respectively. The size, gradient, and disturbance characteristics were designated in terms of primary habitat (1), secondary habitat (2), or unsuitable (0). Each fish species was profiled as to its highest frequency of occurrence for stream size, which was designated as primary habitat. If secondary habitat or stream sizes were determined, they were added to the profile as situations where the fish may occur but with lower frequency. The profile as represented in the row of the fish habitat matrix determines whether a watershed constitutes primary habitat, secondary habitat, or non-habitat for a species. 3.3.3 Distribution Modeling: Translation of habitat relations into distribution of potential habitat was performed in like manner for mammals, amphibians, reptiles, and birds. Habitat relations served to determine a series of conditional operations that identified specific categories in GIS thematic layers as to their habitat suitability for each species. All final mapping procedures and most preliminary procedures for these taxa were completed using the Spatial Analyst Extension of the ArcView geographic information system (GIS) software. This software is created and distributed by the Environmental Systems Research Institute (ESRI) of Redlands, CA. A suite of compatible cellular (raster) GIS layers having 30-meter resolution was used to accomplish mapping of potential habitat for mammals, amphibians, reptiles, and birds. The codes used as column headings in the matrices of habitat relations appear with the ensuing synopses of these GIS layers. Vegetative Land Cover is the result of our classification of Thematic Mapper (TM) satellite imagery for Pennsylvania. Eight types of vegetative land cover were identified: 1 = Water [OPEWAT] 2 = Evergreen forest [CONFOR] 3 = Mixed forest [MIXFOR] 4 = Deciduous forest [BLFFOR] 5 = Woody transitional [WOOSUC] 6 = Perennial herbaceous [PERHER] 7 = Annual herbaceous [ANNHER] 8 = Barren/hard-surface/rubble/gravel [TENOVE]. Urbanized Land was created by overlaying our compressed Thematic Mapper (TM) images with roads data and, thereby, interpreting the locations of urban and suburban areas. Originally digitized using a vector format, this layer was converted into a grid format using the Spatial Analyst Extension of ArcView. Three categories are distinguished: 1 = Rural 2 = Low intensity (suburban) development – [URBLO] 3 = High intensity (urban) development – [URBHI]. 34 A Digital Elevation Model (DEM) prepared by the United States Geological Survey (USGS) has a 30-meter resolution (cell size). This grid layer classifies each raster cell as a distance above sea level in meters. Several avian models identified specific elevations above or below which an animal occurred, with this being specified by the [ELEVAT] column in the bird habitat matrix. The DEM was also used to create two temporary layers Aspect and Slope that were used as special requirements in a few models. The bobcat (Lynx rufus) and the eastern hognose snake (Heterodon platirhinos) both favored certain aspects. The worm-eating warbler (Helmitheros vermivorus) and the whitethroated sparrow (Zonotrichia albicolis) were sensitive to certain slopes. Wetlands data were extracted from land-use/land-cover data classified by the MRLC. The MRLC used NWI maps as an ancillary data source to assist with the classification of TM imagery. Two wetland types along with open water were identified by the MRLC, palustrine herbaceous wetlands and palustrine woody wetlands. To facilitate the process each of these wetland types was isolated into separate layers. In addition to the isolated wetlands, most models requiring wetlands data also needed to include a buffer zone around the wetlands as well. Using the Spatial Analyst Extension of ArcView a distance command was used to calculate distances away from each wetland. This preliminary layer was classified to delineate buffer zones of 30 and 100 meters. Animals that are sensitive to the presence of wetlands were modeled with the assistance of these data. Generally, for wetland sensitive birds the 100-m buffers were used. The amphibian and reptile models used the 30-m buffers. For the mammals, some used the 100-m buffers while other used the 30-m buffers. The column headings [PALWOO] and [PALHER] represented these layers in the habitat matrices. Pennsylvania Streams were originally digitized in a vector format by the Pennsylvania Department of Transportation, and later edited and verified by the Environmental Resources Research Institute (ERRI) at Penn State Univ. These data were converted into a raster format, and using the same procedure as described above for the wetlands layers, processed to create a layer that delineates both 30-m and 100-m riparian buffers. Stream sensitivity was listed as [RIVERI] in the habitat matrices. A Disturbed Lands layer was created to simplify model processing. The layer was compared with other layers to isolate conditions that exist in disturbed areas versus conditions in minimally disturbed areas. The common use of this layer was to separate streams that passed through disturbed areas from those that passed through relatively undisturbed areas. The layer is a result of a reclassification of the Vegetative Land Cover. The vegetative land cover classes for perennial herbaceous, annual herbaceous, and barren represented disturbed land; whereas water, evergreen forest, mixed forest, deciduous forest, and transitional were classed as undisturbed. A Topographic Position layer was created to divide areas of Pennsylvania based on their topographic form. It was recognized during the course of the project that many animals, although not sensitive to elevation (distance above sea level), were sensitive to local physiographic conditions. This layer was created through several reclassifications of the 35 DEM to isolate three general physiographic conditions. The first class was isolated by the SLOPE command of the Spatial Analyst Extension for ArcView GIS. All slopes greater than or equal to 15% were grouped into this class. A Physiographic Provinces layer from the Topographic and Geologic Survey, Pennsylvania DCNR, was also utilized to help identify the next two classes. Pennsylvania was first divided into five zones based on similar physiographic conditions among the provinces. Within each zone the DEM was used to help locate a natural break between ridge top and valley bottom conditions. Each zone could then be divided into these classes. The topographic position variable was identified in the mammal, amphibian, and reptile models by [ELEVAT]. The classification codes are: 1 = Valley bottom 2 = Mid-slope (greater than or equal to 15% slope) 3 = Top of ridge. A Shedorder (small watersheds) data layer was based on information originally digitized in vector format by the Water Resources Division of USGS and subsequently refined by the ERRI at Penn State University. As part of aquatic gap analysis for Pennsylvania, each watershed was interpretively assigned a classification according to stream order. For modeling of wetland-associated animals, the Shedorder layer was usually paired with a streams layer to help identify stream size. For avian models, stream use was identified as either being larger or smaller than a specific stream order, and was listed in the matrix as [STMORD]. The mammal, amphibian and reptile models divided stream order among four size classes that were listed in the habitat matrices under [STRSIZ] as: 1 = Small (1st and 2nd order streams) 2 = Medium (3rd and 4th order streams) 3 = Large (5th and 6th order streams) 4 = Extra-large (7th order streams and above). The mapping process for mammals, amphibians, reptiles, and birds proceeded as a series of conditional GIS operations for each species formulated to identify habitat and eliminate non-habitat areas. The aforementioned data layers were manipulated with the Spatial Analyst Extension of ArcView GIS software to process the models within a raster GIS environment on the basis of 30-m cells. All of the mammal, amphibian, reptile, and bird models fit into two general modeling approaches depending upon the habitat preferences of the animal. The first approach dealt with all areas based first on vegetative land cover. As each additional layer was incorporated into the model, changes were made based on the matrix specifications. The final step(s) removed larger areas such as urban areas, often coded as avoided habitat, to complete the model. The second general approach was used for species associated with water and wetland conditions. Under this second approach, models were constrained by the 30-m or 100-m buffers from the wetland layers. The sequence of conditional statements proceeded like the first approach, but the last step used the appropriate buffer like a ‘cookie cutter’ to restrict the scope. The result was a map having habitat possibilities only within the buffer zone and all areas outside the buffer being coded as non-habitat. 36 With few exceptions, the modeling sequence and decision rules went according to the following scenario. 1 – The vegetative land cover was reclassified based on the matrix specifications. Non-habitat (3’s) for any model variables was noted immediately. Any area of nonhabitat was excluded from subsequent alteration. 2 – Variables coded as “4 = not applicable” were noted in order to control interaction of variables. If an urban variable had a code of 4, for example, then the vegetative land cover took precedence over those areas that would otherwise have been treated as urban. 3 – Wetlands, including streams, were typically addressed next. The coincidence of a wetland coded 2 (secondary habitat) and vegetation coded 1 (primary habitat) would return a code of 2. Coincidence of a wetland coded 1 (primary habitat) and vegetation coded 2 (secondary habitat) would return a code of 1. A wetland coded 3 (non-habitat) would return a code of three regardless of vegetation. 4 – Stream modifying conditions were then addressed. This step either selected streams outside the proper size class for removal or degradation, or degraded the classification within the stream buffer according to degree of disturbance. This was always a degrading process. Streams initially classed as primary or secondary would be reduced to secondary or non-habitat, respectively. 5 – Due to their restrictive influence, urban areas were always treated as a degrading layer. If urban areas had been classed as secondary, then all coincident areas previously designated as primary habitat would be returned as secondary. Also, urban areas classed as non-habitat always received a value of 3. 6 – The minimum area and elevation variables were considered in the final stage. Whereas steps 3-5 can be considered as modifiers, minimum area and/or elevation are more extractive. Any area too small, too large, or not within specifications for elevation or topographic position would become non-habitat. 7 – For a few species it was necessary to consider exceptions and/or special cases. Thereafter, the hexagon-based mask for range limits was applied unless the species is considered to be ubiquitous for Pennsylvania. The foundation layer of small watersheds for vector-based modeling of fish habitat originated with the Water Resources Division of USGS, which undertook to digitize watersheds of all named streams within major river basins of the region. These data for the basins were integrated and harmonized by the Office for Remote Sensing of Earth Resources (ORSER) in ERRI at Penn State University with funding provided by the Pennsylvania Department of Environmental Protection. Some further editing was 37 required for purposes of aquatic gap analysis, mostly to resolve issues along the borders of the state. All factors pertaining to the various fish models were analytically incorporated directly into the polygon attribute table for watersheds. Each class of a factor was represented as a separate column in the attribute table. Translation of any given model into a map could then be accomplished simply by a compound query of the attribute table, or perhaps a sequence of set-reducing queries depending upon the complexity of the model. In order to capture differentiation of streams due to geomorphology, a layer of physiographic provinces and sections was overlaid to assign each small watershed as being in one of 16 physiographic units. The physiographic province layer originated with the Pennsylvania Topographic and Geologic Survey in the Department of Conservation and Natural Resources (DCNR). Watershed classification with respect to stream size was performed interactively by displaying a digital file of all blueline streams superimposed on the watershed outlines. The stream file originated with digitizing by the Pennsylvania Department of Transportation, but extensive editing and topological adjustment had been conducted subsequently by ORSER in ERRI at Penn State University. First-order and second-order streams comprise a small stream class. Third-order and fourth-order streams comprise a medium size class. Fifth-order and sixth-order streams constitute a large size class. Seventh-order and eight-order streams are combined with lakes as a fourth size class. The digital elevation model as described earlier was used to calculate median slope for each small watershed, with coding in three classes: low (<1%), medium (1% to 3%), and high (>3%). The median slope classes reflect stream gradient. The land cover layer was used to assign a human disturbance class for each small watershed. For this purpose, human disturbance was considered to be nonforest area due to agriculture and/or development. Percent of such area in a watershed determined its disturbance class as follows: low (<25%), medium (25% to 75%), and high (>75%). 3.4 Results Including all vertebrate taxa, 470 habitat map layers were produced as described above. This is a large repertoire of map files, even for modern computerized geographic information systems. To facilitate both access and analysis, the state was partitioned into a (vector) network of 1-kilometer square cells. There are 118,218 such cells in Pennsylvania, with each cell encompassing (approximately) 100 hectares (3 acres shy of 250 acres). The origin of the cell network is the southwest corner of the Littleton, W.Va.Pa. USGS 7.5-minute quadrangle map at 3937’30” north latitude and 8037’30” west longitude. The base layer of cell outlines is called PAKAGE, which is an acronym for PA Kilometer-Aggregated Gap Elements. Tabular databases having species as columns and cells as rows have been prepared for different taxa showing whether or not models indicate any potential habitat in the cell. These tables can be joined to the base layer of 38 cell outlines for conservation analysis in a computerized geographic information system (GIS). Such cellular aggregation entails generalization of information, even when there are more than 100,000 cells. A habitat cell may have only part of its area being suitable for the species in question. This PAKAGE layer constitutes Pennsylvania’s hyperdistribution layer for range of potentially suitable habitat. PAKAGE cells are also coded with respect to USEPA 635-km2 hexagon for analysis at broader scales. 3.4.1 Mammals: Habitat mapping was conducted for 62 species of mammals that currently breed in Pennsylvania. The matrix for mammal habitat relations is given in Appendix 3. A quartile mapping of modeled species richness on the basis of 100-ha cells in shown in Figure 3.3. Ecoregion edges are etched into the portrayal of mammal species richness in Figure 3.3. Impoverished regions with respect to mammal species include the entirety of the Coastal Plain, Lake Plain, and Piedmont along with much of the Ridge & Valley, Pittsburgh Low Plateau, and Glaciated Pittsburgh Plateau. There is strong correspondence between high mammal species richness and areas having intact landscape matrix of forest. 39 40 3.4.2 Birds: Habitat mapping was performed for 186 species of breeding birds in Pennsylvania. The matrix of bird habitat relations is given in Appendix 4. A quartile mapping of modeled species richness for birds on the basis of 100-ha cells is shown in Figure 3.4 with ecoregion edges etched into the portrayal. As for mammals, impoverished regions relative to bird species richness include the Coastal Plain, Lake Plain, Piedmont, and Pittsburgh Low Plateau. In contrast to the situation for mammals, the Glaciated Pittsburgh Plateau is a relatively rich area for bird species. The Ridge & Valley is variable for birds, whereas it is substantially impoverished for mammals. The High Plateau region is considerably lower in richness for birds than for mammals. 41 42 3.4.3 Amphibians: Habitat mapping was accomplished for 35 species of amphibians that reproduce in Pennsylvania. Appendix 5 has the matrix containing habitat relations for amphibians, with reptiles also appearing in this same matrix. A quartile mapping of modeled species richness for amphibians on the basis of 100-ha cells is presented in Figure 3.5, with ecoregions etched into the portrayal. Amphibian species richness exhibits relatively little correspondence with that for either mammals or birds. With fewer species, a stronger imprint of hexagon range restrictions is also evident. The major river drainages have a stronger influence for amphibians, and areas having preponderance of small, fast-flowing headwater streams are less conducive to amphibian richness. 43 44 3.4.4 Reptiles: The 34 species of reptiles modeled for Pennsylvania included 10 species of turtles. Habitat relations for both groups appear in Appendix 5, being contained in the same matrix with amphibians. Because of the differences in life histories, however, turtles are treated separately with respect to species richness. A quartile mapping of modeled species richness for turtles on the basis of 100-ha cells is given Figure 3.6. A corresponding quartile mapping of modeled species richness for snakes and lizards is given in Figure 3.7. Much of the rugged and heavily forested terrain of northern Pennsylvania is largely inhospitable to turtles. The more favorable circumstances associated with valleys of higher-order drainages are evident. The situation is quite different for snakes and lizards, whereby the deleterious effects of landscape fragmentation are particularly apparent. Hexagon determinations of range are also quite strongly expressed for snakes and lizards. 45 46 47 3.4.5 Fishes: Potential habitat was determined and mapped on a small watershed basis for 152 species of fishes, with habitat also being mapped separately for rainbow and steelhead trout. The matrix of habitat relationships for fishes is given in Appendix 6. A quartile map of modeled species richness for fishes is presented in Figure 3.8, with quartiles being determined on an area basis by reference to 100-ha cells. Fish species richness is strongly influenced by stream size and river basin. The French Creek drainage system in northwestern Pennsylvania stands out strongly with respect to species richness, and the Ohio River system in western Pennsylvania is likewise important. It is particularly noteworthy that there is virtually an inverse relationship between the fishes and mammals with respect to concentration of species richness across much of Pennsylvania. 48 49 3.5 Accuracy Assessment Assessing the accuracy of the predicted vertebrate distributions is subject to many of the same problems as assessing land cover maps, as well as a host of more serious challenges related to both the behavioral aspects of species and the logistics of detecting them. These are described further in the Background section of the GAP Handbook on the national GAP home page. It is, however, necessary to provide some measure of confidence in the results of the gap analysis for each species (comparison to stewardship and management status), and to allow users to judge the suitability of the distribution maps for their own uses. We therefore feel it is important to provide users with a statement about the accuracy of GAP predicted vertebrate distributions within the limitations of available resources and practicalities of such an endeavor. We acknowledge that distribution maps are never finished products, but are continually updated as new information is gathered. However, we feel that assessing the accuracy of their current iteration provides useful information about their reliability to potential users. We especially encourage wildlife biologists and amateur naturalists to treat the predicted distributions as testable hypotheses and engage the process of validation and iterative modeling. Our goal was to produce maps that predict distribution of terrestrial vertebrates and from that, total species richness and species content with an accuracy of 80% or higher. Failure to achieve this accuracy indicates the need to refine the data sets and models used for predicting distribution. The methods for validating and assessing the accuracy of the vertebrate distribution maps are presented below along with the results. 3.5.1 Methods and Results: Potential habitat distributions predicted by models for mammals, birds, amphibians, and reptiles were assessed by comparison to long-term species checklists and to single-year survey records for research sites. The habitat predictions examined in this regard were mapped at the grain of 30-meter pixels. Results were quite satisfactory for locations where species checklists were compiled over several years, which were viewed as nearly comprehensive surveys (Valley Forge National Historic Site, Gettysburg National Battlefield, Hopewell Furnace National Historic Site, Powdermill Nature Reserve, Hawk Mountain Sanctuary, and the Allegheny National Forest). For locations involving thorough surveys of vertebrates, but only over one year (White Deer Creek, Little Fishing Creek, Poconos), the results suggested that species occurrences were over predicted. For all wildlife combined, omission rates averaged 4.9% (range of 0-9.6%). A low omission rate is quite desirable, because it indicates that the gap analysis habitat models are not missing many known species occurrences. The omission rate for birds was lowest (2.5%), followed by mammals (3.2%), amphibians (12.8%), and reptiles (21.6%). A more detailed breakdown by location and taxonomic group is given in Table 3.1. Information on errors by species and location is provided in Appendix 7. 50 Table 3.1. Habitat model error rates by location and taxonomic group. Location Birds Mammals Amphibs Reptiles Total ValFo omit 4/12 0/20 3/15 3/11 10/158 6.3% com 9/112 16/20 3/15 6/11 34/158 21.5% Getty omit 6/113 1/25 2/10 8/16 17/164 10.4% com 12/113 16/25 6/10 4/16 38/164 23.2% HopF omit 4/113 2/18 1/13 2/8 9/152 5.9% com 9/113 24/18 6/13 9/8 48/152 31.6% WdCr omit 0/78 0/24 1/17 1/3 2/122 1.6% com 67/78 26/24 4/17 12/3 109/122 89.3% LfCr omit 0/91 0/24 0/17 0/14 0/136 0.0% com 57/91 30/24 5/17 14/4 106/136 77.9% PowM omit 5/131 0/46 0/24 1/20 6/221 2.7% com 9/131 6/46 2/24 3/20 20/221 9.0% ANF omit 2/137 2/48 2/25 5/24 11/234 4.7% com 20/137 9/48 0/25 1/24 30/234 12.8% HkMt omit 4/118 2/33 6/18 6/18 18/187 9.6% com 27/118 16/33 0/18 0/18 43/134 32.1% Poco omit 1/79 0/20 2/10 0/3 3/112 2.7% com 81/79 33/20 14/10 16/3 144/112 128.6% The gap analysis models produced longer lists of species predicted to be present, but that were not detected. Considering only the six comprehensive locations, commission rates were 11.8% for birds, 59.6% for mammals, 22.4% for amphibians, and 35.2% for reptiles. Overall commission rates for sites surveyed only one year ranged from 72.6% to 128.6%. The validation results for mammals, birds, amphibians, and reptiles indicate that the predictive gap analysis habitat models omitted only 5% of actual species occurrences, with both birds and mammals having rates of less than 5%. Commission rates tended to be much higher, suggesting that the species list generated by gap analysis in Pennsylvania overestimate the actual number of species present. It is important to note, however, that rare and secretive species may often be missing from checklists, even those compiled over years, due to the difficulty in detecting them. Birds, which are more detectable, had lower omission and commission rates throughout. A single example provides an illustration of the issue. In the White Deer Creek study, 2,000 trap nights using pit traps and drift fences, Museum Special snap traps, and Sherman live traps, produced only one specimen of the northern water shrew (Sorex palustris), a rare and secretive species (Brooks unpubl. data). The same species has not been seen or captured at Powdermill Nature Reserve, where extensive small mammal studies have been conducted for decades (J. Merritt, pers. comm.). Potential habitat distributions predicted for fishes were assessed from sampling records in a large proprietary database maintained at Penn State University. The assessment encompassed 23,169 collection events in 2,880 watersheds, or approximately 30% of the 51 small watersheds. Threatened or endangered species were represented in 1,215 of these collections. An accuracy figure over the sampled watersheds was determined for each species by using the number of watersheds where the species was collected as a base. The accuracy index figure is the percentage of these base watersheds for the species that were correctly predicted. The average of these percentages over all species was found to be 73%. Accuracy figures for individual fish species are given in Appendix 8. 3.6 Limitations and Discussion The potential habitat models for the Pennsylvania GAP Project are of a generalized nature, with consequent tendency to be liberal with regard to what may constitute habitat. These models can indicate which landscapes have the potential to support species in question, but are not intended to predict occurrence of a particular species in a given year. We consider that such models can generate a defendable and usable list of wildlife species for targeted geographic areas at a landscape scale, regardless of the ecoregion in question. It should be noted that the view in the PAKAGE database is still more liberal than that reported in the assessment, since a cell is included if it contains any amount of potential habitat. Appropriate use is, therefore, for planning and coordination of conservation efforts across landscapes in a region.