Land Cover Data: The Foundation for Conservation Planning Vegetation is an integral component of our environment. It is a strong, although complex, indicator of the ecological function of natural systems (Grossman et al. 1998). Nearly all animal life is in some way dependent on vegetation for food and shelter and the influence it has on water cycles and albedo ultimately affect our global climate. Understanding the world we live in, assessing environmental issues, and conserving biodiversity are therefore dependent on our ability to accurately characterize, map, and monitor the vegetation around us. The need and desire to characterize vegetative land cover has long been recognized. The ancient Greek scholar Theophrastus observed that relationships existed between plants and their environments, but it wasn’t until ~1900 that the first known quantitative measure of plant species was proposed by Raunkiaer (Bonham 1989). A rapid increase in our understanding of the ecological world and the need to map the landscape around us occurred in the first half of the 20th century. Chase (1949) pointed out the need for accurately mapped vegetation types for resource management as far back as 1949. The term “land cover” that we commonly use today to describe the physical aspects of the landscape was defined in 1961 as “the vegetational and artificial constructions covering the land surface" (Burley 1961). However, land cover data did not become commonly available until the first remote sensing satellite dedicated to providing information about the earth’s surface was launched in 1972 followed by development of Global Positioning Systems (GPS) and the availability of advanced computers and software in the 1990’s. The advent of these sophisticated tools coupled with the needs of resource managers, environmental modelers, and policy makers has led to increased development and use of land cover data. We have now progressed to the point where digital land cover data are among the most popular data used for resource applications (Thogmartin et al. 2004). The increased activity in land cover mapping has led to a divergence of approaches resulting from various needs, types of remote sensing equipment, and methods for processing and interpreting remote sensing data. The end result is an excessively large number of land cover mapping schemes that tend to be distinct, incompatible with each other, and often only applicable for the application or area of interest for which they were designed. Adams (1999) likened all the different land cover classifications to the mythical Tower of Babel where everyone is working hard but all speaking different languages and went on to state that “The progress that has been made so far is despite the large number of schemes, not because of them.” The myth persists that land cover data is accurate and up-to-date (Estes and Mooneyhan 1994) and in our haste we often help perpetuate this myth by using whatever land cover data are “available” for a project without fully considering accuracy of the data and what effects it will have on our results. Land cover data commonly forms the foundation for conservation planning. Used alone, it offers the potential for identifying unique or rare plant communities to conserve but is more often used as a variable in defining broader habitats of importance across the landscape. Conservation biology has long relied on the idea that protecting habitat for surrogate wildlife species will protect habitat for other species with similar requirements (e.g. focal species (Lambeck 1997), umbrella species (Andleman and Fagan 2000), flagship species (Caro and O’Doherty 1999), and indicator species (Landres et al. 1988). Similar to the use for defining habitat suitability for 1 wildlife, land cover data is often a component for defining connectivity habitat. Identification of these important habitats may subsequently be used in the planning and policy making process It is therefore critical that accurate land cover data be used not only to identify the most important habitats, but to also reduce the cumulative effect of errors in the subsequent decision making process. Remote sensing specialists and resource managers have often had difficulty communicating the needs of resource management with the limitations and capabilities of remote sensing data (Hoffer 1994). Additionally, there is somewhat of a disconnect between classification of vegetation in the field by ecologists and classification by remote sensing specialists. The intent of this chapter is to help bridge the information and communication gap between these disciplines. It is geared towards the conservation planner who desires to use digital land cover data for the delineation of wildlife habitat. However, it may also help remote sensing specialists better understand the needs and requirements of conservation planners and other resource managers. Since there is not one ideal classification of land use and land cover to meet all applications and it is unlikely that one could ever be developed (Anderson et al. 1976, Franklin and Wulder 2002), the focus is on the factors that influence the accuracy of land cover data for the intended purpose and how remote sensing data is tied to the vegetation we see on the ground. By better understanding these factors, conservation planners can make improved decisions about the selection, use, or development of new land cover data. The Basics of Remote Sensing Lillesand and Kiefer (1994) define remote sensing as “the science and art of obtaining information about an object, area, or phenomenon through the analysis of data acquired by a device that is not in contact with the object, area, or phenomenon under investigation”. The earliest and still pertinent application of remote sensing data meeting this definition is from aerial photography. By the 1850s balloons and pigeons were being used to carry cameras for use in land surveys and a passenger of Wilbur Wright provided the first known photograph from an airplane in 1908 (Belward and Eva 2004). In the mid-1940's, Francis J. Marschner began mapping major land use associations for the entire United States, using aerial photographs taken during the late 1930's and the early 1940's (Anderson 1976). Today, the most common and widely used types of sensors for land cover classification are multispectral scanners. All objects on the earth’s surface emit, reflect, or absorb energy and multispectral scanners are termed “passive sensors” because they measure the amount of energy that is emitted or reflected from the sun back to the sensor. These sensors typically provide measurements in the blue, green, red, and infrared bands of the electromagnetic spectrum but the number of bands and specific wavelengths that they measure are dependent on the specific sensor. Since they are so dependent on illumination from the sun, clouds prevent the acquisition of data and even water vapor, particulate matter, and variation in the sun’s angle due to time of day, seasonality, and topographic influence affect reflectance measurements. Vegetation indices such as the Normalized Difference Vegetation Index (NDVI, Rouse et al. 1973) were developed in part to reduce these effects and numerous other methodologies have been developed to decrease variability of sensor data resulting from these factors. These are topics that everyone working with remote sensing data should be familiar with and have a working knowledge about. Further descriptions and methods for correcting these effects can be found in many remote 2 sensing textbooks such as Campbell (1996), Lillesand and Kiefer (1994), Jensen (1996) as well as numerous journal articles and manuals for remote sensing software. Appendix A contains a brief bibliography of the many sources of remote sensing data that are available. This will be updated online on the Conservation Planning website that accompanies this book (URL). Advances are continually being made in the types of available remote sensing equipment, including active sensors that emit their own energy source and measure the return. Sensors such as Light Detection and Ranging (LiDAR) and Synthetic Aperture Radar (SAR) are active sensors that do not rely on energy from the sun and are therefore not prone to many of the issues of passive sensors. These sensors provide the ability to directly measure physical parameters of vegetation such as height that must be inferred from passive sensors. Hyperspectral imagery is a very specialized type of passive sensor that slices the electromagnetic spectrum into many more discrete bands than multispectral sensors (often hundreds) that may allow the detection of specific plant species. These next generation sensors will undoubtedly improve the abilities for classifying land cover when they become fully operational. However, many are still in the research phase and beyond the capabilities of most researchers (Turner et al. 2003) or too expensive to be practical for large areas (Donoghue et al. 2004). Therefore, this chapter will focus on multispectral data since they have been used for virtually all existing land cover classifications and are the most cost effective for new classification work over large areas that are typical of conservation planning. Differences amongst multispectral sensors and the data they provide are generally due to the spatial, spectral, and temporal resolutions of the sensors, as defined below. I have also included brief discussions of several other terms that are pertinent to the following discussion of land cover classifications and methods. NOTE: can add other definitions as desired. Spatial Resolution – spatial resolution in remote sensing refers to the size of the area on the ground that a single pixel of imagery provides information about. Spatial resolution of common imagery are 1 km for the Advanced Very High Resolution Radiometer (AVHRR) and ~30 m for most bands of Landsat data from 1982 onward. The multispectral bands of IKONOS imagery have a spatial resolution of 3.2 m when looking straight down (at nadir). However, IKONOS and some other sensors often acquire data at oblique angles (off-nadir) to increase the spatial coverage and temporal frequency which can significantly increase the spatial resolution depending on the angle. Figure 1 provides a comparison what satellite imagery of different spatial resolutions actually “sees” on the ground. (Note: In actuality, sensors sample a circular region rather than the square pixels that are portrayed to facilitate use of data. Sensors are therefore only sampling the center 78.54% of each pixel.) Spectral Resolution – Figure 2, taken from Lillesand and Kiefer (1994), provides typical reflectance curves for soil, green vegetation, and water. These curves form the basis for the ability of optical remote sensing to identify differences in land cover. Landsat TM samples the electromagnetic spectrum in 7 discrete bands, 0.45-0.52um, 0.52-0.60um, 0.63-0.69um, 0.760.90um, 1.55-1.75um, 10.4-12.5um, and 2.08-2.35um which generally correspond to peaks and valleys in the reflectance curves. IKONOS provides 4 multispectral bands similar to the first 4 bands of Landsat TM. AVHRR also provides 4 multispectral bands, but since it was designed for meteorological purposes, only 2 of the bands (similar to bands 3 and 4 of Landsat) are typically used for land cover classification. 3 Temporal Resolution – temporal resolution refers to the amount of time between repeat coverage for a sensor. The AVHRR sensor samples everywhere on the earth daily, while Landsat takes 16 days between repeat coverage. Selection of remote sensing data is often a tradeoff between spatial, temporal, and spectral resolution in addition to cost, data volume, and image footprint. The spatial resolution of IKONOS and QuickBird may be beneficial for defining fine-scale patterns in land cover over small areas, but the cost, amount of data and difficulty in obtaining timely data over a large area due to their small footprint make them prohibitive in many situations. Similarly, AVHRR is ideal if the question of interest is to track vegetation phenology due to its frequency of re-visits, but is of course spectral resolution and limited spectral resolution. Selection of a single data source or the proper combination of data sources must match the desired classification scheme. Scale – scale is an often misused term which states that one unit of distance on a map, aerial photograph, etc. represents a specific unit of distance on the ground. It is often presented as a fraction (1/25,000) or ratio (1:25,000) where 1 inch on the map represents 25,000 feet on the ground. Subsequently, 1:25,000 defines a smaller scale than 1:10,000 and provides less detail than 1:10,000 (larger scale). Classification Accuracy and Precision – accuracy defines “correctness”, the agreement between an assumed standard and the predicted class of data, while precision defines “detail” of land cover classification (Campbell 1996). The distinction and interaction between these 2 terms is important. As the precision increases along the gradient “forest, coniferous forest, lodgepole pine”, so does the potential for errors which generally results in decreased accuracy (Figure 3). Accuracy of land cover data is important to prevent confusion between classes, but the precision of data is what determines the usefulness for a specific application. There are 2 types of errors associated with each land cover class, errors of omission and errors of commission. Errors of omission for a “grassland” class are those that assign actual grasslands on the ground to another class. The known patch of grassland has been omitted from the resulting classification. Errors of commission for the grassland class refer to locations incorrectly classified as grassland. The classification has committed an error by classifying “forest” or other types as grassland. The distinction between errors is important because a classification could achieve 100% accuracy relative to the “grassland” class by delineating the entire area as grassland. Classification accuracy is often reported from the standpoint of “producer’s accuracy” and “consumer’s accuracy” for each class and overall. Producer’s accuracy refers to the percent of features classified correctly amongst those that are actually of that type on the ground; classification of 400 features as forest from 500 features that are known to contain forest results in a producers accuracy of 80.0%. In comparison, consumer’s accuracy describes the reliability of the classification as a predictive device. In this situation, the correct classification of the 400 features as forest amongst 800 features that were classified as forest in the predictive map results in a consumer’s accuracy of 50%. 4 Linking Remote Sensing with Vegetation on the Ground - What exactly are we Mapping? In the classic work on vegetation mapping, Kuchler (1967) defined vegetation as “the mosaic of plant communities in the landscape” and further went on to state that “this definition implies that vegetation consists of more or less distinct mappable units”. It therefore seems logical that we should be able to map land cover with remote sensing data. When using remote sensing data, it is of tremendous importance to get accurate information to validate what the remote-sensing data products appear to be telling the user; remote sensing products should not be taken at face value (Turner et al. 2003). Unfortunately, there is a lack of coordination and standardization within the natural science community for defining plant communities on the ground, the very descriptors for what remote sensing is sampling. Botanists and field biologists often develop systems for classifying vegetation that are dissimilar, cannot be extrapolated across large areas using remote sensing techniques, and may not be applicable for certain uses within the wildlife and conservation fields. Naturally occurring vegetation is dynamic and varies according to site-specific and environmental parameters. Plant species composition at any given point will vary throughout the growing season according to the growth cycle of existing plants. While any 2 locations may be close in space, plant species composition and quantities may be quite different. Classifying vegetation under these conditions into a clear, concise framework can be difficult. Botanists and field biologists typically classify vegetation as either habitat types (Daubenmire 1952) delineating potential vegetative at climax conditions or cover types defining existing vegetative conditions. They are commonly named using one indicator or dominant species from the overstory (if present) and one from the understory. Examples are the big sagebrush/Idaho fescue and Idaho fescue/bluebunch wheatgrass types described by Mueggler and Stewart (1980). Depending on the seral stage, habitat types often do not indicate the actual vegetation on the ground and 2 areas of very different plant species composition can be classified as the same habitat type. Classification to habitat types using remote sensing data is generally not possible and the use of field data that utilizes habitat type descriptors will be problematic. Recent trends have centered on using cover types (existing vegetation) rather than habitat types. Several standards for classifying existing vegetation have been proposed but have not been fully adopted. The Federal Geographic Data Committee (FGDC) National Vegetation Classification Standards established an initial hierarchical classification with 9 levels (FGDC 1997). The 7 upper levels of the FGDC standards are based primarily on physiognomy and the 2 proposed lower levels, although not finalized, are based on floristic attributes. Recent floristic standards were drafted by the Ecological Society of America (ESA) Panel on Vegetation Classification (Jennings et al. 2004). Final adoption of classification standards will promote the consistent classification of existing vegetation by biologists in the field and facilitate communication describing land cover. However, it is unlikely that multispectral data will be able to accurately classify vegetation at the floristic level of the proposed standards. These sensors cannot “see” through dense tree canopies to classify vegetation of the understory. They also have difficulty differentiating amongst homogenous grass types much less between mixed types such as between Idaho fescue/bluebunch wheatgrass and Idaho fescue/tufted hairgrass. Defining land cover is essentially an exercise in detecting patterns across the landscape. Accurate classification requires that the scale of remote sensing data matches the scale of field 5 data used for classification purposes and the desired classes within the land cover classification. In both the ecological and remote sensing fields, detecting landscape patterns is a function of the size of individual sample units and the size of the area under investigation. Ecological studies refer to the size of sample units as “grain” which is analogous to spatial resolution or pixel size in remote-sensing. Both disciplines use the term “extent” or “area of interest” to refer to the area under investigation. Wiens (1989) describes the influence of extent, grain, and their interaction in ecology. Figure 4 indicates the relationship between heterogeneity of vegetation, extent, and grain and the following text describes the relationship of grain and extent in a patchy landscape (from Wiens 1989). “As the extent of the study is increased (large squares), landscape elements that were not present in the original study area are encountered. As the grain of samples is correspondingly increased (small squares), small patches that initially could be differentiated are now included within samples and the differences among them are averaged out.” Wiens (1989) describes how spatial variance changes depending on grain and extent and discusses the relationship between spatial and temporal scaling. Two very important points should be realized about the influence of extent and grain on field classification of vegetation: 1) as grain increases, the number of classes and distinction between classes decreases because more variability is encountered within each sample 2) as the extent increases, the number of vegetation classes increases because more classes are encountered. The 2 points enumerated above for field classification are the same as for classifying land cover with remote sensing. Band values for each pixel (grain) of remote sensing data are a single cumulative value from every tree, bush, blade of grass, rock, etc. that our eyes see within the pixel. Vegetation classes contain a range of types and amounts of plant species and field biologists use their ability to discern these differences for defining group membership. In contrast, remote sensing “sees” only the cumulative values of each pixel to define group membership. All other things being equal, a smaller pixel size will encounter less mixing of vegetation types within each pixel and be able to detect smaller patches on the ground, similar to that portrayed in Figure 4. Smaller pixel size can therefore increase the precision of the classification, as indicated in Figure 5a. Spatial resolution corresponds scale in the remote sensing context (Woodcock and Strahler 1987) and the x-axis of Figure 5a is also labeled as “Scale of Data”. Similarly, precision of any remote sensing classification is also expected to decline as the extent increases (Figure 5b). More classes are encountered, as noted in point 2 above, which increases the potential for confusion between classes. Classes must often be combined to maintain accuracy which corresponds to a reduction in precision. Additionally, high costs and the large amount of data typically limit classification over large extents to remote sensing data of coarser spatial resolution, which also reduces precision. 6 Although results of many land cover classifications are quite impressive, the inherent process introduces the potential for errors. Field data used to “train” remote sensing data cannot provide every possible combination of plants, rocks, etc. within each vegetation class or provide the full range of values that remote sensing data samples. Landcover classes with distinct combinations of reflectance values will always be more accurate as even the best remote sensing algorithms will confuse pixels with similar values. Locations of data collected on the ground require an accurate match with corresponding pixels of remote sensing data. Current Global Positioning Systems (GPS) produce very accurate locations, often within 1m depending on the type of receiver and satellite configuration at the time of each location. However, modern techniques are still not able to make remote sensing data conform to actual locations on the earth. Loveland et al. (2000) recommended image registration as an important area of future research. The best technique is to match known points on the imagery (referred to as Ground Control Points (GCP’s)) with GPS locations taken on the ground and “warp” the imagery to conform. Even with this process, a general rule of thumb to account for inaccuracies in geographical locations is to sample an area equivalent to a 3x3 pixel area (~90m x 90m) and apply field data to the center pixel of the corresponding imagery. This method results in the grain of field samples to be quite large in many occasions and limits the precision of vegetation classification. Although there are limitations to producing land cover classifications with the desired precision, we are coming closer to bridging this gap. Land cover mapping at any scale yields imperfect results (Loveland et al. 2000). However, the recent availability of satellites with increased spatial resolution is analogous to a smaller grain and sensors with greater spectral resolution increase the ability to differentiate between vegetation types. Use of remote sensing for land cover classification is the only practical method for covering large areas and both our use and abilities in this field are steadily improving. Matching the Needs of Wildlife with Land Cover Data Wildlife typically have specific habitat requirements, an idea that David Lack (1933) may have been the first to propose. It is therefore imperative that land cover mapping classes match the type and scale of habitat selection for the wildlife species of interest. Habitat selection can be considered either floristic in nature, determined more by structural components, or a combination of the two. Numerous grazing studies have documented the section of specific plant species by many wildlife. Fisher generally avoid open areas and utilize a range of plant communities containing a high amount of vegetative structure (Jones and Garton 1994). Sage grouse are strongly associated with sagebrush habitat, but they require specific structural components of sagebrush and amounts of herbaceous cover within sagebrush habitat depending on season (Connelly et al. 2000). The problem of relating phenomena across scales is the central problem in biology and all of science (Levin 1992) and selection of these habitat components is scale dependent. Owen (1972) stated that “selection can be exercised at different scales”. Johnson (1980) suggested a natural ordering of selection processes from first-order selection defining the physical or geographical range of a species, second-order selection determining home range, third-order selection pertaining to usage of habitat within the home range, and fourth-order selection the procurement of food items at a site. In the case of large herbivores, Senft et al. (1987) proposed regional, landscape, and plant community scales of habitat selection where the plant community level is essentially the same as the fourth order described by Johnson (1980). As the order of selection increases from a regional or geographical order to a feeding site level 7 for any animal, the extent of the search area decreases (scale becomes larger) and the specificity of vegetation parameters increases. The exact level and rate these variables change is obviously dependent on the species in question. Grizzly bears are a wide ranging species that vary their habitat use amongst seasons and across their range. Their home ranges are much larger and are of less specific habitat requirements than many other species (e.g. boreal toads) and the rate of change from initial selection orders to the procurement of food items is overall much greater. The level of precision required from land cover data to address the differences in specificity and scale of habitat selection for wildlife can be generalized in Figures 6a and 6b. Amongst species, the required precision increases as the specificity of habitat requirements increases from habitat generalists to habitat obligate species (Fig. 6a). Similarly, the required precision also increases as the order of selection increases within individual species (Fig. 6b). However, the relationship between land cover precision and intraspecific order of use can be considered analogous to a generalization of interspecific scale of use. Third-order selection for grizzly bears will equate to a similar size area and specificity of habitat components as second-order selection for narrower ranging species that tend to be more habitat obligates. Therefore, interspecific scale of habitat selection is also indicated on the x-axis. Many wildlife studies assume available land cover data are not accurate (Cunningham 2006) or are of insufficient precision for the species or question of interest and develop their own classification schemes for both field studies and developing digital land cover data. As an example, there are 3 (at least) methods for field classification of vegetation in the central plateau of Yellowstone National Park; a classification of grassland and shrublands (Mueggler and Stewart 1980), a vegetation classification specific to Yellowstone National Park (Despain 1990), and one specific for mapping grizzly bear habitat in the Yellowstone ecosystem (Mattson and Despain 1985). However, none of these were sufficient for classifying vegetation in relation to a grazing study for bison and an additional classification was developed (Olenicki and Irby, 2002). All are valid classifications with precision intended for their specific use. Information from research using various classification schemes similar to those noted above is typically used for constructing habitat suitability models or identifying specific habitats for conservation applications. Even if digital land cover is developed for a wildlife project, the extent frequently does not cover conservation areas of interest and the best available data must often be used. Specific parameters such as “early seral lodgepole forest” may need to be extrapolated into the broader classes of “lodgepole forest” or “coniferous forest” which will in itself overestimate the amount of suitable habitat. Additionally, the amount and location of specific patches will depend on the accuracy of the land cover classification. Errors of omission will assign the “coniferous forest” class in the above example to a different class while errors of “commission” will erroneously assign “coniferous” forest to a class that is not coniferous forest. Both types of errors will incorrectly identify patches of habitat. In some situations, a conservation area of interest may cross jurisdictional or other boundaries that result in the need to combine sets of land cover data. The difficulty in extrapolating habitat requirements to a dissimilar classification is further compounded when 2 different classification are used. Examples of many of the issues discussed in this section will be noted in the following section comparing several examples of land cover data. 8 Selection of Land Cover Data There is not one ideal classification of land cover to meet all applications and it is unlikely that one could ever be developed (Anderson et al. 1976). This is probably truer today than when Anderson first made this statement. Advanced sensors, modern techniques to tease out characteristics within pixels, and readily available remote sensing data have drastically increased our abilities to develop land cover classifications for a variety of applications. The numerous classifications that have been conducted often give the impression that land cover data is available for any application and location, but this is not the case. While land cover data exists for virtually the entire earth, much of it does not meet the needs for delineating habitat suitability and other conservation uses in many situations. Although the relationships presented in Figures 5 and 6 are not fully understood, the concepts they represent can be used to help match land cover data with the needs for delineating habitat suitability. Figures 6a and b can be used to help identify the required precision for the intended task. Habitat suitability models for species that tend to be habitat obligates, utilize relatively small areas, or for models of upper selection orders requires data of high precision. High precision translates to the need for land cover data at a smaller scale (Fig. 5a) and subsequently the ability to develop models over smaller areas (Fig. 5b). Conversely, developing a conservation plan over a large area of interest (from Fig. 5b) generally translates into land cover data of lower precision (and usually at a smaller scale as previously discussed). Habitat suitability models can only be developed for situations appropriate for less precise data; habitat generalists, wide-ranging species, or for lower selection orders within desired species. In addition to habitat suitability, these ideas generally apply to the requirements of land cover data for other ecological and conservation questions; more precise data is needed as more specific questions are asked. The following examples of digital land cover data are a few of the many available. They were chosen to represent differences in spatial resolution (scale), extent, and precision of data. Their advantages and disadvantages will be discussed as well as my own experience with their use for modeling habitat suitability in the conservation field. Global Land Cover Global land cover data treating all parts of the world equally and representing actual land cover of our planet have been available since the 1990’s (Belward and Eva 2004). Table 2 from Strand et al. (2007) provides 5 sources of readily available global land cover data. The major advantages of these data are their ability to cover any location on the earth with a consistent classification scheme. One of these classifications, the AVHRR Global Land Cover (Hansen et al. 2000) is available with a spatial resolution of 1 km, 8 km, and 1 degree and there are either 13 or 14 land cover classes depending on the spatial resolution. The 14 classes defined by the 1 km and 8km data (Table 1) are very broad and limited to defining habitat suitability at a regional or geographical scale. However, periodic updates (e.g. MODIS Land Cover) allows for change detection and they all provide insight into areas for closer examination using other data sources. National Landcover Database 9 The National Landcover Database (NLCD, available at) is the product of a long-term, multipartner project dedicated to land cover products. Originally released in 1992, a newer version using revised methodology was released in 2001 that includes percent tree canopy and percent urban imperviousness. Slight modifications were made in classes (Table 2). Data are based on Landsat imagery and are available for the conterminous United States, Alaska, and Puerto Rico (http://www.mrlc.gov/index.php). For delineating habitat, there are 2 main advantages of the NLCD data compared to the global land cover previously discussed; 1) the increase in spatial heterogeneity of cover classes that the 30m pixel size of Landsat sensor provides compared to 1 km for AVHRR, and 2) estimates of canopy coverage for the 2001 version. Actual classes of NLCD data are still fairly broad and best suited for delineating lower orders of habitat selection, but estimates of canopy structure increase habitat modeling capabilities for species such as fisher and lynx that rely on forest structure. Classification accuracy was assessed by mapping region for the 1992 NLCD data but has not yet been conducted for the 2001 version. For region 8 (MT, WY, ND, SD, UT, CO), overall consumer’s accuracy for single pixels is listed as 60% (Table 3). A closer look at accuracy for individual classes indicates that 8 of the 20 classes have an estimated accuracy of 15% or less. The specific methods utilizing these data will influence results, but the variability in class accuracy would be expected to produce variable results depending on the species and their dependence on classes with high or low classification accuracy. Cunningham (2006) and Thogmartin et al. (2004) provide good discussions on the use of NLCD data for habitat studies and some of the reasons for classification errors. GAP Analysis Landcover The intent of the Gap Analysis Program (GAP) is to identify and maintain non-threatened animal species and plant communities that are not covered by other legislation and may not occur on existing conservation lands. Mapping land cover and predicting species distribution across the United States are 2 goals of this project. GAP is probably the best available source of information pertaining to land cover data and species distribution modeling. Many research projects, publications, and land cover classifications have been produced under this program. GAP land cover is based on the same Landsat imagery as NLCD but is generally of higher precision due to classification over smaller extents. Classifications have generally been conducted individually for each state and are available at: http://gapanalysis.nbii.gov/portal/community/GAP_Analysis_Program/Communities/GAP_Hom e/ . However, there has been little standardization of methods amongst states that cause problems when conducting conservation work across state boundaries. Figure 7 indicates a location along the MT and WY border where GAP land cover data from MT and WY overlap. The location is within the Greater Yellowstone Ecosystem (Noss et al. 2001) and Yellowstone National Park. The red polygon indicated in the figure comprises ~8028 acres and is classified under WY GAP as 80% lodgepole pine and 20% subalpine meadow. Figure 8 indicates this same polygon as classified by MT GAP where each color represents a different land cover type. The WY data provide useful information, but MT data are obviously more precise. Crosswalking of vegetation types has been used in these situations to produce consistent land cover 10 classification, but it is often a best guess exercise when actual plot data are not used. Detailed information on floristic composition and canopy cover are required to correctly crosswalk between vegetation types (Brohman and Bryant 2005, FGDC 1997, Jennings et al. 2004). The new regional classifications being produced by the GAP program will resolve differences across state boundaries but at the cost of more generalized classes than many of the individual state projects produced. I ran into a similar situation of differences across jurisdictional boundaries when developing habitat models for the Inland Temperate Rainforest that encompasses parts of MT, ID, WA, and British Columbia. Variation across boundaries did not allow even a close approximation for crosswalking. Replacing vegetation types with a rating system also did not solve the problem due to differences in precision and scale amongst the classifications. A more generalized classification system covering the entire area was used. It provided consistent results but at a much broader level of habitat selection than data from some of the individual jurisdictions could provide. Classifications Using IKONOS or QuickBird Imagery Launches of the IKONOS and QuickBird sensors in recent years have greatly increased the spatial resolution of available imagery. IKONOS imagery has shown the ability to identify tree canopy (Snyder et al. 2005), individual trees (Read et al. 2003), and to differentiate between vegetation types in dry shrub/grassland (Depew 2004). At this point, there are not any widespread classifications available using these types of imagery. The extent of each scene is generally small, data are fairly costly, and the spatial resolution provided by these sensors results in very large data sets. The same could once be said for Landsat data and this will obviously be less of an issue in the future. Although use of these types of data offer increased capabilities, their current use for habitat modeling are limited to species occurring over small areas as an intermediate sampling tool between ground surveys and coarser-grained imagery. Vegetation Resource Inventory of British Columbia The Vegetation Resource Inventory (VRI) of British Columbia (MSRM 2002) is a hierarchical classification based on the physiognomy of vegetation. Figure 9 shows the various levels of classification for a polygon initially classified as vegetated. In my opinion, this is one of the best classification schemes for delineating habitat. The hierarchical structure of VRI reduces the potential and severity of classification errors. The confusion amongst all classes (errors of omission and commission) that is present in most classifications is reduced in the VRI scheme by successively classifying polygons into each class. The resulting confusion between “dense coniferous” and “dense broadleaf” habitat in the upland position of the VRI scheme does not affect results as much as confusion between a “coniferous forest” and “low density residential” that is more likely to be present in other schemes. Most habitat delineation relies on ancillary data, such as a digital elevation model (DEM), in addition to land cover data. Most land cover data provides the base vegetation such as coniferous forest or the actual species of conifers as in the GAP data for MT, while the DEM provides topographic variables (e.g. slope, aspect, elevation, roughness). Assumptions are often made as to the structural or productivity characteristics using the topographic variables; north slopes generally contain denser forests and more productive grasslands than south-facing slopes, 11 but the actual amounts and variability are unknown. In contrast, it is easier to make assumptions about the composition of plant species and specific habitat using information from the VRI scheme. As an example, the VRI descriptors “open, coniferous, uplands” on west-facing intermediate slopes in southwest MT would be expected to contain mature Douglas fir mixed with grasslands and provide a pretty specific habitat description. The same slope using a “coniferous forest” descriptor could contain a range of structural classes for Douglas fir and lodgepole pine and even the “Douglas fir” type could contain a range of classes from seedlings to mature trees. I used VRI as the main component for developing habitat models for 7 focal species across the 16.2 million hectare Muskwa-Kechika management area in northern British Columbia (Heinemeyer et al. 2004). Models generally conformed to BC standards (RIC 1999), consisting of separate feeding and living components for both winter and summer for each species. Results proved reliable when validated with telemetry and aerial survey data. One disadvantage of this classification is the size of the data set and the amount of computer time it took to run the models. Additionally, classification methods rely on air photo interpretation which is variable amongst interpreters, is quite time consuming to conduct over large areas, and difficult to update because it is so time consuming. VRI is a very useful classification scheme and the use of newer high resolution imagery or other methods streamline the classification process should help to expand its use. Classification Using Digital Orthophotos Interpretation of air photos has long been an accurate method for land cover classification, but even the availability of digital orthophotos that allow on-screen classification is time consuming and impractical over large area. The ability for computers to discern the same objects photo interpreters see would increase the use of this readily available data. Miller et al. (2004) were able to use image processing software and digital air photos to identify tree canopies. Akbari et al. (2003) had limited success with machine processing in an urban area. Color air photos are similar to satellite imagery in the fact that each pixel is composed of distinct values for red, green, and blue that produces the displayed color. The key difference is that color air photos cover the full range of these colors rather than specific regions that imagery targets. Nonetheless, color air photos can still be decomposed into their individual bands and treated like satellite imagery and I used this potential in a hybrid cross between remote sensing classification and air photo interpretation. The conservation project I used it for called for habitat models for 2 focal species (grizzly bears and elk) over a relatively small area. I initially used MT GAP land cover data and general habitat models I developed for that data, but results were coarser than desired for the size area. In some situations, the more detailed the feature becomes, the greater the variation detected within a class rather than between classes (Grenzdorffer and Bill 1994). This was the situation encountered for coniferous forests, where the shadowing between tree canopies created more variation within rather than between classes of trees. Therefore I aggregated each of the 3 bands from 1m to a 5m pixel size which smoothed values within classes. I then ran an unsupervised classification and used a combination of the MT GAP land cover data and photo interpretation of the original air photo to assign MT GAP descriptors to output classes from the unsupervised classification. The end result was land cover data that were more accurate and at a larger scale (Figure 10), but maintained the MT GAP classes so I could 12 run the original habitat models. The method I used may work only within the extent of each photo used to create digital air photo mosaics. Air photos are often taken at different times of the day or even different months or years for large areas. The differences in actual color values due to changes in sun angle and plant phenology may require a separate classification for each frame. Stand alone software and modules or extensions for many remote sensing packages (e.g. ENVI Feature Extraction Module) are now available to identify shapes within air photos and other high resolution data. These products base their classification on the combination of shape and reflectance of spectrally similar units rather than just spectral differences. Walker and Blaschke (2008) used this process to classify vegetation types, buildings, and impervious surfaces within an urban area. These products offer the potential to classify life forms of vegetation over large areas using air photos alone or used in conjunction with satellite imagery and more common techniques. Recommendations for Future Research New technologies are advancing land cover classification to a point where data matches our needs and desires. The emerging LiDAR technology offers the ability to estimate structural components of vegetation (Lefsky et al. 2002), the full potential from high resolution IKONOS and QuickBird imagery are only beginning to be realized, and advances are being made in hyperspectral sensors. By combining active sensors with existing multispectral sensors, we can add a third dimension to the existing 2-dimension data we’ve been using. Mundt et al. (2006) fused hyperspectral and LiDAR data to improve classification and provide structural components of sagebrush in semi-arid rangeland. But we cannot wait for these technologies to advance and must strive to make improvements with the tools we have. The destruction and fragmentation of natural habitats is often considered to be a leading cause in the decline and loss of native species (Newmark 1985, Sinclair et al. 1995, Turner et al. 2003). In the struggle to maintain biodiversity and conserve habitat, remote sensing tools are there, let’s hope the users soon follow (Turner et al. 2003). There are many topics for further research, but 2 initial ones come to mind. The ability to better match land cover data with the species, selection order, and area of interest would increase accuracy of individual projects and consistency amongst projects. Although determining the actual locations along the lines in Figures 5 and 6 is complex, a rating system could be developed for the use of these figures. Even the relative locations for a range of species and selection orders in Figure 6 (e.g. brood rearing habitat for sage grouse and home range for grizzly bears) and their corresponding locations in Figure 5 could serve as a guideline for matching land cover data with a specific application. The second topic is the development of a land cover classification system using remote sensing that is specific to the needs of delineating habitat. Although this seemingly adds to the confusion described by Adams (1996, 1999) and deviates from the call for standardization, we may not be at the point where standardization of methods is possible. Until technology provides the ability to determine all desired land cover variables, specialized land cover classification will continue to occur. Striving toward standardization is desirable, but should not limit conservation work and the use of remote sensing data. 13 The following conceptual classification is intended to provide structural information about vegetation and reduce the consequences of confusion amongst classes, two important aspects of land cover data for defining habitat. It is a hierarchical classification where the ecological variable at each level determines the methodology. It is similar to the multi-level approach for land cover classification described by Anderson et al. (1976) and considers all 10 criteria they identify with particular attention to equal accuracy amongst classes, applicability over large areas, allowing aggregation of categories, and comparability with future data. The classification is geared toward the use of Landsat imagery, but should also be applicable for higher resolution imagery such as QuickBird or IKONOS in specific applications; Landsat imagery providing classification across large extents, with higher resolution imagery providing increased spatial precision (e.g. Figure 10) over small areas for use at higher selection orders and for animal species considered habitat specialists. Using one or several scenes of higher resolution imagery in conjunction with a Landsat scene for the following classification will allow a comparison of the utility of each type of imagery for this purpose. The inclusion of higher resolution imagery will also help the scaling issues between on-the-ground data and Landsat imagery. The combination of field data with higher resolution imagery and digital air photos can be used to create large training sites for Landsat imagery and aid in accuracy assessment of the resulting classification. As previously stated, the following is a conceptual classification scheme. Many details need to be worked out should it be conducted and changes will undoubtedly occur. Nevertheless, it provides a starting point for a land cover classification specific to the needs of habitat delineation for wildlife, provides options for other conservation needs, and offers points of discussion between resource managers and remote sensing specialists for linking vegetation we experience on the ground with digital land cover data. It also attempts to identify a few of the many ecological variables and processes occurring across the landscape and use them as part of the classification. These processes may or may not be useful for differentiating land cover classes, but should always be kept in mind when conducting a new classification or using existing land cover data. Step 1: Inputs There are a number of correction, enhancement and pre-processing techniques for satellite imagery that are often project-specific and a matter of personal preference. These include geometric, atmospheric, and topographic correction as well as calculating vegetation indices and compressing the information from all available bands into fewer bands through the use of principal components analysis (PCA). A review of their benefits and applications can be found in most remote sensing books. Landsat imagery offers more options for pre-processing and enhancement than QuickBird or IKONOS due to the longer duration it has been available and the greater number of bands it posses. Additionally, the increased spatial resolution and off-nadir angle at which QuickBird and IKONOS imagery are often acquired can complicate some preprocessing techniques. Wu et al. (2008) provides a discussion and methods for topographic correction of QuickBird data. However, their methods may not be practical in most conservation applications due to the need for detailed digital elevation models and the view angle of scenes should be considered whenever using imagery from a taskable satellite, especially when multiple scenes are used together. 14 Figure 11 provides a generalized flow chart of the proposed classification process. The black box on the left identifies the inputs from Landsat or higher resolution imagery into the process and subsequent boxes indicate the resulting classification to each level. Table 4 provides brief class descriptions at each level. Methods used for each level are indicated along the arrows connecting the boxes and red ovals indicate the use of ancillary data at each step. Anticipated inputs to the process are the individual bands of satellite imagery, principal components images, and a vegetation index such as NDVI. Images from 2 different dates, early and late in the growing season, are desirable to identify phenological changes. Within the conterminous U S, separate classifications will occur for each ecological section (McNab et al. 2007) when the area of interest crosses boundaries between sections (similar delineation of ecological sections has occurred for many areas outside the U S). Sections are defined as large areas of relatively homogeneous physical and biological components that interact to form environments of similar productive capabilities where each map unit defines a region of unique ecological characteristics that differs from its neighboring units (McNab et al. 2007). Limiting classification within sections provides the ability to match imagery with phenology of the vegetative; optimal dates for imagery can be identified based on anticipated phenological conditions. Each section should use imagery of the same approximate date to prevent phenological differences. Additionally, errors at the floristic level will be limited to confusion amongst classes that occur within the section rather than over a much larger extent such as exist for many state GAP classifications. Step 2: Classification to Level 1 A supervised classification using one of the hard classifier algorithms is proposed to assign membership into the 6 broad classes of level 1 (Table 4). Confusion amongst classes at this level may have the greatest consequences on accurate habitat delineation and these classes were selected to represent biologically distinct and relatively easy to separate divisions in land cover. In some ways this can be considered a dichotomous selection that indicates potential habitat for most terrestrial species (vegetated) and non-habitat (other classes) and a hard classifier was selected to make this distinction. User’s accuracy greater than 90% for each class is desired to reduce the perpetuation of errors in subsequent levels. An accuracy assessment should be made and methods adjusted if necessary to increase accuracy of all classes. In particular, errors of commission for the vegetated class that incorrectly classify locations as vegetated should be reduced. Since this is a conceptual process at this point, all classes at each level are subject to change and some aspects of the classification process are glossed over. Similar to pre-processing techniques, the effects of cloud shadows is a topic that needs to be addressed and the “urban” class needs to be defined if this classification is implemented. The “burned forest” class was included to identify this habitat for those species dependent on it, for the potential uniqueness in reflectance characteristics, and for the ability to re-classify it as early successional forest in subsequent years. Burned grasslands and shrub/grassland classes may also be incorporated for areas burned a month or so prior to imagery acquisition. Areas classified as barren, water, urban, burned, and clouds have reached their endpoint in the classification. Only those assigned the vegetated class continue in the classification process. 15 Step 3: Classification to Lifeform Step 3 is a physiognomic classification that classifies vegetated pixels as conifers, broadleaf trees, shrubs, herbaceous, or mixtures of these life forms. The use of imagery from different dates and calculated vegetation indices may be especially helpful in this step due to differences in phenology amongst classes. Fuzzy classification, which provides a value for membership in each class rather than distinctly assigning it to one class, was chosen for several reasons. From a remote sensing perspective, percent composition values other than those listed in Table 4 may provide a clearer division of categories and examination of results from a fuzzy classification will indicate and allow adjustments of breakpoints. From a biological perspective, pixels that contain a mix of these types may be vegetatively more diverse and indicate better habitat for certain animal species. The “fuzzy” assignment amongst multiple classes may therefore provide useful information. Step 4: Calculation of Topographical Position Index In addition to variability in ecological sections discussed in step 1, vegetation at any site often depends on the topographic position. For example, ridge tops are windier, drier, and soil depths are often shallower than a gully at the bottom of a hill and vegetation therefore differs between them. Step 4 is a modeling exercise to calculate the 6 topographic positions indicated in Table 4 as ancillary data for subsequent classification. Classes and original methodology are from Weiss (2001) and the ability to easily calculate Topographical Position Index (TPI) was incorporated into an ArcView extension (Jenness 2006). Calculation of TPI is scale sensitive and the ability for topographic classes to match vegetation changes is likely influenced by local topography of the classification area. The same ground-truth data that is typically used for accuracy assessment of classifications may possibly be used to adjust TPI calculations to match vegetation change in the area of interest. Additionally, the use of TP singly or in combination with aspect calculations may be beneficial in correcting some of the topographical effects on QuickBird imagery discussed by Wu (2008). Steps 5 & 6: Structural Classes and Dominant Floristics The last steps of the process classify vegetation into structural classes and identify the dominant plant species to the extent possible. Of the 2, structural classes are the most important because good inferences can be made as to vegetative composition if the ecological section, life form, topographic position, aspect, elevation, and finally vegetative structure are known. However, classification to dominant species is often useful in some applications and may be helpful for determining structural aspects. I have suggested the use of unsupervised classification as the driving force in this process to allow differences in reflectance be the determining factor for the extent that floristic composition can be determined. The influence of the number of specified classes on accuracy will need to be examined as will the use of unsupervised classification on selected subsets of Levels 2 and 3 or in conjunction with multiple inputs from these levels. Dominant species amongst herbaceous types may be the most difficult to determine, but specific NDVI values and differences in NDVI values between imagery acquisition dates should allow classes to be assigned such as irrigated cropland and low, medium, and high productivity grasslands. However, highly productivity grasslands will generally also fall into the “dense herbaceous” structural class and the dashed line connecting the floristic and structural boxes of Figure 11 is meant to symbolize interactions between these levels that may occur. Assignment 16 of final classes may require a feedback loop between these 2 levels and is the reason they are depicted as simultaneous steps. Decision trees (classification and regression trees) have been used to classify vegetation at finer spatial resolution than the imagery used (Joy et al. 2003), including canopy coverage (Herold et al. 2003). Although decision trees require a large sample size (Pal and Mather 2003), they are non-parametric with the advantages of using different types of response variables, provide invariance to transformation of explanatory variables, and have the capacity for interactive exploration, description, and prediction of explanatory variables that is useful in ecological studies (De’anth and Fabricius 2000). They were therefore chosen for estimating vegetative structure. Canopy coverage for each class (upper stratum of tree and shrub classes that also contain an understory) is presented in Table 4. Two size classes for the treed categories are also proposed with the idea that the floristic classification and group membership from the fuzzy classification of step 2 may especially help differentiate them. Similar to the use of NDVI within the herbaceous class, changes in NDVI between dates for forested areas may be helpful in estimating canopy cover. NDVI decreases most notably in relation to senescence of herbaceous vegetation. For coniferous areas of similar topographic position, aspect, and life form, smaller decreases may occur in locations where the shading effects of greater canopy coverage and to a certain respect more acidic soils will reduce the amount of herbaceous ground cover. The same may hold true for many broadleaf and shrub communities. However, exceptions exist to these and most other situation. Sagebrush actively grows and flowers in the fall, potentially resulting in higher NDVI values during that time. Larches (genus Larix) are one of the few conifers that have deciduous leaves. These 2 exceptions to the norm can be factored into a classification scheme, but are also examples that ecological process must always be considered during the classification process. Aspect and elevation are obvious ancillary inputs for steps 5 and 6, but hydrological data may be useful for defining specific classes or situations. Depending on the scale for calculation of topographic position, “valleys” containing streams or rivers may define riparian corridors and the specific riparian vegetation that occurs along them. Highly productive herbaceous areas in valleys within close proximity to water are likely to contain sedges, shrubs will often be willows, and broadleaf trees will be cottonwoods in certain ecological sections. The use of feature extraction software such as the ENVI Feature Extraction Module or Feature Analyst by VLS Software offers a new approach to the remote sensing field when used in conjunction with high resolution imagery (e.g. Walker and Blaschke 2008, San Souci and Doyle 2006). These programs essentially attempt to provide the machine learned equivalent of hand digitizing visible homogenous polygons. Tree and shrub canopies can be defined in high resolution imagery for use alone or as data inputs to Landsat imagery for conventional classification. It is included in Figure 11 for both purposes. Final Comment on Classification Methods The topic of scale has often come up throughout this chapter; scale of vegetation patterns, scale of remote sensing data, and scale of selection by animals. It is obvious that many of the 17 ecological processes, classification variables, and methods discussed in the above classification are also scale dependent, yet the influence of using an arbitrary uniform sampling grid to classify scale-dependent variables was not addressed. Marceau et al. (1994) came to the conclusion that remote sensing data are not independent of the sampling grid used for their acquisition, that neglecting the scale and aggregation level can produce haphazard results with little correspondence to geographical entities of the scene, and that there is not a unique spatial resolution appropriate for all situations. Marceau and Hay (1999) provide an excellent review of the issue and is an article that should be read by anyone involved with remote sensing or remote sensing derived land cover classification. The advent and use of higher resolution imagery has provided the ability to better understand these relationships and correct for them in many situations. Niijland et al. (2009) found they could improve classification accuracy for the variable of interest within their study area by resampling the original 5m imagery to 7m. Ideally, the perfect land cover classification system would have the ability to adjust the spatial resolution of data to match the process or variable of interest on the landscape. Similar to my previous comments, conservation and biodiversity cannot wait for advances in technology and we must strive to make improvements with the tools we have. Recommendations for Conservation Applications Success in conservation planning relies on acceptance and implementation of any work that is done. Since land cover data often forms the foundation of conservation planning, it is critical that limitations in accuracy and precision of the data are understood prior to their use as well as how these limitation impact results. The following recommendations are intended to increase the credibility of any results utilizing digital land cover data. 1) Land cover data must match the question of interest; habitat modeling or any other work should be done within the limitations of land cover data being used. Delineating habitat for a grassland species requires high accuracy of grassland classes. Most land cover data works best for wide ranging habitat generalists than habitat obligates and for lower selection orders. 2) Even if an accuracy assessment is available for data used, accuracy can be variable across the classification area and it is a good idea to conduct an accuracy assessment for the specific project area using field data or photointerpretation. Even the best model will not produce accurate results if there are errors in the input data. 3) No land cover classification is completely accurate and a brief discussion of the confusion between classes (from an accuracy assessment) and of the limitations in the data may explain reasons for discrepancies between habitat predictions and observations. 4) FGDC standards for collection of vegetation data in the field should be followed for accuracy assessments or collection of data for new classifications. 18 Figure 1. Comparison of 1km AVHRR imagery, 30m Landsat imagery, 4m IKONOS imagery, and a color orthophotos for a 1km area in Yellowstone National Park indicating what each data type “sees”. The AVHRR data is a single color as this is the extent of a single pixel for this type data. Figure 2. Typical reflectance curves for vegetation, soil, and water. (From Lillesand and Kiefer 1994). 19 Figure 3. Generalized relationship between precision and accuracy of land cover data. 20 Figure 4. Relationship between grain and extent (from Wiens 1989) 21 Figure 5a. Relationship between precision and the pixel size or scale of data. Figure 5b. Relationship between precision and area extent. 22 Figure 6a. Relationship between pecision and habitat specificity. Figure 6b. Relationip between precision and inraspecific order of selection and interspecific scale of selection. 23 Figure 7. Boundary between GAP land cover at the MT WY border in Yellowstone National Park. Figure 8. Closepup of red polygon in Figure 7. 24 Figure 9. Hierarchical classification steps for vegetated polygons using VRI classification. Figure 10. Original orthoophoto (top), MT GAP classification of the area (middle), and hybrid classification using the orthophotos with MT GAP classes (bottom). 25 Figure 11. Flowchart of conceptual classification. 26 Table 1. Global land cover classifications Table 2. Comparison between land cover classes for 1992 and 2001 NLCD data. 27 Table 3. Producer’s accuracy (user’s accuracy) of NLCD land cover data for region 8. Table 4. Proposed land cover classes (see attached spreadsheet). 28