ACCURACY ASSESSMENT OF A VEGETATION MAP OF NORTHEASTERN CALIFORNIA USING PERMANENT

advertisement
ACCURACY ASSESSMENT OF A VEGETATION MAP OF
NORTHEASTERN CALIFORNIA USING PERMANENT
PLOTS AND FUZZY SETS
1998
AUTHORS
Jeff Milliken
Remote Sensing/GIS Specialist, USDI Bureau of Reclamation
Sacramento, California
Debby Beardsley
Research Forester, PNW Research, US Forest Service
Portland, Oregon
Samantha Gill
Assistant Professor, California Polytechnic University
San Luis Obispo, California
ABSTRACT
The accuracy of a northeastern California vegetation map was assessed
using the data from a grid of permanent Forest Inventory and Analysis (FIA)
plots collected by Region 5 and the Pacific Northwest Research Station (PNW).
The map was assessed in three parts: the Modoc National Forest, the Lassen
National Forest and the lands outside National Forest boundaries. Accuracy was
assessed hierarchically resulting in separate assessments for vegetation growth
form (lifeform), and species association (CALVEG) within lifeform. A fuzzy
logic approach was employed. Fuzzy sets allowed for the recognition that plots
did not always fit unambiguously into a single map class. For each plot, all
possible map classes were given a rating between absolutely wrong (1) and
absolutely right (5). The accuracy of the map for lifeform was high. On average
82% of the sites had the best possible label, and 89% of the sites had labels that
would be considered ‘right’. For many of the map classes, the accuracy was
greater than 75%. The grid plot design undersampled some of the mapped
classes but was a cost effective way to generate an accuracy assessment based
on a probability sample.
www.fs.fed.us/r5/rsl/publications/
1
INTRODUCTION
The Lassen-Modoc project was a USDA Forest Service Region 5 and
California Department of Forestry and Fire Protection cooperative vegetation
mapping program covering 9 million acres of the northeastern portion of
California (fig. 1). Vegetation maps were produced using remote-sensed
processing and GIS modeling techniques (Miller, 1994). Image data for this
project was 1991 Landsat Thematic Mapper. The classification was completed
in the fall of 1995. For each polygon (minimum mapping unit of 1 hectare), a
lifeform type and CALVEG (Classification and Assessment with Landsat of
Visible Ecological Groupings) (USDA, 1981) type were mapped. In addition,
size and density were mapped for forest CALVEG types. The map was assessed
in three parts: the Modoc National Forest, the Lassen National Forest and the
lands outside National Forest boundaries. The purpose of this paper is to report
the accuracy assessment results for lifeform and CALVEG classes. Readers
may contact Ralph Warbington at the Region 5, Remote Sensing Laboratory in
Sacramento, California, for the detailed accuracy assessment of all the mapped
classes.
Figure 1. Lassen-Modoc Project Area.
The most common approach of collecting ground truth information for the
purpose of assessing the accuracy of a map is to visit a site in the field
corresponding to a polygon on the map and to classify it based on the
classification scheme used in the mapping project. This method is often very
www.fs.fed.us/r5/rsl/publications/
2
time consuming and expensive due to the number of samples needed to perform
a valid accuracy assessment and the field costs associated with visiting each site.
Therefore, many maps have no accuracy assessment information. The approach
in this study was to use permanent USFS Forest Inventory and Analysis (FIA)
(USDA, 1992; USDA, 1995) field plots as ground truth sites. These plots were
already established. Thus, the costs associated with collecting accuracy
assessment data were significantly reduced.
A modified fuzzy logic accuracy assessment approach based on Gopal and
Woodcock (1994) was used in this project. The concept of a fuzzy set was
introduced by Zadeh (1965, 1973) to describe imprecision that is characteristic
of much of human reasoning. With fuzzy sets, there are different grades of
membership within a class. In the case of a vegetation map, one label may be
absolutely correct, but other labels may be considered good or acceptable. For
example, for a given site (in this case an inventory plot within a map polygon) a
map label of red fir may be considered absolutely correct, but a map label of
subalpine conifer might still be considered acceptable. Using the traditional
error matrix, only one possible answer (considered to be the best answer by an
'expert' in the field) is compared to the map label. Fuzzy set theory allows the
user and producer to look at ranges of acceptable answers.
METHODS
Accuracy site data collection
The accuracy site data for the accuracy assessment were permanent USFS
FIA field plots installed on a 3.4 mile grid across California. These plots had
been measured independently from the vegetation mapping project in order to
provide current estimates of forest land area, timber volume, net annual growth
and mortality and harvest in California. Plot installation on the Modoc and
Lassen National Forests was administered by the USFS Region 5 inventory staff
between 1993 and 1994. Plots on lands outside National Forest boundaries were
measured by the Pacific Northwest Research station of the USFS (PNW) in
1992. The inventory grid provided 312 accuracy sites on the Modoc National
Forest, 291 sites on the Lassen National Forest and 701 sites on lands outside
National Forests.
The National Forest plots were a cluster of 5 points spanning 2.5 acres. At
each point species, diameter and height were collected on live and dead trees.
www.fs.fed.us/r5/rsl/publications/
3
Percent cover of all understory species was also recorded. In addition, at each
point, the inventory crew assigned a best and second best lifeform class and
CALVEG type. On several points, the inventory crew only assigned a best
lifeform or CALVEG type because, in their opinion, there was only one correct
answer (USDA, 1995). The crew had no knowledge of the map labels when
making these evaluations.
The plots installed by PNW were a cluster of 5 points over 6 acres (USDA,
1992). For 701 of the sites, crews classified each point of each plot as conifer,
hardwood, rangeland (a combination of shrub and/or herb), or non-vegetated.
The lifeform classification of 70% of these sites was done with photo
interpretation. For 30% of the sites the classification was made on the ground.
All 701 sites were used for the lifeform assessment of lands outside National
Forests. For the 215 sites visited on the ground, crews collected species,
diameter and height on live and dead trees and percent cover on shrub, grass and
herb species. The sites visited on the ground were used for the CALVEG
assessment of lands outside National Forests. CALVEG was not a fieldcollected item for the PNW crews. Therefore, a CALVEG type was assigned to
each point by using summaries of the field-collected data (percent cover by
species by vegetation layer), the field plot descriptions, and a CALVEG key.
Assigning fuzzy ratings for each possible map label
The accuracy assessment was based on comparing the map label of each
sample site with evaluations based on ground data. For each site, a rating was
given for all possible lifeform and CALVEG labels of the map without
knowledge of the actual map label at the site. The rating scheme used in this
study was:
5: absolutely right. If this were the map label it would be a perfect match.
4: good. Would be happy to find this label on the map.
3: acceptable. Maybe not the best possible map label but it is acceptable.
2: understandable but wrong. Not an acceptable map label. There is
something about the site that makes the label understandable but there is
clearly a better one.
1: absolutely wrong. The label is absolutely unacceptable.
Ratings for each possible label were derived from the inventory crew’s
evaluations as well as knowledge of vegetation gradients within lifeform and
CALVEG classes. The following procedure was used to assign a ‘fuzzy’ rating
www.fs.fed.us/r5/rsl/publications/
4
to each possible lifeform and CALVEG class for each site. First, a score of two
was given to a class the field crew considered the best label for the point and a
score of one was given to a class the crew considered second best. These scores
were summed over all the points in the cluster plot and divided by the maximum
possible score (2 * the number of points in the cluster) to obtain a normalized
score. Fuzzy ratings were then assigned to the normalized scores as follows:
normalized score
>0.9
0.6-0.9
0.4-0.6
0.2-0.4
<0.2
fuzzy rating
5 (absolutely right)
4
3
2
1 (absolutely wrong)
Using this approach, the class which was assigned the best rating at each
point by the field crew was assured to be given a fuzzy rating of 5 (absolutely
right) and one that was not assigned to any point in a plot was given a fuzzy
rating of 1 (absolutely wrong). Secondly, because the crews only indicated best
and second best lifeform and CALVEG classes, the ratings of some of the
classes were increased based on expert knowledge of which possible map labels
would be acceptable given the map label that received the highest score.
RESULTS AND DISCUSSION
Lifeform Accuracy
Lifeform was the first level of the accuracy assessment to be evaluated
(table 1). The fuzzy logic approach provided two measures of accuracy: the
MAX operator and the RIGHT operator. The MAX operator was the more
conservative measure of accuracy. This operator measured how frequently the
map label was the best choice for the site. The RIGHT operator accepted
matches using any degree of right which in this assessment was any score less
than or equal to 3. In other words, the RIGHT operator measured how
frequently the map label was an acceptable choice for the site. Using the MAX
operator, the overall lifeform accuracy of the map was between 77% and 88%.
Using the RIGHT operator, the overall lifeform accuracy of the map increased
to between 84% and 96%. The accuracy of the map for lifeform was also
weighted by the area of each class in the map. Of the classes with an adequate
www.fs.fed.us/r5/rsl/publications/
5
sample, the least accurate was the shrub class. The matrices below (tables 2-4)
show between which classes confusion occurred. For some classes, there are
more errors than sites because, at some sites, more than one class had a higher
rating than the map label.
These matrices identified the number of times classes received a rating
greater than the map label. Columns show errors of omission and rows show
errors of commission. An error of omission means an area of a ‘known’ class
has been omitted from the map. An error of commission means a particular
mapped class includes areas that are better labeled as other classes. In the shrub
class on the Modoc National Forest, there were many more errors of
commission than omission meaning the shrub class was probably overmapped.
Because most of these errors of commission occurred with the conifer class, the
conifer class was probably undermapped and misidentified as shrub. The
classification system required that 10% conifer cover would be mapped as
conifer. However, there were typically areas of sparse conifer cover that had
extensive shrub understories and spectrally ‘looked’ more like a shrub lifeform.
The increase in accuracy in the shrub class using the RIGHT operator, indicated
that the confusion between the conifer and shrub classes was due to these sparse
conifer stands, as “fuzzy” ratings accounted for sparse conifer stands.
www.fs.fed.us/r5/rsl/publications/
6
The majority of the lifeform confusion on the Lassen National Forest
portion of the map was between the conifer and the shrub classes. The matrix
below (table 3) would suggest that the conifer class was somewhat overmapped
rather than undermapped as on the Modoc National Forest part of the map.
However, because most of the errors of commission were with the shrub class
and most of the errors of omission were also with the shrub class, it was difficult
to predict a trend in the error between conifer and shrub. On this portion of the
map, the shrub class was the least accurate (table 1) and the error was with all
other lifeforms (table 3). Confusion between shrub, herbaceous, and nonvegetated classes was probably due to spectral similarity between desert-type
shrub communities and dry grass or barren ground.
www.fs.fed.us/r5/rsl/publications/
7
The overall lifeform accuracy of the map was somewhat less for areas
outside National Forests. The combined shrub/herb class was the least accurate
which was probably due to the fact that this class was an aggregated class
consisting of chaparral, herbaceous and shrub types. This class was overmapped
and the conifer and hardwood types were somewhat undermapped. A review of
the site data for this portion of the map suggested that the confusion between the
conifer and shrub/herb classes occurred within western juniper stands and the
confusion between the hardwood and shrub/herb classes primarily occurred
within blue oak stands. This confusion is understandable considering that both
western juniper and blue oak communities often have widely spaced trees with
shrub/herb understories.
CALVEG accuracy
There was an adequate sample to assess the accuracy of the conifer
CALVEG classes on all three portions of the map, and the shrub CALVEG
classes on the Modoc National Forest.
Conifer CALVEG accuracy
Overall accuracy of the conifer CALVEG map labels was greater than 75%
using the RIGHT operator. On the Modoc National Forest portion of the map,
three classes that comprised 76% of the conifer area (white fir, western juniper,
and eastside pine) were highly accurate. The most troublesome class on the
Modoc National Forest was the mixed-conifer fir class where only 3% of the
time was the map label the best choice for the site. There was also low accuracy
in the red fir class. In the mixed-conifer fir class most of the confusion was with
eastside pine and white fir (table 6). Mixed-conifer fir accuracy increased
significantly when using the RIGHT operator. In the Warner Mountains, mixed-
www.fs.fed.us/r5/rsl/publications/
8
conifer fir is a ‘transitional’ type in elevation between eastside pine and white
fir. Thus, mixed-conifer fir would be considered an acceptable class for some
eastside pine and white fir sites although not the best answer. The confusion
matrix (table 6) showed that most of the error in the mixed-conifer fir class were
errors of commission indicating that too much mixed-conifer fir was mapped.
The majority of confusion in the red fir class was with lodgepole pine and
whitebark pine. The error is understandable given that red fir is a major
associate of lodgepole pine in the Medicine Lakes area and a major associate of
whitebark pine in alpine areas.
On the Lassen National Forest portion of the map, accuracy was low for the
mixed-conifer types when using the MAX operator. However, accuracy
increased dramatically using the RIGHT operator. This increase was seen on
the Modoc National Forest portion of the map, as well, and is probably
indicative of mixed classes in general. The greatest amount of confusion in the
mixed-conifer fir class was with the white fir class (table 7). Users of the map
can expect to find areas of the map labeled as mixed-conifer fir that are actually
better labeled as white fir. Similarly, the mixed-conifer pine appeared to be
www.fs.fed.us/r5/rsl/publications/
9
overmapped as there were more errors of commission than omission in this
class. Users would expect to find areas of the map labeled mixed-conifer pine
which would be better labeled eastside pine or ponderosa pine.
Using the MAX operator, the accuracy of conifer CALVEG map labels for
areas outside National Forests was low (35%) but increased to 78% using the
www.fs.fed.us/r5/rsl/publications/
10
RIGHT operator (table 1). As only 4 of the twelve mapped CALVEG classes
were adequately sampled for accuracy assessment, the overall conifer CALVEG
accuracy figures for this portion of the map could be misleading. The mixedconifer pine class may be overmapped and on a number of sites better labeled as
ponderosa pine (table 8). The ponderosa pine class showed confusion with a
number of the CALVEG classes (table 8).
Shrub CALVEG accuracy
The overall accuracy for the shrub classes on the Modoc National Forest
using the MAX operator was 45%, but increased to 91% using the RIGHT
operator (table 9). Most of the confusion was between basin sagebrush, low
sagebrush and bitterbrush (table 10), the three classes with a sufficient sample
for accuracy assessment. The magnitude of error between these classes was not
www.fs.fed.us/r5/rsl/publications/
11
large which is why the accuracy improved to 91% using the RIGHT operator.
This is not surprising considering that the CALVEG description for basin
sagebrush lists low sagebrush as a likely associate, and bitterbrush is associated
with both the basin sagebrush and the low sagebrush classes. As seen in table
10, users of these maps may expect basin sagebrush to be ‘overmapped’ in areas
of low sagebrush. In addition, bitterbrush and low sagebrush are likely to be
overmapped in areas of basin sagebrush.
www.fs.fed.us/r5/rsl/publications/
12
CONCLUSIONS
Overall accuracy at the lifeform level was high for all areas assessed. Using
the MAX operator, lifeform accuracy ranged from 77% for areas outside
National Forests to over 85% for both the Modoc and Lassen National Forests.
When the RIGHT operator was used, the accuracy increased to 84% for areas
outside National Forests and to greater than 90% for areas within the two
National Forests. Accuracy for the CALVEG types that were adequately
sampled were not as high as lifeform, but were generally greater than 75%
using the RIGHT operator. Because lifeform accuracy was high, any
aggregation of CALVEG types to more general categories (e.g., Wildlife Habitat
Relationship types or Society of American Forester types) would typically result
in greater accuracy. A number of classes in the map were undersampled and
results for these classes should be used with caution. However, as a function of
the FIA grid inventory design, classes that cover most of the map were
adequately sampled. Additional accuracy assessment sites are recommended for
undersampled classes.
Using inventory data to assess the accuracy of a vegetation map is a unique
and promising approach. Because assessing the accuracy of such a large area is
typically very expensive (the value of the data set in this assessment is over
$690,000), using inventory data can provide a cost effective way of assessing
the accuracy of vegetation maps. The initial set of data could be supplemented
by techniques, such as post-stratification, cluster sampling, double sampling, or
regression estimators similar to those suggested by Stehman (1996). In this
way, information needed to adequately assess the accuracy of vegetation maps
can be incorporated into standard forest inventory designs.
LITERATURE CITED
Gopal, S. and C.E. Woodcock. (1994). Theory and Methods for Accuracy
Assessment of Thematic Maps Using Fuzzy Sets, Photogrammetric Engineering
and Remote Sensing, 60(2): 181-188.
Miller S., H. Eng, M. Byrne, J. Milliken, M. Rosenberg. (1994). Northeastern
California Vegetation Mapping: A Joint Agency Effort, Remote Sensing and
Ecosystem Management: Proceedings of the Fifth Forest Service Remote
Sensing Applications Conference, April 11-15, 1994, pp. 115-125. ASPRS,
Bethesda, Maryland.
www.fs.fed.us/r5/rsl/publications/
13
Stehman S.V. 1996. Cost-effective, Practical Sampling Strategies for Accuracy
Assessment of Large-Area Thematic Maps, Spatial Accuracy Assessment in
Natural Resources and Environmental Sciences: Second International
Symposium, U.S.D.A. Forest Service, Rocky Mountain Forest and Range
Experiment Station, Fort Collins, CO, General Technical Report RM-GTR-277,
pp 485-492.
USDA, U.S. Forest Service - Regional Ecology Group. (1981). CALVEG: A
Classification of California Vegetation, San Francisco, CA. 168p.
USDA, U.S. Forest Service - PNW Research (1992).
California Inventory, Portland, OR.
Field Manual for
USDA, U.S. Forest Service - Region 5. (1995). Forest Inventory and Analysis
User's Guide, San Francisco, CA.
Zadeh, L. (1963). Outline of a New Approach to the Analysis of Complex or
Imprecise Concepts. IEEE Transactions: Systems, Man, and Cybernetics, SMC
3:28-44.
Zadeh, L. (1965). Fuzzy Sets, Information and Control, 8:328-353.
ACKNOWLEDGMENTS
Thanks to the California Department of Forestry and Fire Protection for
joint funding of this effort.
www.fs.fed.us/r5/rsl/publications/
14
Download