Expanding Applications, Data, and Models in C. Gretchen G. Moisen

advertisement
This file was created by scanning the printed publication.
Errors identified by the software have been corrected;
however, some errors may remain.
Expanding Applications, Data, and Models in
a Forest Inventory of Northern Utah, USA 1
Gretchen G. Moisen 2
Thomas C. Edwards, Jr. 3
Tracey S. Frescino4
Abstract-Forest inventories, like those conducted· by the USDA
Forest Service, Forest Inventory and Analysis (FIA) Program in the
Interior West, U.S.A., are under increased pressure to provide
better information and reduce costs. Here, we describe our ongoing
efforts in the Interior Western of the United States to expand
traditional forest inventory strategies to accommodate a wider
variety of user-defmed products, auxiliary data inputs, and statistical models. To the traditional product line of estimates of population totals, we add spatial depictions of the forest resources as well
as exploratory data analyses. To the existing forest inventory
ground and photo plots we add spatially explicit digital data sets
including elevation, aspect, slope, geology, precipitation, AVHRRand TM-based vegetation cover types, as well as UTM coordinates.
To the current model of double sampling for stratification we add
generalized linear and additive models, classification and regression trees, and artificial neural networks. These expansions are
illustrated through a synopsis of ongoing research studies in the
northern Utah mountains.
Over 60 years ago, the United States recognized the need
for information on the supply and condition of the Nation's
timber resources and established a national forest inventory
program under the McSweeney-McNary Act of 1928. That
Act was expanded by the Forest and Rangeland Renewable
Research Act of 1978 to include all forest resources. It is
under this Act that the Forest Service Forest Inventory and
Analysis Program (FIA) seeks to maintain a comprehensive
inventory ofthe status, trends, use and health of the country's
diverse forest ecosystems. Networks of remotely sensed and
field plot locations have been established on nearly all
forested lands throughout the Interior West by the Interior
West Resources, Monitoring, and Evaluation (IWRIME)
Program. Using a double sampling for stratification strategy, estimates of areal extent and structural attributes of
these forested lands are reported at regional scales approximatelyevery 10 years. More recently" emphasis has been
lpaper presented at the North American Science Symposium: Toward a
Unified Framework for Inventorying and Monitoring Forest Ecosystem
Resources, Guadalajara, Mexico, November 1-6,1998.
2Gretchen G. Moisen is Research Forestp 'JSDA Forest Service, Rocky
Mountain Research Station, Ogden, UT be• ..01 U.S.A. e-mail: gmoisenJ
rmrs_ogdenfsl@fs.fed.us
3tfhomas C. Edwards, Jr. is Research Ecologist and Associate Professor,
USGS Biological Resources Division, Utah Cooperative Fish and Wildlife
Research Unit, Department of Fisheries and Wildlife, Utah State University,
Logan, UT. e-mail: tce@nr.usu.edu
'llJ'racey S. Frescino is Forester, USDA Forest Service, Rocky Mountain
Research Station, Ogden, UT 84401 U.S.A. e-mail:tfrescino/rmrs
ogdenfsl@fs.fed.us
-
212
placed on integrating forest inventory data with satellitebased information to improve precision of these estimates,
as well as to produce maps of forest resources, explore
ecological relationships, and monitor change through time.
This recent emphasis is in response to demands by natural
resource managers and scientists to know not only how
much and what type of vegetation exists over an extensive
area, but where it is located, and how it is changing through
ecological processes, management activities,or catastrophic
events.
Development of an approach to meet these multiple objectives is hindered by a number of challenges that can be
visualized along three conceptual axes: data inputs, statistical models, and applications. Each of these axes poses
unique challenges. Challenges arise when model inputs
come from diverse data sources. For example, field data may
be collected under a wide variety of sample designs having
inconsistent sample plot shapes and sizes and unknown
positional error. Available digital data may exist at vastly
different scales than that collected on the ground, and may
include unknown sources of error. Also, definitions ofvariabIes may vary among cooperating agencies. Statistical
modelling challenges are also numerous and varied. Both
response and predictor variables in forest structure models
may be continuous or discrete. The relationship between
response and predictors may be nonlinear or not easily
described by a parametric relationship. Sample points may
be spatially and temporally correlated, and often we need to
model a multivariate response. In addition, inventory estimates of population totals must be unbiased. Finally, application challenges include needs for computational efficiency,
developing methodology that is suitable to a production
environment, and, perhaps most importantly, delivering
products along with validation information that are relevant to specific user needs.
Our research efforts have been conducted in the Northern
Utah Mountain Ecoregion (Omernik 1987) (Fig. 1). Field
data in this ecoregion were collected from 1991-1994 by
IWRIME. These 1 acre field plots were established on a
2.5 km offset grid for National ForestLands and on a 5 km
grid for other ownerships giving -1500 plots for modelling in
our study region. At each sample plot, forest variables such
as site attributes, vegetation structure, and individual tree
characteristics were measured. Details of the sample design, initial inventory estimates and analyses are reported
in O'Brien (1996). The problem we face is how best to link
this extant forest inventory data with a variety of satellitebased information to meet multiple inventory objectives in
light of the challenges along each of the data input, model,
and application axes. Here we describe our ongoing efforts in
USDA Forest Service Proceedings RMRS-P-12. 1999
•
Northern Utah
Mountain Ecoregion
maps lack any spatial depiction of vegetation structure
reducing their utility for applications like the identification
of suitable wildlife habitat (Edwards and others 1996).
There is also a recent emphasis on increasing analytical
capabilities by using extant inventory data to explore ecological relationships in forested systems. Such exploration
might help address management needs like predicting
stand growth response to management activities on lessstudied resources like woodlands, or perhaps to reduce labor
intensive sampling activities by modelling expensive field
variables as functions of site characteristics. In addition, the
question of how the resource is changing through time has
become more pressing with recent Federal legislation requiring annual estimates of forest population totals.
Auxiliary Data Inputs
Figure 1.-Northern Utah Mountain Ecoregion study
area for numerous research activities.
the northern Utah to expand traditional forest inventory
strategies to accommodate a wider variety of user-defined
products, auxiliary data inputs, and statistical models.
Conceptual Framework
Applications
One of our most difficult challenges may be to develop
products that are relevant to specific user needs. While
traditional estimates quantify the forest resources regionally, there are a wide variety of user groups in the U.S. and
abroad that desire more diverse information from regional
forest inventories. These information needs are more frequently driven by smaller spatial events than those addressed by regional analyses. Questions posed by these
groups include: how much of a particular vegetation class or
structure exists over a given area? How is that class and
structure distributed in space? What ecological processes
drive that distribution? And, how do we expect it to change?
For example, land managers often desire estimates of area
by forest type within small areas like a ranger district, or
perhaps estimates of timber volume within the digitized
boundaries of a beetle kill area. While these small area
estimates are essential to improve management, the most
valuable management tool is a map depicting the spatial
arrangement offorest attributes. Vegetation cover-type maps
produced by programs like the USDI Gap Analysis program
(see Homer and others 1997; Scott and others 1993) have
been useful to some extent. However, most, ifnot all, ofthese
USDA Forest Service Proceedings RMRS-P-12. 1999
A wide variety of digital data sets are available in our
study region. Data layers used in our studies include elevation, aspect, slope, geology, precipitation, geographic coordinates, as well as raw spectral values, indices, and vegetation
cover-types based on satellite data from both the advanced
very high resolution radiometer (AVHRR) and Thematic
Mapper (TM) platforms. Elevation, aspect and slope were
obtained from 90 m digital elevation models produced by the
Defense Mapping Agency (DMA). Geology data were obtained from a digitized coverage of a 1:500,000 stable base
mylar of the geology of Utah (Hintze 1980). Precipitation
data came from a downscaling (Zimmermann, unpublished
data) of coarse-scale Prism climate maps (Daly and others
1994). AVHRR-based data included the normalized difference vegetation index (NDVI), derived from the visible and
near infrared spectral values (Loveland and others 1991)
and a vegetation class from the 1993 Resources Planning Act
forest type group map (Powell and others 1993) identifying
11 vegetation classes in the Northern Utah Mountain
Ecoregion. Finally, TM-based data included red, near-infrared, and mid-infrared spectral values as well as a vegetation
class based on the 1 ha cover-map produced by the Utah Ga p
Analysis Project (Edwards and others 1995, Homer and
others 1997).
Statistical Models
Linking the diverse data inputs described above to meet
diverse user needs poses interesting statistical challenges.
Here we have a (possibly multivariate) response collected at
n sample locations on an x-y grid. We have a large number
of predictor variables whose functional relationship to the
response may be highly nonlinear, with complex interactions amongst the predictor variables. We want to model the
response as a function of the predictor variables for the
purposes of predicting in space, estimating population
totals and exploring ecological relationships (Fig. 2). In
addition, the modeling has to be done in a "production"
environment, i.e. repeatable by a variety of analysts frequently working on space-limited hardware with tight
deadlines and small budgets. The question, then, is which of
a myriad of statistical tools is most appropriate?
While a wide variety of approaches are available, we are
currently considering five classes of models for meeting
multiple forest inventory objectives. These include the
213
AVHRR Vs. PI: How' Much?
predictors
Figure 2.-Strategy for merging· regional forest inventory
data with satellite-based information for meeting multiple
inventory objectives through a variety of statistical models.
classical linear model as well as a suite of 4 nonlinear
regression methods. Linear models have been used extensively in forest inventory applications because they are fast
to compute, easy to interpret, and require relatively few data
points. In addition, they can be nested in a probability-based
estimation strategy through stratified sampling or regression estimators (like those currently used in forest inventories) and can produce quite accurate predictions when the
process generating the response is, in fact, linear. However,
predictions are much less accurate when the relationship
between the predictor and the response is highly nonlinear.
Given this constraint, we are likely to be able to extract more
information from the predictor variables through more flexible model structures capable of handling nonlinear relationships. Nonlinear methods considered include generalized linear and additive models (GLMs and GAMs)
(McCullagh and NeIder 1989, Hastie and Tibshirani 1990),
classification and regression trees (CART) (Morgan and
Sonquist 1963, Breiman and others 1984), multivariate
adaptive regression splines (MARS) (Friedman 1991), and
artificial neural networks (ANN) (Ripley 1996). These nonlinear models were chosen because all are believed to be
competitive for prediction when there are a small to moderate number of predictor variables (less than 10), as is the
case for our forest inventory application. See DeVeaux and
others, 1993, and DeVeaux, 1995, for discussions comparing
these techniques.
Questions and Research Activities
in Northern Utah
The following paragraphs summarize recent and ongoing research activities in the Northern Utah Mountain
Ecoregion. Each individual study represents an expansion
of traditional methods along one or more of the conceptual
axes defining our research.
214
Traditionally, estimates of population totals have been
constructed through through two-phase sampling procedures where phase one consists .of aerial photo based information collected on an intensive sample grid, and phase two
consists of a subset of that grid visite9. in the field. More
recently, questions have been raised about the cost-efficiency
of using satellite-based information for stratification in lieu
of photo-interpretation. A study sponsored by the FIA Remote Sensing Band (USDA, In preparation), is underway to
examine the relative precision of estimates of area by forest
type and total volumes of major tree species in 6 ecologically
different states within the U.S. under several stratification
strategies. Traditional two-phase sampling estimates using
photo-interpreted information in phase one are compared to
estimates obtained when classified AVHRR and anAVHRRbased vegetation index are used for stratification and regression estimation, respectively. Preliminary results from
the Northern Utah Mountain Ecoregion show that a tremendous reduction in cost can be realized through the use of
classified AVHRR data over traditional photo-interpreted
data for stratification with only minimal loss in precision.
GLMS and Digital Data: How Much and
Where?
In Moisen and others (in review), we illustrate how
generalized linear models can be used to construct approximately unbiased and efficient estimates of population totals
while providing a mechanism for spatial prediction for
mapping of forest structure. We model forest type and
timber volume of five tree species groups as functions of a
variety of predictor variables in the northern Utah mountains. Predictor variables include elevation, aspect, slope,
and geographic coordinates, as well as vegetation covertypes based on satellite data from both the Advanced Very
High Resolution Radiometer (AVHRR) and Thematic Mapper (TM) platforms. We examine the relative precision of
estimates of area by forest type and mean cubic-foot volumes
under six different models including the traditional double
sampling for stratification strategy. Only very small gains
in precision were realized through the use of expensive
photo-interpreted or TM-based data for stratification, while
models based on topography and spatial coordinates alone
were competitive (Fig. 3a,b). We also compare the predictive
capability of the models through various map accuracy
measures. The models including the TM-based vegetation
performed best overall, while topography and spatial coordinates alone provided substantial information at very low
cost (Fig. 3c,d).
GAMS and Digital Data: Where and Why?
Frescino and others (in review) modelled forest composition and structural diversity in the Uinta Mountains, Utah,
as functions of satellite spectral data and spatially explicit
environmental variables through generalized additive models. Measures of vegetation composition and structural diversity were available from extant forest inventory data.
Three types of satellite data included raw TM spectral data,
USDA Forest Service Proceedings RMRS-P-12. 1999
a
0.016
-.
-. - -
0.014
0.012
- __+
0.01
0.008 .. - - .. - -
-::~,
+
-
'
.
* --* --*.. -..
__ ......
- - -If - :: : - -
... ... -. .*
- -
-)Eo - -
~ :. :. ~
":. '\,
-:~:~:~
~:l:~:"
- •
- -
- -
.::j:~:-
All
AS
OF
-=~::..
LP
-)to
SF
*
0.006 -+--_--......----..---r---..,
A topo Atopo T Ttopo PI
Model
b
40
--
-. .. .. -. -. •
*
••
....
- --
--
30
--
.~~::-
t--__ '#' -:!: _:~
---*--*- --*- --*- --*
- - -+ - - - + - - - +- - - +- - - +
2Q " . : --:-- -- -:.
All
AS
OF
LP
SF
a Gap Analysis classified TM, and a vegetation index based
on AVHRR. Environmental predictor variables included
maps oftemperature, precipitation, elevation, aspect, slope,
and geology. Spatially explicit predictions were generated
for forest classification, presence of lodgepole pine, basal
area of forest trees, percent cover of shrubs, and density of
snags within a user-friendly display environment (Fig. 4).
The maps were validated using an independent set of field
data collected from the Evanston Ranger District within the
Uinta Mountain Range. The models predicting the presence
of forest and lodgepole pine were 88% and 80% accurate,
respectively, within the Evanston Ranger District, and an
average of62% of the predictions of basal area, shrub cover,
and snag density fell within an approximate 15% deviation
from the field validation values. The addition ofTM spectral
da ta and the GAP Analysis TM -classified data were found to
contribute significantly to the models' predictions, with
some contribution from AVHRR data. The methods used in
this study provide a systematic approach for delineating
structural features within forest habitats, thus offering an
efficient spatial tool for making management decisions.
C
Modern Regression Methods: How Much,
Where, and Why?
10~---~-~---,---r---'
A
topo
Atopo
T
napa
PI
Model
c
•..*
100
-::~::.
82.5
... - -
..
_
All
AS
OF
LP
-1(. SF
.-----.
65~~-~--~--~---'
A
topo
Atopo
T
There are numerous ways in which forest class and
structure variables can be modeled as functions of remotely
sensed variables, yet little work has been done to determine
which statistical tools are best suited to the tasks given
multiple objectives and logistical constraints. Moisen and
others (1998) discuss ongoing work comparing the relative
performance of linear models, GAMs, CARTs, MARS, and
ANNs in meeting multiple forest inventory objectives
(Fig. 5). Models have been built for a variety of forest class
and structure variables using forest inventory and satellitebased information in the mountains of northern Utah, and
extensive realistic simulations are under construction
(Moisen, in preparation). The relative performance of each
of the five classes of models mentioned above is being
evaluated according to the following criteria: 1) accuracy of
Ttopo
Model
d
40
----.----
...
30
•.....
• 11 - - - -.
*
-::~::-
«.
20
----*- ---*- ---.. ---*
- - - -+ - - - - +- - - - +- - - - +
10~--~---~------~-~
A
topo
Atopo
T
nopo
Model
USDA Forest Service Proceedings RMRS·P-12. 1999
All
AS
OF
LP
SF
Figure 3.-Forest type and cubic-foot volume for 5 species groups
were modelled as functions of several sets of predictor variables.
Species groups included all timber species (All), Aspen (Asp),
Douglas-fir (OF), lodgepole pine (LP), and spruce-fir (SF). Sets of
predictor variables included classified AVHRR data (A), topographic
variables and spatial position (topo), classified AVHRR along with
topographic and spatial variables (Atopo) , classified TM data (T),
classified TM along with topographic and spatial variables (Ttopo),
and photo-interpreted forest type collected on a 1 km grid (PI).
Standard errors on estimates of area and volume by species group
are illustrated in Figures 3 a and b respectively. Figures 3 c and d
illustrate the accuracy of maps of forest type and tree volume per acre
under each of 5 predictor sets. Accuracy of type maps was expressed
as percent correctly classified (pee) while maps of tree volume were
expressed as the root mean squared error (RMSE). See Moisen and
others (In review) for details.
215
V..... of ..... C_'---.. ........ z.-ofFo_
Ind.,
:f: ~ /11
0.7.
0.740
0.74
0.7311
AalllIo,
II W-IKMM
/
&.;:;::.....:.==.::-==-' . .
on
Figure 4.-Predictions and summary statistics over seven ranger districts in
the Uinta Mountains. See Frescino and others (In review) for details.
LM
MARS
GAM
!Jlp.baJ
ANN
,ltv
dflwol
CART
11 .10
...t
Figure 5.-Linear models (LM), generalized additive models (GAMs), classification and
regression trees (eART), multivariate adaptive regression splines (MARS), and artificial neural
networks (ANN) each offer unique opportunities and challenges if used as tools to meet multiple
inventory objectives. See Moisen and others (1998) for details.
216
USDA Forest Service Proceedings RMRS-P-12. 1999
Probability of presence
Surface maps of wildlife variables
Figure S.-Maps of forest attributes are needed for identification of suitable wildlife habitat.
spatial prediction; 2) efficiency in estimating population
totals; 3) interpretability; 4) suitability to a production
environment.
Wildlife Habitat: Where and Why?
Last, we are examining the ability to link the fine-scale
resolution obtained from site-specific wildlife models, and
fine-scale depictions of forest attributes, to large scale predictive models of wildlife. We first model the specific structural attributes of forest habitat following techniques described above. This process creates a statistical model relating
a response (e.g., snag density, canopy cover, tree density) to
a series of predictor variables (e.g., topographic variables,
classified TM data, spatial position) From this process, a
series of maps offorest attributes can be generated, each of
which is a spatial representation of a predictor variable in a
wildlife habitat model. To generate the probability of wildlife presence, each cell of each variable map is run through
the predictive wildlife model and a probability of presence
calculated (Fig. 6). Preliminary field tests of the predictive
capabilities ofthis approach for cavity nesting birds in aspen
forests indicate accuracies in the 60-85% range (Lawler
and Edwards, Unpublished data). These results suggest
that data, such as those collected by FIA, have applications
to other aspects of forest management beyond simple estimation of population totals.
Summary
Here we have presented a conceptual framework for
merging regional forest inventory data with satellitebased information for meeting multiple inventory objectives. We have described our ongoing efforts in the Interior
USDA Forest Service Proceedings RMRS-P-12. 1999
West to expand traditional forest inventory strategies to
accommodate a wider variety of auxiliary data inputs, statistical models, and user-defined products. Our goal is to
continue research into the strengths and weaknesses of
differing approaches, exploring how alternative data inputs,
statistical models and applications as defined by users affect
our ability to inventory and monitor forest attributes.
References --------------------------------Breiman, L.; Friedman, J. H.; Olshen, R. A.; and Stone, C. J. 1984.
Classification and Regression Trees. Monterey, CA: Wadsworth
and Brooks/Cole.
Daly, C.; Nielson, R. P.; and Phillips, D. L. 1994. A statisticaltopographic model for mapping climatological precipitation
over mountainous terrain. Journal of Applied Meteorology 33:
140-158.
DeVeaux, R. D. 1995. A guided tour of modern regression methods.
In: 1995 Fall Technical Conference: Section on Physical and
Engineering Sciences: Proceedings of conference; St. Louis, MO.
DeVeaux, R. D.; Psichogios, D. C.; and Ungar, L. H. 1993. A
comparison of two nonparametric estimation schemes: MARS
and Neural Networks. Computers in Chemical Engineering 8:
819-837.
Edwards, T. C., Jr.; Homer, C. G.; Bassett, S. D.; Falconer, A.;
Ramsey, R. D.; and Wight, D. W. 1995. Utah Gap Analysis: an
environmental information system. Technical Report 95-1, Utah
Cooperative Fish and Wildlife Research Unit, Utah State University, Logan, Utah. 1138pp + 2 CD-ROMs.
Frescino, T. S.; Edwards, T.C., Jr.; and Moisen, G. G. [In review].
Modelling spatially explicit structural attributes using generalized additive models. Submitted to Ecological Applications.
Friedman, J. H. 1991. Multivariate addaptive regression splines.
Annals of Statistics 19: 1-141.
Hastie, T.; and Tibshirani, R. J.1990. Generalized Additive Models.
New York: Chapman and Hall. 335 p.
Hintze, L. F. 1980. Geologic map index of Utah. Utah Geological
and Mineralogical Survey, Salt Lake City, Utah.
Homer, C. H.; Ramsey, R. D.; Edwards, T. C., Jr.; and Falconer, A.
1997. Landscape cover-type modelling using a multi-scene
217
TM mosaic. Photogrammetric Engineering and Remote Sensing
63: 59-67.
Lawler, J. J. L. and Edwards, T. C., Jr. [Unpublished data.]
Utah State University, Department of Fisheries and Wildlife,
Logan, UT.
Loveland, T.R.~J. W.Merc1uint~D.O.Ohlen;andJ.F.Brown.1991.
Development of a land-cover characteristics database for the
conterminous U.S. Photogrammetric Engineering and Remote
Sensing 57: 1453-1463.
McCullagh, P. and NeIder, J. A. 1989. Generalized Linear Models.
New York: Chapman and Hall. 511 p.
Moisen, G. G.; Edwards, T. C., Jr.; and Van Hooser, D. 1997.
Merging Regional Forest Inventory Data with Satellite-based
Information Through Nonlinear Regression Methods. In:
T. Ranchin and L. Wald, eds. Proceedings of the Second International Conference on the Fusion of Earth Data; Sophia
Antipolis, France; January 1998. p. 123-128.
Moisen, G. G. and T. C. Edwards, Jr. [In review]. Use of generalized
linear models and digital data in a forest inventory of Utah.
Submitted to Journal of Agricultural, Biological and Environmental Statistics.
Moisen, G. G. [ In preparation]. Modem regression methods for
meeting multiple forest inventory objectives: a comparative
study. Utah State University, Department of Mathematics and
Statistics. Ph. D. dissertation in preparation,
218
Morgan, J. N. and Sonquist, J: A. 1963. Problems in the analysis of
survey data and a proposal. Journal of the American Statistical
Association 58: 415-434.
O'Brien, R A. 1996. Forest resources of Northern Utah Ecoregion.
Resour. Bull. INT-RB-87. Ogden, UT: U.S. Department of Agriculture, Forest Service, Intermountain Research Station. 34 pp.
Omernik, J. M. 1987. Map supplement: ecoregions ofthe conterminous United States. Annals ofthe Association of American Geographer. 77: 118-125 (map) ..
Powell, D. S.; Faulkner, J. L.; Darr, D. R; Zhu, Z.; and MacCleery,
D. W. 1993. Forest resources ofthe United States, 1992. General
Technical Report RM-234. Fort Collins, CO: U.S. Department of
Agriculture, Forest Service, Rocky Mountain Forest an Range
Experiment Station. 132 p. + map.
Ripley, B. D. 1996. Pattern Recognition and Neural Networks. New
York: Cambridge University Press. 403 pp.
Scott, J. M.; Davis, F.; Csuti, B.; Noss, R; Butterfield, B.; Caicco, S.;
Groves, C.; Edwards, T. C., Jr.; Ulliman, J.; Anderson, H.;
Derchia, F.; and Wright, R G. 1993. Gap Analysis: A geographic
approach to protection of biological diversity. Wildl. Monogr.
No. 123.
USDA. [In preparation]. Satellite-based stratification alternatives
for forest inventory. Study sponsored by the FIA National
Remote Sensing Band.
Zimmermann, N. [Unpublished data.] Utah State University, Department of Forestry, Logan, UT.
USDA Forest Service Proceedings RMRS-P-12. 1999
Download