Forest Health Monitoring Info.rmation Management System Audrey Mac Leod Harvey Berenberg

advertisement
This file was created by scanning the printed publication.
Errors identified by the software have been corrected;
however, some errors may remain.
Forest Health Monitoring Info.rmation
Management System 1
Audrey Mac Leod 2
Harvey Berenberg 3
Brian Cordova4
Susan Hua5
Matthew Kinkenon 6
Chuck Liff7
Abstract-The United States Department of Agriculture (USDA)
Forest Health Monitoring (FHM) program was established by
several Federal and State agencies to monitor, make assessments,
and report on the health of the Nation's forest. The FHM program
reports on the long-term status, changes, and trends in the health
of US forests. The mission of the FHM Information Management
System (FHM-IMS) is to produce and maintain a system that
supports all FHM activities from field data collection to data
distribution. Data are collected in the field using handheld Portable Data recorders. The field data are loaded into an Oracle
database. The unique feature of the FHM-IMS is that all data
processing takes place in the Oracle database environment. This
allows for quick and efficient data processing to meet the goal of
providing the users with data of known quality. To meet that goal,
the system uses a number of innovative mechanisms including a
database model that incorporates the business rules of FHM.
Changes to the data are tracked with an automated audit system.
The FHM-IMS provides the FHM community and others users with
quality FHM data in various formats suitable for scientific analysis.
The data are distributed in a number of ways including the World
Wide Web.
indicators are mensuration, crown evaluation, damage, lichen communities, soils, and ozone bioindicator plants.
FHM Data Model
The FHM data model mirrors the business rules of
FHM data acquisition and provides enough flexibility so
that the data may be stored in the FHM database and used
for a variety of studies. The FHM data model has two points
of emphasis:
1. The model stores variable plot data annually, but
stores permanent plot data only in the year that the
plot was established. For example, when a plot is
established the tree species codes and diameters at
breast height (DBH) are stored. Thereafter, only the
DBHs are stored. The FHM database consists of a set
of permanent and yearly tables to support this philosophy. This design paradigm accurately mirrors the
functionality of a fixed-area plot design and is used in
other United States Forest Service (USFS) projects.
2. In support of the FHM common plot and sampling
design the model stores data that are common to
various projects in a single database and shares it
among the projects.
FHM Indicators
FHM indicators are measurements or groups of measurements. An indicator is defined as any biological and nonbiological component of the environment that quantitatively estimates the condition or change in condition of
ecological resources, the magnitude of stress, or the exposure of a biological component to stress. The core FHM
Ipaper presented at the North American Science Symposium: Toward a
Unified Framework for Inventorying and Monitoring Forest Ecosystem
Resources, Guadalajara, Mexico, November 1-6,1998.
2Audrey Mac Leod is Senior Programmer/Analyst, FHM-IS, University of
Nevada, Las Vegas, located at Las Vegas, Nevada.
3Harvey Berenberg is Programmer/Analyst, FHM-IS, University of Nevada, Las Vegas, located at Las Vegas, Nevada.
4Brian Cordova is Programmer/Analyst, FHM-IS, University of Nevada,
Las Vegas, located at Las Vegas, Nevada.
5Susan Hua is Programmer/Analyst, FHM-IS, University of Nevada, Las
Ve~as, located at Las Vegas, Nevada.
Matthew Kinkenon is Consultant Programmer/Analyst, FHM-IS, located
at Las Vegas, Nevada.
7Chuck LifT is FHM-IS Manager, USDA Forest Service, located at Las
Vegas, Nevada.
USDA Forest Service Proceedings RMRS-P-12. 1999
Data Acquisition
FHM field data are collected on a Portable Data Recorder
(PDR) running a customized data acquisition program (Tally).
Tally uses various configuration files, menus, and display
screens to present data and request input from the person
collecting the data. The configuration files are defined on a
regional basis. They allow the individual FHM regions to
make modifications to the National FHM data set and still
operate within the framework established by the FHM
Information Management. Once data are entered, logic and
completion checks are run against the data in real time to
ensure data quality.
Data Loading
The field data contained in the Tally data files are loaded
into an Oracle Relational Database Management System
(RDBMS). The FHM data loading program, Tally cracker,
parses the Tally data files and loads the field data into the
473
FHM database. Once the data has been loaded into the FHM
database, the Tally data files are archived and maintained
for historical purposes. The FHM database field data are
considered official and used for all further processing.
Tally cracker was designed and built to maximize the use
ofthe standard UNIX tools and to utilize configuration files.
This approach has lead to greater system efficiency and
cost savings. The configuration files provide flexibility in
loading data from various regions and adaptability with
respect to annual design changes. Processing the field data
wholly within the RDBMS offers greater reliability and
efficiency.
Referential Integrity Checks _ __
Referential Integrity (RI) is the process that Oracle uses
to enforce the business rules of the database. The business
rules are the foreign key constraints placed on table attributes by virtue of their relationship. For example, a
business rule is that the condition class for a tree must be in
the subplot condition class list. A foreign key constraint
would be defined for the attribute condition class in the tree
table referencing the attribute condition class in the subplot condition class table. If there are condition class data
that violate the RI check, the foreign key constraint, then
RI will not be enabled for that constraint and the offending
data will be placed in error tables.
The FHM-IMS utilizes the built-in features of the Oracle
product to implement the RI checks. Prior to loading the
field data, RI is disabled, which insures that all the field data
are loaded into the FHM database. Once the data are loaded,
the RI checks are enabled. If there are data that violate an
RI check, the constraint will not be enabled and the offending data will be inserted into error tables. A report generation program queries the error tables and produces a file
containing the RI errors. This file is sent to the appropriate
regional lead for correction. The corrections are returned to
the FHM-IMS staff, entered into the database, and the RI
check process is repeated until all RI errors have been
resolved.
The mechanisms used by the FHM -IMS RI check process
provide an efficient and flexible means of adding, modifying,
and deleting relationships without having to modify customized software.
SQl logic Checks _ _ _ _ __
Approximately four hundred (400) data checks were of
sufficient complexity that they required the development of
a validation system. An example of this type of check is:
Saplings must have a diameter at breast height (DBH)
between 1 and 4.9 inches. The approach taken to implement
these checks was to develop a set of queries and error tables
within the FHM database. Each logic check was categorized,
given a unique number within its category, coded as a SQL
query, and stored in the FHM database SQL check table.
This approach allows new logic checks to be developed
quickly and efficiently. And since each check is a selfcontained item, concerns about the second order affects of
changing an individual check are eliminated. Prior to
474
being stored in the SQL check table the SQL query is
transformed into an SQL insert statement. When executed
this statement places data that violates the logic check into
the appropriate error table. The SQL check error tables
are the boundary, condition class, plot, point, seedling, site
tree, and tree error tables.
The FHM-IMS SQL check process executes each of the
checks in the SQL check table, then queries the SQL check
error tables and produces the error reports. These reports
are ASCII files in a predefined format suitable for editing.
The regional leads correct the erroneous data. The corrected
error reports are then processed, that is, data that was
changed are loaded into the appropriate SQL check error
table. The corrected data are stored in the same row as the
original data. This allows a single query to identify both the
original and corrected data associated with a given check.
The SQL check error tables also provide an audit trail for
tracking data errors. Once the corrections have been loaded,
the SQL check process queries the error tables and updates
the actual FHM database table rows. This process is repeated until all data errors are resolved.
Value Added Processing
Additional checks that could not be adequately handled
using the SQL check process have been grouped together in
a process referred to as outside checks. The FHM plot design
is a fixed pattern containing four subplots. Some plots have
more than one land use, forest type, stand origin, stand size
or past disturbance resulting in different condition classes
being mapped on subplots by the crews. Errors can sometimes arise due to improper coding of condition classes by
field crews. For example, a crew may record missed trees in
a previous condition class that was in a non forested condition or that was in a condition class not on the subplot.
Another example would be a crew entering more than one
condition class on a plot that differs only by the stand age
(less than 50 years). And as a final example, a crew changing
the previous DBH, DRC, or site tree DBH that was downloaded to the PDR from the previous survey. A report is
produced for each of these potential errors and the regional
leads make a determination if a change is required to the
database. This process is contained in an easy to follow three
step process: (1) Determine the error, (2) decide what change
to make and, (3) update the database.
Special value-added processes are contained in a program
called PNSN (Program Needing Snappy Name). The following is a list of some of the functions of PNSN: Identifying
sapling outgrowth, providing pointers for plot entrance
year, plot exit year, previous visit year, previous tally year
and condition class year, assigning tree numbers for missed
and extra trees, shrunken trees, and creating multiple plot
numbers. The design of the database allows for multiple
"layers" of data. The main layer of the database is the
detection monitoring data. The database can handle additional layers such as quality assurance, pilot studies or
special projects. PNSN provides a method for separating the
data for easier access by the user community. Additionally,
PNSN allows the analyst an easier means of tracking the
progression of a tree from entrance to exit in the database.
USDA Forest Service Proceedinqs RMRS-P-12. 1999
The FHM-IMS needs to handle infrequently occurring
situations, but important nevertheless. For example, there
is a process in the FHM_IMS that handles the situation
where a tree is tallied as a living tree during one visit, dead
during a subsequent visit and then living again during the
current visit. It is appropriately called the zombie process.
The zombie process is driven by the business rules and
accordingly handles each occurrence in exactly the same
predictable way.
The FHM-IMS also contains processes that derive additional data from the field data and then store it in the
database. Calculating the percent of area for each condition class on a plot is one of those processes. This process
would take too long to run in realtime and therefore is run
only once after the data are loaded. It can then be accessed
by the same means as the field data for use in further
analysis. This fulfills the requirement that the percent of
area data can be used for area expansion in the map plot
design.
View-Like Units (VLU)
Creation ___________
For most users the relational database and relational
technology can be confusing. The design ofthe database lays
out like a London Subway map, with data contained in
several tables. In this form the data is not very useful to the
end user. The information management group has created
several tables based on what the user community has
determined most useful for data analysis. The tables
that are most useful for the user community are the
PLOT_VIEW, POINT_VIEW, CONDITION_CLASS_VIEW,
and TREE_VIEW. The TREE_VIEW was essentially designed to be one stop shopping for the end user. It contains
plot, point, condition class and tree level information. These
VLU's have been constructed in method that allows quick
access to the data in the database.
FHM-IMS Audit System _ _ _ __
The FHM-IMS audit system uses the built-in features of
the database to track changes to the database after the data
have been verified. FHM data are not static and the audit
system provides a means of auditing data changes.
USDA Forest Service Proceedings RMRS-P-12. 1999
Data Distribution _ _ _ _ _ _ __
Creating easy to understand and useful tables, tailor
made for the end user, are only part of the task. The FHMIMS needs to provide access to the data as well as providing
the data to the end-user in a usable format. The Oracle Data
Browser is a tool that is available to the user and allows the
user to graphically view the data contained in the VLU's.
The end user can connect to the database and point and click
the attributes they require. The Oracle database can be
reached across the network as easily as ifthe database was
sitting on their desk. For those users who do not have
access to the Internet the FHM-IM staff provides the data
to the user community in SAS data sets. The SAS data sets
mirror the original VLU tables. The data sets are compressed and available through FTP or direct mail. In some
cases the data can be output to ASCII comma delimi ted files.
This allows the data to be read into another format most
comfortable for the end user.
Conclusions ---------------The FHM-IMS provides a flexible, efficient, portable, and
scalable system for loading, verifying, maintaining, and
distributing a change history of forestry data taken on a
permanent plot network. The FHM database design is robust enough to handle data for many studies, allowing data
to be shared among studies where appropriate. The paradigm of only storing certain "permanent" data one time in
the FHM database has improved consistency and given
analyst higher quality data. The Tally cracker program
allows configuration files to be updated on a yearly basis,
and has eliminated the need for special programming to
load data files. The general approach taken by FHM Information Management in solving the data verification problem has allowed the system to be extremely flexible and
portable. It has also cut down on programming maintenance
because all logic checks are SQL queries and these are
simple to construct. The FHM-IMS has addressed the needs
of the user community by presenting data to the user in a
variety of formats. The use of the FHM-IMS has increased
the quality of the data presented to the user community. It
has also provided a flexible framework to allow data to be
processed efficiently even when design changes take place
within the FHM sampling system.
475
Download