This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. Forest Health Monitoring Info.rmation Management System 1 Audrey Mac Leod 2 Harvey Berenberg 3 Brian Cordova4 Susan Hua5 Matthew Kinkenon 6 Chuck Liff7 Abstract-The United States Department of Agriculture (USDA) Forest Health Monitoring (FHM) program was established by several Federal and State agencies to monitor, make assessments, and report on the health of the Nation's forest. The FHM program reports on the long-term status, changes, and trends in the health of US forests. The mission of the FHM Information Management System (FHM-IMS) is to produce and maintain a system that supports all FHM activities from field data collection to data distribution. Data are collected in the field using handheld Portable Data recorders. The field data are loaded into an Oracle database. The unique feature of the FHM-IMS is that all data processing takes place in the Oracle database environment. This allows for quick and efficient data processing to meet the goal of providing the users with data of known quality. To meet that goal, the system uses a number of innovative mechanisms including a database model that incorporates the business rules of FHM. Changes to the data are tracked with an automated audit system. The FHM-IMS provides the FHM community and others users with quality FHM data in various formats suitable for scientific analysis. The data are distributed in a number of ways including the World Wide Web. indicators are mensuration, crown evaluation, damage, lichen communities, soils, and ozone bioindicator plants. FHM Data Model The FHM data model mirrors the business rules of FHM data acquisition and provides enough flexibility so that the data may be stored in the FHM database and used for a variety of studies. The FHM data model has two points of emphasis: 1. The model stores variable plot data annually, but stores permanent plot data only in the year that the plot was established. For example, when a plot is established the tree species codes and diameters at breast height (DBH) are stored. Thereafter, only the DBHs are stored. The FHM database consists of a set of permanent and yearly tables to support this philosophy. This design paradigm accurately mirrors the functionality of a fixed-area plot design and is used in other United States Forest Service (USFS) projects. 2. In support of the FHM common plot and sampling design the model stores data that are common to various projects in a single database and shares it among the projects. FHM Indicators FHM indicators are measurements or groups of measurements. An indicator is defined as any biological and nonbiological component of the environment that quantitatively estimates the condition or change in condition of ecological resources, the magnitude of stress, or the exposure of a biological component to stress. The core FHM Ipaper presented at the North American Science Symposium: Toward a Unified Framework for Inventorying and Monitoring Forest Ecosystem Resources, Guadalajara, Mexico, November 1-6,1998. 2Audrey Mac Leod is Senior Programmer/Analyst, FHM-IS, University of Nevada, Las Vegas, located at Las Vegas, Nevada. 3Harvey Berenberg is Programmer/Analyst, FHM-IS, University of Nevada, Las Vegas, located at Las Vegas, Nevada. 4Brian Cordova is Programmer/Analyst, FHM-IS, University of Nevada, Las Vegas, located at Las Vegas, Nevada. 5Susan Hua is Programmer/Analyst, FHM-IS, University of Nevada, Las Ve~as, located at Las Vegas, Nevada. Matthew Kinkenon is Consultant Programmer/Analyst, FHM-IS, located at Las Vegas, Nevada. 7Chuck LifT is FHM-IS Manager, USDA Forest Service, located at Las Vegas, Nevada. USDA Forest Service Proceedings RMRS-P-12. 1999 Data Acquisition FHM field data are collected on a Portable Data Recorder (PDR) running a customized data acquisition program (Tally). Tally uses various configuration files, menus, and display screens to present data and request input from the person collecting the data. The configuration files are defined on a regional basis. They allow the individual FHM regions to make modifications to the National FHM data set and still operate within the framework established by the FHM Information Management. Once data are entered, logic and completion checks are run against the data in real time to ensure data quality. Data Loading The field data contained in the Tally data files are loaded into an Oracle Relational Database Management System (RDBMS). The FHM data loading program, Tally cracker, parses the Tally data files and loads the field data into the 473 FHM database. Once the data has been loaded into the FHM database, the Tally data files are archived and maintained for historical purposes. The FHM database field data are considered official and used for all further processing. Tally cracker was designed and built to maximize the use ofthe standard UNIX tools and to utilize configuration files. This approach has lead to greater system efficiency and cost savings. The configuration files provide flexibility in loading data from various regions and adaptability with respect to annual design changes. Processing the field data wholly within the RDBMS offers greater reliability and efficiency. Referential Integrity Checks _ __ Referential Integrity (RI) is the process that Oracle uses to enforce the business rules of the database. The business rules are the foreign key constraints placed on table attributes by virtue of their relationship. For example, a business rule is that the condition class for a tree must be in the subplot condition class list. A foreign key constraint would be defined for the attribute condition class in the tree table referencing the attribute condition class in the subplot condition class table. If there are condition class data that violate the RI check, the foreign key constraint, then RI will not be enabled for that constraint and the offending data will be placed in error tables. The FHM-IMS utilizes the built-in features of the Oracle product to implement the RI checks. Prior to loading the field data, RI is disabled, which insures that all the field data are loaded into the FHM database. Once the data are loaded, the RI checks are enabled. If there are data that violate an RI check, the constraint will not be enabled and the offending data will be inserted into error tables. A report generation program queries the error tables and produces a file containing the RI errors. This file is sent to the appropriate regional lead for correction. The corrections are returned to the FHM-IMS staff, entered into the database, and the RI check process is repeated until all RI errors have been resolved. The mechanisms used by the FHM -IMS RI check process provide an efficient and flexible means of adding, modifying, and deleting relationships without having to modify customized software. SQl logic Checks _ _ _ _ __ Approximately four hundred (400) data checks were of sufficient complexity that they required the development of a validation system. An example of this type of check is: Saplings must have a diameter at breast height (DBH) between 1 and 4.9 inches. The approach taken to implement these checks was to develop a set of queries and error tables within the FHM database. Each logic check was categorized, given a unique number within its category, coded as a SQL query, and stored in the FHM database SQL check table. This approach allows new logic checks to be developed quickly and efficiently. And since each check is a selfcontained item, concerns about the second order affects of changing an individual check are eliminated. Prior to 474 being stored in the SQL check table the SQL query is transformed into an SQL insert statement. When executed this statement places data that violates the logic check into the appropriate error table. The SQL check error tables are the boundary, condition class, plot, point, seedling, site tree, and tree error tables. The FHM-IMS SQL check process executes each of the checks in the SQL check table, then queries the SQL check error tables and produces the error reports. These reports are ASCII files in a predefined format suitable for editing. The regional leads correct the erroneous data. The corrected error reports are then processed, that is, data that was changed are loaded into the appropriate SQL check error table. The corrected data are stored in the same row as the original data. This allows a single query to identify both the original and corrected data associated with a given check. The SQL check error tables also provide an audit trail for tracking data errors. Once the corrections have been loaded, the SQL check process queries the error tables and updates the actual FHM database table rows. This process is repeated until all data errors are resolved. Value Added Processing Additional checks that could not be adequately handled using the SQL check process have been grouped together in a process referred to as outside checks. The FHM plot design is a fixed pattern containing four subplots. Some plots have more than one land use, forest type, stand origin, stand size or past disturbance resulting in different condition classes being mapped on subplots by the crews. Errors can sometimes arise due to improper coding of condition classes by field crews. For example, a crew may record missed trees in a previous condition class that was in a non forested condition or that was in a condition class not on the subplot. Another example would be a crew entering more than one condition class on a plot that differs only by the stand age (less than 50 years). And as a final example, a crew changing the previous DBH, DRC, or site tree DBH that was downloaded to the PDR from the previous survey. A report is produced for each of these potential errors and the regional leads make a determination if a change is required to the database. This process is contained in an easy to follow three step process: (1) Determine the error, (2) decide what change to make and, (3) update the database. Special value-added processes are contained in a program called PNSN (Program Needing Snappy Name). The following is a list of some of the functions of PNSN: Identifying sapling outgrowth, providing pointers for plot entrance year, plot exit year, previous visit year, previous tally year and condition class year, assigning tree numbers for missed and extra trees, shrunken trees, and creating multiple plot numbers. The design of the database allows for multiple "layers" of data. The main layer of the database is the detection monitoring data. The database can handle additional layers such as quality assurance, pilot studies or special projects. PNSN provides a method for separating the data for easier access by the user community. Additionally, PNSN allows the analyst an easier means of tracking the progression of a tree from entrance to exit in the database. USDA Forest Service Proceedinqs RMRS-P-12. 1999 The FHM-IMS needs to handle infrequently occurring situations, but important nevertheless. For example, there is a process in the FHM_IMS that handles the situation where a tree is tallied as a living tree during one visit, dead during a subsequent visit and then living again during the current visit. It is appropriately called the zombie process. The zombie process is driven by the business rules and accordingly handles each occurrence in exactly the same predictable way. The FHM-IMS also contains processes that derive additional data from the field data and then store it in the database. Calculating the percent of area for each condition class on a plot is one of those processes. This process would take too long to run in realtime and therefore is run only once after the data are loaded. It can then be accessed by the same means as the field data for use in further analysis. This fulfills the requirement that the percent of area data can be used for area expansion in the map plot design. View-Like Units (VLU) Creation ___________ For most users the relational database and relational technology can be confusing. The design ofthe database lays out like a London Subway map, with data contained in several tables. In this form the data is not very useful to the end user. The information management group has created several tables based on what the user community has determined most useful for data analysis. The tables that are most useful for the user community are the PLOT_VIEW, POINT_VIEW, CONDITION_CLASS_VIEW, and TREE_VIEW. The TREE_VIEW was essentially designed to be one stop shopping for the end user. It contains plot, point, condition class and tree level information. These VLU's have been constructed in method that allows quick access to the data in the database. FHM-IMS Audit System _ _ _ __ The FHM-IMS audit system uses the built-in features of the database to track changes to the database after the data have been verified. FHM data are not static and the audit system provides a means of auditing data changes. USDA Forest Service Proceedings RMRS-P-12. 1999 Data Distribution _ _ _ _ _ _ __ Creating easy to understand and useful tables, tailor made for the end user, are only part of the task. The FHMIMS needs to provide access to the data as well as providing the data to the end-user in a usable format. The Oracle Data Browser is a tool that is available to the user and allows the user to graphically view the data contained in the VLU's. The end user can connect to the database and point and click the attributes they require. The Oracle database can be reached across the network as easily as ifthe database was sitting on their desk. For those users who do not have access to the Internet the FHM-IM staff provides the data to the user community in SAS data sets. The SAS data sets mirror the original VLU tables. The data sets are compressed and available through FTP or direct mail. In some cases the data can be output to ASCII comma delimi ted files. This allows the data to be read into another format most comfortable for the end user. Conclusions ---------------The FHM-IMS provides a flexible, efficient, portable, and scalable system for loading, verifying, maintaining, and distributing a change history of forestry data taken on a permanent plot network. The FHM database design is robust enough to handle data for many studies, allowing data to be shared among studies where appropriate. The paradigm of only storing certain "permanent" data one time in the FHM database has improved consistency and given analyst higher quality data. The Tally cracker program allows configuration files to be updated on a yearly basis, and has eliminated the need for special programming to load data files. The general approach taken by FHM Information Management in solving the data verification problem has allowed the system to be extremely flexible and portable. It has also cut down on programming maintenance because all logic checks are SQL queries and these are simple to construct. The FHM-IMS has addressed the needs of the user community by presenting data to the user in a variety of formats. The use of the FHM-IMS has increased the quality of the data presented to the user community. It has also provided a flexible framework to allow data to be processed efficiently even when design changes take place within the FHM sampling system. 475