Knowledge-Based Approach to Watershed-Scale TMDL Assessment Keith Reynolds Mark Jensen

Knowledge-Based Approach to
Watershed-Scale TMDL Assessment1
Keith Reynolds2
Mark Jensen 3
James Andreasen 4
Iris Goodman 5
Abstract-The Ecosystem Management Decision Support
(EMDS) system is an application framework for knowledge-based
decision support of ecological landscape analysis at any geographic
scale. The system integrates geographic information system and
knowledge base system technologies to provide an analytical tool
for environmental assessment and monitoring. The basic objective
ofEMDS is to improve the quality and completeness of environmental assessments and the efficiency with which they are performed.
The USDA Forest Service and Environmental Protection Agency
have cooperatively developed an EMDS knowledge base for assessment and monitoring of ecological states and processes in 6th code
watersheds. The knowledge base evaluates watershed processes,
patterns, general effects of human influence, and specific effects on
salmon habitat.
The Total Maximum Daily Load (TMDL) program, Section 303(d) of the Clean Water Act, identifies sources of
pollution remaining after end-of-pipe discharges are regulated and a pplying the best available technology. Remaining
sources of pollutant are termed non-point sources (NPS).
Under requirements ofthe Act, States develop lists of waters
that do not meet State water quality standards, even after
point sources of pollution have installed required levels of
pollution control technology, States must establish priority
rankings based on severity of pollution and beneficial uses of
water bodies, such as recreation or fishing, and must develop
TMDLs for waters on the lists. TMDLs specify amounts of
pollutants that need to be reduced to meet State water
quality standards and allocate pollution control responsibilities among pollution sources in a watershed. The U.S.
Environmental Protection Agency (EPA) has established
a five-step approach to setting TMDLs: (1) identify waters requiring TMDLs; (2) priority ranking and targeting;
(3) develop TMDLs; (4) implement control actions; and
(5) assess control actions.
Ipaper presented at the North American Science Symposium: Toward a
Unified Framework for Inventorying and Monitoring Forest Ecosystem
Resources, Guadalajara, Mexico, November 1-6,1998.
2Keith Reynolds is a Research Forester, USDA Forest Service, Pacific
Northwest Research Station, 3200 SW Jefferson Way, Corvallis, OR 97331,
Phone: (541) 750-7434; Fax: (541) 750-7329; e-mail:
3Mark Jensen is aHydrologist, USDA Forest Service, Northern Region
Headquarters, located at Missoula, MT.
4James Andreasen is a Research Fisheries Biologist, U.S. Environmental
Protection Agency, National Center for Environmental Assessment, located
at Washington, DC, Headquarters.
sIris Goodman is a Research Ecologist, U.S. Environmental Protection
Agency Landscape Ecology Branch, located at Las Vegas, NV.
USDA Forest Service Proceedings RMRS-P-12. 1999
Conventional TMDL development is carried out on individual stream reaches by analyzing impaired conditions.
Conventional methods for TMDL analysis cannot address
the spatial and temporal scales required to: (1) establish
adequate reference conditions for NPS parameters; (2) estimate the predictive capabilities of scale relations for spatially continuous ecoregions; (3) project likely scenarios of
water quality change due to changes in land use, cover, or
climate; (4) relate monitoring technologies and standards to
defined ecoregional scales; and (5) establish schedules for
TMDL development that are ecologically meaningful and
compatible with Federal Agency responsibilities under the
Endangered Species Act.
The EPA Office of Research and Development and the
Forest Service (U.S. Department of Agriculture) are cooperatively developing new analytical techniques for landscape-scale TMDL assessment, using knowledge-based processing of landscape databases that enable environmental
managers to make better decisions. The objectives of this
study were to design a knowledge base as a logical framework for assessment of 6th code watershed condition and
illustrate its application in landscape analysis with the
Ecosystem Management Decision Support (EMDS) system
(Reynolds 1999a; Reynolds and others 1997a, 1997b).
Materials and Methods _ _ _ __
NetWeaver Knowledge Bases
This section summarizes key concepts and constructs
related to design and use of NetWeaver knowledge bases
(Stone and others 1986). Reynolds (1999b) gives a more
detailed description of the technology as implemented in
EMDS. Formally, a knowledge base is a meta database
that provides a specification for interpreting information.
Knowledge bases in this sense effectively are cognitive
maps of the elements in a problem domain and the logical
relations among those elements. In the context of watershed
assessment, for example, the elements of the problem are
typically ecosystem states and processes related to vegetation structure and composition, water quality, stream flow
properties, etc.
The primary structural element of a knowledge base as
implemented in N etWeaver is the network whose function is
to evaluate a proposition. The key attribute of a network is
its truth value, which is a measure of the degree to which
the proposition is true, based upon the state of logically
antecedent conditions. NetWeaver networks are recursive
insofar as a network may be evaluated in terms of other
networks. For example, the network for watershed processes (Figure 1) is evaluated in terms of its logically
antecedent networks hydrologic processes, erosion processes, and fire processes (hereafter, NetWeaver objects
are identified in bold type). Thus, the proposition that
watershed processes are within a suitable range of reference
conditions is true to the degree that the propositions associated with its logically antecedent networks are true. The
network architectures under hydrologic processes, erosion processes, and fire processes define the manner in
which these networks are evaluated in turn and so on.
Logical operators in NetWeaver such as AND and OR are
fuzzy logic operators. That is, they perform fuzzy math
operations that propagate truth values, derived at the level
of data links, upward through the logical structure of a
knowledge base. Zadeh (1965, 1968) presented basic concepts of approximate reasoning with fuzzy logic. Subsequent
concept papers (Zadeh 1975a, 1975b, 1975c) elaborated on
the syntax and semantics oflinguistic variables, laying the
foundation for what has become a significant new branch of
mathematics. Fuzzy logic is concerned with quantification of
set membership and associated set operations.
Data links (graphically illustrated as rectangles in figures) essentially are elementary networks. Like networks,
data links may evaluate a proposition, yielding a truth value
(although data links do not necessarily evaluate anything).
The primary distinction between networks and data links
that yield truth values is that data links only evaluate data
rather than a logical expression of antecedent conditions.
Data links evaluate a proposition by comparing the value of
a data item, or the result of a mathematical expression
involving one or more data items, to an argument that
defines the conditions under which the proposition is considered true. An argument may test for a simple true/false
condition as in classical rule-based systems based on bivalent logic, or an argument may be a fuzzy membership
function that tests an observed value's degree of membership in a fuzzy subset (Kaufmann 1975, Zadeh 1992). A fuzzy
membership function provides an explicit mathematical
expression for testing an observation's degree of affinity for
the concept represented by the fuzzy subset.
Problem Domain
Given the objectives of the EPA water quality assessment
program discussed in the introduction, the primary knowledge base topics included in design were watershed processes, watershed patterns, general effects of human influence, and specific effects of human influences on aquatic
species (Table 1). A key decision, made early in the design
process, was that the method of assessment be sufficiently
general for application in any geographic region. This design
criterion was implemented by constructing all fuzzy membership functions as dynamically-defined functions of data
representing standards. That is, all fuzzy membership functions in the knowledge base are defined by standards input
during analysis. All data are evaluated by comparison to
standards for which we conceptually distinguish three basic
types: reference conditions representing attributes of
unmanaged watersheds, management standards set by resource management agencies such as the USDA Forest
Figure 1.-Network for watershed processes. The truth of the proposition that watershed processes are within a suitable range of conditions depends on the degree to which its three premises, represented
by the networks, hydrologic processes, erosion processes, and fire
processes, are true.
Service, and regulatory standards set by regulatory agencies such as the EPA.
Knowledge Base Application in EMDS
Major components of the EMDS system (Reynolds 1999a)
include the NetWeaver knowledge base system, the EMDS
Arcview application extension, and the Assessment system
(Figure 2). This section briefly summarizes system structure
and function in terms of system level objects, their methods,
and relations. More detailed descriptions of the system are
provided in Reynolds (1999a) and Reynolds and others
(1997a, 1997b).
The NetWeaver knowledge base system (Reynolds 1999b)
is composed of an engine and a graphic user interface for
knowledge base developers that provides controls for designing, editing and interactively evaluating knowledge bases
(Figure 2). Primary components ofthe EMDS Arcview application extension are the DataEngine and MapDisplay objects that customize the Arcview environment with methods
and data structures required to integrate NetWeaver's
knowledge-based reasoning schema into Arcview (Figure 2).
The Assessment system is a graphic user interface to the
NetWeaver engine for EMDS application end-users that
Table 1.-Primary networks in the knowledge base for assessing
watershed condition.
Network name
watershed processes
watershed patterns
human influence
aquatic species
Proposition evaluated by network
Watershed processes are within
acceptable ranges.
Watershed patterns are within
acceptable ranges
Aggregate effects of human influence
are within acceptable ranges.
Likelihood of longterm viability of aquatic
species is good.
,/ -
/ NetVVeaver -',
- oelectNetwolkO
EM~~ ,
requeotThemeO "
' __
~~~;;sme~t - - - "
,_ - /
/ __ ,
Ne~ea~er eng~~e- - "
readstw rites
-~O~I~nk :r:~;~rdioplayState()
,_ - - _' - 'DAM
- - ':
··"'·~"f'.a(L - - '
__ -,
/~__' ___ .
- \!","~m"
{pe rs io l
,--- ____ ,'--, __
\ __ ' \
/ - - - - -' - - - - _ _
Knowledge base "
..:-uses""-:>-- -
_, c~::~~)~:
] -\
- -' -'
1 ..
1.. n
'-/ - - - ''/1 .. n
,,' AssessTables ' ~
- - netwolkState I
" {persis} ___ )
, __ , ;sData
, ___ , } _
,( GeoData - "
' , _ {pe 1$ is} ~
1 n)- _,' __
,, _
TabData - "
{pe ... is} ~
Figure 2.-System level object diagram of EMDS system. Lines indicate object relations and annotations on lines indicate
primary nature of the relation ("uses" indicates a general relation in which several to many methods of the used object are
relevant). Text items within objects of the form "xxx 0" indicate object methods. Only key methods are shown for each object.
controls setup and running of analyses, runtime editing of
knowledge bases, and display of maps, tables, graphs, and
evaluated knowledge base state related to analyses.
broad in conceptual scope, requiring evaluation of possibly
numerous and diverse data. Consequently, several to many
data elements needed for complete evaluation of a knowledge base or any of its components may be missing at the
Results --------------------------------------The primary networks for assessing watershed condition
are watershed processes, watershed patterns, human
influence, and aquatic species. Each network evaluates
a specific proposition about the state of watershed condition
(Table 1). An example analysis of erosion processes in a
portion of the Columbia River Basin was performed to
illustrate landscape application of the knowledge base for
watershed copdition in EMDS (Figure 3). The Assessment
system (Figure 2) was used to specifically select the erosion
processes network (Figure 1) for evaluation in our example. In general, the Assessment system can be used to
select any combination of networks for analysis. Map output
shows the computed truth value for the proposition that
erosion processes are within a suitable range of conditions
for each 6 code watershed in the assessment area selected
for this example.
Partial evaluations, based on currently available data,
can be performed in EMDS. Truth values for erosion
processes in the map output (Figure 3) only reflect a partial
evaluation of the network because data values for volumes
of mass wasting and debris avalanche are missing in our
example (Table 2). Ecological assessments frequently are
Figure 3.-Truth value map for the proposition that erosion processes in 6th code watersheds are within a suitable range of
reference conditions.
2.-Propositions associated with networks antecedent to the
erosion processes network.
Network name
erosion processes
surface erosion
mass wasting
debris avalanche
sediment delivery
Erosion processes are within suitable
Amount of surface erosion is within a
suitable range.
Amount of mass wasting is within a
suitable range.
Amount of debris avalanche is within
a suitable range.
Amount of sediment delivery is within
a suitable range.
start of an assessment. In our example, complete evaluation
of erosion processes requires data values for volumes of
surface erosion, sediment delivery to streams, mass wasting, and debris avalanche, but only the first two data
elements were available at the time of analysis. However,
given the set of knowledge base objects and their logical
organization within the knowledge base, the NetWeaver
engine computes the relative influence of missing data
(Figure 4).
Finally, the Hotlink browser (Figure 2) provides a
means to examine details underlying an evaluation, by
allowing the user to view the evaluated state of the knowledge base for any landscape feature selected on a truth
value map (Figure 5).
Discussion --------------------------------Application of fuzzy logic to natural resource science
and management is still relatively new. General areas of
application include classification in remote sensing (Blonda
1996), environmental risk assessment (Holland 1994), phytosociology (Moraczewski 1993a, 1993b), geography
(Openshaw 1996), ecosystem research (Salski and Sperlbaum
1991), and environmental assessment (Smith 1995, 1997).
More specific applications include catchment modeling
(Anonymous 1994), cloud classification (Baum et a1. 1997),
evaluation of plant nutrient supply (Hahn et a1. 1995), soil
interpretation (Mays et a1. 1997, McBratney and Odeh 1997),
and land suitability for crop production (Ranst et a1. 1996).
The knowledge base for evaluation of watershed condition
was designed for general application. The architecture is
such that the knowledge base should be applicable in any
geographic region with no more than minor adaptation.
Specification of standards as data to be read from a database
is an important ingredient of this general applicability.
Clearly, our approach to a general solution begs the question, ''Where do specifications for reference conditions come
from?" We suggest the following approach. For any geographic region or subregion, the vegetation potential of
unmanaged watersheds is conditioned by geographic and
climatic factors (Whittaker 1975). Widely available synecological analysis tools such as detrended correspondence
analysis (Hill and Gauch 1980) provide a basis for arranging
watersheds along geographic and climatic gradients, and
identifying groupings indicative of reasonably separable
vegetation potentials in the absence of management. Most
resource management agencies have sufficiently detailed
GIS coverages to identify watersheds that have experienced
little or no management, and the attributes of such watersheds can be used as reference conditions.
The knowledge-based reasoning schema of NetWeaver
uses an object- and fuzzy logic-based propositional network
architecture for knowledge representation (Reynolds 1999b).
The system facilitates evaluation of complex, abstract topics
such as water quality that depend on numerous, diverse
subordinate conditions because NetWeaver is fundamentally logic based. The object-based architecture ofN etWeaver
knowledge bases is conducive to incremental, evolutionary
design of complex knowledge representations (Booch 1994)
which has been recognized as crucial to successive design of
complex systems (Gall 1986). The propositional network
architecture of NetWeaver knowledge bases allows both
the ability to evaluate the influence of missing information
and the ability to reason with incomplete information
(Reynolds and others 1997a, 1997b).
Use of fuzzy logic in NetWeaver affords significant practical advantages over Bayesian belief networks (Ellison
Relative influence
II massWasteMean
o surfErosMean
Iilll massWasteSD
Q] sedDelivSD
II massWasteQ4
o massWasteQ2
0.3 0.4
0.7 0.8
Figure 4.-Relative influence of missing data with respect to completing an analysis of
erosion processes.
......... 1 .......
Figure 5.-The EMDS NetWeaver browser displays the evaluated state of a knowledge base for selected landscape features in EMDS map outputs.
1996, Howard and Matheson 1981) and classical rule-based
knowledge representations that depend on bivalent (e.g.,
yes/no or true/false) logic (Waterman 1986, Jackson 1990) in
the context of knowledge bases that are conceptually broad
and that include a wide variety of topics. Bayesian belief
networks work well on narrow, well-defined problems, and
may be preferable to fuzzy logic networks when conditional
probabilities of outcomes are known. However, Bayesian
belief networks are difficult to apply to large, general problems because the number of conditional probabilities that
must be specified can quickly become extremely large as
the conceptual scope of a problem increases. In such situations, model design not only becomes difficult to manage, but
many probabilities will not be well characterized and will
therefore need to supplied by expert judgment, thus negating much ofthevalue to be gained by a more statistically-based
approach to knowledge representation. Similarly, the number of rules required in a bivalent logic knowledge base
increase to unmanageable levels as soon as the model designer attempts to account for shades of outcomes such as
poor, fair, good, excellent, etc. These arguments should not
be taken to infer that fuzzy logic networks are inherently
superior to other forms of knowledge representation. On the
contrary, the various methods just discussed may be highly
complementary to one another. In particular, we believe
that fuzzy logic networks are ideally suited as logical frameworks for integrating model results from a variety of
analytical systems such as simulators, linear programs,
Bayesian belief networks, and rule bases.
Conclusions __________
A knowledge-based approach to landscape analysis for
TMDL assessment was shown to be quite feasible with
application of the EMDS system despite the broad conceptual scope of the problem domain. The complete knowledge
base has large data requirements, but any combination of
networks, representing subsets of the full knowledge base,
may be selected for analysis. Key advantages of a landscape
analysis based on fuzzy logic networks as implemented in
NetWeaver and used in EMDS include the ability to reason
with incomplete information, and the ability to evaluate the
influence of missing information. Fuzzy-logic based landscape analysis may be most useful for construction oflogical
frameworks wi thin which a wide variety of analytical resul ts
can be effectively integrated into a single, coherent analysis.
