From: AAAI Technical Report WS-98-09. Compilation copyright © 1998, AAAI (www.aaai.org). All rights reserved.
Mixed Depth Representations
Susan
in the B2 System
1Susan M. Haller3~
W. McRoy 1’2,
and Syed S. Ali
1Department of Mathematical Sciences
2Department of Electrical Engineering and Computer Science
University of Wisconsin-Milwaukee
Milwaukee, WI 53201
3Computer Science and Engineering Department
University of Wisconsin-Parkside
Kenosha, WI 53141
Introduction
This paper gives an overview of our tutoring system,
B2. It describes the natural language and knowledge representation components of B2, and our approach to the representation of questions and requests. The domain that we have developed most
thoroughly helps medical students learn a statistical
model for medical diagnosis. Manyof the examples
will be taken from this domain. According to B2’s
plans for tutoring, the system does this by generating story problems that describe a scenario and then
asking the student about conclusions that might be
drawn. B2 also supports requests to explain its reasoning and questions about facts.
The
B2 Architecture
The B2 system consists of seven components (see
Figure 1). In the diagram, solid, directed arrows
indicate the direction of information flow between
components. The system gets the user’s input using
a graphical user interface that supports both natural language interaction
and mouse inputs. The
Parser component of the Parser/Generator performs
the first level of processing on the user input using
its grammar and the domain information from the
Knowledge Representation Component. The Parser
interprets the user’s inputs to form propositional representations of surface-level utterances for the Discourse Analyzer. The Generator produces natural
language outputs from the text messages (propositional descriptions of text) that it receives from the
Discourse Planner.
The system as a whole is controlled by a module called the Discourse Analyzer. The Discourse
Analyzer determines an appropriate response to the
user’s actions on the basis of a modelof the discourse
and a model of the domain, stored in the knowledge representation
component. The Analyzer invokes the Discourse Planner to select the content of
the response and to structure it. The Analyzer relies on a component called the Mediator to interact
with the Bayesian network processor. This Mediator
processes domain level information, such as ranking
the effectiveness of alternative diagnostic tests. The
Mediator also handles the information interchange
between the propositional information that is used
by the Analyzer and the probabilistic data that is
used by the Bayesian network processor. All phases
of this process are recorded in the knowledgerepresentation component, resulting in a complete history
of the discourse. Thus, the knowledgerepresentation
component serves as a central "blackboard" for all
other components.
During the initialization of the system, there is a
one-time transfer of information from a file that contains a specification of the Bayesian network both to
the Bayesian network processor and to the Knowledge Representation Component. The Mediator converts the specification into a propositional representation that captures the connectivity of the original
Bayesian network.
The KnowledgeRepresentation
Blackboard
B2 represents both domain knowledge and discourse
knowledge in a uniform framework as a propositional semantic network. A propositional semantic
network is a framework for representing the concepts of a cognitive agent who is capable of using language (hence the term semantic). The information is represented as a graph composed of
nodes and labeled directed arcs. In a propositional
semantic network, the propositions are represented
by the nodes, rather than the arcs; arcs represent only non-conceptual binary relations between
nodes. The particular systems that are being used
for B2 are SNePS and ANALOG
(Ali 1994a; 1994b;
Shapiro & Group 1992) which provide facilities
for
building and finding node as well as for reasoning
TextMessaBes
Parser/Generator
Character
Stream
DomainKnowledge
Access
DiscourseKRAccess
s ....
Knowledge
Representation
Component
GUI
Gut
Actions
Discourse
Analyzer
DiscourseAction
KRUpdates
Discourse
Planner
Bayesian
Net KRUpdates
Bayesian
Net
Bayesian
Net
StatisticalInformation
Bayesian Net
Hugin
Queriesto
Bayesian
Net
Mediator
Figure 1: The B2 architecture
and truth-maintenance. These systems satisfy the
following additional constraints:
1. Each node represents a unique concept.
2. Each concept represented in the network is represented by a unique node.
3. The knowledge represented about each concept is
represented by the structure of the entire network
connected to the node that represents that concept.
These constraints allow efficient inference whenprocessing natural language. For example, such networks can represent complex descriptions (common
in the medical domain), and can support the resolution of ellipsis and anaphora, as well as general reasoning tasks such as subsumption (Ali 1994a; 1994b;
Maida & Shapiro 1982; Shapiro & Rapaport 1987;
1992).
Wecharacterize a knowledgerepresentation as uniform when it allows the representation of different
kinds of knowledge in the same knowledge base using
the same inference processes. The knowledge representation componentof B2 is uniform because it provides a representation of the discourse knowledge,domain knowledge, and probabilistic
knowledge (from
the Bayesian net). This supports intertask communication and cooperation for interactive processing of
tutorial dialogs.
The rule in Figure 2 is a good example of how the
uniform representation of information in the semantic network allows us to relate domain information
(a medical case) to discourse planning information
(a plan to describe it). This network represents
text plan for describing a medical case to the user.
Text plans are represented as rules in the knowledge
representation. Rules are general statements about
objects in the domain; they are represented by using case frames I that have FORhLLor EXISTSarcs
to nodes that represent variables that are bound by
these quantifier arcs. In Figure 2, node M13is a rule
with three universally quantified variables (at the end
of the FORALL
arcs), an antecedent (at the end of the
ANTarc), and a consequent (at the end of the CQarc).
This meansthat if an instance of the antecedent is
believed, then a suitably instantiated instance of the
consequent is believed. M13states that if Vl (which is
at the end of the CASE-NUHBER
arc) is the case number of a case, and V2 and V3 (which are at the end
of CASE-INFO
arcs) are two pieces of case informa2tion, then a plan to describe the case will conjoin
the two pieces of case information. Node P1 represents the concept that something is a memberof the
1Case frames are conventionally agreed upon sets of
arcs emanating from a node that are used to express a
proposition. For example, to express that A isa B we use
the MEMBER-CLASS
case frame which is a node with
a MEMBER
arc and a CLASS.arc (Shapiro et al. 1994)
provides a dictionary of standard case frames. Additional
case frames can be defined as needed.
2 "Conjoin"is a technical term from Rhetorical Structure Theory (Mann & Thompson1986); it refers to
co-ordinate conjunctionof clauses.
41
class case and P2 represents the concept that the
case concept has a case number and case information.
For more details about the knowledgerepresentation,
see (McRoy,Haller, & All 1997).
terances such as Why?or What about HIDA? (Hailer
1996) 3 It also allows the system to recover from
misunderstandings, should they occur (McRoy1995;
McRoy& Hirst 1995).
Wewill consider each of these levels in turn, starting with the utterance level, shownat the bottom of
Figure 3.
The Utterance
Level
For all inputs, the parser produces a representation of
of its surface content, which the analyzer will assert
as part of an occurrence of an event of type SAY.The
content of the user’s utterance is always represented
by what she said literally. In the case of requests, the
student may request a story problem directly, as an
imperative sentence Tell me a story or indirectly, as
a declarative sentence that expresses a desire I want
you to tell me a story. The complete representation
of the imperative sentence Tell me a story is shown
in Figure 4.
Figure 2: A rule stating that if V1 is the case number of a case, and V2 and V3 are two pieces of case
information, then a plan for generating a description
of the case will present the two pieces of information
in a coordinating conjunction.
ac|i
The Representation
of the Discourse
The discourse modelhas five levels of representation,
shown in Figure 3. These levels capture what the
student and the system have each said, as well as
how their utterances extend the ongoing discourse.
Unlike many systems, B2’s model of discourse will
include a representation of questions and requests,
as well as statements of fact. (Systems that do
not represent questions and requests typically give
these utterances a procedural semantics, interpreting them as operations to be performed.) Having
an explicit representation of questions and requests
simplifies the interpretation of context-dependent ut-
DE
"a"
"lelr’
Figure 4: Node B3 represents an utterance whose
form is imperative, and whose content (M4) is the
proposition that the hearer (B1) will tell a story (B2)
to the speaker (BS).
interpretation
of exchanges
exchanges
(pairsofinterpretations)
For the system’s utterances, the utterance level
representation corresponds to a text generation event
(this contains much more fine-grained information
about the system’s utterance, such as mode and
tense.) The content of the system’s utterance is the
text message that is sent to the language generator.
system’s
interpretation
of each
utterance
sequence
of utterances
utterance
level
3HIDAstands for radio-nuclide hepatobilary imaging,
a diagonistictest.
Figure 3: Five Levels of Representation
42
Sequence
of Utterances
The second level corresponds to the sequence of utterances. (This level is comparable to the linguistic structure in the tripartite
model of (Grosz
Sidner 1986)). In the semantic network, we represent the sequencing of utterances explicitly, with asserted propositions
that use the BEFORE-AFTER
case frame. The order in which utterances occurred
(system and user) can be determined by traversing
these structures. This representation is discussed in
detail in (McRoy,Haller,& Ali 1997).
The Interpretation
Level
In the third level, we represent the system’s interpretation of each utterance. Each utterance event
(from level 1) will have an associated system interpretation, which is represented using the INTERPRETATION_OF-INTERPRETATIONcase frame. For
example, consider the interpretation of the utterance
Tell me a story (as well as I want you to tell me
a story.), shown in Figure 5. (Every utterance has
one or more interpretations; at any time, only one is
believed and a justification-based truth maintenance
system is used to track changes in belief.)
The Exchange
Interpretation
and Exchange
Levels
The fourth and fifth levels of representation in our
discourse model are exchanges and interpretations of
exchanges, respectively. A conversational exchange
is a pair of interpreted events that fit one of the
conventional structures for dialog (e.g. QUESTIONANSWER).
Figure 6 gives the network representation of a conversational exchangeand its interpretation. Node Ml13 represents the exchange in which
the system has asked a question and the user has answered it. Using the MEMBER-CLASS
case frame,
propositional node Ml15 asserts that the node Ml13
is an exchange. Propositional node Ml12 represents
the system’s interpretation of this exchange: that the
user has accepted the system’s question (i.e. that the
user has understood the question and requires no further clarification). Finally, propositional node MII6
represents the system’s belief that node Ml12 is the
interpretation
of the exchange represented by node
MII3.
Interaction
among the Levels
A major advantage of the network representation is
the knowledge sharing between these five levels. We
term this knowledge sharing associativity. This occurs because the representation is uniform and every
concept is represented by a unique node (see Section ). As a result, we can retrieve and make use
of information that is represented in the network implicitly, by the arcs that connect propositional nodes.
For example, if the system needed to explain whythe
user had said HIDA,it could follow the arcs from the
node representing the utterance that User said HIDA
to the system’s interpretation of that utterance, node
M108, to determine that
¯ The user’s utterance was understood as the answer
within an exchange (node gl13), and
¯ The user’s answer indicated her acceptance and
understanding of the discourse, up to that point
MII2.
"describe"
This same representation could be used to explain
why the system believed that the user had understood the system’s question. This associativity in
the networkis vital if the interaction starts to fail.
.~:
Figure 5: Node M31is a proposition that the interpretation of Tell me a story (which is glossed in this
figure) is M22. Node M22is the proposition that the
user requested that the system describe a case to the
user. (Describing a case is a domain-specific action;
the pronouns from the utterance level have been interpreted according to the context.)
Summary
The goal of the B2 project is to give students an
opportunity to practice their decision making skills.
Wegive students the opportunity to ask the system
to explain what factors were most influential to its
decision and why.
43
INTERPRETATION-OF
EVENT2
OBJECTI
ACTION
®
Figure 6: Node Ml15 represents the proposition that node Ml13 is an exchange comprised of the events M99and
M108.M108is the proposition that The user answered "HIDAis the best test to rule in Gallstones". Additionally,
node Ml16 represents the proposition that the interpretation of Ml13 is event M112. Ml12 is the proposition
that the user has accepted M96. (M96is the question that the system asked in event M99.)
References
The natural language processing and knowledge
representation components of B2 are general purpose. It builds a five-level model of the discourse,
that represents what was literally said, what was
meant, and how each utterance and its interpretation
relates tO previous ones. This is necessary because
students’ utterances may be short and ambiguous,
requiring extensive reasoning about the domain or
the discourse model to fully resolve. Wehave shown
how our mixed-depth representations encode syntactic and conceptual information in the same structure. This allows us to defer any extensive reasoning until needed, rather than when parsing. Weuse
the same representation framework to produce a detailed representation of requests and to produce a
representation of questions. The representations use
the same knowledge representation
framework that
is used to reason about discourse processing and domain information--so that the system can reason
with (and about) the utterances.
Ali, S. S. 1994a. A Logical Language for Natural
LanguageProcessing. In Proceedings o] the lOth Biennial CanadianArtificial Intelligence Conference,
187-196.
Ali, S. S. 1994b. A "Natural Logic" for Natural Language Processing and Knowledge Representation. Ph.D. Dissertation, State University of New
York at Buffalo, Computer Science.
Grosz, B. J., and Sidner, C. L. 1986. Attention,
intentions, and the structure of discourse. Computational Linquistics 12.
Haller, S. 1996. Planning text about plans interactively. International Journal of Expert Systems
85--112.
Maida, A. S., and Shapiro, S. C. 1982. Intensional
concepts in propositional semantic networks. Cognitive Science 6(4):291-330. Reprinted in R. J. Brachman and H. J. Levesque, eds. Readings in Knowledge Representation, Morgan Kaufmann, Los Altos,
CA, 1985, 170-189.
Mann, W., and Thompson, S. 1986. Rhetorical
structure theory: Description and construction of
text structures. In Kempen, G., ed., Natural Lan-
Acknowledgements
This work was supported by the National Science Foundation, under grants IRI-9701617 and IRI9523666and by a gift from the University of Wisconsin Medical School, Department of Medicine.
44
guage Generation. Boston: Kluwer Academic Publishers. 279-300.
McRoy, S. W., and Hirst, G. 1995. The repair of
speech act misunderstandings by abductive inference. Computational Linguistics 21(4):435-478.
McRoy, S.; Haller, S.; and Ali, S. 1997. Uniform
knowledge representation for nip in the b2 system.
Journal of Natural LanguageEngineering 3(2).
McRoy, S. W. 1995. Misunderstanding
and the
negotiation of meaning. Knowledge-based Systems
8(2-3):126-134.
Shapiro, S. C., and Group, T. S. I. 1992. SNePS-2.1
User’s Manual. Department of Computer Science,
SUNYat Buffalo.
Shapiro, S. C., and Rapaport, W. J. 1987. SNePS
considered as a fully intensional propositional semantic network. In Cercone, N., and McCalla, G.,
eds., The Knowledge Frontier. NewYork: SpringerVerlag. 263-315.
Shapiro, S. C., and Rapaport, W. J. 1992. The
SNePS family. Computers ~ Mathematics with Applications 23(2-5).
Shapiro, S.; Rapaport, W.; Cho, S.-H.; Choi, J.;
Feit, E.; Haller, S.; Kankiewicz, J.; and Kumar, D.
1994. A dictionary of SNePScase frames.
45