Modeling Case-based Planning for Repairing Reasoning ... Susan Fox David B. Leake

advertisement
From: AAAI Technical Report SS-95-05. Compilation copyright © 1995, AAAI (www.aaai.org). All rights reserved.
Modeling Case-based Planning for Repairing Reasoning Failures
Susan Fox
David B. Leake
Computer Science Department
Indiana University
Bloomington, IN 47405
{sfox, leake}@cs,indiana,edu
Abstract
Oneapplicationof modelsof reasoningbehavioris to allowa
reasonerto introspectivelydetectandrepair failures of its own
reasoningprocess. Weaddressthe issues of the transferability of such modelsversus the specificity of the knowledge
in
them, the kinds of knowledgeneededfor self-modelingand
howthat knowledge
is structured, andthe evaluationof introspective reasoning systems. Wepresent the ROBBIE
system
whichimplements
a modelof its planningprocessesto improve
the planner in responseto reasoningfailures. Weshowhow
ROBBIE’s
hierarchical modelbalancesmodelgenerality with
accessto implementation-specific
details, anddiscussthe qualitative and quantitativemeasureswehaveused for evaluating
its introspectivecomponent.
Introduction
Manymotivations underlie current interest in introspective
reasoning and learning. Froma functional perspective, introspective reasoning has the potential benefit of allowing the
reasoner to refine its ownreasoning methods, expandingits
capabilities over time and adapting its reasoning to respond
effectively to novel circumstances. In complexdomainsit is
difficult or impossible to predict all the knowledgeand reasoning methodsthe system will need ahead of time. A system
which can learn new knowledgeand new reasoning methods
should be able to perform better under those circumstances.
Froma more general perspective, developmentof a modelfor
this task will help us to understandand evaluate reasoningbehavior and the knowledgeneeded to capture it.
In order to learn about its reasoningmethods,a systemmust
be able to detect opportunities to learn, which are defined
in our system by places where expectations about ideal system performance fail (Leake, 1992; Krulwich, Birnbaum,
Collins, 1992; Hammond,1989; Ram, 1989; Schank, 1986;
Riesbeck, 1981). Whenactual performance differs from expected ideal performance, the system learns by assigning
blamefor the failure, and repairing the flaw in the underlying system. All these tasks require knowledgeabout howthe
systemreasons, and what the expected results of that reasoning are. There are several different recent approachesto the
task of introspective reasoning: RAPTER
(Freed & Collins,
1994a, 1994b) uses expectations about a reactive planning
task to diagnose and repair failures, Meta-AQUA
(Ram
Cox,1994) maintainsa set of templates for reasoning failures
with applicable repairs to apply to failed reasoningtraces, Autognostic (Stroulia &Goel, 1994) uses an Structure-BehaviorFunctionmodelof its ownreasoning to find learning opportunities, and IULIAN(Oehlmann, Edwards, & Sleeman, 1994,
1995) uses questions about its ownreasoning and knowledge
to re-index its memory
and to regulate its processing. OurapI (Fox & Leake, 1994), models the desired
proach, ROBBIE
behavior of its underlying case-based planning componentas
a set of expectations about the behavior of the systemduring
the planning process. ROBBIE
monitors the reasoning of its
underlying system, comparingits performance to a modelof
the "ideal" performanceof the case-based reasoning process,
as first proposed by Birnbaumet al. (1991). The modelcontains expectations about each portion of the system’s reasoning processes. These expectations, assertions that wouldhold
for an ideal CBRsystem, are organized by the componentof
the systemthey refer to, their level of specificity, and their relations to other expectations. The questions of what expectations are required, at what levels of abstraction, and howthey
relate to each other lie at the heart of this work.
In this paper wefocus on a few issues of importanceto systems which use introspective reasoning for self-improvement.
In particular we consider the tradeoff betweencreating a general, transferable modeland creating a modelwith sufficient
detail to guide precise diagnosis and repairs, and we consider
the issue of evaluating introspective learning as a methodology and in terms of specific uses.
Generalityvs. Specificity: In order to facilitate the application of a self-modeling frameworkto manydifferent systems, we must keep the modelas general as possible and use
mechanismsindependent of both the implementedsystem and
the particular task. At the sametime detailed descriptions of
the underlying mechanismsand domain are needed in order
for the self-model to determine concrete repairs. In ROBBIE,
we proposean approachto introspective learning that strikes a
balance betweenthe desired generality and the needed specificity, and whichhas other benefits of its ownin simplifying
access to the model.
The mechanismsin ROBBIEwhich manipulate its introspective model are independent of ROBBIE’s
domainand underlying system, providing a few simple means of communication betweenintrospective reasoner and underlying system.
The vocabulary in whichthe modelis represented is designed
to describe data and reasoning tasks without being specific to
a particular implementation.The modelstructure preserves as
muchgenerality as possible by maintaininga hierarchy of assertions (expectations) which keep task- and implementationspecific details separate from generalities that might be more
transferable to other tasks and domains(Fox &Leake, 1994).
IRe-Organization
of BehaviorByIntrospective Evaluation
tem’s ownreasoning process (Birnbaumet al., 1991; Collins,
Birnbaum, Krulwich, & Freed, 1993; Birnbaum, Collins,
Freed, & Krulwich, 1990). This higher-level task is performed by a separate componentwhich interacts with the
planner (see Figure 1).
WorldSimulator
Presented with a starting location (usually the current location of the simulated robot) and a goal location to reach,
ROBBIE’scase-based componentretrieves the most similar
matchingsolution in memory.Similarity is initially judged by
a naive methodcomparingthe geographic "closeness" of the
starting and goal locations in the current situation to those in
the solutions in memory.ROBBIE
can learn new features to
use in assessing similarity. The solution retrieved from memFigure 1 : ROBBIE
Architecture
ory is adaptedby trying to mapthe actual starting and ending
locations onto the retrieved ones. The resulting plan is executed by the reactive planning component,taking each highEvaluating the method: Evaluation of AI systems is important to verify that the claims madeabout their performance level plan step as a goal to be reached. This execution provides an evaluation of the quality of the adaptedplan.
actually hold. Upto this point little concrete evaluation has
Duringthe plan generation and execution process, the introbeen attempted for introspective reasoning systems; we will
spective componentmonitors the reasoning of the case-based
discuss possible meansfor evaluating such systems and deand reactive componentsfor discrepancies betweenits expecscribe how we have begun to evaluate ROBBIE.
tations and the actual results. ROBBIE
uses a modelof the
In order to fully analyze ROBBIE’s
performance, we must
develop criteria for judging howgoodor "useful" its method underlying planning process to provide expectations about its
is; we must justify the effort expendedboth in terms of what performance.Themodelis a structured set of assertions about
the ideal behaviorof the case-basedplanner (Birnbaumet al.,
we maylearn about modelingmental states and in terms of the
1991; Fox &Leake, 1994). During the monitoring process,
tangible benefits of designing such a system.
only
those assertions relevant to the current portion of the reaBy analyzing ROBBIE’sapproach, we can learn something
soning task need be considered. In diagnosing a discovered
of the knowledgeneeds of systems for doing introspective
failure, the entire modelmaybe reconsidered as a problem
diagnosis and repair, and of howthat knowledgeshould be
mightnot be discovereduntil well after it wasintroduced(i.e.,
structured. For example,expectationsat multiple levels of abretrieval of a bad case mightnot producean explicit failure unstraction seemto makethe modelingas well as the transferring
of goals more tractable. Makingfine discriminations among til plan execution).
the kinds of relationships betweenexpectations seemsto imThe failures ROBBIE
maydetect include both catastrophes
prove the focus of assigning blamewhena failure does occur.
in whichthe planner incorrectly solves a problemor cannot
Onepractical justification for using introspective reasonreach a solution, and hiddenfailures whichinvolve inefficient
ing is the potential for improvedperformance;to support such
processing or successful but non-optimal solutions. For exa claim we must determine, quantitatively and qualitatively,
ample, ROBBIE
expects that it will knowand use all the relto what extent performance has improved. Potential evaluaevant features of a problemto retrieve the best old solution.
tion methods should provide somemeasure of the magnitude This assertion could be violated, yet a solution still be possiof improvementintrospective reasoning produces: one possible fromthe less-than-optimal retrieved case.
ble evaluation methodis to comparethe performanceof the
Whena discrepancy is discovered, the network of related
bottom-level systemalone with that of the systemas a whole.
assertions is reconsidered, drawingfrom a trace of the reaIn addition, we should define morequalitative methods,such
soning so far and those portions of the modelreachable from
as learning the "right" newreasoning, or producing "better"
the original failed assertion. Throughthis process the system
outputresults.
will determinethe root cause and possible repair for the noWefirst describe the ROBBIE
systemin detail and present
ticed failure. For the failure above(that it will knowand use
an exampleof the sort of introspective learning ROBBIE
perall relevant features), ROBBIE
might discover in storing the
forms. Then we will consider the issues described above to
solution gained by a poorly retrieved case that the solution resee howROBBIE
fits in and what we can conclude.
trieved was not the best one. The introspective reasoner can
workback from that noticed failure to the deeper cause: the
The ROBBIEsystem
lack of a relevant feature. ROBBIE
can alter the features used
in retrieval to include one that wouldhave distinguished the
The ROBBIE
system is, at the most basic task level, a planning system, whichinteracts with a user and a simulatedworld "real" best solution. The examplebelow addresses this problem in moredetail.
to generate and execute plans for that world. That "performance" task is performed by a case-based planner (HamThe planner maybe suspended while a repair is found and
mond, 1989; Alterman, 1986; Kolodner, 1993), combined implemented,or it maybe permitted to continue until more
with a simple reactive-style execution system(Firby, 1989).
information becomesavailable to the introspective reasoner.
Overarchingthe performancetask is the task of learning inAfter a repair has been implemented, the planner maycontrospectively about the planning and execution process itself,
tinue from the point where a problem was observed or may
which is done using model-based reasoning about the sysbe reset to a prior point in the reasoning task fromwhichthe
11
32
sertions often have repairs associated with them, because they
can refer to actual parts of ROBBIE
which can be altered.
Several components of the planner are implemented as
case-based systems themselves, sharing the same memoryand
retrieval mechanismsas the planner as a whole. For example,
anticipation is viewedas a process of selecting and applying
cases which specify features to be added to the problem description. Becauseof the re-use of the case-based mechanisms
for morethan one purpose, the details of the modelare simplified for those case-based components;the modelof CBRas a
whole provides expectations for each of them, as well as for
the planner.
Retriever:
abstract
seq~Retriever
a ease
Retrieverwill
willfind
output
a valid ease
Adaptor:
abstract
(/~ Adaptorwill get an adaptable case
2~Adaptati°n
spec/abstl(,,
will succeed
mid-level
X’~I Adaptorwill producea completecase
specific
]
,
/
,J
l Adaptorwill completein less than N step~
Figure 2: SampleAssertions
system can proceed normally.
ROBBIE’s
self-model
The introspective reasoning modelis used to monitor the system’s reasoningprocesses, and to diagnoseand repair failures
that occur whenthe assertions of ideal reasoning performance
fail to be true of the actual reasoning performance.The assertions describe expectations about the reasoning processes
for each componentof the planning system; Figure 2 shows
a portion of the current modelfor ROBBIE,
with assertions
describedin English. In this section wewill describe what assertions the modelcontains, howthey are structured, whatthat
meansfor the assertions in Figure 2, and the benefits gained
by a hierarchical model.
Assertions in the model
The model must provide expectations for the reasoning processes of each componentof the planner. The case-based
planning system consists of componentswhich perform specific parts of the CBRtask: Anticipator, Retriever, Adaptor,
Executor, and Storer. The Anticipator takes an initial problem description and creates an index to compareto the cases
in memory.The Retriever uses that index to select the most
similar solution in memory,the Adaptorchanges the old solution to matchthe newproblem, the Executor evaluates the
solution by executing it, and the Storer adds the newsolution
to memoryfor future use.
The assertions in the modeldescribe the componentsat different levels of specificity. At the abstract level are assertions
muchlike the description given above. High-level assertions
provide a trace of the overall flow of control and information throughthe planner, without using any details specific to
ROBBIE.
At lower levels, assertions refer to specific aspects
of ROBBIE’simplementation: the algorithms used for doing
retrieval, adaptation, execution, and so forth. Lower-levelas-
33
Structure of the model
The assertions are structured by the componentto which an
assertion refers, the level of specificity of the assertion, and
by connectionsto other related assertions. Dividingassertions
into groupsby their components
facilitates monitoringthe reasoning processes for deviations; the only assertions which
must be monitoredrefer to the current componentof processing. Assertions which belong to a particular componentare
also likely to be closely related to each other, as well.
Assertions are arranged hierarchically depending on how
specific they are to ROBBIE’s
implementation. A separation
by hierarchy simplifies the task of updating the modelwhen
things change, and transferring portions of the modelto new
underlying systems. In addition, it separates different ways
of thinking about the reasoning task: the abstract levels link
componentstogether and describe howinformation and control passes betweenthem, low-level assertions describe portions of particular componentsand the specific information
needs and algorithms for them.
Eachassertion is linked to the other assertions which are
related to it. Theselinks guide the introspective reasoner in
explaining and repairing a detected failure by focusing on the
most fruitful portions of the model. There are four kinds of
links, whichthe introspective reasoner treats differently during the search for the deep cause of a failure: an abstraction
link connects a loW-levelassertion to its high-level counterpart, a specification link symmetricto the abstraction link, a
sequence link connects two assertions (at the same level of
specificity) whenone assertion refers to an earlier part of the
reasoning process, and a co-occurs link connects two assertions whichtend to fail or succeed together. Theseclasses of
links betweenassertions are preliminary; we expect to refine
the classes as the modelis completed.
Sample of the model
Figure 2 represents a portion of the model ROBBIE
uses,
showinga subset of the assertions for two componentsof the
case-based planner. Assertions are grouped, first by component, and then by level of specificity. The numberof levels
dependson the componentin question; this figure showsthree
levels for the Adaptor: abstract, mid-level, and specific. Assertions which are groupedby specificity and componentare
considered together during the monitoring process.
Assertions are connectedtogether by several different kinds
of links; three appear in Figure 2: "seq," "spec," and "abstr." The "seq" links encodethe order in whichevent occur in
the underlying system, "spec" and "abstr" links are symmetric and link assertions at one level to correspondingspecifica-
tions or abstractions at another level. Assertions are written
in English for convenience,the actual assertions use a limited
vocabularyin predicate calculus.
The first componentin Figure 2 is the Retriever, for which
only two abstract assertions appear. In the complete model,
these assertions havelinks to other abstract and specific assertions omitted here to simplify the example.The first assertion
states that the retriever will alwaysfind somematchingcase.
The"seq" link indicates that the next assertion comeslater in
the retrieval process:It states that the final result of retrieval
will be the right kind of case. A memory
might contains different kinds of memorystructures (as ROBBIE’s
does), this
asserts that the Retrieverwill find a plan, andnot (for instance)
an adaptationstrategy, if it is lookingfor a plan.
Sequencelinks often connect assertions from two different
componentsat the abstract level. The next assertion in sequenceis an Adaptorassertion, that the Adaptorwill be given
an adaptable case ("adaptable" wouldbe defined by specific
assertions not included here). It is followed by an assertion
stating that adaptation will succeed in producingsomeanswer
in a limited amountof time. This assertion is linked to a specification that describes, in details specific to ROBBIE’s
implementation, exactly howto judge"success" of the adaptor. The
last abstract assertion is, as for the Retriever, concernedwith
the correct output for the component;
it is linked to a mid-level
assertion whichdefines "executable" in terms of "complete",
and wouldhave morespecific assertions belowit.
Benefits of a hierarchical model
A hierarchical model such as ROBBIE
uses provides two advantages over an approach using just general or specific expectations. First, for knowledgere-use, we can encapsulate
those parts of the knowledgewhich wouMapply across different systems and keep that part of the model’s knowledge
whendoing the transfer. Equally important, however,is the
value of both kinds of expectations in monitoringand repairing the underlying system. High-level assertions provide an
overview of the planner’s process, and allow us to connect
the functioning of one componentwith another at the "right"
level of abstraction: wedon’t needto trace each specific step
from storing the solution back to creating the index; the abstract assertions provide access to other components
at a general flow-of-control level of description. In searching for the
root cause of a failure, we can use the high-level assertions
to select appropriate componentsto consider and, from there,
the appropriate specific assertions for that component.Without the lower-level assertions whichdescribe the actual processing of this systems, it becomesnearly impossibleto detect
failures or to specify goodrepairs for them. Wetherefore must
design a modelwhichincorporates both levels of description;
ROBBIE’s
hierarchical and component-oriented model is one
such design.
There are still manyunansweredor incompletely-answered
questions about ROBBIE’sapproach. ROBBIE’scurrent
modelis incomplete,incorporatinga fraction of the assertions
we expect to need, and havingvery few repairs at its disposal.
Our immediatetask is to expandthe modeland repairs: to do
this we must determineto a finer degree what knowledgeis required. Wemust consider howmanylevels of abstraction in
the modelhierarchy are useful for ROBBIE.
Wemust also catalog morecompletelythe kinds of links betweenassertions, as
LI
BirchStreet
A
Wasted
step
//r
Figure 3: Mapof simulated world
Plan A:
¯ Turn south
¯ Movesouth to south side of Birch
¯ Turn west
¯ Movewest to L2
Plan B:
¯ Turn east
¯ Moveeast to L3
Figure 4: Hans in memory
we see what effect the current divisions have on the model’s
processing. Ideally we wouldtest the modelstructure under
fire by using it to implementintrospective reasoningfor a different underlying system.
Example:learning newindex features
To makethe discussion more concrete, let us consider a case
in whichROBBIE
alters the set of features used to index its
memory.ROBBIE’sunderlying task is to create and execute plans for navigating city streets in a simulated worldas
a pedestrian. The systemhas access to previous routes it has
taken and to a mapof the world which does not include dynamically changing details. Suchdetails at the present time
include traffic lights against whichthe systemmustnot cross
and which break down, and street-closings. The case-based
process must measure the similarity between the goal index
and the indices of cases in memoryto select the case which
is easiest to adapt into a newsolution; ROBBIE
originally selects cases based on howsimilar the starting and ending locations are to those in memory.Suchan index, while it seemsan
obvious approach, is not sufficient, as the following example
will makeclear.
Figure 3 showsa portion of the world maprelevant to this
problem. ROBBIEhas in memoryplan A, which describes
howto travel from location LI to location L2, and plan B,
which describes howto get from location L2 to location L3.
Figure 4 showsthe steps of each plan. The current task is to
get from location L4 to location L2. Using the geographic
closeness of starting and ending locations alone to judge sim-
Before execution:
¯ Turn south
¯ Movesouth to south side of Birch
¯ Turn west
¯Movc west to L2
After execution:
¯ Turn west
¯ Movewest to L2
Figure 5: Plan C before and after execution
ilarity, plan Aappears to be the closest becauseit shares the
sameending location (ROBBIE’s
retrieval criteria does not inelude knowledgeabout reversals of knownroutes, so plan B
does not look similar at all). Plan A is selected and adaptedto
create plan C (dashed line in Figure 3 and in Figure 5). During this process, the introspective reasoner monitorsthe system’s behavior but detects nothing wrong. Whenthe plan is
executed, however,the wasted plan steps will be eliminated:
the goal of the first twosteps in plan C is to be on the southside
of Birch, whichis already true, so the steps will be skipped
(see Figure 5). Whenthe resulting plan is stored into memory,
an introspective failure is detected: an assertion in the model
is that the final solution stored will haveplan steps whichare
moresimilar to the retrieved case than to any other case in
memory.In comparingthe final plan C to cases A and B in
memory,it is clear that plan B has the moresimilar solution.
In explaining the cause of this assertion failure, ROBBIE
reconsiders related assertions assertions in the model, moving
up in the hierarchyof assertions to the general assertions that
"retrieval will operate successfully." It will consider highlevel assertions prior to the general one, such as "the index
will select the closest case." That high-level assertion belongs
to the Anticipator component;ROBBIE
will also movedownward from high-level to more specific assertions, including
"the indexwill include all the relevant features to retrieve the
closest case." In re-evaluatingthe last assertion in the context
of the failure the systemdiscoversa feature of the cases it had
not used before: that each involves movingstraight along an
east/west street. This showsthat the assertion "the index will
include all the relevant features to retrieve the closest case"
failed. The assertion suggests a repair: add "movesstraight
on east/west street" to the features used in indexingcases, and
re-index memoryto include the newfeature.
In the future, any problemwhich involves movingstraight
along an east/west street will be indexed by the newfeature,
and will match most closely other cases which also include
that feature in their index. Oncethe introspective reasoner has
evaluated and repaired the problem,processing continues normally. Notice that the failure in question here is not a catastrophic one, but it does represent wastedeffort on the part of
the planner, effort that wouldotherwise be repeated and compoundedin the future.
The situation above is an exampleof ROBBIE’s
introspective learning for a single goal. The ramifications of learning a newfeature will only becomeclear over a sequence of
35
goals. In order to study the improvementintrospectivereasoning provides for ROBBIE,
we ran a set of experiments which
presented ROBBIE
with twenty-six sequences of goals, executing each sequence with and without introspective reasoning. Onesequence was carefully designed to be easy for ROBBIE to handle, other sequences were randomlyperturbed versions of the first. Wemeasuredthe numberof problemsROBBIE successfully handled for each sequence, and found that
in almost every case ROBBIEcould handle more problems
with introspective learning than without (in one anomalous
case the overall performancewas so poor that introspective
learning could provide no benefit at all). Wealso measuredthe
percentage of cases in memorywhich were considered during
the the retrieval process, over the sequenceof retrievals made
in solving the sequence of goals: The percentage considered
whenintrospective reasoning was used droppedsignificantly
below the percentage considered without introspective reasoning. ROBBIE,
using introspective reasoning to re-index
its memory,consideredfewer irrelevant cases at the sametime
as it improvedits overall successrate.
Ramificationsto general issues
Wehave nowdescribed the ROBBIE
system in some detail;
we must comeback to the issues alluded to briefly above. We
will discuss the tradeoff betweenthe generality, and hence
transferability, of a self-modelframeworkand the specificity
of details the modelneedsto accurately detect and repair failures. Wewill also discuss meansfor evaluating the benefit
of learning about reasoning methods. Wewill describe our
attempts to address these issues with the ROBBIE
system,
sketch our conclusions, and describe howROBBIE
relates to
other workin this area.
Generality vs. Specificity
Ideally one could develop a frameworkfor reasoning about
mental processes that could be transferred with minorchanges
to provide self-models for a wide array of underlying systems
and tasks (vision, planning, etc.) and for a wide variety
modelingtasks (modeling others’ reasoning, explaining reasoning behavior, analyzing its ownactions, etc.). While we
must admit that such a universal frameworkis, at least now,
out of reach, it is certainly possible to share higher-lcvcl insights about mental reasoning, and to develop specific frameworks for more limited tasks and domains. There will be
commonalitiesamongthe kinds of knowledge, and the useful
forms for representing that knowledge,neededto reason about
mental actions. Beyondthat, it seems reasonable to expect
more concrete sharing of model forms and knowledgewithin
a particular kind of self-modelingtask.
In developingmodelsof introspection, we will be torn betweenour desires for transferable modelsand the reality that
a modelmust include a great deal of system-specific knowledge. Developingapproaches that maintain the generality of
model as muchas possible meansfocusing on separating details from the functioning of the model, keeping mechanisms
and vocabulary used as independent as possible and emphasizing the kind of knowledgeneeded. Specifying classes of
knowledgeand useful organizations of that knowledgefor describing mental actions will provide the largest gain across
modelingtasks.
The problem of integrating a general approach to selfder any similar reasoning/explanationtask.
modelingwith the details needed to use the modelhas been
Evaluating self-modeling systems
one we have tried to address with ROBBIEfrom the beIt is often problematic in AI to explain exactly what a given
ginning. Wedesigned general mechanisms for monitoring
the underlying reasoning and accessing the declarative model system has accomplished besides showing some implementawhich depend in no way on the contents of that model. We tion is possible. It is importantto demonstratethe advantages
of any learning systemin terms of the breadth of problemsit
are developinga general vocabularyfor describing the assertions in the modelto complete the generality of the mecha- can solve and the applicability of its ideas in general. At this
point, attempts to evaluate introspective reasoning have been
nisms. Within this framework a model may be constructed
limited; we have, however, madean effort to evaluate ROBfor a very different systemsharing little in common
with the
BIE’s mechanismsand performance.
implementedone. Keepinga hierarchy of assertions, and organizing them by component,allows substitution for pieces
Wemust determine when using a self-model provides a
of the modelfor a newsystem without requiring a completely benefit, and howto demonstrate the extent of that benefit.
That benefit maybe in advancingour knowledgeof what selfnew model. For example, a CBRsystem could keep the upper
tiers of the model for each componentsimilar to ROBBIE’s, reasoning entails and the ramifications for mental modeling
adding only new lower-level details, or a variation on ROB- in general. The benefit mayalso lie on the practical side as
well: systems with the power to improve their ownmechaBIE which used a different adaptation mechanismcould substitute newassertions for that componentalone.
nisms should solve more problems, solve problems more effectively, producebetter solutions, and respond moreflexibly
Of perhapsgreater importancein terms of transferability is
to novel situations than their non-introspective counterparts.
what we now understand about the kinds of knowledgeand
The
expense of modelingreasoning behavior makes evaluatthe modelstructure required for this task. In developing a
ing its successas a practical tool of particular importance.
modelfor this system we also develop a template for what to
Howto measurethe performanceof an introspective learninclude in models of other systems; ROBBIE’smodeldemon- ing systemis itself a difficult question and maydependon the
strates the value of incorporating multiple levels of knowl- system; possible measures include: the breadth and number
edge about reasoning tasks. The ROBBIE
system’s diagnosis
of problemssolved that were impossible previously, the speed
capabilities were improved by having high-level knowledge and efficiency of the reasoning process and the solutions prothat provideda general flow of control and information, along
duced, and manyothers. Manysystems which use a model of
with specific details about the system’s operation (tied into
reasoning, including ROBBIE,are two-level systems which
that higher level). Consideringhigh-level assertions whenasmakea relatively firm distinction betweenthe reasoningbeing
signing blameleads the systemto considerationof other assermodeledand the reasoning Usedto do the modeling; one postions distant in termsof the reasoningtrace but close in terms sible evaluation methodis to comparethe performanceof the
of the flow of control. Thesystemshould moreeasily trace the
bottom-level system with the system as a whole. As a qualreasoning behavior from a detected failure back to the origiitative evaluation we can ask if a system like ROBBIE
denal cause. In a similar way,distinguishing different kinds of
tects the "right" failures, assigns blamecorrectly, and repairs
relationships betweenpieces of knowledgefocuses the model the systemthe "right" way. Other workhas been less explicit
on the most relevant pieces; in ROBBIE,
the modelincludes
about concrete meansof evaluating systems. Cox (1995) has
specification and abstraction links, links that indicate the sedescribed classes of reasoning behavior and failures that peoquenceof reasoning, and causal links that connect assertions
ple experience, and that systems which modelreasoning belikely to fail or succeedtogether. The modelcould chooseto
havior should address; that set provides a qualitative guide for
follow specification links whentrying to determinea repair,
judging models of reasoning. Autognostic (Stroulia &Goel,
or could avoid testing assertions whichare specifications of a
1994) provides another kind of evaluation by directly proving
high-level assertion that has not failed. A modelwithout disthe applicability of its modelto different underlyingsystems.
tinct connectionsbetweenassertions could not as accurately
Wehave begun evaluating ROBBIEusing a practicallygaugewhichassertions are relevant under a given set of ciroriented criterion: the addition of introspective reasoning
cumstances.
should produce quantitative as well as qualitative improveManyother systems have also approached the problem of
ments in the performance of the overall system. Weare in
generality of mechanismand transferability.
Cox & Freed
the process of performingextensive experimentsto test ROB(1994) identify knowledgeabout howgeneral and specific
BIE’s performance over long sequences of problems. By colknowledgecombines as a key element for a self-reasoning
lecting statistics on the success of the systemwith and withsystem. Freed’s RAPTER
(Freed & Collins, 1994b) uses
out introspective learning, we can quantify its effect. Some
general set of representations for expectationsand repairs, and
tentative and preliminary results are in (Fox &Leake, 1994).
a general mechanism
to manipulatethem, while the content of
Wehave completed one set of experiments (described above)
its representations is specific to the RAPTER
system. Strouwhich used the numberof successful cases over a sequence
lia’s Autognostic(Stroulia &Goel, 1994) applies an existing
and the percentage of cases in memoryconsidered during rekind of model (used for modelingphysical machines) to imtrieval to reveal differences in ROBBIE
performancewith and
plement a self-model and successfully applied the modeland
withoutintrospective reasoning. Initial results of that experimechanismsto two independent systems (Kritik2 (Stroulia
ment are encouraging.
Goel, 1992) and Router (Goel, Callantine, Shankar, &ChanFocusing too heavily on quantitative measures mayoverdrasekaran, 1991)). Meta-AQUA
(Ram & Cox, 1994)
look someimportantfeatures of introspective reasoning; it is
abstract descriptions of reasoningtraces that might arise undifficult to quantify the quality of a solution, or the elegance
36
of the reasoningthat created it. Wemust be awareof and seek
out those morequalitative benefits as well. Wemayfind objective measuresof solution quality through common
sense in
some domains, through comparisons with human-created solutions, or through surveys eliciting quality judgements.Elegance of reasoning is an even moresubjective issue, but by
similar methodssomeobjective judgementcan be reached.
Conclusions
whenintrospective learning is enabled, while also considering ways to express qualitative measuressuch as "improving
in the fight way"or "learning from the fight failures."
Anissue at the heart of self-modeling systems is the question of what kinds of knowledgeare required for the system
to performits tasks, and howthat knowledgeis to be represented. Whilethis is an ongoingresearch issue, we have proposeda structure for self-modelingthat allows for flexibility
of application and is designed to allow for transfer of some
part of the modelto newapplications.
The ROBBIE
system, while still incomplete, addresses several important issues for modelingreasoning behavior, and introspective learning in particular. Our conclusions about the
structural requirements of ROBBIE’s
modelshould be applicable to a general modelof reasoning, and the approachto preserving the re-usability of the modelmayalso provide pointers
for future work on the transfer of reasoning knowledge.ROBBIE benefited from using multiple levels of knowledgeto focus on the most relevant portions of the model; determining
what the important levels of knowledgeare and howmultiple levels affect reasoning modelsmaybe beneficial to a wide
range of modelsof reasoning behavior.
In developing a model of the reasoning process, we must
strike a balance betweenthe generality and transferability
of the modeland the specific knowledgerequired to detect
and specify repairs. The ROBBIE
system uses a hierarchical modelaccessed by system independent mechanismsin order to find that balance. To achieve generality, mechanisms
for introspective reasoning should workwith any set of assertions neededfor a system, requiring a general vocabulary
or frameworkfor a vocabulary. Separating assertions which
makestatements about the general kind of underlying system
(here, case-based reasoning) from those that refer to implementationor knowledgedetails of the specific systemmakesit
easier to convert the modelto apply to a newbut similar system. Weclaim that a model must include knowledge about
the reasoningprocess at multiple levels of abstraction; a highlevel description tied to lower-leveldetails. Doingso helps us
to keep the generality of the modeland also allows us to use
the modelat the "right" level for diagnosis by using abstract
descriptions to trace the general flow of control and knowledge rather than plodding through every detailed step of the
reasoning process. Using the right level of abstraction may
focus the diagnosis on promising areas of the model while
avoiding unnecessary or unpromisingdetails.
Weclaim that the problemof evaluating modelsof reasoning behavior must be addressed because of the potential expense of such models. Wecan choose to evaluate a system
in terms of its benefit as a modelof mental actions: what we
learn about possible model structures and knowledgeneeds
provide one kind of justification, or the extent to which a
modelcovers the scopeof introspective reasoningfor the task.
Wemayalso evaluate the practical benefits of using a model
of reasoning behavior. For the purposes of a systemusing the
modelfor self-repair, we mayjudge the quality of the overall
systemcomparedto one without learning, or use a qualitative
gaugeof the repairs madeto the system. Tojudge the quality
of the overall system, various measures might be proposed:
breadth of problemssolved, quality of solutions, speedof processing, and so forth. Wehavebegunthis process by trying to
find some quantitative measures of ROBBIE’simprovement
Acknowledgements
This work is supported in part by the National Science Foundation under Grant No. IRI-9409348.
References
37
Alterman, R. (1986). An adaptive planner. In Proceedingsof
the Fifth NationalConferenceon Artificial Intelligence,
pp. 65-69 Philadelphia, PA. AAAI.
Birnbaum,L., Collins, G., Brand, M., Freed, M., Krulwich,
B., & Pryor, L. (1991). A model-based approach
the construction of adaptive case-based planning systems. In Bareiss, R. (Ed.), Proceedingsof the CaseBased Reasoning Workshop, pp. 215-224 San Mateo.
DARPA,Morgan Kaufmann, Inc.
Birnbaum,L., Collins, G., Freed, M., &Krulwieh, B. (1990).
Model-baseddiagnosis of planning failures. In Proceedings of the Eighth National Conferenceon Artificial Intelligence, pp. 318-323 Boston, MA.AAAI.
Collins, G., Birnbaum,L., Krulwich, B., & Freed, M. (1993).
The role of self-models in learning to plan. In Foundations of KnowledgeAquisition: MachineLearning, pp.
83-116. Kluwer AcademicPublishers.
Cox, M. (1995). Representing mental events (or the lack
thereof). In Proceedings of the 1995 AAAI Spring
Symposiumon Representing Mental States and Mechanisms. (in press).
Cox, M. & Freed, M. (1994). Using knowledge of cognitive behavior to learn from failure. In Proceedings of the Seventh International Conference on Systems Research, Informatics and Cybernetics, pp. 142147 Baden-Baden, Germany.
Firby, R. J. (1989). Adaptive Execution in ComplexDynamic
WorMs.Ph.D. thesis, Yale University, ComputerScience Department. Technical Report 672.
Fox, S. & Leake, D. (1994). Using introspective reasoning
to guide index refinement in case-based reasoning. In
Proceedingsof the Sixteenth Annual Conferenceof the
Cognitive Science Society, pp. 324-329 Atlanta, GA.
Lawrence Erlbaum Associates.
Freed, M. &Collins, G. (1994a). Adapting routines to improve task coordination. In Proceedings of the 1994
Conference on AI Planning Systems, pp. 255-259.
Freed, M. &Collins, G. (1994b). Learning to prevent task
interactions. In desJardins, M. &Ram,A. (Eds.), Proceedings of the 1994 AAAI Spring Symposiumon Goaldriven Learning, pp, 28-35. AAAIPress.
Goel, A., Callantine, T., Shankar, M., &Chandrasekaran,B.
(1991). Representation, organization, and use of topographic modelsof physical spaces for route planning. In
Proceedings of the Seventh IEEEConferenceon AI Applications, pp. 308-314. IEEEComputerSociety Press.
Hammond,C. (1989). Case-Based Planning: Viewing Planning as a MemoryTask. AcademicPress, San Diego.
Kolodner, J. (1993). Case-Based Reasoning. MorganKaufman, San Mateo, CA.
Krulwich, B., Birnbaum,L., &Collins, G. (1992). Learning
several lessons from one experience. In Proceedingsof
the FourteenthAnnualConferenceof the Cognitive Science Society, pp. 242-247 Bloomington,IN. Cognitive
Science Society.
Leake, D. (1992). Evaluating Explanations: A Content Theory. LawrenceErlbaumAssociates, Hillsdale, NJ.
Oehlmann, R., Edwards, P., & Sleeman, D. (1994). Changing the viewpoint: re-indexing by introspective questioning. In Proceedingsof the Sixteenth Annual Conference of the Cognitive Science Society, pp. 675-680.
LawrenceErlbaum Associates.
Oehlmann,R., Edwards, P., &Sleeman, D. (1995). Introspection planning: representing metacognitive experience. In Proceedings of the 1995 AAAI Spring Symposium on Representing Mental States and Mechanisms.
(in press).
Ram, A. (1989). Question-driven understanding: An integrated theory of story understanding, memoryand
learning. Ph.D. thesis, Yale University, NewHaven,
CT. Computer Science Department Technical Report
710.
Ram, A. & Cox, M. (1994). Introspective reasoning using meta-explanations for multistrategy learning. In
Michalski, R. &Tecuci, G. (Eds.), MachineLearning:
A multistrategy approachVol. IV, pp. 349-377. Morgan
Kaufmann.
Riesbeck, C. ( 1981). Failure-driven remindingfor incremental learning. In Proceedings of the Seventh International Joint Conferenceon Artificial Intelligence, pp.
115-120 Vancouver, B.C. IJCAI.
Schank, R. (1986). Explanation Patterns: Understanding
Mechanically and Creatively. Lawrence Erlbaum Associates, Hillsdale, NJ.
Stroulia, E. &Goel, A. (1992). Generic teleological mechanisms and their use in case adaptation. In Proceedings
of the Fourteenth AnnualConference of the Cognitive
Science Society, pp. 319-324 Bloomington,IN. Cognitive ScienceSociety.
Stroulia, E. &Goel, A. (1994). Task structures: what
learn?. In desJardins, M. & Ram, A. (Eds.), Proceedings of the 1994 AAAISpring Symposiumon Goaldriven Learning, pp. 112-121. AAAIPress.
38
Download