From: AAAI Technical Report SS-94-06. Compilation copyright © 1994, AAAI (www.aaai.org). All rights reserved.
CONSTRUCTING BELIEF
NETWORKS TO EVALUATE PLANS
Christopher Elsaeser
AI TechnicalCenter
The MITRE
Corporation
7525Colshire Drive
McLean, VA22102
chris@starbase.mitre.org
Paul E. Lehner
SystemsEng. Dept. & C3I Center
GeorgeMasonUniversity
Fairfax, VA22030
& The MITRECorporation
plehner@masonl.gmu.edu
ABSTRACT
This paper examinesthe problemof constructing belief
networks to evaluate plans producedby an knowledgebased planner. Techniquesare presented for handling
various types of complicating plan features. These
include plans with context-dependent consequences,
indirect consequences,actions with preconditions that
must be true during the execution of an action,
contingencies, multiple levels of abstraction, multiple
execution agents with partially-ordered and temporally
overlappingactions, and plans whichreference specific
times andtime durations.
Contentareas: planning,probabilistic reasoning
1. INTRODUCTION
Uncertaintyis ubiquitous in planning problems.Despite
this, few knowledge-basedplanning systems have been
developedthat can reasonexplicitly about uncertainty.
Instead, mostknowledge-based
planningsystemsare based
solely on symbolicreasoning (Allen, et. al., 1990).
Althoughthese systemsmayemploytechniquesthat adapt
a plan to unanticipated events, they cannot generate
quantitativeuncertaintyestimateof possiblefuture states.
Scott A. Musman
AI TechnicalCenter
The MITRECorporation
7525Colshire Drive
McLean, VA 22102
musman@
starbase.mitre.org
Consequently,the best these planners can do is react.
Theycannot generate plans that are deducedto be robust
againstprobablefutures.
Recently a numberof researchers have recognized the
importanceof uncertainty in automatedplanningand are
developing approaches to address it (e.g., Dean
Wellman,1991; Hanks,1990; Kushmerick,et.al, 1993).
Common
to manyof these approachesis the use of belief
networksto represent and reason about uncertainties in
plans. To date, however,research in the use of belief
networksto reasonabout uncertaintyin planninghas been
restricted to limited types of plans. Most,for instance,
assume a single execution agent, a single level of
abstractionandno contingencies.
If belief networks are to provide a foundation for
probabilistic planning,then weneedto examinethe extent
to whichdifferentplan featurescanbe representedin belief
networks.This paper examinesthis issue. In particular
weshowhowto developbelief networksthat can handlea
variety of plan features. All of the capabilities described
beloware being implementedas part of the APplanning
system. The AP system is designed for adversarial
planning problemswhereeach planning agent mayhave
multiple execution agents that execute coordinated
activities (Elsaesser andMacMillan,
1991).
Iocl !oc2)
0
Si
Si+l
Si
Figure 1. Example of Action and Persistence Models.
148
Si+l
2. BASIC
APPROACH
The basic idea behind using belief networksfor plan
evaluation is to construct a belief network from a
knowledgebase of probabilistic action models and
probabilistic persistence models. (Wellman,1990).
probabilistic action model specifies probability
distributions on a set of consequence predicates
conditionedonthe state of a set of predecessor
predicates.
For example,the probability action modeldepicted in
Figure1 asserts that the location of the object referenced
by obj ( (Loc obj) ) in the situation after the action
movingobj from locl to loc2 ( (Moveobj loci 1oc2)
is completed
(situation Si+l) is a probabilisticfunction
the location of obj in the prior situation. Similarly, the
probabilistic persistence model for (Loc obj) is
probabilistic function of the state of (Locobj) in the
previousstate.
Considerthe two step plan (MoveA L1 L2) --> (Move
L3L1). To build a belief networkto evaluate this plan,
one can begin by sequentially pasting onto the belief
networkthe probability action modelfor each action
(Figure 2a). Whenpasting onto the belief network,
conflicting informationalreadyin the networkis replaced.
Notethat the networkin Figure 2a is incomplete,since
there are nodesin future states whichare not connectedto
the current state. To complete this network, it is
necessary to work backwardsthrough the network and
sequentially pasting into the networkthe necessary
persistence models(Figure 2b). Whenpasting into the
network,current entries in the networkare not changed,
but previously unspecified nodes and probability
assessmentsmaybe entered.
constructed,existing algorithmscan be applied to the PEnet to calculate the marginalprobability of any nodein
the PE-netas a function of informationabout the initial
or futurestate.
3. PLAN FEATURES THAT COMPLICATE
PE-NET
CONSTRUCTION.
PE-netconstructionis straight forwardfor simplelinear
plans such as the one mentioned above. However,as
plans get morecomplex,the processof constructinga PEnet becomescorrespondingly more complex. Belowwe
showhowto handle a numberof these complexities.
3.1 PARTIAL MODELS
The PE-net approach to plan evaluation assumes a
knowledge
base of action and persistence models,each of
whichis a small, paritially-specified belief network.
Giventhe numberof actions and predicates that maybe
mentioned
in the knowledge
base, it is unlikelythat all of
the conditional probabilities mentionedin all of these
networkswill be specified. It is morelikely that the
probabilities for consequencepredicates will only be
specifiedfor a subset of the predecessorstates. Wehandle
this as follows. Whereverthe action model is under
specified, wepaste into the PE-netthe persistencemodels
for the consequentpredicates. Whereverthe persistence
modelis under specified, wepaste into the PE-net a
default persistencemodel.For our applicationsthe default
persistencemodelasserts that no changewill take place.
Usingthis techniqueall the conditional probabilities in
the networkwill be specified. All that remains is to
specifythe unconditional
probabilitiesfor the initial state.
Werefer to a belief network,such as shownin Figure2b,
as a plan evaluation network or PE-net. Once
SO
SO
PE.network after
action models are
pasted on.
2a
PE-network after
persistence models
are pasted in.
2b
S2
S1
S2
S1
Figure 2. Constructing a PE-net.
149
S1
SO
Figure 3. Problematic PE-net with derived effects.
3.2 DERIVED EFFECTS
In developinga PE-net,it is importantto separatecausal
effects fromderivedeffects. Causaleffects are links that
go froma predicate nodein one situation to a predicate
nodein a later situation. Derivedeffects are definedby
links betweentwopredicatenodesin the samesituation.
To illustrate the kind of problemthat maybe encountered,
consider the simple PE-net in Figure 3. This PE-net is
for a single Moveaction. It also includes (At L1) nodes
whichindicate whatobject is at location L1. Clearly, if
(At L1)=Xthen (Loc X)=L1.Consequently,the status
(LocX) can sometimesbe derived fromthe status of (AT
L1) in the samesituation. It seemsnatural therefore to
paste onto the PE-net an arc from (At L1) to (Loc
where the conditional probability P((Loc X)=LII(At
L1)=X& anything else)=l is specified. This is
exampleof a derived effect. Now,assumethat the move
action is completelyreliable, all persistence modelsare
the no changedefault model, and X is initially at L1.
Actions
{
Derived
Nodes
{
Giventhese assumptions, we would expect (Loc X)=L2
in S 1 with certainty. However,the PE-net in Figure 3
implies (Loc X)=L1in S1 with the certainty! The
additionof the derivedeffect unexpectedly
resulted in the
persistence model for (AT L1) overriding the action
model.
In general, this type of problemoccurs becausederived
effects serve to completean incompletecausal model.In
theory, it is possible to do awaywith derived effects
altogether. If causality is temporal, then a complete
causal modelgoing fromSi to Si+l wouldaccountfor all
interactions withina situation. In Figure3, for instance,
a completeaction modelwouldhaveboth (LocX) and (At
L1) as consequencepredicates. This would removethe
need to directly connect (Loc X) and (At L1) in
Unfortunately,the knowledge
engineeringeffort required
to developa completecausal modelis prohibitive, since it
wouldrequirethe specificationof conditionalprobabilities
for all direct andindirect consequences
of an action.
Nodes
Primitive{
©
Figure 4. PE.net with primitive and derived predicates
150
Our approach to derived effects is a compromisebetween
complete causal modeling and the liberal use of derived
effects. All the PE-nets constructed by our system
generates networkswith the structure depicted in Figure 4.
Predicates are split into two levels. Primitive predicates
do not have interconnections within a situation. It is
assumedthat they are only conditioned on the state of the
nodes in the previous situation. It is up to the knowledge
engineer of the action and persistence models to ensure
that the models are causally complete with respect to
primitive predicates. Predicates at the derived level can
only be conditioned on other nodes in the same situation.
The predicates at the derived level changefromsituation to
situation. Only relevant derived-level predicates are
included. Enforcing this structure removesthe problems
with derivedeffects.
This approach requires that any predicate mentionedas a
consequence in an action model must be a primitive
predicate. In Figure 3, therefore, (Loc X) would need
be a primitive node, (At L1) a derived node, and the arcs
would go from (Loc X) to (At L1). This chnage would
repair the problemin Figure 3.
3.4
A plan contains contingent actions when the decision to
execute an action (or which action) is contingent on the
situation. In a PE-net, contingent actions can be handled
by combining actions into a single node, and then
conditioning the merged action node on the nodes which
determine which action will be executed. Figure 5
depicts a networkwith contingent actions.
There two things to note here. First, actions can be made
contingent on whether or not previous actions were
executed. Consequently,it is straightforward to represent
a contingent action sequence (i.e., a contingency plan).
Second,there is no requirement that action selection be a
deterministic function of the situation. It could be
probabilistic, to reflect possible uncertainties about the
agents ability to detect the true status of a situation.
Alternatively, one could makethe action contingent on a
sensor report and makethe sensor report a probabilistic
function of the situation.
3.5
3.3
CONTEXT-DEPENDENT
EFFECTS
Many planners
have actions
models where the
consequencesof an action are functions of the situation in
which the action was executed (Wilkins, 1988). In a PEnet this can be handled by invoking these same functions
to determine possible node states in situation Si+l as a
function of the possible node states in situation Si.
Iterating through the states in this waywill enumerateall
possible states for each node.
PLANNED CONTINGENCIES
MULTIPLE LEVELS OF ABSTRACTION.
Many planners use operators at varying levels of
abstraction. As a result, there maybe plans that are only
partially detailed. In order to build PE-netsfor such plans,
it is necessary to have probabilistic action models for
operators at each level of abstraction. Highlevel actions
can be pasted onto the network in exactly the same
manneras less abstract actions.
A L1
~
ove
A L1
or
L~
B L2 L1
L1
(Loc
SO
A~
(Loc
S1
Figure 5. PE-net for plan with contingent actions
151
82
B, C, D ere high level actions.
C2 is alternative subplsn, which can be selected instead of C1.
C1 e, C1 b, and C1 c are executable.
Figure 6. Example hierarchical plan.
Whenan abstract operator is expanded,the PE-subnetfor
that expansionshould be pasted onto the overall PE-net.
Thereare two things to note about the PE-subnet.First,
not only shouldit containthe actions that are selected to
be part of the plan, but it shouldalso containthe actions
that were enumerated,but not selected. To illustrate,
consider the plan in Figure 6. B, C and Dare abstract
actions, eachcapableof expansion.After expandingC, it
turns out that there are two possible approaches to
achieving C, namely C1 and C2. C1 is selected for
inclusion in the plan and is further expandedto the
sequenceof actions Cla, Clb and Clc. To construct the
PE-net, a subnet that combines C1 and C2 into a
contingentaction nodeis constructedand pasted onto the
PE-net. This requires that the conditions be enumerate
underwhichthe alternative action will be selected. After
this, the subnet for the Cla, Clb, Clc sequence is
constructedandpasted onto the PE-net.Thesecondthing
to note is that whena PEsubnetfor an expandedsubplan
is pasted onto a PEnet, it doesn’t necessarily override
everythingin the moreabstract action model.Theremay
be consequence
predicates of the higher level action model
that are not mentioned
in the lowerlevel action models.
Oneadvantageof using PE-netsto evaluate hierarchical
plans is that the PE-netcan be processedto estimateboth
the probabilitythat the current plan will succeedand the
probability that the current plan will lead to success
(i.e., the probability that the plan can be successfully
modifiedduring execution). The probability that the
currentplan will succeedis the joint probabilitythat the
goalconditions(representedas specific states on specified
predicates)will be true in the final situation andthat the
(mostdetailed) steps in the current plan will be executed,
whilethe probabilitythat the plan will lead to successis
just the probabilitythat the target conditionswill be true
in the final situation.
3.6 OVERLAPPING ACTIONS,
DURING
CONDITIONS AND EFFECTS.
In AP,a planningagent mayplan the coordinatedactivity
of multiple executionagents. Althoughthe plan for each
executionagent is linear, the overall plan will contain
multiplesimultaneousactions with interlockingstart and
end situations. To relate the effects of overlapping
actions, APaction models use during conditions and
duringeffects. A duringcondition is a proposition that
mustbe true during executionof an action in order for
someeffect to occur. Similarly, someeffects occurduring
the executionof an action, rather than in the endsituation
of that action.
If the probabilistic action andpersistence modelsdo not
mention specific times (see below), then PE-net
constructionfor plans with overlappingactions proceeds
by arbitrarily selecting a linear orderingonthe situations
that is consistentwith the interlockconstraints, andthen
pasting onto the PE-netanyduringconditionsandeffects
of an action for the nodesin the situations betweenthe
start and end situation of that action. Theprobability
estimates derived from such a PE-net have two useful
characteristics. First, they are minimum
estimates. This
is because the planning agent can choose to further
constrain the plan so that the execution agents will
execute the actions in a waythat satisfies the linear
ordering on the situations. Second, in practical
applications the probability estimates of the goal
conditions are not likely to changesubstantially if a
different linear ordering is selected. This is because
nonlinearplanners(such as AP)are specifically designed
to impose order constraints whenever the current
constraints leave attainment of the goal conditions in
doubt. Consequently,while it is certainly possible for a
nonlinearplannerto miss an importantorder constraint, a
plannerthat does this often is unlikely to transition to
practicalapplications.
3.7 REFERENCES TO SPECIFIC
TIMES
AND DURATIONS.
Onecan easily introducetime into situations by addinga
predicate for clock time and having action modelsthat
assigna probabilitydistributionover the clocktime in the
end situation conditionedon the clock time in the start
situation. If clock time are included, then the
probabilistic persistencemodelscan use time elapsedsince
the previoussituation as a conditioningvariable.
152
7a.
7b.
"--O
S2a
So
Sl
SZb
Figure 7. Structure of PE-net for plan that lacks a clear temporal order on
situations.
This approach workswell for linear plans, wherethe
sequenceof situations are necessarilyin temporalorder no
matter what the distribution of situation clock times.
Unfortunately, this does not alwayshold for plans with
overlappingactions. To illustrate the problemmayresult
fromoverlappingactions, considerthe plan in Figure7a.
In this plan actions A1and A2begin together. A1takes
either 2 or 4 minutes to complete, A2 takes 1 or 6
minutes.As a result, the clock time for S1 is either 2 or
4, and for $2 it is either 1 or 6. If the PE-netfor this
plan orders the situations S0-->S1-->$2-->$3,
then there
are possible states for Clock-timein S 1 that comeafter
some states for Clock-time in $2. As a result, the
persistence models must condition the probability
distribution over the other nodesin S 1 as a function of
negativeelapsedtimes. Obviouslyintolerable.
A solution to this problemis to split situations so that
the temporalorderingof the situations is guaranteed.For
instance, as shownin Figure7b, $2 can be split into S2a
and S2b. A newnode, Relative-end-time is added. The
probabilistic action modelfor A2is pasted onto S2a
wheneverRelative-end-timeis negative. Otherwiseit is
pasted onto S2b. This solution guarantees that the
situations are in temporalorder, eventhoughthe clock
timesfor the situations mayoverlap.
4.
plan. This will occur whenevermultiple node states are
generatedfor eachnodestate in a previoussituation. This
problem can be mitigated somewhat by defining an
"OTHER"
node state, whichcombinesinto a single node
state a set of nodestates that seemto havelittle relevance
to evaluating the plan. In general, if the action and
persistence modelsare carefully engineered, then we
anticipate that the numberof nodestates will increase
linearly with the lengthof a linear plan.
Nonlinearplans are moreproblematic.If relative end time
nodes are inserted then, as the examplein Section 3.7
indicates, the numberof situations will increaserapidly,
whereevery situation will contain mostof tke primitive
predicates mentionedin any of the. action models. The
rate of increaseis not exponential,but it is substantial.
Finally, exact processing of a belief net increases
exponentially with the size of the network (Cooper,
1990). This suggeststhat approximate(e.g., montecarlo)
algorithmsshouldbe usedto processlarge PE-nets.
DISCUSSION
Ourworkto date suggests that automatedprocedurescan
be developed for constructing PE-nets for plans that
contain a variety of complicating features. Belief
networks do seem to provide an adequate formal
foundation for probabilistic evaluation of plans, and
automated
constructionof these nets is feasible.
References
Allen, J, Hendler,J. and Tate, A. (eds.) (1990)Readings
in Planning. San Mateo, CA.: MorganKaufmann.
Cooper, G.F. (1990) The computational complexity
probabilistic inference using Bayesianbelief networks,
Artificial Intelligence,42, 393-405.
Dean, T. and WeUman,
M. (1991) Planning and Control.
San Mateo, CA.: MorganKaufmann.
Elsaesser, C. and Macmillan,T.R., (1991) Representation
and Algorithms for Multiagent Adversarial Planning,
Technical
Report MTR-91W000207, MITRE
Corporation, December1991.
Clearly, a great concern is computationalcomplexity. Hanks,S. (1990)Projecting Plans for UncertainWorlds.
Ourworkto date suggeststhat for linear plans the number Technical Report 756, Yale University, Dept. of
of nodes in a PE-netgrowslinearly with the length of a
ComputerScience.
plan. However,unless care is taken, the numberof node
states will increase exponentiallywith the length of the
153
Kushmerick, N., Hanks, S. and Weld, D. (1993) An
Algorithm for Probabilistic Planning, Technical Report
93-06-03, Dept. of ComputerScience and Engineering,
Univ. of Washington.
Wellman, M. P. (1990) The STRIPS assumption for
planning under uncertainty. In Proceedings AAAI-90,
Menlo Park, CA.: AAAIPress, 198-203.
Wilkins, D. (1988) Practical Planning:Extending the
Classical AI Planning Paradigm. San Mateo, CA.:
Morgan Kaufmann.
154