Defeasible Planning

advertisement
Defeasible
Planning
John L. Pollock
Department of Philosophy
University of Arizona
Tucson, Arizona 85721
(e-mail: pollock@arizona.edu)
From: AAAI Technical Report WS-98-02. Compilation copyright © 1998, AAAI (www.aaai.org). All rights reserved.
Abstract
Planningtheory has traditionally madethe assumptionthat
the plannerbegins with all relevant knowledgefor solving
the problem.Planning agents operating in a complexand
dynamicenvironmentcannot makethat assumption. They
are simultaneouslyplanningagents andepistemicagents, and
the pursuit of knowledgeis driven by the planning. In
particular,the searchfor a planinitiates the searchfor threats.
Becausethe epistemicinvestigationcanbe arbitrarily complex
and maybe non-terminating,this has the consequencethat
the planner cannot wait for the epistemicinvestigation to
terminatebefore continuingthe plan search. This in turn has
the consequencethat the planning cannot be done by a
traditional algorithmicplanner.It is arguedthat the planning
must instead be done defeasibly, makingthe default
assumption
that there are nothreats andthen modifyingplans
as threats are discovered.This paper sketcheshowto build
such a planner based uponthe OSCAR
defeasible reasoner.
Theresulting planner performsessentially the samesearch
as UCPOP,
but does it by reasoningdefeasibly about plans
rather than runningconventionalplan search algorithms. A
beneficial side-effect of such defeasible planningis that
updatingplans in the face of changesin the agent’s beliefs
(often reflecting changesin the world) can be donemore
efficiently thanby replanning.
1. Planning with Variable Knowledge
Most applications of AI planning theory assumethat
the planner comes to the planning problem equipped with
the knowledgeneeded to solve the planning problem. In a
complex and variable environment, that assumption can
fail in three different ways:
First, the beliefs of the planningagent will typically
be fallible. Theywill be based on the best evidencecurrently
available to it, but as more information is acquired, the
agent mayhave to changeits mindabout someof its beliefs.
Put another way, the agent’s epistemic reasoning will be
defeasible. Planning based uponsuch defeasibly held beliefs
must also be defeasible. If a plan assumesa belief that is
subsequently retracted, then the plan must be retracted as
well.
Second,as the world changes,the agent’s beliefs about
its current situation must be updated to keep track of the
world. A plan predicated on the world’s being a certain
waymust be retracted if the world changesso as to falsify
the assumptionsof the plan.
Third, and most important for the present paper, the
agent’s knowledgeof the world, even if accurate, will
typically be incomplete. An agent cannot be assumed to
know everything there is to know about a complex
environment. Instead, it must be equipped with cognitive
facilities enabling it to search for and acquire further
knowledge as it needs it. The search for additional
knowledge can take two forms. Sometimes the requisite
knowledge can be obtained by reasoning from what the
agent already knows. But sometimes reasoning by itself
will not provide the needed knowledge. The agent may
have to undertake empirical investigations. The simplest
empiricalinvestigations will consist of the agent’s perceiving
its surroundings. But most empirical investigations require
the agent to first take actions and then perceive the results.
For example,in order to find out what time it is the agent
mayfirst have to go into the next roomand position itself
before the clock, and then perceive the display on the clock
face.
It is useful to follow philosophers in distinguishing
between epistemic cognition and practical cognition.
Epistemic cognition is cognition about what to believe, and
practical cognition is cognition about what to do. In the
current context, practical cognitionis just planning. Planning
presupposesbeliefs about the agent’s environment,and those
beliefs are produced by epistemic cognition. The above
observations can be summarizedby saying that (1) epistemic
cognition is defeasible, and a planning agent must be
prepared to revise its plans as its defeasibly held beliefs
change, and (2) in order to solve a planning problem
agent mayhave to acquire more information, both through
reasoning and through empirical investigation.
2. Algorithmic Goal-Regression Planning
The purpose of this paper is to investigate the extent
to which existing goal-regression planners can accommodate
the observations of section one. I will argue that these
observations require important changes to current AI
planning technology. Not only must the adoption of plans
be defeasible. In a sense to be explained below, the plan
search itself must be donedefeasibly.
Typical goal-regression
planners are SNLP
(MacAllester and Rosenblitt 1991), UCPOP
(Penberthy
Weld 1992, Weld 1994), and PRODIGY(Carbonell,
Knoblock,Minton 1991). Goal-regression planning is based
upon two kinds of information about the world. First,
there is information about the agent’s current situation,
typically symbolizedas a set of literals. Second, there is
information to the effect that if an action is performed
under certain circumstances,it will have a certain result. I
will represent this using aplanning-conditional,of the form
"(subgoal & action) )~ goal". Goal-regression planning
works backwards from goals to subgoals, using planningconditionals. Givenan interest in finding a plan for achieving
goal and a planning-conditional "(subgoal & action)
goat’, a goal-regression planner tries to find a subplan for
achieving subgoal, and given such a subplan it constructs a
plan for achieving goal by adding action to the end of the
subplan.
A crucial complication arises whena goal or subgoal
is a conjunction, because the planner will not usually have
planning-conditionals whose consequents are the desired
conjunctions. Goal-regression planning handles this by
planning separately for the individual conjuncts, and then
merging the separate subplans into a single plan for the
conjunction. Unfortunately, merging the separate subplans
into a single plan may create destructive interference,
wherein executing the steps of one subplan mayhave results
that will prevent another subplan from working. Goalregression planners handle this by checking for destructive
interference and attempting to repair it before proposing
the mergedplan as a plan for the conjunctive goal.
To illustrate, whenUCPOP
mergestwo plans, it checks
to see whether (1) a step I of o ne plan i s i ntended t o
achieve a subgoal g whosesatisfaction is presupposedby a
later step s2, (2) there is a step s of the other plan that
prescribes an action A for which there is a planningconditional of the form (precondition &A) ¯ -g, and (3)
it is consistent with the ordering constraints of the merged
plan to execute s between s~ and s 2. s is said to threaten
the causal-link between s~ and s2, and UCPOP
attempts to
resolve threats by either adding ordering-constraints that
preclude s’s being executed between s I and s 2 (promotion
or demotion) or adding steps to the plan that will make
precondition false at the appropriate time (confrontation).
Themergedplan is not proposedas a plan for the conjunction
until either (1) it is verified that there are no threats, or (2)
threats are repaired by promotion, demotion, or
confrontation. Such a planner can be proven sound and
complete (Penberthy and Weld 1992).
Conventionalgoal-regression planners are algorithmic
planners, in the sense that in solving a planning problem,
such a planner executes an effective computation,i.e., the
set of pairs (problem,solution)that characterize the planner
is recursively enumerable.This is because each step of the
plan-search is dictated by the planningalgorithm, the search
for threats is performedby simply searching a precompiled
finite list of planning-conditionals,and the repair of threats
is done mechanically.
3. Problems for Algorithmic
GoalRegression
Planning
In a complexand changing environment, the beliefs
upon which goal-regression planning is based will be held
only defeasibly. If a crucial belief is retracted, then a plan
based upon it must also be retracted and an attempt made
to either repair it or replaceit. I will refer to this as planupdating. Plan-updating need not necessitate that the
planningitself be donedifferently than by using conventional
planning algorithms. If crucial beliefs change, the algorithm
can be run over again with the new beliefs. On the other
hand, as we will see below, there maybe a more efficient
way to handle plan-updating.
Wehave also seen that in a complexenvironment, an
agent’s knowledgewill be both incomplete and potentially
completable. The sense in which the agent’s knowledgeis
potentially completableis not that the agent can ever know
it all, but rather whena crucial piece of knowledge
is missing
it is alwaysin principle possible that the agent can acquire
that particular missing piece of knowledgeeither through
reasoningfrom extant beliefs or by empirical investigation.
The potential completability of the agent’s knowledge
creates major difficulties for algorithmic goal-regression
planning.
The problem arises from the fact that it cannot be
assumed that a planning agent operating in a complex
environment has exactly the knowledgeit needs to solve a
planning problem. Such an agent must build its own
knowledgebase. The systemdesigner can get things started
by providing backgroundknowledge, but the agent must be
provided with cognitive machinery enabling its knowledge
base to grow and evolve as it gains experience of its
environment,senses its immediatesurroundings, and reasons
about the consequences of beliefs it already holds. The
more complex the environment, the more the agent will
have to be self-sufficient for knowledgeacquisition. I have
distinguished between practical cognition and epistemic
cognition. The principal function of epistemic cognition in
an autonomousagent is to provide the information needed
for practical cognition. As such, the course of epistemic
cognition is driven by practical interests. Rather than coming
to the planning problem equipped with all the knowledge
required for its solution, the planningproblemitself directs
epistemic cognition, focusing epistemic endeavors on the
pursuit of informationthat will be helpful in solving current
planning problems.
Paramountamongthis information is knowledgeabout
what will happenif certain actions are taken under certain
circumstances, i.e., planning-conditionals. Sometimesthe
agent already knowswhat will happen, but often it has to
figure it out. At the very least this will require reasoning
from current knowledge.In manycases it will require the
empirical acquisition of new knowledge that cannot be
obtained just by reasoning from what is already known.
To use the example given above, in order to construct a
plan the planning agent mayhave to find out what time it
is, and it maybe able to do that only by examining the
world in some way (e.g., it mayhave to go into the next
room and look at the clock). In general, such empirical
investigations are carried out by performing actions (not
just by reasoning). Figuring out what actions to perform is
a matter of engaging in further planning. The agent acquires
the epistemic goal of acquiring certain information, and
then plans for howto accomplish that. So planning drives
epistemic investigation which mayin turn drive further
planning. It follows that an essential characteristic of
planning agents is that planning and epistemic cognition
are interleaved. It is accordinglyimpossibleto require of a
planning agent capable of functioning in realistically
complex environments that it acquire all the requisite
knowledgebefore beginning the plan search.
Nowlet us apply this to the questionwhethera planning
agent can perform its planning by implementing planning
algorithms. That is only possible if destructive interference
is computable,whichin turn requires that the consequences
of actions be computable. As we have seen, autonomous
planning agents cannot rely on precompiled knowledge.
They must engage in genuine reasoning about the
consequences of actions, and we should not expect that
reasoning to be any simpler than general epistemic reasoning.
Realistically, epistemic reasoning must be defeasible, which
interferences will be only r.e. This means that when the
planning algorithm computesplans for the conjuncts of a
conjunctive goal and then considers whether they can be
merged without destructive interference, the reasoning
required to find any particular destructive interference may
take indefinitely long, and if there is no destructive
interference, there will be no point at which the planner
can draw the conclusion that there is none simply on the
grounds that none has been found. Thus a planning algorithm
that waits for such assurance before going on will bog
downat this point and will never be able to produce the
mergedplan for the conjunctive goal.
If destructive interference is not computable,howcan
a planner get away with dividing conjunctive goals into
separate conjuncts and planning for each conjunct
separately?
The key to this problem emerges from
considering how human beings solve it. Humansassume
defeasibly that the separate plans do not destructively
interfere with one another, and so infer defeasibly that the
mergedplan is a goodplan for the conjunctive goal. Having
made this defeasible inference, humanplanners then look
for destructive interference that woulddefeat it, but they
do not regard it as essential to establish that there is no
destructive interference before they makethe inference. And
if, at the time plan execution is to begin, no destructive
interference has been discovered, then we humansgo ahead
and execute the plan despite the fact that we have not
provenconclusively that there is no destructive interference.
One may be tempted to suppose that humanbeings
are makingan unreasonable leap of faith here, and that a
more rational agent wouldpostpone plan execution until it
has beenestablished that there is no destructive interference.
However,the logic of the epistemic search for destructive
interference makes that logically impossible. Given a
logically complexknowledgebase, there will not, in general,
be a point at which an agent can conclude with certainty
that there is no destructive interference within a plan, so an
agent that required such certainty wouldbe unable to plan
for conjunctive goals.
Theupshot of this is that a rational agent operating in
a realistically complexenvironment must makedefeasible
assumptions in the course of its planning, and then be
preparedto changeits planningdecisions later if subsequent
epistemic reasoning defeats those defeasible assumptions.
In other words, the reasoning involved in planning must be
a species of defeasible reasoning. Planning in autonomous
agents cannot be done algorithmically.
The general way goal-regression planning must work
is by using planning-conditionals to reason backwardsfrom
goals to subgoals, splitting conjunctive goals into their
conjuncts and planning for themseparately, and then merging
makes the set of conclusions at best A2.~ But even if we
could construct an agent that did only first-order deductive
reasoning, the set of conclusions is not effectively
computable--it is recursively enumerable. Even for such
an unrealistically oversimplified planner, destructive
interference will not be computable--theset of destructive
Fora discussionof this, see Pollock(1995),chapterthree.
3
the plans for the individual conjuncts into a combinedplan
for the conjunctive goal. The planning agent will infer
defeasibly that the mergedplan is a solution to the planning
problem. A defeater for this defeasible inference consists
of discoveringthat the plan contains destructive interference.
Whenever a defeasible reasoner makes a defeasible
inference, it must adopt interest in finding defeaters, so in
this case the agent will adopt interest in finding destructive
interference. 2 Finding such interference should lead the
agent to try various waysof repairing the plan to eliminate
the interference, and then lead to a defeasible inference
that the repaired plan is a solution to the planningproblem.
4. Rules for Defeasible
Planning
The OSCARsystem of defeasible reasoning 3 was
constructed with epistemic reasoning in mind, but by
providing it with appropriate inference-schemes,it can also
implementthe kind of defeasible planning described above.
The simplest way to do this is to regard a solution to a
planning problem as an epistemic conclusion to the effect
that a certain plan will achieve the goal. Defeasibleplanning
is then defeasible reasoning in support of this epistemic
conclusion. In this section I will give a brief description of
howthis reasoning can be performed.
The simplest case of means-end reasoning is the
degenerate case in whichthe goal to be achieved is already
true, and hence nothing needs to be done to achieve it. A
null-plan for the goal goal is a plan with no plan-steps.
The degenerate case of means-end reasoning can then be
regarded as proceeding in accordance with the following
inference-scheme:
PROPOSE-NULL-PLAN
Givenan interest in finding a plan for achievinggoal,
if goal is already true, take that as a conclusive
(nondefeasible) reason for concludingthat a null-plan
will achieve goal
The core of goal-regression planning consists of using
planning-conditionals to reason backwardsfrom goals to
2 For moreabout the dynamicsof defeasible reasoning,see
chapterfour of Pollock[ 1995].
subgoals. This can be formulated as follows:
GOAL-REGRESSION
Givenan interest in finding a plan for achieving G,
adopt interest in finding planning-conditionals (A
C) ¯ G having G as their consequent. Given such
conditional, adopt an interest in finding a plan for
achieving C. If a plan subplan is proposed for
achieving C, construct a plan by (1) adding a newstep
to the end ofsubplan where the new step prescribes
the action A, and (2) ordering the newstep after all
steps of subplan. Infer nondefeasiblythat the new
plan will achieve G.
Given two plans planI and plan2, let planI + plan2 be the
plan that results from combiningthe plan-steps and orderingconstraints of each. Wecan plan for conjunctive goals by
using the following defeasible inference-scheme:
SPLIT-CONJUNCTIVE-GOAL
Givenan interest in finding a plan for achievinga
conjunctive goal (Gt &G2), adopt interest in finding
plansplanI for G~ andplan2 for G2. If such plans are
proposed,infer defeasibly that planI + plan2 will
achieve (Gl & G2).
SPLIT-CONJUNCTIVE-GOAL is a rule
of
defeasible reasoning. Anysuch inference-scheme must be
supplementedwith an account of what the defeaters are for
reasoning in accordancewith it. Wecan follow the lead of
UCPOP
and adopt the following defeater:
FIND-THREATS
Given an inference in accordance with SPLITCONJUNCTIVE-GOAL
to the conclusion that plan&
will achieve (G1 &G2), for each plan-step 2 of plan&,
ifs 2 assumesthat someearlier step s~ achieves a
subgoalg,4 then for each plan-step s of plan&,if it is
consistent with the ordering-constraints of plan&that
s occur betweens~ and s2, and the action prescribed by
s is A, adopt interest in finding a planning-conditional
of the form (C & A) ¯ -g. Take the inference to the
3 Thetheory behindthe OSCAR
defeasible reasoningis
presentedin Pollock[1995].Thedetails of the implementation
are described in The OSCAR
Manual,whichcan be downloaded
fromhttp://www.u.arizona.edu/~pollock/.
4 Suchassumptionsare recordedin the standardway,using
causal-links.
conclusion thatplan& will achieve (G~ &G2) to be
defeated by the discovery of such a planningconditional.
The resolution of threats can be handled by a pair of
defeasible inference-schemes:
ADD-ORDERING-CONSTRAINT
Givenan interest in finding a plan for achieving a
conjunctive goal (gt &g2), and plans planI for gt and
planz for g2, if plan&is a putative plan for (gl &g2)
constructed by mergingplans plan1 and plan2 (and
possibly other plans), but a plan-step s of plan&
threatens a causal-link betweensteps s~ and s2 of
plan&, construct a plan plan+ by adding the orderingconstraint that s not occur betweens~ and s2 (if this
can be doneconsistently) and infer defeasibly that
plan+will achieve (gl &ge).
CONFRONTATION
Givenan interest in finding a plan for achieving a
conjunctive goal (gt &g2), and plans planI for gl and
planz for g2, if plan&is a putative plan for (gj &g2)
constructed by mergingplans planI and plan2 (and
possibly other plans), but a plan-step s of plan&
threatens a causal-link betweensteps s~ and s2 of
plan& by way of a planning-conditional (C & A)
-g, adopt interest in finding a plan for achieving-C
(or if C is a conjunction, for each conjunct of C, adopt
interest in finding a plan for achievingits negation).
If a plan repair-plan is proposedfor achieving -C or
the negation of one of its conjuncts, construct a new
planplan+ by merging repair-plan with plan&, and
ordering the final step of repair-plan before the step
that dictates Aand threatens the causal-link. If this
ordering can be done consistently, infer defeasibly
thatplan+will achieve (gl &g2).
The repaired
plan produced by ADD-ORDERINGCONSTRAINTor CONFRONTATION
resolves
just one
threat. Other threats mayremain, so these two inferenceschemesmustalso be regardedas defeated by finding threats.
This requires us to generalize FIND-THREATS
to apply
to inferences made in accordance with these inferenceschemes as well as SPLIT-CONJUNCTIVE-GOAL
ADDORDERING-CONSTRAINT and CONFRONTATION
will automaticallylead to attempts to resolve such additional
threats.
An experimental planner that implements reasoning
of the general sort just described has beenconstructed within
the OSCAR
defeasible reasoner, and can be downloaded
5from mywebsite.
The need for defeasible planning is driven by the fact
that a planning agent’s knowledgeof the consequences of
actions can be expected to be incomplete but potentially
completable. A second source of defeasibility lies in the
fact that the beliefs upon which planning is based are
themselves held only defeasibly, and if they are retracted
the plan must be retracted as well. Plan-updating is the
process of repairing or replacing plans in the face of such
belief-updating. One way of updating plans is to perform
the planning all over again with the newbeliefs. However,
that maybe quite inefficient because large parts of the
planning process may be unchanged, and it would be
desirable to avoid repeating them. This inefficiency is
automatically avoided whenplans are constructed as above
by reasoning about themdefeasibly. This reasoning is just
more epistemic reasoning, and as such is integrated with
the samereasoning that leads to the retraction of beliefs.
Whenbeliefs used in the planning are retracted (defeated),
then the specific parts of the plan-reasoning that depend
uponthose beliefs will also be defeated. But other parts of
the plan-reasoning will not be defeated, and so the
conclusions drawnin that part of the reasoning can be used
without further ado in the process of trying to repair the
defeated plan. The reasoning need not be done over again.
5. Evaluating
a Defeasible
Planner
An algorithmic planner is evaluated by asking whether
it is sound and complete. It is sound if every solution it
proposesis a correct solution, and it is completeif it finds
a solution wheneverone exists. But howcan we evaluate a
defeasible planner? It will inevitably find unsoundsolutions.
Hopefully,it will retract themlater.
A distinction can be made between the conclusions
that a defeasible reasoner is justified in holding, at any
given stage of its reasoning, and the warrantedconclusions
5 The URLis http://www.u.arizona.edu/~pollock/.Two
technicalreportsdescribingthe plannerin further detail canalso
be downloaded.
"Reasoningdefeasibly about plans" describes
the constructionof the planner,and "Thelogical foundationsof
goal-regressionplanning"formulatesa generaltheory of
goal-regressionplanning.
that it will be justified in holding at the limit, whenall
possible relevant reasoning has been performed6 What we
want of a defeasible planner is that it will eventually draw
warrantedconclusions that constitute solutions to planningproblems. Let us call the plans endorsed by warranted
conclusions warranted plans. Wemight require that
warranted plans are always solutions, and wheneverthere
is a solution there will be a correct warranted solution.
However, as a criterion of adequacy for the planning
reasoningthis is too strong. The reasoner might be warranted
in taking an unsoundplan to be a solution simply because
the reasoner is unable to draw the conclusion that some
relevant fact about the world is a fact or that somerelevant
consequenceof an action is a consequenceof that action.
For the same reason it maybe unable to find somecorrect
solution.
Wecan usefully separate the plan-reasoning from the
reasoning aimed at finding factual knowledgeof use in the
planning. The reason-schemes used in planning may be
beyondreproach, but the reasoner maystill find incorrect
plans and fail to find correct ones because its factual
reasoning is inadequate. This separation can be achieved
by simply giving the reasoner all the factual knowledge
(including planning-conditionals)that is relevant to solving
the problem, and then asking whether under those
circumstancesall its warrantedplans are correct and whether
it is alwaysable to find a warrantedsolution whenthere is
a solution. Let us understand soundness and completeness
for defeasible planners in this way.
Givingthe reasoner all the relevant factual knowledge
has the effect of turning the defeasible planner into an
algorithmic planner. Search for defeaters for any particular
inference will terminate after a single step, so for each plan
there will be a determinate point at which there is no more
relevant reasoning to be done. Wecan take the planner to
"return" the plan iff at that point it is justified in concluding
that the plan is a solution to the planning problem.Because
all the relevant reasoning has been done, the plan will be
warranted iff the reasoner is justified in drawing that
conclusionat that point.
Looking at the OSCAR
planner in this way, it is
equivalent to UCPOP,and sound and complete for the same
class of planning problems, with the exception that it does
not handle universally and existentially quantified goals.
6 Thisis mademoreprecise in chapterthree of Pollock[1995].
6. Conclusions
A planning agent operating in a complexenvironment
cannot be provided from the outset with exactly the
informationit needsto solve, without further reasoning, all
the planning problems it mayencounter. It may have to
engagein arbitrarily complexepistemic reasoning both about
what is true in the start-state and what the consequencesof
actions maybe under various circumstances. This has the
result that the set of threats is not recursive, and that has
the consequence that planning cannot proceed
algorithmically. Instead, a planning agent must assume
defeasibly that there are no unresolvedthreats until one is
found. An implementation of such a defeasible planning
agent was described.
Acknowledgments
This work was supported by NSFgrant no. IRI-9634106.
References
Carbonell, J. G., Knoblock,C. A., Minton,S.
1991 PRODIGY:
An Integrated Architecture for Prodigy.
In K. VanLehn(ed.), Architectures for Intelligence,
pp. 241-278, LawrenceEribaumAssociates, Hillsdale,
N.J. Fikes, R. E., and Nilsson, N. J.
McAllester, David, and Rosenblitt, David
1991 "Systematic nonlinear planning", Proceedings of
AAAI-91, 634-639.
Penberthy, J. Scott, and Weld,Daniel
1992 "UCPOP:a sound, complete, partial order planner
for ADL".Proceedings 3rd International Conference
on Principles of Knowledge Representation and
Reasoning, 103-114.
Pollock, John
1995 Cognitive Carpentry, MITPress.
Weld, Daniel
1994 "An introduction to least commitmentplanning", AI
Magazine, 15 27-62.
Download