From: AAAI Technical Report SS-93-03. Compilation copyright © 1993, AAAI (www.aaai.org). All rights reserved.
Neoclassical Planning
tAlberto MariaSegre
DavidSturgillt
Jennifer Tumey~
segre@cs.comell.edu sturgill@cs.cornell.edu turney@wilkes1.wilkes.edu
tDepartmentof ComputerScience
Cornell University
Ithaca, NY14853
CDepartmentof Mathematicsand ComputerScience
WilkesUniversity
Wilkes Barre, PA18766
This paper describes an adaptive planning system based
on an extension of the situation-calculus called PERFLOO
[5] and novel adaptive inference engine: a theorem
prover whoseperformancechanges with experience [18].
Thus because of its underlying theorem prover, our
planneris both approximative
andincremental,in that it is
able to generate plausible candidate plans quickly and
refine these plans whenallotted morecomputationtime.
The planner serves as one componentof our SEPIA
intelligent agent architecture [19]; SEIn_Ais a hybrid
system, combiningour neoclassical planning component
with a real-timereactiveexecutive.
Abstract
Classical planning systems have historically
been used to provide a domain-independent
frameworkfor expressing and using domaindependentproblemsolving information. Unfortunately, these systemsare intrinsically intractable. In this paper, wedescribe a system
that blends a classical planningformalismwith
a novel adaptive inference engine in order to
producean adaptive, approximate,planner.
1. Introduction
Classical planning systems perform search. This
observation highlights the fundamental problem with
these systems: in large domainsm like those typically
encountered in real world planning problems -intrinsically weakmethodscoupled with a classical
planning domainrepresentation are overwhelmed
by the
size of the searchspace.
In order to be of practical use, classical planningsystems
must necessarily rely on domain-specificheuristics to
reduce the amountof search. Real planningdomainsoften
require complexdomain-dependentheuristics that are
difficult to designandevenharder to refine or correct.
The messageof this paper is that real-world planning
systemsmustnecessarily be adaptive, that is, they must
reconcile themselvesto their environmentby improving,
refining, and perfecting their ownsearch reduction
heuristics. Therefore, by incorporating automatic
acquisition of search reduction heuristics in the
underlying inference engine design, we can makea
classical planningsystempractically useful evenin nontrivial domains. Weclaim oar planning system is
essentially a classical planningsystemthat promisesto
scale well; in short, a neoclassicalplanner.
Webegin by introducing our theorem prover and its
general capabilities, exclusive of its search reduction
techniques. Next, we briefly describe the PERFLOG
formalism, and howit is supported by our adaptive
inference engine. Wenext turn to the theoremprover’s
underlying search reduction techniques: explanationbased learning, bounded-overheadsuccess and failure
caching, heuristic antecedent reordering strategies,
learned-rule managementfacilities,
and a dynamic
abstraction mechanism.Finally, webriefly discuss our
currentresearchonbuildinga distributed(i.e., large-grain
parallel) versionof our adaptiveinferenceengine.
2. The Inference Engine
Wehave implemented a backward-chaining definiteclause theorem prover in Common
Lisp. The prover’s
inference schemeis essentially equivalent to PROLOG’S
SLD-resolution.Axiomsare stored in a discrimination
net database along with rules indexedby the rule head.
The database performs a pattern-matching retrieval
guaranteedto return a superset of those databaseentries
whichunify with the retrieval pattern. Thecost of a single
databaseretrieval in this modelgrowslinearly with the
numberof matchesfound and (roughly) logarithmically
with the number
of entries in the database.
127
Thesystemrelies on iterative deepening[10] in order to
force completeness
in recursive domainswhilestill taking
advantage of depth-first search’s favorable storage
characteristics. As generally practiced, iterative
deepeninginvolveslimiting depth-first searchexploration
to a fixed depth. If no solution is foundby the time the
depth-limitedsearchspaceis exhausted,the depthlimit is
incrementedand the search is restarted. In return for
completenessin recursive domains,depth-first iterative
deepeninggenerally incurs a constant factor overhead
whencomparedto regular depth-first search: the size of
this constant depends on the branching factor of the
search space and the value of the depth increment.
Changingthe incrementchangesthe order of exploration
of the implicit search space and, therefore, the
performance
of the inferenceengine.
holds(?p,do(?a,?s)) <--- causes(?a,?p, ?s)
states that proposition?p is true in the situation resulting
from performingaction ?a in previous situation ?s if
action ?a is knownto cause ?p whenapplied in situation
?s. For example,if onewereoperatingin the blocksworld,
it wouldbe reasonableto assert:
causes(holding(?
x ), pickup(?x ?s)
z
as part of the descriptionof the pickupoperator,
Thecancellation rule obviates the needfor explicit frame
axioms:
holds(?p,do(?a,?s)) ~-- holds(?p,?s)
"~cancels(?a,
?p, ?s)
This rule states that onecan concludethat ?p holdsin the
situation resulting fromperformingaction ?a in previous
Weperformiterative deepeningon a generalized, usersituation ?s provided?p waspreviously true and action
defined, notion of depth while respecting the overall
?a does not cancel ?p whenapplied to situation ?s.
search resource limit specified at query time. Fixing a
Others have proposedsimilar schemesin the past. The
depth-updatefunction (and thus a precise definition of
cancellation rule is similar to McCarthy’s
inertia axiom
depth) and an iterative-deepening incrementestablishes
for
the
situation
calculus
[12]:
the explorationorder of the theoremprover. For example,
holds(?p,do(?a, ?s)) ~-- holds(?p,?s)
one mightdefine the iterative-deepeningupdatefunction
to computedepth of the search; with this strategy, the
-~abnormal(?
p, ?s)
system is performingtraditional iterative deepening. Unlike abnormal,however,cancels discriminates on the
Alternatively, one mightspecify update functions for
action ?a. In addition, whentreated with the standard
conspiratorial iterative deepening [4], iterative
PROLOG
negationby resource-limitedfailure strategy [1],
1[7], or almostanyother searchstrategy.
broadening
the negated antecedent -,cancels(?a,?p,?s) essentially
constitutes an explicit assumption:somethingthat is true
UnlikePROLOG,
our theoremprover producesa structure
representing the derivation tree for a successful query only becausewefail to find a counterproof.
rather than just an answersubstitution. When
a failure is
Wecan use our theorem prover’s run-time resource
returned, the proverindicates whetherthe failure wasdue allocation capability to further extend the operational
to exceeding a resource limit or if in fact wecan be
status of PERFLOG
assumptions. To see how, consider
guaranteed absence of any solution. In addition, the
initially allocating exactly zero resourcesto all negated
prover supports procedural attachment (i.e., escape to
subgoals(akin to havingasserted an extra databasefact
Lisp), which,amongother things, allowsus to alter the
-,?x). This causes every PERFLOG
assumptionsubgoai to
prover’sresourcelimits dynamically.
succeed immediately, in effect pruning the AND/OR
search tree belowassumptionsubgoals (note that the
3. PERFLOG
The PFatFLOG
planning formalism is an adaptation of
classical situation-calculus planning without explicit
frame axioms. In their place, PERFLOG
uses two simple
meta-theoretic rules called causationand cancellation.
Thecausationrule:
I The conspiracy size of a subgoal corresponds to the numberof
other, aa yet unsolved, subgoals in the current proof structure. Thusconspiratorial best-first search prefers narrow proofs to bushyproofs, regardless of the actual depthof the resulting derivation. Itemtive broadening is an analogous idea that performsiterative deepeningon the breadth
of the candidate proofs.
128
2 Whileit is possible to describe plan knowledgeas a collection of
rules that rely explicitly on the Pm~oocausation and cancellation ndes,
it is muchmore convenient to describe problem solving operators in a
languagesimilar to that of STRIPS[6]. Thusa SEPIAoperator consists of
sets of preconditions and subgoals, as well as an add list and a delete
list. While the add list and delete list have the normalSXRIPS
semantics,
preconditions and subgoals differ as in the ARMS
planner [15]. Moreprecisely, preconditions are immutablerelations, e.g., block(A), which the
planner makesno attempt to expand recursively. Subgoals are relations
such as on(A,B) which maybe recursively achieved by the planner. In
the situation calculus, these relations appear augmentedwith a situation
variable, e.g., holds(on(A, B), ?s). SEPIAoperators are automatically expanded Into collections of ~o rules.
PERFLOC
cancellation rule is the sole source of negated
literals in a SEPIA
domaintheory). Oncesuch a proof has
beenobtained, weincrementthe resources allocated to
negated subgoals and reconstruct the proof, in essence
performingiterative deepeningin assumptionspace. As
additional resources are allocated to verifying a proof
containing assumptions,wecan increase our confidence
that the assumptions
are, in fact, warranted.
4. Speedup Learning Techniques
Our inference engine relies on speedup learning
techniques to improveits performancewith experience.
Theunderlyingassumptionis that not all situations are
equally likely to occur; that certain situational idioms
arise repeatedly and it is thus moreprofitable to bias
search ordering within the prover so that you will more
quicklysucceedmoreoften.
Onespeedupmechanism
used by the prover is an internal
success andfailure cache to record previouslyprovenor
unproven subgoals. Caching was already knownto be
empirically effective in reducing search for problem
solving systems [4,14]. Previous work on caching,
however, relied exclusively on caches of potentially
infinite size, and, therefore, potentially unlimited
overhead cost. Our bounded-overheadcaches are of a
predetermined
user-definedsize and rely on user-specified
cache-management
policies to replace less useful cache
entries. In [20], weshowempirically howsuch boundedoverheadcaches can be quite useful in reducing search
without the increasing overheadassociated with using
moretraditional infinite-size caches.
It is often hardto tell whethera givenspeedupteehnique’s
advantages outweighthe problems it introduces. For
example, while the use of EBLmay provide some
reduction of search, indiscriminateapplication mayalso
entail someincrease in search (the well-knownutility
problem of [13]). To postpone the utility problem
somewhat,our proveractively managesthe rules acquired
via EBL*,disposing of those whichdo not prove to be
useful as time passes. But even without active learned
macro-operatormanagement,
in [17] wedemonstratehow
subgoal caching and EBL*together can sidestep the
utility problem, in short showinggreater strength
combinedthan their individual performancemightimply.
Why do EBL* and subgoal caching work so well
together? EBL*introduces redundancy in the search
space and therefore suffers from the utility problem,
which, loosely put, results frombacktrackingover these
redundant paths. Success and failure caching serve to
prune redundantsearch by recognizingthe path as either
valid or fruitless. Thuscachingcan workto mitigate the
utility problem introduced by EBL*macro-operators,
providing the search reduction benefits of the learned
rules whileavoidingtheir shortcomings.
5. DynamicAbstraction Hierarchies
Yet another search reduction technique employedby our
planner extends the notion of an assumptionfurther,
exploiting the resource allocation mechanismto
dynamically form approximateabstraction hierarchies
[21]. Previous workin hierarchical planningconsists of
static techniqueswhichrely on certain characteristics of
the problem domain, such as Knoblock’s ordered
Anotherspeeduptechnique incorporated in our theorem monotonicity property [9]. Such techniques cannot be
prover is an explanation-based learning (EBL)
expected to make progress in large complex domain
component based on the EBL* formulation of
theories, whereordered monotonicitygenerally does not
explanation-based learning algorithms [16]. An EBL*
hold.
algorithm takes a completed proof as its input and
producesas output a macro-operatorwhichcan be added There are two waysto look at our use of assumptions.
to the original domaintheory. Theaddition of a macro- First, makingan assumption essentially postpones a
operator biases the order in which the search space
subgoal. Thusour treatment of assumptionscould be seen
(implicitly defined by the domaintheory and query)
as a global antecedentreorderingstrategy. A second,and
explored. EBL*algorithms differ from traditional EBL perhaps more useful, wayof looking at assumptions,
however,is as an approximation
of hierarchical planning.
algorithmsin the process by whichthe macro-operatoris
Consider
that
once
a
plan
containing
explicit assumptions
producedfrom the input proof. EBL*provides a set of
has
been
produced,
the
prover
is
used
to verify that the
five proof-transformationoperators which, whenapplied
assumptions
are
in
fact
warranted;
essentially
expandinga
to the input proofin someprespecifiedfashion, result in a
3
newmacro-operator. Anyextant EBLalgorithm can be
moreabstract plan into a moredetailed one. Wejustify
restated as someparticular sequenceof EBL*operations.
TheEBL*frameworkcasts generalization as a heuristic
3 Additional assumptions might be madein the course of verifying an assumption; these assumptions will in mmrequire additional verisearch through the space of transformed explanations;
fication by the theoremprover. Thusour hierarchies maybe manylevels
newEBL*algorithms correspond to explicit searchdeep.
controlheuristics for the generalizationprocess.
129
our expectationthat ~RFLOO
assumptionswill usually be maynot be sufficient to obtain a proof of their parent
node. In recompense,the potential payoff is much
provable since each assumptioncorrespondsto a flume AND
axiom;the vast majorityof relations in the worldremain higherthanfor ORparallelism,since, in the ideal case, all
unchangedwhenany one operator is applied. Whilewe of the workperformedby every processor contributes to
the solution.
mayexpectan expansionexists, there is no guaranteethat
a valid expansion exists: our hierarchies are only
Weare pursuingan unrestricted AND-paralleldesign, one
approximate.
that does not require compile time determination of
Wecan extend the PERFLOG
notion of an assumptioneven subgoalindependence
nor any sort of restriction on which
further in order to efficiently build and manage subgoal can bind a shared variable. Whatmakesour work
ephemeral, approximate, abstraction hierarchies. The different from previous approachesis that our speedup
intuition is that any subgoal-- andnot just the negated techniquesdistribute cleanly andelegantly. For example,
it is not necessarythat each processormaintainidentical
antecedent in the cancellation axiom-- might makea
reasonable assumptionif wehavesomereason to believe subgoalcaches.
the subgoal could could be verified given sufficient
Oursystem exploits the synergybetweenparallelism and
resources. Wesupport our belief dynamically using
each server’s ownadaptive bias. Recall that speedup
successandfailure statistics accumulated
by the inference techniques rely on an assumptionregarding the future
engine in a special-purpose data structure during
querydistribution. Unlikeserial applications of speedup
execution. Thedata structure, created at rule definition
techniques,our distributed inferenceenginehas at least
time and maintainedby the prover, keeps track of how
somecontrol over the subgoaldistribution sent to each
manysuccessful and unsuccessful attempts have been
individual server. By greedily selecting a mappingof
madefor each type of subgoal and how much their
subgoalsto proof servers, wecan exploit each individual
eventual proofs or failures cost. Eachsubgoalis used as
server’s bias, essentially training servers as experts in
an index into this data structure, and, if appropriate someslice of the domaintheory. Subgoalscan then be
conditions are met based on statistics from similar
automaticallyassignedto the server that is expectedto
previously encounteredsubgoals, the subgoal is assumed satisfy them most quickly. By exploiting the
to be true andits proofis deferred. Ourmotivationis the
individualized strengths of each processor at the time
sameas for traditional hierarchical planning:don’t sweat
subgoalsare distributed for solution, wehopeto further
the details until you havean abstract plan. Whatexactly increasethe throughputof the inferenceengine.
constitutesa detail, however,
is determined
on the fly.
6. Distributed Adaptive Inference
7. Discussion
Recently, wehave begun workon a newimplementation
of our adaptivetheoremproverthat is to be configuredto
run transparently in a heterogeneous distributed
environment(e.g., on a networkof workstations). The
hopeis that a distributed systemwill scale up to larger,
morerealistic target problems.Wehave implementeda
prototype AND-paralleldistributed theoremprover that
simulates a network of workstations using only one
machine.
Wehave briefly described an approachto planning that
relies on the use of a special adaptiveinference engine.
The resulting incremental, approximate, neoclassical
planner circumventsthe problemof scaling by attacking
the typically intractable search spacesof real planning
domainswith a variety of speedup and approximation
techniques.Ourlatest workattempts to extendthe system
to operate on a loosely-couplednetworkof processors.
Manyothers have suggested distributed or parallel
implementations of PROLOO
in the past. Most have
proposed implementingOR-parallel systems [2]. The
disadvantageof ORparallel search is that its potential
payoffis limited: in the best case, the expectedelapsed
time for a proof cannotbe less than the expectedelapsed
time for a perfect serial search(i.e., a serial searchthat
makesonly correct choices). A few have pursued more
complexAND-parallel systems, notwithstanding their
intrinsic problems[3,8]. Descendentsof an ANDnode
maydependon each other since they mayshare one or
morevariables: solving sibling subgoals independently
130
While we have used the PF_~FLOO
formalism as both
inspiration and as a convenient descriptive point of
departure, we should note that there is nothing
intrinsically special about this formalism.Indeed, our
adaptive inference engine could be used with any
planning formalism, such as, for example, situation
calculus or even SNLP
[11], even thoughthe use of EBL*
wouldviolate systematicityfor the latter. In fact, there is
nothing special about planning: our adaptive inference
engineis equallyapplicableto anydefinite-clausedomain
theory. Nevertheless, our new operationalization of
classical planningpromisesto scale well by attacking the
typically intractable search spaces of real planning
domains with a variety of speedup and approximation
techniques.
10. R. Korf, "Depth-First Iterative Deepening: An
Optimal Admissible Tree Search" A/27, 1 (1985), pp.
97-109.
Acknowledgements
Charles Elkan, Geoff Gordon, Alex Russell, and Daniel
Scharstein contributed to the research reported here.
Support for our research was provided by the Office of
Naval Research grant N00014-90-J-1542, and through
gifts from the Xerox Corporation and the Hewlett-Packard
Company.
11. D. McAllester and D. Rosenblitt,
"Systematic
Nonlinear Planning" The Natl. Conf. on AI, Anaheim,
CA, July 1991, pp. 634-639.
12. J. M. McCarthy,"Applications of Circumscription to
Formalising CommonSense Knowledge" Proc. of the
NonmonotonicReasoning Workshop, 1984, pp. 151-164.
References
1. K. Clark, "Negation as Failure;’ in Logic and
Databases, H. Gallaire and J. Minker (editors), Plenum
Press, NY,1978, pp. 293-322.
2. S. A. Delgado-Rannauro,
"OR-Parallel
Logic
Computation Models" in Implementations of Distributed
Prolog, P. Kacsukand M. J. Wise (editors), John Wiley
Sons, Chichester, 1992, pp. 3-26.
3. S. A. Delgado-Rannauro, "Restricted
AND- and
AND/OR-Parallel Logic Computational Models," in
Implementations of Distributed Prolog, P. Kacsuk and M.
J. Wise (editors), John Wiley & Sons, Chichester, 1992,
pp. 121-141.
4. C. Elkan, "Conspiracy Numbers and Caching for
Searching And/Or Trees and Theorem-Proving:’ Proc. of
the Eleventh Intl. Joint Conf. on A1, Detroit, MI, Aug.
1989, pp. 341-346.
13. S. Minton, "Quantitative Results Concerning the
Utility of Explanation-BasedLearning" Proc. of the Natl.
Conf. on A1, St. Paul, MN,July 1988, pp. 564-569.
14. D. Plaisted, "Non-Horn Clause Logic Programming
Without Contrapositives" Journ. of Automated Reasoning
4 (1988), pp. 287-325.
15. A. M. Segre, Machine Learning of Robot Assembly
Plans, Kluwer Academic Publishers, Hingham, MA,Mar.
1988.
16. A. M. Segre and C. Elkan, "A Provably Complete
Family of EBL Algorithms" Work. Paper, Dept. of
ComputerScience, Cornell Univ., Ithaca, NY, Nov. 1990.
17. A. M. Segre, "On Combining Multiple Speedup
Techniques," Proc. of the Ninth Intl. MachineLearning
Conf., Aberdeen,Scotland, July 1992, pp. 400-405.
18. A. M. Segre, C. Elkan, D. Scharstein, G. Gordonand
A. Russell, "Adaptive Inference,’ in MachineLearning:
Induction, Analogy, and Discovery, S. Chipmanand A.
Meyrowitz (editors),
Kluwer Academic Publishers,
Hingham, MA, Jan. 1993.
5. C. Elkan, "Incremental, ApproximatePlanning" Proc.
of the Natl. Conf. on AI, Boston, MA,July 1990, pp.
145-150.
6. R. E. Fikes, P. E. Hart and N. J. Nilsson, "Learningand
Executing Generalized Robot Plans" AJ 3 (1972), pp.
251-288.
7. M. Ginsberg and W. Harvey, "Iterative Broadening"
Proc. of the Natl. Conf. on AI, Boston, MA,July 1990, pp.
216-220.
8. M. V. Hermenegildo, "An Abstract Machine Based
Execution Model for Computer Architecture Design and
Efficient Implementation of Logic Programsin Parallel"
Tech. Rep.-86-20, Ph.D. Thesis, The University of Texas
at Austin, Austin, Texas, 1986.
9. C. A. Knoblock,"Learning Abstraction Hierarchies for
ProblemSolving:’ Proc. of the Natl. Conf. on AI, Boston,
MA,July 1990, pp. 923-928.
131
19. A. M. Segre and J. Turney, "Planning, Acting, and
Learning in a Dynamic Domain" in Machine Learning
Methods for Planning, S. Minton (editor),
Morgan
KaufmannPublishers, San Mateo, CA, To appear, Mar.
1993.
20. A. M. Segre and D. Scharstein, "Bounded-Overhead
Caching for Definite-Clause TheoremProving" To appear
in: Journal of AutomatedReasoning, 1993.
21. J. Tumey, "DAM: Dynamic, Approximate
Abstraction for Resource-Limited Planning" Ph.D.
Thesis, Dept. of ComputerScience, ComeU
Univ., Ithaca,
NY,In preparation: 1993.