From: AAAI Technical Report SS-93-03. Compilation copyright © 1993, AAAI (www.aaai.org). All rights reserved. Neoclassical Planning tAlberto MariaSegre DavidSturgillt Jennifer Tumey~ segre@cs.comell.edu sturgill@cs.cornell.edu turney@wilkes1.wilkes.edu tDepartmentof ComputerScience Cornell University Ithaca, NY14853 CDepartmentof Mathematicsand ComputerScience WilkesUniversity Wilkes Barre, PA18766 This paper describes an adaptive planning system based on an extension of the situation-calculus called PERFLOO [5] and novel adaptive inference engine: a theorem prover whoseperformancechanges with experience [18]. Thus because of its underlying theorem prover, our planneris both approximative andincremental,in that it is able to generate plausible candidate plans quickly and refine these plans whenallotted morecomputationtime. The planner serves as one componentof our SEPIA intelligent agent architecture [19]; SEIn_Ais a hybrid system, combiningour neoclassical planning component with a real-timereactiveexecutive. Abstract Classical planning systems have historically been used to provide a domain-independent frameworkfor expressing and using domaindependentproblemsolving information. Unfortunately, these systemsare intrinsically intractable. In this paper, wedescribe a system that blends a classical planningformalismwith a novel adaptive inference engine in order to producean adaptive, approximate,planner. 1. Introduction Classical planning systems perform search. This observation highlights the fundamental problem with these systems: in large domainsm like those typically encountered in real world planning problems -intrinsically weakmethodscoupled with a classical planning domainrepresentation are overwhelmed by the size of the searchspace. In order to be of practical use, classical planningsystems must necessarily rely on domain-specificheuristics to reduce the amountof search. Real planningdomainsoften require complexdomain-dependentheuristics that are difficult to designandevenharder to refine or correct. The messageof this paper is that real-world planning systemsmustnecessarily be adaptive, that is, they must reconcile themselvesto their environmentby improving, refining, and perfecting their ownsearch reduction heuristics. Therefore, by incorporating automatic acquisition of search reduction heuristics in the underlying inference engine design, we can makea classical planningsystempractically useful evenin nontrivial domains. Weclaim oar planning system is essentially a classical planningsystemthat promisesto scale well; in short, a neoclassicalplanner. Webegin by introducing our theorem prover and its general capabilities, exclusive of its search reduction techniques. Next, we briefly describe the PERFLOG formalism, and howit is supported by our adaptive inference engine. Wenext turn to the theoremprover’s underlying search reduction techniques: explanationbased learning, bounded-overheadsuccess and failure caching, heuristic antecedent reordering strategies, learned-rule managementfacilities, and a dynamic abstraction mechanism.Finally, webriefly discuss our currentresearchonbuildinga distributed(i.e., large-grain parallel) versionof our adaptiveinferenceengine. 2. The Inference Engine Wehave implemented a backward-chaining definiteclause theorem prover in Common Lisp. The prover’s inference schemeis essentially equivalent to PROLOG’S SLD-resolution.Axiomsare stored in a discrimination net database along with rules indexedby the rule head. The database performs a pattern-matching retrieval guaranteedto return a superset of those databaseentries whichunify with the retrieval pattern. Thecost of a single databaseretrieval in this modelgrowslinearly with the numberof matchesfound and (roughly) logarithmically with the number of entries in the database. 127 Thesystemrelies on iterative deepening[10] in order to force completeness in recursive domainswhilestill taking advantage of depth-first search’s favorable storage characteristics. As generally practiced, iterative deepeninginvolveslimiting depth-first searchexploration to a fixed depth. If no solution is foundby the time the depth-limitedsearchspaceis exhausted,the depthlimit is incrementedand the search is restarted. In return for completenessin recursive domains,depth-first iterative deepeninggenerally incurs a constant factor overhead whencomparedto regular depth-first search: the size of this constant depends on the branching factor of the search space and the value of the depth increment. Changingthe incrementchangesthe order of exploration of the implicit search space and, therefore, the performance of the inferenceengine. holds(?p,do(?a,?s)) <--- causes(?a,?p, ?s) states that proposition?p is true in the situation resulting from performingaction ?a in previous situation ?s if action ?a is knownto cause ?p whenapplied in situation ?s. For example,if onewereoperatingin the blocksworld, it wouldbe reasonableto assert: causes(holding(? x ), pickup(?x ?s) z as part of the descriptionof the pickupoperator, Thecancellation rule obviates the needfor explicit frame axioms: holds(?p,do(?a,?s)) ~-- holds(?p,?s) "~cancels(?a, ?p, ?s) This rule states that onecan concludethat ?p holdsin the situation resulting fromperformingaction ?a in previous Weperformiterative deepeningon a generalized, usersituation ?s provided?p waspreviously true and action defined, notion of depth while respecting the overall ?a does not cancel ?p whenapplied to situation ?s. search resource limit specified at query time. Fixing a Others have proposedsimilar schemesin the past. The depth-updatefunction (and thus a precise definition of cancellation rule is similar to McCarthy’s inertia axiom depth) and an iterative-deepening incrementestablishes for the situation calculus [12]: the explorationorder of the theoremprover. For example, holds(?p,do(?a, ?s)) ~-- holds(?p,?s) one mightdefine the iterative-deepeningupdatefunction to computedepth of the search; with this strategy, the -~abnormal(? p, ?s) system is performingtraditional iterative deepening. Unlike abnormal,however,cancels discriminates on the Alternatively, one mightspecify update functions for action ?a. In addition, whentreated with the standard conspiratorial iterative deepening [4], iterative PROLOG negationby resource-limitedfailure strategy [1], 1[7], or almostanyother searchstrategy. broadening the negated antecedent -,cancels(?a,?p,?s) essentially constitutes an explicit assumption:somethingthat is true UnlikePROLOG, our theoremprover producesa structure representing the derivation tree for a successful query only becausewefail to find a counterproof. rather than just an answersubstitution. When a failure is Wecan use our theorem prover’s run-time resource returned, the proverindicates whetherthe failure wasdue allocation capability to further extend the operational to exceeding a resource limit or if in fact wecan be status of PERFLOG assumptions. To see how, consider guaranteed absence of any solution. In addition, the initially allocating exactly zero resourcesto all negated prover supports procedural attachment (i.e., escape to subgoals(akin to havingasserted an extra databasefact Lisp), which,amongother things, allowsus to alter the -,?x). This causes every PERFLOG assumptionsubgoai to prover’sresourcelimits dynamically. succeed immediately, in effect pruning the AND/OR search tree belowassumptionsubgoals (note that the 3. PERFLOG The PFatFLOG planning formalism is an adaptation of classical situation-calculus planning without explicit frame axioms. In their place, PERFLOG uses two simple meta-theoretic rules called causationand cancellation. Thecausationrule: I The conspiracy size of a subgoal corresponds to the numberof other, aa yet unsolved, subgoals in the current proof structure. Thusconspiratorial best-first search prefers narrow proofs to bushyproofs, regardless of the actual depthof the resulting derivation. Itemtive broadening is an analogous idea that performsiterative deepeningon the breadth of the candidate proofs. 128 2 Whileit is possible to describe plan knowledgeas a collection of rules that rely explicitly on the Pm~oocausation and cancellation ndes, it is muchmore convenient to describe problem solving operators in a languagesimilar to that of STRIPS[6]. Thusa SEPIAoperator consists of sets of preconditions and subgoals, as well as an add list and a delete list. While the add list and delete list have the normalSXRIPS semantics, preconditions and subgoals differ as in the ARMS planner [15]. Moreprecisely, preconditions are immutablerelations, e.g., block(A), which the planner makesno attempt to expand recursively. Subgoals are relations such as on(A,B) which maybe recursively achieved by the planner. In the situation calculus, these relations appear augmentedwith a situation variable, e.g., holds(on(A, B), ?s). SEPIAoperators are automatically expanded Into collections of ~o rules. PERFLOC cancellation rule is the sole source of negated literals in a SEPIA domaintheory). Oncesuch a proof has beenobtained, weincrementthe resources allocated to negated subgoals and reconstruct the proof, in essence performingiterative deepeningin assumptionspace. As additional resources are allocated to verifying a proof containing assumptions,wecan increase our confidence that the assumptions are, in fact, warranted. 4. Speedup Learning Techniques Our inference engine relies on speedup learning techniques to improveits performancewith experience. Theunderlyingassumptionis that not all situations are equally likely to occur; that certain situational idioms arise repeatedly and it is thus moreprofitable to bias search ordering within the prover so that you will more quicklysucceedmoreoften. Onespeedupmechanism used by the prover is an internal success andfailure cache to record previouslyprovenor unproven subgoals. Caching was already knownto be empirically effective in reducing search for problem solving systems [4,14]. Previous work on caching, however, relied exclusively on caches of potentially infinite size, and, therefore, potentially unlimited overhead cost. Our bounded-overheadcaches are of a predetermined user-definedsize and rely on user-specified cache-management policies to replace less useful cache entries. In [20], weshowempirically howsuch boundedoverheadcaches can be quite useful in reducing search without the increasing overheadassociated with using moretraditional infinite-size caches. It is often hardto tell whethera givenspeedupteehnique’s advantages outweighthe problems it introduces. For example, while the use of EBLmay provide some reduction of search, indiscriminateapplication mayalso entail someincrease in search (the well-knownutility problem of [13]). To postpone the utility problem somewhat,our proveractively managesthe rules acquired via EBL*,disposing of those whichdo not prove to be useful as time passes. But even without active learned macro-operatormanagement, in [17] wedemonstratehow subgoal caching and EBL*together can sidestep the utility problem, in short showinggreater strength combinedthan their individual performancemightimply. Why do EBL* and subgoal caching work so well together? EBL*introduces redundancy in the search space and therefore suffers from the utility problem, which, loosely put, results frombacktrackingover these redundant paths. Success and failure caching serve to prune redundantsearch by recognizingthe path as either valid or fruitless. Thuscachingcan workto mitigate the utility problem introduced by EBL*macro-operators, providing the search reduction benefits of the learned rules whileavoidingtheir shortcomings. 5. DynamicAbstraction Hierarchies Yet another search reduction technique employedby our planner extends the notion of an assumptionfurther, exploiting the resource allocation mechanismto dynamically form approximateabstraction hierarchies [21]. Previous workin hierarchical planningconsists of static techniqueswhichrely on certain characteristics of the problem domain, such as Knoblock’s ordered Anotherspeeduptechnique incorporated in our theorem monotonicity property [9]. Such techniques cannot be prover is an explanation-based learning (EBL) expected to make progress in large complex domain component based on the EBL* formulation of theories, whereordered monotonicitygenerally does not explanation-based learning algorithms [16]. An EBL* hold. algorithm takes a completed proof as its input and producesas output a macro-operatorwhichcan be added There are two waysto look at our use of assumptions. to the original domaintheory. Theaddition of a macro- First, makingan assumption essentially postpones a operator biases the order in which the search space subgoal. Thusour treatment of assumptionscould be seen (implicitly defined by the domaintheory and query) as a global antecedentreorderingstrategy. A second,and explored. EBL*algorithms differ from traditional EBL perhaps more useful, wayof looking at assumptions, however,is as an approximation of hierarchical planning. algorithmsin the process by whichthe macro-operatoris Consider that once a plan containing explicit assumptions producedfrom the input proof. EBL*provides a set of has been produced, the prover is used to verify that the five proof-transformationoperators which, whenapplied assumptions are in fact warranted; essentially expandinga to the input proofin someprespecifiedfashion, result in a 3 newmacro-operator. Anyextant EBLalgorithm can be moreabstract plan into a moredetailed one. Wejustify restated as someparticular sequenceof EBL*operations. TheEBL*frameworkcasts generalization as a heuristic 3 Additional assumptions might be madein the course of verifying an assumption; these assumptions will in mmrequire additional verisearch through the space of transformed explanations; fication by the theoremprover. Thusour hierarchies maybe manylevels newEBL*algorithms correspond to explicit searchdeep. controlheuristics for the generalizationprocess. 129 our expectationthat ~RFLOO assumptionswill usually be maynot be sufficient to obtain a proof of their parent node. In recompense,the potential payoff is much provable since each assumptioncorrespondsto a flume AND axiom;the vast majorityof relations in the worldremain higherthanfor ORparallelism,since, in the ideal case, all unchangedwhenany one operator is applied. Whilewe of the workperformedby every processor contributes to the solution. mayexpectan expansionexists, there is no guaranteethat a valid expansion exists: our hierarchies are only Weare pursuingan unrestricted AND-paralleldesign, one approximate. that does not require compile time determination of Wecan extend the PERFLOG notion of an assumptioneven subgoalindependence nor any sort of restriction on which further in order to efficiently build and manage subgoal can bind a shared variable. Whatmakesour work ephemeral, approximate, abstraction hierarchies. The different from previous approachesis that our speedup intuition is that any subgoal-- andnot just the negated techniquesdistribute cleanly andelegantly. For example, it is not necessarythat each processormaintainidentical antecedent in the cancellation axiom-- might makea reasonable assumptionif wehavesomereason to believe subgoalcaches. the subgoal could could be verified given sufficient Oursystem exploits the synergybetweenparallelism and resources. Wesupport our belief dynamically using each server’s ownadaptive bias. Recall that speedup successandfailure statistics accumulated by the inference techniques rely on an assumptionregarding the future engine in a special-purpose data structure during querydistribution. Unlikeserial applications of speedup execution. Thedata structure, created at rule definition techniques,our distributed inferenceenginehas at least time and maintainedby the prover, keeps track of how somecontrol over the subgoaldistribution sent to each manysuccessful and unsuccessful attempts have been individual server. By greedily selecting a mappingof madefor each type of subgoal and how much their subgoalsto proof servers, wecan exploit each individual eventual proofs or failures cost. Eachsubgoalis used as server’s bias, essentially training servers as experts in an index into this data structure, and, if appropriate someslice of the domaintheory. Subgoalscan then be conditions are met based on statistics from similar automaticallyassignedto the server that is expectedto previously encounteredsubgoals, the subgoal is assumed satisfy them most quickly. By exploiting the to be true andits proofis deferred. Ourmotivationis the individualized strengths of each processor at the time sameas for traditional hierarchical planning:don’t sweat subgoalsare distributed for solution, wehopeto further the details until you havean abstract plan. Whatexactly increasethe throughputof the inferenceengine. constitutesa detail, however, is determined on the fly. 6. Distributed Adaptive Inference 7. Discussion Recently, wehave begun workon a newimplementation of our adaptivetheoremproverthat is to be configuredto run transparently in a heterogeneous distributed environment(e.g., on a networkof workstations). The hopeis that a distributed systemwill scale up to larger, morerealistic target problems.Wehave implementeda prototype AND-paralleldistributed theoremprover that simulates a network of workstations using only one machine. Wehave briefly described an approachto planning that relies on the use of a special adaptiveinference engine. The resulting incremental, approximate, neoclassical planner circumventsthe problemof scaling by attacking the typically intractable search spacesof real planning domainswith a variety of speedup and approximation techniques.Ourlatest workattempts to extendthe system to operate on a loosely-couplednetworkof processors. Manyothers have suggested distributed or parallel implementations of PROLOO in the past. Most have proposed implementingOR-parallel systems [2]. The disadvantageof ORparallel search is that its potential payoffis limited: in the best case, the expectedelapsed time for a proof cannotbe less than the expectedelapsed time for a perfect serial search(i.e., a serial searchthat makesonly correct choices). A few have pursued more complexAND-parallel systems, notwithstanding their intrinsic problems[3,8]. Descendentsof an ANDnode maydependon each other since they mayshare one or morevariables: solving sibling subgoals independently 130 While we have used the PF_~FLOO formalism as both inspiration and as a convenient descriptive point of departure, we should note that there is nothing intrinsically special about this formalism.Indeed, our adaptive inference engine could be used with any planning formalism, such as, for example, situation calculus or even SNLP [11], even thoughthe use of EBL* wouldviolate systematicityfor the latter. In fact, there is nothing special about planning: our adaptive inference engineis equallyapplicableto anydefinite-clausedomain theory. Nevertheless, our new operationalization of classical planningpromisesto scale well by attacking the typically intractable search spaces of real planning domains with a variety of speedup and approximation techniques. 10. R. Korf, "Depth-First Iterative Deepening: An Optimal Admissible Tree Search" A/27, 1 (1985), pp. 97-109. Acknowledgements Charles Elkan, Geoff Gordon, Alex Russell, and Daniel Scharstein contributed to the research reported here. Support for our research was provided by the Office of Naval Research grant N00014-90-J-1542, and through gifts from the Xerox Corporation and the Hewlett-Packard Company. 11. D. McAllester and D. Rosenblitt, "Systematic Nonlinear Planning" The Natl. Conf. on AI, Anaheim, CA, July 1991, pp. 634-639. 12. J. M. McCarthy,"Applications of Circumscription to Formalising CommonSense Knowledge" Proc. of the NonmonotonicReasoning Workshop, 1984, pp. 151-164. References 1. K. Clark, "Negation as Failure;’ in Logic and Databases, H. Gallaire and J. Minker (editors), Plenum Press, NY,1978, pp. 293-322. 2. S. A. Delgado-Rannauro, "OR-Parallel Logic Computation Models" in Implementations of Distributed Prolog, P. Kacsukand M. J. Wise (editors), John Wiley Sons, Chichester, 1992, pp. 3-26. 3. S. A. Delgado-Rannauro, "Restricted AND- and AND/OR-Parallel Logic Computational Models," in Implementations of Distributed Prolog, P. Kacsuk and M. J. Wise (editors), John Wiley & Sons, Chichester, 1992, pp. 121-141. 4. C. Elkan, "Conspiracy Numbers and Caching for Searching And/Or Trees and Theorem-Proving:’ Proc. of the Eleventh Intl. Joint Conf. on A1, Detroit, MI, Aug. 1989, pp. 341-346. 13. S. Minton, "Quantitative Results Concerning the Utility of Explanation-BasedLearning" Proc. of the Natl. Conf. on A1, St. Paul, MN,July 1988, pp. 564-569. 14. D. Plaisted, "Non-Horn Clause Logic Programming Without Contrapositives" Journ. of Automated Reasoning 4 (1988), pp. 287-325. 15. A. M. Segre, Machine Learning of Robot Assembly Plans, Kluwer Academic Publishers, Hingham, MA,Mar. 1988. 16. A. M. Segre and C. Elkan, "A Provably Complete Family of EBL Algorithms" Work. Paper, Dept. of ComputerScience, Cornell Univ., Ithaca, NY, Nov. 1990. 17. A. M. Segre, "On Combining Multiple Speedup Techniques," Proc. of the Ninth Intl. MachineLearning Conf., Aberdeen,Scotland, July 1992, pp. 400-405. 18. A. M. Segre, C. Elkan, D. Scharstein, G. Gordonand A. Russell, "Adaptive Inference,’ in MachineLearning: Induction, Analogy, and Discovery, S. Chipmanand A. Meyrowitz (editors), Kluwer Academic Publishers, Hingham, MA, Jan. 1993. 5. C. Elkan, "Incremental, ApproximatePlanning" Proc. of the Natl. Conf. on AI, Boston, MA,July 1990, pp. 145-150. 6. R. E. Fikes, P. E. Hart and N. J. Nilsson, "Learningand Executing Generalized Robot Plans" AJ 3 (1972), pp. 251-288. 7. M. Ginsberg and W. Harvey, "Iterative Broadening" Proc. of the Natl. Conf. on AI, Boston, MA,July 1990, pp. 216-220. 8. M. V. Hermenegildo, "An Abstract Machine Based Execution Model for Computer Architecture Design and Efficient Implementation of Logic Programsin Parallel" Tech. Rep.-86-20, Ph.D. Thesis, The University of Texas at Austin, Austin, Texas, 1986. 9. C. A. Knoblock,"Learning Abstraction Hierarchies for ProblemSolving:’ Proc. of the Natl. Conf. on AI, Boston, MA,July 1990, pp. 923-928. 131 19. A. M. Segre and J. Turney, "Planning, Acting, and Learning in a Dynamic Domain" in Machine Learning Methods for Planning, S. Minton (editor), Morgan KaufmannPublishers, San Mateo, CA, To appear, Mar. 1993. 20. A. M. Segre and D. Scharstein, "Bounded-Overhead Caching for Definite-Clause TheoremProving" To appear in: Journal of AutomatedReasoning, 1993. 21. J. Tumey, "DAM: Dynamic, Approximate Abstraction for Resource-Limited Planning" Ph.D. Thesis, Dept. of ComputerScience, ComeU Univ., Ithaca, NY,In preparation: 1993.