AIPS-2000 Planning Competition Fahiem Bacchus University of Toronto 4/17/00 Fahiem Bacchus 1 Overview • AIPS-98 featured the first competition. • 4 competitors in a STRIPS track, 2 in an ADL track • This years competition: • 15 competitors • A fully automatic track with STRIPS & ADL domains • A hand-tailored track allowing domain dependent information. 4/17/00 Fahiem Bacchus 2 Make Possible By • Michael Ady, Winter-City Software, Edmonton, Alberta, Canada. • John DiMarco, and the Department of Computer Science, University of Toronto, systems support staff. • The other members of the organizing committee: Henry Kautz, David E. Smith, Derek Long, Hector Geffner, & Jana Koller • The competitors who were willing to subject their work to a very public scrutiny. • Franz Inc. For providing a free copy of Allegro Common lisp for Linux. 4/17/00 Fahiem Bacchus 3 The Competitors 1. Blackbox Yi-Cheng Huang Bart Selman Cornell University Henry Kautz AT&T Research • • • Constructs a graphplan-graph, converts it into a Boolean satisfiability problems, and then attempts to solve the problem with various satisfiability engines. Competed in AIPS-98. Competed in the fully automated track. 4/17/00 Fahiem Bacchus 4 The Competitors 2. MIPS Stefan Edelkamp Malte Helmert University of Freiburg • • • “Intelligent Model checking and Planning System” Uses BDDs compactly store and maintain sets of propositionally represented states, and a heuristic symbolic search engine as well as a heuristic single state search engine. Competed in the fully automated track. 4/17/00 Fahiem Bacchus 5 The Competitors 3. System R Fangzhen Lin Hong Kong University of Science and Technology • • • Based on a regression/progression algorithm like a sound version of the original STRIPS algorithm. Competed in both the fully automated and hand tailored tracks. In the hand tailored track it used domain specific information about (1) the ordering of subgoals; (2) pruning of unachievable goals; and (3) the way a subgoal is solved by regressing it to a new conjunctive goal. 4/17/00 Fahiem Bacchus 6 The Competitors 4. FF Joerg Hoffmann Albert Ludwigs University • • • FF (Fast-Forward) employs heuristic search like HSP, but extending the HSP heuristic to include information from GRAPHPLAN's plan extraction phase. Employs a local search strategy that combines Hill-climbing with systematic search. Competed the fully automated track. 4/17/00 Fahiem Bacchus 7 The Competitors 5. HSP2 Hector Geffner Blai Bonet Universidad Simon Bolivar • • • • A heuristic-search planner that descends from the HSP planner that competed in the AIPS98 Contest. Planning instances are mapped into state-space search problems that are solved with heuristics extracted from the representation. HSP2 supports both forward and backward search and several heuristics, and uses a weighted A* search. Competed in the fully automated track. 4/17/00 Fahiem Bacchus 8 The Competitors 6. IPP Jana Koehler Schindler Lifts Ltd. Joerg Hoffmann Michael Brenner University of Freiburg • • • IPP is based on searching GraphPlan planning graphs that have been extended to handle ADL actions. Essentially the same system as that entered in the AIPS-98 competition. Competed in the fully automated track. 4/17/00 Fahiem Bacchus 9 The Competitors 7. PropPlan Michael Fourman University of Edinburgh • • • • PropPlan uses naive breadth-first state-space search but employs Ordered Binary Decision Diagrams to optimize its state space exploration. Operators are represented directly by efficient operations on BDDs. Like GraphPlan, the competition version of PropPlan utilizes forward chaining to establish a layered set of reachable states until the goal-set is reached, then backward chaining for plan extraction. Competed in the fully automated track. 4/17/00 Fahiem Bacchus 10 The Competitors 8. TokenPlan Yannick Meiller Patrick Fabiani ONERA - Center of Toulouse • • Based on the use of colored Petri nets which can encode mutex relations through the token’s colors. Dependent on how tokens are propagated in the net a range of search techniques can be emulated. Competed in the fully automated track. 4/17/00 Fahiem Bacchus 11 The Competitors 9. STAN Maria Fox Derek Long University of Durham • • • • STAN employs a hybrid of two planning strategies: 1. The original Graphplan-based STAN algorithm. 2. A forward planner using a heuristic function based on the length of the relaxed plan (as in HSP and FF). Uses automatic of domain analysis to select between these strategies. The domain analysis techniques include type, and invariant detection, as well as the automatic identification of certain combinatorial optimization sub-problems Competed in the fully automated track. 4/17/00 Fahiem Bacchus 12 The Competitors 10. BDDPlan Hans-Peter Stoerr Dresden University of Technology • • • BDDPlan uses BDDs to support reasoning in the Fluent Calculus, an framework for reasoning about actions in first order logic. Model checking algorithms are used to do an implicit breadth first search. Competed in the fully automated track. 4/17/00 Fahiem Bacchus 13 The Competitors 11. AltAlt Biplav Srivastava Terry Zimmerman BinhMinh Do XuanLong Nguyen Zaiqing Nie Ullas Nambiar Romeo Sanchez Arizona State University • • AltAlt uses effective and admissible heuristics extracted from the planning graph to drive the backward state space. Competed in the fully automatic track. 4/17/00 Fahiem Bacchus 14 The Competitors 12. GRT Ioannis Refanidis Ioannis Vlahavas Dimitris Vrakas Aristotle University • • GRT (Greedy Regression Table) planner is a heuristic state space planner. Like HSP it employs a best-first search using estimated distances to the goal Competed in the fully automatic track. 4/17/00 Fahiem Bacchus 15 The Competitors 13. PbR Jose Luis Ambite Craig Knoblock Steve Minton University of Southern California/Information Sciences Institute • • • • Planning by Rewriting (PbR) generates plans by using a set of plan rewriting rules and local search techniques to transform an easy-togenerate poor quality initial plans into a higher-quality plans. The rewriting rules were developed semi-automatically: some proposed by a learning algorithm, some defined manually. The initial plan generators were hand-coded for each domain. Competed in the hand tailored track. 4/17/00 Fahiem Bacchus 16 The Competitors 14. SHOP Dana Nau Yue (Jason) Cao Hector Munoz-Avila Amnon Lotem University of Maryland • • • • SHOP (Simple Hierarchical Ordered Planner) is an HTN planning system that plans for tasks in the same order that they will later be executed. This simplifies goal-interactions and provides a complete world-state at each step of the planning process as with forward chaining planners. The complete state information allows SHOP to encode effective domain specific planning algorithms and knowledge. Competed in the hand tailored track. 4/17/00 Fahiem Bacchus 17 The Competitors 14. TALPlanner Jonas Kvarnstrom Patrick Doherty Patrik Haslum Linkoping University • • • • TALplanner is a forward-chaining planner based on the TLPlan system (Bacchus & Kabanza). Domain-dependent search control knowledge expressed declaratively as formulas of a temporal logic and used to control forward chaining. TALplanner uses TAL a narrative temporal logic for reasoning about action and change. Competed in the hand tailored track. 4/17/00 Fahiem Bacchus 18 The Results 4/17/00 Fahiem Bacchus 19 Domain 1: Logistics World • Move a set of packages between locations using trucks within the same city and airplanes between cities. • Limited interaction between goals. 4/17/00 Fahiem Bacchus 20 BlackBox Fully Automated Logistics Time Comparison Mips 10000 System R FF HSP2 100 Seconds IPP PropPlan TokenPlan 1 4 4 4 5 5 5 6 6 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 STAN BDDPlan AltAlt 0.01 GRT Planners doing well enough to scale to bigger problems: • • • • • • System R GRT HSP2 Stan Mips FF 4/17/00 Fahiem Bacchus 22 Fully Automated Logistics Time Comparison System R FF 1000 Seconds HSP2 STAN 10 GRT Mips 0.1 Logistics Domain Time • FF, MIPS a bit better • Stan and HSP2 • Then GRT 4/17/00 Fahiem Bacchus 24 Fully Automated Logistics # Steps Comparison BlackBox Mips 160 System R 140 FF 120 HSP2 100 IPP 80 PropPlan 60 TokenPlan 40 STAN BDDPlan 20 AltAlt 0 4 4 4 5 5 5 6 6 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 GRT Fully Automated Logistics # Steps Comparison Mips 600 System R 500 400 FF 300 HSP2 200 100 STAN 0 GRT Logistics Plan Length • • • • Stan is generating the shortest plans GRT, MIPS, FF about the same HSP a bit worse System R generates very long plans in this domain. 4/17/00 Fahiem Bacchus 27 Hand Tailored 4/17/00 Fahiem Bacchus 28 Hand Tailored Logistic Time Comparison 1000 System R 100 10 Seconds SHOP 1 0.1 0.01 TALplanner Hand Tailored Logistics #Steps Comparison 4000 System R 3500 3000 2500 SHOP 2000 1500 1000 500 16 18 21 23 26 28 31 33 36 38 41 43 46 48 51 53 56 58 61 63 66 68 71 73 76 78 81 83 86 88 91 93 96 98 0 TALplanner Hand Tailored Logistics # Steps Comparison Shop 700 600 500 400 300 200 100 0 TALplanner Hand Tailored Logistics • TALplan is extremely fast (the largest problems in less than a second • Shop and then System R • Shop and TALplan are generating short plans (with Shop a bit better) • System R is generating very long plans 4/17/00 Fahiem Bacchus 32 Domain 2: Blocks World • Stack a set of blocks. • A high degree of interaction between goals. • Easy for people, can be hard for automated planners. 4/17/00 Fahiem Bacchus 33 Fully Automated Blocks Time Comparison BlackBox Mips System R 1000 FF HSP2 Seconds IPP 10 PropPlan TokenPlan 0.1 STAN BDDPlan AltAlt 0.001 GRT Blocks • A wide variation of performance. • Only FF, System R, and HSP2 are able to solve the larger problems. 4/17/00 Fahiem Bacchus 35 Fully Automated Blocks Time Comparison 10000 System R 100 Seconds FF 1 HSP2 0.01 Blocks • System R scales very consistently in this domain. • FF can occasionally solve large problems fast, but it has many misses. 4/17/00 Fahiem Bacchus 37 BlackBox Fully Automated Blocks # Steps Comparison Mips 250 System R 200 FF HSP2 150 IPP PropPlan 100 TokenPlan STAN 50 BDDPlan 0 AltAlt GRT Fully Automated Blocks # Steps Comparison 250 System R 200 150 FF 100 50 HSP2 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 0 Blocks • System R generates short plans. • FF when it succeeds also generates short plans. • HSP2 can generate very long plans in this domain. 4/17/00 Fahiem Bacchus 40 Blocks Hand Tailored 4/17/00 Fahiem Bacchus 41 Blocks Hand Taliored Time Comparison System R 100 Seconds PbR 1 SHOP TALplanner 0.01 Hand Tailored Blocks # Steps Comparison System R 400 350 300 PbR 250 200 150 SHOP 100 50 0 TALplanner Blocks—Hand Tailored • TALPlan is very fast. • System R, then Shop and PBR. • They all generate plans of a similar length. 4/17/00 Fahiem Bacchus 44 Blocks Some Harder Problems 4/17/00 Fahiem Bacchus 45 Hand Tailored Blocks Time Comparison 100000 PbR Seconds 1000 System R SHOP 10 100 100 200 200 250 250 300 300 350 350 400 400 425 425 450 450 475 475 500 500 TALplanner 0.1 Blocks • TALplan scales very well in this domain solving 500 block problems in about 1.5 seconds. • System R also scales well enough to solve the hardest problems of the . 4/17/00 Fahiem Bacchus 47 Domain 3: Schedule World • Machine a collection of parts. • Goals are mostly non-interacting, but they compete for “resources” (time on machines) and on the same part different goals clobber other goals. • Originally a Prodigy domain. 4/17/00 Fahiem Bacchus 48 Schedule World • This is an ADL domain with many actions requiring conditional effects. • Only Mips, FF, HSP2, IPP, PropPlan, and BDDPlan could deal with this domain. 4/17/00 Fahiem Bacchus 49 Fully Automated Schedule Time Comparison Mips 10000 FF HSP2 Seconds 100 IPP 48 50 47 43 45 42 38 40 37 33 35 32 28 30 27 23 25 22 18 20 17 13 15 12 8 10 7 3 5 2 1 PropPlan 0.01 BDDPlan Schedule • FF is the only planner that scales to the harder problems on this domain. • The length of the solutions are roughly comparable 4/17/00 Fahiem Bacchus 51 Schedule—Hand Tailored 4/17/00 Fahiem Bacchus 52 Hand Tailored Schedule Time Comparison 10000 PbR 100 Seconds BDDPlan 48 50 47 43 45 42 38 40 37 33 35 32 28 30 27 23 25 22 18 20 17 13 15 12 8 10 7 3 5 2 1 TALplanner 0.01 Hand Tailored Schedule # Steps Comparison 100 PbR 90 80 70 60 BDDPlan 50 40 30 20 10 0 TALplanner Schedule—Hand Tailored • TALplan is generating short solutions in about 0.15 seconds on the largest problems. • PbR takes longer and generates inferior plans. • Interestingly FF is taking 10-25 seconds on the largest problems, generating slightly longer solutions than TALplan, but fully automatically. 4/17/00 Fahiem Bacchus 55 Domain 4: Freecell World • Freecell is a solitaire card game that comes with Microsoft Windows. • Freecell demo. 4/17/00 Fahiem Bacchus 56 Fully Automatic FreeCell Time Comparison 10000 BlackBox Mips FF HSP2 100 Seconds IPP PropPlan TokenPlan 1 STAN BDDPlan 0.01 GRT Fully Automatic Freecell # Steps Comparison 250 BlackBox Mips System R 200 FF 150 HSP2 100 IPP TokenPlan 50 STAN 0 GRT Freecell • Stan plan takes the least time but does not solve the hardest problems, and generates long solutions. • FF is best at the harder problems, but cannot solve all the problems • HSP2 also solves some larger problems (but takes a long time on them) 4/17/00 Fahiem Bacchus 59 Freecell—Fully Automatic 4/17/00 Fahiem Bacchus 60 Hand Tailored Freecell Time Comparison 10000 System R 100 Seconds SHOP 13 13 12 12 12 11 10 11 10 10 9 9 8 8 8 7 7 6 6 6 5 4 5 4 4 3 3 2 2 2 1 TALplanner 0.01 Hand Tailored Freecell # Steps Comparison LOG SCALE 10000 System R 1000 100 SHOP 10 TALplanne 1 Freecell • Talplan can be fast and solves all of the problems, but it can generate very long plans 4500 steps. • Shop generates reasonable plans in a reasonable amount of time. System R takes longer. • None of these planners perform that much better than the fully automatic planners in this domain. 4/17/00 Fahiem Bacchus 63 Domain 5: Mic10 Elevator World • Based on the Miconic-10 Elevator controller developed by Schindler Lifts Ltd. • Contributed by Jana Koehler. • Problems involve controlling a sophisticated elevator to move passengers to their destination. • Various constraints on movement, including priority passengers, passengers that must go nonstop, passengers that must be accompanied. 4/17/00 Fahiem Bacchus 64 Miconic 10 • • • • • Four Separate problems: A simple strips version. A simplified ADL version. Jana’s original ADL version. The original ADL version plus the constraint that only 6 people can be on board the elevator at a time. 4/17/00 Fahiem Bacchus 65 STRIPS Miconic 10 Time Comparison System R 10000 TokenPlan 100 Seconds STAN 58 60 56 52 54 50 46 48 44 40 42 38 34 36 32 28 30 26 22 24 20 16 18 14 10 12 8 4 6 2 1 AltAlt GRT 0.01 STRIPS Miconic # Steps Comparison System R 180 160 140 TokenPlan 120 100 STAN 80 60 AltAlt 40 20 0 GRT Strips Miconic10 • STAN is slightly faster and generates slightly shorter solutions. • GRT also does well on both criteria. 4/17/00 Fahiem Bacchus 68 Simple Miconic 10 Time Comparison 70 60 Seconds 50 40 HSP2 30 20 10 0 Simple ADL Miconic10 # Steps Comparison 120 100 80 HSP2 60 40 20 0 Simple Miconic 10 • HSP was the only planner to successfully solve the problems. • Generates reasonable solutions quickly in this domain. 4/17/00 Fahiem Bacchus 71 Full Miconic 10 Time Comparison 10000 FF 100 Seconds IPP 1 PropPlan 0.01 2 2 4 6 8 10 10 12 14 16 18 18 20 22 24 26 28 28 30 32 34 36 38 40 42 44 46 48 52 52 54 56 58 60 Full Miconic 10 # Steps Comparison 120 0 FF 100 80 IPP 60 40 20 PropPlan Full ADL-Miconic10 • PropPlan generates minimal length solutions. • FF is faster. 4/17/00 Fahiem Bacchus 74 Miconic10—ADL+Constraint 4/17/00 Fahiem Bacchus 75 Miconic 10 + Constraint Time Comparison 10000 PropPlan Seconds 100 1 TALplanner 0.01 Miconic 10 + Constraint # Steps Comparison 40 35 PropPlan 30 25 20 15 10 TALplanner 5 0 Miconic10 + Constraints • PropPlan is generating minimal length plans. • TALPlan is generating reasonably good plans, quickly. • PropPlan is not using any domain specific control. 4/17/00 Fahiem Bacchus 78 Distinguished Planners • Schindler Lifts Ltd. Is providing a special award for performance on the Miconic-10 domain. 4/17/00 Fahiem Bacchus 79 Distinguished Planners • Celcorp is providing a set of awards for performance in the competition. • There are many metrics, and it is impossible to say that any one planner was the best. • But some planners did distinguish themselves as performing with distinction in different ways. • I have selected a set of such planners as worthy of “special distinction.” Distinguished Planners Group B • • • • STAN HSP2 MIPS System R 4/17/00 Fahiem Bacchus 81 Distinguished Planners Group A • Two Planners demonstrated performance that was even more distinguished. • TalPlanner • FF 4/17/00 Fahiem Bacchus 82