TDDC17: Introduction to Automated Planning

TDDC17: Introduction to Automated Planning Jonas Kvarnström Automated Planning Group Department of Computer and Information Science Linköping University Towers of Hanoi 4 Some applications are simple, well-structured, almost toy problems One way of defining planning: Using knowledge about the world, including possible actions and their results, to decide what to do and when in order to achieve an objective, before you actually start doing it Simple structure General Knowledge: Problem Domain Single agent acting Specific Knowledge and Objective: Problem Instance  There are pegs and disks  3 pegs, 7 disks  A disk can be on a peg  All disks currently on peg 2, in order of increasing size  One disk can be above another  Actions: Move topmost disk from x to y, without placing larger disks on smaller disks  We want: All disks on the third peg, in order of increasing size jonkv@ida jonas.kvarnstrom@liu.se – 2015   What you can do:  Drive to location  Deliver p from t Objective: Deliver 20,000 packages (efficiently!) in which order?  Which roads to use?  Tall buildings, multiple elevators  How to serve people as efficiently as possible? 7 jonkv@ida ▪ http://www.youtube.com/watch?v=qXdn6ynwpiI Xerox Printers  Xerox: Experimental reconfigurable modular printers  Prototype: 170 individually controlled modules, 4 print engines  Objective: Finish each print job as quickly as possible  Schindler Miconic 10 system  Enter destination before you board  A plan is generated, determining which elevator goes to which floor, and in which order  Saves time!  Planning  3 elevators could serve as much traffic as 5 elevators with earlier algorithms 8  Goals: ▪ Be in room 4 with objects A,B,C  Which packages in which truck, Miconic 10 Elevators Classical robot example: Shakey (1969)  Available actions: ▪ Moving to another location ▪ Turning light switches on and off ▪ Opening and closing doors ▪ Pushing movable objects around ▪ …  Put package p in truck t  6 jonkv@ida Some are larger, still well-structured Shakey jonkv@ida 5 jonkv@ida Logistics   Initially: There is a flat sheet of metal  Objective: The sheet should be in specific shape  Constraints: The piece must not collide with anything when moved!  An optimized plan for bending the sheet saves a lot of time = money!  Earth Observing-1 Mission  Satellite in low earth orbit ▪ Can only communicate 8 x 10 minutes/day ▪ Operates for long periods without supervision  CASPER software: Continuous Activity Scheduling, Planning, Execution and Replanning  Dramatically improved results ▪ Interesting events are analyzed (volcanic eruptions, …) ▪ Targets to view are planned depending on previous observations ▪ Plans downlink activities: Limited bandwidth  http://ase.jpl.nasa.gov/ 12 SIADEX  Decision support system for designing forest fire fighting plans  Needs to consider allocation of limited resources  Plans must be developed in cooperation with humans – people may die! 11 jonkv@ida NASA Earth Observing-1 Mission 10 jonkv@ida Bending sheet metal SIADEX jonkv@ida  9 jonkv@ida Bending Sheet Metal Unmanned Aerial Vehicles  Unmanned aerial vehicles, assisting in emergency situations jonkv@ida  13 15 jonkv@ida Why should Computers Plan? And why should computers create plans?  Manual planning can be boring and inefficient  Automated planning may create higher quality plans ▪ Software can systematically optimize, can investigate millions of alternatives Blocks World (1)  Simple example: The Blocks World Blocks World (2)  Suppose we can:  Pick up a block from the table, or from another block  Place a block on the table, or on another block You Current state of the world Your greatest desire A A C B B C Want to decide in advance how to achieve the goal  By definition, we want planning  16 A C B C B Then we can achieve any set of "towers" by:  Systematically dismantling all towers, A placing the blocks on the table  Repeatedly choosing a block whose destination is ready, and placing it in its desired location ▪ Pick up B, place B on C ▪ Pick up A, place A on B ▪ Done! B C A B C A jonkv@ida  Automated planning can be applied where the agent is ▪ Satellites cannot always communicate with ground operators ▪ Spacecraft or robots on other planets may be hours away by radio  Instead of actually moving blocks, let the program create a plan  [ unstack(A,C), putdown(A), Domain-Independent 1: What we want High Level Domain Descr. A C General description of Blocks World, Towers of Hanoi, Miconic 10 Elevators, Xerox Printers, or … B pickup(B), stack(B,C), pickup(A), stack(A,B) ]  Then you have created a planner!  Decides what to do to achieve a designated goal C B 18 jonkv@ida 17 jonkv@ida Blocks World (3): Domain-Specific High Level Instance Descr. Blocks World  locations of blocks Miconic 10 Elevators  list elevators and people, their current locations, … A  In a domain-specific way  Generally creates inefficient plans…  Solvers are usually far less straight-forward, good solvers even more so  Time-consuming to write  Problem changes  more programming required Domain-Independent 2: How? Domain-independent Planner B C Written for generic planning problems Difficult to create (but done once) A Improvements  all domains benefit A B C Plan 19 Domain-Independent 3: Models 20 First requirement: A formal language for domains + instances! To design a planner we must fully understand its inputs and expected results…  We want a general language:   Allows a wide variety of domains  Generate plans quickly to be modeled How do we create a domain-independent planner? We want good algorithms  Generate high quality plans  Easier with more limited languages Conflicting desires! jonkv@ida Problems: jonkv@ida  jonkv@ida  21 23 jonkv@ida Domain-Independent 4: Tradeoffs Many early planners made similar tradeoffs in terms of… The expressivity of the modeling language The underlying assumptions about the world  At the time this was simply "planning" or "problem solving"  Later grouped together, called "classical planning"  Quite restricted, but a good place to start Example: Dock Worker Robots (DWR) Containers shipped in and out of a harbor Cranes move containers between ”piles” and robotic trucks PDDL  24 PDDL: Planning Domain Definition Language  Origins: First International Planning Competition, 1998  Most used language today  General; many expressivity levels  Lowest level of expressivity: Called STRIPS  One specific syntax/semantics for planning domains/instances, with specific assumptions and restrictions associated with "classical planning"  Name taken from Shakey's planner, the Stanford Research Institute Problem Solver (but is not exactly what the STRIPS planner supported) Concrete PDDL constructs will illustrate what we are truly interested in: The underlying concepts! jonkv@ida  Forms the basis of most non-classical planners as well  PDDL separates domains and problem instances Domain file Named (define (domain dock-worker-robots) … ) PDDL: Expressivity Specification  26 Domains declare their expressivity requirements ▪ (define (domain dock-worker-robots) (:requirements :strips ;; Standard level of expressivity …) ;; Remaining domain information goes here! ) Problem instance file Named (define (problem dwr-problem-1) (:domain dock-worker-robots) Associated with … a domain ) jonkv@ida 25 jonkv@ida PDDL: Domain and Problem Definition We will see some other levels as well… Colon before many keywords, to avoid collisions when new keywords are added Many planning languages have their roots in logic  Great – you've already seen this! 27 Objects 1: First-order  objects exist  First-order representation: A crane moves containers between piles and robots  We assume a finite number of objects  Propositional syntax  Plain propositional facts: ~B1,1 , ~S1,1 , OK1,1 , OK2,1 , OK1  Used in some planners to simplify parsing, …  First-order syntax  Objects and predicates (relations): On(B,A), On(C,Floor), Clear(B), Clear(C)  Used in most implementations 28 A robot is an automated truck moving containers between locations A container can be stacked, picked up, loaded onto robots c3 c1 p1 A pile is a stack of containers – at the bottom, there is a pallet loc2 c2 p2 loc1 r1 A location is an area that can be reached by a single crane. Can contain several piles, at most one robot. We can skip: Hooks Wheels Rotation angles Drivers Essential: Determine what is relevant for the problem and objective! jonkv@ida Objects 0: First-Order  Warning: Many planners’ parsers ignore expressivity specifications Specific properties of a particular DWR problem instance jonkv@ida General properties of DWR problems  In PDDL and most planners: Objects 3: PDDL Type Hierarchies   Begin by defining object types ▪ (define (domain dock-worker-robots) (:requirements :strips Tell the planner :typing ) which features you need… 30 jonkv@ida 29 jonkv@ida Objects 2: PDDL Object Types Many planners support type hierarchies  Convenient, but often not used in domain examples ▪ (:types ; containers and robots are movable objects container robot – movable …) (:types location ; there are several connected locations in the harbor pile ; attached to a location, holds a pallet + a stack of containers robot ; holds at most 1 container, only 1 robot per location crane ; belongs to a location to pickup containers container)  Predefined ”topmost supertype”: object ) p1 loc2 c2 p2 loc1 r1 Objects 4: PDDL Object Definitions  c3 c1 p1 31 Objects are generally specified in the problem instance ▪ (define c2 p2 loc1 loc2 r1 Facts  32 In first-order representations of planning problems:  Any fact is represented as a logical atom: Predicate symbol + arguments (problem dwr-problem-1) (:domain dock-worker-robot) (:objects r1 – robot loc1 loc2 – location k1 – crane p1 p2 – pile c1 c2 c3 pallet – container)  Properties of the world ▪ raining – it is raining [not a DWR predicate!]  Properties of single objects… ▪ empty(crane) – the crane is not holding anything  Relations between objects ▪ attached(pile, location) – the pile is in the given location  Relations between >2 objects ▪ can-move(robot, location, location) – the robot can move between the two locations [Not a DWR predicate] c3 c1 p1 loc2 c2 p2 loc1 r1 loc2 c2 c3 p2 c1 r1 loc1 Essential: Determine what is relevant for the problem and objective! p1 jonkv@ida Still present in some benchmark domains c3 c1 jonkv@ida Earlier, planners used type predicates: islocation(obj), ispile(obj), … In PDDL: Lisp-like syntax for predicates, atoms, … ▪ (define (domain dock-worker-robots) (:requirements …) (:predicates (adjacent ?l1 ?l2 - location) "Fixed" (attached ?p - pile ?l - location) (can't (belong ?k - crane ?l - location) change) (at ?r - robot ?l - location) (occupied ?l - location) (loaded ?r - robot ?c - container ) (unloaded ?r - robot) "Dynamic" (modified by (holding ?k - crane ?c - container) (empty ?k - crane) actions) (in (top (on ?c - container ?p - pile) ?c - container ?p - pile) ?k1 ?k2 - container )  Variables are prefixed with “?” ; robot ?r is at location ?l ; there is a robot at location ?l ; robot ?r is loaded with container ?c ; robot ?r is empty We know which objects exist for each type ; crane ?k is holding container ?c ; crane ?k is not holding anything So: States have internal structure! Number of states: 2number of atoms – enormous, but finite (for classical planning!) jonkv@ida States 3: Efficient Representation  Efficient specification and storage for a single state:   A state of the world is specified as a set We can see the difference between two states containing all variable-free atoms that [are, were, will be] true in the world ▪ s0 = { on(A,B), on(B,C), in(A,2), in(B,2), in(C,2), top(A), bot(C) } s2 s0 A B C s0 top(A) is true top(B) is false top(C) is false on(A,A) is false on(A,B) is true on(A,C) is false on(B,A) is false … A B C s1 top(A) is false top(B) is true top(C) is false on(A,A) is false on(A,B) is false on(A,C) is false on(B,A) is false … 36  Specify which atoms are true ▪ All other atoms have to be false – what else would they be? we know ”this is a state where top(A) is true, top(B) is false, …” s1 These are the facts to keep track of! Every assignment of true/false to the ground atoms is a distinct state  Instead of knowing only ”this is state s0”, s0 adjacent(loc1,loc1) adjacent(loc1,loc2) … attached(pile1,loc1) … We can find all possible states! ; container ?c is somewhere in pile ?p ; container ?c is on top of pile ?p ; container ?k1 is on container ?k2 35 We can calculate all ground atoms We know all predicates that exist, and their argument types: adjacent(location, location), … )  A state (of the world) should specify exactly which facts are true/false in the world at a given time  PDDL: Facts are ground atoms ; can move from ?l1 directly to ?l2 ; pile ?p attached to location ?l ; crane ?k belongs to location ?l States 2: Internal Structure 34 jonkv@ida States 1: State of the World Structure is essential! We will see later how planners make use of structured states… s0 top(A) on(A,B) on(B,C) bot(C) in(A,2) in(B,2) in(C,2) top(A) ∈ s0  top(A) is true in s0 top(B) ∉ s0  top(B) is false in s0 jonkv@ida  33 jonkv@ida Predicates in PDDL Initial states in classical STRIPS planning:  We assume complete information about the initial state (before any action) ▪ So we can use a set/list of true atoms  Complete relative to the model: We must know everything about those predicates and objects we have specified... But not whether it's raining! loc1 Operators 1: Why?  Planning problem:  We know the current (initial) state of the world  We want to decide which actions can take us to a goal state  We need to know:  Which actions can be executed in a given state?  Where would you end up?  Plain search (lectures 2-3):  Assumed a transition / successor function – that's what we want  But how to specify it succinctly?  Example: Containers 1 and 3 should be in pile 2 ▪ We don't care about their order, or any other fact ▪ (define (problem dwr-problem-1) (:domain dock-worker-robot) (:objects …) (:goal (and (in c1 p2) (in c3 p2)))) c3 c1 r1 p1 39 jonkv@ida c1 Classical STRIPS planning: Want to reach one of possibly many goal states  In PDDL: ▪ Specify the set of goal states using a goal formula ▪ Basic STRIPS expressivity: Formula = conjunction of positive literals  In PDDL: ▪ (define (problem dwr-problem-1) (:domain dock-worker-robot) Lisp-like notation again: (:objects …) (attached p1 loc), not attached(p1,loc) (:init (attached p1 loc1) (in c1 p1) (on c1 pallet) (in c3 p1) (on c3 c1) (top c3 p1) (attached p2 loc1) (in c2 p2) (on c2 pallet) (top c2 p2) (belong crane1 loc1) (empty crane1) (at r1 loc2) (unloaded r1) (occupied loc2) (adjacent loc1 loc2) (adjacent loc2 loc1) ) loc2 ) c2 c3 p2 p1 38 jonkv@ida States 5: Goal States loc2 c2 p2 loc1 r1 Operators 2: PDDL  40 PDDL supports action schemas: ▪ (define (domain dock-worker-robots) … (:action move :parameters (?r - robot ?from ?to - location) The action is applicable in a state s if its ▪ is true in s precond :precondition (and (adjacent ?from ?to) (at ?r ?from) (not (occupied ?to))) :effect (and (at ?r ?to) (not (occupied ?from)) (occupied ?to) (not (at ?r ?from)) ) Quantifiers: (exists (?v) …) (forall (?v) …) Many planners only support some of these! ) The result of applying the action in state s: s – negated effects + positive effects Classical planning: Known initial state, known state update function  deterministic, can completely predict the state of the world after a sequence of actions! Connectives: (and …) (or …) (imply …) (not …) loc2 c2 c3 c1 p1 p2 loc1 r1 jonkv@ida  37 jonkv@ida States 4: Initial State in PDDL  The planner instantiates these schemas   Applies them to any combination of parameters ▪ (define (domain dock-worker-robots) … (:action move :parameters (?r - robot ?from ?to - location) One instance is move(r1,loc2,loc1)  The specific instance, with parameters specified, can be called: ▪ action, action instance, … ▪ operator instance, … ▪ … Applicable if (adjacent loc2 loc1), (at r1 loc2), (not (occupied loc1)) :precondition (and (adjacent ?from ?to) (at ?r ?from) (not (occupied ?to))) ▪ :effect (and (at ?r ?to) (not (occupied ?from)) Modifies current state, (occupied ?to) (not (at ?r ?from)) adding/removing facts ) loc2 c2 c3 c1 p2 Operators 4: Step by Step  43 In classical planning (the basic, limited form): Each action corresponds to one state update We know the initial state s1 Time is not modeled, and never multiple state updates for one action a1 s2 a2 s3 a3 s4 We know how states are changed by actions  Deterministic, can completely predict the state of the world after a sequence of actions! The solution to the problem will be a sequence of actions jonkv@ida r1 loc1 p1 Terminology can differ  The general schema can be called: ▪ operator, operator schema, … ▪ action, action schema, … ▪ … of the correct type ) Operators 5: Terminology 42 jonkv@ida 41 jonkv@ida Operators 3: Instances  A ▪ ▪ ▪ ▪ Back to the Blocks World  Simple: Allows us to focus on concepts, not details You Current state of the world Your greatest desire B B C Blocks World (3)  A ▪ ▪ ▪ ▪ 47 common blocks world version, with 4 operators (pickup ?x) (putdown ?x) (unstack ?x ?y) (stack ?x ?y) pickup(B) – takes ?x from the table – puts ?x on the table – takes ?x from on top of ?y – puts ?x on top of ?y (:action putdown :parameters (?x) :precondition (holding ?x) A C B D stack(B,C) 48 A C B D (:action unstack :parameters (?top ?below) :precondition (and (on ?top ?below) (clear ?top) (handempty)) :effect (and (holding ?top) (clear ?below) (not (clear ?top)) (not (handempty)) (not (on ?top ?below)))) (:action stack :parameters (?top ?below) :precondition (and (holding ?top) (clear ?below)) :effect (and (not (holding ?top)) (not (clear ?below)) (clear ?top) (handempty) (on ?top ?below))) :effect (and (on-table ?x) (clear ?x) (handempty) (not (holding ?x)))) Direct Plan Generation Sometimes we can directly generate a plan  plan How to find a classical sequential plan? putdown(A) (pickup ?x) (putdown ?x) (unstack ?x ?y) (stack ?x ?y) (:action pickup :parameters (?x) :precondition (and (clear ?x) (on-table ?x) (handempty)) :effect (and (not (on-table ?x)) (not (clear ?x)) (not (handempty)) (holding ?x)))  – takes ?x from the table – puts ?x on the table – takes ?x from on top of ?y – puts ?x on top of ?y  Predicates (relations) used: (clear A) ▪ (on ?x ?y) – block ?x is on block ?y (on A C) ▪ (ontable ?x) – ?x is on the table (ontable C) ▪ (clear ?x) – we can place a block on top of ?x (clear B) (ontable B) ▪ (holding ?x) – the robot is holding block ?x (clear D) (ontable D) (handempty) ▪ (handempty) – the robot is not holding any block unstack(A,C) jonkv@ida C common blocks world version, with 4 operators For later reference: A A 46 jonkv@ida Blocks World (2) jonkv@ida  45 jonkv@ida Blocks World (1) = [ ];  s = current state of the world;  while (exists b1,b2 [ s.isOn(b1,b2) ]): ▪ ▪ ▪ ▪ plan s plan s A C B += "unstack(b1,b2)" = apply(unstack(b1,b2), s) += "putdown(b1)" = apply(putdown(b1), s) Tear down all towers  while (exists b1,b2 [ goal.isOn(b1,b2) & !s.isOn(b1,b2) & s.isClear(b1) & s.isClear(b2) ]): ▪ ▪ ▪ ▪ plan s plan s += pickup(b1) = apply(pickup(b1), s) += stack(b1,b2) = apply(stack(b1,b2), s)  return plan Rebuild in the right order More often, planning algorithms are based on search  (contains some information, depending on the search space…) Child node 1 Child node 2 Important distinction:  Planning means deciding in advance Search space which actions to perform in order to achieve a goal Classical planning: Finite number of search nodes Initial search node 0 50 jonkv@ida With or Without Search  Search will often be a useful tool, but you can (sometimes) get by without it We usually don’t have all search nodes explicitly represented: We start with a single initial search node  Search can be used for many other purposes than plan generation A successor function / branching rule returns all successors of any search node  can build the graph incrementally Expand a node = generate its successors Continue until a node satisfies a goal criterion Now we have multiple unexpanded nodes! A search strategy chooses which one to expand next If we use search: What search space? Forward Search 1: Vacuum World  52 Forward state space search: As in the Vacuum World  A search node is simply a world state  Successor function: ▪ One outgoing edge for every executable (applicable) action ▪ The action specifies where the edge will lead We assume an initial state: And zero or more goal states ("no dirt"): Find a path (not necessarily shortest) – SRS, RSLS, LRLRLSSSRLRS, … jonkv@ida  49 jonkv@ida Planning as Search Towers of Hanoi 3 disks, 3 pegs  27 states  54 jonkv@ida A classical solution plan is an action sequence taking you from the init state to a goal state Initial/current state Forward Search 3: Larger Example 56 jonkv@ida 53 jonkv@ida Forward Search 2: ToH, Actions Larger state space – interesting symmetry  7 disks  2187 states  6558 transitions, [state, action]  state Forward Search 4: Blocks World, 3 blocks 55 jonkv@ida Goal state Forward Search 5: Blocks World, 4 blocks 125 states 272 transitions Initial (current) state Goal states 866 states 2090 transitions Forward Search 7 58 jonkv@ida 57 jonkv@ida Forward Search 6: Blocks World, 5 blocks Extremely many states – must generate the graph as we go Forward State Space Forward planning, forward-chaining, progression: Begin in the initial state Initial search node 0 = initial state Generate the initial state = initial node from the initial state description (PDDL) Edges correspond to actions Child node 1 = result state Child node 2 = result state The successor function / branching rule: To expand a state s, generate all states that result from applying an action that is applicable in s  59 Forward Search 9: Step by step 60 Blocks world example:  Incremental expansion: Choose a node (using search strategy)  Generate the initial state  Expand all possible successors ▪ “What actions are applicable in the current state, and where will they take me?” ▪ Generates new states by applying effects  Repeat until a goal node is found! A C B D A A C B D C B D A C B D A C B D A C B D A C B C B D A D  The BW is symmetric.  Not true for all domains! jonkv@ida Forward Search 8: Step by step jonkv@ida Now we have multiple unexpanded nodes! A search strategy chooses which one to expand next  ▪ Given that all actions have the same cost, Dijkstra will first consider all plans that stack less than 400 blocks! ▪ Stacking 1 block: = 400*399 plans, … ▪ Stacking 2 blocks: > 400*399 * 399*398 plans, … One possible strategy: Dijkstra’s algorithm  Similar to uniform-cost search ▪  Strategy: Pick the cheapest path from the list of expandable paths  Efficient: O(|E| + |V| log |V|) More than 163056983907893105864579679373347287756459484163478267225862419762304263994207997664258213955766581163654137118 163119220488226383169161648320459490283410635798745232698971132939284479800304096674354974038722588873480963719 240642724363629154726632939764177236010315694148636819334217252836414001487277618002966608761037018087769490614 847887418744402606226134803936935233568418055950371185351837140548515949431309313875210827888943337113613660928 318086299617953892953722006734158933276576470475640607391701026030959040303548174221274052329579637773658722452 549738459404452586503693169340418435407383263781602533940396297139180912754853265795909113444084441755664211796 274320256992992317773749830375100710271157846125832285664676410710854882657444844563187930907779661572990289194 810585217819146476629300233604155183447294834609054590571101642465441372350568748665249021991849760646988031691 394386551194171193333144031542501047818060015505336368470556302032441302649432305620215568850657684229678385177 725358933986112127352452988033775364935611164107945284981089102920693087201742432360729162527387508073225578630 777685901637435541458440833878709344174983977437430327557534417629122448835191721077333875230695681480990867109 051332104820413607822206465635272711073906611800376194410428900071013695438359094641682253856394743335678545824 320932106973317498515711006719985304982604755110167254854766188619128917053933547098435020659778689499606904157 077005797632287669764145095581565056589811721520434612770594950613701730879307727141093526534328671360002096924 483494302424649061451726645947585860104976845534507479605408903828320206131072217782156434204572434616042404375 21105232403822580540571315732915984635193126556273109603937188229504400 1.63 * 101735 ▪ |V| = the number of nodes ▪ |E| = the number of edges  And generates optimal (cheapest) paths/plans! Efficient in terms of the search space size: O(|E| + |V| log |V|) Would this algorithm be a solution?  But computers are getting very fast!  Suppose we can check 10^20 states per second ▪ >10 billion states per clock cycle for today’s computers, each state involving complex operations  Then it will only take 10^1735 / 10^20 = 10^1715 seconds…  But we have multiple cores!  The universe has at most 10^87 particles, including electrons, …  Let’s suppose every one is a CPU core   only 10^1628 seconds > 10^1620 years  The universe is around 10^10 years old Hopeless? No: We need informed search! The search space is exponential in the size of the input description… 63 jonkv@ida Fast Computers, Many Cores 62  Blocks world, 400 blocks initially on the table, goal is a 400-block tower Which search strategy to use?  Breadth first, depth first, iterative deepening first, …  Dijkstra’s Algorithm: Analysis jonkv@ida 61 jonkv@ida Dijkstra We need:   A heuristic function h(n) estimating the cost or difficulty  Count the number of facts that are “wrong” of reaching the goal from node n in the current state (similar to "number of pieces out of place") ”wrong value” ▪ In this lecture, cost = number of actions ▪ Could also use action-specific costs  A search strategy selecting nodes depending on h(n)  A very simple domain-independent heuristic: What's the difference from heuristics in the search lectures?  Previously we adapted heuristics to the problem at hand ▪ Romania Travel  straight line distance ▪ 8-puzzle  pieces out of place, sum of Manhattan distances – on(A,C) – clear(A) – ¬clear(C) + holding(A) ”destroyed” + handempty A C B A C Not perfect  We must often pass states where more facts are "wrong" on our way to a goal state - on(A,C) + ¬clear(A) ontable(B) + clear(C) ¬on(A,B) 8 + ¬handempty on(A,C) A + holding(A) ¬on(B,C) C B D 6 clear(B) A B 5 C D A 5 B C B D A C D 6 B A 9 A D C D C B Optimal: unstack(A,C) putdown(A) pickup(B) stack(B,C) pickup(A) stack(A,B) Not fatal in itself: heuristics often lead you astray A B C D jonkv@ida C A B - clear(B) + ¬ontable(B) + holding(B) + handempty Generally not admissible (unlike in the 8-puzzle)!  Matters to some heuristic search algorithms (not all) ¬on(B,D) ¬handempty ¬clear(B) clear(D) A C - handempty + clear(A) + ontable(A) 9 B Domain-Independent Heuristics (3)  - handempty + clear(A) 6 - holding(A) C B A 7 ”wrong value”  67 A C B C B  Now we want to define a general heuristic function ▪ Without knowing what planning problem is going to be solved! Domain-Independent Heuristics (2) 4 - holding(A) 6 A Optimal: unstack(A,C) stack(A,B) pickup(C) stack(C,A) - ¬on(A,B) ”repaired” ontable(C) on(A,C) ¬on(C,A) ¬on(A,B) clear(A) clear(B) ¬clear(C) 66 jonkv@ida Domain-Independent Heuristics (1) B D 4 facts are ”wrong”, can be fixed with a single action A C B D 68 jonkv@ida  65 jonkv@ida Domain-Independent Heuristics Planning Competition 2011: Elevators domain, problem 1  A* with goal count heuristics ▪ States: 108922864 generated, gave up Creating Admissible Heuristic Functions: The General Relaxation Principle and Delete Relaxation  LAMA 2011 planner, good heuristics, other strategy: ▪ Solution: 79 steps ▪ States: 13236 generated, 425 evaluated/expanded Turns out: Counting goals is not a very useful heuristic… Elevators, problem 5, LAMA 2011: ▪ Solution: ▪ States:  112 steps 41811 generated, 1317 eval/exp Elevators, problem 20, LAMA 2011:  LAMA 2011 planner: ▪ Solution: 354 steps ▪ States: 1364657 generated, 14985 eval/exp Even a state-of-the-art planner can’t go directly to a goal state! Generates many more states than those actually on the path to the goal… Relaxed Planning Problems  71 A classical planning problem P has a set of solutions  Solutions(P) = { π : π is an executable sequence of actions leading from s0 to a state in Sg }   P is a classical planning problem  P’ is another classical planning problem  If a plan achieves the goal in P, the same plan also achieves the goal in P’ But P’ can have additional solutions!  Relaxed Planning Problems (2)  Remember the concepts from this definition!  Not just "we relaxed the constraints"; now that the problem is defined we can be more precise!  If you transform the planning problem Suppose that:  Solutions(P) ⊆ Solutions(P’) jonkv@ida  Then P’ is a relaxation of P  The constraints on what constitutes a solution have been relaxed but all "old" solutions still remain solutions, the transformation is a relaxation 72 jonkv@ida  69 jonkv@ida Example Statistics   All old solutions still valid, but new solutions may exist  Modifies the state transition system (states/actions)  Retains the same states and transitions  This particular example: shorter solution exists  This particular example: Optimal solution retains the same length Goal Goal on(B,A) on(A,C) 75 jonkv@ida on(B,A) on(A,C) or on(C,B) Relaxation Definition of relaxation: Change the problem so that all old solutions remain valid, new solutions become possible! An optimal solution for the relaxed problem can never be more expensive than an ”original” optimal solution You’ve seen relaxations adapted to the 8-puzzle What can you do independently of the planning domain used? on(B,A) on(A,C) or on(C,B) Delete Relaxation (1)  76 Example 2: Adding goal states  All old solutions still valid, but new solutions may exist Goal 74 jonkv@ida Example 1: Adding new actions Relaxation Example 2 jonkv@ida  73 jonkv@ida Relaxation Example 1 One domain-independent technique: delete relaxation  Assume a classical problem with: ▪ Positive preconditions ▪ Positive goals ▪ Both negative and positive effects  Relaxation: remove all negative effects (all "delete effects")! ▪ (unstack ?x ?y) before transformation: :precondition (and (handempty) (clear ?x) (on ?x ?y)) :effect (and (not (handempty)) (holding ?x) (not (clear ?x)) (clear ?y) (not (on ?x ?y) ) ▪ (unstack ?x ?y) after transformation: :precondition (and (handempty) (clear ?x) (on ?x ?y)) :effect (and (holding ?x) (clear ?y)) C B Any action applicable (or goal satisfied) after unstack(A,C) in the original problem… on(A,C) ontable(B) ontable(C) clear(A) clear(B) clear(C) holding(A) handempty  This is not a problem  All we need is for solutions to be preserved – and they are! on(A,C) ontable(B) ontable(C) clear(A) clear(B) holding(B) handempty …is applicable (satisfied) after unstack(A,C) here – :strips preconds/goals are positive!  If it results in a goal state in P, it is executable in delete-relaxed P’ it results in a goal state in del-relaxed P’ Easy to apply mechanically  Remove all negative effects (“the delete effects are relaxed/removed”)  Important Issues    This is a relaxation!  79 So given any action sequence:  If it is executable in P, If only this relaxation is applied: Important: You cannot “use a relaxed problem as a heuristic”. What would that mean? You use the cost of an optimal solution to the relaxed problem as a heuristic. Solving the relaxed problem can result in a more expensive solution  inadmissible! You have to solve it optimally to get the admissibility guarantee.  Gives us the optimal delete relaxation heuristic, h+(n)  h+(n) = 80 Comments:  Delete relaxation is "strange": Leads to "non-physical" states on(A,C) ontable(B) ontable(C) clear(A) clear(B) handempty Delete Relaxation (4): Heuristic  78 jonkv@ida on(A,C) ontable(C) clear(A) holding(B) A A C B  jonkv@ida ontable(B) ontable(C) clear(B) A clear(C) holding(A) C B Notice: Transitions are not added but moved (leading to other states)! No physical ”meaning”! on(A,C) ontable(B) ontable(C) clear(A) clear(B) handempty Delete Relaxation (3): Comments jonkv@ida State space: delete-relaxed problem State space: Original problem A C B 77 jonkv@ida Delete Relaxation (2): Example the cost of an optimal solution to a delete-relaxed problem starting in node n Need a heuristic value for every expanded state – calculating h+(n) still takes too much time (requires planning)! You don’t just solve the relaxed problem once. Every time you reach a new state and want to calculate a heuristic, you have to solve the relaxed problem of getting from that state to the goal.

TDDC17: Introduction to Automated Planning

Related documents

Products

Support

TDDC17: Introduction to Automated Planning

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib