Planning Graphs -- As a search space -- To calculate informed heuristics

Planning Graphs -- As a search space -- To calculate informed heuristics Jonas Kvarnström Automated Planning Group Department of Computer and Information Science Linköping University jonas.kvarnstrom@liu.se – 2016 3 BACKWARD SEARCH   We know if the effects of an action can contribute to the goal Don't know if we can reach a state where its preconditions are true so we can execute it at(home) have-heli at(LiU) … at(home) have-shoes One solution: Heuristics. But other methods exist… jonkv@ida Recap: Backward Search  Suppose that:  We are interested in sequential plans with few actions  Suppose that we could:  Quickly calculate the set of reachable states at time 0, 1, 2, 3, …  Quickly test formulas in such sets  Then we could also:  Use this to prune the backward search tree! 4 jonkv@ida Reachable States  5 First step: Determine the length of a shortest solution  Doesn't tell us the actual solutions: We know states, not paths Time 4 Time 3 Time 2 Time 1 Time 0 Initial state These 10 states are reachable These 200 states are reachable These 4000 states are reachable Goal not satisfied Goal not satisfied in any state Goal not satisfied in any state Goal not satisfied in any state These 12000 states are reachable Goal satisfied in 14 of the states: All shortest solutions must have 4 actions! jonkv@ida Reachable States  6 Second step: Backward search with pruning Time 4 Time 3 Time 2 Time 1 Time 0 Initial state These 10 states are reachable These 200 states are reachable These 4000 states are reachable Regression: at(home) have-heli These 12000 states are reachable One relevant action: fly-heli Not true (reachable) in any of the 4000 states at time 3! We know preconds can’t be achieved in 3 steps  backtrack at(LiU) … True in 14 states jonkv@ida Backward Search with Pruning  7 There must be some other way of achieving at(LiU)! Time 4 Time 3 Time 2 Time 1 Time 0 Initial state These 10 states are reachable Continue backward search as usual, using reachable states to prune the search tree These 200 states are reachable These 4000 states are reachable Regression: at(home) have-shoes These 12000 states are reachable Another relevant action: walk Preconds are achievable at time 3! (True in several states) at(LiU) … True in 14 states jonkv@ida Backward Search with Pruning (2) 8 Problem: Fast and exact computation is impossible!  ”Suppose we could quickly calculate all reachable states…”  In most cases, calculating exactly the reachable states would take far too much time and space!  Argument as before ("otherwise planning would be easy, and we know it isn't") jonkv@ida Problem  10 Solution: Don’t be exact!  Quickly calculate an overestimate  Anything outside this set is definitely not reachable – still useful for pruning Time 4 Time 3 Time 2 Time 1 Time 0 Initial state These 15: possibly reachable 10 These 500: possibly reachable Includes 200 These 15000: possibly reachable Includes all 4000 truly reachable …out of a billion? These 42000: possibly reachable Includes all 12000 truly reachable jonkv@ida Possibly Reachable States (1)  11 Planning algorithm:  Keep calculating until we find a timepoint where the goal might be achievable Time 2 Time 1 Time 0 Initial state These 15: possibly reachable These 500: possibly reachable Goal not satisfied Goal not satisfied in any state Goal not satisfied in any state Time 3 These 15000: possibly reachable Includes all 4000 truly reachable Satisfied in 37 possibly reachable states: A shortest solution must have at least 3 actions! jonkv@ida Possibly Reachable States (2)  12 Backward search will verify what is truly reachable  In a much smaller search space than plain backward search Time 3 Time 2 These 15000: possibly reachable Time 1 Time 0 Initial state These 15: possibly reachable These are all of the relevant actions… (Of course we don't always detect the bad choice in a single time step!) These 500: possibly reachable Includes all 4000 truly reachable Preconds not satisfied… fly-heli at(LiU) … Preconds not satisfied… walk at(LiU) … The goal seems to be reachable here, but we can’t be sure (overestimating!) jonkv@ida Possibly Reachable States (3)  13 Extend one time step and try again Time 4  A larger number of states may be truly reachable then Time 3 Time 2 Time 1 Time 0 Initial state These 15: possibly reachable These 500: possibly reachable Continue backward search, using reachable states to prune the search tree  smaller search space Pruning still possible! These 42000: possibly reachable These 15000: possibly reachable Includes all 4000 truly reachable Preconds not satisfied… Seems OK! Includes all 12000 truly reachable fly-heli at(LiU) … walk at(LiU) … jonkv@ida Possibly Reachable States (4)  14 This is a form of iterative deepening search! Time step = state step (1 time step  0 actions) Classical problem P Loop for i = 1, 2, … Try to find a plan with i time steps (including the initial state) no solution with i-1 actions exists Plan! { solutions to P } = { solutions to P with i time steps | i ∈ ℕ, i >= 0 } jonkv@ida Iterative Search An efficient representation for possibly reachable states  16 A Planning Graph also considers possibly executable actions  Useful to generate states – also useful in backwards search! k+1 proposition levels Which propositions may possibly hold in each state? Initial state 1200 200 4000 10 possibly 8 possibly 47 possibly possibly possibly possibly executable reachable executable reachable executable reachable actions actions states actions states states k action levels Which actions may possibly be executed in each step? jonkv@ida Planning Graph  17 GraphPlan’s plans are sequences of sets of actions   Fewer levels required! Initial state 1200 200 4000 10 possibly 8 possibly 47 possibly possibly possibly possibly executable reachable executable reachable executable reachable actions actions states actions states states load(Package1,Truck1), load(Package2,Truck2), load(Package3,Truck3) Can be executed in arbitrary order drive(T1,A,B), drive(T2,C,D), drive(T3,E,F) unload(Package1,Truck1), unload(Package2,Truck2), unload(Package3,Truck3) Arbitrary order Can be executed in arbitrary order Not necessarily in parallel – original objective was a sequential plan jonkv@ida GraphPlan: Plan Structure  Running example due to Dan Weld (modified):  Prepare and serve a surprise dinner, take out the garbage, and make sure the present is wrapped before waking your sweetheart! s0 = {clean, garbage, asleep} g = {clean, ¬garbage, served, wrapped} Action Preconds Effects cook() serve() wrap() carry() roll() clean() clean dinner asleep garbage garbage ¬clean dinner served wrapped ¬garbage, ¬clean ¬garbage, ¬asleep clean 18 jonkv@ida Running Example  Time 0: ▪ s0  Time 1: ▪ cook ▪ serve ▪ wrap ▪ carry ▪ roll ▪ clean ▪ cook+wrap ▪ cook+roll ▪ …  Time 2: ▪ cook/cook ▪ cook/serve ▪ cook/wrap ▪ cook/carry ▪ cook/roll ▪ cook/clean ▪ wrap/cook  {clean, garbage, asleep} 19 Can’t calculate and store all reachable states, one at a time…  {clean, garbage, asleep, dinner}  impossible  {clean, garbage, asleep, wrapped} Let’s calculate  {asleep} reachable literals  {clean} instead!  impossible  {garbage, clean, asleep, dinner, wrapped}  {clean, dinner}  {clean, garbage, asleep, dinner}  {clean, garbage, asleep, dinner, served}  {clean, garbage, asleep, dinner, wrapped}  {asleep, dinner}  {clean, dinner}  not possible … jonkv@ida Reachable States Time 0:  s0 = {garbage, clean, asleep}   Time 1:       cook (serve) wrap carry roll (clean)  s1 = {garbage, clean, asleep, dinner} not applicable  s2 = {garbage, clean, asleep, wrapped}  s3 = {asleep}  s4 = {clean} not applicable No need to consider sets of actions: All literals made true by any combination of ”parallel” actions are already there State level 0 Action level 0 garbage State level 1 garbage ¬garbage clean asleep ¬dinner We can't reach all combinations of literals in state level 1… ¬served But we can reach only such combinations! Can not reach a state where served is true ¬wrapped clean Depending on which actions we choose here…  20 jonkv@ida Reachable Literals (1) ¬clean asleep ¬asleep dinner ¬dinner ¬served wrapped ¬wrapped  21 Planning Graph Extension: State level 0  Start with one ”state level” ▪ Set of reachable literals garbage  For each applicable action ▪ Add its effects to the next state level ▪ Add edges to preconditions and to effects for bookkeeping (used later!) Action cook() serve() wrap() carry() roll() clean() Precond clean dinner asleep garbage garbage ¬clean Effects dinner served wrapped ¬garbage, ¬clean ¬garbage, ¬asleep clean But wait! Some propositions are missing… Action level 1 State level 1 carry ¬garbage roll ¬clean clean asleep ¬asleep cook dinner ¬dinner wrap ¬served wrapped ¬wrapped jonkv@ida Reachable Literals (2)  Depending on the actions chosen, facts could persist from the previous level! 22 State level 0 Action level 1 garbage To handle this consistently: maintenance (noop) actions One for each literal l  Precond = effect = l  Action cook() serve() wrap() carry() roll() clean() noopdinner noopnotdin … Precond clean dinner asleep garbage garbage ¬clean dinner ¬dinner Effects dinner served wrapped ¬garbage, ¬clean ¬garbage, ¬asleep clean dinner ¬dinner State level 1 garbage carry  jonkv@ida Reachable Literals (3) clean ¬garbage clean roll asleep ¬clean asleep ¬asleep cook ¬dinner dinner ¬dinner wrap ¬served ¬served wrapped ¬wrapped ¬wrapped  Now the graph is sound  If an action might be executable,  23 State level 0 it is part of the graph  If a literal might hold in a given state, it is part of the graph garbage But it is quite “weak”! asleep Action level 1  carry clean ¬garbage clean roll ¬clean asleep ¬asleep cook ¬dinner We need more information  In an efficiently useful format State level 1 garbage  Even at state level 1, it seems any literal except served can be achieved jonkv@ida Reachable Literals (4) dinner ¬dinner wrap ¬served  Mutual exclusion ¬served wrapped ¬wrapped ¬wrapped No mutexes at state level 0: We assume a consistent initial state! State level 0 24 Action level 1 garbage Two actions in a level are mutex if their effects are inconsistent Can’t execute them in parallel, and order of execution is not arbitrary  carry / noop-garbage,  roll / noop-garbage, roll / noop-asleep  cook / noop-¬dinner  wrap / noop-¬wrapped State level 1 garbage carry clean ¬garbage clean roll asleep ¬clean asleep carry / noop-clean ▪ One causes garbage, the others cause not garbage jonkv@ida Mutex 1: Inconsistent Effects ¬asleep cook ¬dinner dinner ¬dinner wrap ¬served ¬served wrapped ¬wrapped ¬wrapped Two actions in one level are mutex if one destroys a precondition of the other 25 State level 0 garbage  carry is mutex with noop-garbage ▪ carry deletes garbage ▪ noop-garbage needs garbage  … State level 1 garbage carry Can’t be executed in arbitrary order  roll is mutex with wrap ▪ roll deletes asleep ▪ wrap needs asleep Action level 1 jonkv@ida Mutex 2: Interference clean ¬garbage clean roll asleep ¬clean asleep ¬asleep cook ¬dinner dinner ¬dinner wrap ¬served ¬served wrapped ¬wrapped ¬wrapped Two propositions are mutex if one is the negation of the other Can’t be true at the same time… State level 0 26 Action level 1 garbage jonkv@ida Mutex 3: Inconsistent Support State level 1 garbage carry clean ¬garbage clean roll asleep ¬clean asleep ¬asleep cook ¬dinner dinner ¬dinner wrap ¬served ¬served wrapped ¬wrapped ¬wrapped Two propositions are mutex if they have inconsistent support All actions that achieve them are pairwise mutex in the previous level ¬asleep can only be achieved by roll, wrapped can only be achieved by wrap, and roll/wrap are mutex  ¬asleep and wrapped are mutex State level 0 27 Action level 1 garbage garbage carry clean ¬garbage clean roll asleep ¬clean asleep ¬asleep cook ¬clean can only be achieved by carry, dinner can only be achieved by cook, and carry/cook are mutex  ¬clean and dinner are mutex State level 1 ¬dinner dinner ¬dinner wrap ¬served ¬served wrapped ¬wrapped ¬wrapped jonkv@ida Mutex 4: Inconsistent Support    28 Two actions at the same action-level are mutex if  Inconsistent effects: an effect of one negates an effect of the other  Interference: one deletes a precondition of the other  Competing needs: they have mutually exclusive preconditions Otherwise they don’t interfere with each other Recursive  Both may appear at the same time step in a solution plan Two literals at the same state-level are mutex if  Inconsistent support: one is the negation of the other, or all ways of achieving them are pairwise mutex propagation of mutexes jonkv@ida Mutual Exclusion: Summary  Is there a possible solution?  Goal g = {clean, ¬garbage, served, wrapped}  No: Cannot reach a state where served is true in a single (multi-action) step 29 State level 0 Action level 1 garbage State level 1 garbage carry clean ¬garbage clean roll asleep ¬clean asleep ¬asleep cook ¬dinner dinner ¬dinner wrap ¬served ¬served wrapped ¬wrapped ¬wrapped jonkv@ida Early Solution Check State level 0 Action level 1 garbage carry clean State level 1 asleep Action level 2 garbage ¬garbage ¬garbage ¬clean asleep carry roll clean ¬asleep cook ¬dinner dinner ¬dinner wrap ¬served ¬wrapped State level 2 garbage clean roll 30 clean ¬clean asleep ¬asleep cook serve wrap dinner ¬dinner served ¬served ¬served wrapped wrapped ¬wrapped ¬wrapped All goal literals are present in level 2, and none of them are (known to be) mutex! jonkv@ida Expanded Planning Graph State level 0 Action level 1 garbage carry clean 32 State level 1 asleep garbage ¬garbage ¬garbage ¬clean asleep carry ¬dinner We seem to have 12 alternatives at the last layer (3*2*1*2): wrap ¬served dinner ¬dinner clean roll ¬clean clean ¬asleep cook State level 2 garbage clean roll Action level 2 asleep ¬asleep cook serve wrap dinner ¬dinner served ¬served ¬served 2 ways of achieving clean wrapped wrapped 1 ¬wrapped way of achieving served ¬wrapped ¬wrapped 3 ways of achieving ¬garbage 2 ways of achieving wrapped g = {clean, ¬garbage, served, wrapped} jonkv@ida Solution Extraction (1) State level 0 Action level 1 garbage carry clean 33 State level 1 asleep garbage ¬garbage ¬garbage ¬clean asleep carry ¬dinner Here: wrap An inconsistent alternative ¬served (mutex actions chosen)… ¬wrapped dinner ¬dinner clean roll ¬clean clean ¬asleep cook State level 2 garbage clean roll Action level 2 asleep ¬asleep cook serve wrap dinner ¬dinner served ¬served ¬served wrapped wrapped ¬wrapped ¬wrapped g = {clean, ¬garbage, served, wrapped} jonkv@ida Solution Extraction (2) State level 0 Action level 1 garbage carry clean 34 State level 1 asleep garbage ¬garbage ¬garbage ¬clean asleep carry ¬dinner This is one of the consistent combinations – a successor wrap in this ¬served backwards search “multi-action” The choice is a backtrack point! (May¬wrapped have to backtrack, try to achieve ¬garbage by roll instead) dinner ¬dinner clean roll ¬clean clean ¬asleep cook State level 2 garbage clean roll Action level 2 asleep ¬asleep cook serve wrap dinner ¬dinner served ¬served ¬served wrapped wrapped ¬wrapped ¬wrapped g = {clean, ¬garbage, served, wrapped} jonkv@ida Solution Extraction (3) State level 0 Action level 1 garbage carry clean 35 State level 1 asleep garbage ¬garbage ¬garbage ¬clean asleep carry roll clean ¬asleep cook ¬dinner dinner ¬dinner The successor has a new goal wrap This goal consists of all ¬served preconditions of the chosen action (Not a backtrack point) The goal is not mutually exclusive ¬wrapped (if so, it would be unreachable) State level 2 garbage clean roll Action level 2 clean ¬clean asleep ¬asleep cook serve wrap dinner ¬dinner served ¬served ¬served wrapped wrapped ¬wrapped ¬wrapped Successor goal = {clean, garbage, dinner, wrapped} jonkv@ida Solution Extraction (4) State level 0 Action level 1 garbage carry clean 36 State level 1 asleep garbage ¬garbage ¬garbage ¬clean asleep carry roll clean ¬asleep cook ¬dinner dinner ¬dinner wrap ¬served One possible solution: 1) cook/wrap ¬wrappedin any order, 2) serve/roll in any order State level 2 garbage clean roll Action level 2 clean ¬clean asleep ¬asleep cook serve wrap dinner ¬dinner served ¬served ¬served wrapped wrapped ¬wrapped ¬wrapped jonkv@ida Solution Extraction (5) The set of goals we are trying to achieve 37 The level of the state si, starting at the highest level procedure Solution-extraction(g,i) A form of backwards search, if i=0 then return the solution but only among the actions in the graph (generally much fewer, esp. with mutexes)! nondeterministically choose a set of non-mutex actions ("real" actions and/or maintenance actions) statestateactionto use in state s i–1 to achieve g level level level (must achieve the entire goal!) i-1 i i if no such set exists then fail (backtrack) g’ := {the preconditions of the chosen actions} Solution-extraction(g’, i–1) end Solution-extraction jonkv@ida Solution Extraction 6  38 Possible literals:  What is achieved is always carried forward by no-ops   Monotonically increase over state levels  Possible actions:  Action included if all precondition literals exist in the preceding state level   Monotonically increase over action levels State level 0 Action level 1 garbage carry clean State level 1 asleep cook State level 2 garbage garbage ¬garbage ¬garbage clean roll Action level 2 ¬clean carry roll clean ¬clean asleep asleep ¬asleep ¬asleep dinner cook dinner jonkv@ida Important Properties  39 Mutex relationships:  Mutexes between included literals monotonically decrease ▪ If two literals could be achieved together in the previous level, we could always just “do nothing”, preserving them to the next level  (Mutexes to newly added literals can be introduced)  At some point, the Planning Graph “levels off”  After some time k all levels are identical  (Why?) A planning graph is exactly what we described here – not some arbitrary planning-related graph jonkv@ida Important Properties (2) Top: Real state reachability for a Rover problem ("avail" facts not shown) Bottom: Planning Graph for the same problem (mutexes not shown) At each step, an overestimate of reachable properties / applicable actions!  A form of iterative deepening: Classical problem P Loop for i = 1, 2, …  41 Try to find a plan with i time steps (including the initial state) Plan! Therefore, GraphPlan is optimal in the number of time steps  Not very useful: We normally care much more about ▪ Total action cost ▪ Number of actions (special case where action cost = 1) ▪ Total execution time (”makespan”) load(Package1,Truck1), load(Package2,Truck2), load(Package3,Truck3) drive(T1,A,B), drive(T2,C,D), drive(T3,E,F) unload(Package1,Truck1), unload(Package2,Truck2), unload(Package3,Truck3) jonkv@ida Parallel Optimality 43 Heuristics as approximations of h+ (optimal DR) h1(s) <= h+(s) h0(s) = hadd(s) >= h+(s) cost(p and q) = max(cost(p), cost(q)) cost(p and q) = sum(cost(p), cost(q)) Optimistic: As if achieving the most expensive goal would always achieve all the others Pessimistic: As if achieving one goal could never help in achieving any other Gives far too little information Informative, but always exceeds h+ and can exceed even h* by a large margin! How can we take some interactions into account? jonkv@ida Introduction  44 The planning graph takes many interactions into account  One possible heuristic h(s): ▪ Use GraphPlan to find a solution starting in s ▪ Return the number of actions in the solution  Too slow (requires plan generation), but: Let’s apply delete relaxation, then construct a planning graph! (Called a relaxed planning graph -- pioneered by FastForward, FF) Recall: Delete relaxation assumes we have positive preconds, goals jonkv@ida Relaxed Planning Graphs  Building a relaxed planning graph:  Construct state level 0 (SL0) 45 State level 0 Action level 1 garbage carry ▪ Atoms in the initial state clean  Construct action level 1 (AL1) ▪ Actions whose preconditions are included in SL0  Two actions in AL1 are mutex if: ▪ Their effects are inconsistent ▪ One destroys a precondition of the other roll asleep Can't happen! Only positive effects!  Construct state level 1 (SL1) ▪ All effects of actions in AL1  Two propositions in SL1 are mutex if: ▪ One is the negation of another ▪ All actions that achieve them are mutex Can't happen! Only positive propositions, no mutex actions! jonkv@ida Building Relaxed Planning Graphs  46 No backtracking: Recall the backwards search procedure ▪ Goal specifies which propositions to achieve in SL2 ▪ Choose one of many possible sets of achieving actions in AL2 ▪ If they are mutex, backtrack ▪ Determine which propositions must be achieved in SL1 (preconds of actions) ▪ Choose actions in AL1 Situation with delete effects: ▪ If they are mutex, State Action State Action backtrack level 0 level 1 level 1 level 2 garbage garbage ▪ … carry clean clean roll No delete effects  No mutexes  No backtracking ¬garbage asleep cook ¬dinner ¬clean ¬wrapped roll garbage ¬garbage clean ¬clean asleep asleep ¬asleep ¬asleep dinner ¬dinner wrap ¬served carry State level 2 cook serve wrap dinner ¬dinner served ¬served ¬served wrapped wrapped ¬wrapped ¬wrapped jonkv@ida No Backtracking  47 The relaxed planning graph considers positive interactions  For example, when one action achieves multiple goals  Ignores negative interactions  No delete effects  no mutexes to calculate (no inconsistent effects, no interference, …)  No mutexes exist  can select more actions per level, fewer levels required  No mutexes exist  no backtracking needed in solution extraction  Can extract a Graphplan-optimal relaxed plan in polynomial time Heuristic: hFF(s) = number of actions in relaxed plan from state s jonkv@ida Properties of Relaxed Planning Graphs How can this be efficient? Sounds as if we calculate h+, which is NP-complete!  The plan that is extracted is only GraphPlan-optimal! ▪ Optimal number of time steps ▪ Possibly sub-optimal number of actions (or suboptimal action costs) ▪  hFF is not admissible, can be greater than h+ (but not smaller!) and can be greater than h* (or smaller)  Still, the delete-relaxed plan can take positive interactions into account ▪  Often closer to true costs than hadd is  Plan extraction can use several heuristics (!) ▪ Trying to reduce the sequential length of the relaxed plan, to get even closer to true costs 48 jonkv@ida Complexity  49 Recall that plan extraction uses backward search  For each goal fact at a given level, we must find an achiever 3. Here, at least one of drive(α,β), noop-at-β, drive(γ,β) must be the achiever 2. Here, at(β) must be achieved 1. Here, some chosen action has at(β) as a precondition jonkv@ida Plan Extraction Heuristics (1) 2. And there’s a NOOP action available in the preceding level… 50 1. If we need to achieve this fact…  Then use the noop action as the achiever ▪ Achieve every fact as early as possible!  In a delete-relaxed problem, this strategy cannot lead to backtracking ▪ If there is a noop action  Then the fact can be achieved earlier ▪ There are no delete effects  No action can conflict with achieving it earlier jonkv@ida Plan Extraction Heuristics (1): Noop  Difficulty Heuristic:  If there is not a maintenance action available for f in level i-1, choose an action whose preconditions seem "easy" to satisfy  Then: ▪ Let p be a precondition fact ▪ difficulty(p) = index of first state level where p appears  And: ▪ Let a be an action ▪ difficulty(a) = sum { index of first state level where p appears | p is a precondition of a }  Select an action with minimal difficulty 51 jonkv@ida Plan Extraction Heuristics (2) 52 function ExtractPlan(plan graph P0A0P1…An−1Pn, goal G) for i = n . . . 1 do Partition the goals of the problem instance Gi  goals reached at level i depending on the level end for where they are first reached for i = n . . . 1 do All goals that could not be for all g ∈ Gi not marked ACHIEVED at time i do reached before level i must find min-difficult a ∈ Ai−1 s.t. g ∈ add(Ai−1) be reached at level i RPi−1  RPi−1 ∪ {a} // Add to the relaxed plan for all f ∈ prec(a) do Must achieve prec(a) at some time! Heuristic: Do this at the Glayerof(f) = Glayerof(f) ∪ { f } first level we can – layerof(f). end for for all f ∈ add(a) do One action can achieve mark f as ACHIEVED at times i − 1 and i more than one goal – end for mark them all as "done" end for end for return RP // The relaxed plan end function jonkv@ida Simplified Plan Extraction  54 jonkv@ida HSP Recall hill climbing in HSP!  Works approximately like this (some intricacies omitted): ▪ impasses = 0; unexpanded = { }; current = initialNode; while (not yet reached the goal) { children  expand(current); // Apply all applicable actions if (children = ∅) { current = pop(unexpanded); } else { At each step, choose a child bestChild best(children); with minimal heuristic value add other children to unexpanded in order of h(n); if (h(bestChild) ≥ h(current)) { Allow a few steps without impasses++; improvement of heuristic value if (impasses == threshold) { current = pop(unexpanded); Too many such steps  impasses = 0; Restart at some other point } } } }  55 FF uses enforced hill climbing – approximately:  s  init-state  repeat expand breadth-first until a better state s' is found until a goal state is found Step 1 Not expanded Step 2 Wait longer to decide which branch to take Don't restart – keep going jonkv@ida Enforced Hill Climbing  56 Is it Enforced Hill-Climbing efficient / effective?  In many cases (when paired with FF's Relaxed Planning Graph heuristic)  But plateaus / local minima must be escaped using breadth first search ▪ Large problems  many actions to explore ▪ Analysis: Hoffmann (2005), Where `ignoring delete lists' works: Local search topology in planning benchmarks  Is Enforced Hill-Climbing complete?  No!  You could still choose a state from which the goal is unreachable  If you reach a dead end:  HSP used random restarts  FF uses best-first search from the initial state, using only the inadmissible FF heuristic for guidance (no consideration of "cost so far") jonkv@ida Properties of EHC  57 Pruning Technique in FF: Helpful Actions in state s  Recall: FF's heuristic function for state s ▪ Construct a relaxed planning graph starting in s ▪ Extract a relaxed plan (choose actions among potentially executable in each level)  Helpful actions: ▪ The actions selected in action level 1 ▪ Plus all other actions that could achieve the same subgoals (but did not happen to be chosen) ▪ More likely to be useful to the plan than other executable actions  FF constrains EHC to only use helpful actions! ▪ Sometimes also called preferred operators State level 0 Action level 1 A1 A2 A3 A4 A5 State level 1 jonkv@ida Helpful Actions state-level 0 Suppose this is the relaxed plan generated by FF… garbage clean action-level 1 carry roll 58 state-level 1 garbage clean clean asleep Delete-relaxed: s0 = {clean, garbage, asleep} g = {clean, served, wrapped} Action cook() serve() wrap() carry() roll() clean() Preconds clean dinner asleep garbage garbage clean Effects dinner served wrapped asleep cook wrap dinner wrapped …more levels Possible in level 1: carry, roll, clean, cook, wrap Used in relaxed plan: cook, wrap Achieved at level 1: clean, dinner, wrapped Helpful actions: cook, wrap jonkv@ida FF: Helpful Actions (2)  59 EHC with helpful actions:  Non-helpful actions crossed over, never expanded 12 10 11 5 16 6 10 8 7 7 8 16 11 jonkv@ida FF: EHC with Helpful Actions  60 EHC with helpful actions:  EHC(initial state I , goal G) plan  EMPTY s I while hFF(s) != 0 do execute breadth first search from s, using only helpful actions, to find the first s' such that hFF(s') < hFF(s) if no such state is found then fail plan  plan + actions on the path to s' s  s' end while return plan Incomplete if there are dead ends! If EHC fails, fall back on best-first search using f(s)=hFF(s) jonkv@ida FF: EHC with Helpful Actions (2)

Planning Graphs -- As a search space -- To calculate informed heuristics

Related documents

Products

Support

Planning Graphs -- As a search space -- To calculate informed heuristics

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib