Planning Graphs -- As a search space -- To calculate informed heuristics

advertisement
Planning Graphs
-- As a search space
-- To calculate informed heuristics
Jonas Kvarnström
Automated Planning Group
Department of Computer and Information Science
Linköping University
jonas.kvarnstrom@liu.se – 2016
3
BACKWARD SEARCH


We know if the effects of an action
can contribute to the goal
Don't know if we can reach a state
where its preconditions are true
so we can execute it
at(home)
have-heli
at(LiU)
…
at(home)
have-shoes
One solution: Heuristics. But other methods exist…
jonkv@ida
Recap: Backward Search

Suppose that:
 We are interested in sequential plans with few actions

Suppose that we could:
 Quickly calculate the set of reachable states at time 0, 1, 2, 3, …
 Quickly test formulas in such sets

Then we could also:
 Use this to prune the backward search tree!
4
jonkv@ida
Reachable States

5
First step: Determine the length of a shortest solution
 Doesn't tell us the actual solutions:
We know states, not paths
Time 4
Time 3
Time 2
Time 1
Time 0
Initial
state
These 10
states are
reachable
These 200
states are
reachable
These
4000
states are
reachable
Goal not
satisfied
Goal not
satisfied in
any state
Goal not
satisfied in
any state
Goal not
satisfied in
any state
These
12000
states are
reachable
Goal satisfied
in 14 of the states:
All shortest solutions
must have 4 actions!
jonkv@ida
Reachable States

6
Second step: Backward search with pruning
Time 4
Time 3
Time 2
Time 1
Time 0
Initial
state
These 10
states are
reachable
These 200
states are
reachable
These
4000
states are
reachable
Regression:
at(home)
have-heli
These
12000
states are
reachable
One
relevant
action:
fly-heli
Not true (reachable) in any of the 4000 states at time 3!
We know preconds can’t be achieved in 3 steps  backtrack
at(LiU)
…
True in 14 states
jonkv@ida
Backward Search with Pruning

7
There must be some other way of achieving at(LiU)!
Time 4
Time 3
Time 2
Time 1
Time 0
Initial
state
These 10
states are
reachable
Continue backward search
as usual,
using reachable states
to prune the search tree
These 200
states are
reachable
These
4000
states are
reachable
Regression:
at(home)
have-shoes
These
12000
states are
reachable
Another
relevant
action:
walk
Preconds are achievable at time 3!
(True in several states)
at(LiU)
…
True in 14 states
jonkv@ida
Backward Search with Pruning (2)
8
Problem: Fast and exact computation is impossible!

”Suppose we could quickly calculate all reachable states…”
 In most cases, calculating exactly the reachable states
would take far too much time and space!
 Argument as before
("otherwise planning would be easy, and we know it isn't")
jonkv@ida
Problem

10
Solution: Don’t be exact!
 Quickly calculate an overestimate
 Anything outside this set is definitely not reachable – still useful for pruning
Time 4
Time 3
Time 2
Time 1
Time 0
Initial
state
These 15:
possibly
reachable
10
These
500:
possibly
reachable
Includes
200
These
15000:
possibly
reachable
Includes
all 4000
truly
reachable
…out of a billion?
These
42000:
possibly
reachable
Includes
all 12000
truly
reachable
jonkv@ida
Possibly Reachable States (1)

11
Planning algorithm:
 Keep calculating until we find a timepoint
where the goal might be achievable
Time 2
Time 1
Time 0
Initial
state
These 15:
possibly
reachable
These
500:
possibly
reachable
Goal not
satisfied
Goal not
satisfied in
any state
Goal not
satisfied in
any state
Time 3
These
15000:
possibly
reachable
Includes
all 4000
truly
reachable
Satisfied in
37 possibly reachable states:
A shortest solution must have
at least 3 actions!
jonkv@ida
Possibly Reachable States (2)

12
Backward search will verify what is truly reachable
 In a much smaller search space than plain backward search
Time 3
Time 2
These
15000:
possibly
reachable
Time 1
Time 0
Initial
state
These 15:
possibly
reachable
These are all of the
relevant actions…
(Of course we don't
always detect the bad
choice in a single time
step!)
These
500:
possibly
reachable
Includes
all 4000
truly
reachable
Preconds not
satisfied…
fly-heli
at(LiU)
…
Preconds not
satisfied…
walk
at(LiU)
…
The goal seems
to be reachable here,
but we can’t be sure
(overestimating!)
jonkv@ida
Possibly Reachable States (3)

13
Extend one time step and try again
Time 4
 A larger number of states may be truly reachable then
Time 3
Time 2
Time 1
Time 0
Initial
state
These 15:
possibly
reachable
These
500:
possibly
reachable
Continue backward search,
using reachable states
to prune the search tree
 smaller search space
Pruning still
possible!
These
42000:
possibly
reachable
These
15000:
possibly
reachable
Includes
all 4000
truly
reachable
Preconds not
satisfied…
Seems OK!
Includes
all 12000
truly
reachable
fly-heli
at(LiU)
…
walk
at(LiU)
…
jonkv@ida
Possibly Reachable States (4)

14
This is a form of iterative deepening search!
Time step = state step
(1 time step  0 actions)
Classical problem P
Loop for i = 1, 2, …
Try to find a plan
with i time steps
(including the
initial state)
no solution
with i-1 actions exists
Plan!
{ solutions to P } =
{ solutions to P with i time steps | i ∈ ℕ, i >= 0 }
jonkv@ida
Iterative Search
An efficient representation
for possibly reachable states

16
A Planning Graph also considers possibly executable actions
 Useful to generate states – also useful in backwards search!
k+1 proposition levels
Which propositions may possibly hold in each state?
Initial
state
1200
200
4000
10 possibly 8 possibly 47 possibly
possibly
possibly
possibly
executable reachable executable
reachable executable reachable
actions
actions
states
actions
states
states
k action levels
Which actions may possibly be executed in each step?
jonkv@ida
Planning Graph

17
GraphPlan’s plans are sequences of sets of actions
  Fewer levels required!
Initial
state
1200
200
4000
10 possibly 8 possibly 47 possibly
possibly
possibly
possibly
executable reachable executable
reachable executable reachable
actions
actions
states
actions
states
states
load(Package1,Truck1),
load(Package2,Truck2),
load(Package3,Truck3)
Can be executed in
arbitrary order
drive(T1,A,B),
drive(T2,C,D),
drive(T3,E,F)
unload(Package1,Truck1),
unload(Package2,Truck2),
unload(Package3,Truck3)
Arbitrary
order
Can be executed in
arbitrary order
Not necessarily in parallel – original objective was a sequential plan
jonkv@ida
GraphPlan: Plan Structure

Running example due to Dan Weld (modified):
 Prepare and serve a surprise dinner,
take out the garbage,
and make sure the present is wrapped before waking your sweetheart!
s0 = {clean, garbage, asleep}
g = {clean, ¬garbage, served, wrapped}
Action
Preconds Effects
cook()
serve()
wrap()
carry()
roll()
clean()
clean
dinner
asleep
garbage
garbage
¬clean
dinner
served
wrapped
¬garbage, ¬clean
¬garbage, ¬asleep
clean
18
jonkv@ida
Running Example
 Time 0:
▪ s0
 Time 1:
▪ cook
▪ serve
▪ wrap
▪ carry
▪ roll
▪ clean
▪ cook+wrap
▪ cook+roll
▪ …
 Time 2:
▪ cook/cook
▪ cook/serve
▪ cook/wrap
▪ cook/carry
▪ cook/roll
▪ cook/clean
▪ wrap/cook
 {clean, garbage, asleep}
19
Can’t calculate and store
all reachable states,
one at a time…
 {clean, garbage, asleep, dinner}
 impossible
 {clean, garbage, asleep, wrapped}
Let’s calculate
 {asleep}
reachable literals
 {clean}
instead!
 impossible
 {garbage, clean, asleep, dinner, wrapped}
 {clean, dinner}
 {clean, garbage, asleep, dinner}
 {clean, garbage, asleep, dinner, served}
 {clean, garbage, asleep, dinner, wrapped}
 {asleep, dinner}
 {clean, dinner}
 not possible
…
jonkv@ida
Reachable States
Time 0:
 s0 = {garbage, clean, asleep}


Time 1:






cook
(serve)
wrap
carry
roll
(clean)
 s1 = {garbage, clean, asleep, dinner}
not applicable
 s2 = {garbage, clean, asleep, wrapped}
 s3 = {asleep}
 s4 = {clean}
not applicable
No need to consider sets of actions:
All literals made true
by any combination of ”parallel” actions
are already there
State
level 0
Action
level 0
garbage
State
level 1
garbage
¬garbage
clean
asleep
¬dinner
We can't reach all combinations
of literals in state level 1…
¬served
But we can reach only such combinations!
Can not reach a state where served is true
¬wrapped
clean
Depending on which actions
we choose here…

20
jonkv@ida
Reachable Literals (1)
¬clean
asleep
¬asleep
dinner
¬dinner
¬served
wrapped
¬wrapped

21
Planning Graph Extension:
State
level 0
 Start with one ”state level”
▪ Set of reachable literals
garbage
 For each applicable action
▪ Add its effects
to the next state level
▪ Add edges to preconditions
and to effects
for bookkeeping (used later!)
Action
cook()
serve()
wrap()
carry()
roll()
clean()
Precond
clean
dinner
asleep
garbage
garbage
¬clean
Effects
dinner
served
wrapped
¬garbage, ¬clean
¬garbage, ¬asleep
clean
But wait!
Some propositions are missing…
Action
level 1
State
level 1
carry
¬garbage
roll
¬clean
clean
asleep
¬asleep
cook
dinner
¬dinner
wrap
¬served
wrapped
¬wrapped
jonkv@ida
Reachable Literals (2)

Depending on the actions chosen,
facts could persist
from the previous level!
22
State
level 0
Action
level 1
garbage
To handle this consistently:
maintenance (noop) actions
One for each literal l
 Precond = effect = l

Action
cook()
serve()
wrap()
carry()
roll()
clean()
noopdinner
noopnotdin
…
Precond
clean
dinner
asleep
garbage
garbage
¬clean
dinner
¬dinner
Effects
dinner
served
wrapped
¬garbage, ¬clean
¬garbage, ¬asleep
clean
dinner
¬dinner
State
level 1
garbage
carry

jonkv@ida
Reachable Literals (3)
clean
¬garbage
clean
roll
asleep
¬clean
asleep
¬asleep
cook
¬dinner
dinner
¬dinner
wrap
¬served
¬served
wrapped
¬wrapped
¬wrapped

Now the graph is sound
 If an action might be executable,

23
State
level 0
it is part of the graph
 If a literal might hold
in a given state,
it is part of the graph
garbage
But it is quite “weak”!
asleep
Action
level 1

carry
clean
¬garbage
clean
roll
¬clean
asleep
¬asleep
cook
¬dinner
We need more information
 In an efficiently useful format
State
level 1
garbage
 Even at state level 1,
it seems any literal
except served
can be achieved
jonkv@ida
Reachable Literals (4)
dinner
¬dinner
wrap
¬served
 Mutual exclusion
¬served
wrapped
¬wrapped
¬wrapped
No mutexes at state level 0:
We assume a consistent initial state!
State
level 0
24
Action
level 1
garbage
Two actions in a level are mutex
if their effects are inconsistent
Can’t execute them in parallel, and
order of execution is not arbitrary
 carry / noop-garbage,
 roll / noop-garbage,
roll / noop-asleep
 cook / noop-¬dinner
 wrap / noop-¬wrapped
State
level 1
garbage
carry
clean
¬garbage
clean
roll
asleep
¬clean
asleep
carry / noop-clean
▪ One causes garbage,
the others cause not garbage
jonkv@ida
Mutex 1: Inconsistent Effects
¬asleep
cook
¬dinner
dinner
¬dinner
wrap
¬served
¬served
wrapped
¬wrapped
¬wrapped
Two actions in one level are mutex
if one destroys a precondition
of the other
25
State
level 0
garbage
 carry is mutex with noop-garbage
▪ carry deletes garbage
▪ noop-garbage needs garbage
 …
State
level 1
garbage
carry
Can’t be executed in arbitrary order
 roll is mutex with wrap
▪ roll deletes asleep
▪ wrap needs asleep
Action
level 1
jonkv@ida
Mutex 2: Interference
clean
¬garbage
clean
roll
asleep
¬clean
asleep
¬asleep
cook
¬dinner
dinner
¬dinner
wrap
¬served
¬served
wrapped
¬wrapped
¬wrapped
Two propositions are mutex
if one is the negation of the other
Can’t be true at the same time…
State
level 0
26
Action
level 1
garbage
jonkv@ida
Mutex 3: Inconsistent Support
State
level 1
garbage
carry
clean
¬garbage
clean
roll
asleep
¬clean
asleep
¬asleep
cook
¬dinner
dinner
¬dinner
wrap
¬served
¬served
wrapped
¬wrapped
¬wrapped
Two propositions are mutex
if they have inconsistent support
All actions that achieve them
are pairwise mutex
in the previous level
¬asleep can only be achieved by roll,
wrapped can only be achieved by wrap,
and roll/wrap are mutex
 ¬asleep and wrapped are mutex
State
level 0
27
Action
level 1
garbage
garbage
carry
clean
¬garbage
clean
roll
asleep
¬clean
asleep
¬asleep
cook
¬clean can only be achieved by carry,
dinner can only be achieved by cook,
and carry/cook are mutex
 ¬clean and dinner are mutex
State
level 1
¬dinner
dinner
¬dinner
wrap
¬served
¬served
wrapped
¬wrapped
¬wrapped
jonkv@ida
Mutex 4: Inconsistent Support



28
Two actions at the same action-level are mutex if
 Inconsistent effects: an effect of one negates an effect of the other
 Interference: one deletes a precondition of the other
 Competing needs: they have mutually exclusive preconditions
Otherwise they don’t interfere with each other
Recursive
 Both may appear at the same time step in a solution plan
Two literals at the same state-level are mutex if
 Inconsistent support: one is the negation of the other,
or all ways of achieving them are pairwise mutex
propagation
of mutexes
jonkv@ida
Mutual Exclusion: Summary

Is there a possible solution?

Goal g =
{clean, ¬garbage, served, wrapped}

No: Cannot reach a state
where served is true
in a single (multi-action) step
29
State
level 0
Action
level 1
garbage
State
level 1
garbage
carry
clean
¬garbage
clean
roll
asleep
¬clean
asleep
¬asleep
cook
¬dinner
dinner
¬dinner
wrap
¬served
¬served
wrapped
¬wrapped
¬wrapped
jonkv@ida
Early Solution Check
State
level 0
Action
level 1
garbage
carry
clean
State
level 1
asleep
Action
level 2
garbage
¬garbage
¬garbage
¬clean
asleep
carry
roll
clean
¬asleep
cook
¬dinner
dinner
¬dinner
wrap
¬served
¬wrapped
State
level 2
garbage
clean
roll
30
clean
¬clean
asleep
¬asleep
cook
serve
wrap
dinner
¬dinner
served
¬served
¬served
wrapped
wrapped
¬wrapped
¬wrapped
All goal literals are present in level 2, and none of them are (known to be) mutex!
jonkv@ida
Expanded Planning Graph
State
level 0
Action
level 1
garbage
carry
clean
32
State
level 1
asleep
garbage
¬garbage
¬garbage
¬clean
asleep
carry
¬dinner
We seem to have 12 alternatives
at the last layer (3*2*1*2): wrap
¬served
dinner
¬dinner
clean
roll
¬clean
clean
¬asleep
cook
State
level 2
garbage
clean
roll
Action
level 2
asleep
¬asleep
cook
serve
wrap
dinner
¬dinner
served
¬served
¬served
2 ways of achieving clean
wrapped
wrapped
1 ¬wrapped
way of achieving served
¬wrapped
¬wrapped
3 ways of achieving ¬garbage
2 ways of achieving wrapped
g = {clean, ¬garbage, served, wrapped}
jonkv@ida
Solution Extraction (1)
State
level 0
Action
level 1
garbage
carry
clean
33
State
level 1
asleep
garbage
¬garbage
¬garbage
¬clean
asleep
carry
¬dinner
Here:
wrap
An inconsistent alternative
¬served
(mutex
actions chosen)…
¬wrapped
dinner
¬dinner
clean
roll
¬clean
clean
¬asleep
cook
State
level 2
garbage
clean
roll
Action
level 2
asleep
¬asleep
cook
serve
wrap
dinner
¬dinner
served
¬served
¬served
wrapped
wrapped
¬wrapped
¬wrapped
g = {clean, ¬garbage, served, wrapped}
jonkv@ida
Solution Extraction (2)
State
level 0
Action
level 1
garbage
carry
clean
34
State
level 1
asleep
garbage
¬garbage
¬garbage
¬clean
asleep
carry
¬dinner
This is one of the consistent
combinations – a successor wrap
in this
¬served backwards search
“multi-action”
The choice is a backtrack point!
(May¬wrapped
have to backtrack, try to
achieve ¬garbage by roll instead)
dinner
¬dinner
clean
roll
¬clean
clean
¬asleep
cook
State
level 2
garbage
clean
roll
Action
level 2
asleep
¬asleep
cook
serve
wrap
dinner
¬dinner
served
¬served
¬served
wrapped
wrapped
¬wrapped
¬wrapped
g = {clean, ¬garbage, served, wrapped}
jonkv@ida
Solution Extraction (3)
State
level 0
Action
level 1
garbage
carry
clean
35
State
level 1
asleep
garbage
¬garbage
¬garbage
¬clean
asleep
carry
roll
clean
¬asleep
cook
¬dinner
dinner
¬dinner
The successor has a new goal
wrap
This goal consists of all
¬served
preconditions of the chosen action
(Not a backtrack point)
The goal
is not mutually exclusive
¬wrapped
(if so, it would be unreachable)
State
level 2
garbage
clean
roll
Action
level 2
clean
¬clean
asleep
¬asleep
cook
serve
wrap
dinner
¬dinner
served
¬served
¬served
wrapped
wrapped
¬wrapped
¬wrapped
Successor goal = {clean, garbage, dinner, wrapped}
jonkv@ida
Solution Extraction (4)
State
level 0
Action
level 1
garbage
carry
clean
36
State
level 1
asleep
garbage
¬garbage
¬garbage
¬clean
asleep
carry
roll
clean
¬asleep
cook
¬dinner
dinner
¬dinner
wrap
¬served
One possible solution:
1) cook/wrap
¬wrappedin any order,
2) serve/roll in any order
State
level 2
garbage
clean
roll
Action
level 2
clean
¬clean
asleep
¬asleep
cook
serve
wrap
dinner
¬dinner
served
¬served
¬served
wrapped
wrapped
¬wrapped
¬wrapped
jonkv@ida
Solution Extraction (5)
The set of goals we are
trying to achieve
37
The level of the state si,
starting at the highest level
procedure Solution-extraction(g,i)
A form of backwards search,
if i=0 then return the solution
but only among the actions in the graph
(generally much fewer, esp. with mutexes)!
nondeterministically choose
a set of non-mutex actions
("real" actions and/or maintenance actions)
statestateactionto use in state s i–1 to achieve g
level
level
level
(must achieve the entire goal!)
i-1
i
i
if no such set exists then fail (backtrack)
g’ := {the preconditions of the chosen actions}
Solution-extraction(g’, i–1)
end Solution-extraction
jonkv@ida
Solution Extraction 6

38
Possible literals:
 What is achieved is always carried forward by no-ops
  Monotonically increase over state levels

Possible actions:
 Action included if all precondition literals exist in the preceding state level
  Monotonically increase over action levels
State
level 0
Action
level 1
garbage
carry
clean
State
level 1
asleep
cook
State
level 2
garbage
garbage
¬garbage
¬garbage
clean
roll
Action
level 2
¬clean
carry
roll
clean
¬clean
asleep
asleep
¬asleep
¬asleep
dinner
cook
dinner
jonkv@ida
Important Properties

39
Mutex relationships:
 Mutexes between included literals monotonically decrease
▪ If two literals could be achieved together in the previous level,
we could always just “do nothing”, preserving them to the next level
 (Mutexes to newly added literals can be introduced)

At some point, the Planning Graph “levels off”
 After some time k all levels are identical
 (Why?)
A planning graph is exactly what we described here –
not some arbitrary planning-related graph
jonkv@ida
Important Properties (2)
Top: Real state reachability for a Rover problem ("avail" facts not shown)
Bottom: Planning Graph for the same problem (mutexes not shown)
At each step, an overestimate of reachable properties / applicable actions!

A form of iterative deepening:
Classical problem P
Loop for i = 1, 2, …

41
Try to find a plan
with i time steps
(including the initial
state)
Plan!
Therefore, GraphPlan is optimal in the number of time steps
 Not very useful: We normally care much more about
▪ Total action cost
▪ Number of actions (special case where action cost = 1)
▪ Total execution time (”makespan”)
load(Package1,Truck1),
load(Package2,Truck2),
load(Package3,Truck3)
drive(T1,A,B),
drive(T2,C,D),
drive(T3,E,F)
unload(Package1,Truck1),
unload(Package2,Truck2),
unload(Package3,Truck3)
jonkv@ida
Parallel Optimality
43
Heuristics as approximations of h+ (optimal DR)
h1(s) <= h+(s)
h0(s) = hadd(s) >= h+(s)
cost(p and q) = max(cost(p), cost(q))
cost(p and q) = sum(cost(p), cost(q))
Optimistic:
As if achieving the most expensive goal
would always achieve all the others
Pessimistic:
As if achieving one goal
could never help in achieving any other
Gives far too little information
Informative, but always exceeds h+
and can exceed even h* by a large margin!
How can we take some interactions into account?
jonkv@ida
Introduction

44
The planning graph takes many interactions into account
 One possible heuristic h(s):
▪ Use GraphPlan to find a solution starting in s
▪ Return the number of actions in the solution
 Too slow (requires plan generation), but:
Let’s apply delete relaxation, then construct a planning graph!
(Called a relaxed planning graph -- pioneered by FastForward, FF)
Recall: Delete relaxation assumes we have positive preconds, goals
jonkv@ida
Relaxed Planning Graphs

Building a relaxed planning graph:
 Construct state level 0 (SL0)
45
State
level 0
Action
level 1
garbage
carry
▪ Atoms in the initial state
clean
 Construct action level 1 (AL1)
▪ Actions whose preconditions are included in SL0
 Two actions in AL1 are mutex if:
▪ Their effects are inconsistent
▪ One destroys a precondition of the other
roll
asleep
Can't happen!
Only positive effects!
 Construct state level 1 (SL1)
▪ All effects of actions in AL1
 Two propositions in SL1 are mutex if:
▪ One is the negation of another
▪ All actions that achieve them are mutex
Can't happen!
Only positive propositions,
no mutex actions!
jonkv@ida
Building Relaxed Planning Graphs

46
No backtracking: Recall the backwards search procedure
▪ Goal specifies which propositions to achieve in SL2
▪ Choose one of many possible sets of achieving actions in AL2
▪ If they are mutex, backtrack
▪ Determine which propositions must be achieved in SL1 (preconds of actions)
▪ Choose actions in AL1
Situation with delete effects:
▪ If they are mutex,
State
Action
State
Action
backtrack
level 0
level 1
level 1
level 2
garbage
garbage
▪ …
carry
clean
clean
roll
No delete effects

No mutexes

No backtracking
¬garbage
asleep
cook
¬dinner
¬clean
¬wrapped
roll
garbage
¬garbage
clean
¬clean
asleep
asleep
¬asleep
¬asleep
dinner
¬dinner
wrap
¬served
carry
State
level 2
cook
serve
wrap
dinner
¬dinner
served
¬served
¬served
wrapped
wrapped
¬wrapped
¬wrapped
jonkv@ida
No Backtracking

47
The relaxed planning graph considers positive interactions
 For example, when one action achieves multiple goals
 Ignores negative interactions
 No delete effects  no mutexes to calculate
(no inconsistent effects, no interference, …)
 No mutexes exist  can select more actions per level,
fewer levels required
 No mutexes exist  no backtracking needed in solution extraction
 Can extract a Graphplan-optimal relaxed plan in polynomial time
Heuristic: hFF(s) = number of actions in relaxed plan from state s
jonkv@ida
Properties of Relaxed Planning Graphs
How can this be efficient?
Sounds as if we calculate h+, which is NP-complete!
 The plan that is extracted is only GraphPlan-optimal!
▪ Optimal number of time steps
▪ Possibly sub-optimal number of actions (or suboptimal action costs)
▪  hFF is not admissible,
can be greater than h+ (but not smaller!)
and can be greater than h* (or smaller)
 Still, the delete-relaxed plan can take positive interactions into account
▪  Often closer to true costs than hadd is
 Plan extraction can use several heuristics (!)
▪ Trying to reduce the sequential length of the relaxed plan,
to get even closer to true costs
48
jonkv@ida
Complexity

49
Recall that plan extraction uses backward search
 For each goal fact at a given level, we must find an achiever
3. Here, at least one of
drive(α,β), noop-at-β, drive(γ,β)
must be the achiever
2. Here, at(β)
must be
achieved
1. Here, some
chosen action
has at(β) as a
precondition
jonkv@ida
Plan Extraction Heuristics (1)
2. And there’s a NOOP action
available in the preceding level…
50
1. If we need
to achieve
this fact…
 Then use the noop action as the achiever
▪ Achieve every fact as early as possible!
 In a delete-relaxed problem, this strategy cannot lead to backtracking
▪ If there is a noop action
 Then the fact can be achieved earlier
▪ There are no delete effects  No action can conflict with achieving it earlier
jonkv@ida
Plan Extraction Heuristics (1): Noop

Difficulty Heuristic:
 If there is not a maintenance action available for f in level i-1,
choose an action whose preconditions seem "easy" to satisfy
 Then:
▪ Let p be a precondition fact
▪ difficulty(p) = index of first state level where p appears
 And:
▪ Let a be an action
▪ difficulty(a) = sum { index of first state level where p appears |
p is a precondition of a }
 Select an action with minimal difficulty
51
jonkv@ida
Plan Extraction Heuristics (2)
52
function ExtractPlan(plan graph P0A0P1…An−1Pn, goal G)
for i = n . . . 1 do
Partition the goals of the problem instance
Gi  goals reached at level i
depending on the level
end for
where they are first reached
for i = n . . . 1 do
All goals that could not be
for all g ∈ Gi not marked ACHIEVED at time i do
reached before level i must
find min-difficult a ∈ Ai−1 s.t. g ∈ add(Ai−1)
be reached at level i
RPi−1  RPi−1 ∪ {a} // Add to the relaxed plan
for all f ∈ prec(a) do
Must achieve prec(a) at some
time! Heuristic: Do this at the
Glayerof(f) = Glayerof(f) ∪ { f }
first level we can – layerof(f).
end for
for all f ∈ add(a) do
One action can achieve
mark f as ACHIEVED at times i − 1 and i
more than one goal –
end for
mark them all as "done"
end for
end for
return RP // The relaxed plan
end function
jonkv@ida
Simplified Plan Extraction

54
jonkv@ida
HSP
Recall hill climbing in HSP!
 Works approximately like this (some intricacies omitted):
▪ impasses = 0;
unexpanded = { };
current = initialNode;
while (not yet reached the goal) {
children  expand(current);
// Apply all applicable actions
if (children = ∅) {
current = pop(unexpanded);
} else {
At each step, choose a child
bestChild best(children);
with minimal heuristic value
add other children to unexpanded in order of h(n);
if (h(bestChild) ≥ h(current)) {
Allow a few steps without
impasses++;
improvement of heuristic value
if (impasses == threshold) {
current = pop(unexpanded);
Too many such steps 
impasses = 0;
Restart at some other point
}
}
}
}

55
FF uses enforced hill climbing – approximately:
 s  init-state
 repeat
expand breadth-first until a better state s' is found
until a goal state is found
Step 1
Not expanded
Step 2
Wait longer to decide which branch to take
Don't restart – keep going
jonkv@ida
Enforced Hill Climbing

56
Is it Enforced Hill-Climbing efficient / effective?
 In many cases (when paired with FF's Relaxed Planning Graph heuristic)
 But plateaus / local minima must be escaped using breadth first search
▪ Large problems  many actions to explore
▪ Analysis: Hoffmann (2005),
Where `ignoring delete lists' works: Local search topology in planning benchmarks

Is Enforced Hill-Climbing complete?
 No!
 You could still choose a state from which the goal is unreachable

If you reach a dead end:
 HSP used random restarts
 FF uses best-first search from the initial state,
using only the inadmissible FF heuristic for guidance
(no consideration of "cost so far")
jonkv@ida
Properties of EHC

57
Pruning Technique in FF: Helpful Actions in state s
 Recall: FF's heuristic function for state s
▪ Construct a relaxed planning graph starting in s
▪ Extract a relaxed plan (choose actions among potentially executable in each level)
 Helpful actions:
▪ The actions selected in action level 1
▪ Plus all other actions
that could achieve the same subgoals
(but did not happen to be chosen)
▪ More likely to be useful to the plan
than other executable actions
 FF constrains EHC
to only use helpful actions!
▪ Sometimes also called
preferred operators
State
level 0
Action
level 1
A1
A2
A3
A4
A5
State
level 1
jonkv@ida
Helpful Actions
state-level 0
Suppose this
is the
relaxed plan
generated by
FF…
garbage
clean
action-level 1
carry
roll
58
state-level 1
garbage
clean
clean
asleep
Delete-relaxed:
s0
= {clean, garbage, asleep}
g
= {clean, served, wrapped}
Action
cook()
serve()
wrap()
carry()
roll()
clean()
Preconds
clean
dinner
asleep
garbage
garbage
clean
Effects
dinner
served
wrapped
asleep
cook
wrap
dinner
wrapped
…more levels
Possible in level 1:
carry, roll, clean,
cook, wrap
Used in relaxed plan:
cook, wrap
Achieved at level 1:
clean, dinner, wrapped
Helpful actions:
cook, wrap
jonkv@ida
FF: Helpful Actions (2)

59
EHC with helpful actions:
 Non-helpful actions crossed over, never expanded
12
10
11
5
16
6
10
8
7
7
8
16
11
jonkv@ida
FF: EHC with Helpful Actions

60
EHC with helpful actions:
 EHC(initial state I , goal G)
plan
 EMPTY
s
I
while hFF(s) != 0 do
execute breadth first search from s,
using only helpful actions,
to find the first s' such that hFF(s') < hFF(s)
if no such state is found then fail
plan  plan + actions on the path to s'
s  s'
end while
return plan
Incomplete
if there are dead ends!
If EHC fails, fall back on
best-first search using
f(s)=hFF(s)
jonkv@ida
FF: EHC with Helpful Actions (2)
Download