Automated Planning Forward-Chaining Search

advertisement
Automated Planning
Forward-Chaining Search
Jonas Kvarnström
Automated Planning Group
Department of Computer and Information Science
Linköping University
jonas.kvarnstrom@liu.se – 2015

2
Our next example domain: The Blocks World
 A simple example domain
allowing us to focus on algorithms and concepts, not domain details
You
Initial State
Your greatest desire
A
A
C
B
B
C
jonkv@ida
Blocks World (1)

3
We will generate classical sequential plans
 One object type: Blocks
 A
▪
▪
▪
▪
common blocks world version, with 4 operators
(pickup ?x)
(putdown ?x)
(unstack ?x ?y)
(stack ?x ?y)
 Predicates used:
▪ (on ?x ?y)
▪ (ontable ?x)
▪ (clear ?x)
▪ (holding ?x)
▪ (handempty)
unstack(A,C)
– takes ?x from the table
– puts ?x on the table
– takes ?x from on top of ?y
– puts ?x on top of ?y
– block ?x is on block ?y
– ?x is on the table
– we can place a block on top of ?x
– the robot is holding block ?x
– the robot is not holding any block
putdown(A)
pickup(B)
A
C B D
(not (exists (?y)
(on ?y ?x)))
(not (exists (?x)
(holding ?x)))
stack(B,C)
jonkv@ida
Blocks World (2)
(:action pickup
:parameters (?x)
:precondition (and (clear ?x) (on-table ?x)
(handempty))
:effect
(and (not (on-table ?x))
(not (clear ?x))
(not (handempty))
(holding ?x)))
(:action putdown
:parameters (?x)
:precondition (holding ?x)
(:action unstack
:parameters (?top ?below)
:precondition (and (on ?top ?below)
(clear ?top) (handempty))
:effect
(and (holding ?top)
(clear ?below)
(not (clear ?top))
(not (handempty))
(not (on ?top ?below))))
(:action stack
:parameters (?top ?below)
:precondition (and (holding ?top)
(clear ?below))
:effect
(and (not (holding ?top))
(not (clear ?below))
(clear ?top)
(handempty)
(on ?top ?below)))
:effect
(and (on-table ?x)
(clear ?x)
(handempty)
(not (holding ?x))))
4
jonkv@ida
Blocks World (3): Operator Reference
We assume we know the initial state
Let’s see which states are reachable from there!
Here: Start with s0 = all blocks on the table
holding(A)
5
Many other states "exist",
but are not reachable
from the current starting state
pickup(A)
s2
holding(A)
handempty
ontable(A)
clear(A)
s3
handempty
clear(A)
ontable(A)
on(A,A)
unstack(A,A)
handempty
ontable(A)
clear(A)
s4
holding(A)
clear(A)
ontable(A)
jonkv@ida
Reachable State Space: BW size 1
A on Table
B on Table
A on Table
Holding B
B on A on Table
6
5 states
8 transitions
Holding A
B on Table
A on B on Table
jonkv@ida
Reachable State Space: BW size 2
A on Table
B on Table
C on table
7
22 states
42 transitions
jonkv@ida
Reachable State Space: BW size 3
8
Initial (current) state
Goal states
We simply need a path in a graph –
not even a shortest path!
jonkv@ida
The Planning Problem
Paths are found through graph search
Search is the basis
for many (most?) forms of planning!
Many search methods already exist –
can we simply apply them?

We'll begin with the most natural idea:
 Start in the initial state
 Apply actions step by step
 See where you end up

Many names, one concept:
 Forward search
 Forward-chaining search
 Forward state space search
 Progression
 …
10
jonkv@ida
Forward State Space Search (1)
11
Many states – let's generate the reachable ones as we go
Forward State Space
Initial search node 0
= initial state
Forward planning, forward-chaining,
progression: Begin in the initial state
Corresponds directly to the initial state
Edges correspond to actions
Child node 1
= result state
Child node 2
= result state
The successor function / branching rule:
To expand a state s,
generate all states that result from
applying an action that is applicable in s
Now we have multiple unexpanded nodes!
A search strategy chooses
which one to expand next
jonkv@ida
Forward State Space Search (2)

Blocks world example:
 Generate the initial state = initial node
from the initial state description in the problem
A
C B D
12
jonkv@ida
Forward State Space Search (3)
13
 Incremental expansion: Choose a node
▪ First time, the initial state – later, depends on the search strategy used
 Expand all possible successors
▪ “What actions are applicable in the current state, and where will they take me?”
▪ Generates new states by applying effects
 Repeat until a goal node is found!
A
C B D
A
C B D
A
C B
D
A
C B D
A
C B D
A
C
B
C B D A
D
 Notice that the
BW lacks dead
ends.
 In fact, it is
even symmetric.
 This is not true
for all domains!
jonkv@ida
Forward State Space Search (4)
General Forward State Space Search Algorithm
 forward-search(operators, s0, g) {
What strategies are
open  { <s0, ε> }
available and useful?
while (open ≠ emptyset) {
use a strategy to select and remove <s,path> from open
if goal g satisfied in state s then return path
Expand
the node

14
}
foreach a ∈ { ground instances of operators applicable in state s } {
s’  apply(a, s) // dynamically generate a new state
path’  append(path, a)
add <state’, path’> to open
}
}
return failure;
Is always sound
Completeness depends on the strategy
To simplify extracting a plan,
a state space search node could include
the plan to reach that state!
Still generally called
state space search…
jonkv@ida
Forward State Space Search (3)

16
jonkv@ida
Forward State Space Search: Dijkstra
First search strategy: Dijkstra’s algorithm
 Matches the given forward search ”template”
▪ Selects from open a node n with minimal g(n):
Cost of reaching n from the starting point
 Efficient graph search algorithm: O(|E| + |V| log |V|)
▪ |E| = the number of edges, |V| = the number of nodes
 Optimal: Returns minimum-cost plans
 Simple problem, for illustration:
▪ Navigation in a grid
▪ Each state specifies only
the coordinates of the robot:
Two state variables
▪ Actions: Move left, move right, …
(cost = 1)
▪ Single goal node
Goal
Obstacle
Start

Dijkstra’s Algorithm:
17
Expands in a "circle" – but not a geometric circle!
The distance measure is path cost from the initial state!
Animation from Wikimedia Commons
jonkv@ida
Dijkstra’s Algorithm (2)
All done?

19
Explores all states that can be reached more cheaply
than the cheapest goal node
Usually we have many more ”dimensions”,
many more nodes within a given distance
(this was just a trivial 2-dimensional 8-connected example)!
cost 8
cost 7
cost 6
cost 4
cost 4
cost 3
cost 2
cost 1
Goal nodes
jonkv@ida
Dijkstra’s Algorithm (3)
20
125 states
272 transitions
jonkv@ida
Reachable State Space: BW size 4
21
866 states
2090 transitions
jonkv@ida
Reachable State Space: BW size 5
22
Blocks
States
States reachable
Transitions (edges)
from "all on table" in reachable part
0
2
1
0
1
32
2
2
2
2048
5
8
3
524288
22
42
4
536870912
125
272
5
2199023255552
866
2090
6
36028797018963968
7057
18552
7
2361183241434822606848
65990
186578
8
618970019642690137449562112 695417
2094752
9
649037107316853453566312041 …
152512
…
10
272225893536750770770699685 …
9454145691648
…
Even # of reachable states will grow too quickly!
jonkv@ida
Reachable State Space: BW sizes 0–8
23
jonkv@ida
400 blocks
 Blocks world, 400 blocks initially on the table, goal is a 400-block tower
▪ Given uniform action costs (same cost for all actions),
Dijkstra will always consider all plans that stack less than 400 blocks!
▪ Stacking 1 block:
= 400*399 plans, …
▪ Stacking 2 blocks:
> 400*399 * 399*398 plans, …
▪
More than
163056983907893105864579679373347287756459484163478267225862419762304263994207997664258213955766581163654137118
163119220488226383169161648320459490283410635798745232698971132939284479800304096674354974038722588873480963719
240642724363629154726632939764177236010315694148636819334217252836414001487277618002966608761037018087769490614
847887418744402606226134803936935233568418055950371185351837140548515949431309313875210827888943337113613660928
318086299617953892953722006734158933276576470475640607391701026030959040303548174221274052329579637773658722452
549738459404452586503693169340418435407383263781602533940396297139180912754853265795909113444084441755664211796
274320256992992317773749830375100710271157846125832285664676410710854882657444844563187930907779661572990289194
810585217819146476629300233604155183447294834609054590571101642465441372350568748665249021991849760646988031691
394386551194171193333144031542501047818060015505336368470556302032441302649432305620215568850657684229678385177
725358933986112127352452988033775364935611164107945284981089102920693087201742432360729162527387508073225578630
777685901637435541458440833878709344174983977437430327557534417629122448835191721077333875230695681480990867109
051332104820413607822206465635272711073906611800376194410428900071013695438359094641682253856394743335678545824
320932106973317498515711006719985304982604755110167254854766188619128917053933547098435020659778689499606904157
077005797632287669764145095581565056589811721520434612770594950613701730879307727141093526534328671360002096924
483494302424649061451726645947585860104976845534507479605408903828320206131072217782156434204572434616042404375
21105232403822580540571315732915984635193126556273109603937188229504400
1.63 * 101735
Dijkstra is efficient in terms of the search space size: O(|E| + |V| log |V|)
The search space is exponential in the size of the input description…

But computers are getting very fast!
 Suppose we can check 10^20 states per second
▪ >10 billion states per clock cycle for today’s computers,
each state involving complex operations
 Then it will only take 10^1735 / 10^20 = 10^1715 seconds…

But we have multiple cores!
 The universe has at most 10^87
particles, including electrons, …
 Let’s suppose every one
is a CPU core
  only 10^1628 seconds
> 10^1620 years
 The universe is around 10^10
years old
24
jonkv@ida
Fast Computers, Many Cores

Dijkstra’s algorithm is completely impractical here
 Visits all nodes with cost < cost(optimal solution)

Breadth first would not work
 Visits all nodes with length < length(optimal solution)

Iterative deepening would not work
 Saves space, still takes too much time

Depth first search would normally not work
 Could work in some domains and some problems, by pure luck…
 Usually either doesn’t find the goal,
or finds very inefficient plans
 [movies/4_no-rules]
25
jonkv@ida
Impractical Algorithms

26
Depth first search:
 Always prefers adding a new action to the current action sequence
 Always adds the first action it can find
Goal nodes
jonkv@ida
Depth First Search Example

Is there still hope for planning?
 Of course there is!
 Our trivial planning method uses blind search – tries everything!
 We wouldn’t choose such silly actions – so why should the computer?

Planning is part of Artificial Intelligence!
 We should develop methods to judge what actions are promising
given our goals
27
jonkv@ida
Hopeless?
Next 3-4 lectures: Using heuristic functions
We will focus on using heuristic functions
to prioritize the search order
Usually without pruning: Even low-priority nodes are kept,
may have be visited later
Resilient: Make a "bad decision" locally
 can come back later
Memory usage: Still have to keep all nodes
in case you need to go back later
29
jonkv@ida
Heuristics (1)

30
General Heuristic Forward Search Algorithm
 heuristic-forward-search(ops, s0, g) {
open  { <s0, ε> }
while (open ≠ emptyset) {
use a heuristic search strategy to select and remove node n=<s,path> from open
if goal-satisfied(g, s) then return path
Expanding
node n
= creating
its
successors
foreach a ∈ groundapp(ops, s) {
s’  apply(a, s)
path’  append(path, a)
add <state’, path’> to open
}
}
return failure;
}
A*, simulated annealing,
hill-climbing, …


The strategy selects nodes from the
open set depending on:

Heuristic value h(n)

Possibly other factors,
such as g(n) = cost of reaching n
What is a good heuristic depends on:

The algorithm (examples later)

The purpose (good solutions /
finding solutions quickly)
jonkv@ida
Heuristics (2)
31
Example: 3 blocks, all on the table in s0
s0
We now have
1 open node,
which is unexpanded
jonkv@ida
Heuristics (3)
32
We visit s0 and expand it
We now have
3 open nodes,
which are unexpanded
A heuristic function estimates the distance from each open node to the goal:
We calculate h(s1), h(s2), h(s3)
A search strategy uses this value (and other info) to prioritize between them
jonkv@ida
Heuristics (4)
33
If we choose to visit s1:
We now have
4 open nodes,
which are unexpanded
2 new heuristic values are calculated: h(s16), h(s17)
The search strategy now has 4 nodes to prioritize
jonkv@ida
Heuristics (5)

We tend to use toy examples in very simple domains
 To learn fundamental principles
 To create readable, comprehensible examples

Always remember:
 Real-world problems are larger, more complex
34
jonkv@ida
Note on Examples
35
As indicated: Two aspects of using heuristic guidance
Defining a search strategy
able to take guidance into account
Generating the actual guidance
as input to the search strategy
Examples:
Example:
A* uses a heuristic (function)
Hill-climbing uses a heuristic… differently!
Finding a suitable heuristic function
for A* or hill-climbing
Can be domain-specific,
given as input in the planning problem
Can be domain-independent,
generated automatically by the planner
given the problem domain
We will consider both – heuristics more than strategies
jonkv@ida
Two Aspects of Heuristic Guidance
36
Two distinct objectives for heuristic guidance
Find a solution quickly
Find a good solution
Prioritize nodes where you think
you can easily find a way
to a goal node in the search space
Prioritize nodes where you think
you can find a way
to a good (cheap) solution,
even if finding it will be difficult
Preferred: Accumulated plan cost 50,
estimated distance to goal 10
Preferred: Accumulated plan cost 5
estimated distance to goal 30
Often one strategy+heuristic can achieve both reasonably well,
but for optimum performance, the distinction can be important!
jonkv@ida
Two Uses for Heuristic Guidance

37
What properties do good heuristic functions have?
 Informative: Provide good guidance to the search strategy
Heuristic
Search
Algorithm
Heuristic
Function
Planning
Problem
Performance
and
Plan Quality
Test on a
variety of
benchmark
examples
jonkv@ida
Some Desired Properties (1)

38
What properties do good heuristic functions have?
 Efficiently computable!
▪ Spend as little time as possible deciding which nodes to expand
 Balanced…
▪ Many planners spend almost all their time calculating heuristics
▪ But: Don’t spend more time computing h than you gain by expanding fewer nodes!
▪ Illustrative (made-up) example:
Heuristic
quality
Nodes Expanding one
expanded
node
Calculating h
for one node
Total time
100 μs
1 μs
10100 ms
1000 μs
2200 ms
Worst
100000
Better
20000
…
5000
…
2000
…
500
Best
200
100 μs
100 μs
100 μs
100 μs
100 μs
10 μs
2200 ms
10000 μs
5050 ms
100 μs
1000 ms
100000 μs
20020 ms
jonkv@ida
Some Desired Properties (2)
Download