8-Puzzle

advertisement
CS 440 / ECE 448
Introduction to Artificial Intelligence
Spring 2010
Lecture #4
Instructor: Eyal Amir
Grad TAs: Wen Pu, Yonatan Bisk
Undergrad TAs: Sam Johnson, Nikhil Johri
When to Use Search Techniques?
The search space is small, and
There is no other available techniques, or
It is not worth the effort to develop a more
efficient technique
The search space is large, and
There is no other available techniques, and
There exist “good” heuristics
Models, |=, math proofs
Alpha |= Beta
Alpha, Beta – propositional formulas
M |= Alpha “M models Alpha” means “Alpha
evaluated to TRUE in model M”
Math. Proofs: example
A B |= A
(another one later in today’s class)
Search Algorithms
Blind search – BFS, DFS, ID, uniform cost
no notion concept of the “right direction”
can only recognize goal once it’s achieved
Heuristic search – we have rough idea of how
good various states are, and use this
knowledge to guide our search
Types of heuristic search
Best First
A* is a special case
BFS is a special case
ID A*
ID is a special case
Hill climbing
Simulated Annealing
A* Example: 8-Puzzle
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
3+3
1+5
2+3
3+4
5+2
0+4
3+2
1+3
2+3
5+0
3+4
1+5
2+4
4+1
ID A*: 8-Puzzle Example
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=4
6
ID A*: 8-Puzzle Example
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=4
4
6
6
ID A*: 8-Puzzle Example
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=4
4
5
6
6
ID A*: 8-Puzzle Example
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
5
4
Cutoff=4
4
5
6
6
ID A*: 8-Puzzle Example
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
6
5
4
5
6
6
4
Cutoff=4
ID A*: 8-Puzzle Example
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
6
ID A*: 8-Puzzle Example
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
4
6
6
ID A*: 8-Puzzle Example
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
4
5
6
6
ID A*: 8-Puzzle Example
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
4
5
7
6
6
ID A*: 8-Puzzle Example
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
5
4
5
7
6
6
ID A*: 8-Puzzle Example
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
5
4
5
7
6
6
5
ID A*: 8-Puzzle Example
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
5
4
5
7
6
6
5
Hill climbing example
star
t
5
3
h=
-3
2 8 3
1 6 4
7
5
h=
-4
5
2 8 3
1
4 h=
7 6 5 -3
2
3
1 8 4
7 6 5
4
goa
l
2
1 2 3
8
4 h=
7 6 5 0
1 2 3
8 4 h=
7 6 5 -1
2 3
1 8 4 h=
7 6 5 -2
4
f(n) = -(number
of tiles out of
Best-first search
Idea: use an evaluation function f(n) for
each node n
Expand unexpanded node n with min f(n)
Implementation: FRINGE is queue sorted
by decreasing order of desirability
Greedy search
A* search
Greedy Search
h(n) – a ‘heuristic’ function estimating the
distance to the goal
Greedy Best First: expand argmin_n h(n)
thus, f(v) = h(v)
Informed Search
Add domain-specific information to select
the best path along which to continue
searching
h(n) = estimated cost (or distance) of
minimal cost path from n to a goal state.
The heuristic function is an estimate,
based on domain-specific information that
is computable from the current state
description, of how close we are to a goal
Robot Navigation
Robot Navigation
f(N) = h(N), with h(N) = Manhattan distance to the goal
8
7
7
6
5
4
5
4
3
3
2
6
7
6
8
7
3
2
3
4
5
6
5
1
0
1
2
4
5
6
5
4
3
2
3
4
5
6
Robot Navigation
f(N) = h(N), with h(N) = Manhattan distance to the goal
8
7
7
6
5
4
5
4
3
3
2
1
5
4
3
6
77
6
8
7
6
3
2
3
4
5
6
5
00 1 2
4
What happened???5
2
3
4
5
6
Greedy Search
f(N) = h(N) greedy best-first
Is it complete?
If we eliminate endless loops, yes
Is it optimal?
More informed search
Our goal is not to minimize the distance from
the current head of our path to the goal, we
want to minimize the overall length of the path
to the goal!
Let g(N) be the cost of the best
path found so far between the initial
node and N
f(N) = g(N) + h(N)
Robot Navigation
f(N) = g(N)+h(N), with h(N) = Manhattan distance to goal
8+
8
3
7+
7
2
6+
6
1
7+
7
0
8+
8
1
7+ 6+ 5+
7 6 5
4 3
5 6
5+ 4+
5 4
6 7
3
6+
6
1
7+ 6+ 5+
7 6 5
2 3 4
4+ 3+ 2+ 3+1
4 3 2 30 4
7 8 9
3+
3
8
2+ 1+1 0+1
2 10 01 1 2
9
5
6
5
4
5
4+ 3+ 2+ 3+
4 3 2 3 4
5 6 7 8
5
6
Can we Prove Anything?
If the state space is finite and we avoid
repeated states, the search is complete,
but in general is not optimal
Proof: ?
If the state space is finite and we do not
avoid repeated states, the search is in
general not complete
Admissible heuristic
Let h*(N) be the true cost of the optimal
path from N to a goal node
Heuristic h(N) is admissible if:
0
h(N)
h*(N)
An admissible heuristic is always
optimistic
A* Search
Evaluation function:
f(N) = g(N) + h(N)
where:
g(N) is the cost of the best path found so far to N
h(N) is an admissible heuristic
Then, best-first search with this evaluation function
is called A* search
Important AI algorithm developed by Fikes and Nilsson
in early 70s. Originally used in Shakey robot.
Completeness & Optimality of A*
Claim 1: If there is a path from the initial
to a goal node, A* (with no removal of
repeated states) terminates by finding the
best path, hence is:
complete
optimal
requirements:
0<
c(N,N’) - c(N,N’) – cost of going from N to N’
Completeness of A*
Theorem: If there is a finite path from the
initial state to a goal node, A* will find it.
Proof of Completeness
Intuition (not math. Proof):
Let g be the cost of a best path to a goal
node
No path in search tree can get longer than
g/ , before the goal node is expanded
Optimality of A*
Theorem: If h(n) is admissable, then A* is
optimal (finds an optimal path).
Proof of Optimality
f(G1) = g(G1)
N
G
1
G
2
f(N) = g(N) + h(N)
f(G1)
g(N) + h(N)
Cost of best path
to a goal thru N
g(N) + h*(N)
g(N) + h*(N)
Example of Evaluation Function
f(N) = (sum of distances of each tile to its goal)
+ 3 x (sum of score functions for each tile)
where score function for a non-central tile is 2 if it is
not followed by the correct tile in clockwise order
and 0 otherwise
5
8
4 2 1
7 3 6
N
1 2 3 f(N) = 2 + 3 + 0 + 1 + 3 + 0 + 3 + 1
4 5 6 3x(2 + 2 + 2 + 2 + 2 + 0 + 2)
= 49
7 8
goa
l
Heuristic Function
Function h(N) that estimates the cost of
the cheapest path from node N to goal
node.
Example: 8-puzzle
5
8
4 2 1
7 3 6
N
1 2 3 h(N) = number of misplaced tiles
4 5 6 =6
7 8
goa
l
Heuristic Function
Function h(N) that estimate the cost of the
cheapest path from node N to goal node.
Example: 8-puzzle
5
8
4 2 1
7 3 6
N
1 2 3 h(N) = sum of the distances of
4 5 6 every tile to its goal position
=2+3+0+1+3+0+3+1
7 8
= 13
goa
l
8-Puzzle
f(N) = h(N) = number of misplaced tiles
3
5
3
4
3
4
2
4
2
3
3
0
4
5
4
1
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
3+3
1+5
2+3
3+4
5+2
0+4
3+2
1+3
2+3
5+0
3+4
1+5
2+4
4+1
8-Puzzle
f(N) = h(N) =
6
distances of tiles to goal
5
2
5
2
4
3
0
4
6
5
1
8-Puzzle
5
8
1
2
3
6
4
2
1
4
5
7
3
6
7
8
N
goa
h1(N) = number of misplaced tiles l= 6 is
admissible
h2(N) = sum of distances of each tile to goal = 13
is admissible
h3(N) = (sum of distances of each tile to goal)
+ 3 x (sum of score functions for each tile) = 49
is not admissible
8-Puzzle
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles
3+3
1+5
2+3
3+4
5+2
0+4
3+2
1+3
2+3
5+0
3+4
1+5
2+4
4+1
Robot navigation
f(N) = g(N) + h(N), with h(N) = straight-line distance from N to goal
Cost of one horizontal/vertical step = 1
Cost of one diagonal step = 2
About Repeated States
f(N1)=g(N1)+h(N1)
N
N
S
1
1
f(N)=g(N)+h(N)
g(N1) < g
(N)
h(N) < h
(N1)
g(N2) < g
f(N) < f(N1)
(N)
S
N
2 2)+h(N)
f(N2)=g(N
Consistent Heuristic
The admissible heuristic h is consistent
(or satisfies the monotone restriction) if for
every node N and every successor N’ of
N:
N
h(N)
c(N,N’) + h(N’)
(triangle inequality)
c(N,
N’)
N
’h(N’)
h(N)
8-Puzzle
5
8
1
2
3
6
4
2
1
4
5
7
3
6
7
8
N
goa
l
h1(N) = number of misplaced
tiles
h2(N) = sum of distances of each tile to goal
are both consistent
Robot navigation
Cost of one horizontal/vertical step = 1
Cost of one diagonal step = 2
h(N) = straight-line distance to the goal is consistent
Claims
If h is consistent, then the function f along
any path is non-decreasing:
N
f(N) = g(N) + h(N)
f(N’) = g(N) +c(N,N’) + h(N’)
c(N,
N’)
N
’h(N’)
h(N)
Claims
If h is consistent, then the function f along
any path is non-decreasing:
N
f(N) = g(N) + h(N)
f(N’) = g(N) +c(N,N’) + h(N’)
h(N) c(N,N’) + h(N’)
f(N) f(N’)
c(N,
N’)
N
’h(N’)
h(N)
Claims
If h is consistent, then the function f along
any path is non-decreasing:
N
f(N) = g(N) + h(N)
f(N’) = g(N) +c(N,N’) + h(N’)
h(N) c(N,N’) + h(N’)
f(N) f(N’)
c(N,
N’)
N
’h(N’)
h(N)
If h is consistent, then whenever A* expands
a node it has already found an optimal path
to the state associated with this node
Avoiding Repeated States in A*
If the heuristic h is consistent, then:
Let CLOSED be the list of states
associated with expanded nodes
When a new node N is generated:
If its state is in CLOSED, then discard N
If it has the same state as another node in
the fringe, then discard the node with the
largest f
Complexity of Consistent A*
Let s be the size of the state space
Let r be the maximal number of states
that can be attained in one step from any
state
Assume that the time needed to test if a
state is in CLOSED is O(1)
The time complexity of A* is O(s r log s)
Heuristic Accuracy
h(N) = 0 for all nodes is admissible and
consistent. Hence, breadth-first is a special
case of A* !!!
Let h1 and h2 be two admissible and
consistent heuristics such that for all nodes N:
h1(N) h2(N).
Then, every node expanded by A* using h2 is
also expanded by A* using h1.
h2 is more informed than h1
Iterative Deepening A* (IDA*)
Use f(N) = g(N) + h(N) with admissible
and consistent h
Each iteration is depth-first with cutoff on
the value of f of expanded nodes
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=4
6
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=4
4
6
6
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=4
4
5
6
6
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
5
4
Cutoff=4
4
5
6
6
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
6
5
4
5
6
6
4
Cutoff=4
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
6
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
4
6
6
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
4
5
6
6
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
4
5
7
6
6
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
5
4
5
7
6
6
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
5
4
5
7
6
6
5
f(N) 8-Puzzle
= g(N) + h(N)
with h(N) = number of misplaced tiles
4
Cutoff=5
5
4
5
7
6
6
5
About Heuristics
Heuristics are intended to orient the search along
promising paths
The time spent computing heuristics must be
recovered by a better search
After all, a heuristic function could consist of solving
the problem; then it would perfectly guide the search
Deciding which node to expand is sometimes called
meta-reasoning
Heuristics may not always look like numbers and may
involve large amount of knowledge
What’s the Issue?
Search is an iterative local procedure
Good heuristics should provide some
global look-ahead (at low computational
cost)
Another approach…
for optimization problems
rather than constructing an optimal solution
from scratch, start with a suboptimal solution
and iteratively improve it
Local Search Algorithms
Hill-climbing or Gradient descent
Potential Fields
Simulated Annealing
Genetic Algorithms, others…
Hill-climbing search
If there exists a successor s for the current state n
such that
h(s) < h(n)
h(s) <= h(t) for all the successors t of n,
then move from n to s. Otherwise, halt at n.
Looks one step ahead to determine if any successor
is better than the current state; if there is, move to
the best successor.
Similar to Greedy search in that it uses h, but does
not allow backtracking or jumping to an alternative
path since it doesn’t “remember” where it has been.
Not complete since the search will terminate at "local
minima," "plateaus," and "ridges."
Hill climbing on a surface of
states
Height Defined
by Evaluation
Function
Hill climbing
Steepest descent (~ greedy best-first with
no search) may get stuck into local
minimum
Robot Navigation
Local-minimum problem
f(N) = h(N) = straight distance to the goal
Examples of problems with HC
applet
Drawbacks of hill climbing
Problems:
Local Maxima: peaks that aren’t the highest
point in the space
Plateaus: the space has a broad flat region
that gives the search algorithm no direction
(random walk)
Ridges: flat like a plateau, but with dropoffs to
the sides; steps to the North, East, South and
West may go down, but a step to the NW may
go up.
Remedy:
Introduce randomness
Random restart.
What’s the Issue?
Search is an iterative local procedure
Good heuristics should provide some
global look-ahead (at low computational
cost)
Hill climbing example
star
t
5
3
h=
-3
2 8 3
1 6 4
7
5
h=
-4
5
2 8 3
1
4 h=
7 6 5 -3
2
3
1 8 4
7 6 5
4
goa
l
2
1 2 3
8
4 h=
7 6 5 0
1 2 3
8 4 h=
7 6 5 -1
2 3
1 8 4 h=
7 6 5 -2
4
f(n) = -(number
of tiles out of
Example of a local maximum
sta
rt
1 2 5
7 4
8 6 3
3
1 2 5
7 4
8 6 3
4
1 2 5
7 4 8 6 3 4
1 2 5
7 4 8 6 3 4
go
al
1 2 5
7 4 0
8 6 3
Potential Fields
Idea: modify the heuristic function
Goal is gravity well, drawing the robot toward it
Obstacles have repelling fields, pushing the
robot away from them
This causes robot to “slide” around obstacles
Potential field defined as sum of attractor field
which get higher as you get closer to the goal
and the indivual obstacle repelling field (often
fixed radius that increases exponentially closer
to the obstacle)
Does it always work?
No.
But, it often works very well in practice
Advantage #1: can search a very large
search space without maintaining fringe of
possiblities
Scales well to high dimensions, where no
other methods work
Example: motion planning
Advantage #2: local method. Can be done
online
Example: RoboSoccer
All robots have same field: attracted to the ball
Repulsive potential to other players
Kicking field: attractive potential to the ball and
local repulsive potential if clase to the ball, but
not facing the direction of the opponent’s goal.
Result is tangent, player goes around the ball.
Single team: kicking field + repulsive field to
avoid hitting other players + player position
fields (paraboilic if outside your area of the field,
0 inside). Player nearest to the ball has the
largest attractive coefficient, avoids all players
crowding the ball.
Simulated annealing
Simulated annealing (SA) exploits an analogy between
the way in which a metal cools and freezes into a
minimum-energy crystalline structure (the annealing
process) and the search for a minimum [or maximum] in
a more general system.
SA can avoid becoming trapped at local minima.
SA uses a random search that accepts changes that
increase objective function f, as well as some that
decrease it.
SA uses a control parameter T, which by analogy with
the original application is known as the system
“temperature.”
T starts out high and gradually decreases toward 0.
Simulated annealing (cont.)
A “bad” move from A to B is accepted with a
probability
(f(B)-f(A)/T)
e
The higher the temperature, the more likely it
is that a bad move can be made.
As T tends to zero, this probability tends to
zero, and SA becomes more like hill climbing
If T is lowered slowly enough, SA is complete
and admissible.
The simulated annealing algorithm
Summary: Local Search Algorithms
Steepest descent (~ greedy best-first with
no search) may get stuck into local
minimum
Better Heuristics: Potential Fields
Simulated annealing
Genetic algorithms
When to Use Search Techniques?
The search space is small, and
There is no other available techniques, or
It is not worth the effort to develop a more
efficient technique
The search space is large, and
There is no other available techniques, and
There exist “good” heuristics
Summary
Heuristic function
Best-first search
Admissible heuristic and A*
A* is complete and optimal
Consistent heuristic and repeated states
Heuristic accuracy
IDA*
Modified Search Algorithm
1. INSERT(initial-node,FRINGE)
2. Repeat:
If FRINGE is empty then return failure
n REMOVE(FRINGE)
s STATE(n)
If GOAL?(s) then return path or goal state
For every state s’ in SUCCESSORS(s)
Create a node n’
INSERT(n’,FRINGE)
Download