Problem Solving

advertisement
Problem Solving
Problem Solving
• Game Playing
• Planning
Game Playing
•
•
•
•
Attractive AI problem, because it is abstract
One of the oldest domains of AI
In most cases, the world state is fully accessible
Computer representation of the situation can be
clear and exact
Game Playing
• Challenging: uncertainty introduced by the
– opponents and the
– complexity of the problem (full search is impossible)
• Hard: in chess, branching factor is about 35, and
50 moves by each player = 35100 nodes to search
1040 possible legal board states.
Games vs. Search Problems
• “Unpredictable” opponent  solution is a contingency plan
• Time limits  unlikely to find goal, must approximate.
Plan of attack
• Algorithm for perfect play (Von Neumann, 1944)
• Finite horizon, approximate evaluation (Zuse, 1945; Shannon,
1950; Samuel, 1952 –57)
• Pruning to reduce costs (McCarthy, 1956)
Types of Games
Perfect info
Imperfect info
deterministic
chance
Chess,
checkers, go
Backgammon,
monopoly
?
Bridge, poker
Two-Person Perfect Information
Game
Two agents, players, move in turn until one of them
wins, or the result is a draw.
Each player has a complete and perfect model of the
environment
Two-Person Perfect Information
Game
initial state: initial position and who goes first
operators: legal moves
terminal test: game over?
utility function: outcome (win:+1, lose:-1, draw:0, etc.)
•
•
•
two players (MIN and MAX) taking turns to maximize
their chances of winning (each turn generates one ply)
one player’s victory is another’s defeat
need a strategy to win no matter what the opponent does
MINIMAX : Strategy for Two-Person
Perfect Info Games
The mini-max procedure is a depth first, depth
limited search procedure. At the current state next
possible moves are generated by plausible move
generator.
Then static evaluation function is applied to chose
the best move.We assume that the static evaluation
function returns large values for the player and small
numbers for the opponent.
MINIMAX PROCEDURE
Assume that 10 means a win
for the player -10 means a
win for the opponent and 0
means a tie. If the two
players are at the same level
of knowledge then moves
will be taken as maximizing
when it is the players turn
and minimizing when it is the
opponents turn.
We should chose move B to maximize our advantage
MINIMAX PROCEDURE
But, we usually carry the search for more than one step.
If we take move B the opponent will make move F which is more to
his advantage.
MINIMAX PROCEDURE
We propagate the static evaluation function values upwards and
chose to make move C.
MAX
MIN
(-4)
MINIMAX PROCEDURE
• Complete search of most game graphs is computationally
infeasible.
• After search terminates, an estimate of the best first move
is made.
• This can be made by applying a static evaluation function
to the leaf nodes of the search tree. The evaluation function
measures the worth of a leaf node.
• Positions favorable to MAX evaluate to a large positive
value, where positions favorable to MIN evaluate to a
negative value.
• Values near zero correspond to game positions not
particularly favorable to either MAX or MIN
MINIMAX PROCEDURE
The backed up values are based on “looking ahead” in the
game tree and therefore depend on features occurring near
the end of the game.
MINIMAX PROCEDURE
3
MAX
3
MIN
MAX
2
9
3
35
2
0
0
9
2
7
0
7
4
2
6
1 5
6
Tic-Tac-Toe
Assume MAX marks Xs, MIN marks Os
• MAX plays first
• With a depth bound of 2, we generate all the nodes till
level 2, then apply static evaluation function to the
positions at these nodes.
If p is not a winning position for either of the players
• e(p) = number of complete rows, columns or diagonals
that are still open
• e(p) =  if p is a win for MAX
• e(p) = -  if p is a win for MIN
Tic-Tac-Toe
• Thus if p is
O
X
We have e(p) = 6 –4 = 2
We make use of symmetries in generating successor positions;
thus the following game states are all considered identical.
O
O
X
X
X
X
O
O
MINIMAX PROCEDURE
MINIMAX PROCEDURE
MINIMAX PROCEDURE
The Alpha-Beta PROCEDURE
 Cuts
When the current maximum value is greater than the
successor’s min value, don’t look further on that
min subtree:
MAX >=4
4
4
6
MIN =< 2
2
discard
Right subtree can be at most
2, so MAX will always
choose the left path regardless
of what appears next.
The Alpha-Beta PROCEDURE
 Cuts
When the current min value is les than the
successor’s max value, don’t look further on that
max subtree:
MIN =< 3
MAX >=5
3
1
3
5
discard
Right subtree can be at least
5, so MIN will always choose
the left path regardless of
what appears next.
The Alpha-Beta PROCEDURE
3 C
MAX
MIN
MAX
2
3
3
B
35
A
0 D
2
0
E
2
0
1) A has = 3 (A will be no larger than 3 )
B is  pruned
2) C has  = 3 (C will be no smaller than 3
2
1
D is  pruned, since 0<3
E is  pruned, since 2<3
Search Efficiency of  -  Procedure
• Pruning does not affect the final result
• In order to perform  -  cutoffs, at least some part
of the search tree must be generated to the max.
depth, because alpha and beta values must be
based on the static values of the tip nodes.
• Good move ordering improves effectiveness of
pruning
Ordering is important for good Pruning
For MIN, sorting successor’s utility in an increasing order is
better (shown above; left).
For MAX, sorting in decreasing order is better.
 - Pruning Properties
• A game of m moves can be represented with a tree
of depth m. If the branching factor is b then we
have bm tip nodes.
• With perfect ordering (lowest successor values
first for MIN nodes and the highest successor
values first for MAX nodes), the number of cut
offs is maximized and the number of tip nodes
generated is minimized.
– time complexity = bm/2.
– bm/2 = (b1/2)m ,thus b= 35 in chess reduces to 6.
 - Pruning Properties
• That is, the number of tip nodes at depth d that
would be generated by optimal alpha-beta search
is about the same as the number of tip nodes that
would have been generated at depth d/2 without
alpha-beta
Planning – Means Ends Analysis
• In some problems like guiding a robot around a house
description of a single state is very complex.
• A given action on the part of the robot will change only
very small part of the total state.
• Instead of writing rules that describe transformations
between states, we would like to write rules that describe
only the affected parts of the state description. The rest can
be assumed to stay constant.
• These methods focus on ways of decomposing the original
problem into appropriate subparts and on ways of
recording and handling interactions among the subparts as
they are detected during the problem solving process.
Planning – Means Ends Analysis
• The means ends analysis centers around the
detection of the difference between the current
state and the goal state. Once such a difference is
isolated, an operator that can reduce the difference
(means) must be found. In other words it relies on
a set of rules that transform one problem state into
another (ends).
• Example: Suppose a robot is given a task of
moving a desk with two things on it from one
room to another.
Means-Ends Analysis
Operator (means)
Preconditons
Results (ends)
PUSH(obj, loc)
at(robot, obj),
large(obj), clear(obj),
armempty
at(obj, loc), at(robot, loc)
CARRY(obj, loc)
at(robot, obj),
small(obj)
at(obj, loc), at(robot, loc)
WALK(loc)
none
at(robot, loc)
PICKUP(obj)
at(robot, obj)
armempty
holding(obj)
PUTDOWN(obj)
holding(obj)
~holding(obj)
PLACE(obj1, obj2)
at(robot, obj2),
holding(obj1)
on(obj1, obj2)
Means-Ends Analysis
• The main difference between the start state and the
goal state is the location of the desk
• To reduce this difference PUSH or CARRY
operators are available.
• CARRY reaches a dead-end
obj is not small
• PUSH
A
Start
B
C
Push
D
Goal
Means-Ends Analysis
A
B
Start
C
Push
A
Start
D
Goal
B
Walk Pick up Put down
Pick up Put down Push
C
E
D
Place
This difference can be reduced by using WALK to get
the robot back to the objects followed by PICKUP and
CARRY
References
• Nilsson, N.J. Artificial Intelligence: A new
Synthesis, Morgan Kaufmann, 1998
• Luger, Stubblefield. Artificial Intelligence
• Rich, Artificial Intelligence
Download