Adversarial Search Chapter 6 Outline Optimal decisions α-β pruning Imperfect, real-time decisions Games as search problems Engaged intellectual faculties of humans. Board games such as Chess. Game playing is appealing target of AI. State of the game is easy to represent . And Actions are restricted. Games as search problems This makes the Game playing an idealizations of world in which hostile agents act so as to diminish one’s well being. Early researchers choose chess fro several reasons. A chess playing computer is an existing proof of machine doing some thing thought to require intelligence. Easy to represent the game as search through a space of possible gaming positions . Games as search problems The presence of opponent makes the decision problem complicated. It introduces “Uncertainty”. So, all game playing programs must deal with contingency problems. Means prediction is impossible. Games as search problems The complexity of the games introduces completely a new kind of uncertainty that we have not seen so far. It does not arise because it has missing information but one does not have a time to calculate the exact consequences of any move. Instead, one has to make the best guess based on past experience and act before he is sure what action to take. Games as search problems Game usually has the time limits. Game playing research has therefore spawned a number of interesting ideas on how to make the best use of time to reach good decisions. Now ,the next discussion is how to find the theoretically best move. Then we look techniques for choosing a good move when time is limited. Pruning allows us to to ignore portions of the search tree that makes no difference to the final choice. Heuristic function allows us to approximate the true utility of state without complete search Perfect Decisions in two person Games The general case of a game with two players, whom we will call MAX and M1N, MAX moves first, and then they take turns moving until the game is over. At the end of the game, points are awarded to the winning player. Formal definition of game as search problem The initial state, which includes the board position and an indication of whose move it is. A set of operators, which define the legal moves that a player can make. A terminal test, which determines when the game is over. States where the game has ended are called terminal states. A utility function (also called a payoff function), which gives a numeric value for the outcome of a game. In chess, the outcome is a win, loss, or draw, which we can represent by the values +1, —1, or 0 Perfect Decisions in two person Games MAX would have to search for a sequence of moves that leads to a terminal state that is a winner .then go ahead and make the first move in the sequence. Unfortunately, MIN has something to say about it. MAX therefore must find a strategy that will lead to a winning terminal state regardless of what MIN does, where the strategy includes the correct move for MAX for each possible move by MIN. We will begin by showing how to find the optimal (or rational) strategy. Game tree (2-player, deterministic, turns) Figure shows part of the search tree for the game of Tic-TacToe. From the initial state, has a choice of nine possible moves. Play alternates between MAX placing x's and MIN placing o's until we reach leaf nodes corresponding to terminal states: states where one player has three in a row or all the squares are filled. The number on each leaf node indicates the utility value of the terminal state from the point of view of MAX; high values are assumed to be good for MAX and bad for MIN .It is MAX'S job to use the search tree to determine the best move. Game tree (2-player, deterministic, turns) Game tree Even a simple game like Tic-Tac-Toe is too complex to show the whole search tree, so we will switch to the absolutely trivial game in Figure . The possible moves for MAX are labelled A I , AT, and AS. The possible replies loA\ for MIN are A11, A12, A13, and so on. This particular game ends after one move each by MAX and MIN. Game tree Minimax Algorithm The minimax algorithm is designed to determine the optimal strategy for MAX, and thus to decide what the best first move is. The algorithm consists of five steps: Generate the whole game tree, all the way down to the terminal states. Apply the utility function to each terminal state to get its value. Use the utility of the terminal states to determine the utility of the nodes one level higher up in the search tree. Minimax Algorithm Continue backing up the values from the leaf nodes toward the root, one layer at a time. Eventually, the backed-up values reach the top of the tree; at that point, MAX chooses the move that leads to the highest value. In the topmost A node of , MAX has a choice of three moves that will lead to states with utility 3, 2, and 2, respectively. Thus, MAX's best opening move is A1. This is called the minimax decision, because it maximizes the utility under the assumption that the opponent will play perfectly to minimize it. Minimax Perfect play for deterministic games Idea: choose move to position with highest minimax value = best achievable payoff against best play E.g., 2-ply game: Minimax algorithm Properties of minimax Complete? Yes (if tree is finite) Optimal? Yes (against an optimal opponent) Time complexity? O(bm) (m is depth and b is legal move on each point) Space complexity? O(bm) (depth-first exploration) For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible α-β pruning example α-β pruning example α-β pruning example α-β pruning example α-β pruning example Properties of α-β Pruning does not affect final result Good move ordering improves effectiveness of pruning With "perfect ordering," time complexity = O(bm/2) doubles depth of search A simple example of the value of reasoning about which computations are relevant (a form of metareasoning) Why is it called α-β? α is the value of the best (i.e., highestvalue) choice found so far at any choice point along the path for max If v is worse than α, max will avoid it prune that branch The α-β algorithm The α-β algorithm Resource limits Suppose we have 100 secs, explore 104 nodes/sec 106 nodes per move Standard approach: cutoff test: e.g., depth limit (perhaps add quiescence search) Evaluation functions For chess, typically linear weighted sum of features Eval(s) = w1 f1(s) + w2 f2(s) + … + wn fn(s) e.g., w1 = 9 with f1(s) = (number of white queens) – (number of black queens), etc. Cutting off search MinimaxCutoff is identical to MinimaxValue except 1. 2. Terminal? is replaced by Cutoff? Utility is replaced by Eval 3. Does it work in practice? bm = 106, b=35 m=4 4-ply lookahead is a hopeless chess player! Deterministic games in practice Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions. Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply. Othello: human champions refuse to compete against computers, who are too good. Go: human champions refuse to compete against computers, Summary Games are fun to work on! They illustrate several important points about AI perfection is unattainable must approximate good idea to think about what to think about