Chapter Three Game Theory Game theory is the systematic study of how rational agents behave in strategic situations. It is the formal study of decision making where several players must make choices that potentially affect the interest of other players. In a strategic setting, a person may no longer have an obvious choice that is best for him or her. What is best for one decision maker may depend on what the other is doing and vice versa. The object of study in a game theory is the game which is a formal model of an interactive situation. A game is an abstract model of a strategic situation. Even the most basic games have three essential elements: players, strategies, and payoffs. In complicated settings, it is sometimes also necessary to specify additional elements such as the sequence of moves and the information that players have when they move (who knows what when) to describe the game fully. In a game, two or more decision-takers know that their decisions are strategically interdependent: the outcome of decisions taken by one decision-taker depends on the decisions of the other(s). To work out the consequences of their actions each player must formulate expectations about how the other player(s) will act. Game theory is concerned with the analysis of such decision problems. A game typically involves several players but a game with only one player is usually called a decision problem. The concepts of game theory provide a language to formulate structure, analyze and understand strategic scenarios. The earliest a formal game theoretic analysis is the study of duopoly by Antoine Cournot. Game theory was established as a subject in its own right after the 1947 publication of “Theory of Games and Economic Behavior” by Von Neuman and Oskar Morgenstern. This book provided much of the basic terminology and problem set up that is still in use today. In 1950 John Nash demonstrated that finite games always have an equilibrium point at which players choose actions which are best for them given their opponent choices. 3.1 Basic Elements of a game Players: Each decision maker in a game is called a player. These players may be individuals, firms (as in markets with few firms), or entire nations (as in military conflicts). A player is characterized as having the ability to choose from among a set of possible actions. Usually, the number of players is fixed throughout the “play” of the game. Games are sometimes characterized by the number of players involved (two-player, three-player, or n-player games). But here our discussion focuses on two-player games because this is the simplest strategic setting. Strategy: Each course of action open to a player during the game is called a strategy. Payoffs: The final returns to the players at the conclusion of a game are called payoffs. So a payoff is a number that reflects the desirability of an outcome to a player for whatever reason. Payoffs are measured in levels of utility obtained by the players. For simplicity, monetary payoffs (say, profits for firms) are often used. More generally, payoffs can incorporate nonmonetary outcomes such as prestige, emotion, risk preferences, and so forth. Players are assumed to prefer higher payoffs than lower payoffs. In a two-player game, π’1 (π 1 , π 2 )denotes player 1’s payoff given that he or she chooses π 1 and the other player chooses π 2 and similarly π’2 (π 2 , π 1 ) denotes player 2’s payoff. The fact player 1’s payoff may depend on 2’s strategy (and vice versa) is where the strategic interdependence shows up. 3.2 The Payoff Matrix of the Game The payoff matrix of the game is a table that summarizes the different possible outcomes of the game. Table 3.1 Firm B Firm A Advertise( T) Don't Advertise (B) Advertise( L) (10, 5) (6, 8) Don't Advertise (R) 15, 0 (10, 2) Example: Table 3.1 is the table of matrix of advertising: (A, B). Firm A has two strategies he can choose either advertise (top (T)) or don’t advertise (bottom (B)) and similarly firm B can choose either advertise (left (L)) or don’t advertise (right (R)). The first and second entries in each cell represent the benefits of firm A and firm B respectively. For instance when both firm A and firm B advertise, the benefit of A is 10 and the benefit of B is 5. The payoff matrix of a game simply depicts the payoffs to each player for each combination of strategies that are chosen. 3.3 Dominant strategy Equilibrium A dominant strategy is a strategy in a game that is best for a player no matter what an opponent does. From the payoff matrix of a game of Table 3.1, it can be easily seen that firm A should advertise no matter what B does. Firm A does the best by advertising. Thus advertising is the dominant strategy for firm A and the same is true for firm B. Thus, assuming that the two firms are rational, the outcome of the game will be that both firms advertise. An equilibrium is called dominant-strategy equilibrium, if the strategy of each player is a dominant strategy. If there is a dominant strategy for each player in a game it would be the equilibrium outcome of the game. However, dominant strategies often do not exist. So it is rarely possible to predict the outcome of a game based on this equilibrium concept. 3.4 Nash Equilibrium Dominant strategy equilibrium is stable but one or more players may not have dominant strategy. If there are two players in a game (say A and B), Nash equilibrium is implies that A is doing the best it can given B’s choice and B is doing the best it can, given what A is doing. Nash equilibrium is a set of actions or strategies such that each player believes that it is doing the best it can, given the action of its opponents. Nash equilibrium is named after John Nash, an American mathematician who formulated this fundamental concept of game theory in 1951. In 1994 he received the Nobel Prize in economics, along with two other game theory pioneers, John Harsanyi and Reinhard Selten. The Nash equilibrium notion has certain logic. Unfortunately, it also has some problems. First, a game may have more than one Nash equilibrium. Second, there are games that have no Nash equilibrium. In general, we may have a single, several or no Nash equilibrium. Table 3.2 A game with one Nash equilibrium Firm B Firm A I. Advertise Don't Advertise Advertise Don't Advertise 10, 5 15, 0 5,8 20, 2 Example of one Nash Equilibrium game: The game in Table 3.2 has one Nash equilibrium. As can be seen from Table 3.2, firm A has no dominant strategy while firm B has dominant strategy i.e. Advertise. The Nash equilibrium of the game will be (Advertise, Advertise). The game described in Table 3.1 also has one Nash equilibrium. II. Example of a two Nash equilibrium games: The Battle of Sexes This is a two player coordination game. Imagine a couple that agreed to meet this evening, but cannot recall if they will be attending a Ballet or a Football match. The husband would most of all like to go to the football game. But the wife would like to go to the ballet. Both would prefer to go to the same place rather than to different once. Assuming that each couple would be unhappy with out each other, what would be the Nash Equilibrium? Both the wife and the husband do not have a dominant strategy and the game has two Nash equilibrium points. i.e. (Ballet, Ballet) and (Football, Football). Table 3.3 A game with two Nash equilibrium Husband Wife Ballet Football Ballet (3, 2) (0,0) Foot Ball (0, 0) (2, 3) So what is the problem with games that have multiple Nash equilibria? There are at least two problems caused by the multiplicity of equilibria. First, the predictive power of a theory that makes several predictions is limited and, second, it is only of limited use in supporting players with identifying optimal strategies. III. Example of games with No Nash Equilibrium ( in pure strategy) Matching pennies: Two people choose simultaneously whether to show the head or tail of a coin. If they show the same side player two pays player one 2 dollars. If they show different sides player one pays player two 2 dollars. Thus, each player has two strategies which we abbreviate as Heads or Tails. We can depict the strategic interactions in a game matrix. The entry in box (Head, Tails) indicates that player one gets -2 and player two gets +2 if this particular combination of strategies is chosen. Note that in each entry of this box, the payoff to player one is just the negative of the payoff to player two. In other words, this is a zero-sum game. In zero-sum games the interests of the players are diametrically opposed and are particularly simple to analyze. However, most games of interest to economists are not zero sum games. Table 3.3 A game matrix of matching pennies Player 2 Player 1 Head Tail Head (2, -2) (-2,2) Tail (-2, 2) (2, -2) The game matrix shown in Table 3.4 is also another example of a game with no Nash equilibrium. For this game, if firm A chooses to advertise firm B wants to advertise too. But if firm B advertises then firm A wants to not to advertise. Similarly, if firm A wants to not to advertise then firm B wants to not to advertise but if firm B chooses not to advertise then A wants to advertise. Thus, there will be no Nash equilibrium. This is a kind of circular game. Table 3.4 A game matrix for advertisement Firm B Advertise Don’t Advertise (0, 0) (0, -1) Advertise Firm A (1,0) (-1, 3) Don’t Advertise There is an easy procedure to determine the set of Nash equilibria for games in matrix form. First, one successively goes through all the strategies of player 1 and marks the respective best response(s) of player 2. Then one repeats the whole procedure with the strategies of player 2 and marks player 1’s best responses. If there are fields in which there are marks for both players, then the strategy profile associated with that field is a Nash equilibrium. 3.5 Mixed strategy Equilibrium First let us refine the description of strategies. We ought to refer to those that have been discussed so far as pure strategies. We have been thinking of each agent as choosing a strategy once and for all. That is, each agent is making one choice and sticking to it. This is called a pure strategy. However, if we enlarge our definition of strategies, we can find a new sort of Nash equilibrium, for example for the game in Table 3.4. Another way to think about it is to allow the agents to randomize their strategies-to assign a probability to each choice and to play their choices according to those probabilities. For example, for the game in Table 3.4, Firm A might choose to advertise 50 percent of the time and not to advertise 50 percent of the time, while firm B might choose to advertise 50 percent of the time and not to advertise 50 percent of the time. This kind of strategy is called a mixed strategy. Definition: Given a finite set π β of pure strategies for agent β, a mixed strategy is a probability distribution over the elements of π β . We can represent the mixed strategy by writing out the elements of π β in vector form β (π 1,β π 2,β … ) and representing the probability distribution by π β = (π1,β π π2, … ) such that ππ,β is the probability that π π,β is the strategy that is actually adopted by β. If firm A and B follow the mixed strategies given above, of playing each of their choices half the time, then they will have a probability of 1⁄4 of ending up in each of the four cells in the payoff matrix. Thus the average payoff to firm A will be 0, and the average payoff to B will be 1⁄2. Another example is the simple one-shot simultaneous play game shown in Table 3.5 Each player has two pure strategies: L (left) and R (right) for player A, and U (up) and D (down) for player B. If the players choose pure strategies there is no Nash equilibrium. Now suppose that the players can choose the probability with which to play each of their pure strategies. Table 3.5 A game matrix with no pure strategy equilibrium Player A Player B Up (U), Down (D) Left(L) (5, 10) Right (R) (10, 2) (4, 8) (2, 4) Denote the probability that player A chooses L by ππΏ , so that she chooses R with probability (1 - ππΏ ). Similarly ππ is the probability that player B chooses U. A mixed strategy for player A is a choice of ππΏ which determines the probability with which she plays her L and R pure strategies. Similarly, a mixed strategy for player B is a choice of the probability ππ with which he plays U. Games in mixed strategies includes games in pure strategies as special cases where ππΏ and ππ are restricted to be 0 or 1. Given the chosen probabilities the expected payoffs π π to the players are π π΄ (ππΏ, ππ ) = 5ππΏ ππ + 4ππΏ (1 − ππ ) + 10(1 − ππΏ )ππ + 2(1 − ππΏ )(1 − ππ ) π π΅ (ππΏ, ππ ) = 10ππΏ ππ + 8ππΏ (1 − ππ ) + 2(1 − ππΏ )ππ + 4(1 − ππΏ )(1 − ππ ) A mixed strategy Nash equilibrium is a pair of mixed strategies (ππΏ,∗ ππ∗ ) with the property π π΄ (ππΏ,∗ ππ∗ ) ≥ π π΄ (ππΏ , ππ∗ ) for all ππΏ βΏ [0, 1] and π π΅ (ππΏ,∗ ππ∗ ) ≥ π π΄ (ππΏ,∗ ππ ) for all ππ βΏ [0, 1]. The mixed strategies must be best responses to each other. The partial derivative of π π΄ with respect to ππΏ is πππ΄πΏ (ππΏ, ππ ) = 5ππ + 4(1 − ππ ) − 10ππ − 2(1 − ππ ) = 2 − 7ππ which is positive, zero or negative as ππ is less than, equal to or greater than 2/7, respectively. Thus player A’s best response to ππ < 2/7 is to set ππΏ = 1, and to ππ > 2/7 is to set ππΏ = 0. If ππ = 2/7 player A would get the same expected payoff from choosing any ππΏ βΏ [0, 1]. The partial derivative of π π΅ with respect to ππ is πππ΅π (ππΏ, ππ ) = 10ππΏ − 8ππΏ + 2(1 − ππΏ ) − 4(1 − ππΏ ) = 4ππΏ − 2 which is positive, zero or negative as ππ is less than, equal to or greater than 2/7, respectively. Thus player A’s best response to ππ < 2/7 is to set ππΏ = 1, and to ππ > 2/7 is to set ππΏ = 0. If ππ = 2/7 player A would get the same expected payoff from choosing any ππΏ βΏ [0, 1]. Player B’s best response to ππΏ less than 1/2 is ππ = 0, to ππΏ greater than 1/2 is ππ = 1. If ππΏ = 1/2 player B would be indifferent among all would be indifferent among all ππ βΏ [0, 1]. There is an equilibrium in mixed strategies for this one-shot game at (ππΏ,∗ ππ∗ ) = (1/2, 2/7). This pair of mixed strategies are best responses to each other: if player A chooses to play her pure strategy L with probability 1/2 then player B cannot do better than to play U with probability 2/7. But, faced with player B choosing U with probability 2/7, player A cannot do better than to play L with probability 1/2. It is possible to show that a Nash equilibrium in mixed strategies exists for all games in which the players have a ο¬nite number of pure strategies. 3.6 The Prisoner's Dilemma Another problem with the Nash equilibrium of a game is that it does not necessarily lead to Pareto efficient outcomes. Consider, for example, the game depicted in Table 3.6. This game is known as the prisoner's dilemma. The original discussion of the game considered a situation where two prisoners who were partners in a crime were being questioned in separate rooms. Each prisoner had a choice of confessing to the crime, and thereby implicating the other, or denying that he had participated in the crime. Table 3.6 The prisoner’s dilemma Player B Player A Confess Confess (-3, -3) Deny (0, -6) Deny (-6, 0) (-1, -1) If only one prisoner confessed, then he would go free, and the authorities would throw the book at the other prisoner, requiring him to spend 6 months in prison. If both prisoners denied being involved, then both would be held for 1 month on a technicality, and if both prisoners confessed they would both be held for 3 months. The payoff matrix for this game is given in Table 3.6. The entries in each cell in the matrix represent the utility that each of the agents assigns to the various prison terms, which for simplicity we take to be the negative of the length of their prison terms. Put yourself in the position of player A. If player B decides to deny committing the crime, then you are certainly better off confessing, since then you'll get off free. Similarly, if player B confesses, then you'll be better off confessing, since then you get a sentence of 3 months rather than a sentence of 6 months. Thus whatever player B does, player A is better off confessing. The same thing goes for player B-he is better off confessing as well. Thus the unique Nash equilibrium for this game is for both players to confess. In fact, both players confessing is not only a Nash equilibrium, it is a dominant strategy equilibrium, since each player has the same optimal choice independent of the other player. But if they could both just hang tight (deny), they would each be better off! If they both could be sure the other would hold out, and both could agree to hold out themselves, they would each get a payoff of -1, which would make each of them better off. The strategy (deny, deny) is Pareto efficient-there is no other strategy choice that makes both players better off-while the strategy (confess, confess) is Pareto inefficient. The problem is that there is no way for the two prisoners to coordinate their actions. If each could trust the other, then they could both be made better off. The prisoner's dilemma applies to a wide range of economic and political phenomena. Consider, for example, the problem of arms control. Interpret the strategy of "confess" as "deploy a new missile" and the strategy of "deny" as "don't deploy." Note that the payoffs are reasonable. If my opponent deploys his missile, I certainly want to deploy, even though the best strategy for both of us is to agree not to deploy. But if there is no way to make a binding agreement, we each end up deploying the missile and are both made worse off. Another good example is the problem of cheating in a cartel. Now interpret confess as, “produce more than your quota of output" and interpret deny as "stick to the original quota." If you think the other firm is going to stick to its quota, it will pay you to produce more than your own quota. And if you think that the other firm will overproduce, then you might as well, too. The prisoner's dilemma has provoked a lot of controversy as to what is the "correct" way to play the game or, more precisely, what is a reasonable way to play the game. The answer seems to depend on whether you are playing a one-shot game or whether the game is to be repeated an indefinite number of times. If the game is going to be played just one time, the strategy of defecting- in this example, confessing-seems to be a reasonable one. After all, what- ever the other fellow does, you are better off, and you have no way of influencing the other person's behavior. 3.7 Repeated Games and Enforcing a Cartel In the preceding section, the players met only once and played the prisoner's dilemma game a single time. However, the situation is different if the game is to be played repeatedly by the same players. In this case there are new strategic possibilities open to each player. If the other player chooses to defect on one round, then you can choose to defect on the next round. Thus your opponent can be "punished" for "bad" behavior. In a repeated game, each player has the opportunity to establish a reputation for cooperation, and thereby encourage the other player to do the same. Whether this kind of strategy will be viable depends on whether the game is going to be played a fixed number of times or an indefinite number of times. Let us consider the first case, where both players know that the game is going to be played 10 times, say. What will the outcome be? Suppose we consider round 10. This is the last time the game will be played, by assumption. In this case, it seems likely that each player will choose the dominant strategy equilibrium, and defect. After all, playing the game for the last time is just like playing it once, so we should expect the same outcome. Now consider what will happen on round 9. We have just concluded that each player will defect on round 10. So why cooperate on round 9? If you cooperate, the other player might as well defect now and exploit your good nature. Each player can reason the same way, and thus each will defect. Now consider round 8. If the other person is going to defect on round 9 . . . and so it goes. If the game has a known, fixed number of rounds, then each player will defect on every round. If there is no way to enforce cooperation on the last round, there will be no way to enforce cooperation on the next to the last round, and so on. Players cooperate because they hope that cooperation will induce further cooperation in the future. But this requires that there will always be the possibility of future play. Since there is no possibility of future play in the last round, no one will cooperate then. But then why should anyone cooperate on the next to the last round? Or the one before that? And so it goes-the cooperative solution "unravels" from the end in a prisoner's dilemma with a known, fixed number of plays. But if the game is going to be repeated an indefinite number of times, then you do have a way of influencing your opponent's behavior: if he refuses to cooperate this time, you can refuse to cooperate next time. As long as both parties care enough about future payoffs, the threat of non- cooperation in the future may be sufficient to convince people to play the Pareto efficient strategy. In Chapter two we discussed the behavior of duopolists playing a price setting game. We argued there that if each duopolist could choose his price, then the equilibrium outcome would be the competitive equilibrium. If each firm thought that the other firm would keep its price fixed, then each firm would find it profitable to undercut the other. The only place where this would not be true was if each firm were charging the lowest possible price, say the constant marginal cost c of the two firms. In the terminology of this chapter, each firm charging a c price is a Nash equilibrium in pricing strategies-what we called a Bertrand equilibrium in chapter two. The payoff matrix for the duopoly game in pricing strategies has the same structure as the prisoner's dilemma. If each firm charges a high price, then they both get large profits. This is the situation where they are both cooperating to maintain the monopoly outcome. But if one firm is charging a high price, then it will pay the other firm to cut its price a little, capture the other fellow's market, and thereby get even higher profits. But if both firms cut their prices, they both end up making lower profits. Whatever price the other fellow is charging, it will always pay you to shave your price a little bit. The Nash equilibrium occurs when each fellow is charging the lowest possible price. However, if the game is repeated an indefinite number of times, there may be other possible outcomes. Suppose that you decide to play tit for tat. If the other fellow cuts his price this week, you will cut yours next week. If each player knows that the other player is playing tit for tat, then each player would be fearful of cutting his price and starting a price war. The threat implicit in tit for tat may allow the firms to maintain high prices. 3.8 Sequential Games Up until now we have been thinking about games in which both players act simultaneously. But in many situations one player gets to move first, and the other player responds. An example of this is the Stackelberg model described in Chapter 2, where one player is a leader and the other player is a follower. Let's describe a game like this. In the first round, player A gets to choose top or bottom. Player B gets to observe the first player's choice and then chooses left or right. The payoffs are illustrated in a game matrix in Table 3.7. Note that when the game is presented in this form (game matrix also known as normal form) it has two Nash equilibrium: (top, left) and (bottom, right). However, we'll show below that one of these equilibriums isn't really reasonable. The payoff matrix hides the fact that one player gets to know what the other player has chosen before he makes his choice. In this case it is more useful to consider a diagram that illustrates the asymmetric nature of the game. Figure 3.1 is a picture of the game in extensive form-a way to rep- resent the game that shows the time pattern of the choices. First, player A has to choose top or bottom, and then player B has to choose left or right. But when B makes his choice, he will know what A has done. Table 3.7 The pay of matrix of a sequential game Player B Left Right Top (1, 9) (1, 9) Bottom (0, 0) (2, 1) Player A The way to analyze this game is to go to the end and work backward. Suppose that player A has already made his choice and we are sitting in one branch of the game tree. If player A has chosen top, then it doesn't matter what player B does, and the payoff is (1, 9). If player A has chosen bottom, then the sensible thing for player B to do is to choose right, and the payoff is (2, 1). Now think about player A's initial choice. If he chooses top, the outcome will be (1, 9) and thus he will get a payoff of 1. But if he chooses bottom, he gets a payoff of 2. So the sensible thing for him to do is to choose bottom. Thus the equilibrium choices in the game will be (bottom, right), so that the payoff to player A will be 2 and to player B will be 1. The strategies (top, left) are not a reasonable equilibrium in this sequential game. That is, they are not an equilibrium given the order in which the players actually get to make their choices. It is true that if player A chooses top, player B could choose left-but it would be silly for player A to ever choose top. Player B Chooses Left (1, 9) Right (1, 9) Leftt (0, 0) Top Player A Chooses Bottom Player B Chooses Right (2, 1) Figure 3.1: Extensive form of the game. This way of depcting a game indicates the order in which the players move. 3.9 A Game of Entry Deterrence In our examination of oligopoly we took the number of firms in the industry as fixed. But in many situations, entry is possible. Of course, it is in the interest of the firms in the industry to try to prevent such entry. Since they are already in the industry, they get to move first and thus have an advantage in choosing ways to keep their opponents out. Suppose, for example, that we consider a monopolist who is facing a threat of entry by another firm. The entrant decides whether or not to come into the market, and then the incumbent decides whether or not to cut its price in response. If the entrant decides to stay out, it gets a payoff of 1 and the incumbent gets a payoff of 9. If the entrant decides to come in, then its payoff depends on whether the incumbent fightsby competing vigorously-or not. If the incumbent fights, then we suppose that both players end up with 0. On the other hand, if the incumbent decides not to fight, we suppose that the entrant gets 2 and the incumbent gets 1. Note that this is exactly the structure of the sequential game we studied earlier, and thus it has a structure identical to that depicted in Figure 3.1. The incumbent is player B, while the potential entrant is player A. The top strategy is to stay out, and the bottom strategy is to enter. The left strategy is to fight and the right strategy is not to fight. As we've seen in this game, the equilibrium outcome is for the potential entrant to enter and the incumbent not to fight. (Entrant, Encumbent) Fight (1, 9) Don't fight (1, 9) Encumbent Fight Chooses (0, 2) Encumbent Chooses Entrant Chooses Stay out Enter Don't fight (2, 1) Figure 3.2: The new entry game. This figure depicts the entry game with the changed payoffs. The incumbent's problem is that he cannot precommit himself to fighting if the other firm enters. If the other firm enters, the damage is done and the rational thing for the incumbent to do is to live and let live. Insofar as the potential entrant recognizes this, he will correctly view any threats to fight as empty. But suppose that the incumbent can purchase some extra production capacity that will allow him to produce more output at his current marginal cost. Of course, if he remains a monopolist, he won't want to actually use this capacity since he is already producing the profit-maximizing monopoly output. But, if the other firm enters, the incumbent will now be able to produce so much output that he may well be able to compete much more successfully against the new entrant. By investing in the extra capacity, he will lower his costs of fighting if the other firm tries to enter. Let us assume that if he purchases the extra capacity and if he chooses to fight, he will make a profit of 2. This changes the game tree to the form depicted in figure 3.2. Now, because of the increased capacity, the threat of fighting is credible. If the potential entrant comes into the market, the incumbent will get a payoff of 2 if he fights and 1 if he doesn't; thus the incumbent will rationally choose to fight. The entrant will therefore get a payoff of 0 if he enters, and if he stays out he will get a payoff of 1. The sensible thing for the potential entrant to do is to stay out. But this means that the incumbent will remain a monopolist and never have to use his extra capacity! Despite this, it is worthwhile for the monopolist to invest in the extra capacity in order to make credible the threat of fighting if a new firm tries to enter the market. By investing in "excess" capacity, the monopolist has signaled to the potential entrant that he will be able to successfully defend his market.