ECON 3210/4210 Decisions, Markets and Incentives Lecture notes 10.10.05 Nils-Henrik von der Fehr DYNAMIC GAMES OF COMPLETE INFORMATION Introduction Static versus dynamic games Sequential move order some players act only after having observed others’ choice early movers may be able to affect play of later movers Extensive Form vs. Normal Form representations Refinements of Nash Equilibrium backward induction Subgame Perfect Equilibrium An example Consider again the following example. A new type of consumer product is about to be introduced (relevant examples include music machines and computer game consoles). There are two competing technologies, controlled by different firms. Both firms would like there to be one standard, as this would increase total sales. However, each firm would like its own technology to become the standard, as this would mean higher revenues for itself (from own sales as well as from licensing of technology to the competitor). The strategic choice involves whether to choose ones own technology or that of its competitor. Payoffs are as follows: Firm 2 Technology 1 Technology 2 Technology 1 2,1 0,0 Technology 2 0,0 1,2 Firm 1 In this game there are two Nash equilibria: both firms choosing Technology 1 and both firms choosing Technology 2. Consider now the Extensive Form representation of this game, in which we assume that Firm 1 makes its choice before Firm 2. Figure: game tree Game tree decision node payoffs given at end nodes Again, two Nash equilibria, but is the Nash equilibrium in which both firms choose Technology 2 reasonable? Subgame Perfection Backward induction in the above example: start at nodes at which Firm 2 makes decision and derive optimal choices in these subgames; go back to node at which Firm 1 makes decisions and consider Firm 1’s choice given that Firm 2 is expected to choose optimally. Subgame part of a larger game Definition of Subgame Perfect Equilibrium: Strategies constitute a Subgame Perfect Equilibrium if they constitute a Nash equilibrium of all subgames of the game (including the game itself). Commitment Classic example: Cesar at Rubicon. 2 cf. Nobel-Price winner Thomas C. Schelling, “The Strategy of Conflict” Repeated Games Repetition of one-shot (or stage) games supergame finite versus infinite repetitions Actions versus strategies action is the choice at any particular stage strategy defines a plan of actions contingent strategies depend on what happened earlier in the game Finite repetitions Example: the repeated Prisoners’ Dilemma Player 2 L R U 0,0 -2,1 D 1,-2 -1,-1 Player 1 Consider first the last stage of the game: it has a unique equilibrium. Consequently, at the second-to-last stage, players should realise that play in the last stage will not be influenced by what they do at the current stage and hence should play the one-shot equilibrium here also. Similar backwardinduction reasoning establishes that the (subgame perfect) equilibrium is unique. This example hints at a more general result: when the stage game has a unique Nash equilibrium then, for any finite number of repetitions, the repeated game has a unique Subgame Perfect equilibrium in which the Nash equilibrium is played in every stage. 3 However, when the stage game has multiple equilibria, the repeated game may have Subgame Perfect equilibria that do not correspond to equilibria of the one-shot game. Example: Cabral Player 2 Player 1 L C R U 5,5 3,6 0,0 M 6,3 4,4 0,0 D 0,0 0,0 1,1 There are two Nash equilibria in the one-shot game: (M,C) and (D,R). However, the outcome (5,5) may be sustained as an equilibrium if the game is repeated twice (or more), by the following strategies: Player 1: at stage 1, play U. At the second stage, play M if (U,L) was played in the first stage; otherwise, play D. Player 2: at stage 1, play L. At the second stage, play C if (U,L) was played in the firs stage; otherwise, play R. Clearly, at the second stage, in both subgames strategies constitute a Nash equilibrium. Moreover, payoffs from equilibrium play is 9 (= 5+4) for both players, while the maximum achievable from any other strategy is 7 (6+1). Hence, strategies constitute a Nash equilibrium of the overall game also. Note, the threat of triggering a ‘bad’ equilibrium disciplines players. In other words, threats and promises may affect future behaviour. But, are they credible in the above example? When reaching the second stage, would it not be reasonable to play (M,C) – since this payoff dominates (D,R) – whatever happened at the first-stage; in particular, should not the players ‘renegotiate’ after a first-stage deviation? If they do, however, (U,L) cannot be sustained as an equilibrium in the first stage. One implication is that threats and promises are credible only if they are not based on payoff-dominated outcomes (see Gibbons, 87-88). How would payoff-maximising equilibrium strategies look in a corresponding game with T > 2 repetitions? 4 Infinite repetitions Example: the Prisoners’ Dilemma game. Suppose: the outcome of stage t-1 is observed before stage t begins; players discount payoffs with factor δ ∈ [0,1] . Denoting the stage payoff to player i by π it and the current period by 0, the total, discounted profits become ∞ Π i = ∑ δ t π it . t =0 Consider the following set of strategies: Player 1: at the first stage, play U. Continue to play U so long as (U,L) has been played in all previous stages; otherwise, play D; Player 2: at the first stage, play L. Continue to play L so long as (U,L) has been played in all previous stages; otherwise, play D. Comparison of payoffs. Payoffs along the equilibrium path – i.e. always play of (U,L) – are 0. Payoff to Player 1 from deviating to D in the current round is ∞ Π1 = 1 + ∑ δ t [ −1] = 1 − t =1 δ 1− δ = 1 − 2δ . 1− δ This is the same that Player 2 would obtain when playing R in the current round. The critical discount factor such as to make (U,L) a Subgame Perfect Equilibrium is δ = 0.5 ; that is, (U,L) can be sustained as an equilibrium if and only if δ ≥ δ = 0.5 . Folk theorem: When players are sufficiently patient, any feasible combination of payoffs that yields to each player as least as much as in some Nash equilibrium of the one-shot game, may be sustained as the average payoff of the infinitely repeated game. Conclusion: in infinitely repeated games there may be equilibria that do not correspond to equilibria of the one-shot game even if the one-shot game has a unique equilibrium. 5 Interpretation of discount factor: suppose the interest rate is r and that with probability p the game stops after the current period. Then the discount factor may be written: δ= 1− p 1+ r In other words, infinite repetitions may be associated with the case in which there is no fixed end date. Perfect and imperfect information Any game – whether static or dynamic – may be represented in both Normal Form and Extensive Form. Consider the following Normal-Form Representation: Player 2 L R U 0,2 -2,1 D 1,-2 -1,-1 Player 1 The game has a unique Nash equilibrium (D,R) (this solution may also be found by iterated elimination of dominated strategies). We may formulate the simultaneous-move game in extensive form, by using the concept of information sets. An information set for a player is a collection of decision nodes satisfying (i) the player has the move at every node in the information set, and (ii) when play of the game reaches a node in the information set, the player with the move does not know which node in the information set has been reached. In the game tree, we may indicate that a collection of decision nodes constitutes an information set by connecting the nodes by a line (alternatively, drawing a circle around these nodes). Figure: game tree Consider then the dynamic version of the game in which Player 1 moves first. Figure: game tree 6 In this game there is a unique Subgame Perfect Equilibrium strategy profile (U,(L,R)), where the formulation (L,R) is used to indicate that Player 2’s strategy is contingent; that is, Player 2 chooses L if Player 1 chooses U and R if Player 1 chooses D. The strategy profile (D,R) is a Nash Equilibrium, but not a Subgame Perfect Equilibrium. This game captures the idea that commitment is valuable: by choosing U first, Player 1 forces Player 2 to play the strategy L. Perfect information may defined to mean that at each move in the game the player with the move knows the full history of the play of the game thus far. Equivalently, perfect information means that every information set is a singleton. Imperfect information then means that there is at least one nonsingleton information set in the game. A dynamic game of complete but imperfect information can be represented in extensive form by using non-singleton information sets to indicate what each player knows (and does not know) when he or she has to the move. 7