OR 3: Chapter 10 - Infinitely Repeated Games

OR 3: Chapter 10 - Infinitely Repeated Games Recap In the previous chapter: • We defined repeated games; • We showed that a sequence of stage Nash games would give a subgame perfect equilibria; • We considered a game, illustrating how to identify equilibria that are not a sequence of stage Nash profiles. In this chapter we'll take a look at what happens when games are repeatedly infinitely. Discounting To illustrate infinitely repeated games (𝑇 → ∞) we will consider a Prisoners dilemma as our stage game. ( (2,2) (0,3) ) (3,0) (1,1) Let us denote 𝑠𝐶 as the strategy "cooperate at every stage". Let us denote 𝑠𝐷 as the strategy "defect at every stage". If we assume that both players play 𝑠𝐶 their utility would be: ∞ 𝑈1 (𝑠𝐶 , 𝑠𝐶 ) = 𝑈2 (𝑠𝐶 , 𝑠𝐶 ) = ∑ 2 > ∞ 𝑖=1 Similarly: ∞ 𝑈1 (𝑠𝐷 , 𝑠𝐷 ) = 𝑈2 (𝑠𝐷 , 𝑠𝐷 ) = ∑ 1 > ∞ 𝑖=1 It is impossible to compare these two strategies. To be able to carry out analysis of strategies in infinitely repeated games we make use of a discounting factor 0 < 𝛿 < 1. The interpretation of 𝛿 is that there is less importance given to future payoffs. One way of thinking about this is that "the probability of recieveing the future payoffs decreases with time". In this case we write the utility in an infinitely repeated game as: ∞ 𝑈𝑖 (𝑟, 𝑠) = ∑ 𝛿 𝑡−1 𝑢𝑖 𝑟(𝑡), 𝑠(𝑡) 𝑡=1 Thus: ∞ 𝑈1 (𝑠𝐶 , 𝑠𝐶 ) = 𝑈2 (𝑠𝐶 , 𝑠𝐶 ) = 2 ∑ 𝛿 𝑡−1 = 2/(1 − 𝛿) 𝑖=1 and: ∞ 𝑈1 (𝑠𝐷 , 𝑠𝐷 ) = 𝑈2 (𝑠𝐷 , 𝑠𝐷 ) = ∑ 𝛿 𝑡−1 = 1/(1 − 𝛿) 𝑖=1 Conditions for cooperation in Prisoner's Dilemmas Let us consider the "Grimm trigger" strategy (which we denote 𝑠𝐺 ): "Start by cooperating until your opponent defects at which point defect in all future stages." If both players play 𝑠𝐺 we have 𝑠𝐺 = 𝑠𝐶 : 𝑈1 (𝑠𝐺 , 𝑠𝐺 ) = 𝑈1 (𝑠𝐺 , 𝑠𝐺 ) = 2/(1 − 𝛿) If we assume that 𝑆1 = 𝑆2 = {𝑠𝐶 , 𝑠𝐷 , 𝑠𝐺 } and player 2 deviates from 𝑆𝐺 at the first stage to 𝑠𝐷 we get: ∞ 𝑈2 (𝑠𝐶 , 𝑠𝐺 ) = 3 + ∑ 𝛿 𝑡−1 = 3 + 𝛿/(1 − 𝛿) 𝑡=2 Deviation from 𝑠𝐺 to 𝑠𝐷 is rational if and only if: 2/(1 − 𝛿) < 3 + 𝛿/(1 − 𝛿) ⇔ 𝛿 < 1/2 thus if 𝛿 is large enough (𝑠𝐺 , 𝑠𝐺 ) is a Nash equilibrium. Importantly (𝑠𝐺 , 𝑠𝐺 ) is not a subgame perfect Nash equilibrium. Consider the subgame following (𝑟1 , 𝑠2 ) having been played in the first stage of the game. Assume that player 1 adhers to 𝑠𝐺 : 1. If player 2 also plays 𝑠𝐺 then the first stage of the subgame will be (𝑟2 , 𝑠1 ) (player 1 punishes while player 2 sticks with 𝑠1 as player 1 played 𝑟1 in previous stage). All subsequent plays will be (𝑟2 , 𝑠2 ) so player 2's utility will be: ∞ 0 + ∑ 𝛿 𝑡−1 = 𝛿/(1 − 𝛿) 𝑡=2 2. If player 2 deviates from 𝑠𝐺 and chooses to play 𝐷 in every period of the subgame then player 2's utility will be: ∞ ∑ 𝛿 𝑡−1 = (1 − 𝛿) 𝑡=1 which is a rational deviation (as 0 < 𝛿 < 1). Two questions arise: 1. Can we always find a strategy in a repeated game that gives us a better outcome than simply repeating the stage Nash equilibria? (Like 𝑠𝐺 ) 2. Can we also find a strategy with the above property that in fact is subgame perfect? (Unlike 𝑠𝐺 ) Folk theorm The answer is yes! To prove this we need to define a couple of things. Definition of an average payoff If we interpret 𝛿 as the probability of the repeated game not ending then the average length of the game is: 𝑇= 1 1−𝛿 We can use this to define the average payoffs per stage: 1 𝑇 𝑈𝑖 (𝑟, 𝑠) = (1 − 𝛿)𝑈𝑖 (𝑟, 𝑠) This average payoff is a tool that allows us to compare the payoffs in an infitely repeated game to the payoff in a single stage game. Definition of individually rational payoffs Individually rational payoffs are average payoffs that exceed the stage game Nash equilibrium payoffs for both players. As an example consider the plot corresponding to a repeated Prisoner's Dilemma. Convex hull of payoffs to a prisoners dilemma. The feasible average payoffs correspond to the feasible payoffs in the stage game. The individually rational payoffs show the payoffs that are better for both players than the stag Nash equilibrium. The following theorem states that we can choose a particular discount rate that for which there exists a subgame perfect Nash equilibrium that would give any individually rational payoff pair! Folk Theorem for infinetely repeated games Let (𝑢1∗ , 𝑢2∗ ) be a pair of Nash equilibrium payoffs for a stage game. For every individually rational pair (𝑣1 , 𝑣2 ) there exists 𝛿 such that for all 1 > 𝛿 > 𝛿 > 0 there is a subgame perfect Nash equilibrium with payoffs (𝑣1 , 𝑣2 ). Proof Let (𝜎1∗ , 𝜎2∗ ) be the stage Nash profile that yields (𝑢1∗ , 𝑢2∗ ). Now assume that playing 𝜎1 ∈ 𝛥𝑆1 and 𝜎2 ∈ 𝛥𝑆2 in every stage gives (𝑣1 , 𝑣2 ) (an individual rational payoff pair). Consider the following strategy: "Begin by using 𝜎𝑖 and continue to use 𝜎𝑖 as long as both players use the agreed strategies. If any player deviates: use 𝜎𝑖∗ for all future stages." We begin by proving that the above is a Nash equilibrium. Without loss of generality if player 1 deviates to 𝜎1 ʹ ∈ 𝛥𝑆1 such that 𝑢1 (𝜎1 ʹ, 𝜎2 ) > 𝑣1 in stage 𝑘 then: 𝑘−1 (𝑘) 𝑈1 = ∑𝛿 𝑘 𝑡−1 𝑣1 + 𝛿 𝑘−1 𝑢1 (𝜎1 ʹ, 𝜎2 ) + 𝑡=1 𝑢1∗ ( 1 − ∑ 𝛿 𝑡−1 ) 1−𝛿 𝑡=1 Recalling that player 1 would receive 𝑣1 in every stage with no devitation, the biggest gain to be made from deviating is if player 1 deviates in the first stage (all future gains are more 𝑣1 (1) heavily discounted). Thus if we can find 𝛿 such that 𝛿 > 𝛿 implies that 𝑈1 ≤ 1−𝑞 then player 1 has no incentive to deviate. 𝛿 1−𝛿 (1 − 𝛿)𝑢1 (𝜎1 ʹ, 𝜎2 ) + 𝑢1∗ 𝛿 𝑢1 (𝜎1 ʹ, 𝜎2 ) − 𝑣1 (1) 𝑈1 = 𝑢1 (𝜎1 ʹ, 𝜎2 ) + 𝑢1∗ 𝑣1 1−𝛿 ≤ 𝑣1 ≤ 𝛿(𝑢1 (𝜎1 ʹ, 𝜎2 ) − 𝑢1∗ ) ≤ 𝑢 (𝜎 ʹ,𝜎 )−𝑣 as 𝑢1 (𝜎1 ʹ, 𝜎2 ) > 𝑣1 > 𝑢1∗ , taking 𝛿 = 𝑢 1(𝜎1ʹ,𝜎 2)−𝑢1∗ gives the required required result for 1 1 2 1 player 1 and repeating the argument for player 2 completes the proof of the fact that the prescribed strategy is a Nash equilibrium. By construction this strategy is also a subgame perfect Nash equilibrium. Given any history both players will act in the same way and no player will have an incentive to deviate: • If we consider a subgame just after any player has deviated from 𝜎𝑖 then both players use 𝜎𝑖∗ . • If we consider a subgame just after no player has deviated from 𝜎𝑖 then both players continue to use 𝜎𝑖 .

OR 3: Chapter 10 - Infinitely Repeated Games

Related documents

Products

Support

OR 3: Chapter 10 - Infinitely Repeated Games

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib