Some Problems from Chapt 13

advertisement
Final Lecture
Thoughts on subgame perfection?
``Life can only be understood backwards; but it
must be lived forwards.”
Søren Kierkegaard
Some Problems from Chapter 12
Problem 1, Chapter 12
Find a separating equilibrium for this game
Equilibrium in Signaling games
• In this signaling game, Player 1 is the sender,
Player 2 is the receiver.
• In a Bayes’ Nash equilibrium for a signaling
game, we need to specify the receiver’s
beliefs.
• Then we check whether when receiver takes
action based on these beliefs, the outcome is
consistent with these beliefs.
Getting started
• Since player 1 can be one of two types and
there are two possible actions A and B for
player 1, there are only two possible strategies
for player 1 that result in separating equilibria.
These are
– Choose A if type s and B if type t
– Choose B if type s and A if type t
• Let’s see if either or both of these strategies
“works”.
Strategies and beliefs
• Let’s see if we can find beliefs for the receiver
(player 2) that make for a separating
equilibrium where player 1 plays A if type s
and B if type t.
• Recall that Player 2 sees what player 1 played,
but does not see his type.
• But if senders are using the above strategy,
then Player 2 believes that those who play A
are type s and those who play B are type t.
Problem 1, Chapter 12
Find a separating equilibrium for this game
Best responses for 2
• Then if player 2 sees action A, he believes that
player 1 is type s and his best response given
his beliefs is to take action y.
• If player 2 sees action B, he believes that
player 1 is type t and his best response given
these beliefs is to take action x.
Best responses for 1
• If Player 1 believes that player 2 plays y when he
sees action A and x when he sees action B, what
will Player 1 do?
• Look at the payoffs. If Player 1 is a type s, then he
would rather that Player 2 play y than x. If he is
of type t, he would rather Player 2 play x than y.
• So his best response to the way player 2 responds
to messages is to send message s when he is type
A and t when he is type B.
Beliefs confirmed
• So we see that if the receiver believes that
sender will send message A if he is type s will
send message B if he is of type t then in the
resulting Nash equilibrium, the receiver’s
beliefs are confirmed. This is what happens.
Another separating equilibrium?
• Suppose Player 2 believes that Player 1 will
send message B if he is type s and A if he is
type t.
– Then if Player 2 sees message A, he believes he is
playing a type t and his best response is x
– If he sees message B, his best response is y.
• But type s wants player 2 to do y and type B
wants player 2 to play x.
• So what is Player 1’s strategy?
Beliefs not confirmed
• Suppose that Player 2 believes that Player 1 will
send message B if he is type s and A if he is type t.
• Then we have shown that when Player 2 acts
according to these beliefs, the best strategy for
Player 1 is send message A if he is type s and B if
he is type t.
• So these beliefs are not confirmed. There is not a
separating equilibrium in which Player 2 has
these beliefs.
Chapter 12, Problem 2
Nature determines Player 1’s type, which is either
t=-1,t= 1,t=2, or t=3, each with probability ¼.
Sender learns his type and sends one of three
possible messages, bumpy, smooth or slick.
Receiver observes message (but not type) an
chooses one of three actions: a=0, a=5, or a=10.
If sender is type t and receiver takes action a,
payoff of sender is a×t and payoff of receiver is
2a×t . (typo in textbook-last word should be
“action”, not “payoff”.)
Separating equilibriaum
• Part a) asks “Find a separating perfect Bayes Nash
equilibrium”.
Answer: There isn’t one. There couldn’t be, since
in a separating equilibrium each type takes a
different action. But there are 4 types and only 3
messages you can send.
I think the author should have asked the question in
the form: “Is there a separating PBNE? If so, find
it.”
Semi-separating PBNE
• You might be able to guess that it will be fairly
easy to separate the type t=-1 from the other
types.
• Notice that this type and only this type wants
the receiver to take action 0.
• Note also that the receiver will want to take
action 0 if and only if the sender’s type is t=-1.
Let’s try this
• Start with receiver’s beliefs. Suppose receiver
believes that senders strategy is
– Say “bumpy” if you are of type -1 and say
“smooth” or “slick” if you are of type 1, 2, or 3.
• If receiver hears “bumpy” and believes that
those who say “bumpy” are type -1, then his
best response is 0. If receiver hears
“smooth” or “slick”, his best response is 10.
Sender’s response
• Suppose that sender believes that receiver’s
strategy is
– If sender says “bumpy”, take action 0, if sender says
“smooth” or “slick”, take action 10.
• Type -1 senders want receiver to take action 0.
Other types of senders want receivers to take
action 10.
• So given receiver’s strategy, best response of
sender is
– Say “bumpy” if type=-1, say “smooth” or “slick” if
type=1,2, or 3.
Beliefs confirmed
• So we see that if receiver believes that sender
will say “bumpy” if of type -1 and otherwise
will say “smooth” or “slick”, then when in the
resulting Nash equilibrium, the receiver’s
beliefs about how senders behave will be
confirmed.
Some Problems from Chapter 13
Problem 7 (the doctors)
• N doctors share a practice and share all
income from it. Doctor can exert effort level,
1,2,…10. Profit of firm is 2(e1+e2+…en) where
ei is effort level of Dr. i.
• Payoff to Dr. i is (1/n) 2(e1+e2+…en) –ei
• What is a Nash equilibrium?
• What would be a best cooperative outcome?
Clicker question
In the stage game, what is the Nash equilibrium
effort level for each doctor?
A) 1
B) 2
C) 5
D) 10
E) There is no pure strategy Nash equilibrium.
Another question
• If one doctor believed that all other doctors
would follow her example and work just as
hard as she does, what effort level would this
doctor choose.
A) 1
B) 3
C) 5
D) 10
• When payoff to each Dr. i is
• (1/n) 2(e1+e2+…en) –ei
• What is the payoff to each doctor if all doctors
choose effort level e*?
A) ne*
B) 0
C) e*/2
D) e*
E) e*/n
Incentivizing with a grim trigger
• Suppose they all use the following grim trigger
strategy. Work at effort level e*>1 so long as
all others work that hard. If anybody works
less hard at any time, then provide e=1 in all
future periods.
• For what discount rates will this strategy be a
SPNE?
Let’s try e*=10 and n=3.
• The most interesting of the SPNE is the one
where everybody works at e*=10.
• Let’s do this one first.
• Grim trigger-Strategy Do e=10 so long as
everybody else does e=10. If anybody ever
works less, revert to e=1.
Payoffs
• If everybody plays this strategy, all work at
e=10 in all periods.
– They all get payoffs of 10 in each period for
expected payoff of 10(1+d+d2+…+)=10/(1-d)
• If somebody works less than 10, say e=1 in the
first period, her payoff in period 1 is
–
2 × (21 /3) - 1 = 13
– Her payoff in all future periods would then be 1.
Comparisons
• Expected value from playing grim trigger with
effort 10 is 10/(1-d)
• Expected payoff from defecting in first period is
13+ 1×(d+d2+d3…)=13+d/(1-d)
• Grim trigger is NE if
10/(1-d)>13+d/(1-d)
This is the case when 10>13(1-d)+d, which implies
d>1/4.
For n=3 and e=e*>1
• We can show e=e*>1 can be sustained by a
grim trigger strategy with 3 doctors so long as
Now let’s try e*=5 and n=3
• Consider grim trigger strategy, Work at effort
level 5 so long as nobody shirks and work at
level 1 if anybody works less.
• Expected payoff to each player if everybody
does this is 5 in every period and 5/(1-d) over
whole game.
• If you deviate and work at level 1, your payoff
in first period is 2×(11/3)-1=19/3 and then you
would get 1 in all future periods.
Comparison
• Grim trigger sustains effort level 5 if
5/(1-d)>(19/3)+d/(1-d)
This is equivalent to d>1/4.
General e*>1 and n=3
• If all play this strategy, they each get a payoff
of 2(3e*)/3-e*=e* in each stage game.
• Discounted total payoff would be e*/(1-d)
• If somebody provides only e=1 in the first
period, the best that Dr. could get is
2(2e*+1)/3-1=(4e*-1)/3 in the first period and
2(3)/3-1=1 in all future periods.
The discounted total value of this stream is
(4e*-1)/3+1(d/(1-d).
Comparison
• If the other player is playing the grim trigger
strategy sustaining e*, then playing that
strategy will be a best response if
e*/(1-d)> (4e*-1)/3+(d/(1-d).
• This is equivalent to d>1/4
Part (b)
• What if doctors can only see total done by
others, but not what each individual did.
• Same grim trigger works, just keys on the
total.
Problem 7 (old edition)
The stage game:
• Payoff to player 1 is V1(x1,x2)=5+x1-2x2
• Payoff to player 2 is V2(x1,x2)=5+x2-2x1
• Strategy set for each player is the interval [1,4]
What is a Nash equilibrium for the stage game?
A) Both players choose 4
B) Both players choose 3
C) Both players choose 2
D) Both players choose 1
E) There is no pure strategy Nash equilibrium
Part b (i)
• If the strategy set is X={2,3}, when is there a
subgame perfect Nash equilibrium in which both
players play a “grim strategy” always play 2 so
long as nobody has ever played anything else, but
play 3 forever if anyone ever plays 3.
• Note that “both play 3” is the only N.E. for the
stage game.
• Compare payoff v(2,2) forever with payoff v(3,2)
in first period, then v(3,3) ever after.
• That is, compare 3 forever with 4 in the first
period and then 2 forever.
Payoff if both play 2 always
• Payoff in stage game to either player if both
play 2 is
V (2,2)=5+2-2x2=3.
• Expected payoff to each if both play 2 forever
is 3(1+d+d2 +…)=3/(1-d)
Payoff from playing 3
• If you play 3 in period 1 where the other player
plays 2, and if in all future periods the other
player plays 3, the best you can do after period 1
is play 3.
• Expected payoff in first period would be
V(3,2)=5+3-2x2=4
• Expected payoff in future periods where play
continues would be V(3,3)=5+3-2x3=2.
• Total expected payoff is then
4+3(d+d2+d3+…)=4+2d/(1-d).
Comparison
• If you play the grim trigger strategy, play 2 so
long as nobody has played 3, but play 3 forever if
anybody ever plays 3, your payoff is the
discounted value of 3 forever: 3/(1-d)
• If the other player is playing this grim trigger
strategy, then the best you can get by playing
something is 4+2d/(1-d).
• This grim trigger is a SPNE if 3/(1-d)>4+2d/(1-d)
• Equivalently, 3>4(1-d)+2d, or d>1/2.
Part b(ii) X=[1,4]
• When is there a subgame perfect equilibrium
where everybody does y so long as nobody has
ever done anything differently and everybody
does z>y if anyone ever does anything other than
y?
• First of all, it must be that z=4. Because actions
after a violation must be Nash for stage game.
• When is it true that getting V(y,y) forever is better
than getting V(4,y) in the first period and then
V(4,4) forever?
Comparison
• V(y,y) forever is worth V(y,y)/(1-d)=(5-y)/(1-d)
• V(4,y) and then V(4,4) forever is worth
9-y+1d+1d2+…=9-y+d/1-d)
• Works out that V(y,y)>V(4,y) if d(8-y)>4
– (Of course the problem also requires 1≤y≤4.)
– Notice that for there to be an equilibrium that
sustains y=1 forever, we need d>4/7. To sustain
y=3 forever, we would need d>4/5.
Problem 2, Chapter 13
Exploring the problem
• Note that the strategy profile {c, x} yields the
highest total payoff for the two players and
payoff is equally divided.
• Is this a Nash equilibrium? Why not?
• What are the Nash equilibria?
• Can we sustain repeated play of c, x by
subgame perfect grim trigger strategies that
revert to a not-so-good Nash equilibrium if
anyone fails to play c or x?
Best Responses and the Four Nash Equilbria
Question 2, Part a
• When is there a SPNE where:
– Player 1 Plays strategy Cdgrim; chooses c so long as
all previous play is c,x but moves to d forever if
Player 2 ever plays anything but x
– Player 2 Plays strategy Xygrim: Choose x so long as all
previous play is c,x but moves to y forever if Player 1
ever plays anything but c.
Checking for SPNE
• If Player 2 is plays Strategy Xygrim, Player 1’s
payoff from playing Cdgrim is 7 in every period so
long as the game lasts. Expected payoff from this
strategy is 7/(1-d).
• If Player 1 plays anything other than c at any time,
on every later play, Player 2 will play y.
• Best possibility for Player 1 would be to play b and
then d forever. Expected payoff from this strategy
is 8 +6/(1-d).
• Note that once 1 has ticked off 2, 2 will always
play w and d is a best response to w. And
perpetual w is a best response to perpetual d.
Comparing
• Sticking with strategy Cdgrim and continuing
to play C is better than any other play if
7/(1-d)>8+6d/1-d)
This implies 7>8(1-d) +6d, which implies that
d>1/2.
Other SPNE
Grim trigger strategies that revert to other Nash
equilibria are also SPNE for sufficiently large d.
For example, suppose Player 1 reverts to b
forever and 2 reverts to w forever if anyone fail
to do c or x.
This works if 7/(1-d)>8+3d/(1-d). Equivalently
d>1/5.
Part b of question 2
• Don’t worry about working this one. It
involves an intricate pattern of responses that
is hard to follow and in my opinion not worth
the effort required to work it out.
Getting cooperation in finite games
• Chapter 13, Problem 3 illustrates an important
and interesting idea.
• In games that have two (or more) Nash
equilibria, it is sometimes possible to get
cooperation in early rounds.
• The idea. Although we must end up at a Nash
equilibrium, the course of play can determine
which one we wind up at.
Chapter 13, Problem 3
• We play the stage game from Problem 2
repeatedly, but only 3 times. Show that some
“cooperative behavior” can be sustained in
Nash equilibrium.
• This game has more than one Nash
equilibrium and one is better for both than
the others.
• This is what gives us a shot.
What we learned before.
• If the stage game has only one Nash equilibrium,
then a game consisting of a finite number of
repetitions has only one SPNE
• In this equilibrium, everybody always plays the
Nash equilibrium action from the stage game.
• When there is more than one N.E. for the stage
game, we can use the threat of reverting to the
worse Nash equlibrium to incentivize good
behavior in early rounds.
Proposed SPNE
• Player 1: Strategy A1-- Play c in period 1 and c
in period 2 if other played x in period 1.
Otherwise play b in periods 2 and 3. If Player
2 plays x in periods 1 and 2 plays x in both
rounds 1 and 2, then play d in round 3.
• Player 2: Strategy A2-- Play x in period 1 and x
in period 2 if other played c in period 1.
Otherwise play w in periods 2 and 3. If Player
1 plays c in periods 1 and 2, play y in period 3.
Checking that A1,A2 is a SPNE
• Let’s work backwards. For each possible course of play in
first two rounds, the third round is a regular subgame. Play
in each of these subgames must be a N.E. One of these
subgames occurs where 1 has played c twice and two has
played x twice. Strategies A1 and A2 have player 1 play d
and two play x in this case. This is a Nash equilibrium.
• In other subgames for last play, someone has done
something other than c or x. In this case, strategies A1 and
A2 prescribe b for 1 and w for 2. This is a Nash equilibrium
as well.
• So the A1 and A2 prescribe Nash equilibria for all of the
“last play” subgames.
Best Responses and the Four Nash Equilbria
Subgames after first play
• After the first play of the game, there are 25
different regular subgames corresponding to
different actions on first play by the players.
• If on the first play, Player 1 did c and Player 2
did x, then if 1 follows A1 and 2 follows A2,
they will play c and x on second round and d
and y on third round. They will each get payoff
7+7+6=20.
• Could Player 1 do better in this subgame?
• The best deviation from strategy A1 for Player 1
would be to play b rather than c at this point
Why?
• If Player 1 plays c on round 1 and b on round 2
and player 2 is playing A2, then Player 2 will play
x on rounds 1 and 2 and w on round 3.
• Best Player 1 can do then is to play b on round 3
and get total payoff 7+8+3=18
• Since playing A1 gives him 20>18, A1 prescribes
Nash equilibrium play on this subgame.
What about the other 24 subgames
after first round.
• In the other subgames after the first round,
somebody has played something other than c
or x.
• In this case, if Player 2 is playing A2, Player 2
will play w in the next two rounds.
• If Player 2 is playing w in next two rounds,
best response for Player 1 is to play b in next
two rounds, which is what Strategy A1
prescribes.
Conclusion for these subgames
• We have seen that at all subgames starting
after the first round, A1 prescribes best
responses to A2.
• Symmetric reasoning shows that A2 prescribes
best responses to A1.
• Thus we have shown that A1 and A2 prescribe
Nash equilibrium play in all regular proper
subgames.
Conclusion for Full Game
• We still need to show that A1, A2 is a Nash
equilibrium for the full game.
• We saw that payoff to Player 1 from A1 is 20.
• Suppose Player 1 plays something other than c on
first round. Then A2 will have 2 play w in the next
two rounds.
• Best thing other than c for Player 1 on first round
is b. After that given that 2 is playing w, playing b
is best in the next two rounds for Player 1.
• So best Player 1 can get by deviating in first round
is 8+3+3=14<20.
Conclusion
• Symmetric reasoning applies to Player 2.
• The strategy profile A1, A2 is a subgame
perfect Nash equiibrium since the
substrategies prescribed in each subgame are
Nash equilibria (best responses to each other)
Understanding what happens
• In a subgame perfect Nash equilibrium for a
finitely repeated game, it must be that play in the
last round is a Nash equilibrium for the stage
game.
• In this example, there is more than one Nash
equilibrium we could wind up at.
• We can get cooperation in early rounds by threats
of going to a “bad” last period equilibrium if
others misbehave while doing your part for a
“better” last period Nash equilibrium in the last
period if others behave.
Problem 4, Ch 13
a) Define a grim-trigger strategy profile.
b) Derive conditions whereby this strategy profile is a SPNE.
(proposed answer to b: d>3/4)
Hints for Problem 4
• What is a nice outcome for stage game?
• What is a Nash equilibrium for this game.
• Define “grim trigger” strategies in which each
player does her part of a nice outcome so long
as the other does his part, but if either ever
does anything else, both revert to the Nash
equilibrium forever.
• Find payoffs from always playing “nice”.
• Find best you can do by “defecting” from nice
play when other is playing the grim trigger.
Problem 5, Ch 13
a) Find a SPNE strategy profile that
results in an outcome path where both
players choose x in every period.
• Note: x,x is not a Nash equilibrium for stage
game, but w,w and z,z are.
• We see that x,x is better for both than either
w,w or z,z.
• We could construct trigger strategies with
either w,w or z,z as the threat.
• For what values of d is there a SPNE trigger
strategy with z,z being the threat?
Proposed answer to part a
• With z,z as the reversion “punishment”, We
need 6/(1-d)>10+3d/(1-d). This means d>4/7.
• There is also a SPNE in which the reversion is
to w,w for some values of d?
• For you to figure out: What values of d?
Part b) Find a SPNE strategy profile that results
in an outcome path where players choose x in
odd numbered periods and y in even periods.
• Try strategies. Continue to abide by the rule “play x
in odd periods, y in even” so long as nobody has
ever violated this rule. If anybody violates the rule,
play z forever.
• Payoff from playing this rule forever is
6+8d+6d2+8d3+6d4+8d5+6d6+…
=6(1+d2+d4+d6+..)+8d(1+d2+d4…)
=6(1+d2+(d2)2+(d2 )3+…)+8d(1+d2+(d2)2+(d2 )3+…)
=6/(1-d2)+8d/(1-d2)
Payoff from violating rule
• Most profitable violation is choose d at start.
If other is playing the proposed trigger strategy,
Other will play x on first play and violator will get 10 on
first play. But ever after, other will play z, and best
violator can do is play z. Payoff from doing this is
10+3d/(1-d).
• Proposed strategy profile is a SPNE if
6/(1-d2)+8d/(1-d2)>10+3d/(1-d). This is true if 7d2+5d>4.
We see that the left side of this inequality is increasing in
d. We also see that the inequality holds for d=1, but not
for d=1/2. (We could solve a quadratic to find exactly
which d’s work.)
Part c) Find a SPNE strategy profile that results
in an outcome path in which players choose x in
first 10 periods, then always choose z.
• There ain’t one. Can you see why?
Part d) You should be able to show that the one
and only grim trigger strategy that does this is
one where players revert to z if someone ever
deviates form choosing y.
Final exam
• Exam will ask questions from all chapters.
• Some problems will be easy, some will be
harder.
May all your subgames be happy..
Even if not always regular and proper.
Download