Uploaded by princedemption

3.2.NashEq2x2NormalFormGames.1.0

advertisement
Finding Mixed Strategy Nash Equilibria in 2˜2 Games
Page 1
Finding Mixed-strategy Nash Equilibria
in 2˜2 Games
ÙÛ
Introduction ______________________________________________________________________1
The canonical game________________________________________________________________1
Best-response correspondences ______________________________________________________2
A’s payoff as a function of the mixed-strategy profile _________________________________________________
A’s best-response correspondence ________________________________________________________________
Case 1: complete indifference _________________________________________________________________
Case 2: A has a dominant pure strategy __________________________________________________________
Case 3: A plays strategically __________________________________________________________________
Summary of the best-response correspondence____________________________________________________
B’s best-response correspondence ________________________________________________________________
2
3
3
4
4
5
5
The Nash equilibria are the points in the intersection of the graphs of A’s and B’s best-response
correspondences____________________________________________________________________6
A typical example__________________________________________________________________6
A’s best-response correspondence ____________________________________________________6
B’s best-response correspondence ____________________________________________________7
The Nash set ____________________________________________________________________9
A nongeneric example _____________________________________________________________10
Introduction
We’ll now see explicitly how to find the set of (mixed-strategy) Nash equilibria for two-player games
where each player has a strategy space containing two actions (i.e. a “2˜2 matrix game”). After setting
up the analytical framework and deriving some general results for such games, we will apply this
technique to two particular games. The first game is a typical and straightforwardly solved example; the
second is nongeneric in the sense that it has an infinite number of equilibria. 1 For each game we will
compute the graph of each player’s best-response correspondence and identify the set of Nash equilibria
by finding the intersection of these two graphs.
The canonical game
We consider the two-player strategic-form game in Figure 1. We assign rows to player A and columns
to player B. A’s strategy space is SA={U,D} and B’s is S B={l,r}. Because each player has only two
actions, each of her mixed strategies can be described by a single number (p for A and q for B)
© 1992 by Jim Ratliff , <jim@virtualperfection.com>, <http://virtualperfection.com/gametheory>.
1
Here nongeneric means that the phenomenon depends very sensitively on the exact payoffs. If the payoffs were perturbed the slightest
bit, then the phenomenon would disappear.
jim@virtualperfection.com
Jim Ratliff
virtualperfection.com/gametheory
Finding Mixed Strategy Nash Equilibria in 2˜2 Games
Page 2
belonging to the unit interval [0,1]. A mixed-strategy profile for this game, then, is an ordered pair
(p,q)˙[0,1]˜[0,1]. We denote the players’ payoffs resulting from pure-strategy profiles by subscripted
a’s and b’s, respectively. E.g. the payoff for A when A plays D and B plays r is aDr.
B
l: q r: 1_q
U: p
aUl ,bUl aUr,bUr
A
D:1_p aDl ,bDl aDr,bDr
Figure 1: The canonical two-player, two-action-per-player strategic-form game.
The pure-strategy equilibria, if any, of such a game are easily found by inspection of the payoffs in
each cell, each cell corresponding to a pure-strategy profile. A particular pure-strategy profile is a Nash
equilibrium if and only if 1 that cell’s payoff to the row player (viz. A) is a (weak) maximum over all of
A’s payoffs in that column (otherwise the row player could profitably deviate by picking a different row
given B’s choice of column) and 2 that cell’s payoff to the column player (viz. B) is a (weak) maximum
over all of B’s payoffs in that row. For example, the pure-strategy profile (U,r) would be a Nash
equilibrium if and only if the payoffs were such that aUr≥a Dr and bUr≥b Ul.
Best-response correspondences
Finding the pure-strategy equilibria was immediate. Finding the mixed-strategy equilibria takes a
little more work, however. To do this we need first to find each player’s best-response correspondence.
We will show in detail how to compute player A’s correspondence. Player B’s is found in exactly the
same way.
Player A’s best-response correspondence specifies, for each mixed strategy q played by B, the set of
mixed strategies p which are best responses for A. I.e. it is a correspondence pÆ which associates with
every q˙[0,1] a set pƪqºÓ[0,1] such that every element of pƪqº is a best response by A to B’s choice
q. The graph of pÆ is the set of points
{(p,q):Ùq˙[0,1], p˙pƪqº}.
(1)
A’s payoff as a function of the mixed-strategy profile
To find A’s best-response correspondence we first compute her expected payoff for an arbitrary
mixed-strategy profile (p,q) by weighting each of A’s pure-strategy profile payoffs by the probability of
that profile’s occurrence as determined by the mixed-strategy profile (p,q):1
uAªp;Ùqº=pqÛa Ul + p(1_q)Ûa Ur + (1_p)qÛa Dl + (1_p)(1_q)ÛaDr.
(2)
A’s utility maximization problem is
1
The semicolon in “uªp;Ùqº” is used to denote that, while p is a choice variable for A, q is a parameter outside of A’s control.
jim@virtualperfection.com
Jim Ratliff
virtualperfection.com/gametheory
Finding Mixed Strategy Nash Equilibria in 2˜2 Games
Page 3
max uAªp;Ùqº.
(3)
p˙[0,1]
Because p is A’s choice variable, it will be convenient to rewrite equation (2) as an affine function of p:1
uAªp;qº = p[(aUl_aUr_aDl+aDr)q+(aUr_aDr)] + [(aDl_aDr)q+aDr],
=∂ªqºp+[(a Dl_aDr)q+aDr].
(4a)
(4b)
Of interest here is the sign of the coefficient of p,
∂ªqº fi (a Ul_aUr_aDl+aDr)q+(aUr_aDr),
(5)
which is itself an affine function of q.
A’s best-response correspondence
For a given q, the function u Aªp;Ùqº will be maximized with respect to p either 1 at the unit interval’s
right endpoint (viz. p=1) if ∂ªqº is positive, 2 at the interval’s left endpoint (viz. p=0) if ∂ªqº is
negative, or 3 for every p˙[0,1] if ∂ªqº is zero, because u Aªp;Ùqº is then constant with respect to p.
Now we consider the behavior of A’s best response as a function of q. There are three major cases to
consider.
Case 1: complete indifference
A’s payoffs could be such that ∂ªqº=0 for all q.2 In this case A’s best-response correspondence would
be independent of q and would simply be the unit interval itself: Åq˙[0,1], pƪqº=[0,1]. In other
words A would be willing to play any mixed strategy regardless of B’s choice of strategy. The graph of
pÆ in this case is the entire unit square [0,1]˜[0,1]. (See Figure 2.)
1
2
I would be happy to say linear here instead of affine. The strict definition of linear seems to be made consistently in linear algebra, but
the less restrictive definition seems to be tolerated in other contexts.
This would require that in each column A receives the same payoff in each of the two rows.
jim@virtualperfection.com
Jim Ratliff
virtualperfection.com/gametheory
Finding Mixed Strategy Nash Equilibria in 2˜2 Games
Page 4
Figure 2: Best-response correspondence when A is completely indifferent.
Case 2: A has a dominant pure strategy
If this is not the case, i.e. if ∂ªqº is not identically zero, then—because ∂ªqº is affine—there will be
exactly one value qà at which ∂ªqàº=0. For all q to one side of qà, ∂ªqº will be positive; for all q on the
other side of qà, ∂ªqº will be negative. However, this qà need not be an element of [0,1]. If qà˙÷[0,1],
then all q˙[0,1] will lie on a common side of qà and therefore ∂ªqº will have a single sign throughout
the interval [0,1]. Therefore A will have the same best response for every q (viz. p=1 if ∂ªqº>0 on
[0,1]; p=0, if ∂ªqº<0 on [0,1]); i.e. A has a strongly dominant pure strategy. (See Figure 3.)
Case 3: A plays strategically
Now we consider the case where A plays strategically: her optimal strategy depends upon her
opponent’s strategy. If qà˙(0,1), then the unit interval is divided into two subintervals—those points to
the right of qà and those points to the left of qà—in each of which A plays a different pure strategy. At
exactly qà, A can mix with any p˙[0,1]. (See Figure 4.)
jim@virtualperfection.com
Jim Ratliff
virtualperfection.com/gametheory
Finding Mixed Strategy Nash Equilibria in 2˜2 Games
Page 5
Figure 3: Best-response correspondence when A has a dominant pure strategy (p=1).
Figure 4: Best-response correspondence when A plays strategically.
If qà=0 or qà=1, then A can mix at qà and will play a common pure strategy for every other
q˙[0,1]. In this case the pure strategy which A plays against q˙(0,1) is a weakly dominant strategy. (It
is strictly best against q˙[0,1]_{qà} and as good as anything else against qà.)
Summary of the best-response correspondence
We see that—in a 2˜2 matrix game—each player will either 1 be free to mix regardless of her
opponent’s strategy (a case which ‘almost never’ occurs), 2 play a dominant pure strategy, or 3 be free
to mix in response to exactly one of her opponent’s possible choices of strategy but for every other
choice will play a pure strategy.
B’s best-response correspondence
Player B’s best-response correspondence specifies, for each mixed strategy p played by A, the set of
mixed strategies q which are best responses for B. I.e. it is a correspondence qÆ which associates with
jim@virtualperfection.com
Jim Ratliff
virtualperfection.com/gametheory
Finding Mixed Strategy Nash Equilibria in 2˜2 Games
Page 6
every p˙[0,1] a set qƪpºÓ[0,1] such that every element of qƪpº is a best response by B to A’s choice
p. The graph of qÆ is the set of points
{(p,q):p˙[0,1],#q˙qƪpº}.
(6)
This correspondence is found using the same method of analysis we used for A’s. You will easily
show that
uBªq;pº = ©ªpºq+ [(bUr_bDr)p+bDr],
(7)
©ªpºfi(b Ul_bUr_bDl+bDr)p+(bDl_bDr).
(8)
where
The Nash equilibria are the points in the intersection of the graphs of A’s and B’s best-response
correspondences
We know that a mixed-strategy profile (p,q) is a Nash equilibrium if and only if 1 p is a best response
by A to B’s choice q and 2 q is a best response by B to A’s choice p. We see from (1) that the first
requirement is equivalent to (p,q) belonging to the graph of pÆ, and from (6) we see that the second
requirement is equivalent to (p,q) belonging to the graph of qÆ. Therefore we see that (p,q) is a Nash
equilibrium if and only if it belongs to the intersection of the graphs of the best-response
correspondences pÆ and qÆ. We can write the set of Nash equilibria, then, as
{(p,q)˙[0,1]˜[0,1]: p˙pƪqº, q˙qƪpº}.
(9)
A typical example
Consider the 2˜2 game in Figure 5. First we immediately observe that there are two pure-strategy
Nash equilibria: (U,r) and (D,l).
Figure 5: A typical 2˜2 game.
A’s best-response correspondence
Now we find A’s best-response correspondence. From (5) we see that
∂ªqº = ¥6q+3,
(10)
which vanishes at qà=1 /2 . Because ∂ªqº is decreasing in q we see that A will choose the pure strategy
p=1 against q’s on the interval [0,1 /2 ) and the pure strategy p=0 against q’s on the interval (1 /2 ,1].
Against q=qà= 1 /2 , A is free to choose any mixing probability. Player A’s best-response correspondence
pÆ is plotted in Figure 6.
jim@virtualperfection.com
Jim Ratliff
virtualperfection.com/gametheory
Finding Mixed Strategy Nash Equilibria in 2˜2 Games
Page 7
Figure 6: A’s best-response correspondence for the game of Figure 5.
We can also derive A’s best-response correspondence graphically by plotting her payoff to her
different pure strategies as a function of B’s mixed-strategy choice q. Using (4b) and (10) we have
uAªp;Ùqº = (¥6q+3)p + 4q.
(11)
Evaluating this payoff function at A’s pure-strategy choices p=1 and p=0, respectively, we have
uAªU;Ùqº = 3_2q,
(12)
uAªD;Ùqº = 4q.
(13)
Both of these functions at plotted for q˙[0,1] in Figure 7. These two lines intersect when q=qà= 1 /2 ;
i.e. they intersect at the mixed strategy for B at which we earlier determined A would be willing to mix.
To the left of this point, A’s payoff to U is higher than her payoff to D; the reverse is true on the other
side of qà. Therefore A’s best response is to play U against q˙[0, 1 /2 ) and D against q˙(1 /2 ,1]. At the
intersection point qà, A is indifferent to playing U or D, so she is free to mix between them. This is
exactly the best-response correspondence we derived analytically above.
B’s best-response correspondence
We similarly find B’s best-response correspondence. From (8) we find that
©ªpº = ¥4p+3,
(14)
which decreases in p and vanishes at pà=3 /4 . Player B, then, chooses the pure strategy q=1 against p’s
on the interval [0,3 /4 ) and the pure strategy q=0 against p’s on the interval (Ù3 /4 ,1]. Against p=pà= 3 /4 ,
jim@virtualperfection.com
Jim Ratliff
virtualperfection.com/gametheory
Finding Mixed Strategy Nash Equilibria in 2˜2 Games
Page 8
B is free to choose any mixing probability. Player B’s best-response correspondence qÆ is plotted in
Figure 8.
Figure 7: A’s pure-strategy payoffs as a function of B’s mixed strategy q.
Figure 8: B’s best-response correspondence for the game of Figure 5.
jim@virtualperfection.com
Jim Ratliff
virtualperfection.com/gametheory
Finding Mixed Strategy Nash Equilibria in 2˜2 Games
Page 9
Using (7) and (14) we compute B’s payoff to the mixed-strategy profile (p,q) to be
uBªq;Ùpº = (¥4p+3)q + (p_1).
(15)
Evaluating this at B’s pure-strategy choices we get
uBªl;Ùpº = 2_3p,
(16)
uBªr;Ùpº = p_1.
(17)
These functions are plotted for p˙[0,1] in Figure 9. We again note that their intersection—in this case at
pà=3 /4 —occurs at the opponent’s mixed strategy which allows mixing by the player choosing a best
response. The same reasoning we used above to graphically derive A’s best-response correspondence
works here, and we arrive at the same behavior rules we found analytically.
Figure 9: B’s pure-strategy payoffs as a function of A’s mixed strategy p.
The Nash set
Both A’s and B’s best-response correspondences are plotted together in Figure 10. We see that the
intersection of the graphs of the two best-response correspondences contains exactly three points, each
corresponding to a mixed-strategy profile (p,q): (0,1), ( 3 /4 ,1 /2 ), and (1,0). The first and last of these
correspond to the two pure-strategy Nash equilibria we identified earlier. Note that the additional
equilibrium we found is the strategy profile (pà,qà). This strategy profile will in general be the only
jim@virtualperfection.com
Jim Ratliff
virtualperfection.com/gametheory
Finding Mixed Strategy Nash Equilibria in 2˜2 Games
Page 10
Figure 10: The players’ best-response correspondences,
the Nash set, and equilibrium payoffs.
non-pure equilibrium strategy profile when pà and qà both lie in the interior of the unit interval.1 Note
as well that there is an odd number of Nash equilibria for this game, as is “almost always” the case. The
payoff vectors for these equilibria are (4,2), (2,¥¤), and (3,0), respectively. [The mixed-strategy profile
payoffs are computed using (11) and (15).] Note that the equilibrium payoffs are completely Pareto
ranked.
A nongeneric example
We now consider the two-player normal form game in Figure 11. We immediately determine that the
unique pure-strategy equilibrium is (D,r).
Figure 11: A nongeneric two-player game.
To find A’s best-response correspondence we use (5) to compute that
∂ªqº = ¥3q,
1
(18)
It is not possible when pà,qà˙(0,1) that in equilibrium only one player mixes. Assume, say, that A mixes. We know that A will mix
only when q=qà˙(0,1). Therefore B is mixing as well.
jim@virtualperfection.com
Jim Ratliff
virtualperfection.com/gametheory
Finding Mixed Strategy Nash Equilibria in 2˜2 Games
Page 11
which decreases in q and vanishes at qà=0. Therefore A plays the pure strategy p=0 against any
q˙(0,1] and she is willing to mix against q=qà=0. Therefore D is a weakly dominant strategy for A.
Her best-response correspondence pƪqº is plotted in Figure 12.
To find B’s best-response correspondence we use (8) and find that
©ªpº = äp_1.
(19)
This function increases in p and vanishes at pà=2 /3 . Therefore B will play the pure strategy q=0 against
any p˙[0,2 /3 ), will play the pure strategy q=1 against any p˙(Ù2 /3 ,1], and will be free to mix for
p=pà= 2 /3 . B’s best-response correspondence qƪpº is also plotted in Figure 12.
Inspection of Figure 12 shows that the intersection of the graphs of A’s and B’s best-response
correspondences is a line segment along which B plays q=0 and A mixes with any probability p on
[0,2 /3 ]. We note that the unique pure-strategy Nash equilibrium we identified earlier is the left endpoint
of this set. This example is nongeneric in that we have an infinity (in fact, a continuum) of equilibria: a
situation which generically never happens.
Figure 12: A game with a continuum of equilibria.
Before leaving this example we should also take note of the equilibrium payoffs. At the pure-strategy
equilibrium each player gets 3. (See Figure 12.) Player A is indifferent to mixing between U and D,
given that B is playing r. However, this mixing hurts B. At the right-hand endpoint of the Nash set, A
still receives 3 but B’s payoff, which has been decreasing linearly with A’s mixing probability p, has
declined to 5 /3 . Note that, at the alternative strategy profile in which B plays l and A mixes with p= 2 /3 , B
would get the same payoff as playing r, but A would get only 2, rather than 3. If A mixed any more
jim@virtualperfection.com
Jim Ratliff
virtualperfection.com/gametheory
Finding Mixed Strategy Nash Equilibria in 2˜2 Games
Page 12
strongly toward U than p=2 /3 , B would defect to the alternative strategy l, giving A less than in this
equilibrium. This is what determines the location of the right-hand endpoint of the Nash set. In this
game the set of equilibria are Pareto ranked. Obviously, player B would prefer to coordinate on the purestrategy equilibrium, and there is no reason A should disagree.
jim@virtualperfection.com
Jim Ratliff
virtualperfection.com/gametheory
Download