Uploaded by Sine Nomine

IE616 Notes (4)

advertisement
INTRODUCTION TO GAME THEORY
K.S. MALLIKARJUNA RAO
Abstract. This set of notes on “Game Theory” are based on the lectures given on multiple
occasions.
Contents
List of Todos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Part 1.
Combinatorial Games
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
3
§ 1.
Combinatorial Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
§ 2.
Take Away Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
§ 3.
Game of Nim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
§ 3.1.
Positions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
§ 3.2.
Nimber Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
§ 3.3.
Solution of the Nim Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
§ 4.
Zermelo Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
§ 5.
Game of Hex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
Part 2.
§ 6.
Non-Cooperative Games . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
Oligopoly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
§ 6.1.
Monopoly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
§ 6.2.
Cournot’s Duopoly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
§ 6.3.
Bertrand’s Duopoly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
§ 7.
Matching Pennies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
§ 8.
Rock-Paper-Scissors Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
§ 9.
Prisoner’s Dilemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
§ 10.
BoS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
§ 11.
Matrix Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
§ 12.
Continuous Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
§ 13.
Nonzero-sum Bimatix Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
§ 14.
Nonzero-sum Continuous Games . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
§ 15.
Lemke-Howson Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
§ 16.
Correlated Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
Industrial Enginnering & Operations Research, IIT-Bombay, Powai, Mumbai– 400 076, India.
Email: mallik.rao@iitb.ac.in
URL: http://www.ieor.iitb.ac.in/˜mallik .
1
2
INTRODUCTION TO GAME THEORY
§ 17.
Congestion and Potential Games . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
§ 18.
Evolutionary Game Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
§ 19.
Replicator Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
§ 20.
Fictitious Play . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
§ 21.
Cooperative Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
§ 22.
Nucleolus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
§ 23.
Utility Under Certainity
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
Preference Relations and Utility Representation . . . . . . . . . . . . . . . . .
33
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
§ 23.1.
List of Todos
INTRODUCTION TO GAME THEORY
3
Part 1. Combinatorial Games
§ 1. Combinatorial Games
(1.1). Definition. A combinatorial game is a game in which two players take turns making
moves; both of them have complete information about what has happened in the game so far
and what each player’s options are from each position.
In particular combinatorial games have the following properties:
‚
‚
‚
‚
‚
There are two players who make moves alternatively.
There are finite set of positions in the game.
The rules specifying how each player can move to some position from the current position.
Game ends when a player can’t move .
The game ends eventually.
(1.2). Definition. If the set of available moves depend only on the position and not on which
of the two player is moving, then the game is called impartial games. Otherwise they are called
partisan games.
Tic-Tac-Toe is an example of impartial game whereas Chess is a partisan game.
§ 2. Take Away Game
There is a pile of 21 sticks. In each turn, a player can take one, two, or three sticks. The player
who takes the last piece is winner (Normal Play). To understand optimal strategy, suppose the
number of sticks n ď 3. Obviously, the first player can remove all the sticks and can win the
game. Suppose there are four sticks, then second player wins. More generally, if there are n
sticks, then second player wins if n is a multiple of 4, otherwise first player wins. Thus with 21
sticks, first player wins.
To prove this, we use mathematical induction. From the above, we know that the theorem is
true for n P t1, 2, 3, 4u. So we assume n ě 5 and that the theorem is true for all k ă n. Now,
applying division algorithm, we have n “ 4q `r with 0 ď r ď 3. Suppose r ‰ 0. In the first turn,
Player 1 will remove r sticks, then the number of sticks available will be 4q. Now the induction
hypothesis (since 4q ă n) shows that Player 1 will win. Next assume that r “ 0. In the first
turn, Player 1 has to pick s P t1, 2, 3u sticks, Player can pick 4 ´ s sticks so that the remaining
sticks will be 4pq ´ 1q ă n. Now induction hypothesis completes the proof.
§ 3. Game of Nim
There are three (or more) piles or nim-heaps of stones. Players alternatively remove any
number of stones from a pile until there are no stones left. There are two variations to decide
on the winner.
‚ Normal Play - The player to remove the last stone wins.
‚ Misère Play - The player that is forced to take the last move loses.
§ 3.1. Positions.
(3.1). Definition. In any impartial game, the position of the game is said to be in P-position
if it secures a win for the previous player (the one who just made his move). The position is in
N-posiiton if it secures win for the next player (the one to make a move).
In normal play of Nim game with three heaps, p0, 0, 1q is P and p1, 1, 0q is P . To find, in
general, whether a Nim position is P or P, we work backwards (using backward induction).
‚
‚
‚
‚
Terminal positions are P.
every position that can reach a P is N.
Positions that only move to N positions is P.
Repeat above procedure until all positions are labeled.
4
INTRODUCTION TO GAME THEORY
§3.2. Nimber Arithmetic. To add two numbers, we first write their binary form and then take
the exclusive or (XOR) of the corresponding digits (with the carry forward of the remainder).
As an example, take 3 and 5.
3=
‘5=
6
011
‘ 101
110
Note that in the XOR operation, 1 ‘ 1 “ 0 “ 0 ‘ 0 and 1 ‘ 0 “ 1 ‘ 1. Easiest way to look
at this is that if we are adding an odd number of ones, then the answer is 1, an even number of
ones gives the answer 0.
§ 3.3. Solution of the Nim Game. Nimsum of all the heaps play the key role in the solution.
Note that the nimsum at the end is zero. There are other positions where nimsum can be zero,
e.g., x ‘ x “ 0 for any x.
(3.2). Theorem. Winning strategy in normal play Nim is to finish every move with a nimsum
of 0.
Proof of this theorem is split into couple of lemmas.
(3.3). Lemma. If the nimsum is zero after a player’s turn, then the next move must change it to
non-zero.
Proof. Let the nimheap be px1 , x2 , ¨ ¨ ¨ , xn q and s “ x1 ‘ x2 ‘ ¨ ¨ ¨ ‘ xn .
Let t “ y1 ‘ y2 ‘ ¨ ¨ ¨ ‘ yn be the sum of the heaps after the move. Note that xi “ yi for all i
except for one say k.
Now
t “ 0 ‘ t “ s ‘ s ‘ t “ s ‘ px1 ‘ y1 q ‘ ¨ ¨ ¨ ‘ pxn ‘ yn qs ‘ pxk ‘ yk q
Clearly if s “ 0, then t ‰ 0 (why!), completing the proof of the lemma.
□
When numsum is zero, we think of the position as balanced. From a balanced position, we
can only move to unbalanced position. But from unbalanced position, it is possible to move to
either another unbalanced position or a balanced one.
(3.4). Lemma. It is always possible to make the nimsum zero on your turn if it wasn’t already
zero at the beginning of your turn.
Proof. Let d be the position of the most significant bit in s. Choose a heap xk such that its
most significant bit is also in position d (exists?).
Now choose to make the new value of the heap yk “ s ‘ xk by removing xk ´ yk stones from
the heap. Now the new nimsum is
t “ s ‘ xk ‘ yk “ s ‘ xk ‘ xk ‘ s “ 0,
completing the proof.
□
Proof of the theorem. If you start off by making your first move so that nim-sum is zero, then
on each turn your opponent will disturb the sum and you will in turn set it back to zeor.
By Lemma (3.3), the opponent has no choice but to disturb the nimsum and by Lemma (3.4),
you can set it back to zero. Eventually on your turn there will be no stones left with nim-sum
of zero.
□
(3.5). Exercise. Find strategy for Nim game with misère play.
(3.6). Remark. The following points will help us while playing Nim game.
‚ Whenever possible, reduce to two heaps containing same number of stones each. Then
mimic your opponent’s move.
INTRODUCTION TO GAME THEORY
5
‚ Visualising the binary arithmetic for large numbers is hard for us. An easy way to make
the nimsum zero is always to leave even subpiles of the powers of two, starting with the
largest subpile possible, where subpile is a group of coins within a nimheap.
§ 4. Zermelo Theorem
(4.1). Theorem. Every finite perfect information two player game is determined i.e.,
‚ first player has has a winning strategy, or
‚ second player has a winning strategy,
‚ both of them have strategies such that the game ends in a draw.
Proof. We use mathematical induction on the depth of the game tree. If the depth is zero, the
win/loss/draw is clearly determined. So assume that the game is determined when the depth
ă n. Now assume that the depth is n. At this moments, there is a player who make a decision
from the k possible ones (say). Each of these k decisions will lead to k sub-games T1 , T2 , ¨ ¨ ¨ , Tk .
The depth of each of these subgames is clearly ă n and hence the win/loss/draw is determined
for these subgames. Now, we can easily determine the win/loss/draw for the game with depth
n that we started with.
We, now, provide another argument using De Morgan’s law. Note that
‚ first player has a winning strategy ðñ Dx1 @y1 Dx2 @y2 ¨ ¨ ¨ such that first player wins;
‚ second player has a winning strategy ðñ @x1 Dy1 @x2 Dy2 ¨ ¨ ¨ such that second player
wins;
Ť
‚ or the negation of pDx1 @y1 Dx2 @y2 ¨ ¨ ¨ q p@x1 Dy1 @x2 Dy2 ¨ ¨ ¨ q happens. From De Morgan’s
law, this is equivalent to
p@x1 Dy1 @x2 Dy2 ¨ ¨ ¨ : Player 1 loses or drawq X pDx1 @y1 Dx2 @y2 ¨ ¨ ¨ : Player 2 loses or drawq
Clearly the third bullet implies that the game ends in a draw, proving the theorem.
□
§ 5. Game of Hex
A hex board is an array of regular hexagons arranged into a diamond shape in such a way
there is the same number of hexagons along each side of the board.
Two players alternatively play like in tic-tac-toe. Each player, in his turn, places his symbol
(blue or red coloured according to the player) on one of the hexagon. Winner is the player
who first obtains a connected path of adjacent hexagons stretching between the sides of that
player’s label. Invented by Piet Hein, a Danish scientist, mathematician, writer, poet in 1942.
Rediscovered by John Nash at Princeton in 1948, and became popular at Princeton.
We want to address the following questions.
(1) How do we solve this game?
(2) Does the game have a winning strategy for either player?
(3) Can there be a draw?
(5.1). Theorem. There can be no draw in the Game of Hex.
6
INTRODUCTION TO GAME THEORY
Proof. Imagine the playing board of hex is made out of paper. Whenever red moves, he colours
the hexagon of his choice. Whenever blue moves, he cuts his hexagon of his choice. Now either
the board will remain a single piece or at least two pieces. This completes the proof.
□
Above proof is quite intuitive, but the proof uses one of the most important theorem known
as Jordan curve theorem. In fact, one can show that Game of Hex proves the Jordan curve
theorem.
(5.2). Theorem. First player has a winning strategy.
Proof. Suppose the second player has a winning strategy. Because moves by the players are symmetric, it is possible for the first player to adopt the second player’s winning strategy as follows:
The first player, on his first move, just colors in an arbitrarily chosen hexagon. Subsequently,
for each move by the other player, the first player responds with the appropriate move dictated
by second player’s winning strategy. This is called “stealing the strategy” and is used by Nash
in his proof.
If the strategy requires that first player move in the spot that he chose in his first turn and
there are empty hexagons left, he just picks another arbitrary spot and moves there instead.
Having an extra hexagon on the board can never hurt the first player - it can only help him.
In this way, the first player, too, is guaranteed to win, implying that both players have winning
strategies, a contradiction
□
Why is this interesting, in particular to this course? Provides a proof to Brouwer fixed point
theorem, a very important result across many disciplines of Mathemaitcs.
INTRODUCTION TO GAME THEORY
Part 2. Non-Cooperative Games
7
8
INTRODUCTION TO GAME THEORY
§ 6. Oligopoly
§ 6.1. Monopoly. Consider a firm which sells its product in the market. The demand for the
product is at most q0 and the price of the product is inversely proportional to the quantity
available in the market. In particular, we assume that the price is given by
#
p0 p1 ´ qq0 q, q ď q0
ppqq “
0,
q ě q0
where p0 is the maximum price the product can have. If the quantity produced is more than the
market can demand i.e., q0 , then the price will be zero.
The firm incurs a production cost c per unit. Then, the profit of the firm, when it produces
q units, is given by
#
qppqq ´ cq, q ď q0
Πpqq :“
0,
q ě q0
If p0 ď c, the firm will not produce. And hence we assume c ă p0 . The firm’s objective is to
choose q which maximizes its profit. Since c ă p0 . we can find a q P p0, q0 q such that Πpqq ą 0.
Therefore the optimal q will be interior to p0, q0 q. Also note that the profit function is concave
in p0, q0 q. Therefore the optimal qm is characterised by
BΠ
pqq “ 0.
Bq
Solving this, yields
qm
q0
“
2
ˆ
c
1´
p0
˙
and the optimal profit is given by
q0 p0
Πpqm q “
4
ˆ
c
1´
p0
˙2
§ 6.2. Cournot’s Duopoly. A duopoly is a situation in which two firms control the market
for a certain commodity. When there are more firms, it is oligopoly. The duopoly problem is to
decide how the firms adjust their production to maximize their profits. The duopoly problem is
studied by Cournot (1838). His work can be seen as precursor to Nash equilibrium. Due to this
some authors use Cournot-Nash equilibrium instead of Nash equilibrium in the case of Cournot’s
oligopoly problem.
Two Firms 1, 2 produce and sell a product on the same market. Price of the product decreases
proportionally to the supply. qi the number of items produced by company i, i “ 1, 2. q0 and
p0 are the highest reasonable production level and highest possible price. price when the total
quantity produced is q “ q1 ` q2
#
p0 p1 ´ qq0 q if q ă q0
ppqq “
0
if q ě q0
marginal cost of the production = c for both firms p0 ď c is meaningless (no profit). We assume
p0 ą c.
Strategies of each form q1 and q2 , both taken from the interval r0, q0 s. The payoffs are given
by
Πi pq1 , q2 q “ qi ppqq ´ cqi
What is the quantity to be produced by each company to have maximum profits possible?
Given a strategy q2 of firm 2, what is the best response of firm 1? It is q̂1 pq2 q which maximizes
the profit for firm 1. Using the arguments in the monopoly situation, we get
ˆ
˙
q0
q2
c
1´
´
.
q̂1 pq2 q “
2
q0
p0
INTRODUCTION TO GAME THEORY
9
Similarly the best response of firm 2 to a given strategy q1 of firm 1 is given by
q̂2 pq1 q “
q0
2
ˆ
˙
q1
c
1´
´
.
q0
p0
The solution for the problem is to choose a pair pq1˚ , q2˚ q such that q1˚ is best response to q2˚ and
vice versa. This is called “Cournot-Nash Equilibrium”. This amounts to solve the system of
equations
ˆ
q0
1´
2
ˆ
q0
q2˚ “
1´
2
q1˚ “
˙
q2˚
c
´
q0
p0
˙
q1˚
c
´
q0
p0
This system has a unique solution and is given by
q1˚
“
q2˚
q0
“q “
3
˚
ˆ
c
1´
p0
˙
The equilibrium payoff is given by
Π1 pq1˚ , q2˚ q
“
Π2 pq1˚ , q2˚ q
q0 p0
“
9
ˆ
c
1´
p0
˙2
We will now compare the situation with monopolistic market. Now assume that the two firms
form a cartel and agree to produce the same amount of product. In this case each firm will
´
¯2
produce q2m units each and share the profit q04p0 1 ´ pc0
equally. This profit is clearly higher
than the profit they get in the duopoly setting.
Firms decisions of whether to go with duopoly or monopoly (cartel) can be visualized as as a
two player game with the payoff matrix
Player 1
Duopoly
Cartel
Player 2
Duopoly
´
¯2
´
¯2
q0 p0
q 0 p0
c
c
1
´
,
1
´
9
p0
9
p0
´
´
¯2
¯2
5q0 p0
5q0 p0
c
c
1
´
,
1
´
48
p0
36
p0
Cartel
´
¯2
´
¯2
5q0 p0
5q0 p0
c
c
1
´
,
1
´
36
p0
48
p0
´
´
¯2
¯2
q 0 p0
q0 p0
c
c
1
´
,
1
´
8
p0
8
p0
From this table, we can see that Nash equilibrium gives worse payoff than what can be achieved
by cooperation. However, the cooperation is not stable. This illustrates the Prisoner’s dilemma.
´
¯
Note that the total quantity 2q30 1 ´ pc0 available in the market under the Nash equilibrium
´
¯
is strictly higher than the corresponding quantity q20 1 ´ pc0 under the cooperative outcome.
Correspondingly the price of the product under the Nash equilibrium is lower than the price
under the cooperative equilibrium. From the consumers perspective, the Nash equilibrium is
better than the cooperative equilibrium. Thus, firms objective and consumers interests are at
odds.
§ 6.3. Bertrand’s Duopoly. The situation is same as previous one. Strategies are now the
prices and not the quantity. The firm with lower price captures the market. This firm sells the
whole product and the second one sells nothing. In case of equal prices, the firms share market
equal. The demand function qppq is given by
ˆ
qppq “ q0
p
1´
p0
˙
10
INTRODUCTION TO GAME THEORY
for p ă p0 . The payoffs are given by
$
’
&pp1 ´ cqqpp1 q
Π1 pp1 , p2 q “ pp1 ´ cqqpp1 q{2
’
%
0
$
’
&pp2 ´ cqqpp2 q
Π2 pp1 , p2 q “ pp2 ´ cqqpp2 q{2
’
%
0
if p1 ă p2
if p1 “ p2
if p ` 1 ą p2
if p2 ă p1
if p2 “ p1
if p2 ą p1
p0 ą c denote the highest reasonable price for the product. It is a symmetric game with strategy
spaces P1 “ P2 “ rc, po s.
‚ p1 ą p2 can not be best response to p2 , whenever p2 ą c. Similarly p2 ą p1 can not be
best response to p1 , whenever p1 ą c.
‚ c “ p2 ă p1 ď p0 : In this case p2 is not the best response to p1 , as choosing anything
between c and p1 yields a better payoff for firm 2. So this case can not give a Nash
Equilibrium.
‚ Similarly the case c “ p1 ă p2 ď p0 can not give Nash equilibrium.
‚ c ă p1 “ p2 ď p0 . A slight decrease in pi give firm i better pay than pi . so this case also
can not give Nash equilibrium.
‚ Remaining case: c “ p1 “ p2 . In this case the payoff to both firms is zero and this is the
Nash equilibrium.
In the monopolist case, pm “ pp0 ` cq{2 is the optimal price and the optimal profit is
˙
ˆ
pm
.
Πm ppm q “ ppm ´ cqq0 1 ´
p0
§ 7. Matching Pennies
There are two players and each have a coin. Both of them place the two coins on the table.
If the two coins match, player 1 wins and gets a rupee from player 2, otherwise player 2 wins
and he gets a rupee from player 1. This game can be represented in the following table.
Player 2
Head Tail
Head 1, -1 -1, 1
Tail -1, 1 1, -1
Player 1
What should the players choose? Suppose first player chooses x P tH, T u and the second
players choses y P tH, T u. Let π1 px, yq be the payoff that the first player receives and let
π2 px, yqbe the payoff that the second player receives. Note that π1 px, yq ` π2 px, yq “ 0 for all x
and y. Such games are called zero-sum game. Since maximizing π2 over the choices/strategies
of second player is same as minimizing π1 over the choices/strategies of second player, we can
view this problem as follows: First player maximizes π1 (hereafter we denote by π) over his/her
choices/strategies while the second player minimizes π over his/her choices/strategies.
First player’s thought process goes as follows: If I choose x, second player will choose y so as to
minimize πpx, yq. Therefore, the best I will do is when I choose x which maximizes miny πpx, yq.
Thus Player 1 considers the optimization (max min) problem
max min πpx, yq.
x
y
´
Let this value be denoted by v . Any choice/strategy x which gives first player a value bigger
than v ´ is called security strategy of the first player.
In a similar fashion, the second player consider the optimization (min max) problem
min max πpx, yq.
y
`
x
We denote this value by v . Any choice/strategy y which gives the second player a value less
than v ` is called security strategy of the second player.
INTRODUCTION TO GAME THEORY
11
The security strategies of the players can be interpreted as worst case strategies. They are
also called maxmin (or minmax) strategies of the respective players. Simple computation shows
that
v ´ “ ´1 and v ` “ 1,
implying that v ´ ă v ` . In fact, v ´ ď v ` in any example. To see this, first note that
min πpx, yq ď πpx, y 1 q ď max
πpx1 , y 1 q
1
y
x
for all x, y 1 . Therefore
max min πpx, yq ď min max πpx1 , yq
x
y
y
x1
implying that v ´ ď v ` .
Now, suppose the first player has a random device which suggests the choice he has to make.
Let us say he choses the suggested strategy x˚ which suggests that H will be chosen with
probability 12 and T will be chosen with 12 probability. In such a case his utility will be an
expectation and is given by
1
1
πpx˚ , ¨q “ πpH, ¨q ` πpT, ¨q.
2
2
It is easy to check that πpx˚ , Hq “ πpx˚ , T q “ 0. Even if the second player choses a strategy
y suggested by any random device (of his own), than also it is true that πpx˚ , yq “ 0. Such
a strategy x˚ is called equalizer strategy of Player 1. Similarly the strategy of player 2 which
suggests H with probability 21 and T with probability 21 will be equalizing strategy of Player 2.
The strategies discussed just now are called mixed (or randomized) strategies. Let ∆1 and
∆2 denote, respectively, the sets of mixed strategies of the players. The earlier arguments can
be extended to show that
max min πpx, yq ď min max πpx, yq.
xP∆1 yP∆2
yP∆2 xP∆1
In fact, we can show that these two are equal. With an abuse of notation, we use v ´ to denote
the term on the left side and v ` to denote the term on right side. These are knowns as lower
and upper values in the class of mixed strategies respectively. We will now show that v ´ “ v ` .
Note that
max min ě min πpx˚ , yq “ 0
xP∆1 yP∆2
yP∆2
and
min max πpx, yq ď max πpx, y ˚ q “ 0
´
yP∆2 xP∆1
`
xP∆1
This clearly implies that v “ v “ 0.
Another property (exercise) of the pair x˚ , y ˚ q of equalizer strategies is that
πpx, y ˚ q ď πpx˚ , y ˚ q ď πpx˚ , yq
for all x P ∆1 and y P ∆2 . Any pair of strategies px˚ , y ˚ q P ∆1 ˆ ∆2 satisfies this property is
called saddle point equilibrium of the game. To show that v ´ “ v ` , it is enough to have the
existence of saddle point equilibrium. We now prove this.
Let px˚ , y ˚ q P ∆1 ˆ ∆2 be saddle point equilibrium. Since
πpx, y ˚ q ď πpx˚ , y ˚ q
for all x P ∆1 , we have
max πpx, y ˚ q ď πpx˚ , y ˚ q
xP∆1
which, in turn, will imply that
min max πpx, yq ď max πpx, y ˚ q ď πpx˚ , y ˚ q
yP∆2 xP∆1
`
˚
xP∆1
˚
and hence v ď πpx , y q. From the inequality
πpx˚ , y ˚ q ď πpx˚ , yq
for all y P ∆2 , we have
πpx˚ , y ˚ q ď min πpx˚ , yq ď max min πpx, yq
yP∆2
xP∆1 yP∆2
12
INTRODUCTION TO GAME THEORY
proving πpx˚ , y ˚ q ď v ´ . Hence v ´ “ v ` .
In view of these observations, it is natural to ask if v ´ “ v ` , does there exist a saddle point
equilibrium? Indeed, it is true provided the sets of strategies is compact. Note that, in our
example,
∆1 “ tx “ px1 , x2 q P R2` : x1 ` x2 “ 1u
and
∆2 “ ty “ py1 , y2 q P R2` : y1 ` y2 “ 1u.
Furthermore, both the strategy spaces are compact (closed and bounded) and convex. Due to
the compactness, we can choose x̂ P ∆1 , ŷ P ∆2 such that
max min πpx, yq “ min πpx˚ , yq
xP∆1 yP∆2
yP∆2
and
min max πpx, yq “ max πpx, y ˚ q.
xP∆1
yP∆2 xP∆1
Now
πpx˚ , y ˚ q ě min πpx˚ , yq “ max min πpx, yq
yP∆2
xP∆1 yP∆2
and
πpx˚ , y ˚ q ď max πpx, y ˚ q “ min max πpx, yq
xP∆1
yP∆2 xP∆1
From these two inequalities and the hypothesis that v ´ “ v ` , we obtain v ´ “ v ` “ πpx˚ , y ˚ q.
It is, now, not hard to verify that
πpx, y ˚ q ď πpx˚ , y ˚ q ď πpx˚ , yq
for all x P ∆1 and y P ∆2 . Thus, showing the existence of saddle point equilibrium is equivalent(?)
to the fact that lower and upper values are same. The main question (in zero-sum games) is to
prove the existence of value i.e., lower and upper values are same. This result is established by
von Neumann and is known as von Neumann minmax theorem.
§ 8. Rock-Paper-Scissors Game
This is a famous children game. It is played by two players. They simultaneously display
their hands in one of three shapes denoting a rock, a paper, or scissors. The rock wins over the
scissors as it can shatter scissors, the scissors win over paper (scissor cuts paper), and the paper
wins over the rock (since paper can cover rock). Winner takes a rupee from the opponent. If
both display the same, the game is drawn. This game can be tabulated as
Player 1
Player 2
R
S
R 0, 0 1, -1
S -1, 1 0, 0
P 1, -1 -1, 1
P
-1, 1
1, -1
0, 0
§ 9. Prisoner’s Dilemma
Prisoners’ dilemma is framed by Merrill Flood and Melvin Dresher working at RAND in 1950
and Alan W. Tucker formalized the version used now. Two individuals who have committed a
serious crime are apprehended. Lacking of incriminating evidence, the prosecution can obtain an
indictment only by persuading one (or both) of the prisoners to confess to the crime. However,
they have witness only to convict them of a minor offense. If neither admits, the both will be
charged with the minor offense and pay a moderate fine. They are put in separate cells and asked
to fink on the other. Finking corresponds to a strategy D (Defect) and not finking corresponds
to C (Cooperate with the other prisoner). Each of them was told that if he finks on the other,
he will be released with no fine.
The above situation defines a two-player strategic-form game in which each player has two
strategies: D, which stands for defection, betraying your fellow criminal by confessing, and C,
INTRODUCTION TO GAME THEORY
13
which stands for cooperation, cooperating with your fellow criminal and not confessing the crime.
This situation is better represented in the the following table
Player 1
Player 2
D
C
D -6, -6 0, -10
C -10, 0 -1, -1
Now the question is what the two criminals should do? Let us start with the thought process
of Player 1.
‚ Suppose Player 2 decides to play C. Then I will benefit by playing D.
‚ Even if Player 2 decides to play D, I will benefit by playing D.
‚ Therefore I should player D.
Similar thought process suggests choosing D for Player 2 also. Thus the pair (D, D) is the
outcome of this game and is called Nash equilibrium of the game.
This result can also be seen also from the argument of domination. The choice D strictly
dominates the choice C for both players in the sense that the payoff that the player receives
under the choice D is higher than that of under the choice C irrespective of the choice of other
player. Thus, the “rationality” of the players forces them to choose D and hence (D, D) is the
outcome of the game.
If both players do not defect each other, they are better off with imprisonment for only one
year. However (C, C) is not stable, i.e., if one player chooses C then the other player will switch to
D, thus by the other player will get no imprisonment. Is there a way to ensure the cooperation
among the player? This is a big question, people have been working on this. The situation
of Prisoner’s dilemma appears naturally in various situations. One such situation appears in
oligopoly. Some other examples include
‚ Arm races between superpower or local rival nations.
‚ Students sharing a room need to clear the room. Each student would prefer that his
roommate clean the room. In the end nobody puts the effort and the room does not get
cleaned.
‚ Competition between two firms selling similar products. Take the example of Coca-Cola
and Pepsi. Each must decide on a pricing strategy. If both of them charge a higher
price, they can exploit their joint market and make a good profit. However, if one of
them sets a competitive low price, it attracts more customers from the rival and then
the profit will raise higher than the earlier.
The prisoners’ dilemma has several viewpoints.
‚ Moral Issue: Taking the high road (not defecting the other) leads to the best overall
outcome.
‚ Community Issue: How can we persuade people to do what is best for the group, instead
of what is best for themselves (and not best for the group)?
‚ Truth and Falsehood Issue: The above story does not say whether the evidence given by
the prisoners is true or false. If one person gives evidence, it convicts the other, that’s
all.
‚ Communication Issue: If the two prisoners are allowed to communicate to each other,
they could mutually decide to not-attack the other and get the best overall outcome.
Prisoner’s Dilemma A Story from Childhood Days
‚ A king, while performing a yagna, decides to make his praja participate so that they
also will attain the punya.
‚ For this, he asks all his praja to get a milk and put it in a large vessel.
‚ The result:
The vessel contains only water
‚ This story is a multiplayer version of “Donation Game”, a variant of Prisoner’s dilemma.
Prisoner’s Dilemma Donation Game
14
INTRODUCTION TO GAME THEORY
‚ The payoffs in the donation game for two player case is given below:
Player 2
D
C
b, ´c
Player 1 D 0 , 0
C ´c, b b ´ c, b ´ c
Here C refers to the cooperation by both the players, which benefit them, whereas D
refers to defecting the other player. b is the benefit obtained by cooperating and c is the
cost of cooperation.
Prisoner’s Dilemma
‚ A natural question is the following: Can the prisoners extricate themselves from the
dilemma and sustain cooperation when each of them have a powerful incentive to cheat?
If so how?
– Idea is to consider repeated play of the same game. The competition between Coke
and Pepsi exists on every day. So if they continue cheating, the gains will be lesser
than the cooperation. We can expect them to cooperate in the long run.
– Can Gandhigiri be explained through the iterated play of Prisoner’s dilemma?
Prisoner’s Dilemma
‚ Cheater’s reward comes at once, while the loss from punishment lies in the future. If the
future payoffs are heavily discounted, then the loss may be insufficient to deter cheating.
Thus cooperation is harder to sustain among very impatient players (governments for
example).
‚ Punishment will not work unless cheating can be detected and punished. If the actions
of companies are not easily detected, then the companies will try to defect.
‚ Punishment can be made automatic by following strategies like ‘tit for tac’. This is
popularised by Robert Axelrod. Here you cheat if your rival cheated your in the previous
round.
Prisoner’s Dilemma
‚ A fixed, finite number of repetition is logically inadequate to yield cooperation.
‚ Cooperation can also arise if the group has a large leader. if the leader is going to
loose a lot by the outright competition and therefore exercises restraint, even though he
knows the small players will cheat. Saudi Arabia’s role of “swing producer” (supplier
of any commodity controlling its global deposits and possessing large spare production
capacity) in the OPEC (Organization of the Petroleum Exporting Countries) cartel is
an instance of this. A swing producer is able to increase or decrease commodity supply
at minimal additional internal cost, and thus able to influence prices and balance the
markets, providing downside protection in the short to middle term.
§ 10. BoS
Two people wish to go out together. They have two options: watching a movie or to go for
dinner. One prefers movie and the other prefers dinner. If they go to different things, each of
them are unhappy as they do not have a company. This situation is described in the following
table.
Player 1
Player
M
M 2, 1
D 0, 0
2
D
0, 0
1, 2
§ 11. Matrix Games
A matrix game is described by a single matrix. There are two players (let us call row player
(player 1) and column player (player 2)). Row player chooses a row and column player chooses
a column independent of each other. The corresponding entry is that pay received by the row
player from the column player. Generally we assume that the row player is the maximizer and
INTRODUCTION TO GAME THEORY
15
the column player is minimizer. The row player will choose a row such that he gets maximum
pay whereas the column player chooses a column to minimize the amount that he pays. The
pair of optimal choices of row and colum is called pure saddle point equilibrium. In general,
pure saddle point equilibrium does not exist. In a seminal work, what can be considered as the
starting point of game theory, von Numann proved the minimax theorem, which the existence
of saddle point equilibrium in mixed strategies. We will now discuss this result.
Let A be a m ˆ n matrix meaning that player 1 has m pure strategies and player 2 has n pure
strategies. A mixed strategy for player 1 is a probability vector x “ px1 , x2 , ¨ ¨ ¨ , xm q1 P Rn i.e.,
m
ÿ
xi “ 1 and xi ě 0, i “ 1, 2, ¨ ¨ ¨ , m.
i“1
Here xi represents the probability with which player 1 picks the row i. Similarly a mixed strategy
for player 2 is a probability vector y “ py1 , y2 , ¨ ¨ ¨ , yn q1 P Rn i.e.,
n
ÿ
yj “ 1 and yj ě 0, j “ 1, 2, ¨ ¨ ¨ , n.
j“1
As earlier, here yj represents the probability with which player 2 pics the column j. Let ∆m
denotes the set of all mixed strategies for player 1 and ∆n denotes the set of all mixed strategies
for player 2. Note that both ∆m and ∆n are convex and compact subsets of the respective
euclidean spaces. Note that vectors x and y are generally understood as column vectors.
Let player 1 choose a mixed strategy x and player 2 chooses a mixed strategy y then the
(expected) payoff received by player 1 from player 2 is given by
πpx, yq “
m ÿ
n
ÿ
xi yj aij “ x1 Ay.
i“1 j“1
The meaning of the above payoff function is self-explanatory. the row i is picked with probability
xi and the column j is picked with probability yj and hence the player 1 will get xi yj aij . Now
the expected payoff will be sum of all these quantities.
Let us consider the situation of player 1. Since the players are assumed to be rational, he will
think in the following way: if I choose a mixed strategy x, then player 2 will try to choose mixed
strategy which minimized the quantity x1 Ay over y P ∆n . So player 1’s best choice will be to
choose a mixed strategy which maximizes this minimum value. Thus player 1, if he plays best
will secure at least
max min x1 Ay.
xP∆m yP∆n
´
This value, denoted by V pAq, is called the security level of player 1, and any strategy that
secures him at least this value is called optimal (prudent) strategy for player 1.
In a similar fashion, player 2 will think and his strategy will be to choose so as not to pay
more than
min max x1 Ay.
yP∆n xP∆m
This value, denoted by V ` pAq, is called the security level of player 2, and any strategy which
makes him pay less than this is called optimal (prudent) strategy for player 2. The values V ´ pAq
and V ` pAq are also called lower and upper values of the game respectively.
Note that we always have
V ´ pAq ď V ` pAq.
When these two values are equal, that equal value is called the value of the game and is denoted
by V pAq. The first major theorem of the game theory (proved by von Neumann) is to show that
the upper and lower values, when mixed strategies are used, are always equal.
(11.1). Theorem (Minmax Theorem, von Neumann). Every finite zero-sum game admits value.
Before proceeding with the proof we recall two results.
16
INTRODUCTION TO GAME THEORY
(11.2). Proposition. Let C be a compact convex subset of a euclidean space Rm and 0 R C.
Then there exists a vector z P Rm such that
z ¨ x ą 0 for x P C.
Proof. Since C is convex, there exists a unique point z P C such that
|z|2 ď |x|2
for every x P C.
Now for any x P C, we have p1 ´ αqz ` αx P C for all α P p0, 1q. Therefore
}z}2 ď }p1 ´ αqz ` αx}2 “ p1 ´ αq2 }z}2 ` 2αp1 ´ αqz ¨ x ` α2 }x}2
Therefore,
0 ď αpα ´ 2q}z}2 ` 2αp1 ´ αqz ¨ x ` α2 }x}2
Dividing by α, we have
0 ď pα ´ 2q}z}2 ` 2p1 ´ αqz ¨ x ` α}x}2
Letting α Ñ 0, we have
0 ď ´2}z}2 ` 2z ¨ x
which gives the required inequality
}z}2 ď z ¨ x
□
(11.3). Proposition. Let A be any matrix of order m ˆ n. Then either
(1) there exists x P Rm , x ‰ 0, x ě 0 such that x1 A ě 0; or
(2) there exists y P Rn , y ‰ 0, y ě 0 such that Ay ď 0.
Proof. Let e1 , e2 , ¨ ¨ ¨ , en be the unit vectors in Rn . Let the rows of A be denoted by a1 , a2 , ¨ ¨ ¨ , am P
Rn . Let C be the convex hull of ´e1 , ´e2 , ¨ ¨ ¨ , ´en and a1 , a2 , ¨ ¨ ¨ , am , then C is a compact
convex subset of Rn . Now two cases arise: 0 P C or 0 R C.
Case 0 P C : In this case, there exists non-negative real numbers x1 , x2 , ¨ ¨ ¨ , xm , η1 , η2 , ¨ ¨ ¨ , ηn
such that
x1 a1 ` x2 a2 ` ¨ ¨ ¨ ` xm am ´ η1 e1 ´ η2 e2 ´ ¨ ¨ ¨ ´ ηn en “ 0,
and x1 ` x2 ` ¨ ¨ ¨ ` xm ` η1 ` η2 ` ¨ ¨ ¨ ` ηn “ 1. Clearly all of x1 , x2 , ¨ ¨ ¨ , xm can be zero. Indeed,
if x1 “ x2 “ ¨ ¨ ¨ “ xm “ 0, then we must have
η1 e1 ` η2 e2 ` ¨ ¨ ¨ ` ηn en “ 0, η1 ` η2 ` ¨ ¨ ¨ ` ηn “ 1
which contradicts the liner independence of the vectors e1 , e2 , ¨ ¨ ¨ , en . Thus we have non-negative
real numbers x1 , x2 , ¨ ¨ ¨ , xm P R, not all of them zero, such that
x1 a1 ` x2 a2 ` ¨ ¨ ¨ ` xm am “ η
n
where η “ pη1 , η2 , ¨ ¨ ¨ ηn q P R . Note that η ě 0. In other words,
x1 A “ η ě 0
where x “ px1 , x2 , ¨ ¨ ¨ , xm q1 P Rm , x ‰ 0 and x ě 0. This proves (i).
Case 0 R C : Since 0 R C, there is a hyperplane separating 0 and C. In other words there
must exist z P Rn such that
x ¨ z ą 0 for every x P C.
Since ´ei P C, we must have zi ă 0 and hence z ‰ 0, z ď 0. Now ai P C and hence ai ¨ z ą 0
for every i “ 1, 2, ¨ ¨ ¨ , m. Thus Az ą 0. Now taking z “ ´y we obtain Ay ă 0 which proves
(ii).
□
With these two lemmas in hand, we are now ready to prove the minmax theorem.
INTRODUCTION TO GAME THEORY
17
Proof. (Minmax Theorem)
From the previous result either we have two cases: there exists x ě 0 P Rm , x1 ‰ 0 such that
x A ě 0 or there exists y ě 0 P Rn , y ‰ 0 such that Ay ď 0. Letting x̄ “ řcxi and ȳ “ řyyj , we
note that x̄ P ∆m and ȳ P ∆n and either x̄1 A ě 0 or Aȳ ď 0.
1
The first case means that x̄1 Ay ě 0 for ever y P ∆n which means that the lower value of the
game
V ´ pAq “ max min x1 Ay ě 0.
xP∆m yP∆n
The second case means that xAȳ ď 0 for every x P ∆m , which gives that the upper value of the
game
V ` pAq “ min max x1 Ay ď 0.
yP∆n xP∆m
Thus we have either V ´ pAq ě 0 or V ` pAq ď 0. Let B “ ppaij ´ cqq, where c P R. Note that
V ´ pBq “ V ´ pAq ´ c and V ` pBq “ V ` pAq ´ c. Thus we must have
V ´ pAq ě c or V ` pAq ď c
for any c P R. This can happen only if both V ´ pAq and V ` pAq are equal. This completes the
proof of the minmax theorem.
□
(11.4). Remark. We can view the Farkas lemma as minmax theorem in disguise. The Farkas
lemma says that either maxx miny xx, Ayy ě 0 or miny maxx xx, Ayy ď 0. In fact, we can show
that they are equivalent.
We now recall the definition of saddle point equilibrium. A pair of mixed strategies px‹ , y ‹ q P
∆m ˆ ∆n is said to be saddle point equilibrium provided
x1 Ay ‹ ď x˚1 Ay ‹ ď x˚1 Ay
for every x P ∆m and y P ∆n .
We now show that the existence of saddle point equilibrium and the existence of value are
equivalent.
(11.5). Theorem. A game admits value if and only if it has a saddle point equilibrium.
□
Proof. Exercise.
We now discuss several properties of zero-sum games.
(11.6). Proposition. Suppose px‹ , y ‹ q and px̂, ŷq are two saddle point equilibria of the game A,
then
x˚1 Ay ‹ “ x̂1 Aŷ,
and both px‹ , ŷq and px̂, y ‹ q are also saddle points.
□
Proof. Exercise.
For given mixed strategy x of player 1, let BR2 pxq denotes all the mixed strategies ŷ of player
2 such that
x1 Aŷ “ min x1 Ay.
yP∆n
2
Any strategy in BR pxq is called a best response of player 2 to the mixed strategy x of player
1. Similarly let BR1 pyq denotes the best responses of player 1 to the mixed strategy y of player
2 and is given by
*
"
BR1 pyq “ x̂ P ∆m : x̂1 Ay “ max x1 Ay .
xP∆m
1
2
Set BRpx, yq “ BR pyq ˆ BR pxq Ď ∆m ˆ ∆n , then BR defines a set-valued map from ∆m ˆ ∆n
to itself and is denoted by BR : ∆m ˆ ∆n ãÑ ∆m ˆ ∆n .
It is easy to observe the following:
18
INTRODUCTION TO GAME THEORY
(11.7). Proposition. A point px‹ , y ‹ q P ∆m ˆ ∆n is a saddle point equilibrium if and only if
px‹ , y ‹ q P BRpx‹ , y ‹ q.
In other words, x‹ P BR1 py ‹ q and y ‹ P BR2 px‹ q.
□
Proof. Exercise.
(11.8). Remark. A point px‹ , y ‹ q such that px‹ , y ‹ q P BRpx‹ , y ‹ q is called a fixed point of the
best response map. Thus proving the existence of saddle point equilibrium amounts to show
the existence of fixed point of the best response map. This can be done using Kakutani fixed
point map. We will not give these details here, as we will use this idea later while studying
nonzero-sum games.
For a fixed mixed strategy x of player 1, note that
min x1 Ay “ min x1 Aej .
1ďjďn
yP∆n
Here e1 , e2 , ¨ ¨ ¨ , en are the pure strategies of player 2. Thus finding an optimal strategy x of
player 1 amounts to the following optimization problem
maximize min x1 Aej
1ďjďn
m
ÿ
s.t xi ě 0,
xi “ 1.
i“1
Let t “ min1ďjďn x1 Aej , then we note that t ď x1 Aej for every j. Using this fact in the above
optimization problem, we get
maximize t
m
ÿ
s.t t ď x1 Aej ; xi ě 0;
xi “ 1.
i“1
Now this is a linear program and the value of this linear program is equal to the lower value of
the game. Similarly we can write another linear program from the second player’s perspective.
We now give these details.
When player 2 fixes his strategy y, then we have
max x1 Ay “ max e1i Ay.
1ďiďm
xP∆m
Thus the corresponding linear program will be
minimize s
s. t. s ě e1i Ay; yj ě 0;
n
ÿ
yj “ 1.
j“1
The value of this linear program is the upper value of the game. It is not very difficult to note
that both the linear programs are dual to each other. Since both are finite linear programs,
using the duality result, we must have that the two values must be equal, which provides, yet
another proof of minmax theorem.
§ 12. Continuous Games
Let S1 and S2 be the action spaces of player 1 and player 2 respectively. Let f : S1 ˆ S2 Ñ R
be a continuous function. Player 1 chooses an action x from S1 and player 2 chooses y from S2 .
As a result player 1 gets f px, yq from player 2. The object of player 1 is to choose his action so
that f will be maximized and at the same time player 2 chooses his action to minimize f .
If player 1 chooses an action x P S1 , then the best he can get is miny f px, yq. And hence he
chooses an x which maximizes this minimum value i.e., he choose x‹ such that
max min f px, yq “ min f px‹ , yq
x
y
y
INTRODUCTION TO GAME THEORY
19
This maxmin value is called the security level of player 1. In a similar way, the security value for
player 2 can be introduced and is given by minmax value miny maxx f px, yq. A pair of strategies
px‹ , y ‹ q given as above are called minmax (or maxmin) strategies of the players.
(12.1). Proposition. Security level for player 1 is always less than or equal to the security level
of player 2 i.e.,
max min f px, yq ď min max f px, yq
x
y
y
x
One can find numerous examples to show that the above inequality can be strict. When the
security levels for both the players are same, then the pair of minmax strategies play a crucial
rule in the study of zero-sum games. These can be shown to be equivalent to saddle point
equilibrium. we now introduce the definition of saddle point equilibrium.
(12.2). Definition (Saddle Point Equilibrium). A pair of strategies px‹ , y ‹ q P S1 ˆ S2 is called
a saddle point equilibrium if
f px, y ‹ q ď f px‹ , y ‹ q ď f px‹ , yq
for all px, yq P S1 ˆ S2 .
We now list some properties of saddle point equilibrium.
(12.3). Proposition. If px‹ , y ‹ q is a pair of saddle point equilibrium, then
f px‹ , y ‹ q “ max min f px, yq “ min max f px, yq.
x
y
y
x
(12.4). Proposition. If
max min f px, yq “ min max f px, yq.
x
y
y
‹
‹
x
and the outer maximizer x and outer minimizer y exists, then px‹ , y ‹ q is saddle point equilibrium. In other ways, minmax strategies and saddle point equilibrium coincide.
(12.5). Theorem (Minmax Theorem). Let Si be a compact convex subset of some euclidean
space and f be concave-convex function (i.e., f is concave in x-variable when y is fixed and it is
convex in y-variable when x is fixed). Then a saddle point equilibrium exists.
Proof. We first assume that f is strict concave in x-variable and strict convex in y-variable when
the other variable is fixed.
By the strictness, for each x, there is a unique ypxq such that
f px, ypxqq “ min f px, yq :“ mpxq
y
Since f is uniformly continuous (as it is continuous on a compact set), ypxq is continuous as a
function of x. Also mpxq is concave (being minimum of a family of concave functions) function.
Note that minimum of a family of concave functions need not be continuous in general. However,
our mpxq is continuous (Verify!). Let x‹ be such that
mpx‹ q “ max mpxq “ max min f px, yq
x
‹
x
y
‹
Now our aim is to show that px , ypx qq is a saddle point equilibrium. Note that
(12.6)
mpx‹ q “ f px‹ , ypx‹ qq ď f px‹ , yq for all y P S2
from the definition of ypxq. Also,
f px‹ , ypx‹ qq “ mpx‹ q ě mpxq for all x P S1
(12.7)
from the definition of x‹ . Let ỹ “ ypp1 ´ tqx‹ ` txq, then
(12.8)
mpp1 ´ tqx‹ ` tx‹ q “ f pp1 ´ tqx‹ ` tx‹ , ỹq ě p1 ´ tqf px‹ , ỹq ` tf px, ỹq
Using (12.6), (12.7) and (12.8), we now get
mpx‹ q ě mpp1 ´ tqx‹ ` tx‹ q ě p1 ´ tqf px‹ , ỹq ` tf px, ỹq ě p1 ´ tqmpx‹ q ` tf px, ỹq
which further implies that
mpx‹ q ě f px, ỹq
20
INTRODUCTION TO GAME THEORY
Now letting t Ñ 0, we obtain
mpx‹ q ě f px, ypx‹ qq
for all x P S1 . Combing this with (12.6), we get
f px, ypx‹ qq ď f px‹ , ypx‹ qq ď f px‹ , yq
for all px, yq P S1 ˆ S2 . Thus px‹ , ypx‹ qq is a saddle point equilibrium. This proves the theorem
for the strict concave/convex case. We now prove for general case.
Let
f ϵ px, yq “ f px, yq ´ ϵ|x|2 ` ϵ|y|2
for ϵ ą 0. Then f ϵ is strict concave in x and strict convex in y when the other variable is fixed.
Thus from above there is a pair pxϵ , y ϵ q P S1 ˆ S2 such that
(12.9)
f ϵ px, y ϵ q ď f ϵ pxϵ , y ϵ q ď f ϵ pxϵ , yq
Since S1 , S2 are compact, we can extract a subsequence which is convergent. With an abuse of
notation, we denote this subsequence by tpxϵ , y ϵ qu itself. Let pxϵ , y ϵ q Ñ px‹ , y ‹ q. Then letting
ϵ Ñ 0 in (12.9), we obtain
f px, ypx‹ qq ď f px‹ , ypx‹ qq ď f px‹ , yq
for all px, yq P S1 ˆ S2 , which completes the proof of the theorem.
□
The proof above is due to Karlin [7]. The general way to prove this theorem is to apply fixed
point theorems. But the above proof avoids the use of fixed point theorem relying on strict
convexity/concavity. We now prove the general minmax theorem.
(12.10). Theorem (Minmax Theorem). Let X and Y be two compact and convex topological
spaces. and f : X ˆ Y Ñ R be a continuous function. Assume that f is concave in x-variable
when y is fixed and convex in y-variable when x is fixed. Then there exists a point px‹ , y ‹ q P X ˆY
such that
f px, y ‹ q ď f px‹ , y ‹ q ď f px‹ , yq
In other words, there is a saddle point equilibrium.
Proof. The above proof due to Karlin can be replicated if X and Y are compact convex subsets
of some normed linear spaces with suitable modifications. But, we provide a proof which uses
fixed point theorems and encompasses all topological vector spaces.
□
(12.11). Exercise. Consider the following extension of the matching pennies game with countably infinite pure strategies. Describe the set of all equilibrium with positive probability on each
1, -1
-1, 1
1, -1
-1, 1
..
.
-1, 1
1, -1
-1, 1
1, -1
..
.
1, -1
-1, 1
1, -1
-1, 1
..
.
-1, 1
1, -1
-1, 1
1, -1
..
.
¨¨¨
¨¨¨
¨¨¨
¨¨¨
¨¨¨
¨¨¨
¨¨¨
¨¨¨
¨¨¨
¨¨¨
of the choices available.
§ 13. Nonzero-sum Bimatix Games
Let pA, Bq be a bimatrix game. Our aim is to prove the following theorem due to Nash.
(13.1). Theorem (Nash). There exists a Nash equilibrium.
Proof. Recall that a pair of mixed strategies px‹ , y ‹ q is a Nash equilibrium if and only if
x‹ Ay ‹ ě ei Ay ‹ and x‹ By ‹ ě x‹ Bej
for every i and j.
INTRODUCTION TO GAME THEORY
21
Define f : ∆m ˆ ∆n Ñ ∆m ˆ ∆n by f px, yq “ px1 , y 1 q, where px1 , y 1 q are defined as follows:
ci px, yq “ maxp0, ei Ay ´ xAyq and dj px, yq “ maxp0, xBej ´ xByq
and
yj ` dj px, yq
xi ` ci px, yq
řm
řm
and yj1 “
1 ` l“1 cl px, yq
1 ` k“1 dk px, yq
Clearly f is a continuous map. By Brouwer fixed point theorem, it has a fixed point. Let px˚ , y ˚ q
be a fixed point of f . Then,
x˚i ` ci px˚ , y ˚ q
yj ` dj px˚ , y ˚ q
řm
řm
(13.2)
x˚1
and yj˚1 “
i “
˚
˚
1 ` l“1 cl px , y q
1 ` k“1 dk px˚ , y ˚ q
řm
řm
řm
˚ ˚
˚ ˚
˚ ˚
We, now,
l“1 cl px , y q “ 0 and
k“1 dk px , y q “ 0. Suppose
l“1 cl px , y q ‰ 0,
řm claim˚ that
˚
then l“1 cl px , y q ą 0 (since ci s are non-negative.
x1i “
Let I “ ti : ci px˚ , y ˚ q ą 0u. Note that
m
ÿ
x˚1
cl px˚ , y ˚ q “ ci px˚ , y ˚ q
i
l“1
from (13.2). If i R I, then this equality implies that x˚i “ 0. Now,
m
ÿ
ÿ
ÿ
ÿ
x˚i ei Ay ˚ ą
x˚i x˚1 Ay ˚ “ x˚1 Ay ˚ x˚i “ x˚1 Ay ˚ ,
x˚1 Ay ˚ “
x˚i ei Ay ˚ “
l“1
iPI
iPI
iPI
řm
which is a a contradiction. Therefore the claim that l“1 cl px˚ , y ˚ q “ 0 holds. Similarly
ř
n
˚ ˚
˚ ˚
˚ ˚
k“1 dk px , y q “ 0. and hence ci px , y q “ 0 as well as dj px , y q “ 0. Thus
ei Ay ˚ ď x˚1 Ay ˚ and ej By ˚ ď x˚1 By ˚
for each i “ 1, 2, ¨ ¨ ¨ , m and j “ 1, 2, ¨ ¨ ¨ , n, proving that px˚ , y ˚ q is Nash equilibrium.
Digression: Brouwer Fixed Point Theorem
Brouwer fixed point theorem is one of the central result in various disciplines of mathematics.
The standard proofs uses either algebraic topology or degree theory. Here we provide a proof
due to Lax [8] using change of variable formula.
(13.3). Proposition. Let ϕ : Rn Ñ Rn be a continuously differentiable function such that
ϕpxq “ x for all |x| ě 1. Then f is onto.
Proof. Suppose ϕ is not onto. Then there exists y0 with |y0 | ă 1 such that for every
x P Rn , ϕpxq ‰ y0 . Since ϕ maps unit ball into a closed set (Exercise!), there must be a
neighborhood of y0 with no preimages. Let ϵ ą 0 be such that Bpy0 , ϵq X Rangepϕq
“ ∅.
ş
Choose any continuous function f with support contained in Bpy0 , ϵq and f pyqdy ‰ 0.
Using the change of variables, we now have
ż
ż
0 ‰ f pyqdy “ f pϕpxqqJpxqdx “ 0
since support of ϕ is not contained in range of f . Here J is the determinant of the Jacobian
of ϕ. Thus we have a contradiction, which completes the proof of the theorem.
□
(13.4). Proposition. Let ϕ be a continuous map of the unit ball in to Rn that is identity
on the boundary. Then the image of ϕ covers every point in the unit ball.
Proof. Extend ϕ outside the unit ball by ϕpxq “ x. Then clearly ϕ is continuous function.
Now approximate ϕ by smooth functions which are identity outside unit ball (Exercise!).
Now by previous proposition, each such smooth function covers the unit ball. By continuity
and compactness arguments, ϕ also covers the unit ball.
□
The above proposition actually proves the famous result known as no retraction theorem.
This theorem says that there can not be a continuous function from the unit ball onto unit
sphere which is identity on unit sphere. Once we have this result, we can easily prove
□
22
INTRODUCTION TO GAME THEORY
Brouwer fixed point theorem. In fact, it is known that Brouwer fixed point theorem and no
retraction theorem are equivalent.
(13.5). Theorem (Brouwer Fixed Point Theorem). Any continuous function f from unit
ball to itself has a fixed point i.e., there exists a point x such that f pxq “ x.
Proof. Suppose f has no fixed point. Let ϕpxq denote the point where the ray starting from
f pxq to x meets the boundary of the unit ball. The ϕ is a continuous function which maps
the unit ball into the unit sphere. Also ϕ is identity on the unit sphere. Thus it satisfies the
hypothesis of previous theorem, but not the conclusion and hence the theorem is proved. □
(13.6). Theorem (Kakutani Fixed Point Theorem). Let X Ď Rn be a compact and convex.
for each x P X, let F pxq be a nonempty convex subset of X. Assume that the graph of the
set valued map F is closed in X ˆ X. Then there is a point x‹ P X such that x‹ P F px‹ q.
§ 14. Nonzero-sum Continuous Games
Let N denotes the set of players and with an abuse of notation, we also use N denote the
number of players. Let Si be subset of a metric space which is the action space of palyer i,
i “ 1, 2, ¨ ¨ ¨ , N . Denote by S “ S1 ˆ S2 ˆ ¨ ¨ ¨ ˆ SN . Let fi : S Ñ R be the payoff function
of ith player. The objective of each player is to maximize his payoff fi . The central concept
of the non-cooperative game theory is the equilibrium concept introduced by J. Nash. We
first introduce the notation: For s P S and s˚i P Si , ps´i ; s˚i q denotes the strategy profile
ps1 , ¨ ¨ ¨ , si´1 , s˚i , si`1 , ¨ ¨ ¨ , sN q.
(14.1). Definition (Nash Equilibrium). A strategy profile s˚ “ ps˚1 , s˚2 , ¨ ¨ ¨ , s˚N q is called Nash
equilibrium if
fi ps˚ q ě fi ps˚´i , si q
for all si P Si and i “ 1, 2, ¨ ¨ ¨ , N .
(14.2). Remark. The interesting feature of Nash equilibrium is that any unilateral deviation by
a player, when others stick to Nash equilibrium, will yield him lesser payoff.
Our aim is to establish the existence of Nash equilibrium. We restrict the attention to two
player case, and the proof readily extends to multiplayer case without much difficulty.
Let S1 , S2 be action spaces of player 1 and 2 respectively and the payoff for ith palyer is
given by fi : S1 ˆ S2 Ñ R. We assume that S1 and S2 are compact and convex subsets of
some euclidean space. f is concave in each of the variables when the other is fixed. In this case,
the definition of Nash equilibrium take the following form. A pair px‹ , y ‹ q P S1 ˆ S2 is Nash
equilibrium if
f1 px, y ‹ q ď f1 px‹ , y ‹ q and f2 px‹ , yq ď f2 px‹ , y ‹ q
for all x P S1 and y P S2 .
The following proof is due to Genakoplos.
Assume f1 is concave in x and f2 is concave in y. Also assume that both f1 and f2 are jointly
continuous.
Fix px̄, ȳq P S1 ˆ S2 and define define ϕ “ pϕ1 , ϕ2 q : S1 ˆ S2 Ñ S1 ˆ S2 by
ϕ1 px̄, ȳq “ arg maxxPS1 tf1 px, ȳq ´ |x ´ x̄|2 u
and
ϕ2 px̄, ȳq “ arg maxyPS2 tf2 px̄, yq ´ |y ´ ȳ|2 u
Clearly ϕ1 and ϕ2 are single valued and hence ϕ : S1 ˆ S2 Ñ S1 ˆ S2 defines a continuous
function. Now Brouwer fixed point theorem guarantees the existence of px˚ , y ˚ q P S1 ˆ S2 such
that ϕpx˚ , y ˚ q “ px˚ , y ˚ q. Therefore,
!
)
ϕ1 px˚ , y ˚ q “ max ϕ1 px, y ˚ q ´ |x ´ x˚ |2
xPS1
INTRODUCTION TO GAME THEORY
and
23
!
)
ϕ2 px˚ , y ˚ q “ max ϕ2 px, y ˚ q ´ |y ´ y ˚ |2
yPS2
Fix x P S1 . Let λ P p0, 1q and consider λx ` p1 ´ λqx˚ P S1 . By definition of x˚ and y ˚ , we
have
ϕ1 px˚ , y ˚ q ě ϕ1 pλx ` p1 ´ λqx˚ , y ˚ q ´ |λx ` p1 ´ λqx˚ ´ x˚ |2
“ ϕ1 pλx ` p1 ´ λqx˚ , y ˚ q ´ λ2 |x ´ x˚ |2
ě λϕ1 px, y ˚ q ` p1 ´ λqϕ1 px˚ , y ˚ q ´ λ2 |x ´ x˚ |2
Rearranging we get
λϕ1 px˚ , y ˚ q ě λϕ1 px, y ˚ q ´ λ2 |x ´ x˚ |2
Now dividing both sides by λ ą 0 and then letting λ Ñ 0, we obtain
ϕ1 px˚ , y ˚ q ě ϕ1 px, y ˚ q
Since x P S1 is arbitrary, we get the optimality of x˚ . In a similar fashion, we can show the
optimality of y ˚ . Therefore px˚ , y ˚ q is Nash equilibrium.
§ 15. Lemke-Howson Algorithm
(15.1). Proposition. Let pA, Bq be a bimatrix game. A mixed strategy x˚ P ∆1 is a best response
to y ˚ P ∆2 if and only if for all i “ 1, 2, ¨ ¨ ¨ , m,
x˚i ą 0 ñ pAy ˚ qi “ maxtpAy ˚ qk : k “ 1, 2, ¨ ¨ ¨ , mu.
Proof. Let u “ xx˚ , Ay ˚ y. Then
u“
m
ÿ
k“1
x˚k pAy ˚ qk “
ÿ
x˚k pAy ˚ qk ě xx, Ay ˚ y
ą0
x˚
k
for all x P ∆1 . Assume that, for some i, pAy ˚ qi ă maxtpAy ˚ qk : k “ 1, 2, ¨ ¨ ¨ , mu and x˚i ą 0.
□
As a corollary, we have
(15.2). Corollary. A pair of mixed strategies px˚ , y ˚ q is Nash equilibrium if and only if
Ay ˚ ď ve and B 1 x˚ ď ue
for some u and v together with xx˚ , Ay ˚ ´ vey “ 0 and xx˚ , By ˚ ´ uey “ 0.
□
Proof. Left as an exercise.
Define
P “ tpu, xq P R ˆ Rm : x ě 0,
ÿ
xi “ 1, B T x ď uu
i
Q “ tpv, yq P R ˆ Rn : y ě 0,
ÿ
yj “ 1, Ay ď vu
j
P̄ “ tx P Rm : x ě 0, B T x ď 1u
Q̄ “ ty P Rn : y ě 0, Ay ď 1u
Note that P̄ represents the set of “artificial” strategies for Player 1 in the sense that any normalized nonzero vector of P̄ is a mixed strategy of Player 1. In fact, for each pu, xq P P , taking
x̃ “ u1 x, we see that x̃ P P̄ . For xp‰ 0q P P̄ , p ř 1xi , ř 1xi xq P P . Similarly Q̄ represents the set
i
i
of “artificial” strategies for Player 2 in the sense that any nonrmalized nonzero vector of Q̄ is a
mixed strategy of Player 2. And the points in Q and Q̄ are connected. From the Proposition
(15.2), we know that there is a one-to-one correspondence between the set of Nash equilibria
and the extreme points of the polyhedrons P and Q.
24
INTRODUCTION TO GAME THEORY
For the notational convenience, we write y P Rn as y “ pym`1 , ym`2 , ¨ ¨ ¨ , ym`n q, M “
t1, 2, ¨ ¨ ¨ , mu and N “ tm ` 1, m ` 2, ¨ ¨ ¨ , m ` nu. For x P P̄ and y P Q̄, we define the label sets
of x and y respectively by
!
)ď!
)
Lpxq “ i P t1, 2, ¨ ¨ ¨ , mu : xi “ 0
j P tm ` 1, m ` 2, ¨ ¨ ¨ , m ` nu : pxT Bqj “ 1
!
)
)
ď!
Lpyq “ j P tm ` 1, m ` 2, ¨ ¨ ¨ , m ` n : yj “ 0u
i P t1, 2, ¨ ¨ ¨ , mu : pAyqi “ 1
We now assume that the bimatrix game is nondegenerate i.e., for any x P ∆1 , |Spxq| ě
P BRpxq and for any y P ∆2 , |Spyq| ě P BRpyq. In other words: for any y 1 that is best response
to x, we have |Spxq| ě |Spy 1 q| and for any x1 that is best response to y, we have |Spyq| ě |Spx1 q|.
By the non-degeneracy assumption, |Lpxq| ď m and |Lpyq| ď n.
(15.3). Theorem. A pair px, yq corresponds to Nash equilibrium if and only if it is completely
labeled: Lpxq Y Lpyq “ M Y N .
Proof. Suppose Lpxq Y Lpyq “ M Y N . Let M1 “ ti : xi “ 0u, M2 “ ti : pAyqi “ 1u, N1 “ tj :
yj “ 0u, N2 “ tj : pxT Bqj “ 1u. Since |Lpxq| ď m, |Lpyq| ď n and Lpxq Y Lpyq “ M Y N , we
must have that M1 Y M2 “ M and N1 Y N2 “ N . Now xi ą 0 if and only if i P M2 if and only
if ei is best response to y, which imply that x is best response to y. Similarly y is best response
to x and hence px, yq corresponds to a Nash equilibrium.
We now assume that px, yq corresponds Nash equilibrium. For i R Spxq, xi “ 0 and hence
M zSpxq Ď Lpxq. Now let j P Spyq, then ej corresponds to the best response of x and hence
pxT Bqj “ 1, implying that Spyq Ď Lpxq. Thus pM zSpxqq Y Spyq Ď Lpxq. Since the game is
non-degenerate, |Spxq| “ |Spyq|. Therefore, Lpxq contains exactly m elements implying that
Lpxq “ pM zSpxqq Y Spyq. Similarly, Lpyq “ pN zSpyqq Y Spxq. Hence Lpxq Y Lpyq “ M Y N . □
We now introduce graphs G1 and G2 with vertices given by the extreme points of P̄ and Q̄
respectively. Two extreme points x and x̃ of P̄ are connected by an edge in G1 if and only if
they are adjacent extreme points. Similarly the edges of G2 are defined.
We, now, introduce another graph G “ G1 ˆ G2 , where the set of vertices of G is given by
V pG1 qˆV pG2 q. There is an edge between z “ px, yq and z 1 “ px1 , y 1 q if and only if px, x1 q P EpG1 q
or py, y 1 q P EpG2 q. For each vertex z “ px, yq, the label of z is given by Lpzq “ Lpxq Y Lpyq.
For k P M Y N , define
Uk “ tz P V pGq : Lpzq Ě pM Y N qztkuu.
Vertices in Uk are called “k-almost” completely labelled vertices. Note that for k ‰ l, Uk X Ul is
exactly the completely labelled vertices.
(15.4). Theorem. For any k P M Y N ,
(1) p0, 0q and all Nash equilibrium points belong to Uk . Furthermore, their degree in the
graph induced by Uk is exactly one.
(2) The degree of every other vertex in Uk in the graph induced by Uk is two.
Proof. Since the label set of p0, 0q and any Nash equilibrium is exactly M Y N , all these belong
to Uk for each k P M Y N . Let z “ px, yq be one such point. Without loss of generality, suppose
k P Lpxq. In the graph G1 , among all edges that x is incident to, there is only one direction
leading to a vertex x1 without label k (by loosing the binding constraint corresponding to label
k). It is easy to see that z and px1 , yq have an edge showing that px1 , yq is the only neighbour of
z in the graph induced by Uk .
Let z “ px, yq be any other point in Uk . Then there must be label l such that l P Lpxq X Lpyq.
Now z will have one neighbour through the graph G1 and another neighbour through the graph
G2 implying that its degree is two.
□
INTRODUCTION TO GAME THEORY
25
Thus, in a non-degenerate bimatrix game, the set of k-almost completely labelled vertices in
G and their induced edges consist of disjoint paths and cycles. The end points of the paths
correspond to p0, 0q and the Nash equilibria of the game.
(15.5). Corollary. A non-degenerate bimatrix game has an odd number of Nash equilibria.
Based on this theorem, we have the following Lemke-Howson algorithm for computing Nash
equilibria of non-degenerate games.
(1)
(2)
(3)
(4)
Input: non-degenerate bimatrix game pA, Bq.
Choose k P M Y N .
Start with px, yq “ p0, 0q P V pGq. Drop label k from z.
Let z be the current vertex. Let l be the label that is picked up by dropping label k. If
l “ k, then px, yq is the Nash equilibrium. If l ‰ k, drop l and repeat.
The Lemke-Howson algorithm starts from origin and follows a path in Uk . This path can not
be a cycle as p0, 0q has degree 1. It must end at a Nash equilirbium.
§ 16. Correlated Equilibria
We start with few examples.
(16.1). Example. Consider the game battle of sexes
Player 1
Player
M
M 2, 1
D 0, 0
2
D
0, 0
1, 2
The game has two pure Nash equilibria
in which¯the players will receive either 2, 1 or 1, 2. There
´
2 1
is also a mixed Nash equilibrium p 3 , 3 q, p 13 , 23 q , where the players receive a reward of 23 each.
Note that this reward is lower than the worst reward they get under a pure Nash equilibrium.
Now consider the following situation where a third party advises their choices according to a
coin toss which is observed by both of them: choose M if the coin lands in heads and choose D
if the coin lands in tails. If they follow this advise, both the players will receive 23 . Interesting
fact is that no player has incentive deviate from this strategy. Another interesting point to note
here is that the payoff they are receiving is higher than that of in mixed Nash equilibrium.
(16.2). Example. We, now, consider another game called game of chicken given by
Player 1
Player 2
Y
D
Y 3, 3 0, 5
D 5, 0 -4, -4
The story behind this game is the following: There are two drivers who arrive at the same time
to an intersection. Each one would like to drive on (strategy D) rather than yielding (strategy
Y), but if both drive then they run the risk of damaging their cars. If both yield then they may
waste time, but no risk of damage. The game has three Nash equilibria pY, Dq, pD, Y q and a
mixed equilibria where each drive with probability 13 . The players’ expected payoffs under these
equilibria are p0, 5q, p5, 0q and p2, 2q respectively.
Suppose we install a traffic light which would instruct each player whether to yield or drive.
For example, the light could choose uniformly at random from pY, Dq and pD, Y q. In this case
the players’ payoffs will be p2.5, 2.5q and they have no incentive to deviate, assuming the other
obeys it.
We can also consider the situation where the light chooses from tpY, Dq, pD, Y q, pY, Y qu where
the third one is chosen with probability p and the first and second are chosen with probability
1´p
2 .
26
INTRODUCTION TO GAME THEORY
Given that a player is instructed to yield, the player knows that the other player has been told
p
to yield with conditional probability pY “ p`p1´pq{2
and to drive with conditional probability
p1´pq{2
pD “ p`p1´pq{2
. Therefore the player’s utility for yielding is 3pY while the utility for driving is
5pY ´ 4pD . Therefore the player will not deviate the instruction as long as 3pY ě 5pY ´ 4pD .
Simplifying this, we get p ď 12 .
Each player’s utility is 3p ` 5p1 ´ pq{2. Choosing p “ 1{2, the players’ utilities will be
p2.75, 2.75q which are more than in any Nash equilibrium.
We will not introduce the above equilibrium concept formally.
(16.3). Definition. Let G “ pS1 , S2 , u1 , u2 q be a two player finite game. A distribution µ P
℘pS1 ˆ S2 q is called a correlated strategy.
A correlated strategy µ is said to be correlated equilibrium if for every player i and every
si , ti P Si , it holds that
ÿ
ÿ
µps´i ; si qui ps´i ; ti q.
µps´i ; si qui ps´i ; si q ě
s´i PS´i
s´i PS´i
Note that player i’s expected utility under correlated equilibrium is
ř
sPS
ui psqµpsq.
Several interesting consequences follow from the definition of correlated equilibrium.
(16.4). Proposition. Set of correlated equilibria is closed and hence compact. Moreover, it is
convex.
(16.5). Proposition. Every Nash equilibrium is a correlated equilibrium.
Incidentally, the above proposition also proves the existence of correlated equilibrium. Note
that correlated equilibrium is defined in terms of linear inequalities. Therefore it is natural to
ask if we can prove this without using Nash equilibrium. The answer to this is yes and such a
proof is provided by Hart and Mas-Collel.
(16.6). Example. This is taken from web https://economics.stackexchange.com/questions/
21583/example-of-a-game-with-no-nash-equilibria-but-at-least-one-correlated-equilibr
Consider a game with three players. Player 1 and 2 have strategy space r0, 1s ˆ N, while
Player 3’s strategy space is r0, 1s. We denote the generic strategies of these players respectively
as px, mq, py, nq and x1 .
The payoff of Player 3 is simply 1 if x1 “ x and 0 otherwise. The payoff of Player 1 as well as
Player 2 is 2 if y “ x ‰ x1 , and both of them is -2 if x “ x1 . If y ‰ x ‰ x1 , then the palyer with
highest number in the second coordinate gets a payoff of 1 and the other one gets ´1. If they
pick same digit, then both of them get 0.
This game has no Nash, but it has a correlated equillibrium.
§ 17. Congestion and Potential Games
(17.1). Definition. Let pn, S1 , S2 , ¨ ¨ ¨ , Sn , u1 , u2 , ¨ ¨ ¨ , un q be a normal form game. Let S “
S1 ˆ S2 ˆ ¨ ¨ ¨ ˆ Sn . The game is said to be potential if there exists a function f : S Ñ R such
that
ui psq ě ui ps1i ; s´i q
for all s P S, s1i P Si and i “ 1, 2, ¨ ¨ ¨ , n.
(17.2). Example. Consider the game Battle of Sexes
ˆ
˙
2, 1 0, 0
0, 0 1, 2
INTRODUCTION TO GAME THEORY
27
§ 18. Evolutionary Game Theory
Consider a symmetric game given by the payoff matrix A. Recall that such a symmetric game
is a bimatrix game where A and A1 are respectively the payoff matrices of player 1 and player 2.
In particular, both players have the same set of pure strategies S “ te1 , e2 , ¨ ¨ ¨ , ek u and mixed
strategies
∆ “ tx P Rk : x1 ` ¨ ¨ ¨ ` xk “ 1, xi ě 0; 1 ď i ď ku.
When a strategey x is played against y, then x receives the payoff
f px, yq :“ xx, Ayy “ x1 Ay “
k
ÿ
aij xi yj ,
i,j“1
and y receives f py, xq.
Consider a large population of individuals pairing off randomly and play the symmetric game
with payoff matrix A. Let the incumbent strategy (of all individuals) be x. Suppose that an ϵ
fraction of individuals become mutants and start playing y. This would give rise to a population
where 1 ´ ϵ proportion plays x and the remaining ϵ proportion plays y. In such a population,
the average fitness (or payoff) for x-individulas is f px, ϵy ` p1 ´ ϵqxq, and that for y-individuals
is f py, ϵy ` p1 ´ ϵqxq.
Biological intuition suggests that evolutionary forces select against the mutant strategy y if
and only if its postentry payoff is lower than that of the incumbent strategy x. That is,
f px, ϵy ` p1 ´ ϵqxq ą f py, ϵy ` p1 ´ ϵqxq.
(18.1). Definition. A strategy x P ∆ is an evolutionarily stable strategy (ESS, for short) if for
every y ‰ x, there exists ϵ̄ P p0, 1q such that
(18.2)
f px, ϵy ` p1 ´ ϵqxq ą f py, ϵy ` p1 ´ ϵqxq; 0 ă ϵ ă ϵ̄.
The highest fraction ϵ̄, satisfying ((18.2)), is called the invasion barrier of the incumbent
strategy x against the mutant strategy y. So an ESS may be understood as a strategy having
positive invasion barrier against all mutations.
Let us now rewrite ((18.2)) as
(18.3)
ϵrf px, yq ´ f py, yqs ` p1 ´ ϵqrf px, xq ´ f py, xqs ą 0, 0 ă ϵ ă ϵ̄
A careful observation of this yields the next theorem whose proof is left as an exercise.
(18.4). Theorem. For x P ∆, the following two statements are equivalent:
(i). x is an ESS.
(ii). x is a symmetric NE and, for any other best response y (against x), f px, yq ą f py, yq.
An immediate consequence of this theorem is that every strict symmetic NE is an ESS. It also
follows that an ESS is necessarily a symmetric Nash equilibrium. The converse of the previous
two statements may not be true. The next two examples illustrate this fact.
(18.5). Example (Hawk-Dove game). Consider a species of birds pairing off at random and
competing for resources. Each bird is programmed to behave like a ‘Hawk’ or ‘Dove’. When
two Hawks compete, the fight continues till one gets injured seriously. The injury is very costly
compared with reward of success. When two Doves compete, both keep displaying till one
retreats. There is no injury in this case. When a Dove is faced with a Hawk, Dove retreats
immediately
ˆ
˙ with out injury. Such a Hawk-Dove game may be represented by the matrix A “
´1 2
.
0 1
Consider the mixed strategy x “ p1{2, 1{2q and note that
f py, xq “ ´
1 ´ y1
1
y1
` y1 `
“ ,
2
2
2
28
INTRODUCTION TO GAME THEORY
for every mixed strategy y. Also note that
1
f px, yq ´ f py, yq “ 2py1 ´ q2 ě 0.
2
Here equality holds only if y “ x. Therefore, by Theorem (18.4), x is an ESS.
(18.6). Example (Rock-Scissors-Paper game). This game has three pure strategies; Rock, Scissors,
¨ Paper. Rock˛beats Scissors, Scissors beat Paper, Paper beats Rock. The payoff matrix is
0
1 ´1
˝ ´1 0
1 ‚. It can be easily shown that x “ p1{3, 1{3, 1{3q is the only symmetric NE.
1 ´1 0
Clearly all strategies are best responses agaist x. However, f pe1 , e1 q “ 0 “ f px, e1 q. Therefore,
by Theorem (18.4), x is not an ESS.
The next theorem will provide some insight in to the geometry of the set of all evolutionary
stable strategies in a game. Recall that the support Spxq of a strategy x is the set of all pure
strategies ei with xi ą 0. Also note that the convex hull Λpxq of Spxq is the face (of ∆) generated
by (or, the smallest face containing) x.
(18.7). Theorem. If x P ∆ is an ESS, then there is no other symmetric NE in Λpxq.
Proof. Let x be an ESS. By Theorem (18.4), x is a symmetric NE, and hence all pure strategies
in Spxq are best responses to x. This implies that Λpxq Ă BRpxq.
Therefore, by Theorem (18.4), no strategy y P Λpxq is in BRpyq. That is, no strategy in Λpxq is
a symmetric NE.
□
(18.8). Remark. In view of Theorem (18.4), one infers also that if x is an ESS, then then there
is no other ESS in the face generated by x. In particular, it follows that if there is an ESS in
the interior of ∆, then there is no other ESS in the game.
To obtain another useful characterization of ESS, note that the inequality ((18.2)) is equivalent
to the inequality
(18.9)
f px, ϵy ` p1 ´ ϵqxq ą f pϵy ` p1 ´ ϵqx, ϵy ` p1 ´ ϵqxq.
This motivates the next theorem, and the proof is left as an exercise to the reader.
(18.10). Theorem. For x P ∆, the following two statements are equivalent:
(i). x is an ESS.
(ii). There exista a neighbourhood (relative to ∆) of x such that f px, yq ą f py, yq for every
y P U , y ‰ x.
(18.11). Remark. If x satisfies the conditions in Theorem (18.10)(ii), then some authors define
it as a locally superior strategy. So the above theorem says that a strategy is an ESS if and only
if it is locally superior. This result will be helpful in deriving important dynamic properties of
ESS in the next section.
§ 19. Replicator Dynamics
The notion of ESS relies upon implicit dynamical considerations. In certain situations, the
underlying dynamics can be modeled by a system of ordinary differential equations on the simplex
∆.
Consider a large population of individuals where each individual is programmed to adopt
a certain pure strategy from te1 , e2 , ¨ ¨ ¨ , ek u in a symmetric game with payoff matrix A. Let
řk
ni ptq be the number of individuals adopting ei at time t. Then nptq “ i“1 ni ptq is the total
population size at time t. The associated population state xptq at time t is the transpose of the
vector px1 ptq, x2 ptq, ¨ ¨ ¨ , xk ptqq, where
xi ptq “
ni ptq
, 1 ď i ď k.
nptq
INTRODUCTION TO GAME THEORY
29
Clearly xi ptq is the proportion of individuals programmed to play ei at time t. In this way, a
population state xptq can be considered as a mixed strategy in the simplex ∆.
We also note that, at time t, the average payoff to an individual adopting ei , in a random
match, is f pei , xptqq. The population average payoff is f pxptq, xptqq. Let
σpei , xq “ f pei , xq ´ f px, xq.
ptq
is a measure of evolutionary success of ei -strategists. FollowThe relative rate of change xx9 ii ptq
ing the basic tenet of Darwinism, we may express this success as the fitness difference σpei , xptqq.
Thus we obtain the replicator dynamics:
(19.1)
x9 i “ xi σpei , xq,
i “ 1, 2, ¨ ¨ ¨ , k.
řk
Since i“1 x9 i “ 0, the mixed strategy simplex ∆ is invariant under the replicator dynamics
((19.1)). This, together with the fact that the R.H.S. of ((19.1)) is a polynomial of degree at
most three, yield existence and unique for the replicator dynamics. That is, for each initial state
xp0q P ∆, the replicator dynamics admits a global unique solution xptq which remains in ∆ for
all time. In addition, by the very structure of the replicator dynamics, each face of ∆ (as well
as its boundary and interior) is invariant.
Clearly each vertex (pure strategy) of ∆ is a stationary point of ((19.1)). But there can be
other interesting stationary points. Denote by ∆0 , the set of all stationary points in ∆; that is,
∆0 “ tx P ∆ : σpei , xq “ 0
for all i P Spxqu.
The next result relates the stationary points of ((19.1)) with symmetric NE of the associated
game.
(19.2). Theorem. Let x P ∆. If x is a symmetric NE, then it is a stationary point of ((19.1)).
The converse is true when any of the following conditions hold:
(c1). x is in the interior of ∆.
(c2). x is a limit state of trajectory lying in the interior of ∆.
(c3). x is a Lyapunov stable state of ((19.1)).
Proof. If x is a symmetric NE, then clearly
f pei , xq “ f px, xq,
for all ei P Spxq.
This implies that, for i with xi ą 0, σpei , xq “ 0. Therefore x is a stationary point of ((19.1)).
We now prove the converse under (c1). If x is an interior stationary point, then σpei , xq “
f pei , xq ´ f px, xq “ 0 for i “ 1, 2, ¨ ¨ ¨ , k. This implies that x P BRpxq “ ∆, and hence x is a
symmetric NE.
To prove the converse under (c2), assume that the stationary point x is a limit state. That is,
(19.3)
x “ lim xptq,
tÑ8
where xptq satisfies ((19.1)) and it lies in the interior of ∆ for all time t ě 0.
Note that
şt
(19.4)
xi ptq “ xi pt0 qe t0
σpei ,xpsqq ds
1 ď i ď k.
;
If possible, let x be not a symmetric NE. Then there would be a pure strategy, say ej P Spxq
satisfying f pej , xq ą f px, xq. This implies that
2δ :“ σpej , xq ą 0.
This implies (by continuity of f ) that there exista a neighborhood (relative to ∆) U such that
(19.5)
σpej , yq ě δ,
for all y P U, y ‰ x.
In view of ((19.3)), this shows that there is t0 (large enough) such that xptq P U when t ď t0 .
This and ((19.5)) yield
σpej , xptqq ě δ,
for all t ě t0 .
Substituting this in to ((19.4)), we obtain
(19.6)
xj ptq ě xj p0qeδpt´t0 q ,
@t ě t0 .
30
INTRODUCTION TO GAME THEORY
This contradicts ((19.3)), and indicates that x has to be a symmetric NE.
It remains to prove the converse under (c3). To this end, let us take a Lynapunov stable
stationary point x of ((19.1)). If x is not a symmetic NE, then there exist a pure strategy ej
and t0 such that ((19.6)) is satisfied. This contradicts the Lyapunov stability of x.
□
From the above theorem, it follows that all Lyapunov stable states of the replicator dynamics
are symmetric NE. The next theorem shows that all ESS are asymptotically stable state of the
replicator dynamics ((19.1)).
(19.7). Theorem. If x is an ESS, then it is an asymptotically stable state of ((19.1)).
Proof. Let x P ∆ be an ESS. By Theorem (18.10), there exists a neighborhood (relative to ∆)
U of x such that
σpx, yq ą 0
(19.8)
@y P U ztxu.
We use Lyapunov’s direct method to prove that x is asymptotically stable.
Consider the relative neighborhood O of x, where
O “ ty P ∆ : Spxq Ă Spyqu.
Define V : O Ñ R by
V pyq “
ÿ
xi logp
iPSpxq
xi
q.
yi
Clearly V is continuous and V pxq “ 0. Now for y P Oztxu,
ř
V pyq “ ´ iPSpxq xi logp xyii q
ř
ą ´ iPSpxq xi p xyii ´ 1q p since logprq ă r ´ 1
ř
“ 1 ´ iPSpxq yi
ě
@r ‰ 1q
0.
Therefore the proof will be complete once we show the time derivative of V pyptqq, along the
trajectories of ((19.1)) is strictly negative.
d
dt V
pyptqq
“
řk
“
řk
“
´σpx, yptqq.
BV
9 i
i“1 Byi pyptqqyptq
´xi
i
i“1 yi ptq yi ptqσpe , yptqq
In view of ((19.8)), we now obtain the fact that the time derivative of the Lyapunov function V
along the trajectories of replicator dynamics is strictly negative.
□
(19.9). Remark. If x is an interior ESS, then, as in the previous theorem, it can be shown to
be globally asymptotically stable; that is, intp∆q is its basin of attraction.
The following example shows that the converse is not true, in general. That is, an asymptotically stable state of the replicator dynamics may fail to be an ESS.
¨
˛
1 5 0
(19.10). Example. Consider the symmetic game with payoff matrix A “ ˝ 0 1 5 ‚.
5 0 4
1 2 3
It is clear that the pure strategies e , e , e are not symmetic NE. Furthermore, for any mixed
strategy y,
f pe1 , yq
“ y1 ` 5y2 ,
2
“ y2 ` 5y3 ,
3
“ 5y1 ` 4y3 .
f pe , yq
f pe , yq
INTRODUCTION TO GAME THEORY
31
From this, it follows that
f pe1 , yq “ f pe2 , yq
iff
6y1 ` 9y2 “ 5,
f pe , yq “ f pe , yq
iff
6y1 “ 1,
f pe1 , yq “ f pe3 , yq
iff
9y2 “ 4.
2
3
7
It follows that the game has only one symmetic NE; namely x “ p 61 , 49 , 18
qu. Note also that, all
mixed strategies are best responses to x. We observe that
f px, e3 q “ 4x3 “
28
.
18
But
28
,
18
and so x is not an ESS. Nevertheless, we can show that x is an asymptotically stable state of
the associated replicator dynamics:
$
& y9 1 “ ´y1 py12 ` y22 ` 4y32 ` 5y1 y2 ` 5y2 y3 ` 5y3 y1 ´ y1 ´ 5y2 q,
y9 2 “ ´y2 py12 ` y22 ` 4y32 ` 5y1 y2 ` 5y2 y3 ` 5y3 y1 ´ y2 ´ 5y3 q,
(19.11)
%
y9 3 “ ´y3 py12 ` y22 ` 4y32 ` 5y1 y2 ` 5y2 y3 ` 5y3 y1 ´ 5y1 ´ 4y3 q.
f pe3 , e3 q “ 4 ą
The R.H.S. of ((19.11)) is a map from R3 R3 , and its gradient matrix at x is
˛
¨
´7{12
2{9
´37{36
˝ ´2
´32{27 ´14{27 ‚.
7{36 ´77{54 ´91{108
The eigen values of this matrix are x, x, x, and hence x is an asymptotically stable stationary
point of (19.11).
§ 20. Fictitious Play
Fictitious play is a simple iterative procedure introduced by Brown [2]. The proof of convergence is given by Robinson [14]. In this method, both the players play the same game iteratively.
At each stage the player, after observing the other player’s action and the history of both players,
chose a pure strategy which is best response to the empirical strategy of the other player. Under
some situations, this empirical strategies converge to optimal strategies. We now describe this
procedure.
(1)
(2)
(3)
(4)
(5)
(6)
Player 1 chooses a pure strategy α1 . Then x1 “ α1 .
Player 2 chooses a best response β1 to x1 in pure strategies. Then y 1 “ β1 .
Player 1 chosses a best response α2 in pure strategies to y 1 . Then x2 “ 21 α1 ` 12 α2 .
Player 2 chooses a best response β2 in pure strategies to x2 . Then y 2 “ 12 β1 ` 12 β2 .
Palyer 1 chooses a best response α3 in pure strategies to y 2 . Then x3 “ 13 α1 ` 13 α2 ` 13 α3 .
Repeat steps 4 and 5 with xk , y k , k ě 3.
In the following we provide the proof of this result which closely follows [7]
Let A “ ppaij qq be a m ˆ n matrix which gives the payoff of a zerosum game. Let v be the
value of the game (which exists by von Neumann minmax theorem). Let ∆m and ∆n denote
the mixed strategy spaces of both the players respectively. Let Ai denote the ith row and Aj ,
denote the jth column.
(20.1). Definition. A pair of sequences pU P ∆m and yl P ∆n is called a vector system
Let A be the payoff matrix and consider a sequence of vectors cptq and rptq of dimension n
and m defined by
(1) maxi ci p0q “ minj rj p0q;
(2) rpt ` 1q “ rptq ` al ; where l P arg maxi ci ptq and al is the lth row of A;
(3) cptq “ cpt ´ 1q ` bk , where k P arg minj rj ptq and bk is the kth column of A;
32
INTRODUCTION TO GAME THEORY
Let ni , mj denote theřnumberřof times that ai , bj are added to rp0q, cp0q in forming rptq,
m
cptq respectively. Then i ni “ j mj “ t. Let xi “ nti , yj “ tj , then x, y are respectively
mixed strategies of players 1 and 2. Without loss of generality we assume both players have
same number of pure strategies. Now note that
ÿ
ÿ
ci ptq “ ci p0q ` t aij yj and rj ptq “ rj p0q ` t aij xi .
j
i
It is trivial to see that
min
j
rj ptq
ci ptq
ď valuepAq ď max
i
t
t
Consider
ci ptq ´ rj ptq “ ci p0q ´ rj p0q ` t
ÿ
tyk aik ´ xk akj u
k
We now assume that the matrix game A is symmetric i.e., A “ ´A1 . Then
ÿ
ci ptq ´ ri ptq “ ci p0q ´ ri p0q ` t aik pyk ` xk q
k
Now take z “ 21 px ` yq, then
ci ptq ´ ri ptq “ ci p0q ´ ri p0q `
tÿ
aik zk
2 k
Without loss of generality, we make the assumption that all the entries of A are non-negative.
Then
ci ptq ´ rj ptq ě ci p0q ´ rj p0q ` t min max
u
ˆ
We now consider a game with the payoff matrix B “
0
A
w
˙
´AT
. Note that
0
z “ 21 px1 , x2 , ¨ ¨ ¨ , xn , y1 , y2 , ¨ ¨ ¨ , yn q
is a valid mixed strategy of this game. If the maximizing player chooses this mixed strategy z
and minimising
ř player chooses rows i and j with equal probability then the payoff obtained is
given by 41 kl pyk aik ´ xl alj q. Since B is skew-symmetric, the value of B is zero. Thus
min
ij
ci ptq ´ rj ptq
Ñ0
t
as t Ñ 8.
Thus both mini
ci ptq
t
and maxj
rj ptq
t
converge to same value.
Note that
ci ptq ´ ri ptq “ ci p0q ´ ri p0q ` t
ÿ
tyk aik ´ xk aki u
k
Thus for a symmetric game, ci ptq “ ri ptq.
§ 21. Cooperative Games
A transferable utility game (TU game) is specified by pN, vq, where N is the set of players
and v : 2N Ñ R. It is also called game in characteristic form. The value vpSq for a subset S Ď N
is called the worth of coalition S. Needless to say, the subsets of N are called coalitions.
INTRODUCTION TO GAME THEORY
33
§ 22. Nucleolus
Nucleolus is another solution concept for TU games. It is manifestation of Rawlsian social
welfare, wherein the the welfare of the worst-off player is maximized. The welfare, here, is
measured in terms of the excess function. Now consider a TU game pN, vq and consider a
coalition C of N . The excess epC, xq of a coalition at an allocation x is defined by
epC, xq “ vpCq ´ xpCq.
ř
Note that xpCq “ iPC xi . Whenever epC, xq ą 0, we can see this as the amount of dissatisfaction
or complaint of the players of C from the allocation x. Using this notation, we can see that
CorepN, vq “ tx P Rn |x is an imputation and epC, xq ď 0, C Ď N u.
Let e˚ pxq be the arrangement of the values tepC, xq, C Ď N u in the decreasing order. Thus e˚ pxq
N
can be seen as a vector R2 ´1 . Here we excluded the excess corresponding to empty coaltion,
which is zero always.
N
N
Let ľlex denote the lexicographical ordering of R2 ´1 i.e., u, v P R2
if either u “ v or there is t such that ui “ vi for 1 ď i ă t and ut ą vt .
´1
, w ąlex y if and only
We say that an allocation y is better than x provided e˚ pyq ľlex e˚ pyq. Now, the nucleolus
N upN, vq is the set
N upN, vq “ tx P Rn : x is an imputation and e˚ pyq ľlex e˚ pxq for each imputation yu.
§ 23. Utility Under Certainity
§ 23.1. Preference Relations and Utility Representation. Let X be a nonempty set which
is supposed to be the set of alternatives that a decision maker has. A preference relation on X
is a binary operation satisfying
‚ Complete. for all x, y P X, either x ĺ y or y ĺ x
‚ Transitive. for all x, y, z P X, if x ĺ y and y ĺ z, then x ĺ z.
In fact, we say that such a preference relation is rational.
Any order preserving map u : X Ñ R is called a utility function. Note that if there is an
utility function, then the preference relation is necessarily rational.
If x ĺ y and y ĺ x, then we write x „ y to denote that both x and y are equally likely. We
write x ă y to mean that x ĺ y but not y ĺ x. In this case y is strictly preferred to x. The sets
of alternatives strictly worse and better than y P X are denoted respectively by
W pyq “ tx P X|x ă yu and Bpyq “ tx P X|x ą yu.
Given a topology on X, we say that the preference relation is continuous if for each y, the sets
W pyq and Bpyq are open. The preference relation is upper semicontinuous if W pyq is open for
each y. Lower semicontinuity of the preference relation is defined analogously.
The set X is called Jaffray order separable if there is a countable subset D Ă X such that for
all x, y P X,
x ă y ùñ Dy, y 1 P D : x ĺ d ă d1 ĺ y.
One can easily verify that there is a utility function if and only if X is Jaffray order separable.
(23.1). Proposition. If X is countable, there always exists an utility function.
Proof. Let X “ tx1 , x2 , ¨ ¨ ¨ u and let tai u be a sequence of positive integers with
ÿ
upxq “
ai
ř
ai ă 8. Set
xi ĺx
then u is an utility function.
Proof. Let X “ tx1 , x2 , ¨ ¨ ¨ u and u be defined inductively as follows:
‚ upx1 q “ 0
□
34
INTRODUCTION TO GAME THEORY
‚
$
’
&upx1 q if x1 „ x2
1q
upx2 q “ ´1`upx
if x2 ă x1
2
’
% upx1 q`1
if
x1 ă x2
2
□
(23.2). Definition (Lexicographic Preference). Let X and Y be two sets with preference relations ĺ1 and ĺ2 respectively. On X ˆ Y , we define the preference relation ĺ as follows:
px, yq ĺ px1 , y 1 q if x ă1 x1 or x „1 x1 , y ĺ2 y 1 .
The preference relation ĺ is called lexicographic preference.
(23.3). Proposition. Let X “ Y “ r0, 1s. On X ˆ Y , consider the lexicographic order induced
by the usual order on r0, 1s. There is no utility function which represents this order on X ˆ Y .
Proof. Suppose u is an utility function representing the order on r0, 1sˆr0, 1s. For each a P r0, 1s,
note that pa, 0q ă pa, 1q. So, we can choose a rational number qpaq P pupa, 0q, upa, 1qq. Thus we
have a one-one function q : r0, 1s Ñ Q, which is a contradiction. Thus there can not be any
utility function representing the order on r0, 1s ˆ r0, 1s.
□
(23.4). Exercise. Show that lexicographic preference is not continuous.
(23.5). Definition. A preference relation on X is said to be separable if there is a countable set
D such that
x ă y ùñ x ĺ z ĺ y for some x P D.
(23.6). Theorem. Let X be uncountable. The preference relation is representable by utility
function if and only if it is separable.
Proof. Existence of a utility function obviously implies the separability of the preference relation.
So, we prove only the other way.
Let D be order separable. Let µ be a probability distribution over D. For any x P X, let
ÿ
ÿ
upxq “
µy ´
µy .
yPW pxqXD
u satisfies the desired properties.
yPBpyqXD
□
INTRODUCTION TO GAME THEORY
35
References
[1] Tilman Börgers and Rajiv Sarin, Learning through reinforcement and replicatior dynamics, J. Econ. Theory,
77(1997), 1 - 14.
[2] G. W. Brown, Iterative solutions of games by fictitious play, in Koopmans, 374 - 376, 1951.
[3] G. W. Brown and J. von Neumann, Solutions of games by differential equations, in Kuhn and Tucker, 73 79, 1950.
[4] T. Fuzimato, An extension of Tarski’s fixed point theorem and its application to isotone complementarity
problems, Math. Programming, 28(1984), 116 - 118.
[5] J. Gait, Stability in the gaming equation, Bull. Aus. Math. Soc., 21(1980), 207 - 210.
[6] J. Geanakoplos, Nash and Walras equilibrium via Brouwer, Economic Theory, 21(2003), 585 - 603,
[7] S. Karlin, Mathematical Methods and Theory in Games, Programming, and Economics, Dover, 1992.
[8] P.D. Lax, Change of variables in multiple integrals, American Math. Monthly, 106(99), 497 - 501.
[9] R.D. Luce and H. Raiffa, Games and Decisions, Dover, 1957.
[10] Andreu Mas-Colell, Michael D. Whinston, and Jerry R. Green, Microeconomic Theory. Oxford University
Press, New York, 1995.
[11] J.F. Mertens, S. Sorin, and S. Zamir, Repeated Games, Econometric Society Monographs, 55, Cambridge
University Press, New York, 2015
[12] H. Nikaidô, Stability by equilibrium by the Brown-von Neumann differential equation, Econometrica,
27(1959), 654 - 671.
[13] T.E.S. Raghavan, Completely mixed strategies in bimatrix games, J. London Math. Soc., (2), 2(1970), 709
- 712.
[14] J. Robinson, An iterative method of solving a game, Annals of Mathematics, 54(1951), 296 - 301.
[15] A. Tarski, A lattice-theoretical fixpoint theorem and its applications, Pacific J. Math., 5(1955), 285 - 309.
Download