Matrix Games

advertisement
Matrix Games
Mahesh Arumugam
Borzoo Bonakdarpour
Ali Ebnenasir
CSE 960: Selected Topics in Algorithms and Complexity
Instructor: Dr. Torng
Outline
• Basic concepts
• Problem statement
• LP Formulation of Matrix Games
• Minimax Theorem
• Gambling
• Bluffing and Underbidding
2
Basic Concepts
• Game: A description of strategic interaction
between rationale parties based on a set of rules
• Rules: Constraints on the set of actions that each
party can take and the players’ interest
• Finite Game: Set of actions of each player is finite
• Two-Player Game: There exist only two players
[OR94] Osborne and Rubinstein, A Course in Game Theory, MIT press, 1994. 3
Example:
The Game of Morra
• Rule:
– Each player hides one or two francs, and
– Tries to guess how many francs the other player has
hidden
• Payoff:
– If only one player guesses correctly
• he wins the total amount of hidden money
– Otherwise, the result is a draw
4
The Game of Morra:
Pure Strategies
• Possible courses of action for each player
–
–
–
–
Hide
Hide
Hide
Hide
one,
one,
two,
two,
guess
guess
guess
guess
one
two
one
two




[1,
[1,
[2,
[2,
1]
2]
1]
2]
• Pure strategy: a course of action
– Denoted [x,y]; i.e., hide x, guess y
5
The Game of Morra:
Payoff Matrix
y=[
A
x=
x1
x2
x3
x4
[1,1]
[1,2]
[2,1]
[2,2]
B
y1
y2
y3
y4
]
[1,1] [1,2] [2,1] [2,2]
0
-2
3
0
2
0
0
-3
-3
0
0
4
0
3
-4
0
xi – probability that row i is selected by row player
yj – relative frequency with which column j is selected
by column player
– X and Y are stochastic vectors
6
The Game of Morra - Cont’d
• A only plays [1,2] or [2,1] with probability 0.5
• B plays
– [1,1] , [1,2], [2,1], [2,2] in c1, c2, c3, c4 rounds
• c1+ c2+c3 +c4 = N, where N is total number of rounds
• Record of the game
–
–
–
–
–
In c1/2 rounds, A played
In c1/2 rounds, A played
In c4/2 rounds, A played
In c4/2 rounds, A played
Other rounds, result in a
[1,2]
[2,1]
[1,2]
[2,1]
draw
and
and
and
and
B
B
B
B
played
played
played
played
[1,1]:
[1,1]:
[2,2]:
[2,2]:
A
A
A
A
losing 2
winning
winning
losing 4
francs
3 francs
3 francs
francs
• Total winning of A : (c1 – c4)/2 francs
What if the roles of A and B are swapped?
7
Basic Concepts - Cont’d
• Round: a course of actions in which each player moves
once
• Payoff: the value gained by a player in a round
• The Payoff Matrix defines a game for two players
• Zero-sum game: The sum of the average payoffs of the
Possible moves
two players is 0
The resulting payoff
of the column player
Possible moves
of the row player
1
2
i
.
.
m
1
2
a11
…….
…
j
…
of the row player
n
…….
aij
…….
amn
8
Problem Statement
Given the payoff matrix A = [aij ],
– identify a mixture of moves of the row player where
the average payoff per round is optimal no matter
what moves the column player takes
9
LP Formulation of Matrix Games
xi – probability that row i is selected by row player
yj – relative frequency with which column j is selected
by column player
– X and Y are stochastic vectors
• Average payoff to the row player in each round
m
n
 a x y
i 1 j 1
ij i
j
or
xAy
10
LP Formulation of Matrix Games - Cont’d
• If row player adopts the strategy specified by stochastic
vector x, he is assured to win
m
=
min xAy
min
j
y
a
x
ij i
i 1
• The objective is to maximize this payoff
m
maximize min
j
s.t.,
m
x
i 1
i
a
i 1
maximize
x
ij i
m
a
s.t., z 
1
xi  0 (i  1, 2, , m)
z
or
i 1
m
x
i 1
i
x  0 (j  1, 2,  , n)
ij i
1
xi  0 (i  1, 2, , m)
11
LP Formulation of Matrix Games - Cont’d
• What is the dual of this problem?
P maximize z
minimize w
D
n
m
s.t., z   aij xi  0 (j  1, 2, , n) s.t., w   aij y j  0 (i  1, 2, , m)
i 1
m
x
i 1
i
1
xi  0 (i  1, 2, , m)
n
y
j 1
j 1
j
1
y j  0 (j  1, 2, , n)
• What does this problem formalize?
Column player’s optimal strategy and the value he
is assured to win if he adopts such a strategy!
12
Minimax Theorem
For every m  n matrix A there is a stochastic row vector x*
of length m and a stochastic column vector y* of length n
such that
min x*Ay = max xAy*
with the minimum taken over all stochastic column vectors y
of length n and maximum taken over all stochastic row
vectors x of length m.
Value of game
In a game,
that game.
v = min x*Ay = max xAy* is called the value of
What are the implications of this theorem?
13
Ready for Gambling?!!
• As long as a player adopts an optimal strategy, the player
can reveal it to the opponent
• Example: (The Game of Morra)
– column player announces his/her guess
– row player announces his/her guess either independent of the
opponent or adjust his/her guess based on the extra
information
– Additional pure strategies for row player
• Hide 1, make the same guess
• Hide 1, make a different guess
• Hide 2, make the same guess
• Hide 2, make a different guess




[1,
[1,
[2,
[2,
S]
D]
S]
D]
14
Gambling:
Payoff Matrix and LP Solution
[1,1]
[1,1]  0
[1,2] 
 2
[ 2,1]  3

[ 2,2]  0
[1, S ]  0

[1, D ]  2
[ 2, S ]  3

[ 2, D ] 
 0
[1,2]
2
0
0
3
0
2
3
0
[2,1]
3
0
0
4
3
0
0
4
[2,2]
0 
3 

 4

0 
3 

0 
0 

 4

Consider the optimal solution
x=[0, 56/99, 40/99, 0, 0, 2/99, 0, 1/99]
y=[28/99, 30/99, 21/99, 20/99]
Game value = 4/99
-
row player is assured to win at least
this amount on the average
column player is assured to lose no
more than this amount on the
average
Do you think this game is fair?
What does this suggest?
Revealing the guess does not hurt the
prospects for the column player!! 15
How about Bluffing or Underbidding?
• Are bluffing or underbidding rational strategies?
• Example: (Game invented by H. W. Kuhn)
– 2 players, deck of cards numbered 1, 2, or 3
– Each player bets or passes in every round
– Play terminates when
• Bet is answered by bet; payoff 2 to player holding higher card
• Pass is answered by pass; payoff 1 to player holding higher
card
• Bet is answered by pass; payoff 1 to the player who bets
16
Bluffing, Underbidding:
Pure Strategies
•
A’s
1.
2.
3.
strategies
• B’s strategies
Pass; if B bets, pass again
1. Pass no matter what A did
Pass; if B bets, bet again
2. If A passes, pass; if A bets, bet
Bet
3. If A passes, bet; if A bets, pass
4. Bet no matter what A did
3x3x3 pure strategies
•
x1x2x3 – strategy for A
instructing him to follow line
xj when holding j
4x4x4 pure strategies
• y1y2y3 – strategy for B
Payoff matrix size: 8x4!
27x64!
Holding 1: A – refrain line 2; B – refrain lines 2 and 4;
Holding 3: A – refrain line 1; B – refrain lines 1, 2 and 3;
Holding 2: choose to pass in the first round; lines 1 or 2
17
Bluffing, Underbidding:
Payoff Matrix and LP Solution
114 124 314 324
112
113
122
123
312
313
322
323
0
0
-1/6 -1/6
0
1/6 -1/3 -1/6
-1/6 -1/6
1/6
1/6
-1/6 0
0
1/6
0
-1/2
1/6 -1/3
1/6 -1/6 -1/6 -1/2
0
-1/2
1/3 -1/6
0
-1/3
1/6 -1/6
Consider the optimal solution
A: [1/3, 0, 0, 1/2, 1/6, 0, 0, 0]
B: [2/3, 0, 0, 1/3]
Game Value = -1/18
Holding 1:
BLUFF
A is allowed to bet 1/6th times!
B is allowed to bet 1/3rd times!
Holding 3:
UNDERBID
A is allowed to pass 1/2 times!
18
Thank U!
LP Formulation of Matrix Games: Identity (15.1)
miny xAy = minj im aij xi
– It is trivial that
miny xAy <= minj im aij xi
– Now, we show
miny xAy >= minj im aij xi
– Let t = minj im aij xi , thus we have
xAy = jn yj (im aij xi) >= jn yj t = t
20
Download