Game_Theory

advertisement
A very little Game Theory
Math 20
Linear Algebra and Multivariable
Calculus
October 13, 2004
A Game of Chance
 You and I each have
a six-sided die
 We roll and the
loser pays the
winner the
difference in the
numbers shown
 If we play this a
number of times,
who’s going to win?
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
The Payoff Matrix
 Lists one player’s
R’s outcomes
(call him/her R)
possible outcomes
versus another
player’s (call him/her
C) outcomes
 Each aij represents
the payoff from C to
R if outcomes i for R
and j for C occur (a
zero-sum game).
1
2
3
4
5
6
1
0
1
2
3
4
5
C’s
2
-1
0
1
2
3
4
outcomes
3 4 5
-2 -3 -4
-1 -2 -3
0 -1 -2
1 0 -1
2 1 0
3 2 1
6
-5
-4
-3
-2
-1
0
Expected Value
 Let the probabilities of R’s outcomes and C’s
outcomes be given by probability vectors
p  p1

p2
pn 
q1 
 
q2 

q
 
 
qn 
Expected Value
 The probability of R having outcome i
and C having outcome j is therefore
piqj.
 The expected value of R’s payoff is
E(p,q) 
n
paq
i
i, j 1
ij
j
 pAq
Expected Value of this Game
1

6
1
6
1
6
1
6
1
6
0 1 2 3 4

1 0 1 2 3
1  2 1 0 1 2
 
 3 2 1 0 1
6 
4 3 2 1 0

5 4 3 2 1
1 
6 
 
5 1 

4 6 
1 

3 6 

2  1 
 
1 6 
 1
0   
6
1 
 
6 


15
 
9 
3  1
1
  1 1 1 1 1 1   
6
 3  6
 9 
 
15 
0
 A “fair game” if the dice are fair.
Expected value
with an unfair die
1 1
p  
10 10
 Suppose
 Then

1 1
E(p,q)  
10 10

1
5
1
5
1
5
0 1 2 3 4

1 0 1 2 3
1  2 1 0 1 2
 
 3 2 1 0 1
5 
4 3 2 1 0

5 4 3 2 1
1
24 2
(15)  (9)  2(3)  2  3  2  9  2 15  
60
60 5
1
5
1
5
1
5
1

5
1
6
 
15
5 1

 
4 6
9 
1
3  1
3 6 1



1
1
2
2
2
2


  

2  1 10
3  6

 
 9 
1 6
1

 
0   
15 
6

1
 
6
Strategies
 What if we could
R’s outcomes
choose a die to
be as biased as
we wanted?
 In other words,
what if we could
choose a strategy
p for this game?
 Clearly, we’d
want to get a 6
all the time!
1
2
3
4
5
6
1
0
1
2
3
4
5
C’s
2
-1
0
1
2
3
4
outcomes
3 4 5
-2 -3 -4
-1 -2 -3
0 -1 -2
1 0 -1
2 1 0
3 2 1
6
-5
-4
-3
-2
-1
0
Flu Vaccination
 Suppose there are two
Strain
1
Vaccine
flu strains, and we have
two flu vaccines to
combat them.
 We don’t know
distribution of strains
 Neither pure strategy is
the clear favorite
 Is there a combination of
vaccines that maximizes
immunity?
2
1 0.85 0.70
2 0.60 0.90
Fundamental Theorem of
Zero-Sum Games
 There exist optimal strategies p* for R and
q* for C such that for all strategies p and q:
E(p*,q) ≥ E(p*,q*) ≥ E(p,q*)
 E(p*,q*) is called the value v of the game
 In other words, R can guarantee a lower
bound on his/her payoff and C can guarantee
an upper bound on how much he/she loses
 This value could be negative in which case C
has the advantage
Fundamental Problem of
Zero-Sum games
 Find the p* and q*!
 In general, this requires linear
programming. Next week!
 There are some games in which we can
find optimal strategies now:
Strictly-determined games
22 non-strictly-determined games
Network Programming
 Suppose we have two
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Quick Time™a nd a
TIFF ( Unco mpre ssed ) dec ompr esso r
ar e nee ded to see this pictur e.
networks, NBC and CBS
 Each chooses which
program to show in a
certain time slot
 Viewer share varies
depending on these
combinations
 How can NBC get the
most viewers?
Payoff Matrix
CBS shows
Everybody
Loves
Raymond
CSI
Survivor
60
Minutes
NBC Shows
Friends
60
20
30
55
Dateline
Law &
Order
50
70
75
45
45
35
60
30
NBC’s Strategy
 NBC wants to
60 M
Surv
CSI
ELR
maximize NBC’s
minimum share
 In airing Dateline,
NBC’s share is at
F
least 45
DL
 This is a good
strategy for NBC L&O
60
20
30
55
50
75
45
60
70
45
35
30
CBS’s Strategy
 CBS wants to
60 M
Surv
CSI
ELR
minimize NBC’s
maximum share
 In airing CSI, CBS
keeps NBC’s share
no bigger than 45 F
DL
 This is a good
strategy for CBS L&O
60
20
30
55
50
75
45
60
70
45
35
30
Equilibrium
 (Dateline,CSI) is an
60 M
Surv
CSI
ELR
equilibrium pair of
strategies
 Assuming NBC airs
Dateline, CBS’s
best choice is to F
air CSI, and vice DL
versa
L&O
60
20
30
55
50
75
45
60
70
45
35
30
Characteristics of an
Equlibrium
 Let A be a payoff matrix. A saddle point is
an entry ars which is the minimum entry in
its row and the maximum entry in its
column.
 A game whose payoff matrix has a saddle
point is called strictly determined
 Payoff matrices can have multiple saddle
points
Pure Strategies are optimal
in Strictly-Determined Games
 If ars is a saddle
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
point, then erT is an
optimal strategy for
R and es is an
optimal strategy for
C.
Proof
E(eTr , q)  eTr Aq  ar1 ar 2 L
 q1 
q 
2

arn 
 M
 
 qn 
 ar1q1  ar 2 q2  L  arn qn
 ars q1  ars q2  L  ars qn
 ars (q1  L  qn )  ars  E(e , e s )
T
r
Proof
 a1s 
a 
2s 

E(p,e s )  pAe s  p1 p2 L pm 
 M
 
 ams 
 p1a1s  p2 a2 s  L  pm ams
 p1ars  p2 ars  L  pm ars
 ( p1  p2  L  pm )ars  ars  E(e ,e s )
T
r
Proof
 So for all strategies p and q:
E(erT,q) ≥ E(erT,es) ≥ E(p,es)
 Therefore we have found the optimal
strategies
2x2 non-strictly determined
 In this case we can compute E(p,q) by
hand in terms of p1 and q1
a11 a12  q1 
E(p,q)  p1 p2   
  
a21 a22 q2 
 p1a11q1  p1a12q2  p2 a21q1  p2 a22q2
 p1a11q1  p1a12 (1 q1)  (1 p1)a21q1  (1 p1 )a22 (1 q1 )
 (a11  a22  a12  a21) p1  (a22  a21)q1  (a12  a22 ) p1  a22
Optimal Strategy for 2x2 non-SD
 Let
a22  a21
p1  p
; p2 1 p1
a11  a22  a12  a21

1
 This is between 0 and 1 if A has no
saddle points

 Then
(a12  a22 )(a22  a21)
E(p,q) 
 a22
a11  a22  a12  a21
a11a22  a12a21

a11  a22  a12  a21
Optimal set of strategies
 We have


a22  a21
a11  a12
p 

a

a

a

a
a

a

a

a
 11
22
12
21
11
22
12
21 
a22  a12


a  a  a  a 
11
22
12
21



q 
a11  a21


a  a  a  a 
 11
22
12
21 

a11a22  a12 a21
v
a11  a22  a12  a21
Flu Vaccination
.90  .60
.30 2
p 


.85  .90  .70  .60 .45 3
1

p2 
3
.90  .70
.20 4

q1 


.85  .90  .70  .60 .45 9
5
q2 
9
(.85)(.90)  (.70)(.60) .345
v

 .766
.85  .90  .70  .60
.45

1
Strain
1
2
Vaccine
1 0.85 0.70
2 0.60 0.90
Flu Vaccination
 So we should give
Strain
1
Vaccine
2/3 of the
population vaccine
1 and 1/3 vaccine 2
 The worst that
could happen is a
4:5 distribution of
strains
 In this case we
cover 76.7% of pop
2
1 0.85 0.70
2 0.60 0.90
Other Applications of GT
 War
 Battle of Bismarck
Sea
 Business
 Product Introduction
 Pricing
 Dating
Quic kTime™ and a
TIFF (Unc ompres sed) dec ompres sor
are needed to see this pic ture.
Download