Tutorial on Game Theory

advertisement
Introduction to Game theory
Presented by: George Fortetsanakis
Game theory
• Game theory attempts to mathematically capture
behavior in strategic situations, or games, in which an
individual's success in making choices depends on the
choices of others (Myerson, 1991).
• A game consists of the following elements
– Players: Who participates in the game?
– Strategies: What can each player do?
– Payoff: What is the outcome of the game?
Normal form game
•
Consider a simple game between two players α and β.
–
–
•
Player α has n strategies s1, s2, ..., sn.
Player β has m strategies t1, t2, …, tm.
Each player receives a payoff when he chooses a certain
strategy.
–
–
πα(i,j) is the payoff of player α if α chooses the strategy si and β
chooses the strategy tj.
πβ(i,j) is the payoff of player β if α chooses the strategy si and β
chooses the strategy tj.
Bi-matrix representation
• Let πα and πβ be nxm matrices with entries πα(i,j) and
πβ(i,j). The game in now conveniently represented using
the bi-matrix notation.
Example 1: Oil producing countries 1/2
• Two oil producing countries SA and IR can each produce
either 2 millions or 4 millions barrels per day.
– The total production level will be either 4,6, or 8 millions barrels
per day.
– Due to market demand the corresponding price per barrel will
be $25, S17, or $12.
• The cost of producing one barrel is $5.
Example 1: Oil producing countries 2/2
• If a player does not know the action of the other player it is
preferable to produce 4 million barrels.
– Each player will end up earning 28 million dollars.
• If the two players cooperate they will choose to produce 2
millions barrels each.
– Each player will end up earning 40 million dollars.
Example 2: Rock-Paper-Scissors 1/2
• Each player chooses among the strategies s1 = Rock, s2 =
Paper, or s3 = Scissors.
– Paper wins over Rock, Rock wins over Scissors and Scissors wins
over Paper.
• The winner gets $1 from the loser and no money is
exchanged in the case of a tie.
Example 2: Rock-Paper-Scissors 2/2
• Example of Zero sum game: The payoff of one player is
negative of the payoff of the other player.
• Best way to play: Choose any of the three strategies with
probability 1/3.
Mixed strategies
• The best way to play the Rock, Paper, Scissors game is
stochastic and it can be represented with the probability
vector:
– P = (pR pP pS) = (1/3, 1/3, 1/3).
– This is an example of a mixed strategy.
• Generalization: We consider a player that can choose
among the strategies s1, s2, …, sn. We define a mixed
strategy as a probability vector upon s1, s2, …, sn:
n
– P = (ps1, ps2, … , psn), psi ≥ 0,
p
i 1
si
1
Example 3: Hawk and Dove game 1/3
• A species of territorial animals engage in fights over
territories. Their behavior comes in two variants.
– Hawk behavior: Animals fight until either victory or injury
ensues.
– Dove behavior: Display hostility at first but retreat at the first
sign of attack from the opponent.
• We define:
– υ: territory won after a fight.
– w: cost of injury.
Example 3: Hawk and Dove game 2/3
• We distinguish the following cases:
– Two Hawks meet: If the probability to win is 1/2 the expected
payoff of a Hack is υ*1/2 – w*1/2 = (υ-w)/2.
– Two Doves meet: Each Dove could win with probability 1/2 thus
the expected payoff of a Dove is υ/2.
– A Hawk and a Dove meet: The Hawk always wins achieving a
payoff υ while the Dove gains nothing.
Example 3: Hawk and Dove game 3/3
• Consider a large population of N animals consisting of N1
Hawks and N2 Doves.
– When an animal engages in a fight, it meets a Hawk with
probability p1 = N1/N and a Dove with probability p2 = N2/N.
w
• Expected payoff of a Hawk = p1 2  p2
• Expected payoff of a Dove =
p2

2
• The population reaches an equilibrium when p1 =

w
Strategies and payoffs
• Consider a game in which a player can choose a strategy
from the set S = {s1, s2, …, sn}.
– All members of the set S are called pure strategies.
– A mixed strategy is a probability vector on the elements of S.
• The set of all strategies (pure and mixed) is denoted by
Δ(S).
– Δ(S) is convex: given two mixed strategies p and q, the convex
combination ap + (1-a)q, is also a mixed strategy, ∀ a ∈[0, 1]
Example on R3
• If S = {s1, s2, s3} then Δ(S) is depicted in the following
diagram.
Payoff function
• A payoff function π : S → R assigns a value πi to each
pure strategy si. We identify the function with the vector:
π  (1,  2 ,..., n )  Rn
• If p is a mixed strategy, the payoff is a random variable
whose expected value is the following:
n
Ep [ π]    i pi  π, p
i 1
Best response
• A strategy si ∈ S is a pure best response of the payoff π if:
 i  max  k
k
• A mixed strategy that is a convex combination of pure
best response strategies, is also best response for π.
• Formally a strategy p* is best response for the payoff π if
p* maximizes <π, p> or equivalently:
p*  BR( π) iff
π, p*  max π, p
p ( S )
Example on R3
• If the pure best response strategies are s2 and s3 then the set
of all best responses (pure and mixed) are the following:
Normal form games
• A finite game in normal form can be described by the
following data:
•
A finite set of players Γ = {γ1, γ2, … γn}.
•
A set of pure strategies Sγ for each player γ ∈ Γ.
–
The set S = x γ∈Γ Sγ is the set of strategy profiles and an element s =
(sγ) γ∈Γ assigns a pure strategy to each player.
•
A payoff function πγ : S → R for each player γ ∈ Γ that
assigns a payoff to player γ given a strategy profile s.
Mixed strategy profiles
• We denote by pγ a (possibly mixed) strategy for the
player γ i.e. pγ ∈ Δ(Sγ). The set of mixed strategy profiles
is:
    (S )
• An element p ∈ Δ contains the mixed strategies that are
chosen by all players and is written as p = (pγ1 pγ2, …, pγn).
• If p is a mixed strategy profile then the payoff of player γ
is a random variable whose expected value is:
Ep [ π]  π (p) 
 ...  π (i ,...,i ) p
i1S1
i1S n
1
n
i1 , 1
...pin , n
New notation
• We introduce the notation s-α to denote the pure
strategy profile for all players except α, i.e.
s   s



   S
 
• Similarly we denote by p-α the profile of mixed strategies
for all players except player α, i.e.
p   p      S 
 
 
Nash equilibrium
• A Nash equilibrium is a strategy profile p in which no
player can improve his payoff by changing his strategy
given that the other players leave their own strategy
unchanged.
• Formally, a Nash equilibrium for the game (Γ, S, {πγ}γ ∈Γ) is
a strategy profile p* ∈ Δ, such that for every γ, p*γ is a
best response for the player γ given the strategy profile
p-γ of the other players, i.e.
  (p* )  max  (q , p* ),    
q
Example: Matching pennies game 1/4
• Two children, holding a penny, independently choose
which side of their coin to show.
– Child 1 wins if both coins show the same side and child 2 wins
otherwise.
– The winner pays $1 to the loser.
Example: Matching pennies game 2/4
• Child 1 chooses the mixed strategy p1 =(p1,H,p1,T)= (p, 1-p)
• Child 2 chooses the mixed strategy p2 =(p2,H,p2,T)= (q, 1-q)
• Expected payoff for Child 1:
1 p1 , p 2    1 (i, j ) p1,i p2, j  1* p1,H p2,H  1* p1, H p2,T  1* p1,T p2, H  1* p1,T p2,T
iS1 jS 2
 pq  p(1  q)  (1  p)q  (1  p)(1  q)  p(4q  2)  2q  1
• Expected payoff for Child 2:
 2 p1 , p2     2 (i, j ) p1,i p2, j  1* p1, H p2,H  1* p1,H p2,T  1* p1,T p2,H 1* p1,T p2,T
iS1 jS 2
  pq  p(1  q)  (1  p)q  (1  p)(1  q)  q(2  4 p)  2 p  1
Example: Matching pennies game 3/4
• BR of child 1 to the mixed strategy p2 of child 2.
BR1 (p 2 )  max  1 (p1 , p 2 )  max p(4q  2)  2q  1
p
p
• Solve the above problem using Linear programming
1 q  1 / 2
BR1 (p 2 )  
0 q  1 / 2
• BR of child 2 to the mixed strategy p1 of child 1.
BR2 (p1 )  max  2 (p1 , p 2 )  max q(2  4 p)  2 p  1
q
q
1 p  1 / 2
BR1 (p 2 )  
0 p  1 / 2
Example: Matching pennies game 4/4
• The NE of the game is the crossing point of BR1(p2) and
BR2(p1) .
– p1 =(p1,H , p1,T) = (p , 1-p) and p2 = (p2,H , p2,T) = (q , 1-q)
Download