Bayesian Games - University of Virginia

advertisement
Bayesian Games
Matthew H. Henry
November 10, 2004
References
1.
Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s
dilemma.” Genetic Algorithms and Simulated Annealing. (ed. D. Davis)
London: Pitman, pp. 32-43.
2.
Gibbons, Robert. 1992. Game Theory for Applied Economists. Princeton, New
Jersey: Princeton University Press.
3.
Harsanyi, John C. 1967. “Games with Incomplete Information Played by
Bayesian Players, Parts I, II and III.” Management Science 14:159-182, 320334, 486-502.
4.
Sigmund, Karl. 1993. Games of Life – Explorations in Ecology, Evolution,
and Behaviour. Oxford, England: Oxford University Press.
Slide 1 of 24
Outline
•
Static Games with Bayesian Players
– Example: Scalping Tickets
– Nash Equilibria for Matrix Games with Incomplete Information: Generals
– Nash Equilibria for Games with Asymmetric Information: Cournot Model
– Nash Equilibria for Games with Continuous Type Space: Auction
•
Dynamic Games with Bayesian Players
– Perfect Bayesian Equilibrium for Games with Incomplete or Imperfect Information
– Example: 3-Player Game Tree
•
Signaling Games
– Perfect Bayesian Equilibrium for Signaling Games
– Example: Job Market Signaling
Slide 2 of 24
Static Games with Incomplete Information
•
Static games
– Players move simultaneously
– No observation of opponent move history
•
Games with incomplete information
– One or more players lacks full information regarding the payoff functions and
strategies available
– We shall limit the information deficit to the player state (or type) knowledge
•
Player type implies (and is implied by) payoff function
•
Matrix games will have a unique payoff matrix for each player type match-up
Slide 3 of 24
Example: Scalping Tickets
•
For example, consider a scenario in which you and the Cavalier are each
scalping tickets for beer money before the UVa-Miami football game
•
For every discrete round of the game, each player assumes one of two types
and can take one of two actions (stand in one of two locations)
– Types: Buyer or Seller
– Locations: in front of Durty Nellie’s Pub or at the Fry’s Spring Garage
•
You know that you are either buying or selling and you know with probability
p that the Cavalier is buying, and selling otherwise
•
Four payoff matrices for the four possible type match-ups
•
Choose a spot to maximize profit (Durty Nellies or Fry’s Spring Garage) based
on your type and your best guess of the Cav’s type
Slide 4 of 24
A Better Example from Harsanyi
•
Consider two Generals A and B
– A seeks to maximize (maxmin) payoff and B seeks to minimize (minmax) payoff
– Fixed action profiles: (a1, a2) and (b1, b2)
– Each leads an army which assumes one of two states: Strong or Weak
•
This yields four possible match-ups – (AS, BS), (AS, BW), (AW, BS), (AW, BW) –
with corresponding payoff matrices, each having its own Nash equilibrium:
a1
a2
b1
2
-1
b2
5
20
(AS, BS)
a1
a2
b1
-24
0
b2
-36
24
a1
a2
(AS, BW)
b1
28
40
b2
15
4
(AW, BS)
[Harsanyi]
Slide 5 of 24
a1
a2
b1
12
2
b2
20
13
(AW, BW)
Bayesian Players
•
Each player knows his own state and estimates his opponents state
•
Each player has a pure strategy for every possible match-up
•
Each player forms a strategy based on the expected payoff
•
To continue the example given by Harsanyi, consider the following
probabilities of occurrence for the four possible match-ups:
BS
BW
AS
4/10
1/10
AW
2/10
3/10
Slide 6 of 24
Bayesian Nash Equilibrium
BS b1, BW b1
BS b1, BW b2
BS b2, BW b1
BS b2, BW b2
This yields the following payoff matrix and a single pure strategy Nash equilibrium:
AS a1, AW a1
7.6
8.8
6.2
7.4
AS a1, AW a2
7.0
9.1
1.0
3.1
AS a2, AW a1
8.8
13.6
14.6
19.4
AS a2, AW a2
8.2
13.9
9.4
15.1
Example calculation: Bayesian Nash Equilibrium payoff
= (.4)(-1) + (.1)(0) + (.2)(28) + (.3)(12) = 8.8
Slide 7 of 24
Interpretation of Bayesian Nash Equilibrium
•
If Player A is Strong, he takes action a2 and a1 if Weak.
•
Player B takes action b1 irrespective of state.
•
Emerged from the known probabilities of each possible match-up
•
Nash optimal : Best response (in a Bayesian sense) on the part of each player to
the actions available to his opponent
•
Note that each player has a pure state-dependent strategy
(However, an outside observer could interpret it as a mixed strategy, with
Nature playing the part of a third indifferent player who randomly chooses
states for players A and B according to fixed probability distributions)
Slide 8 of 24
Static Bayesian Game #2: Cournot Model
•
Consider a Cournot model comprising two firms A and B producing the same
commodity to satisfy market demand, D.
•
The commodity price on the market is given by
 D  q A  q B 
P
0

•
•
if D  q A  q B 
otherwise
Firm A’s cost of producing the commodity is cAqA
–
cAis the marginal cost
–
qA is the quantity that Firm A produces.
Firm B’s cost of producing the commodity is
–
cB1qB, with probability p
–
cB2qB with probability (1-p).
•
Player state defined by its marginal cost
•
Each firm seeks to maximize its profit by anticipating the market price
Slide 9 of 24
Cournot Model and Asymmetric Information
•
Firm B knows its state and Firm A’s state
•
Firm A knows its own marginal cost but can only estimate Firm B’s state
•
Each firm knows of the other’s degree of knowledge
•
Gibbons calls this a Bayesian game with asymmetric information
•
Firm A chooses the optimal quantity qA to produce:
max  pD  q A  qB (cB1 )  cA q A  1  p D  q A  qB (cB 2 )  cA q A 
qA
•
Firm B chooses the optimal quantity qB to produce
For cB1:
max D  q A  qB  cB1 q B , where q B  q B cB1 
qB
For cB2
max D  q A  qB  cB 2 q B , where q B  q B cB 2 
qB
Slide 10 of 24
Analytical Solution: Bayesian Nash Equilibrium
System of Equations:
q*A  pD  qB (cB1 )  c A   1  p D  qB (cB 2 )  c A 
D  q*A  cB1
q (cB1 ) 
, Firm B' s solution if it is in state 1
2
D  q*A  cB 2
*
q B (c B 2 ) 
, Firm B' s solution if it is in state 2
2
*
B
Solutions:
D  2c A  pcB1  (1  p)cB 2
3
D  2cB1  c A 1  p
cB 2  cB1 
qB* (cB1 ) 

3
6
D  2cB 2  c A p
qB* (cB 2 ) 
 cB 2  cB1 
3
6
q *A 
Slide 11 of 24
Bayesian Game with Continuous Type Space: Auction
•
Consider an auction comprising two bidders and one item
•
Players offer bids, b1 or b2, for the item
•
b1 & b2 [0, 1]
•
Each bidder values the item at v1 or v2 with payoff v1– p or v2– p, respectyively
•
v1 & v2 [0, 1]
1






max (vi  bi ) P bi  b j  (vi  bi ) P bi  b j 
bi
2


Note: The latter term in this utility function applies only when bids are offered
in fixed increments. For bids from the continuous set [0,1], this term is zero.
Slide 12 of 24
Linear Equilibrium
•
We simplify the search for equilibrium by limiting the solution to the linear form
bi(vi) = ai + civi
•
This does not limit the player action spaces to linear strategies, but simply looks for a
linear equilibrium solution
•
We can assume that a player i will neither bid above the expected highest bid nor
below the lowest expected bid of player j
•
Therefore, aj  bi  aj+cj, since vj[0,1] and is a uniformly distributed random variable
Slide 13 of 24
Linear Equilibrium
•

 

 bi  a j
 bi  a j
 vj  
cj
 c j

This gives us: P bi  b j  P bi  a j  c j v j  P 
and

 bi  a j
max vi  bi 
 c
bi

j

or

  2bi  vi  a j


 vi  a j

2
bi (vi )  

aj

similarly

for vi  a j 


otherwise 

 v j  ai

2
b j (v j )  

ai


for v j  ai 


otherwise 

Slide 14 of 24
Linear Equilibrium
•
Since we are looking for a linear solution, ai and aj  0, since values greater
than zero would yield a non-linear solution or, if greater than 1, would yield an
infeasible solution since neither bidder will offer more than he values the item.
•
Thus, since the bids must be non-negative, ai = aj = 0, and the solution is that
each bidder will offer one half his valuation of the item.
Slide 15 of 24
Dynamic Games with Bayesian Players
•
Dynamic games with incomplete or imperfect information
– Players move after observing the actions taken by their opponents.
– Recall from the initial discussion on static games that information incompleteness
implied an information deficit with respect to an opponent’s type or state
– Information imperfection implies that each successive player’s move is based on
complete information about the state of the other players but flawed information
about the state of the game; i.e., the play history on the part of his opponents
•
These games require a new solution concept: perfect Bayesian equilibrium
Slide 16 of 24
Perfect Bayesian Equilibrium
Gibbons gives the following four requirements for a perfect Bayesian equilibrium:
1. For each game turn, the moving player must have a belief about the state of the game,
i.e. the play history to that point, in the form of a probability distribution over the set of
the possible game sub-states at that point.
2. Given their beliefs, the players’ strategies must be sequentially rational.
Note: An example of irrational (but effective under some circumstances) strategy is tit-for-tat in
repeated prisoner’s dilemma games. [Axlerod, Sigmund]
3. At each game state on the equilibrium path, beliefs are formed by observation-driven
Bayes’ rule and players’ equilibrium strategies.
(For a given equilibrium in a sequential game, a game state is on the equilibrium path if it will be reached with
positive probability when the game is played according to equilibrium strategies. Otherwise, the state is off the
equilibrium path.)
4. For game states off the equilibrium path, beliefs are formed by Bayes’ rule and players’
equilibrium strategies where possible.
Slide 17 of 24
Simple Example
Consider the following 3-player Game Tree. Each set of nodes corresponding to outcomes
associated with any particular player’s move represents a possible game state.
P1
R1. This requirement is relevant for P3 only since
if P1 chooses A, the game is over, and thus
P2 has only to believe that he is in state D if
he has a turn. Player 3 must conclude that p
= 1 since R is dominated by L for player 2.
2
0
0
D
P2
L
R
[p]
R2. Given this belief, Player 3 must choose R’.
R3. This requirement is satisfied by R1.
R4. This requirement is trivially satisfied since
there are no states off the equilibrium path.
Thus, the equilibrium (D,L,R’) can be confirmed
by inspection.
A
L’
1
2
1
Slide 18 of 24
[1-p]
P3
R’
3
3
3
L’
0
1
2
R’
0
1
1
Signaling Games
•
Games of two players with incomplete information about the opponent’s type
•
One player is the Sender, one is the Receiver.
•
Nature draws a type for the Sender according to a probability distribution on
the set of feasible types.
•
The Sender observes his type and sends a message based on that type. The
sender can follow pooling, separating or hybrid strategies.
– A pooling Sender transmits the same message regardless of type.
– A separating Sender always transmits different messages for each type.
•
The Receiver observes the message but not the type and chooses an action.
•
Payoffs to the Sender and receiver are each a function of Sender type, message
and Receiver action.
Slide 19 of 24
Requirements for Perfect Bayesian Equilibrium in Signaling Games
1. After observing the Sender’s message, the Receiver must have a belief about
the Sender’s type in the form of a probability distribution conditional upon the
message transmitted.
2R.For each message observed, the Receiver’s action must maximize the
Receiver’s expected payoff, given the belief about the Sender’s type.
2S. For each type determined by Nature, the Sender’s message must maximize his
expected payoff, given the Receiver’s strategy, defined as the set of actions to
be taken as functions of the message transmitted.
3. For each message transmittable by the Sender, if there exists a sender type
such that the message is optimal for that type, then the Receiver’s belief about
the Sender’s type must be derivable from Bayes’ rule and the Sender’s
strategy.
Slide 20 of 24
Example: Job Market Signaling
•
Nature determines a worker’s (the Sender) productive ability, which can be either High
or Low. The probability that his ability is High is q.
•
The worker observes his ability and chooses a level of education (his message to
potential employers).
•
The hiring market (the Receiver) observes the worker’s level of education and, based on
a belief about the worker’s ability, offers a wage (Receiver’s action).
•
Payoff to the worker is W – C(a, e), where W is the wage offered, C is the cost (financial
+ intellectual difficulty) of attaining a particular level of education as a function of
ability a and education level e. Presumably, the cost of attaining a higher level of
education for a Low ability worker is relatively high due to the additional intellectual
difficulty sustained by the worker in its pursuit.
•
Payoff to the hiring market is P(a, e) – W, where P is the level of productivity supplied
by the worker as a function of ability and education level.
Slide 21 of 24
Complete Information Solution
IH
IL
W
P(H,e)
Note the marginal cost of education is
higher for a Low ability worker, thus he
would require a higher relative salary to
justify pursuing a higher education, hence
the steeper indifference curve.
W*(H)
W*(L)
P(L,e)
e*(L)
e*(H)
Slide 22 of 24
The Productivity lines are found from the
Nash solution W(e) = P(,e) in which the
market, which is presumed to be
competitive and therefore devoid of excess
profit, offers a wage equal to the expected
level of productivity.
e
Pooling Equilibria and the Power of Envy
•
Suppose now that the hiring market has incomplete information about the
worker’s type and only observes the level of education attained by the
workers.
•
Suppose further that a Low ability worker is envious of a High ability worker’s
salary and decides to attempt to masquerade as a High ability worker by
getting a more advanced degree.
•
This constitutes a pooling strategy since the worker will attempt to signal to
the hiring market that he is of High ability irrespective of type.
Note, this is only rational if the following inequality holds:
W*(H) - C[L,e*(H)] > W*(L) – C[L,e*(L)]
Slide 23 of 24
Masquerading Workers with Pooling Strategies
IH
IL
W
P(H,e)
qP(H,e) + (1-q) P(L,e)
W*p
P(L,e)
W*(L)
e*(L)
Here the Nash equilibrium sets the wage at
wp, where the expected Productivity line
intersects both indifference curves.
e
e* p
Slide 24 of 24
Download