Lecture 3-4

advertisement
Dynamic Games of Complete
Information
.
Extensive form games
• To model games with a dynamic structure
• Main issues with a dynamic structure:
1. Information structure: who knows what
and when?
2. Credibility
3. Commitment
4. The idea of Backward Induction
The Stackelberg game
• A dynamic version of the Cournot game
• Player 1, “Stackelberg leader” chooses
output q1 first
• Player 2, “Stackelberg follower” chooses
output q2 next
• Demand is linear: p(q)=12-q
• Player i’s utility is ui(q1 , q2 )=[12- (q1 + q2)]qi
• What is the Stackelberg equilibrium?
• Are there other Nash equilibria?
Model of strategic investment
• Firms 1, 2 have average cost of 2 per unit
• Firm 1 can install new technology at cost f.
Then average cost is zero
• Firm 2 can observe firm 1’s investment
• The two firms then move simultaneously to set
quantity. Demand is p(q)=14-q
• How should firm 1 forecast its rival’s output?
• Backward induction not directly applicable. Why?
The Extensive form
• Building blocks of the extensive form game:
1. The set of players
2. The order of moves - i.e. who moves when
3. The player’s payoffs as a function of moves
4. What the player’s choices are when they move
5. What a player knows when making his choice
6. Probability distribution over any exogenous events
Extensive form trees
• Rules for forming trees
1. Single starting point
2. No cycles
3. One way to proceed
• Define precedence relation: a  b  a precedes b
1. ‘  ‘ is asymmetric: a  b, means b 
a
2. ‘  ’ is transitive
3. x/  x and x//  x implies x/  x// or x//  x/
4. There is single initial node
Formalizing the extensive form
1. Let i єI be the finite set of players
2. Let i(x) bet set of players that move at node x
3. Let Z be set of terminal nodes. Maps ui:Z→R with
values ui(z) are i’s payoffs to a sequence of moves z
4. Let A(x) be set of feasible actions at node x
5. Information Sets h partition nodes of the tree:
a. Each node x is in only one information set h(x)
b. If x/єh(x), then player moving at x does not know
if he is at x or x/
c. If x/єh(x), then the same player moves at x & x/
d. If x/єh(x), then A(x) = A(x/). Thus A(h) is action
set at information set h
Example
• Two people want to go to a Broadway
musical in great demand
• There is exactly one ticket left, and
whoever arrives first gets it
• There are three transportation choices:
c(cab); b(bus); s(subway)
• Player 1 leaves home a little earlier
• A cab is faster than the subway, which is
faster than a bus
Strategies & equilibria in extensive form
• Let Hi be set of player i’s information sets
• Let Ai   A(hi ) be the set of all actions for i
h H
• A pure strategy for i is a map si: Hi → Ai , with
si(hi) є A(hi) for all hi є Hi
• The set of pure strategies for i is Si= h H A(hi )
• The number of i’s pure strategies is given by the
# ( A(hi ))
product # Si  h 
H
• Mixed strategies in extensive form are called
behavior strategies. Let ∆(A(hi)) be prob dist on A(hi)
A behavior strategy for i, denoted bi, is an element of
Cartesian product  ( A(hi ))
i
i
i
i
i
hi H i
i
Strategic-form versus extensive-form
• Using its pure strategies and payoffs, an extensive form can be
transformed to strategic form
• Extensive form interpretation: player i waits until hi is reached
before deciding how to play there
• Strategic form interpretation: player i makes a complete
contingent plan in advance
• Games of perfect information with all singleton information sets
constitute a special class
• Any mixed strategy σi (strat form) generates a behavior strategy
bi (ext form), but many different σi’s can generate the same bi
• Theorem (Kuhn 1953):
In a game of perfect recall, mixed and behavior strategies are
equivalent
Backward induction & Subgame perfection
• Theorem (Zermelo 1913; Kuhn 1953)
A finite game of perfect information has a pure strategy
Nash equilibrium
• Subgame perfection is the analog of backward induction
for multi-player situations
• G is a proper subgame of an extensive form game T if it
1. Starts at a single node x of T
2. Contains all successors of x
3. If x/є G, and x//є h(x/), then x//є G
• A behavior strategy σ of an extensive form game is a
subgame perfect equilibrium if the restriction of σ to G is
a Nash equilibrium of G for every proper subgame G
Multi-stage games with observed
actions
1. There are k stages: 0, 1, …, k-1
2. All players know the actions chosen at all
previous stages
3. All players move simultaneously in each
stage
4. This includes games where players move
alternately (all other players have strategy:
“do nothing”)
Multi-stage games with observed
actions
a 0≡
(a , a ,..., a ) be the stage-0 action-profile
• Let
• At the beginning of stage1, players know history h1
which is just a0
• Let Ai(h1) be player i’s action set at stage 1 with
history h1
• hk+1 is history at end of stage k, hk+1=(a0, a1,… ak),
and Ai(hk+1) is player i’s action set at stage k+1
• If game is K stages, HK is set of all ‘terminal histories’
k K
• A pure strategy for i is seq. of maps {si }k  0 such that
sik : H k  Ai ( H k ) where Ai ( H k )   Ai (h k )
0
1
0
2
0
I
h k H k
Multi-stage games with observed actions
• Payoffs are defined on terminal histories,
ui: Hk+1→R
• In most applications, payoffs are additively
separable over stages. This isn’t necessary
• The game from stage k on with history hk is a
proper subgame G(hk), and a strategy profile s for
whole game induces si│hk for subgame G(hk)
• A Nash equilibrium s satisfies the familiar condition
ui(si , s-i)≥ ui(s/i , s-i) for all s/i
• A Nash equilibrium s is subgame perfect if si│hk is a
Nash equilibrium for every subgame G(hk)
Principle of optimality and subgame
perfection
• For multi-stage games with observed actions, we
have a useful characterization of subgame perfection
for the finite-horizon case
• Theorem (One-stage-deviation principle)
A strategy profile ‘s’ is subgame perfect iff no player i
can gain by deviating from ‘s’ in a single stage and
conforming to ‘s’ thereafter
• This theorem can be extended to the infinite horizon
case
Rubinstein Bargaining model
• Two players have to share a pie of size 1
• The game:
Step1: In periods 0, 2, 4,…, player 1 proposes a split
(x, 1-x)
Step 2: If player 2 accepts in period 2k, game ends.
If he rejects, he proposes (x, 1-x) in period 2k+1
Step 3: If player 1 accepts, game ends. Else, Step 1
• Discount factors are δ1, δ2, and if split (x, 1-x)
is accepted at time t payoffs are (δt1x, δt2(1-x))
• This is an infinite-horizon game of perfect info
Subgame perfect equilibrium
• A continuation payoff of a strategy profile in
subgame starting at t is utility in time-t units of
outcome induced by that profile
• Let v1 , v1 be player 1’a lowest and highest
continuation payoffs in any subgame that begins
with player 1 making an offer
• Let w1 , w1 be player 1’a lowest and highest
continuation payoffs in any subgame that begins
with player 2 making an offer
• Similarly define v 2 , v 2 , w2 , w2 for player 2
Subgame perfect equilibrium
• When I makes offer, 2 will accept if he gets more
than δ2 v 2 . Hence, v1  1   2 v 2 . Also, by symmetry,
v 2  1   1 v1
• Suppose player 1 makes offer (x, 1-x). If 2
accepts, the min he can get is  2 v 2 , and
therefore, 1-x≥  2 v 2 . Thus, x≤1-  2 v 2 . This implies
that, sup  1   2 v 2 ,  v1  1   2 v 2
x
• Again, by symmetry, v 2  1  1 v1
Subgame perfect equilibrium
1 2
• Now, v1  1   2 v 2  1   2 (1  1 v1 ) , so, v1 
1  1 2
• Similarly,
v1  1   2 v 2  1   2 (1  1 v1 ) ,
which
gives, v1 
1 2
1  1 2
1 2
• But, v1  v1 , and thus, v1  v1  1   
1 2
1  1
• Proceeding similarly, v 2  v 2 
1  1 2
• What is the effect of patience?
As δ1→1, for fixed δ2, we have v1→1, and player
1 gets entire pie.
A model of R&D race
• Firms R and S are conducting R&D
• Several stages need to be completed
• Simplifying assumptions:
1. Distance from goal can be measured. E.g. firm S is n-steps
away from completion
2. Either firm can move 1, 2, or 3 steps
3. It costs $2/7/15 to move 1/2/3 steps
4. Firm completing all the steps first gets patent worth 20
• What would happen if R-S were a cartel, maximizing
joint profits?
- Since only one firm gets patent, only it does R&D
- Chosen firm moves 1-step at a time, and firm closer to
finishing is chosen
The extensive form of R&D game
• Suppose the firms take turns deciding on R&D
investment: becomes a game of perfect info
• Converting to extensive-form:
1. Transform to location-space picture
2. Let (r, s) be coordinates of R and S , with r
depicting how far R is from finishing
Subgame perfect equilibrium of
R&D game
•
If S is in R’s safety zone- whatever the zone
number -it should drop out of the race
•
Firm S in its own safety zone spends the
minimum amount on R&D, moves one step at a
time and wins the patent
•
In Trigger zone n, each firm spends what it
needs to- profitably -to get an advantage and
move the game to its safety zone n-1
David vs Goliath in Entry Decisions
• Suppose Goliath has $700 and David has
$300
• They are gambling types, and prefer
roulette
• Whoever ends up with more money after
the next round will win ultimately
• Suppose David moves first and makes the
safest bet
• He can never win 
David vs Goliath in Entry Decisions
• He should take one of the more risky
gambles
• Bets $300 that the ball would land on a
multiple of 3 – wins $900 w.p. 12/37
• What is Goliath’s best response?
• To exactly imitate David’s bet !
• Again, David can never win 
• Is there any hope for David?
David vs Goliath in Entry Decisions
• David should have gone second and
differentiated himself
• This situation is parallel to new product launch
decisions when a firm with shallow pockets
competes against a firm with deep pockets
• If going second is not feasible, then entrant
should take riskier bets – like launching a
product with some chance of failing!!
Patent races
• When is there competition or monopoly?
• Depends on possibility of preemption and
leapfrogging
• Not about chance of winning, but about chance of
being favorite
• Consider two firms i=1,2, and let value of patent be V
• Productivity of R&D increases over time
• Let ωi(t) be firm i’s total experience at time t
• μ(ωi(t)) is firm i’s hazard rate at t. Let μi(t)=μ(ωi(t))
• Discovery probability is an exponential waiting time
• Note, discovery is stochastic and firm 2 (enters t2>0)
could discover before firm 1 (enters t1 =0)
Assumptions and preliminaries
• Cost is c and common discount rate is r
• Probability that no firm makes discovery before time t is
t
exp[  (1 ( )  2 ( ))d ]
0
• Expected value of patent race for firm i is

t
 i   exp{[rt   ( 1 ( )   2 ( )) d ]}{i (t )V  c}dt
0
ti
• R&D is viable for monopoly,

t
 exp{[rt    ( )d ]}{ (t )V  c}dt  0
0
0
• It is unprofitable for both firms to always do R&D

t
 exp{[rt  2  ( )d ]}{ (t )V  c}dt  0
0
0
~
• State of competition is pair of experiences   (1 , 2 )
Model without leapfrogging
• Result 1 (є-preemption): In the unique subgame perfect
equilibrium, whatever t2, firm 1 engages in R&D and firm 2
drops out of the race.
• Sketch of proof:
i. Let Ω(ω) be firm 1’s experience such that it has zero payoff
when firm 2 has experience ω, and both firms do R&D
ii. Show that a firm does R&D until discovery or drops out
immediately
a) Suppose not. Let initial state be ~  (1, 2 ) and both firms
do R&D till time t and firm 1 drops out at t with zero profits
~
~
b) Then  (t )  ((2  t ), 2  t ) and 1 ((2  t ), 2  t )  1 ( (t ))  0
c) Firm 1’s expected profit at ~ is
t
s
0
0
1 (~ )   exp{[rs   [ ( 1   )  ( 2   )]d ]}{ ( 1  s)V  c}ds
t
 1 (~ (t )) exp{[rt   [ ( 1   )  ( 2   )]d ]}
0
Sketch of proof of Result 1 (contd.):
d) Since 1 (~(t ))  0, &  (1  s)V  c  0 for s≤t, implies
1 (~)  0 so firm 1 would not join patent race: a contradiction
iii. The strategy: ‘If ωi≥ ωj, firm i stays in and j drops out iff
ωj ≤ Ω(ωi)’, is subgame perfect
iv. Suppose there exists t such that, conditional on neither
having dropped out at t, firm 1 drops out with some prob.
v. Let t 1 be the supremum of such times for firm 1. t 2 for firm 2
vi. Claim: t1  t 2
vii. Claim: Firm 2 drops out with probability 1 at time t 1
viii. Then, firm 1 will not drop out at time t 1 -є for small є
ix. Proceeding similarly, firm 1 never drops out
x. Then, firm 2 never does R&D
Model with leapfrogging
•
•
•
•
There are 2 stages: preliminary and final stages
Costs are c1 and c2
Time-t experience for firm i in stage j is ωji(t)
Probability of making 1st, 2nd stage discovery at
time t if it has not been made before are μi(t), θ
• Cannot accumulate 2nd stage experience without
making 1st stage discovery

2
( r  ) t
nd
M

e
(V  c2 )dt
• Payoff for 2 stage monopolist,
0
= (θV-c2)/(r+ θ), and for a 2nd stage duopolist is
W2 =(θV-c2)/(r+ 2θ),
Model with leapfrogging
• Result 2: There exists a SPNE where the leading
firm always does R&D unless the rival does R&D
1
st
and completes the 1 stage before  . The
follower either, (i) drops out at the start, (ii) does
R&D until  1 , or (iii) always does R&D unless the
leader passes the 1st stage before (  1 +t2)
Model with leapfrogging
• Sketch of proof of Result 2:
i. There are 2 decision points: one firm has finished 1st stage,
or both are in 1st stage
1
ii. Let  be level of 1st stage experience
such that a firm with
1
experience less (greater) than  will drop out (stay in)
even if the rival has completed stage 1. It is defined by

s
2
exp{

(
r


)
s


(



)
d

}{

(


s
)
W
 c1}ds  0


1
1
0
0
1
iii a. If 2 has completed stage 1, firm 1 should stay in if t ≥ 
1
b. If 1 has completed stage 1, firm 2 should stay in if t-t2≥ 
Model with leapfrogging
• Sketch of proof of Result 2:
1
iv. Suppose neither has made
stage discovery at time 
1
v. Firm 1 will hit experience level  before firm 2. After that
it will stay in.
vi. Can 2 do R&D profitably when both stay in forever? If
yes, both do R&D unless firm 1 passes stage 1 before
1
1
t2+ . Else, the subgame from onwards resembles the
1
no-leapfrogging situation. From result 1, 2 will drop at 
1

vii. Consider race before time . Straightforward extension
of arguments in Result 1 show that
- there is є-preemption and 2 quits at t=0
1
- firm 2 does R&D till t= t2+
1st
Download