NGM presentation on strategic interaction and game theory

advertisement
Next Generation Management
Interdependent decision making:
introduction to game theory
MSBM/MECB/MMK
28th March 2012
© Malcolm Brady, 2011
Malcolm Brady
DCU Business School
Dublin City University
Game theory
• Orginally developed in the early 20th century by mathematicians and
economists
– von Neumann, Morgenstern
• to address the problem of interdependent decision making
– ie. a situation where a decision made by one party influences a decision
made by another party
• and has since become the dominant paradigm in industrial
organisation economics,
– ousting the structure-conduct-performance paradigm
• and is steadily making inroads into the field of strategic management
• Eight recent Nobel prizewinners have been game theorists –
1994:Nash, Selten, Harsanyi; 2005: Aumann, Schelling; 2007:
Myerson, Hurwicz, Maskin.
© Malcolm Brady, 2011
Examples:
Dark Knight http://www.youtube.com/watch?v=tc1awt6v2M0
Beautiful Mind http://www.youtube.com/watch?v=CemLiSI5ox8
The Good… http://www.gametheory.net/media/GoodBadUgly.mov
http://www.youtube.com/watch?v=J0BrdMi-oyc
Rational behaviour
‘It is not from the benevolence of the butcher, the brewer, or the baker
that we expect our dinner, but from their regard to their own self-interest
…and he is in this, as in many other cases, led by an invisible hand to
promote an end which was no part of his intention’
Adam Smith, 1776 (1994, I.15, IV.485)
• The world is driven by self-interest
• Rational economic man – maximises their own utility
• Smith saw price and the market as the invisible hand
• Individual action may lead to an unexpected end
© Malcolm Brady, 2011
Menu
•
•
•
•
Starter:
Mains:
Dessert:
Coffee:
salad $4
or prawns $10
burger $13 or steak $25
icecream $4 or pavlova $4
$3
• Wine: bottle $25 shared
• do you drink:
a little
or
© Malcolm Brady, 2011
a lot
‘Two roads diverged in a wood, and I
I took the road less travelled by,
And that has made all the difference.’
Robert Frost
A simple
decision
Made all the difference
Yellow wood
Dixit and Nalebuff, 1993:34
© Malcolm Brady, 2011
Decision Tree
$100k to Newcleaners
Fastcleaners
-$200k to Newcleaners
Newcleaners
$0 to Newcleaners
Expected value of entering = .5x$100k + .5x(-$200k) = - $50k
Expected value of not entering = 0
=> Do not enter
How to determine probabilities of what Fastcleaners will do?
Dixit and Nalebuff, 1993:37
© Malcolm Brady, 2011
Game tree
$100k to Newcleaners
$100k to Fastcleaners
Fastcleaners
-$200k to Newcleaners
-$100k to Fastcleaners
Newcleaners
$0 to Newcleaners
$300k to Fastcleaners
Now what happens?
Look ahead, and reason backwards.
Dixit and Nalebuff, 1993:37
© Malcolm Brady, 2011
Form of games
• Extensive form
– Structured as a decision tree
– Multiple stages are evident
– Becomes cumbersome if games are complex
• Strategic form (or normal form)
– Structured as a payoff matrix
– Summarised form
© Malcolm Brady, 2011
Extensive form
•
Player
– Each independent decision maker is a player
– Chance can also be a player (eg. throw of a dice, act of God)
•
Move
– A decision point in a game, at which alternative choices are available to a player
•
Action
– The alternative choices available to a player at a move
•
Outcome
– An end point of the game
•
Payoff
– The value received by a player at an outcome
•
Game tree
– The order of decision making is displayed as a tree
– A tree is a connected graph containing no loops and beginning at a single node
(known as the root node)
– A tree is not necessarily a unique representation of a game
© Malcolm Brady, 2011
Assumptions
• Each player acts so as to maximise their utility ie. to gain
maximum payoff
• Each player has preferences over the various outcomes
of the game and chooses actions to achieve their
preferred outcome
– ie. each player is ‘rational’ in the sense that, given two
alternatives, he will always choose the one that he prefers ie. the
one with the larger utility
• Each player has full knowledge of the game in extensive
form
– Each player knows the preference pattern and payoff function of
the other players
– Each player is fully aware of the rules of the game
© Malcolm Brady, 2011
Strategic (normal) form
• In principle we could ask each player to state in advance what
he/she would do in each situation which might arise in the play of a
game. From this information an umpire could carry out the play of
the game without further aid from the players and thereby determine
the payoffs. Such a description of decisions for each possible
situation is known as a pure strategy for a player.
• If player i has q information sets then q-tuples of integers (y1,
y2,…yq) represent pure strategies
• We can then label the strategies 1 to t where t is the total number of
strategies available to the player. Note that t is a finite number but
could be a large number.
• We can now represent the game using only the strategies for each
player.
© Malcolm Brady, 2011
Strategic form – payoff matrix
Player 2
A
a
b
1,1
2,1
Player 1
B
1,2
2,2
(1, 1)
a
2
1
A
B
A,B: strategies available to player 1
a,b: strategies available to player 2
(x, y): payoff for players 1, 2 respectively
© Malcolm Brady, 2011
b
2
a
b
(2, 1)
(1, 2)
(2, 2)
The two representations
are equivalent
Example: advertising campaign
Tesco Ireland
small
big
$100m
$110m
small
$100m
$70m
Dunnes
$70m
$80m
big
$110m
$80m
Adapted from Grant, CSA, 2010:101
See also: Brandenburger and Nalebuff,1995, HBR, July-Aug
© Malcolm Brady, 2011
Strategic form
• This strategic form game is equivalent to the
extended form (tree structure) above
• In this example the two player’s strategies
consist of just one action; however, a strategy is
a set of actions and the set may comprise of
more than one action
• Formally, a game in normal form consists of
– A set of n players
– n sets of pure strategies si, one for each player
– n linear payoff functions mi, one for each player,
whose values depend on the strategy choices of all
the players.
© Malcolm Brady, 2011
Zero Sum Games
• In zero sum games we have two players and each player has only
one move.
• These moves are taken simultaneously (or if in succession the
player who moves second has no knowledge of the choice made by
the first player)
• This is not as restrictive as it seems as many interesting situations
involve only two players and all extensive form games can be
transposed into normal form where each player has only one move
(ie. makes one choice from a set of strategies).
• Where players have strictly opposing preferences among outcomes
then it is a strictly competitive game
• eg. player 1 prefers outcome x to outcome y
and player 2 prefers outcome y to outcome x
• Payoffs for players are diametrically opposed
• ie. p2 = - p1
© Malcolm Brady, 2011
or
p2 + p1 = 0 (zero sum)
Zero Sum Game example
α1
β1
β2
18
3
α2
0
α3
5
4
α4
16
4
α5
9
3
3
β3
β4
0
2
8
20
5
5
2
0
25
20
Luce and Raiffa, 1957:61
© Malcolm Brady, 2011
Infinite loop
• Player 1 would like to play strategy α4 and get a
payoff of 25; but on realising this then player 2
would play β3 and keep her loss down to 2;
player 1, realising this, would then prefer to play
α2 and so gain 8; but then player 2 would prefer
to play β1 and lose zero; player 1 would then
prefer α1 and gain 16; player 2 would then
prefer β3, player 1 α2, then player 2 β1 …
• We could equally start with player 2 and get a
different infinite sequence of choices
© Malcolm Brady, 2011
Maximin and Minimax
• Player 1: determines the minimum he can gain
by playing each strategy, then chooses the
maximum of these minima
• Player 2: determines the maximum she can lose
by playing each strategy, then chooses the
minimum of these maxima
• => Player 1 plays α3 and thereby ensures that
he receives a payoff of at least 4
• =>Player 2 plays β2 and ensures that she
makes a loss of no more than 4
© Malcolm Brady, 2011
Maximin
α1
© Malcolm Brady, 2011
β1
β2
18
3
α2
0
α3
5
4
α4
16
4
α5
9
3
3
β3
β4
0
2
8
20
5
5
2
0
25
20
Minimax
α1
© Malcolm Brady, 2011
β1
β2
18
3
α2
0
α3
5
4
α4
16
4
α5
9
3
3
β3
β4
0
2
8
20
5
5
2
0
25
20
Equilibrium pair
• A pair of strategies (α, β) is an equilibrium pair if the corresponding
entry in the payoff matrix is the minimum value in its row and the
maximum value in its column
• At that cell neither player will have an incentive to change his
strategy
• Existence: strictly competitive games do not necessarily have an
equilibrium pair
3
1
4
5
4
2
4
3
0
1
• Uniqueness: several equilibrium pairs may exist in a strictly
competitive game
• Is (α3, β2) above an equilibrium pair?
© Malcolm Brady, 2011
Example: Bismarck Sea
• Outcome represents number of bombing days by US
planes on Japanese convoy
• Both commanders chose north channel
J
N
S
N
2
2
S
1
3
N: Poor visibility
US
New Britain
S: Clear weather
J
US
• Japanese convoy sighted after one day and suffered
severe losses
• Cannot say that Japanese commanded erred in his
decision. The north route was at least as good as the
south route against either of the two US strategies
© Malcolm Brady, 2011
Source: Luce and Raiffa, 1957:64
Minimax theorem
• For any two person zero sum game there
exists a mixed strategy (maximin) for
player 1 which guarantees him at least v,
and a mixed strategy (minimax) for player
2 which guarantees that player 1 gets at
most v. These mixed strategies are in
equilibrium.
• The unique number v is called the value of
the game
© Malcolm Brady, 2011
β1
α1
Non-zero sum gamesα2
a11
b11
a12
b12
a21
b21
a2,2
b2,2
• We now move on to two person non-zero sum
non-cooperative games
• By a cooperative game is meant a game in
which the players have complete freedom of
preplay communication to make joint binding
agreements
• In a non-cooperative game no preplay
communication is permitted
• Cooperation does not arise in zero sum games
because players interests are diametrically
opposed
© Malcolm Brady, 2011
β2
Battle of the sexes
woman
thriller
2,1
0,0
man
thriller
romance
romance
0,0
1,2
• Couple going out
• Man prefers thriller to
romance
• Woman prefers
romance to thriller
• Both prefer to agree
on something and go
together than go out
separately
Video explaining mixed strategy NE for this game:
http://www.youtube.com/watch?v=VjkShMpDzLc
http://www.youtube.com/watch?v=08JlYCgckDQ&NR=1
© Malcolm Brady, 2011
Coordination game
• BoS is an example of a coordination game
– Issue is ‘how to achieve coordination’?
• Shows power of declaring your strategy and sticking to it
– Eg. man declares he has bought cinema tickets and, guess
what, it’s for the latest thriller
– A credible pre-play commitment
• In a zero sum game declaring your strategy in advance
is never to your advantage
• In a coordination game it can be to your advantage to
declare in advance and have a reputation for inflexibility
– Second player acting in their own self-interest works to the
advantage of the first player
© Malcolm Brady, 2011
Dominant strategy
L
M
R
U
4,3
5,1
6,2
M
2,1
8,4
3,6
D
3,0
9,6
2,8
• Solution is found by
the method of iterated
dominance
• For player II
– R strictly dominates M
– ie. aiR>aiM, i=U,M,D
• Then for player I
– U strictly dominates M
– U strictly dominates D
• Then player II prefers
L to R
• => play is (U, L) and
outcome is (4, 3)
Fudenberg and Tirole, 1991:5
© Malcolm Brady, 2011
L
Iterated dominance
R
U
8,10 -100,9
D
7,16
7,6
6,5
• Not all games can be solved by iterated
dominance
• Order of iteration does not affect the prediction
under strict dominance
– But does under weak dominance
• Result also depends on behaviour and
anticipated behaviour of players
– Eg. iterated dominance gives (U,L) as solution but
experiments show that D is often chosen by students
• Because of fear of other player making ‘mistake’ ie. not
behaving rationally and first player ending up with -100
Fudenberg and Tirole, 1991:8
© Malcolm Brady, 2011
Prisoner’s Dilemma
don’t testify
don’t
testify
testify
testify
-1,-1
-5,0
0,-5
-3,-3
We must re-consider the concept
of optimality when we are dealing
with several independent decision
makers and where outcomes are
interdependent
© Malcolm Brady, 2011
• Iterated dominance predicts
(testify, testify) with an
outcome of (-3,-3) ie. both
prisoners going down for 3
years
• Even though a solution exists
(don’t, don’t) which is better for
both players
• Self interest can lead to
inefficient outcomes
• Even if players agree not to
testify beforehand the
agreement will not be binding:
each will be motivated to
defect from the agreement
See video explaining the prisoner’s dilemma:
http://www.youtube.com/watch?v=IotsMu1J8fA&feature=fvwrel
work
Teamwork
•
•
•
•
•
•
work
1,1
shirk
2,-1
shirk
-1,2
0,0
Work leads to a contribution of 1
Shirk leads to a contribution of zero
Team output is 4 *sum of contributions
Output is equally divided
Cost of working is 3;cost of shirking is zero
Moral hazard in teamwork
– Is there is a motivation to shirk?
Fudenberg and Tirole, 1991:10
© Malcolm Brady, 2011
hi
Price and profit
lo
hi
3,3
1,4
lo
4,1
2,2
• Each firm has to choose whether to set a
low or a high price
• Both can do well if both set high prices
• Firm gains advantage over rival if she sets
high price while you set low
• Both firms are driven to the prisoner’s
dilemma outcome
McAleese, 2004:145
© Malcolm Brady, 2011
β1
Nash equilibrium
β2
α1
a11
b11
a12
b12
α2
a21
b21
a2,2
b2,2
• ‘An array of strategies, one for each player, such that no
player has an incentive (in terms of improving his payoff)
to deviate from his part of the strategy array’ (Kreps
1990:28)
• For (α1,β1) to be a Nash Equilibrium in pure strategies
a11>a21 and b11>b12
• If both players settle on this pair of strategies then
neither will have an incentive to move from their choice –
neither player can improve their payoff by shifting to an
alternative strategy
• Nash’s theorem – all non-cooperative games with a finite
number of players and strategies have a mixed strategy
equilibrium
© Malcolm Brady, 2011
Examples of Nash Equilibria
2,1
0,0
8,8
0,10
0,0
1,2
10,0
4,4
Battle of the Sexes:
two NE in pure strategies exist
5,5
0,3
8,8
5,7
3,0
3,3
10,0
4,4
Stag hunt:
two NE in pure strategies exist
© Malcolm Brady, 2011
Prisoners dilemma:
one NE in pure strategies exists
Payoffs in top right cell altered:
no NE in pure strategies exist
Focal point
l
r
•
l
r
•
Two cars round a corner
both of them in the centre
of a narrow road
•
r
l
r
-2,-2
-3,-3
l
-3,-3
0,0
Two NE equilibria exist (l,l) and (r,r).
(l,l) is Pareto dominant as payoff is
better for both players
ie. (0,0) is better than (-2,-2) for both
© Malcolm Brady, 2011
•
•
NE
Multiple Nash equilibria can exist
in a game.
This raises the issue of which NE
players will settle on.
Coordination problem can be
solved by focal consideration.
Both drivers naturally turn left, and
both know that the other driver will
naturally turn left and so (left, left)
is the likely equilibrium
But, if both drivers are foreigners
and this is obvious (eg. both cars
have foreign registrations) then
(right,right) also becomes a
possibility. Outcome here is less
certain as there are now two focal
considerations
Stag Hunt game
stag hare
stag
5,5
0,3
hare
3,0
3,3
If a group of hunters set out to take a stag,
they are fully aware that they would all have
to remain faithfully at their posts in order to succeed;
but if a hare happens to pass one of them,
there can be no doubt that he pursued it without qualm,
and that once he had caught his prey,
he cared very little whether or not he had made
his companions miss theirs.
Jean-Jacques Rousseau, quoted in Fudenberg and Tirole 1991
Party game:
replace stag with ‘arrive early’
and hare with ‘arrive late’
© Malcolm Brady, 2011
• If hunters cooperate they get a
stag
• If hunters hunt separately they
each get a hare
• (Stag, stag) is pareto optimal
• Payoffs are best for at least
one player and at least as
good for other players
• But (hare, hare) is risk
dominant
– Lower ultimate payoff but less
risky
• has a higher security level for
each player
Chicken
tough weak
tough
-1,-1 2,1
weak
1,2
0,0
• Similar to Battle of the Sexes
• But is an anti-coordination
game
• With disastrous consequences
for coordination on (tough,
tough)
Logan Airport Near Miss 9 June 2005
Aer Lingus A330 and US Airways B737 planes take
off on two different runways that intersect
See simulation
Aer Lingus
Airbus A330
http://www.ntsb.gov/Recs/mostwanted/federal_200511/animation.htm
See report
http://www.ntsb.gov/ntsb/brief.asp?ev_id=20050624X00863&key=1
What factors saved the day here?
© Malcolm Brady, 2011
US Airways
Boeing 737
Repeated Game: Tit for Tat
• Axelrod set up computer tournament where Tit
for Tat strategy yielded best outcome in an
infinitely repeated game
• Tit for Tat: cooperate in the first move and
thereafter do whatever the other player did on
previous move
• => cooperation can emerge even in a world of
egoists without central authority
• World without governance not necessarily one
where life is ‘solitary, poor, nasty, brutish and
short’ (Hobbes, 1651)!
Axelrod, 1984:10
© Malcolm Brady, 2011
Tit for Tat
• Nice
• Always begin by cooperating; never first to defect
• Forgiving
• Only defect once, then resume cooperation
• Retaliatory
• If other player defects, always punish by defecting on next move
• Clear
• Other player quickly comes to understand tit for tat strategy
• Robust
• Tit for tat proved best strategy in Axelrod’s tournaments; also proved best
strategy in John Maynard Smith’s evolutionary simulations (survival of the
fittest rule)
• Tit for Tat is a trigger strategy
• Grim trigger is another trigger strategy, but unforgiving:
• cooperate initially but, on defection by opponent, defect forever
© Malcolm Brady, 2011
coop defect
Tournament
coop
3,3
0,5
defect
5,0
1,1
• Two players playing tit for tat both achieve
payoffs of 3+3δ+3δ2+3δ3+…=3/(1- δ)
• Suppose player 2 plays all defects. Then player
1 achieves 0+δ+δ2+δ3+…=δ /(1- δ)
and player 2 achieves 5+δ+δ2+δ3…= 5+δ /(1- δ)
• Player 2 is better off playing tit for tat provided
3/(1- δ)>5+δ /(1- δ) ie. δ>½
• Above was the model used in the tournaments
• δ is the discount factor
© Malcolm Brady, 2011
Repeated prisoner’s dilemma
Tit for tat
All defect
Tit for tat
3+3δ+3δ2+3δ3+…=3/(1- δ)
3+3δ+3δ2+3δ3+…=3/(1- δ)
All defect
5+δ+δ2+δ3…= 5+δ/(1- δ)
0+δ+δ2+δ3+…=δ/(1- δ)
For discount rate of 0.9
this reduces to
30,30
9,14
14, 9
10,10
© Malcolm Brady, 2011
0+δ+δ2+δ3+…=δ/(1- δ)
5+δ+δ2+δ3…= 5+δ/(1- δ)
1+δ+δ2+δ3+…=1/(1- δ)
1+δ+δ2+δ3+…=1/(1- δ)
This game is not dominance solvable
but we will see shortly that the pair of
strategies (tit for tat, tit for tat) yield a
Nash equilibrium with payoffs of (30,30)
Note that the pair of strategies (all
defect,all defect) with payoffs (10,10)
is also a Nash equilibrium
Folk Theorem
• In an infinitely repeated game it is possible to obtain the cooperative
result, unless interest rates are too high (or discount rates too low)
Player 2
can ensure payoff of 1
Player 1
payoff
Folk theorem says that any outcome in shaded area
can be obtained in an infinitely repeated game
5
4
Both players can do better than they would
under a one-shot game
3
2
Player 1
can ensure payoff of 1
1
1
2
3
4
5
Player 2
payoff
Martin, 2001:40
© Malcolm Brady, 2011
Backward Induction
(3, 1)
l
2
•
L
r
(1, 2)
1
R
2
•
•
l
(2, 1)
•
•
•
r
(0, 0)
•
•
© Malcolm Brady, 2011
Start at last decision node
Ask what choice would decision
maker at that node make
Use dotted line to indicate this
decision
Then move back a decision node
and repeat the process
Continue until you reach the root
node
The dotted line path indicates the
solution found by backward
induction
Strategy pair (R, (rl)) is solution
and yields payoff (2,1)
Note that a player’s strategy
requires a choice at each decision
node for that player
Subgame Perfection
ll
lr
rl
rr
L
3,1
3,1
1,2
1,2
R
2,1
0,0
2,1
0,0
Red: highest in column
Green: highest in row
Highest in column
and highest in row
indicates NE
2
l
(2, 1)
• Subgame perfection places a
further restriction on NE and
narrows down the possible
solutions to a game
• Subgame perfection implies
that each subgame must also
represent a NE
• Two NE exist: (R,rl) and (L,rr)
but only (R,rl) is subgame
perfect ie. these choices would
also made in every subgame
• (L,rr) is not SPNE because
player two will not choose r in
the subgame across
r
(0, 0)
© Malcolm Brady, 2011
Selten
Gibbons, 1992:115
Credible Threat
• NE of (L,rr) is sustained by the threat by player 2
of playing r
• But this threat is not credible
– Refer back to earlier slide where threat to fight new
entrant with price war was not credible
• Easier to see this in extensive form; not obvious
in normal form
• Technical note: A subgame is a subset of the game that is
itself a game in extensive form. It starts at a singleton node,
and if any information set is reached then all nodes in that
information set must be reachable
© Malcolm Brady, 2011
Some further reading
• Brandenberger A, Nalebuff B. 1995. ‘The right
game: use game theory to shape strategy’,
Harvard Business Review 73(4):57-71
• Gibbons, R. 1997. An introduction to applicable
game theory. Journal of Economic Perspectives
11(1):127-149.
• Dixit, A and B Nalebuff. 1991. Thinking
strategically, WW Norton.
• Dutta, P. 1999. Strategies and Games, MIT
Press.
© Malcolm Brady, 2011
Download