Uploaded by gargrajesh24x7

4 Sequential Equilibrium

advertisement
Game Theory
Sequential Equilibria
Ruitian Lang
ANU
October 11, 2021
Ruitian Lang (ANU)
Game Theory
October 11, 2021
1 / 38
Overview
Consider a player’s decision making at a non-singleton information set
under some conjecture about the opponents’ strategies (like they
playing the equilibrium strategies).
In order to evaluate his options, the player needs to assign probabilities
to the histories contained in the information set.
In SPE, we do not consider decision making at these information sets
specifically.
In PPE, by considering public strategies, such probabilities are
irrelevant.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
2 / 38
Table of contents
1
Conditional Probabilities
2
Definition and the First Example
3
Some More Examples
Ruitian Lang (ANU)
Game Theory
October 11, 2021
3 / 38
Terminology of probability theory
We usually have a big set Ω containing all possible outcomes. In
extensive form games, that set is usually the set of all histories or the
set of histories in a particular information set.
A probability measure P assigns a non-negative number to each
(measurable) subset of outcomes, such that P(Ω) = 1 and for any
countably many disjoint subsets A1 , A2 , ..., of Ω,


∞
∞
[
X
P  Aj  =
P(Aj ).
j=1
j=1
The above equation implies the requirement that the infinite series on
the right hand side converges. We only need the case for finitely many
Aj ’s.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
4 / 38
Terminology of probability theory
In talking about probability measures on a finite set I, it is convenient
to represent a measure P by a probability
P mass function p : I → R
such that p(j) ≥ 0 for every j ∈ I and
P j∈I p(j) = 1. The probability
measure of a J ⊂ I is then P(J) = j∈J p(j).
However, we often misuse notation and write P({j}) = p(j) as “P(j)”
and talk about the probability of “j” instead of that of {j}.
In game theory, we often need to talk about different probability
measures on the same set, like the different probability measures on Y
depending on the action profile in public monitoring games.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
5 / 38
Some motivation I
Focus on Player i and fix the opponents’ strategies. Imagine that the
opponents’ (including Nature’s) play lands Player i in one of the
following disjoint information sets: I1 , ..., Im . At Ij , Player i chooses an
action aj ∈ Aj and there is no future decision for Player i to make.
Let P be the probability measure on I1 ∪ I2 ∪ ... ∪ Im so that P(A) is
the probability that a subset A of histories is reached based on the
opponents’ strategies.
For each j, Player i needs a probability measure Pj on Ij to evaluate his
options at the information set.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
6 / 38
Some motivation II
For a history h ∈ Ij and action a ∈ aj , let v(a, h) be Player i’s expected
payoff from playing a (assuming that the opponents play according to
their strategy profiles in the future).
Then
P Player i should choose aj ∈ Aj to maximize
h∈Ij v(aj , h)Pj ({h}).
After an aj is chosen for each j,Pwe hope
that Player i’s expected
m P
payoff from the entire game is j=1 h∈Ij v(aj , h)Pj ({h})P(Ij ).
However,
Pm P Player i’s expected payoff from playing this strategy is
j=1
h∈Ij v(aj , h)P({h}) by definition.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
7 / 38
Some motivation III
The correct way of specifying Pj for each information set Ij should be
independent of the payoff function. Therefore, we require that
Pj ({h})P(Ij ) = P({h}) for every h ∈ Ij .
To accommodate the case where Ij is infinite, we generalize the
requirement to the condition that for every (measurable) subset A of Ij ,
Pj (A)P(Ij ) = P(A).
When Ij is reached with positive probability, the condition determines
Pj : Pj (A) = P(A)/P(Ij ) for every (measurable) A ⊂ Ij .
When Ij is reached with zero probability, our entire derivation (and the
entire probability theory) says nothing.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
8 / 38
Conditional probabilities
Definition
Let P be a probability measure on Ω and A and B be (measurable) subsets
of Ω such that P(B) > 0. The probability of A conditional on B is defined
as P(A|B) = P(A ∩ B)/P(B).
Theorem
(Bayes) Let P be a probability measure on Ω, A, B1 , ..., Bm be
(measurable)
subsets of Ω such that (a) Bi ∩ Bj = ∅ for i 6= j; (b)
Sm
B
=
Ω;
(c) P(A) > 0; and (d) P(Bj ) > 0 for j = 1, ..., m. Then
j=1 j
P(A|Bj )P(Bj )
P(Bj |A) = Pm
, for j = 1, ..., m.
k=1 P(A|Bk )P(Bk )
Ruitian Lang (ANU)
Game Theory
October 11, 2021
9 / 38
Assessments
Definition
In an extensive form game, an assessment of Player i assigns to each
information I of Player i a mix of actions σI on AI (where AI is the set of
actions available at I) and a probability measure PI on I.
Definition
Fix an extensive form game that is continuity at infinity and an assessment
profile. Fix a Player i and one of his information sets I, where his set of
available actions is denoted by AI . For each a ∈ AI and h ∈ I, denote by
vi (a, h) his expected payoff when every player (including i himself) follows
the strategy profile after the history (h, a). Player i’s assessment is called
sequentially rational if for each of his information
set I, each action a in the
P
support of σI maximizes EI [vi (a, h)] = h∈I vi (a, h)PI (h).
Ruitian Lang (ANU)
Game Theory
October 11, 2021
10 / 38
Table of contents
1
Conditional Probabilities
2
Definition and the First Example
3
Some More Examples
Ruitian Lang (ANU)
Game Theory
October 11, 2021
11 / 38
Equilibrium refinements
For an information set I which is reached with positive probability in
equilibrium, the probability measure PI is given by conditional
probabilities.
Probabilists do not care about information sets I that are not reached
with positive probabilities as decisions made there do not affect
expected payoffs.
Economists do care about such information sets as decisions made
there affect other players’ decision making. Different proposals have
been made about restrictions on PI when P(I) = 0.
Each proposal leads to a different solution concept. We focus on a
standard one called “sequential equilibrium” and will briefly mention
another (specifically for signaling games).
Ruitian Lang (ANU)
Game Theory
October 11, 2021
12 / 38
The scandal game
A journalist learns a scandal of a celebrity and demands a payment from the
celebrity to hide that scandal.
1
2
3
The reader decides whether to read the newspaper.
Without observing whether the public will read the newspaper, the
celebrity decides whether to pay the journalist.
Without observing the public’s action but after observing whether the
celebrity pays, the journalist chooses whether to publicize the scandal.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
13 / 38
The scandal game (cont.)
Payoffs:
Reader 0 if not reading; 1 if reading a scandal, and -1 if reading a
newspaper with no scandal.
Celebrity 0 if scandal not publicized, -5 if scandal publicized without
being read, -10 if scandal publicized and read. Suffers
additional 2 if making payment.
Journalist 0 if hiding the scandal, 1 if scandal publicized and read, -5 if
scandal publicized without being read. Gains 2 if payment is
made, but suffers -5 if publicizing scandal after payment
(through bad reputation or the celebrity’s retaliation).
Ruitian Lang (ANU)
Game Theory
October 11, 2021
14 / 38
The scandal game: surprising move
Consider the following strategy profile: the reader does not read, the
celebrity pays, and the journalist publicizes if and only if no payment is
made.
The reader’s and the celebrity’s moves are obviously sequentially
rational. The question is whether the journalist’s move without
payment is sequentially rational.
That movement can be justified by the belief that the reader will read
the newspaper. Is that belief reasonable?
Ruitian Lang (ANU)
Game Theory
October 11, 2021
15 / 38
Consistency
Definition
A behavior strategy is called fully mixed if its support at every information
set is the entire set of the available actions at that information set.
Definition
An assessment of Player i which assigns to each information set I of Player
i the probability measure PI is called consistent with a strategy profile s if
there exists a sequence {s (k) }∞
k=1 of fully mixed strategy profiles converging
(k)
to s such that the corresponding conditional probabilities PI converges to
PI as k → ∞, for every information set I of Player i.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
16 / 38
Consistency (cont.)
In case every information set is finite:
A fully mixed behavior strategy assigns a positive probability to every
action.
For every history h ∈ I, PI ({h}) → PI ({h}) as k → ∞.
(k)
Definition
An assessment profile of an extensive form game (continuous at infinity) is
called a sequential equilibrium if it is consistent and sequentially rational.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
17 / 38
Sequential equilibria
It can be shown that every finite extensive form game has a sequential
equilibrium; actually, each of them has a more refined solution, a
(trembling-hand) perfect equilibrium.
There is no mechanical procedure of finding all sequential equilibrium.
The proof of the existence theorem involves two non-constructive steps.
To check that a given assessment profile is a sequential equilibrium, we
first check consistency by constructing a suitable sequence of fully
mixed strategy profiles and then check sequential rationality.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
18 / 38
Perfect Bayesian Equilibria
Many textbooks use a solution concept called “Perfect Bayesian
Equilibria” (PBE), which I do not recommend you to use.
There is no unified definition of PBE: different textbooks use different
definitions. One version is to define PBE as a synonym of sequential
equilibria.
Therefore, if you give a presentation and somebody asks you what you
mean by a sequential equilibrium, just says “PBE”.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
19 / 38
The certificate game
There is a seller of a used car and a buyer. The car may be roadworthy
or not roadworthy; the seller knows this but the buyer does not.
The cost of a roadworthy inspection is c > 0, the value of a roadworthy
car is v > c and the value of a non-roadworthy car is zero.
The seller may sell a roadworthy car with a certificate (by ordering the
inspection himself) or sell a car without a certificate. Either way, the
seller makes a take-it-or-leave-it offer. A buyer of a car without
certificate needs to order the inspection.
The seller’s payoff is payment (if any) minus cost of inspection (if any).
The buyer’s payoff is the value of the car (if purchased) minus payment
(if any) minus cost of inspection (if any).
Ruitian Lang (ANU)
Game Theory
October 11, 2021
20 / 38
Persuasion models
There is a seller and a buyer. Nature chooses the seller’s type from a
finite set Θ = {0, 1, ..., m}.
There is a set M = {1, 2, ..., m} of certificates. For every k ∈ M, only
types θ ≥ k can obtain Certificate k.
For simplicity, assume that obtaining an available certificate is free.
The seller chooses which certificate to obtain and present to the buyer.
There is a continuation game after that. The model is called voluntary
disclosure of verifiable information or persuasion.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
21 / 38
A persuasion game
For simplicity, assume that after a certificate (if any) is presented, the
seller makes a take-it-or-leave-it offer to the buyer.
The seller’s payoff is payment from the buyer. The buyer’s payoff is the
value vθ of the product (if purchased) minus payment (if any).
Assume that v0 < v1 < ... < vm .
Ruitian Lang (ANU)
Game Theory
October 11, 2021
22 / 38
Table of contents
1
Conditional Probabilities
2
Definition and the First Example
3
Some More Examples
Ruitian Lang (ANU)
Game Theory
October 11, 2021
23 / 38
Overview
In three classes of models, a player takes an action to actively influence
other players’ beliefs in order to improve his future payoff.
Career concerns (Signal jamming) The player does not know what he is.
Signaling The player knows what he is and tries to portrait himself
truthfully.
Reputation The player knows what he is and tries to pretend to be
something else.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
24 / 38
A simple career concern model
There is a worker and two employers. Nature chooses a worker’s talent
θ ∈ {H, L}. The probability that θ = H is µ ∈ (0, 1).
There are two periods. In each period t, the following events happen:
1
2
3
Each employer i makes a wage offer wi,t (without observing the other
employer’s offer).
The worker chooses which offer to accept (or accepts neither).
If working, the worker chooses a binary effort et ∈ {0, 1} and a binary
outcome yt ∈ {R, 0} is generated and publicly observed.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
25 / 38
A simple career concern model (cont.)
The probability that yt = R is qθ + aet where 0 < a < 1 − qH .
When hiring the worker, Employer i’s stage payoff is yt − wi,t and the
worker’s stage payoff is wi,t − cet with c > 0.
All players discount future payoffs by δ ∈ (0, 1).
The two employers essentially play a Bertrand competition, so they
both offer wage equal to the expected revenue.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
26 / 38
Career concerns
There is no contractual or discretionary bonus, so there is no direct
incentive for effort.
However, the worker may exert effort to improve the employers’ beliefs
about his talent.
When inferring about the worker’s talent from his output, both
employers take into account the “distortion” created by the effort.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
27 / 38
Signal jamming
Signal jamming refers to a situation where a player exerts a costly
effort to distort a signal without observing the signal or the true state.
Career concern is one example of signal jamming where the effort is
socially beneficial. There are other models where the effort is a social
waste such as advertisement or manipulating report to supervisor.
A rational receiver of the signal should “deduct” the expected
distortion from the observed signal. In the mostly commonly used
functional form, the deduction is perfect.
When the deduction is perfect, the equilibrium outcome (apart from
the cost and benefit of the effort) is the same as when the sender has
no opportunity to distort.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
28 / 38
Spence’s theory of education
Nature chooses a student’s type θ ∈ {H, L}.
The student observes his type and decides whether to pursue an
advanced degree m ∈ {D, N}.
An employer observes the student’s degree choice but not his type and
chooses whether to offer a job.
The cost of getting the advanced degree depends on the student’s type.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
29 / 38
Signaling
There is a sender and a receiver. Nature chooses the sender’s type
θ ∈ Θ.
The sender observes his type and chooses a message m ∈ M from a
pre-defined set of messages M.
The receiver observes m but not θ and makes some decision d ∈ D.
The receiver’s payoff uR depends on θ, m and d; crucially, the sender’s
payoff uS depends on m as well as d and θ.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
30 / 38
Signaling (cont.)
Compared with persuasion, the message is not a verifiable piece of
information about the sender’s type. In theory, every message can be
sent by all types.
However, the “cost” of sending a message depends on the sender’s
type.
This is different from cheap talk, where all messages are costless.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
31 / 38
Separating vs pooling
In a separating equilibrium, any two different types send different
messages. In a pooling equilibrium, all types send the same message.
An equilibrium may also be mixed when the sender uses a
non-degenerate behavior strategy, or partially separating when the set
space is partitioned into several groups and all types in the same group
send the same message.
Since every message is available to all types (albeit at different costs),
consistency imposes no restriction on the receiver’s belief upon seeing a
message that is not supposed to be sent by any type.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
32 / 38
The intuitive criterion
Let Ã(m) be the set of receiver’s actions that is optimal under some
probability measure on Θ when she receive message m. (Ã(m) is
independent of m if the receiver’s payoff function is so.)
Fix an assessment profile and denote by vS (θ) the expected payoff of a
sender with type θ ∈ Θ.
Intuitive criterion: for every message m ∈ M, if there is a θ ∈ Θ such
that uS (a, m, θ) ≥ vS (θ) for no a ∈ Ã(m) and another θ0 ∈ Θ such
that uS (a, m, θ0 ) ≥ vS (θ0 ) for some a ∈ Ã(m), then zero probability
should be assigned to this θ by the receiver upon seeing m.
In words, if sending a message m may (weakly) benefit θ0 but always
makes θ strictly worse off, then we rule out θ upon seeing m.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
33 / 38
Remarks (optional)
The criterion rules out “counterintuitive” beliefs of types that cannot
benefit from sending the message under consideration.
The criterion does not follow from trembling-hand perfection.
Even with supermodularity, the criterion does not unravel the game; a
stronger and much less intuitive criterion (called “divinity I”) does so.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
34 / 38
The idea of reputation
There is a behavioral type that commits to a particular strategy and a
rational (opportunistic) type that maximizes some payoff function.
The rational type’s payoff is increasing in other players’ belief that he is
behavioral.
In early stages of the game, the rational type may imitate the
behavioral type to build “reputation” that he is behavioral, in order to
show his true face and earn high payoff later on.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
35 / 38
A concession game
This is a hugely simplified version of Abreu and Gul (2000) “Bargaining
and Reputation”.
There are two players spliting a unit rent. Each player i has announced
a demand αi ∈ ((1 + δ)−1 , 1).
In Period t = 0 or 2, Player 1 decides whether to concede (receiving
δ t (1 − α2 ) and ending the game);In Period 1, Player 2 decides whether
to concede (receiving δ(1 − α1 ) and ending the game).
If nobody concedes by the end of Period 2, the game ends with both
players receiving zero payoff.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
36 / 38
A concession game (cont.)
Under the above assumptions, the game has perfect information and
can be solved backward. In equilibrium, Player 1 concedes immediately.
Now assume that Player i is of a commited type with probability
qi ∈ (0, 1) (independent of the other player’s type). Each player knows
his own type at the beginning but not the opponent’s type.
A commited type never concedes. A non-commited (rational or
opportunistic) type maximizes his expected payoff.
Would a rational Player 1 concede immediately?
Ruitian Lang (ANU)
Game Theory
October 11, 2021
37 / 38
A concession game (cont.)
Unless q2 is sufficiently high, there is no equilibrium in which a rational
Player 1 concedes immediately.
The result can be interpreted as follows: a rational Player 1 wants to
pretend to be committed (for as long as possible) in the hope that it
will convince Player 2 to concede.
The full model considers a continuous-time version of the concession
game and both players concede with some positive rate after a while.
This outcome is referred to as a war of attrition.
Ruitian Lang (ANU)
Game Theory
October 11, 2021
38 / 38
Download