Document 11157128

advertisement
Digitized by the Internet Archive
in
2011 with funding from
Boston Library Consortium IVIember Libraries
http://www.archive.org/details/approximatefolktOOfude
working paper
department
of economics
AN APPROXIMATE FOLK THEOREM WITH
IMPERFECT PRIVATE INFORMATION
Drew Fudenberg
and
David K. Levine
No.
March 1989
525
massachusetts
institute of
I
^
technology
J'
50 memorial drive
Cambridge, mass. 02139
AN APPROXIMATE FOLK THEOREM WITH
IMPERFECT PRIVATE INFORMATION
Drew Fudenberg
and
David K. Levine
No.
525
March 1989
AN APPROXIMATE FOLK THEOREK WITH
IMPERFECT PRIVATE INFORMATION
by
Drew Fudenberg
and
David K. Levine
***
March 1989
*
We are grateful to Andreu Mas-Colell. NSF Grants 87-08616 and
86-09697 and a grant from the UCLA Academic Senate provided financial
support.
**
Department of Economics, Massachusetts Institute of Technology
***
Department of Economics, University of California, Los Angeles; part
of this research was conducted while at the University of Minnesota.
Abstract
We give a partial folk theorem for approximate equilibria of a class
of discounted repeated games where each player receives a private signal of
the
play
of
his
opponents
"informationally connected,"
.
Our
meaning
condition
that
each player
that can be statistically detected by player
any third player k.
is
j
that
i
the
has
a
game
deviation
regardless of the action of
Under the same condition,
we obtain a partial folk
theorem for the exact equilibria of the game with time-average payoffs,
JEL Classification numbers = 022, 026
Keywords:
Repeated games, folk theorem, private information,
approximate equilibrium
be
1.
Introduction
We give a partial folk theorem for the approximate equilibria of a
class of repeated games with imperfect information satisfying an informa-
tional linkage condition.
Every payoff vector which exceeds a mutual threat
point and is generated by one-shot mixed strategies from which no player can
profitably deviate without having some effect on other players, is approximately an approximate sequential equilibrium payoff if the discount factor
is close
enough to one.
The class of repeated games we consider has complete but imperfect
information.
Each period, player
his own payoff and also a signal
i
z
chooses an action
a
,
then observes
of the play of his opponents.
This
model includes as a special case games of imperfect public information as
defined by Fudenberg-Levine-Maskin [1989].
publicly observed variable
y,
In these games, there is a
and each player
i
observes
z.
- (y,a.).
The public information case includes the literature on repeated oligopoly
(Green-Porter [1983], Abreu-Pearce Stacchetti [1986]), repeated principalagent games (Radner [1981, 1985], Rubinstein-Yaari [1983]), repeated
partnerships (Radner [1986], Radner-Myerson-Maskin [1986]), as well as the
classic case of observable actions (Auman-Shapley
Rubinstein [1981], Fudenberg-Maskin [1986]).
,
Friedman [1971],
It also includes the examples
studied in the literature of repeated games with "semi-standard
information", as discussed in Sorin [1988].
While examples of repeated games that have been formally studied have
public information, players have private information in some situations of
economic interest.
Indeed, the central feature in Stigler's [1961] model of
secret price-cutting is that each firm's sales depend on the prices of its
rivals, but firms only observe their own demand.
The goal of this paper is
to show that a partial folk theorem extends to these cases, provided we
relax the notion of equilibrium slightly.
The key element in any folk theorem is that deviators must be punished.
This requires, with three or more players, that non- deviators coordinate
their punishments, and may even require a player to cooperate in his own
punishment.
With public information, the necessary coordination can be
accomplished by conditioning play on the commonly observed outcome.
When
players receive different private signals they may disagree on the need for
punishment.
This raises the possibility that some players may believe that
punishment is required while others do not realize this.
The most straight-
forward way to prevent such confusion from dominating play is to
periodically "restart" the strategies at commonly known times, ending
confusion, and permitting players to recoordinate their play.
This,
however, makes it difficult to punish deviators near the point at which play
restarts and forces us toward approximate equilibrium.
The use of approximate equilibrium leads to an important
simplification, because it allows the use of review strategies of the type
introduced by Radner [1981].
Under these strategies, each player calculates
a statistic indicating whether a deviation has occurred.
If the player's
statistic crosses a threshold level at a commonly known time, he communicates this fact during a communication stage of the type introduced by Lehrer
[1986].
Our informational linkage condition ensures that all players cor-
rectly interpret this communication, so that the communications stage allows
coordination of punishments.
The importance of approximate equilibrium is
that Radner-type review strategies do not punish small deviations, i.e.,
those that do not cause players statistics to cross the threshold level.
Consequently, there will typically be small gains to deviating.
A second
consequence of approximate equilibrium is that sequentiality loses its
force,
and the review strategies are perfect whenever they are Nash.
This
is because punishments end when the strategies are restarted so the cost to
punishers is small.
Thus the punishment are "credible" in the sense that
carrying them out is an approximate equilibrium.
Our results are closely connected to those of Lehrer [1986, 1988a, b,c,d,
1989] who considers time-average Nash equilibrium of various classes of games
with private information and imperfect monitoring but with non-stochastic outIn addition to the deterministic nature of outcomes, Lehrer 's work has
comes.
a different emphasis than ours,
focusing on completely characterizing
equilibrium payoffs under alternative specifications of what happens when a
time average fails to exist.
Our work provides a partial characterization of
payoffs for a special but important class of games.
A secondary goal of our paper is to clarify the correction between
Lehrer'
s
work and other work on repeated time-average games, and work on
discounted games such as ours.
In particular we exposit. the connection
between approximate discounted equilibrium and exact time-average equilibrium, which is to some extent already known to those who have studied
repeated games.
We emphasize the fact that the time-average equilibrium
payoff set includes limits of approximate as well as exact discounted
equilibria.
Although we show that the converse is false in general, the
type of construction used by Lehrer (and us) yields both an approximate
discounted and a time-average theorem.
For economists, who are typically
hostile to the notion of time-average equilibrium, we hope to make clear
that results on repeated games with time -average payoffs are relevant,
provided the logic of approximate equilibrium is accepted.
If we consider the set of equilibrium payoffs as potential contracts in
there is an interpretation of approximate equi-
a mechanism design problem,
librium that deserves emphasis.
The set of equilibrium payoffs in our
theorem in some cases strictly exceeds that for exact equilibria.
This
means that exact equilibrium may be substantially too pessimistic.
The
efficiency frontier for contracts may be substantially improved if people
can be persuaded to forego the pursuit of very small private benefits.
For
example, even if ethical standards have only a small impact on behavior,
they may significantly improve contracting possibilities.
Section
2
of the paper lays out the repeated games model.
Section
3
establishes a connection between approximate sequential discounted
equilibria with discount factor near one, and time-average equilibrium.
In
particular, both types of equilibria can be constructed from finite-horizon
approximate Nash equilibria.
Section 4 describes our informational linkage
condition, and proves a partial folk theorem,
2.
The Model
In the stage game
chooses an action
,
each player
from a finite set
a.
11
player
observes an outcome
'
let
z
- (z,
z
1
n
and
)
z.
Z - x.
,
1-1
realized payoff
A.
1
to
N,
simultaneously
with
m.
elements.
a finite set with
6 Z.,
induces a probability distribution
i's
i
i,
Each
We
elements.
M.
1
Z
.
Each action profile
.
a e
i
tt
(a)
over outcomes
z.
A
X.
,
A,
1-1 1
Each player
depends on his own observed outcome only; the
r (z.)
opponents' actions matter only in their influence on the distribution over
outcomes.
Player
i's
expected payoff to an action profile
g. (a)
"i
-
S
_
zeZ
TT
z
(a)r. (z.)
1
1
.
a
is
.
d -
We also define
g.(a.,a
max
i ,a. ,a: ,a
1
1
g.(a', ,a
-
.)
to be the maximum
.)
.
-1
one-period gain any player could obtain by playing one of his actions
instead of another
al
a
.
We will also wish to consider various types of correlated strategy
profiles.
A correlated action profile
tion over
A.
A
,
H X.
A
.
We write
and the vector
i
plays
write
a
(a. ,a
probability distribution,
is such a
i
represents the correlated action in which player
)
and the other players correlate according to
for the marginal induced over
a
simply a probability distribu-
is
for the induced probability distribution over
a
A belief by player
.
a
we refer to
is independent,
by
A
a
.
is characterized by and may be identified with
.
We also
.
If the play of players
as a mixed action profile
a
a
a
(a.
in which case
,
the vector of
),
marginals
For any correlated action profile
we can calculate the induced
a,
distribution over outcomes
TT
(a) -
2
aGA
(a)a(a).
TT
We can also calculate the marginal distribution over player
TT
(z
,q:)
-
S
g (a) -
E
TT
z.eZ,
1
In the repeated game,
.
,Q)r (z
(z
^
^
-
)
^
i
in each period
t
Z
ZGZ
TT
i
(a)r.(z
- 1,2
i
at time
t
is
h.(t) - (a.(l),z.(l)
).
^
^
the stage game is
played, and the corresponding outcome is then revealed.
player
outcomes
i's
T (q).
Finally, we may calculate the expected payoff to player
a. (t)
The history for
,
a
z. (t)
)
.
We also
let
denote the null history existing before play begins.
h.(0)
for player
h.(t-l)
is a sequence of maps
i
A system of beliefs
h.(t),
private history
b,
a
mapping his private history
a.(t)
to probability distributions over
for player
A
i
.
specifies for each time
probability distribution
histories of other players of length
t.
b
over private
b
for all
truncation of
a
in
This requires that there exists for each
a sequence of truncated strategy profiles
T
and
t
if for every finite time
a,
is consistent with the
the sense of Kreps-Wilson [1982].
truncation
b.(h.(t))
A profile of beliefs
payers is consistent with the strategy profile
horizon the truncation of
A strategy
a
T
-*
a
T
putting
,
strictly positive probability on every action, such that the unique sequence
of truncated belief profiles derived from Bayes law,
Given a system of beliefs
b
and a history
induced over the history of all players play.
the strategy profile
at all times
player
i,
r
.
b
T - T
b
.
a distribution is
h (t)
Given this distribution and
a corresponding distribution is induced over play
a,
In turn this gives an expected payoff in period
which we denote by
G. (r ,h. (t) ,b. ,a)
normalized present value to player
i
r
to
The corresponding
.
<
at discount factor
5
Wj^(h^(t).b^,a.5) - (1-5) 2"_^ 5^'^ G(r ,h^(t) ,b^,a)
<
1
is
.
^
While if
5-1
wj(hj^(t).b.,a) - (1/T) Sy_^ G^(r,h^(t),b^,a)
and
W (h (t),b
^
,cr,l)
^
-
lira
sup W^Ch (t)
,b
,a)
.
T-*«
Notice that in the case of the initial null history
h. (0)
,
the present
.
value of time average expected payoff is the same for all beliefs consistent
with
W
The emphasize this, we write
a.
For
5
each player
<
W^(h^(0) ,b^
and
W^(h^(0),b^,a,6)
and strategy
i
is an e-Nash
a
equilibrium if for
a'
W^((aj^.a_^),fi) ^ W^(£7,5) +
(2.1)
in place of
.a)
a strategy profile
1,
T
W. (a)
and
(cr,5)
£
There is also a corresponding notion of a truncated e-Nash equilibrium
,
where we require
W^(aj^,a_^) < W^(a) +
An
£
e.
-sequential equilibrium is defined in a similar way, except that we
require there exist beliefs
b
consistent with
a
W^(h^(t),b^,(a^,a_^),5) < W^(h^(t)
(2.2)
for all players
and all histories
i,
Notice that
history).
rather than time
t.
h. (t)
such that
.b^^.a, 5)
+
8^''^
e
(not merely the null initial
is measured as a present value at time
W,
1,
Consequently, an €-sequential equilibrium is defined
so that the gain to deviating at time
is of order
t
e,
in time-t units.
Notice that this definition is stronger than the usual version, found in
Fudenberg and Levine [1983]
at time
For
t
,
ordinarily the gain to deviating
for example:
is measured in time-0 units, not time-t units.
5-1,
we will impose a regularity condition on the equilibrium.
A uniform Nash equilibrium is a profile
T
(2.3)
W.(cr)
(2.4)
for all
wT(cr: ,a
1
1
a
such that
converges, and
p
>
0,
.)
<
wT(cr)
1
-1
3r,
such that
+ p.
T >
r
implies
A uniform sequential equilibrium is a profile
T
W. (h. (t) ,b
(2.5)
converges for all histories
,a)
consistent beliefs
b,
3r
T >
such that
strategies
a
.
r
h.(t)
and
and
there exist consistent beliefs
(2.6)
such that
a
b
such that for all
implies for all histories
p
> 0,
and
h.(t)
,
W^(h.(t),b.,(al^,a_.) < wj(h^(t),b.,a) + p.
When
a
satisfies condition (2.3) we call it repular
we say it is sequentially regular
:
if it satisfies (2.5)
.
Our definitions of time average equilibrium require that the
appropriate regularity condition be satisfied.
In other words, we have
required that the time-average payoff exist if no player deviates, and that
there be a uniform bound on the gain to deviating in any sufficiently long
subgame.
Both conditions are responses to our unease in comparing lim infs
or lim sups of payoffs when the limit does not exist.
We impose the condi-
tions because we interpret time averaging as the idealized version of a game
with long finite horizon or very little discounting.
Below we show that
non-uniform equilibria cannot always be interpreted in this way.
3
.
Discounting and Time Averaging
Out goal is to characterize the set of equilibrium payoffs when players
are very patient.
In the next section we characterize truncated €-Nash
equilibrium payoffs with small
£
as containing a certain set
V*.
Since
this notion of equilibrium is not the most economically meaningful, we first
show that this is in fact a strong characterization of payoffs.
Specifically we show
Theorem 3.1
numbers
Suppose there exists a sequence of times
:
and strategy profiles
-^
e
such that
a
T
non-negative
,
is a T -truncated
a
T
time-average
-Nash equilibrium and
e
W
(a
- v.
)
There exists a sequence of discount factors
(A)
numbers
t
n_,
-
,-.-,
and strategy profiles
sequential equilibrium for
and
6
a
-I,
5
,,
n
W. (a
Then
.
non-negative
such that
,5
)
-*
v.
a
n.is
an
W
that
Remark
1
(cr.l)
2
a
such
- V
It is well known that the hypotheses of the theorem imply that
:
s
equilibrium (see for example, Sorin [1988]).
is a time-average Nash
Remark
n-
.
There exists a uniform sequential equilibrium strategy profile
(B)
£
The hypothesis is weak in the sense that only approximate Nash
:
equilibria are required.
However, as long as we consider approximate equi-
librium, or infinite time -average equilibrium,
sequentlality has little force.
and (B) show that
(A)
Roughly, the reason is that after a
deviation, no further deviations are anticipated, so the cost of punishment
must be only paid once.
patient.
This has negligible cost if players are very
It is easily seen in the following example, which is a special
case of the main theorem in the next section.
Consider a game in which
players perfectly observe each others past play, and suppose there is a pure
action profile
players.
periods
a
for which
Clearly there is a discount factor
K,
such that if
by his opponents for
deviating from
a.
K
5
> £
i
1
>
5,
>
and number of
the loss to each player of being minraaxed
periods exceeds any possible one-shot gain of
Consider then a strategy profile consisting of a review
phase in which all players play
the
strictly exceeds the minmax for all
g(a)
of which, player
i
is
a,
and
n
minmaxed for
different punishment phases in
K
periods.
Initially, the
,
10
game is in the review phase, and the review phase continues as long as no
Whenever
player deviates, or more than one player deviates simultaneously.
a single player deviates,
begins.
a
punishment phase against that player immediately
The punishment phase continues
periods, regardless of whether
K
any further deviation occurs, then the game restarts with a new review
phase.
6
> £
,
Since no player can profitably deviate during a review phase for
these strategies form a Nash equilibrium with payoffs
g(a)
It is
.
equally clear that they are not generally sequential (subgame perfect)
because players may profitably deviate during punishment phases.
In proving the folk theorem, Fudenberg-Maskin (1986a) construct
strategies that induce the players to carry out punishments by providing
"rewards" for doing so.
strategies.
But such rewards are not provided by our review
the Fudenberg-Maskin construction is only possible if the
Indeed,
game satisfies a "full -dimensionality" condition that we have not imposed.
Thus the conclusion of (A) cannot in general be strengthened to
The review strategies do, however, form an
with
£(5)
as
-+
5
-*
1.
€
(5)
£
- 0.
-sequential equilibrium
(Note that with observable actions, sequential
equilibrium and subgame -perfect equilibrium are equivalent.)
To show this, we
need only calculate the greatest gain to deviating during a punishment phase.
Recall that
any profile.
to zero as
is the greatest one -shot gain to any
d
Consequently, the gain is at most
6
->
1.
player of deviating from
which. clearly goes
(l-5)Kd,
The point is that the punishment lasts only K periods, and
players never expect to engage in punishment again.
Consequently, with
extreme patience, the cost of the punishment is negligible.
In particular,
with time -averaging this is an exact equilibrium.
Proof of (A)
:
Consider the strategy
then starting over and playing
a
a
from
of playing
T
+1
to
from
a
2T
,
1
to
T
and so forth.
,
11
We call period
Fix
<
5
t"
to
1
t" +
"round 1",
and any beliefs consistent with
1
Let a history
.
Let us calculate an upper bound on any
be given.
h (t))
(h^ (t)
a
"round 2" and so on.
2t"
to
1
player's gain to deviating in time-t normalized present value.
and a history
t
<
kT
t
(h^ (t)
< (k+l)T
,
.
.
.
so that
,
equilibrium, at most
During round
(1-6)T
at most
k+2
(1-5)T
Further,
d.
on is independent of what happened during the
Since round
previous rounds.
Since the maximum of per-
k.
can gain more than
k,
+1
(k+l)T
play from time
so that
k
we know that regardless of the history, no
d,
player by deviating in round
in round
is
t
period gain to deviating is
and choose
(t)),
,h
Fix a time
is a truncated time-average
k+1
-Nash
e
can be gained by deviating in this round.
e
5(1-5)T
can be gained, and so forth.
e
Conse-
T
quently, no player can gain more than
this quantity approaches
choose
by I'Hopital's rule.
e
close enough to
S
11
(1-5 )T (d+e /(1-fi
that
1
is a 2e
a
)).
5-+1,
As
Clearly, then, we may
-sequential equilibrium.
In addition, because of the repeated structure of
a
,
it is clear that
T
W.(a",5")
W.'^Ca")
-*
Proof of (B)
a
2
from
with
T
as
^
so
1
W.
Consider the strategy
:
1
-t-
1
to
T
Let a history
a.
s""
12
+ T
,
(h..
(t)
(a"",
fi"")
i
a
of playing
and so forth.
,
.
.
.
(t))
,h
I
.
from
a
to
1
k
-*
€
T
W.
k
(a
i
can gain at most
)
is at most
k+1
e
,
'
in round
k+2,
k+2
e
this implies that the time average gain is zero.
,
k
k+1
->
V.
,
s
is a uniform sequential equilibrium,
desired conclusion.
,
be in round
t
period as play in the future is independent of play in this round.
period gain at
T
Fix any beliefs consistent
be given and let
By deviating in the current round, player
k.
^ v,
1
.
d
per
The per-
Since
Since
and this yields the
I
12
Each of the two conclusions of Theorem 3.1 has strengths and weaknesses.
The limit of discounting is more appealing than time -averaging, which suggests
the interpretation (A)
but exact equilibrium is more appealing than
,
approximate equilibrium, which suggests the interpretation (B)
.
In this
it is worth emphasizing reiterating that the conclusion of (A) cannot
context,
be strengthened to
e
=0.
One interpretation of this fact is that the set
of time -average equilibria includes not only the limits of exact discounted
equilibria, but the limits of approximate discounted equilibria as well.
If we weaken definition (2.6) to allow the gain to deviating,
depend on
T
and
t,
than a uniform one.
to
p,
we have a time-averape sequential equilibrium
,
rather
That these equilibria include all limits of discounted
sequential equilibria as shown by
Proposition 3.2
If
:
a
is sequentially regular,
equilibrium for discount factor
5
with
e(5)
-+
and is an
0,
then
e
a
(5) -sequential
is a time-
average sequential equilibrium.
Proof
5
-
1
:
,
This effectively follows from the fact that the limit points, as
of the normalized present value of a sequence of payoffs are
contained in the set of limit points of the finite time averages.
implies that if
a
is sequentially regular,
This
then the discounted value of
payoffs along the continuation equilibrium approaches the value with time
averaging.
Deviations from the continuation payoff when evaluated with the
discounting criterion yield values whose limits are no greater than the
lira
sup of the time average.
It is worth noting that the converse of this proposition is not true.
An example of a time -average equilibrium that is not even an approximate
equilibrium with discounting can be found in the following "guru" game:
I
13
Player
1
period.
He chooses one of three actions
is the guru.
Player
2
or
0,1,
each
2,
He also chooses one of two actions.
is a disciple.
Each period the guru receives zero, while the disciple receives an amount
equal to the action chosen by the guru.
In other words,
the guru is comp-
letely indifferent, while the disciple cares only about what the guru does.
Notice that sequentiality reduces to subgame perfection in this example, and
Any strategy by the guru is optimal
that subgame perfection has no force.
in any subgame, while any Nash equilibrium need only be adjusted for the
disciple so that he chooses an optimum in subgames not actually reached with
(There is a trivial complication in
positive probability in equilibrium.
that the guru has strategies such that no optimum for the disciple exists.)
Consider the following equilibrium with time averaging.
disciple never played
in the past,
2
the guru plays
the first period in which the disciple plays
period
(t+1) to
2t,
plays 0, in periods
the guru reverts back to
3t+2
disciple always plays
1.
1
As long as the
Let
1.
denote
t
Then the guru plays
2.
2t+l,
.
.
.
,
2
in
3t+l, and in period
Regardless of the history the
forever.
With time averaging the disciple is completely
indifferent between all his deviations, since they all yield a time average
of
1.
Similarly the guru is clearly indifferent between all his actions.
Consequently this is a time-average equilibrium.
(In fact,
it is also an
equilibrium when the players' time preference is represented by the
overtaking criterion.)
On the other hand, with discount factor
we can always find a time
t
such that
by deviating at such a time player
2
is
S
between
1/2
always gains at least
Consequently this configuration is only an
£
-equilibrium for
S
and
> 1/2,
1/4,
and
1/32.
e
> 1/32,
and
14
in particular,
does not converge to zero as
e
goes to
S
1.
We would argue that time -averaging is of interest only insofar as it
captures some sort of limit of discounting.
Consequently, we argue that
this equilibrium of the guru game does not make good economic sense.
not know whether Proposition
If so,
average equilibrium.
3
.
We do
or its converse holds for uniform time-
2
it is a strong argument in favor of restricting
attention to uniform equilibria.
A Folk Theorem
4.
We now restrict attention to a limited, but important class of games,
which we call informationallv connected games
informationally connected.
connected to player
action profile
TT. (
•
,a. ,a' ,a
.
other words, at
,)
r*
n.(',a)
player
a,
j
p,
to
j,
k.
is
directly
such that
i
of player
a'
k.
In
has a deviation that can be potentially
i
,
.
.
.
i
k
We say that
plays.
if for every player
j
In
i,
and such that player
despite player
for player
regardless of how player
exists a sequence of players
^
'
for any
a.
i
if there exists a mixed
k?*i,j
regardless of the play
is connected to player
i
All two-player games are
we say that player
despite player
and a mixed action
a
detected by player
player
j?^i
N > 2,
If
.
,
i
with
k ^ i.j.
In-'
i,
—
i,
i
—
i
and
there
i ?«k
p
is directly connected to player
In other words, a message can always be passed from
regardless of which single other player might try to interfere.
i
..
i
A
game is informationally connected if every player is connected to every
Note that the counterexample is to the strategy profile o not being
an approximate equilibrium with discounting.
From the folk theorem, we know
that there are other strategies that are exact equilibria and yield approximately the same payoff.
this is not true in games with a Markov structure
and absorbing states, as shown by the example of Sorin [1986].
In fact, his
example has equilibrium payoff with time averaging that are not ever
approximately equilibrium payoffs with discounting.
15
other player.
For an example of such a game, suppose that each player
"Average" price is a strictly
determines an output level of zero or one.
decreasing function of total output, and each player observes an "own price"
that is an independent random function of average price, with the property
that the distribution corresponding to a strictly higher average price
strictly stochastically dominates that corresponding to the lower price.
In
this case, all players can choose zero output levels, and if any player
produces a unit of output, this changes the distribution of prices for all
other players, with a deviation by a second player merely enhancing the
signal by lowering the distribution of prices still further.
v
A mutual threat point
mutual punishment action
and mixed actions
a'.
.
is a
payoff vector for which there exists a
such that
a
g.(a!,a
)
^ v
for all players
With three or more players vector
v*
i
of rainmax
values need not be a mutual threat point, as there may not be a single
action profile that simultaneously holds all of the players to their minmax
When such a profile exists, the game is said to satisfy the "mutual
values.
minmax property" (Fudenberg-Maskin [1986a]).
One example where this
property obtains is a repeated quantity- setting oligopoly with capacity
constraints where the profile "all players produce to capacity" serves to
minmax all of the players.
A payoff vector that weakly Pareto- dominates a mutual threat point is
called mutually punishable
is the
the closure of the convex hull of such payoffs
:
mutually punishable set
.
We say that a payoff vector
exists a mixed action profile
player
i
player
j^i,
and mixed action
n
.
(»
,a'.
,a
.)
?^
a
a'.,
tt.
v
is
with
(independently) enforceable if there
g(cx)
e,.(a'.,a
(•,«).
.)
and such that if for some
- v,
> v.
,
then for some other
In other words,
any improving
.
.
16
deviation for player
The enforceable set
i
V
can potentially be detected by some other player
is the closure of the convex hull of the
j
enforceable
Notice that in addition to the static Nash equilibrium, which is
payoffs.
clearly enforceable, every extremal Pareto efficient payoff is.
This is
because any extremal payoff is generated by a pure action profile, and any
if a action profile is not
efficient pure action profile is enforceable:
enforceable, one player can strictly improve himself without anyone else
knowing or caring.
equilibrium at time
Note also that if the unconditional play in any Nash
t
is a mixed (rather than correlated)
action profile,
it is clear that the profile must be enforceable (except in the infinite time
average case, for a negligible fraction of periods).
However, it is possible
that unenforceable payoffs can be achieved through correlation.
Finally, we define the enforceable mutually punishable set
V* - V n V,
which is closed, convex, and contains at least the convex hull
of static Nash equilibrium payoffs.
Theorem 4.1
v e V*,
£
-
:
(Folk Theorem)
We can now prove:
In an informationally connected game,
there exists a sequence of times
and strategy profiles
T
of non-negative numbers
,
such that
a
if
a
is a T -truncated time-
T
average
Remark
:
e
-Nash equilibrium and
W.
(a
)
-»•
v.
.
This is a partial folk theorem in that, as remarked earlier, the
mutually enforceable payoffs may be a strictly smaller set than the individually rational socially feasible ones.
Moreover, in games with imperfect
observation of the opponents' actions and three or more players, even the
individually rational payoffs may not be a lower bound on the set of payoffs
that can arise as equilibria (see Fudenberg-Levine-Maskin for an example)
Finally, we do not know whether the set of equilibria can include payoffs
17
not in
V*.
In review
The proof constructs strategies that have three stages.
stages, players play to obtain the target payoff
At the end of review
v.
stages players "test" to see if their own payoff was sufficiently close to
the target level.
Then follows a "communication stage" where players use
their actions to "communicate" whether or not the review was passed;
the
assumption of information connectedness is used to ensure that such
communication is possible.
Finally,
if players learn that the test was
failed, they revert to a "punishment state" for the remainder of the game.
The idea of using strategies with reviews and punishments was
introduced by Radner [1981, 1986] who did not need the communication phase
because he studied models with publicly observed outcomes.
Lehrer [1986]
introduces the idea of a communications phase to coordinate play.
is
£
Our proof
in some ways simpler than Radner 's as we establish only the existence of
-equilibria of the finite-horizon games and the appeal to Proposition 3.1.
Lehrer 's [1986] proof is more complex than ours because he obtains a larger
set of payoffs.
The key to proving Theorem 4.1 is a lemma proven in the Appendix:
Lemma 4.2
:
If
a
is enforceable,
then for
Tr^L
there exists m. -vectors
J
of weights
and
A
such that for all
a^,
S
A
tt
( •
,a^,Q:_^) > g^(a^,a_^) +
S.^.A.7r.(-,a)-g.(a)+r7.
J?'! J
1
J
This shows how we can reduce the problem of detecting profitable
deviations by player
deter
a.'s
1
i
to a one -dimensional linear test.
that lead to the information vectors
n
.
(• ,a. ,a
l'
J^
above the half- space whose existence is asserted in the
'
We need only
,)
-r
Leijima.
the problem very similar to one with publicly observed signals.
that lie
This makes
rj
18
Proof of Theorem 4.1
how to find
and
T
such that
a
Nash equilibrium, and
is given.
We show
The proof
First we construct a class of strategy profiles
determined by the game and
L
The constants
that are free parameters.
i
>
This clearly suffices.
.
that depends on a vector of constants
and constants
f
T-truncated time-average lOOe-
is a
a
|w.(a)-v.| < lOOt
proceeds in several steps.
s
and suppose
v e V*,
Fix
:
the relative lengths of various phases of play, and the
v,
will index
L
will determine
i's
both the absolute length of the phases and the number of times the phases
The length of the game
are repeated.
equilibrium yielding payoffs within
1
is implicitly
We then show how to choose the constants
constants.
Step
T
(Payoffs):
v e V*, v 6 V
Since
of
lOOe
determined by these
to make
H
v.
v e V.
and
This means we can find
finite L' -dimensional vectors of non-negative coefficients
-
r- r-
-h
to
T
1,
and of payoffs
v
V
is
enforceable and
v
,
,v
are mixed action profiles
a
h
.
_L'
,
and
a.
a
,
and non-negative
L
that sum to
L
and such that
|2:^1i(lV^)Y^-v| <
Step
2
L'
where
v
h
_,L'
summing
,\i
/i
-v, Z,..iivh
mutually punishable.
is
is a mutual punishment that enforces
can find
-h-h
S,../iv
with
,
-
v
yields payoff
a
..
,
,
such that
Corresponding to these
v
and
Moreover, it is clear that we
.
-dimensional vectors of integers
|ZL
a lOOe-
s
(L /L )v -v| <
L ,L
and
e
e.
(Temporal Structure)
We first describe the temporal structure of
:
the game, which consists of
communications stage.
Set
i'
L
C
repetitions of a review stage followed by a
- N(N-2) (N-1)
periods, and a communications stage lasts
!
.
A review stage lasts
C C
i
periods.
L
H
R R
L
Each stage is
p
further subdivided into phases; a review stage into
h = 1
L
R
"'
,
and a communications stage into
L
C
L
phases indexed by
phases,
indexed as
19
The temporal structure is outlined in Figure 3.1; the game
described below.
(2W2^L^)
T = 2'
lasts
periods.
The length of a phase of a review stage depend on whether it is rega-
rded as a reward or punishment stage; if it is a reward stage, the h
lasts
periods; if it is a punishment stage it lasts
L
i
Each phase of a communications stage lasts
Q
have indices
(i,j,k)
phases have indices
Fixing
and so forth.
(i,j,2),
(i,j,l);
k,
the next
the third index,
calculating every permutation of the set, giving rise to
Fix a permutation,
periods.
(N-2)
in the block is
_,i , ,k)
n-2 n-1
^i
the second
(i-,i„,k);
•
i-o
^vj
t-
The
(N-2)(N-1)!
indices are determined by taking the set of all players but
(i,j)
(i
corresponding
(Each index generally occurs more than once.)
to a triple of players.
(N-2)(N-1)!
periods.
L
periods.
2
Each communications phase is assigned an index
first
2
phase
(N-1)
!
the
k,
and
blocks of
Then the first index
and so forth up to
(i„,i„,k)
The point of this rather complex structure is that because
.
"^
'^
the game is informationally connected, each player has an opportunity to
send a signal to all other players, without being blocked by any other
single player.
Using the fact that the game is informationally connected, we associate
distinct triples
ted to
a
,
despite
j
with certain mixed strategies.
we say that
k,
(i,j,k)
i
that enables him to communicate with
despite
r:.
define
Step
3
a
k
we say that
ijk^,-l
to be a
(Strategies):
,
(i,j,k)
a.-"
We now describe player
i
j
.
is connec-
and let.
the deviation for pla-
a.
If
i
is an inactive link,
j^ijk^,-l
to be
a
and
If
is an active link,
be the action allowing communication and
-^
yer
j
(i,j,k)
is not
connected to
and arbitrarily
.
i's
strategy.
one of two states, a reward state or a punishment state.
He may be in
He begins the game
20
oiy\ir^
" ^r<^
sVa^i
V
t-
\
-X /I
Figure 3.1:
Temporal Structure
i
^
.
21
The punishment state is absorbing, so once reached,
in the reward state.
player
We must describe how a player plays in
rema-ins there forever.
i
each state, and when he moves from the reward to punishment state.
In the reward state and review stage, player
divided into
h
phases, with phase
L'
phase he plays
Z
R-h
L
periods.
i'
he plays
(i',j,k)
a.
ik
-^
.
In the punishment state and review stage, player
as subdivided into
stage,
phases, with phase
L'
phase he plays
ing the h
q.
if
a.
Player
i
i
=
lasting
h
regards the stage
i
i
R h
L
periods.
i
he plays
(i',j,k)
'
a.
ik
if
i
that is an active link.
(j,i,k)
h,
to be the fraction of the
player
i
Itt.
(•)-".(• a
I
i
periods in which
L
\n
The
.
(•)
-it
.
(• ,a
)|
>
e'
calz.
for any
communications phase corresponding to a triple
that is an active link, player
fraction of the
periods in which
£
)|
>
e'
,
player
In all other cases player
calculates
i
occurred.
z
n.(z.)
to be the
If
switches to the punishment state.
i
state does not change;
i's
in particular
the punishment state is absorbing.
We must show how to choose
payoffs within
and
switches to the punishment state.
At the end of the p
(j,i,k)
i
If at the end of the review stage
occurred.
,
^1
e'.
At the end of each reward phase of the review stage player
11
i'
can change states only at the end of a review stage, or at
transition is determined by a parameter
7r.(z.)
?^
i'
the end of a communications phase
culates
Dur-
In the punishment state and communications
.
in the phase indexed by
he plays
During the
In the reward state and communications stage, in
a..
the phase indexed by
lasting
h
regards the stage as
i
lOOe
of
v,
2'
R
,
S.
,
2
C
and
e'
and no greater than an
to give expected
lOOe
gain to any
.
22
player from deviating.
First we consider the equilibrium payoffs.
Step 4 (Payoffs):
Let
den-
n'
ote the probability that no player ever enters the punishment state and let
v'
be the expected payoff vector conditional on this.
V =
S,
.,
(L /L )v
The payoff
.
v
differs from
v'
Let
solely due to communi-
cations phases, yielding
r r
/,
.x
I-,
-I
|v'-v|
(4.1)
i
<
L d
d
A^+/l^
where recall that
<
|v-v|
d
l+(//i^)(LV^)
is the greatest difference
between any payoffs.
Since
by construction, we conclude
e
|w^(a)-v^| <
(4.2)
+ a
£
+ d-TT')
Ll+(iVS(LVS
This shows we must take
2
R
C
/£.
very large and
close to one.
n'
show that the weak law of large numbers implies that
i
R
and
Step
5
C
i
are both sufficiently large (relative to
(Gain to Deviating):
by deviating.
n'
We will
close to one if
e').
Next, we consider how much a player might gain
During a communications stage (or in any period for that
matter) the greatest per period gain is
d.
However, only
R R C C
L /{2 L +2 L )
C C
£
of the periods in the game lie in communications phases so we have the
greatest per period gain over the whole game due to deviations during
communications is
[communications]
(4.3)
l+(Ji^/f)l^/lP)
Next we consider review stages.
possibilities:
Fixing a deviator, there are three
all other players are in the punishment state; all are in
the review state,
or they disagree on the state.
Of course a player
23
contemplating deviating may not be certain which of these is true, but since
he can only benefit from this extra information, we may suppose he knows
which case he faces.
If all other players are in the punishment state they will remain there
regardless of player
play.
i's
punishment phase, player
gets at most
i
T
the largest gain over
Regardless of how he plays during a
that player
W. (a)
|S(L /L )v. -v.
Since
v..
|
<
e,
can obtain by deviating
i
throught the punishment state is
(punishment)
(4.4)
T
+ v
e
-
W. (a) <
+ (l-:r')
2£ + d
Ll+(iVS(LVS
where the inequality follows from (4.2).
Next,
suppose that other players disagree about the state (so in
particular,
TT
N > 2)
Then player
.
i
can possibly get
per period.
d
Let
be a lower bound on the probability that all opponents agree on a
punishment state at the end of the subsequent communications stage (and
therefore agree for the rest of the game)
best hope to get
d
Then the deviating player can at
.
for one stage with probability
periods with probability
(I-tt)
.
Since there are
per period for one phase is actually worth
[confusion]
(4.5)
—
d/£'
n,
£'
and
d
in all
review phases,
d
yielding
+ (l-7r)d.
This leaves review stages where all opponents are in the reward state.
Let
I-tt'
more than
stage.
be an upper bound on the probability that player
2e
i
both gains
and no opponent enters the punishment state at the end of the
Then player
i
can at best gain
2e
in all periods, and can gain
.
24
more than
and remain with all opponents in the reward state next stage
2f
with probability no more than
i
I-tt'
Since at best he can gain
.
d,
player
gains at most
[reward]
(4.6)
2e
+ j;^ + (l-7r')d.
in reward stages.
Adding these bounds, we find an upper bound on the gain to deviating
[total]
(4.7)
r>
^r
^^ +
r +
u
TT + (l-^)d +
l+(iV^ )lVl^
Our goal is to find
i'.i
smaller than
lOOf
Step
Fix
2'
so that
smaller than
^2
where
6
:
R
C
d/2' <
—
e
+ d(l-7r') + dd-^^').
to simultaneously make
,(.'
,2
^
C
Let
e.
2
(2
R
)
(4.2) and (4.6)
be the largest integer
.
1+(1/7)(lV^)
We may then simplify (4.2) to
|wT(a)-v.| < 2e + (l-^')d,
(4.2')
and (4.7) to
8£ +
(4.7')
(l-^)d + (l-7r')d +
(l-:;^')d.
Consequently, it suffices to show that we can choose
C
when
-
2
C
2
{2
R
-
)
,
(I-tt'),
(I-tt)
(I-tt')
,
2
and
e'
so that
are all less than or equal to
e/d.
By Lemma A.l in the Appendix, for the h
and
such that for all
2
either
i
e'
<
e
gains less than or equal
and
le
2
reward phase there exists
>
2
the probability that
or at least one opponent switches
c
25
-
to punishment at the end of the stage is at least
Consequently, for
By Lemma A.
2
e'
< min
e
in the Appendix,
<
""
and
e-^
iki
play
>
£
phase is at least
2
> max
i^ -
2
we have
,
i^(/)
(1-e/d)
/
and
Fix then
e'
>
1
> i**,
tt'
>
1
-
e/d.
such that for all
and
Note that
e/d.
-
I
players
k'
i
?^
< min
e'
> max
£
e
2
-^
,
,
provided
.
By the
> (l+7)i^JV7.
- min(
-
lei
i-^
plays a.
j
e'^,
e^^^)
i* - max{
and
,
2**
weak law of large numbers, there exists an
2'
i
and
>
n'
that are active
(j,k,i)
Consequently, for
.
tt
we have
,
1/L^
switching to punishment at the end of the
k
the probability of
ar.,
lei
e-'
if player
i^
i
for the
i
communications links there exist
e'
> max
i
,
(1-e/d)
Choosing
e/d.
2'
i'^
,
(l+7)i^^
/-y)
such that for
= max{i*,^**)
then completes the
proof.
Remark
:
How frequently does punishment occur?
As
T
«,
-+
the proof shows
Examining the proof of
that the probability of punishment goes to zero.
Theorem 3.1, we see that infinite equilibria are constructed by splicing
For approximate discounted
together a sequence of truncated equilibria.
equilibria, the construction repeats the same equilibrium over and over, so
that the probability of punishment in each round is constant.
By the zero-
one law, this means punishment occurs infinitely often.
In the case of time average equilibria, it is irrelevant whether
punishment occurs infinitely often or not.
The truncated equilibria that
are spliced together have decreasing probability of punishment, say
»r..,7r„,...
-»•
0.
By the zero-one law,
infinitely often, while if
finite time.
if
Z
..
tt
=
co,
punishment occurs
CO
S
-
tt
< ",
punishment almost surely stops in
However, we can choose a subsequence
[n'
)
such that
26
2
tt'
<
00.
The corresponding truncated equilibria when spliced together
form a time average equilibrium in which punishment almost surely ceases.
On the other hand, we may form a sequence in which the
equilibrium is repeated
1/t
times.
truncated
t
Splicing this sequence together
clearly yields a time-average equilibrium with
CO
S
tt'
-
<»,
and so
punishment occurs infinite often.
This discussion should be contrasted with the results of Radner [1981]
in which punishment is infinitely often, and Fudenberg.Kreps and Maskin
[1986a],
in which it almost surely ceases.
27
APPENDIX
Proof of Lemma 4.2
We may regard
:
as having components
tt
.
and
j?*i
,
Z.
g
+
.M.
Define
.
space such that there exists a mixed strategy
= 7r.(»,a.,Q
n.
L
J^
j
and
.)
-1
with components
< g.(a.,a .).
'^1
g.
^1
and let
7r.(',a ),
enforceability implies
A > 0,
for player
.
be the
tt
E.
(7r,g.+A)
Then
).
vector
.M.
J
{n,g.) G F
Moreover,
F.
with
1
Jr^l
- g. (a
g.
to be the subset of this
F
/i
Let
-L
1
dimensional Euclidean space
1
F
and for
is a convex
polyhedral set and its extreme points correspond to pure strategies for
player
i.
Since
is convex polyhedral,
F
linear inequalities
>
where
+77
g.
where
X(n-n)
-
{n,g
e F
)
is a matrix,
A
Since
-
r?
G
(7r,g.)
Att
-
g.
and
,
Att
- g. +
rj
,
and
b
Then
the desired conclusion.
a feasible solution to the primal exists,
F,
by the
subject to this
(g.-g.)
Suppose this problem has a solution equal to zero.
constraint.
Att
+ b(g.-g.) > c,
A(7r-7r)
Consider minimizing
are vectors.
c
we may characterize
so a
minimum of zero exists, if and only if the dual has a feasible plan yielding
The dual is to maximize
zero.
- (A,-l)
(cok,dh)
tj
In other words,
A
— wA
is
.
A > 0,
bA
w>0,
wc-0,
cob--l,
3^
(T.g.) 6 F,
A(7r-7r)
,
c
+
> c.
so
b(g'. -g.)
> c
Moreover, if
We conclude for some component
.
p.
-
w
q
q
?^
p
and
-
w
P
1.
A(7r-7r)
> b.
implying that
+ b(g.-g.) > c
Finally, for
- 0, b
c
P
then
then
the desired solution.
> g.
g'.
> 0.
if we can find
Recall that
and
subject to
aic
<
0.
Choose
P
This is the desired solution.
I
1
28
Lemma
A.
Suppose players
:
are in the reward state and follow their
ji^i
equilibrium strategies in review phase
an
and
e
such that for
i.
Prob{3j^i|^^(-)-7r.(-,a'^)| <
J
<
e'
e'
and
e
and
For every
h.
(1/L^i^)
is player
P roof
-?
,
r.(t) > g.(a^) + 2e} <
^
(Recall that
i.
(t)
r
=
weights
m. -vectors of
ii^i
such that
rj
.A.7r.(«,a.,Q:
E.
J*^^ J
V\
S.
.A.-7r.(',Q
x(t) = S.
.)
-1
J
> g.(a,,a
''i
1
+
.)
-1
r?
and
1-1
)
— B.(a
+
)
ri
Consider then the random variable
.
where
.A.(z.(t)),
J'^i J
random variable
is the component of
A.(z.(t))
J
corresponding to
players.
1
J
and
A.
J
a scalar
/3
^
realized payoff in period t.)
i's
From Lemma 4.2 there exist for
:
^
S
t=l
J
regardless of the strategy used by player
r.(z.(t))
i
there exists
P >
J
z.(t)
J
J
The idea of this construction is to use a single
z..
to summarize the information received by all other
x(t)
Let
L'V
x(t) = X.
S X.n^(-) " (1/L^i^) 2
j^i ^
t=l
-"
It is clear then that there exists
then
|7r.-7r.(«,Q;
)|
>
for some
e
x > g. (a
+
e
such that if
j.
That is, if the sample mean
far from its theoretical distribution under
observe that his empirical information
tx
.
a
)
then some player
r?
x
j?^i
i
had played
a
Consequently, it suffices to show with
.
L^/
p.
"
=
2
t-1
r.(t)
that
^
Prob{x < g.(a
Fix a strategy for player
i
)
+
r?
+
e
and
and consider
£
is
will
is far from what it would be if
J
player
+
p.
> g.(a
)
+ 2£) <
1-/9.
2
29
i(t) - x(t)
E[x(t)|x(t-1)
-
r(t-l)
].
These are uncorrelated random variables with zero mean and are bounded
independent of the particular strategy of player
r(t) = r(t)
tions apply to
E[r(t) |x(t-l)
-
,
.
the weak law of large numbers shows that as
ity uniformly over strategies for player
mixed action at
conditional on
t
Z
R
r
>
g. (a
Lemma A
.
)
-r
-
r}
< -i)
implies
x
-
r <
Prob{x
+ 2e
Suppose player
:
is an active
-1
< e^^^
Prob
^j.'^^
(
I
Proof
:
tt,
k
tt,
k
may find a
Att
m,
(..a-^
(•,q-'
i
,a. ,a
there exists
x(t) = x(t)
£-'
-
=0.
A
such that if
E[x(t) |x(t-l)
,
.
11
k'r'i
r,
rj.
and
(j,k,i)
follow their equii lei.
there exists an
/3
lei
.
i lei-
and
e-^
i-"
/3.
for different
,)
iki
tt
.
.
]
(•,q-'
and a scalar
x(t) -A(z, (t)).
Set
> g (a (t),a'^ ) +
-1
punishment state, and that
-J-i
1
-vector of weights
^)
i's
x<g(a)+r7+e
7r^(.,aJ^^)| > £') >
-
i Ici.
=
be player
this gives the desired conclusion.
convex, and by assumption does not contain
and
)
This is similar to, but simpler than Lemma A.l.
set of vectors
in probabil-
i^ > i^^^
and
( . )
>
Consequently
.
r(t-l)....] +
If all players
1
]
Then
1
Since
,
is in the
j
librium strategies then for every
e'
e
.
Q:.(t)
(.,a (t),a^
J
.
-
rj
communication phase.
such that for
-^
.
x,r->0
- E[r(t)|x(t-1)
Consequently
.
r(t-l)
TT
J
,
— —
-",
Let
i.
A
Js^l
r(t-l)
,
.
x(t-l)
r(t-l),...] - S
E[x(t)|x(t-1)
.
Similarly considera-
i.
x >
r?/2
has
x
,
).
>
r;
-tt
.
is compact
a.
i
( •
such that
,a
•
Consequently, we
A^^^^(.)
Again,
\t\-^
Observe that the
)
|
>
->
£-^
Att
x,
.
>
so
Again
converging uniformly to zero in
rj
30
probability.
Since
E[x(t) x(t-l) ,...]=
|
get the desired conclusion.
Att^C
.
,aJ^\a.
(t) .a^^f^
)>
,7
,
we
,
31
REFERENCES
Abreu, D., D. Pearce and
E.
"Toward a Theory of
Stacchetti (1987),
Discounted Repeated Games With Imperfect Monitoring," unpublished ms
.
Harvard.
"Optimal Cartel Monitoring With Imperfect Information,"
(1986),
,
Journal of Economic Theory
Friedman, J.
.
"A Non-Cooperative Equilibrium for Supergames," Review
(1972),
of Economic Studies
.
38,
Fudenberg, D., D. Kreps and
1-12.
E.
Maskin (1989), "Repeated Games With Long Run
and Short Run Players," mimeo
Fudenberg, D.,
Levine
D.
251-69.
39,
,
and
E.
,
MIT.
Maskin (1989), "The Folk Theorem With
Imperfect Public Information," mimeo.
Fudenberg, D.
,
and
E.
Maskin (1986a), "Folk Theorems for Repeated Games With
Discounting and Incomplete Information," Econometrica
(1986b),
unpublished ms
(1989)
.
,
,
54,
533-54.
"Discounted Repeated Games With One-Sided Moral Hazard,"
MIT.
"The Dispensability of Public Randomizations in
Discounted Repeated Games," unpublished ms
Fudenberg, D., and
.
D.
.
,
MIT.
Levine [1983], "Subgame Perfect Equilibria of Finite-
and Infinite-Horizon Games," Journal of Economic Theory
.
31 (December),
251-63.
Green,
E.
,
and R. Porter (1984), "Noncooperative Collusion Under Imperfect
Price Information," Econometrica
Kreps, D.
,
863-94.
.
52,
87-100.
and R. Wilson (1982), "Sequential Equilibrium," Econometrica
.
50,
32
Lehrer,
"Two-Player Repeated Games With Non-Observable Actions
(1985),
E.
and Observable Payoffs," mimeo, Northwestern.
(1988a)
"Extensions of the Folk Theorem to Games With Imperfect
,
Monitoring," mimeo, Northwestern.
"Internal Correlation in Repeated Games," Northwestern,
(1988b),
CMSEMS Discussion Paper #800.
"Nash Equilibria of n-Player Repeated Games With Semi-
(1988c),
Standard Information," mimeo, Northwestern.
"Two Players' Repeated Games With Non-Observable
(1988d),
Actions," mimeo, Northwestern.
(1989)
"Lower Equilibrium Payoffs in Two-Players Repeated Games
,
With Nonobservable Actions," forthcoming, International Journal of Game
Theory
.
Porter, R.
"Optimal Cartel Trigger-Price Strategies," Journal of
(1983),
Economic Theory
Radner, R.
(1981),
29,
.
313-38.
"Monitoring Cooperative Agreements in a Repeated
Principal-Agent Relationship," Econometrica
(1985),
Econometrica
.
(1986)
,'
R.
1127-48.
1173-98.
"Repeated Partnership Games With Imperfect Monitoring and
No Discounting," Review of Economic Studies
Radner, R.
49,
"Repeated Principal Agent Games With Discounting,"
53,
,
.
Myerson and
E.
.
53,
43-58.
Maskin (1986), "An Example of a Repeated
Partnership Game With Discounting and With Uniformly Inefficient
Equilibria," Review of Economic Studies
.
53,
59-70.
Rubinstein, A. and M. Yaari (1983), "Repeated Insurance Contracts and Moral
Hazard," Journal of Economic Theory
.
30,
74-97.
33
Sorin,
S.
(1988),
"Asymptotic Properties of a Non-Zero Sum Stochastic Game,"
International Journal of Game Theory
(1986),
Stigler, G.
Economy
.
(1961),
69,
"Supergames," mimeo
,
.
15,
101-107.
CORE.
"The Economics of Information," Journal of Political
213-25.
jb'
U
U0
7
Date Due
24
m\
Lib-26-67
MIT LIBRARIES DUPL
3
TDflD
I
Da5b7m4
5
Download