Improving the Relationship: A Principal-Agent Model of Progressive Learning and Path Dependence ∗

advertisement
Improving the Relationship:
A Principal-Agent Model of Progressive
Learning and Path Dependence∗
Avidit Acharya† and Juan Ortner‡
May 3, 2016
Preliminary - comments welcome
Abstract
The ratchet effect literature studies principal-agent relationships in which the principal lacks inter-temporal commitment power and the agent possesses persistent private
information. A key message of this literature is that in such a setting the possibilities for learning by the principal are typically limited and the principal must content
herself with low values, giving up substantial informational rents to the agent. This
paper investigates a new source of potential learning for the principal: time varying
productivity shocks. We show that when the incentive environment is subject to these
shocks, the principal may be able to increase her value by gradually learning the agent’s
private information over time. We find conditions under which full learning eventually
takes place so that the principal eventually achieves her first best payoffs, as well as
conditions under which learning is path dependent in the sense that the history of early
shocks determines the principal’s long run value. We show how a strategic principal
leverages these shocks to both increase her value and potentially achieve efficiency even
when commitment is not possible.
JEL Classification Codes: C73, D86
Key words: principal-agent model, adverse selection, ratchet effect, inefficiency,
learning, path dependence
∗
For helpful comments, we would like to thank Stephane Wolton and seminar audiences at Boston University, Stanford, Berkeley and the LSE/NYU political economy conference.
†
Assistant Professor of Political Science, 616 Serra Street, Stanford University, Stanford CA 94305 (email:
avidit@stanford.edu).
‡
Assistant Professor of Economics, Boston University, 270 Bay State Road, Boston MA 02215 (email:
jortner@bu.edu).
1
1
Introduction
An early literature on the ratchet effect studies long-run principal-agent relationships, finding
that the value of the relationship to the principal suffers when the agent has persistent private
information and the principal lacks long term commitment power. The lack of commitment
power, in particular, hinders the principal’s ability to incentivize information disclosure, and
as a result, the principal has to give up substantial informational rents to the agent. One
key message of this literature is that without commitment power, the principal’s ability to
improve the relationship by learning the agent’s private information is severely limited.
In this paper, we investigate a new source of learning for the principal that mitigates
the ratchet effect: time-varying productivity shocks. Such shocks are a natural feature of
most dynamic relationships, and our model departs from previous work (e.g., Hart and Tirole
(1988) and Schmidt (1993)) only with respect to these shocks. In particular, we maintain the
assumption that the principal lacks commitment power, and the agent’s private information
is persistent. Although these productivity shocks further complicate the environment, we
show that they provide the principal with valuable opportunities to learn the agent’s private
information over time. We are interested in how the value of the relationship for the principal
evolves in the presence of these shocks.1
In each period of our model, the principal offers the agent a transfer in exchange for
taking an action that benefits the principal. The principal has short-term, but not longterm, commitment power: she can credibly promise to pay a transfer in the current period
if the agent takes the action, but she cannot commit to any future transfers. The principal
is able to observe the agent’s decision, but the agent’s cost of taking the action is his private
information and is constant over time. In each period, the realization of a productivity shock
affects the size of the benefit that the principal obtains from having the agent take the action.
The current level of productivity is publicly observed by both the principal and the agent at
the start of the period, and productivity evolves over time as a Markov process.
The basic structure of our model is motivated by the focus of the ratchet effect literature
and facilitates a direct comparison of our results to existing results by Hart and Tirole (1988),
1
Other papers examine alternative approaches to mitigating the ratchet effect. For example, Kanemoto
and MacLeod (1992) show that piece-rate contracts alleviate the ratchet effect in labor contracting when
there is competition for second-hand workers. Carmichael and MacLeod (2000) show that the ratchet effect
can be mitigated when the principal and agent play a cooperative equilibrium that is sustained via the
threat of future punishment. Fiocco and Strausz (2015) show that the ratchet effect can be mitigated when
contracting is strategically delegated to an independent third party. Our paper differs from this work in that
we do not introduce external sources of contract enforcement nor do we reintroduce commitment through
the back door by allowing for punishment strategies.
2
Schmidt (1993), Gerardi and Maestri (2015), and others.2 These authors consider stationary
models in which the agent has persistent private information, and show that the principal’s
inability to commit to long term contracts severely limits what she can learn about the agent.3
Hart and Tirole (1988) and Schmidt (1993) establish their results by studying games with
long but finite time horizons, while Gerardi and Maestri (2015) assume that the time horizon
is infinite. We take the latter approach, and focus on pure strategy Markovian equilibria that
are optimal for the principal. In these equilibria, the shock variable and principal’s belief are
treated as the relevant state variables. These restrictions keep our model closely related to
the existing work in the sense that without the productivity shocks, our equilibrium collapses
to the standard ratchet effect equilibrium studied in the literature.4
Our analysis produces three key results that contrast with the main results of the previous
work cited above. First, we show that in the presence of productivity shocks, the principal’s
lack of commitment power may produce lasting inefficiencies. To see why, consider a productivity shock at which it is only efficient for low cost agents to take the action. If the
equilibrium outcome were efficient, the principal would learn information about the agent’s
cost after observing the agent’s choice of action. Given the principal’s inability to commit,
a low cost agent would not be willing to reveal his private information if the benefit that
he obtains from pooling with high cost types is sufficiently large. When this happens, the
equilibrium must necessarily be inefficient.
Second, and more importantly, we show that the principal might be able to gradually
learn the agent’s cost over time. In some cases, she may be able to eventually learn the
agent’s exact cost, while in other situations only some (but not full) learning takes place.
The principal’s ability to learn arises because the benefit that low cost types obtain by pooling
with high cost types changes over time, together with the level of productivity. Specifically,
mimicking a high cost type becomes less profitable for a low cost type when productivity is
low. In such periods, it becomes cheaper for the principal to extract information from low
cost agent types, and thus she may find it optimal to do so.5
2
Other papers in the literature include Freixas et al. (1985), Gibbons (1987), Laffont and Tirole (1988),
Dewatripont (1989), and, more recently, Halac (2012) and Malcomson (2015).
3
One paper in which the environment is non-stationary is Blume (1998), in which the agent’s type is
changing over time. In our model, the agent’s private information is fixed throughout the entire game, and
the non-stationarity of our environment arises from changes in productivity over time.
4
Specifically, in the case of no shocks, the path of play generated by our equilibrium is, for arbitrarily long
history, identical to equilibrium path of a corresponding game in which the time horizon is long but finite,
as in Hart and Tirole (1988) and Schmidt (1993).
5
The idea that time-varying shocks can ameliorate a player’s lack of commitment also appears in Ortner
(2016), who studies the problem of a durable goods monopolist with time-varying production costs. The
paper shows that, unlike the classic Coase conjecture results in Gul et al. (1986) and Fudenberg et al. (1985),
3
Third, and finally, we uncover an interesting feature of the equilibrium: it may be pathdependent. By this we mean that the information that the principal is able to learn about the
agent’s type along the path of play may depend on the sequence of productivity shocks that
was realized early on in the relationship. Consequently, the principal’s long-run value from
the relationship may also depend crucially on the path of productivity in the early stages.
This is true even when the process governing the evolution of productivity is ergodic and our
equilibrium concept is Markovian.
To understand the intuition behind our path dependence result, consider a setting with
three possible agent types, with costs c1 < c2 < c3 . In this setting, the information rents
that a c1 -agent type gets by mimicking a c2 -agent type depends on how often the c2 type is
expected to take the productive action in the future. In turn, how often a c2 type takes the
action depends on the principal’s beliefs. If the principal assigns positive probability to the
agent’s type being c3 , the c2 -type has an incentive to mimic the c3 -type by not taking the
action in periods when productivity is low. This incentive disappears if at some point along
the path of play the principal learns that the agent’s cost is not c3 . As a result, there may
be levels of productivity at which it is profitable for the principal to incentivize a c1 -type
to reveal his type when she assigns positive probability to all three agent types. However,
if it some point in the past the principal learned that the agent’s cost is not c3 , it becomes
too expensive for her to incentivize a c1 -agent type to reveal his private information. The
principal then never fully learns the agent’s type.
Our paper is relevant to the set of applications that motivated the original work of Hart
and Tirole (1988) on contract renegotiation, particularly repeated buyer-seller relationships.
As a model of repeated bargaining with one-sided offers, it is also relevant to the literature
on repeated negotiations in which path dependence has been highlighted as an important
empirical feature (see, e.g., Kennan, 2001). The two main assumptions of our model—that
inter-temporal contracts are not available, and that the relationship is periodically hit by
productivity shocks—make our results especially relevant to a range of applications for which
these assumptions are natural. The literature on relational contracting, for example, starts
from the premise that not everything can be contracted (see, e.g., Levin, 2003). Shareholders
may provide the managers of their firm with short term incentives, but they cannot always
make long-term commitments. In addition, as the firm’s investment opportunities change,
its productivity will also vary over time.
Similarly, in many political economy applications, it is natural to assume that key aca monopolist with time-varying costs may extract rents from consumers.
4
tors like governments and international organizations cannot always make credible long-term
commitments. Dixit (2000), for example, suggests that the IMF’s relationship with a client
government is a principal-agent relationship. He writes that “IMF programs are incentive
schemes or mechanisms” and that “viewing them explicitly in this way ... reminds us of
the essential common elements of such problems, for example asymmetries of information
and observation, credibility of commitment, [etc.]” He further explains how “the outcomes
[of IMF programs] are eventually beneficial, for example lower inflation and better access to
international financial markets, but taking the required actions can have some economic costs
to a country, for example higher unemployment in the short run, and political costs to its
government, for example a reduction in subsidies to its favored groups.” Our model speaks
to this view, especially if the client government has private information about the political
costs of reform. Since the benefits to economic reform will, in general, depend on the state
of the economy as it moves through different points on the business cycle, our assumption
that there are productivity shocks to the environment is also natural in this setting.
2
Model
2.1
Setup
Consider the following repeated interaction between a principal and an agent. Time is discrete
and indexed by t = 0, 1, 2, ..., ∞. Both players are risk-neutral expected utility maximizers
and share a common discount factor δ < 1.6
At the start of each period t, a state bt is drawn from a finite set of states B, and is
publicly revealed. After observing bt ∈ B, the principal decides how much transfer Tt ≥ 0 to
offer the agent in exchange for taking a productive action. The agent then decides whether
or not to take the action. We denote the agent’s choice by at ∈ {0, 1}, where at = 1 means
that the agent takes the action period t. The action provides the principal a benefit equal to
bt . We assume that b > 0 for all b ∈ B. The action, however, has a cost c > 0 to the agent.
This cost is private information to the agent, and is fixed through time. The set of possible
costs is C = {c1 , ..., cK }, and the principal has a prior belief µ0 ∈ ∆(C) with full support.
At the end of each period, the principal observes the agent’s choice and updates her beliefs
about the agent’s cost. The players receive their payoffs and the game moves to the next
period.7
6
7
Our results remain qualitatively unchanged if the principal and agent have different discount factors.
As in Hart and Tirole (1988) and Schmidt (1993), the principal in our model can commit to paying the
5
These assumptions imply that the payoffs to the principal and an agent of cost type c = ck
at the end of period t are, respectively,
u(bt , Tt , at ) = at [bt − Tt ]
vk (bt , Tt , at ) = at [Tt − ck ]
We assume, without loss of generality, that the agent’s possible costs are ordered so that
0 < c1 < c2 < ... < cK . To avoid having to deal with knife-edge cases, we further assume
that b 6= ck for all b ∈ B and ck ∈ C. This means that it is socially optimal (efficient) for an
agent with cost ck to take the productive action in state b if and only if b − ck > 0. Let the
set of states for which it is socially optimal for an agent with cost ck to take the action be
Ek := {b ∈ B : b − ck > 0}.
We refer to Ek as the efficiency set for type ck . Note that by our assumptions on the ordering
of types, the efficiency sets are nested, i.e. Ek0 ⊆ Ek for all k 0 ≥ k.
We assume that the evolution of states is governed by a Markov process with transition
matrix [Qb,b0 ]b,b0 ∈B . We further assume that this process is relatively persistent. To formalize
this, first define the following function: for any state b ∈ B and subset of states B ⊆ B, let
"
X(b, B) := E
∞
X
#
δ t 1{bt ∈B} |b0 = b ,
t=1
where E[·|b0 = b] denotes the expectation operator with respect to the Markov process
governing state transitions, given that the period 0 state is b. Term X(b, B) is the expected
discounted amount of time that the process visits a state in B in the future, given that the
current state is b. The following assumption then captures the idea that discounting is not
too high, and that states are relatively persistent
Assumption 1 (discounting/persistence) X(b, {b}) > 1 for all b ∈ B.
When there are no shocks (i.e., the state is fully persistent so that B is a singleton) the above
assumption holds when δ > 1/2. In general, for any ergodic process, the assumption holds
whenever δ is above a cutoff δ > 1/2. Conversely, for any δ > 1/2, the assumption holds
whenever the process is sufficiently persistent; that is, whenever Prob(bt+1 = b|bt = b) is
sufficiently large for all b ∈ B.
transfer within the current period, but cannot commit to a schedule of transfers in future periods.
6
2.2
Histories, Strategies and Equilibrium Concept
A history ht = h(b0 , T0 , a0 ), ..., (bt−1 , Tt−1 , at−1 )i records the states, transfers and agent’s action choices that have been realized from the beginning of the game until the start of period
t. For any two histories ht0 and ht with t0 ≥ t, we write ht0 ht if the first t period entries
of ht0 are the same as the t period entries of ht . As usual, we let Ht denote the set of period
S
t histories and H = t≥0 Ht the set of all histories. A pure strategy for the principal is a
function τ : H × B → R+ , which maps histories and the current state to transfer offers T . A
pure strategy for the agent is a collection of mappings {αk }K
k=1 , αk : H × B × R+ → {0, 1},
each of which maps the current history, current state and current transfer offer to the action
choice a ∈ {0, 1} for a particular type ck . Given a pair of strategies σ = τ, {αk } , the
continuation payoffs of the principal and an agent with cost ck after history ht and shock
realization bt are denoted U σ [ht , bt ] and Vkσ [ht , bt ] respectively. The principal’s belief about
the agent’s cost after history ht is denoted µ[ht ] and is given by a mapping µ : H → ∆(C).
A pure strategy perfect Bayesian equilibrium (PBE) is a pair of strategies σ and posterior
beliefs µ for the principal such that the strategies form a Bayesian Nash equilibrium in every
continuation game given the posterior beliefs, and beliefs are consistent with Bayes rule
whenever possible. Thus, pure strategy PBE can be denoted by the pair (σ, µ). We use
the term “equilibrium” to refer to pure strategy PBE that satisfy the two conditions below,
where we identify τ (ht , ·) and αk (ht , ·, ·) with the continuation strategies of the principal and
agent with cost ck , given the occurrence of history ht .
R1. (Markovian condition) For all histories ht and ht0 , if µ[ht ] = µ[ht0 ] then
τ (ht , ·) = τ (ht0 , ·) and αk (ht , ·, ·) = αk (ht0 , ·, ·) for all k.
R2. (best for principal) There is no history ht , shock bt ∈ B and pure strategy PBE (σ 0 , µ0 )
that also satisfies the Markovian condition, R1, for which
0
U σ [ht , bt ] ≥ U σ [ht , bt ]
R1 says that the principal’s and agent’s strategies depend on history only through the principal’s current beliefs. R2 says that after every history, the equilibrium yields the highest
possible continuation payoff to the principal among all pure strategy PBE that satisfy R1.
We impose these restrictions to rule out indirect sources of commitment for the principal.
In particular, R1 rules out equilibria in which the threat of punishment enforces high continuation payoffs for the agent. R2 rules out Markovian equilibria in which off path beliefs
7
are constructed in ways that make the principal give the agent extra rents beyond his informational rents.8 As we show in Lemma 0 below, the main equilibrium implications of these
restrictions is that the highest cost type in the support of the principal’s belief has a zero
continuation payoff at any history, and local incentive constraints always bind with equality.
3
Equilibrium Analysis
3.1
Incentive Constraints
Fix any equilibrium (σ, µ) = ((τ, {αk }), µ) and given the equilibrium let at,k be a random
variable indicating the action that agent type ck takes in period t. We will use C[ht ] to denote
the support of µ[ht ] and k[ht ] := max{k : ck ∈ C[ht ]} to denote the highest index of types
in the support of the principal’s beliefs. For any history ht , any pair ci , cj ∈ C[ht ], and any
b ∈ B, let
"∞
#
X 0
σ
Vi→j
[ht , b] := Eσj
δ t −t at0 ,j (Tt0 − ci )|ht , bt = b
t0 =t
be the expected discounted payoff that type ci would obtain after history ht when bt = b from
following the equilibrium strategy of type cj . Here, Eσj [·|ht , bt = b] denotes the expectation
over future events given type ci ’s deviation to type cj ’s strategy after history ht when bt = b
and when all other types play according to σ. For any ck ∈ C[ht ], the continuation value of
σ
[ht , b]. Then, note that
an agent with cost ci at history ht is simply Viσ [ht , b] = Vi→i
"
σ
[ht , b] = Eσj
Vi→j
∞
X
#
0
δ t −t at0 ,j (Tt0 − cj )|ht , bt = b + Eσj
t0 =t
"
∞
X
#
0
δ t −t at0 ,j (cj − ci )|ht , bt = b
t0 =t
= Vjσ [ht , b] + (cj − ci )Aσj [ht , b]
(1)
where Vjσ [ht , b] is type cj ’s continuation value at history (ht , bt = b), and
Aσj [ht , b] := Eσj
"∞
X
#
δ
t0 −t
at0 ,j |ht , bt = b
t0 =t
is the expected discounted number of times that type cj takes the action after history ht ,
according to σ. Equation (1) says that type ci ’s payoff from deviating to cj ’s strategy can
8
Markovian equilibria in which the principal offers high transfers to the agent can be constructed by
specifying off path beliefs that “punish” an agent who accepts low transfers. Such beliefs incentivize the
agent to reject low transfers, and by doing this they also incentivize the principal to offer high transfers.
8
be decomposed into two parts: type cj ’s continuation value, and an informational rent (cj −
ci )Aσj [ht , bt ], which depends on how frequently cj is expected to take the action in the future.
Incentive compatibility requires that for all histories ht , all shocks bt ∈ B and every pair of
σ
types ci , cj ∈ C[ht ], Viσ [ht , bt ] ≥ Vi→j
[ht , bt ], or using (1),
Viσ [ht , bt ] ≥ Vjσ [ht , bt ] + (cj − ci )Aσj [ht , bt ]
∀(ht , bt ), ∀ci , cj ∈ C[ht ]
(2)
We then have the following fact, which follows from equilibrium conditions R1 and R2. Part
(i) says that the highest cost type in the support of the principal’s beliefs obtains a zero
continuation payoff, while part (ii) says that local incentive constraints bind with equality.
Lemma 0. Fix any equilibrium (σ, µ) and history ht , and if necessary renumber the types so
that C[ht ] = {c1 , c2 , ..., ck[ht ] } with c1 < c2 < ... < ck[ht ] . Then, for all ht ∈ H and all b ∈ B,
σ
(i) Vk[h
[ht , b] = 0.
t]
σ
[ht , b] + (ci+1 − ci )Aσi+1 [ht , b] for all ci , ci+1 ∈ C[ht ].
(ii) If |C[ht ]| ≥ 2, then Viσ [ht , b] = Vi+1
Proof. See Appendix A. 3.2
Equilibrium Characterization
We now describe the (essentially) unique equilibrium of the game. Recall that k[ht ] denotes
the highest index in the support of the principal’s equilibrium beliefs at any history ht , and
that Ek is the set of states at which it is socially optimal for type ck ∈ C to take the action.
Theorem 1. The set of equilibria is non-empty, and all equilibrium are payoff equivalent.
For every history ht and every bt ∈ B, in equilibrium we have:
(i) If bt ∈ Ek[ht ] , then the principal offers transfer Tt = ck[ht ] , and all types in C[ht ] take
the action.
(ii) If bt ∈
/ Ek[ht ] and X(bt , Ek[ht ] ) > 1 then no type in C[ht ] takes the action.
(iii) If bt ∈
/ Ek[ht ] and X(bt , Ek[ht ] ) ≤ 1, then there is a threshold type ck∗ ∈ C[ht ] such that
types in C − := {ck ∈ C[ht ] : ck < ck∗ } that are below the threshold take the action
while types in C + := {ck ∈ C[ht ] : ck ≥ ck∗ } that are above the threshold don’t take the
action. If C − is non-empty, the transfer that is accepted by types in C − is,
Tt = cj ∗ + Vkσ∗ [ht , bt ] + (ck∗ − cj ∗ )Aσk∗ [ht , bt ],
9
(∗)
where cj ∗ = max C − .
Proof. See Appendix B. Part (i) characterizes a situation in which an efficient ratchet effect is in play, replicating
the main finding of the ratchet effect literature. In this case, it is socially optimal for all
agent types in the principal’s support to take the action (and they all do) but the principal
must compensate the agent as if he was the highest cost type. For example, Hart and Tirole
(1988) and Schmidt (1993) consider a special case of our model in which the benefit from
taking the action is constant over time (for all t, the state is bt = b for some constant b) and
strictly larger than the highest cost (b > cK ). Thus, part (i) of Theorem 1 applies. In each
period t, the principal offers a transfer Tt = cK that all agent types accept, and she never
learns anything about the agent’s type.
Part (ii) characterizes situations where an inefficient ratchet effect is in play: low cost
types pool with high cost types and don’t take the productive action even if the principal
would be willing to fully compensate their costs. The reason is that if bt ∈
/ Ek[ht ] , then
the payoff that an agent with cost ck < ck[ht ] obtains by pooling with type ck[ht ] is (ck[ht ] −
ck )Aσk[ht ,bt ] = (ck[ht ] − ck )X(bt , Ek[ht ] ), which follows because an agent with cost ck[ht ] takes
the action at time t0 ≥ t if and only if bt0 ∈ Ek[ht ] . Suppose there exists a nonempty subset
/ Ek[ht ] with X(bt , Ek[ht ] ) > 1
C ⊂ C[ht ]\{ck[ht ] } of types that take the action at a state bt ∈
after history ht . Let c∗ = max C. By Lemma 0, an agent with cost c∗ obtains a total payoff
of Tt − c∗ + 0 by taking the action at time t.9 Since this payoff must be larger than what
this type would get by mimicking an agent with cost ck[ht ] , it must be that Tt ≥ c∗ + (ck[ht ] −
c∗ )X(bt , Ek[ht ] ) > ck[ht ] , where the second inequality follows because X(bt , Ek[ht ] ) > 1. But
this cannot occur in an equilibrium, since by Lemma 0(i) an agent with type ck[ht ] would
strictly prefer to accept offer Tt > ck[ht ] and take the action. Therefore, for all bt ∈
/ Ek[ht ]
with X(bt , Ek[ht ] ) > 1, no type takes the action.
Part (iii) characterizes situations where learning may take place. Specifically, learning
takes place when the set C − is nonempty. Although the theorem does not tell us under
which conditions C − is nonempty, we provide sufficient conditions for learning to occur in
Section 4 below. In Appendix B.3 we provide a characterization of the threshold ck∗ as the
unique solution to a finite maximization problem. Building on this, we also characterize the
principal’s equilibrium payoffs as the unique fixed point of a contraction mapping.
9
Indeed, under the proposed equilibrium, the principal infers that the agent’s type is in C if she observes
that the agent took the action at time t. Hence, by Lemma 0, an agent with cost c∗ obtains a continuation
payoff of zero from time t + 1 onwards.
10
3.3
Examples
We now present two examples that illustrate some of the main equilibrium features of our
model. The first highlights the fact that equilibrium outcome in our model can be inefficient.
This contrasts with the results in Hart and Tirole (1988) and Schmidt (1993), where the
equilibrium is always socially optimal.
Example 1 (inefficient ratchet effect) Suppose that there are two states, B = {bL , bH }, with
0 < bL < bH , and two types, C = {c1 , c2 } (recall our assumption that c2 > c1 ). Let the
efficiency sets be E1 = {bL , bH } and E2 = {bH }, and assume that X(bL , {bH }) > 1.
Consider a history ht such that C[ht ] = {c1 , c2 }. Theorem 1(i) implies that, at such a
history, both types take the action if bt = bH , receiving a transfer equal to c2 . On the other
hand, Theorem 1(ii) implies that neither type takes the action if bt = bL . Indeed, when
X(bL , {bH }) > 1 the benefit that a c1 -agent obtains by pooling with a c2 -agent is so large
that there does not exist any offer that a c1 -agent would accept but a c2 -agent would reject.
As a result, the principal never learns the agent’s type in equilibrium. Inefficiencies arise in
all periods t in which bt = bL : an agent with cost c1 never takes the action when the state is
bL , even though it is socially optimal for him to do so.
The next example illustrates a situation in which the principal is able to learn the agent’s
type, and the equilibrium outcome is efficient. This too contrasts with earlier work on the
ratchet effect in which there is no learning by the principal on the path of play.
Example 2 (efficiency and learning) The environment is the same as in Example 1, with the
only difference that X(bL , {bH }) < 1. Consider a history ht such that C[ht ] = {c1 , c2 }. As
in Example 1, both types take the action in period t if bt = bH , i.e., they take the action in
the high productivity state. The difference is that, if bt = bL , the principal offers a transfer
Tt that a c2 -agent rejects, but a c1 -agent accepts. To see why, note first that by Theorem
1, an agent of type c2 does not take the action at time t if bt = bL . Suppose that type
c1 does not take the action when bt = bL either. Since the equilibrium is Markovian, this
implies that the principal never learns the agent’s type, and her payoff is at time t when
bt = bL is U = X(bL , {bH })[bH − c2 ]. If instead the principal were to make an offer than only
an agent with cost c1 accepts, then by Theorem 1(iii) and Lemma 0(i) the principal’s offer
would exactly compensate type c1 for revealing his type, i.e. T − c1 = X(bL , {bH })(c2 − c1 ).
Note that X(bL , {bH }) < 1 implies that T < c2 , so an agent with cost c2 rejects this offer.
Conditional on the agent’s cost being c1 , the principal’s payoff from making offer T when
11
bt = 1 is
U [c1 ] = bL − T + X(bL , {bL })[bL − c1 ] + X(bL , {bH })[bH − c1 ]
= [1 + X(bL , {bL })][bL − c1 ] + X(bL , {bH })[bH − c2 ].
On the other hand, conditional on the agent’s type being c2 , the agent would reject the
transfer and the principal’s payoff would be U [c2 ] = X(bL , {bH })[bH − c2 ] = U . The principal
finds it optimal to make offer T if µ0 [c1 ]U [c1 ] + µ0 [c2 ]U [c2 ] > U , where µ0 [cj ] is the prior
probability that the agent’s cost is cj . Since U [c2 ] = U and since U [c1 ] > U , this inequality
holds. Therefore, in any Markovian PBE satisfying R2, type c1 takes the action in state bL
when C[ht ] = {c1 , c2 }.
Finally, note that the principal learns the agent’s type at time t = min{t : bt = bL }, and
the equilibrium outcome is efficient from time t + 1 onwards: type ci takes the action at time
t0 > t if and only if bt0 ∈ Ei . Moreover, Lemma 0(i) guarantees that the principal extracts
all of the surplus from time t∗ + 1 onwards, paying the agent a transfer equal to his cost.
Example 2 has three notable features. First, despite of her lack of commitment the principal
is able to learn the agent’s type. Learning takes place the first time the relationship hits the
low productivity state. In the next subsection we present conditions under which this result
generalizes. Second, the principal’s value increases over time, since the surplus she extracts
from the agent increases as she learns the agent’s type. In Section 4 we characterize general
conditions under which full learning eventually takes place.
Third, the equilibrium exhibits a form of path-dependence: equilibrium play at time t
depends on the entire history of shocks up to period t. Before state bL is reached the principal
pays a transfer equal to the agent’s highest cost c2 , to get both types to take the action. After
state bL is visited, if the principal finds that the agent has low cost, then she pays a transfer
equal to the low type’s cost. Note, however, that the path dependence in this example is
short-lived: after state bL is visited for the first time, the principal learns the agent’s type and
the equilibrium outcome from that point on is independent of the prior history of shocks.
It turns out, however, that this is not a general property of our model. In Section 4 we
demonstrate how the equilibrium may also display long-run path dependence.
3.4
Learning in Bad Times
Under natural conditions on the process [Qb,b0 ] that governs the evolution of the stochastic
shock, the principal learns about the agent’s type only in “bad times,” i.e., only when the
12
benefit bt is small. Recall that µ[ht ] denotes the principal’s beliefs about the agent’s type
after history ht . Then, we have:
Proposition 1. (learning in bad times) Suppose that for all ck ∈ C and all b, b0 ∈ B\Ek such
that b < b0 , X(b, Ek ) ≤ X(b0 , Ek ). Then, in any equilibrium and for every history ht there
exists a state b[ht ] ∈ B such that µ[ht+1 ] 6= µ[ht ] only if bt ≤ b[ht ].
Proof. By Theorem 1, after history ht the principal learns at state b only if X(b, Ek[ht ] ) ≤ 1.
By the assumption that for all types ck , if b < b0 implies X(b, Ek ) ≤ X(b0 , Ek ), there exists a
state b[ht ] ∈ B such that X(b, Ek[ht ] ) ≤ 1 if and only if b ≤ b[ht ].
Proposition 1 provides conditions under which the principal only updates her beliefs about the
agent’s type at states at which the benefits from taking the productive action are sufficiently
small. The reason is that under the premise of the proposition, the informational rent that
agents with type ci < ck[ht ] get from mimicking an agent with the highest cost ck[ht ] is
decreasing in the realization of bt . As a result, the principal is only able to learn about the
agent’s type when bt is small.
4
Long Run Properties
In this section, we study the long run properties of the equilibrium characterized in the
previous section. Before stating our results, we introduce some additional notation, and
make a preliminary observation.
An equilibrium outcome can be written as an infinite sequence h∞ = h(bt , Tt , at i∞
t=0 , or
equivalently as an infinite sequence of equilibrium histories h∞ = {ht }∞
t=0 such that ht+1 ht
for all t. Because we focus on pure strategy Markovian equilibria and because the sets
of types and states are finite, for any equilibrium outcome h∞ there exists a time t∗ [h∞ ]
such that µ[ht ] = µ[ht∗ [h∞ ] ] for all ht ht∗ [h∞ ] . That is, given an equilibrium outcome,
learning always stops after some time t∗ [h∞ ]. Therefore, given an agent’s type c ∈ C and
an equilibrium outcome h∞ that can arise when the agent has type c, in every period after
t∗ [h∞ ] the principal’s continuation payoff can be written to depend only on the realization
of the current period shock. Formally, given any equilibrium outcome h∞ = {ht }∞
t=0 that is
possible when the agent’s true type is c = ck ∈ C, the principal’s equilibrium continuation
value conditional on the agent’s type being c = ck can be written as U σ (bt |ht∗ [h∞ ] , c = ck )
for all ht ht∗ [h∞ ] . This says that after period t∗ [h∞ ] we need to only keep track of the
current shock and what beliefs were at the end of period t∗ [h∞ ], since beliefs are constant
13
from this period onwards. We use this fact in the next two subsections to study properties
of the principal’s long run value.
4.1
The Principal’s Long Run Value
We start by studying the extent to which the principal can learn the agent’s type, and how
the efficiency of the relationship might improve over time.
For all b ∈ B and all ck ∈ C, the principal’s first best payoffs conditional on the current
shock being b and the agent’s type being c = ck are given by
"
U ∗ (b|ck ) := E
∞
X
0
δ t −t (bt0 − ck )1{bt0 ∈Ek }
t0 =t
#
bt = b .
Thus, under the first best outcome the agent takes the action whenever it is socially optimal
and the principal always compensates the agent his exact cost. We then say that an equilibrium is long run first best if for all ck ∈ C and for every equilibrium outcome h∞ that arises
with positive probability when the agent’s type is c = ck ,
U σ (bt |ht∗ [h∞ ] , c = ck ) = U ∗ (bt |ck ) ∀t ≥ t∗ [h∞ ] and ∀bt ∈ B.
This says that no matter what the agent’s true type is, and no matter what the equilibrium
outcome is, once learning has stopped the principal achieves her first best payoff at every
subsequent realization of the shock. The following proposition reports a sufficient condition
for the principal to always eventually achieve her first best payoffs.
Proposition 2. (long run first best) Suppose that [Qb,b0 ] is ergodic and that for all ck ∈
C\{cK } there exists b ∈ Ek \Ek+1 such that X(b, Ek+1 ) < 1. Then, the equilibrium is long
run first best.
Proof. See Appendix C. Note that an equilibrium is long run first best if and only if the principal always eventually
learns the agent’s type, i.e., if and only if for all c ∈ C and every equilibrium outcome h∞ that
is possible when the agent’s true cost is c, we have µ[ht∗ [h∞ ] ](c) = 1. The proof of Proposition
2 shows that, under the sufficient condition, with probability 1 the principal eventually learns
the agent’s type. The condition guarantees that, for any history ht such that |C[ht ]| ≥ 2,
there exists at least one state b ∈ B at which the principal finds it optimal to make an offer
that only a strict subset of types accept. Therefore, if the process [Qb,b0 ] is ergodic, then it is
14
certain that the principal will eventually learn the agent’s type, and from that point onwards
she will obtain her first best payoffs.
If an equilibrium is long run first best then it is also long run efficient, i.e. for all ck ∈ C
and for every equilibrium outcome h∞ that is possible when the agent’s true cost is ck ,
an agent with cost ck takes the action in each period t > t∗ [h∞ ] if and only if bt ∈ Ek .
However, the converse of this statement is not true. Because of this, it is easy to find weaker
sufficient conditions under which long run efficiency holds. One such condition is that [Qb,b0 ]
is ergodic and for all ck ∈ C\{cK } there exists b ∈ Ek \Ek such that X(b, Ek ) < 1, where
k = min{j ≥ k : Ej 6= Ek }. This condition guarantees that the principal’s beliefs will
eventually place unit mass on the set of types that share the same efficiency set with the
agent’s true type. After this happens, even if the principal does not achieve her first best
payoff by further learning the agent’s type, the agent takes the action if and only if it is
socially optimal to do so.
The next and final result of this section provides a partial counterpart to Proposition 2 by
presenting conditions under which the equilibrium is not long run first best, and conditions
under which it is not long run efficient.
Proposition 3. (no long run first best; no long run efficiency) Let ht be an equilibrium
history such that |C[ht ]| ≥ 2 and X(b, Ek[ht ] ) > 1 for all b ∈ B. Then C[ht0 ] = C[ht ] (and
thus |C[ht0 ]| ≥ 2) for all histories ht0 ht , so the equilibrium is not long run first best. If,
in addition, there exists ci ∈ C[ht ] such that Ei 6= Ek[ht ] , then the equilibrium is not long run
efficient either.
Proof. Follows from Theorem 1. 4.2
Path Dependence
In the examples of Section 3.3 the principal always learns the same amount of information
about the agent’s type in the long run. As a result, even if equilibrium play may exhibit
path-dependence in the short-run, as in Example 2, the principal’s long run value from the
relationship, conditional on the agent’s type, is independent of the history of play.
In this section we show that this is not a general property of the equilibrium of our model.
We show here that the learning process, and hence the principal’s value from the relationship,
may be exhibit path dependence even in the long run. We say that an equilibrium exhibits
long run path dependence if for some type of the agent c = ck ∈ C there are two equilibrium
outcomes, h∞ and h̃∞ , that arise with positive probability when the agent’s type is c = ck ,
15
such that,
U σ (·|ht∗ [h∞ ] , c = ck ) 6= U σ (·|ht∗ [h̃∞ ] , c = ck ).
As we have emphasized repeatedly, the equilibrium may exhibit long run path dependence
even when the process [Qb,b0 ] governing the evolution of shocks is ergodic. In fact, our next
example illustrates how easily long run path dependence can arise when the productivity
shock process [Qb,b0 ] is not ergodic.
Example 3 Let C = {c1 , c2 }, and B = {bL , bM , bH }, with bL < bM < bH . Suppose that
E1 = {bL , bM , bH } and E2 = {bM , bH }. Suppose the process [Qb,b0 ] satisfies: (i) X(bL , E2 ) < 1,
and (ii) QbH ,bH = 1 and Qb,b0 ∈ (0, 1) for all (b, b0 ) 6= (b3, b3) (recall that Qb,b0 denotes the
probability of transitioning to state b0 from state b.) Thus, state bH is absorbing. By Theorem
1, if bt = bH , then from period t onwards the principal makes an offer equal to ck[ht ] and all
agent types in C[ht ] accept.
Consider a history ht with C[ht ] = {c1 , c2 }. By Theorem 1, if bt = bM the principal makes
an offer Tt = c2 that both types of agents accept. If bt = bL , by arguments similar to those in
Example 2, the principal finds it optimal to make an offer Tt = c1 +X(bL , E2 )(c2 −c1 ) ∈ (c1 , c2 )
that an agent with cost c1 accepts and that an agent with cost c2 rejects. Therefore, the
principal learns the agent’s type.
Suppose that the agent’s true type is c = c1 , and consider the following two histories, ht
and h̃t :
ht = h(bt0 = bM , Tt0 = c2 , at0 = 1)t−1
t0 =1 i,
h̃t = h(bt0 = bM , Tt0 = c2 , at0 = 1)t−2
t0 =1 , (bt−1 = bL , Tt−1 = T̃ , at−1 = 1)i.
Under history ht , bt0 = bM for all t0 ≤ t − 1, so the principal’s beliefs after ht is realized are
equal to her prior. Under history h̃t the principal learns that the agent’s type is c1 at time
t − 1. Suppose that bt = bH , so that bt0 = bH for all t0 ≥ t. Under history ht , the principal
doesn’t know the agent’s type at t, and therefore offers a transfer Tt0 = c2 for all t0 ≥ t, which
both agent types accept. Instead, under history h̃t the principal knows that the agent’s type
is c1 , and therefore offers transfer Tt0 = c1 for all t0 ≥ t, and the agent accepts it. Therefore,
when the agent’s type is c1 , the principal’s continuation payoff value at history (ht , bt = bH )
1
1
(bH − c2 ), while her payoff at history (h̃t , bt = bH ) is 1−δ
(bH − c1 ).
is 1−δ
We now establish that equilibrium may exhibit long-run path dependence even when [Qb,b0 ]
is ergodic. Let B = {b1, b2, b3, b4}, with b1 < b2 < b3 < b4 and C = {c1 , c2 , c3 }, and
16
assume that the efficiency sets are E1 = E2 = {b2, b3, b4} and E3 = {b4}. Thus, in the most
productive state, it is socially optimal for all types to take the productive action; in the next
two most productive states, it is socially optimal for only the two lowest cost types to take
the productive action; and in the least productive state it is not socially optimal for any type
to take the productive action. The following proposition shows that equilibrium may have
long-run path dependence even when every entry of the transition matrix [Qb,b0 ] is positive.
Proposition 4. (long run path dependence) Suppose that the agent’s cost is c1 , all of the
entries of [Qb,b0 ] are positive, X(b3, {b4}) > 1 and X(b2, {b4}) < 1.
If |b − c1 | is small enough, |b − c2 | is large enough, Qs,b1 is small enough for all b 6= b1
and Qb,b2 is small enough for all b 6= b2, then the equilibrium has long-run path dependence.
In particular, there exist two equilibrium outcomes, h∞ and h̃∞ , such that
C[ht∗ [h∞ ] ] = {c1 } =
6 {c1 , c2 } = C[ht∗ [h̃∞ ] ] and thus
(†)
U σ (·|ht∗ [h∞ ] , c = c1 ) 6= U σ (·|h̃t∗ [h̃∞ ] , c = c1 ).
Proof. See Appendix D. Proposition 4 shows that, even when the process [Qb,b0 ] is ergodic, the information that
the principal learns about the agent’s type in the long run might be influenced by the history
of productivity shocks early on in the relationship. In particular, when the agent’s true cost
is c1 , under some sequences of productivity shocks the principal eventually ends up learning
the agent’s exact cost and achieving her first best payoff. Under other sequences of shocks
the principal only learns that the agent’s cost is in the set {c1 , c2 }, after which learning stops.
In this case, the principal never achieves her first best payoff, and in the long run she must
pay a transfer equal to c2 whenever she incentivizes the agent to take the action; that is, she
may be giving up quite substantial informational rents even in the long run. Therefore, the
early shocks have a lasting effect on the principal’s equilibrium value.
The intuition behind Proposition 4 is as follows. The information rents that a c1 -agent
gets by mimicking a c2 -agent depends on how often the c2 -agent is expected to take the
productive action in the future (see equation (1)). In turn, how often a c2 -agent takes the
productive action depends on the principal’s beliefs. Indeed, if the principal assigns positive
probability to the agent’s type being c3 , under the assumptions in Proposition 4 a c2 -agent
will not take the productive action at periods such that bt = b2. In contrast, if the principal
learns along the path of play that the agent’s type is not c3 , from that time onwards a c2 -agent
will take the action whenever the state is in E2 = {b2, b3, b4}.
17
As a consequence of this, at histories (ht , bt ) with C[ht ] = {c1 , c2 , c3 } and bt = b1, it is
profitable for the principal to make an offer that only a c1 -agent accepts (i.e., an offer that
induces a c1 -agent to reveal his type). In contrast, at at histories (ht , bt ) with C[ht ] = {c1 , c2 },
inducing a c1 -agent to reveal his private information is too expensive, and the principal is
unable to fully learn the agent’s type.
At the same time, in the proof of Proposition 4 we show that, at histories (ht , bt ) with
C[ht ] = {c1 , c2 , c3 } and bt = b2, the principal finds it optimal to make an offer that only types
in {c1 , c2 } accept. This, together with the arguments above, explain why the equilibrium
displays long run path-dependence. Suppose the agent’s type is c1 . Then, if state b = b1
is visited before state b = b2, the principal will learn the agent’s type and from that point
onwards she will extract all the surplus. In contrast, if state b = b2 is visited before state
b = b1, the principal learns that the agent’s type is in {c1 , c2 }, and then learning stops.
Path dependence has been highlighted to be an important phenomenon in organizational
economics, especially in understanding why seemingly identical firms have persistent differences in performance – see Gibbons (2010).10 With informational asymmetries, as in our
model, shocks to the environment provide variation in learning about firm productivity over
time. The path of learning then has implications for long run performance. In particular,
two firms with similar structures may embark upon two very different learning paths, even
though their internal institutions for providing incentives might appear to be identical.11
5
Conclusion
Productivity shocks are a natural feature of most economic environments, and the incentives
that economic agents face in completely stationary environments can be very different than
the incentives they face in environments subject to these shocks. Our results show that
this is true of the traditional ratchet effect literature. The message of this literature is
that outside institutions that provide contract enforcement can help improve the principal’s
welfare. Our results show that even without such institutions, a strategic principal can use
productivity shocks to her advantage to gradually learn the agent’s private information and
10
Luria (1996), for example, shows that in the American metal manufacturing industry in the 1990’s, some
firms were more than three times more productive than others (by labor productivity) even though they sold
essentially the same product to the same customers. Similarly, Hallward-Driemeier et al. (2001) show that
the top quartile of Indonesian electronics manufacturers were more than eight times as productive as the
bottom quartile, even though all firms were supplying similar products on a competitive global market.
11
Other dynamic principal-agent models giving rise to path-dependence include Chassang (2010), Li and
Matouschek (2013) and Halac and Prat (2015).
18
improve her own welfare. In addition, a relationship that was initially highly inefficient may
become efficient over time. On the other hand, whether or not the relationship ever becomes
efficient, and how profitable it becomes for the principal, may be path dependent.
Appendix
A. Proof of Lemma 0
Proof of part (i). The proof is by strong induction on the cardinality of the support of
the principal’s beliefs, C[ht ]. Fix an equilibrium (σ, µ), and note that the claim is true for
all histories ht such that |C[ht ]| = 1. Suppose next that the claim is true for all histories h̃t̃
with |C[h̃t̃ ]| ≤ n − 1, and consider a history ht with |C[ht ]| = n.
σ
[ht , bt ] > 0. Then, there must exist a state bt0 and
Suppose by contradiction that Vk[h
t]
history ht0 ht that arises on the path of play at which type ck[ht ] receives a transfer
Tt0 > ck[ht ] that the type accepts. Note first that, since type ck [ht ] accepts offer Tt0 , all types
in the support of C[ht0 ] must also accept it. If this were not true, then there would be a
highest type ck ∈ C[ht0 ] that rejects. By the induction hypothesis, the equilibrium payoff
that this type obtains at history ht0 is Vkσ [ht0 , bt0 ] = 0, since this type will be the highest cost
of in the support of the principal’s beliefs following a rejection. But this cannot be, since
type cj can get a payoff of at least Tt0 − ck > 0 by accepting the principal’s offer at time t0 .
We now construct an alternative strategy profile σ̃ that is otherwise identical to σ except
that in state bt0 and history ht0 the agent is offered a transfer T̃ ∈ (ck[ht ] , Tt0 ). Specify the
principal’s beliefs at history (ht0 , bt0 ) as follows: regardless of the agent’s action, the principal’s
beliefs at the end of period t0 are the same as her beliefs at the beginning of the period. At
all other histories, the principal’s beliefs are the same as in the original equilibrium. Note
that, given these beliefs, at history ht0 all agent types in C[ht0 ] find it strictly optimal to
accept the principal’s offer T̃ and take the action. Thus, the principal’s payoff at history ht0
is larger than her payoff under the original equilibrium, which contradicts R2.
Proof of part (ii). The proof is by induction of the cardinality of C[ht ]. Consider first a
history ht such that |C[ht ]| = 2. Without loss of generality, let C[ht ] = {c1 , c2 }, with c1 < c2 .
There are two cases to consider: (i) for all histories ht0 ht , µ[ht0 ] = µ[ht ], i.e., there is no
more learning; and (ii) there exists a history ht0 ht such that µ[ht0 ] 6= µ[ht ].
Consider first case (i). Since µ[ht0 ] = µ[ht ] for all ht0 ht , both types of agents take the
productive action at the same times. This implies that Aσ2 [ht , bt ] = Aσ1 [ht , bt ]. Moreover, by
19
Lemma 0, the transfer that the principal pays when the productive action is taken is equal to
P∞ t0 −t
(Tt0 − c1 )at0 ,1 |ht = V2σ [ht , bt ] + Aσ2 [ht , bt ](c2 − c1 ), where
c2 . Hence, V1σ [ht , bt ] = E
t0 =t δ
we have used the facts that V2σ [ht , bt ] = 0 and Tt0 = c2 for all t0 such that at0 ,1 = at0 ,2 = 1.
Consider next case (ii), and let t = min{t0 ≥ t : at0 ,1 6= at0 ,2 }. Hence, at time t only one
type of agent in {c1 , c2 } takes the action. Note that an agent of type c1 must take the action.
To see why, suppose that it is only the agent of type c2 that takes the action. By part (i) of
the Lemma, the transfer Tt that the principal pays the agent must be equal to c2 . The payoff
that an agent with type c1 gets by accepting the offer Tt is bounded below by c2 − c1 > 0. In
contrast, by part (i) of the Lemma, an agent of type c1 would obtain a continuation payoff
of zero by rejecting this offer. Hence, it must be that only an agent with type c1 takes the
action at time t.
Note that the total payoff that an agent with type c1 gets from time t onwards must
satisfy V1σ [ht , bt ] = Tt − c1 ≥ V2σ [ht , bt ] + Aσ2 [ht , bt ](c2 − c1 ), where the inequality follows since
an agent of type c1 can get a payoff equal to the right-hand side by mimicking an agent with
type c2 . Since we focus on stationary PBE that are optimal for the principal, the transfer
that the principal offers the agent at time t must be Tt = c1 + V2σ [ht , bt ] + Aσ2 [ht , bt ](c2 − c1 ),
and so
V1σ [ht , bt ] = V2σ [ht , bt ] + Aσ2 [ht , bt ](c2 − c1 ).
(3)
Note next that, for all t0 ∈ {t, ..., t − 1}, at0 ,1 = at0 ,2 , i.e., both types of agents take the same
action, and that Tt0 = c2 whenever at0 ,1 = at0 ,2 = 1, i.e., the principal pays a transfer equal
to c2 whenever the high cost agent takes the action. Therefore,
"
V1σ [ht , bt ] = E
"
=E
t−1
X
t0 =t
∞
X
!
δ
t0 −t
(Tt0 − c1 )at0 ,1
#
+ δ t V1σ [ht , bt ] | ht , bt
!
0
δ t −t (c2 − c1 )at0 ,2
#
+ δ t Aσ2 [ht , bt ](c2 − c1 ) | ht , bt
t0 =t
= V2σ [ht , bt ] + Aσ2 [ht , bt ](c2 − c1 ),
(4)
where we have used (3), and the fact that V2σ [ht , bt ] = 0. Therefore, the result of the lemma
holds for all ht such that |C[ht ]| = 2.
Suppose next that the result holds for all h̃t̃ such that |C[h̃t̃ ]| ≤ n − 1, and consider a
history ht such that |C[ht ]| = n. Consider two “adjacent” types ci , ci+1 ∈ C[ht ]. We have two
possible cases: (i) with probability 1, types ci and ci+1 take the same action at all histories
ht0 ht ; (ii) there exists a history ht0 ht at which types ci and ci+1 take different actions.
20
Under case (i),
"
Viσ [ht , bt ]
=E
=E
∞
X
0 =t
" t∞
X
#
δ
t0 −t
δ
t0 −t
(Tt0 − ci )at0 ,i |ht , bt
#
(Tt0 − ci+1 )at0 ,i+1 |ht , bt + E
t0 =t
"
∞
X
#
δ
t0 −t
(ci+1 − ci )at0 ,i+1 |ht , bt
t0 =t
σ
= Vi+1
[ht , bt ] + Aσi+1 [ht , bt ](ci+1 − ci ).
(5)
For case (ii), let t = min{t0 ≥ t : at0 ,i+1 6= at0 ,i } be the first time after t at which types
ci and ci+1 take different actions. Let ck ∈ C[ht ] be the highest cost type that takes the
action at time t. The transfer Tt that the principal offers at time t must satisfy Vkσ [ht , bt ] =
σ
σ
[ht , bt ] + Aσk+1 [ht , bt ](ck+1 − ck ).12 Note further that Vk+1
[ht , bt ] ≥ Tt − ck+1 ,
Tt − ck + 0 = Vk+1
since an agent with cost ck+1 can guarantee Tt − ck+1 by taking the action at time t and
then not taking the action in all future periods. This, combined with the previous equality,
implies that Aσk+1 [ht , bt ] ≤ 1.
We now use this to show that all types below ck also take the action at time t. This
implies that all agents in the support of C[ht ] with cost weakly lower than ck take the action
at t, and all agents with cost weakly greater than ck+1 do not take the action. Note that
this implies that ci = ck (since types ci and ci+1 take different actions at time t). Suppose
for the sake of contradiction that this is not true, and let cj be the highest cost type below
ck that takes does not take the action. The payoff that this agent gets from not taking the
σ
[ht , bt ] + Aσk+1 [ht , bt ](ck+1 − cj ), which follows since at time t types
action is Vjσ [ht , bt ] = Vk+1
cj and ck+1 do not take the action and since, by the induction hypothesis, from time t + 1
onwards the payoff that an agent with cost cj gets is equal to what this agent would get by
mimicking an agent with cost ck+1 . On the other hand, the payoff that agent cj obtains by
taking the action and mimicking type ck is
Vkσ [ht , bt ] + Aσk [ht , bt ](ck − cj ) = Tt − ck + Aσk [ht , bt ](ck − cj )
σ
= Vk+1
[ht , bt ] + Aσk+1 [ht , bt ](ck+1 − ck ) + Aσk [ht , bt ](ck − cj )
σ
[ht , bt ] + Aσk+1 [ht , bt ](ck+1 − cj ),
> Vk+1
(6)
which follows since Aσk+1 [ht , bt ] ≤ 1 < Aσk [ht , bt ]. Hence, type j strictly prefers to take the
action, a contradiction. Therefore, all types below ck take the action at time t, and so ci = ck .
12
The first equality follows since, after time t, type ck is the highest type in the support of the principal’s
beliefs if the agent takes action a = 1 at time t.
21
By the arguments above, the payoff that an agent of type ci = ck obtains at time t is
σ
Viσ [ht , bt ] = Tt − ci + 0 = Vi+1
[ht , bt ] + Aσi+1 [ht , bt ](ci+1 − ci ),
(7)
σ
[ht , bt ] + Aσi+1 [ht , bt ](ci+1 − ci ).
since transfer that the principal offers at time t is Tt = ci + Vi+1
Moreover,
t−1
X
"
Viσ [ht , bt ]
=E
t0 =t
∞
X
"
=E
!
δ
t0 −t
δ
t0 −t
(Tt0 − ci )at0 ,i
#
+δ
t
Viσ [ht , bt ]|ht , bt
!
#
((Tt0 − ci+1 )at0 ,i+1 + (ci+1 − ci )at0 ,i+1 ) |ht , bt
t0 =t
#
"
σ
[ht , bt ] + Aσi+1 [ht , bt ](ci+1 − ci ) |ht , bt
+ E δ t Vi+1
σ
[ht , bt ] + Aσi+1 [ht , bt ](ci+1 − ci ),
= Vi+1
(8)
where the second equality follows since at0 ,i = at0 ,i+1 for all t0 ∈ [t, t − 1]. Hence, the result
also holds for histories ht with |C[ht ]| = n.
B. Proof of Theorem 1
The proof proceeds in three steps. First we analyze the case where bt ∈ Ek[ht ] , establishing
part (i) of the theorem. Then we analyze the case where bt ∈
/ Ek[ht ] , establishing (ii) and
(iii). Finally, we show that equilibrium exists and has unique payoffs. In doing so, we also
characterize the threshold type ck∗ defined in part (iii).
B.1. Proof of part (i) (the case of bt ∈ Ek[ht ] )
We prove part (i) of the theorem by strong induction on the size of the set C[ht ]. If C[ht ] is
a singleton {ck }, the statement of part (i) holds: by R1-R2, the principal offers the agent a
transfer Tt0 = ck at all times t0 ≥ t such that bt0 ∈ Ek and the agent accepts, and she offers
some transfer Tt0 < ck at all times t0 ≥ t such that bt0 ∈
/ Ek , and the agent rejects.
Suppose next that the claim is true for all histories ht0 such that |C[ht0 ]| ≤ n − 1, and let
ht be a history such that |C[ht ]| = n, and let bt ∈ Ek[ht ] . In a PBE that satisfies R1-R2, it
cannot be that the principal makes an offer that no type in C[ht ] accepts. Indeed, suppose
that no type in C[ht ] takes the action. Consider an alternative Markovian PBE which is
identical to the original PBE, except that when the principal’s beliefs are µ[ht ] and the shock
22
is bt , the principal makes an offer T = ck[ht ] , and all agent types in C[ht ] accept any offer
weakly larger than T . The principal’s beliefs after this period are equal to µ[ht ] regardless
of the agent’s action. Note that it is optimal to all types of agents to accept this offer, and
it is optimal for the principal to make this offer. Moreover, since bt ∈ Ek[ht ] , the payoff that
the principal obtains from this PBE is strictly larger than her payoff under the original PBE.
But this cannot be, since the original PBE satisfies R1-R2. Hence, if bt ∈ Ek[ht ] , at least a
subset of types in C[ht ] take the action at time t.
Suppose for the sake of contradiction that the principal makes an offer Tt that only a
subset C ( C[ht ] of types accept, and let cj = max C. By Lemma 0, the payoff of type cj
from taking the productive action is Tt − cj + 0. Since an agent with cost cj can mimic the
strategy of type ck[ht ] , incentive compatibility implies that
σ
Tt − cj ≥ Vk[h
[ht , bt ] + (ck[ht ] − cj )Aσk[ht ] [ht , bt ]
t]
≥ (ck[ht ] − cj )X(bt , Ek[ht ] ) > ck[ht ] − cj
(9)
The first inequality follows from equation (2) in the main text. The second inequality follows
/ C,13
from Lemma 0 and the fact that Aσj [ht , bt ] ≥ X(bt , Ek[ht ] ). To see why, note that ck[ht ] ∈
so at most n − 1 types accept the principal’s offer. Thus, the inductive hypothesis implies
that if the agent rejects the offer, then in all periods after t the principal will get all the
remaining types to take the action whenever bt ∈ Ek[ht ] . The last inequality in equation (9)
follows from the fact X(bt , Ek[ht ] ) ≥ X(bt , {bt }) > 1 where the first inequality is due to the
fact that bt ∈ Ek[ht ] and the second is by Assumption 1.
On the other hand, because Lemma 0 implies that an agent with type ck[ht ] has a continuation value of zero, the transfer Tt that the principal offers must be weakly smaller than
ck[ht ] ; otherwise, if Tt > ck[ht ] , an agent with type ck[ht ] could guarantee himself a strictly
positive payoff by accepting the offer. But this contradicts (9). Hence, it must be that all
agents in C[ht ] take action a = 1 at history (ht , bt ) if bt ∈ Ek[ht ] .
13
Suppose, for the sake of contradiction, that type ck[ht ] ∈ C. Since by Lemma 0 this type’s continuation
payoff is zero for all histories, it must be that Tt ≥ ck[ht ] . Let ci = max C[ht ]\C. Since ci rejects the offer
today and becomes the highest cost in the support of the principal’s beliefs tomorrow, Lemma 0 implies that
Viσ [ht ] = 0. But this cannot be, since this agent can guarantee a payoff at least of Tt − ci ≥ ck[ht ] − ci > 0
by accepting the offer. Contradiction.
23
B.2. Proof of parts (ii) & (iii) (the case of bt ∈
/ Ek[ht ] )
In both parts (ii) and (iii) of the theorem, the highest cost type in the principal’s support
ck[ht ] does not take the productive action when bt ∈
/ Ek[ht ] . We prove this in Lemma 1 below,
and use the lemma to prove parts (ii) and (iii) separately.
Lemma 1. Fix any equilibrium (σ, µ) and history ht . If bt ∈
/ Ek[ht ] , then an agent with cost
ck[ht ] does not take the productive action.
Proof. Suppose for the sake of contradiction that an agent with type ck[ht ] does take the action
when bt ∈
/ Ek[ht ] . Since, by Lemma 0, this type’s payoff must equal zero at all histories, it
must be that the offer that is accepted is Tt = ck[ht ] . We now show that if the principal makes
such an offer, then all agent types will accept the offer and take the productive action. To
see this, suppose some types reject the offer. Let cj be the highest cost type that rejects the
offer. By Lemma 0, type cj earns a continuation payoff of zero. So, the payoff that type cj
gets by rejecting the offer is zero. However, this type can guarantee itself a payoff of at least
Tt − cj = ck[ht ] − cj > 0 by accepting the current offer. Hence, it cannot be that some agents
reject the offer Tt = ck[ht ] when an agent with type ck[ht ] accepts the offer.
It then follows that if type ck[ht ] accepts the offer, then the principal will not learn anything
about the agent’s type. Since bt ∈
/ Ek[ht ] , her flow payoff from making the offer is bt −ck[ht ] < 0.
If instead the principal offers a transfer equal to 0 that all agents reject, she would obtain
a current payoff of zero and have the same beliefs as in the case where everyone accepts.
Therefore, by conditions R1-R2, she would have the same continuation payoff as well. Since
bt − ck[ht ] < 0, the principal obtains a higher payoff from following the second strategy.
Proof of part (ii). Fix a history ht and let bt ∈ B\Ek[ht ] be such that X(bt , Ek[ht ] ) > 1.
By Lemma 1, type ck[ht ] doesn’t take the productive action at time t if bt ∈
/ Ek[ht ] . Suppose,
for the sake of contradiction, that there is a nonempty set of types C ( C[ht ] that do take
the productive action. Let cj = max C. By Lemma 0 type cj obtains a continuation payoff
of zero starting in period t + 1. Hence, type cj receives a payoff Tt − cj + 0 from taking the
productive action in period t. Since this payoff must be weakly larger than the payoff the
agent would obtain by not taking the action and mimicking the strategy of agent ck[ht ] in all
future periods, it follows that
σ
Tt − cj ≥ Vk[h
[ht , bt ] + (ck[ht ] − cj )Aσk[ht ] [ht , bt ]
t]
≥ (ck[ht ] − cj )X(bt , Ek[ht ] ) > ck[ht ] − cj ,
24
(10)
where the first line follows from incentive compatibility, the second line follows from the fact
that at0 ,k[ht ] = 1 for all times t0 ≥ t such that bt0 ∈ Ek[ht ] (by the result of part (i) proven
above), and the third line follows since X(bt , Ek[ht ] ) > 1 by assumption. The inequalities in
(10) imply that Tt > ck[ht ] . But then by Lemma 0, it would be strictly optimal for type ck[ht ]
to deviate by accepting the transfer and taking the productive action. So it must be that all
agent types in C[ht ] take action at = 0.
Proof of part (iii). We start by showing that the set of types that accept the offer has
the form C − = {ck ∈ C[ht ] : ck < ck∗ } for some ck∗ ∈ C[ht ]. The result is clearly true if no
agent type takes the action, in which case set ck∗ = min C[ht ]; or if only an agent with type
min C[ht ] takes the action, in which case set ck∗ equal to the second lowest cost in C[ht ].
Therefore, suppose that an agent with type larger than min C[ht ] takes the action, and
let cj ∗ ∈ C[ht ] be the highest cost agent that takes the action. Since bt ∈
/ Ek[ht ] , by Lemma
1 it must be that cj ∗ < ck[ht ] . By Lemma 0, type cj ∗ ’s payoff is Tt − cj ∗ , since from date
t + 1 onwards this type will be the largest cost in the support of the principal’s beliefs if the
principal observes that the agent took the action at time t. Let ck∗ = min{ck ∈ C[ht ] : ck >
cj ∗ }, and note that (2) implies that
Tt − cj ∗ ≥ Vkσ∗ [ht , bt ] + (ck∗ − cj ∗ )Aσk∗ [ht , bt ]
(11)
Furthermore, type ck∗ can guarantee himself a payoff of Tt − ck∗ by taking the action once
and never taking the action again. Therefore, it must be that
Vkσ∗ [ht , bt ] ≥ Tt − ck∗ ≥ cj ∗ − ck∗ + Vkσ∗ [ht , bt ] + (ck∗ − cj ∗ )Aσk∗ [ht , bt ]
=⇒ 1 ≥ Aσk∗ [ht , bt ]
(12)
where the second inequality in the first line follows from (11).
We now show that all types ci ∈ C[ht ] with ci < cj ∗ also take the action at time t. Suppose
for the sake of contradiction that this is not true, and let ci∗ ∈ C[ht ] be the highest cost type
lower than cj ∗ that does not take the action. The payoff that this type would get by taking
the action at time t and then mimicking type cj ∗ is
Viσ∗ →j ∗ [ht , bt ] = Tt − ci∗ + (cj ∗ − ci∗ )Aσj∗ [ht , bt ]
= Tt − ci∗ + (cj ∗ − ci∗ )X(bt , Ej ∗ )
≥ (cj ∗ − ci∗ )[1 + X(bt , Ej ∗ )] + Vkσ∗ [ht , bt ] + (ck∗ − cj ∗ )Aσk∗ [ht , bt ]
25
(13)
where the first line follows from the fact that type cj ∗ is the highest type in the support of the
principal’s beliefs in period t + 1, so he receives a payoff of 0 from t + 1 onwards; the second
follows from part (i) and Lemma 1 which imply that type cj ∗ takes the action in periods
t0 ≥ t + 1 only when bt0 ∈ Ej ∗ ; and the third follows by applying the inequality in (11).
On the other hand, by Lemma 0(ii), the payoff that type ci∗ gets by rejecting the offer
at time t is equal to the payoff she would get by mimicking type ck∗ , since the principal will
believe for sure that the agent does not have type in {ci∗ +1 , ..., cj ∗ } ⊆ C[ht ] after observing a
rejection. That is, type ci∗ ’s payoff is
Viσ∗ [ht , bt ] = Viσ∗ →k∗ [ht , bt ] = Vkσ∗ [ht , bt ] + (ck∗ − ci∗ )Aσk∗ [ht , bt ]
(14)
From equations (13) and (14), it follows that
Viσ∗ [ht , bt ] − Viσ∗ →j ∗ [ht , bt ] ≤ (cj ∗ − ci∗ ) Aσk∗ [ht , bt ] − [1 + X(bt , Ej ∗ )] < 0,
where the strict inequality follows after using (12). Hence, type ci∗ strictly prefers to mimic
type cj ∗ and take the action at time t than to not take it, a contradiction. Hence, all types
ci ∈ C[ht ] with ci ≤ cj ∗ take the action at t, and so the set of types taking the action takes
the form C − = {cj ∈ C[ht ] : cj < ck∗ }.
Finally, it is clear that in equilibrium, the transfer that the principal will pay at time t if
all agents with type ci ∈ C − take the action is given by (∗). The payoff that an agent with
type cj ∗ = max C − gets by accepting the offer is Tt − cj ∗ , while her payoff from rejecting the
offer and mimicking type ck∗ = min C[ht ]\C − is Vkσ∗ [ht , bt ] + (ck∗ − cj ∗ )Aσk∗ [ht , bt ]. Hence, the
lowest offer that a cj ∗ -agent accepts is Tt = cj ∗ + Vkσ∗ [ht bt ] + (ck∗ − cj ∗ )Aσk∗ [ht bt ].
B.3. Proof of Existence and Uniqueness
For each history ht and each cj ∈ C[ht ], let Cj+ [ht ] = {ci ∈ C[ht ] : ci ≥ cj }. For each history
ht and state b ∈ B, let (ht , bt ) denote the concatenation of history ht = hbt0 , Tt0 , at0 it−1
t0 =0
together with state realization bt . Let
"
Aσj+ [ht , bt ] := Eσj
∞
X
#
0
δ t −t at0 ,j |(ht , bt ) and C[ht+1 ] = Cj+ [ht ] .
t0 =t+1
That is, Aσj+ [ht , bt ] is the expected discounted fraction of time at which an agent with type cj
takes the action after history (ht , bt ) if the beliefs of the principal at time t + 1 have support
26
Cj+ [ht ]. We then have:
Lemma 2. Fix any equilibrium (σ, µ) and history-state pair (ht , bt ). Then, there exists an
offer T ≥ 0 such that types ci ∈ C[ht ], ci < cj , accept in time t and types ci ∈ C[ht ], ci ≥ cj ,
reject if and only if Aσj+ [ht , bt ] ≤ 1.
Proof. First, suppose such an offer T exists, and let ck be the highest type in C[ht ] that
accepts T . Let cj be the lowest type in C[ht ] that rejects the offer. By Lemma 0, the expected
discounted payoff that an agent with type ck gets from accepting the offer is T − ck + 0. The
payoff that type ck obtains by rejecting the offer and mimicking type cj from time t + 1
onwards is Vjσ [ht , bt ] + (cj − ck )Aσj+ [ht , bt ]. Therefore, the offer T that the principal makes
must satisfy
T − ck ≥ Vjσ [ht , bt ] + (cj − ck )Aσj+ [ht , bt ]
(15)
where the second equality follows because, if the offer is rejected at t, the principal’s beliefs in
period t + 1 are that the agent’s type lies in the set Cj+1 [ht ], and thus Aσj [ht , bt ] = Aσj+ [ht , bt ].
Note that an agent with type cj can guarantee herself a payoff of T − cj by taking the action
in period t and then never taking it again; therefore, incentive compatibility implies
Vjσ [ht , bt ] ≥ T − cj = Vjσ [ht , bt ] + (cj − ck ) Aσj+ [ht , bt ] − 1
=⇒ 1 ≥ Aσj+ [ht , bt ]
where the equality in the first line follows after substituting T from (15).
Suppose next that Aσj+ [ht , bt ] ≤ 1, and suppose the principal makes offer T = ck +
Vjσ [ht , bt ] + (cj − ck )Aσj+ [ht , bt ], which only agents with type c` ∈ C[ht ], c` ≤ ck are supposed
to accept. The payoff that an agent with cost ck obtains by accepting the offer is T − ck ,
which is exactly what he would obtain by rejecting the offer and mimicking type cj . Hence,
type ck has an incentive to accept such an offer. Similarly, one can check that all types
c` ∈ C[ht ], c` < ck also have an incentive to accept the offer. If the agent accepts such an
offer and takes the action in period t, the principal will be believe that the agent’s type lies in
{c` ∈ C[ht ] : c` ≤ ci }. Note that, in all periods t0 > t, the principal will never offer Tt0 > ck .
Consider the incentives of an agent with type cj at time t. The payoff that this agent
gets from accepting the offer is T − cj , since from t + 1 onwards the agent will never accept
any equilibrium offer. This is because all subsequent offers will be lower than ck < cj . On
the other hand, the agent’s payoff from rejecting the offer is Vjσ [ht , bt ] ≥ T − cj = ck − cj +
Vjσ [ht , bt ] + (ck − cj ) Aσj+ [ht , bt ] − 1 , where the inequality follows since Aσj+ [ht , bt ] ≤ 1.
27
The proof of existence and uniqueness relies on Lemma 2 and uses strong induction on the
cardinality of C[ht ]. Clearly, equilibrium exists and equilibrium payoffs are unique at histories
ht such that C[ht ] is a singleton {ck }: in this case, the principal offers the agent a transfer
Tt0 = ck at all times t0 ≥ t such that bt0 ∈ Ek (which the agent accepts) and offers some
transfer Tt0 < ck at all times t0 ≥ t such that bt0 ∈
/ Ek .
Suppose next that equilibrium exists and equilibrium payoffs are unique for all histories
h̃t̃ such that |C[h̃t̃ ]| ≤ n − 1, and let ht be a history such that |C[ht ]| = n. Fix a candidate
for equilibrium (σ, µ), and let U σ [bt , µ[ht ]] denote the principal’s equilibrium payoffs when
her beliefs are µ[ht ] and the shock is bt . We now show that, when the principal’s beliefs are
µ[ht ], equilibrium payoffs are also unique.
If bt ∈ Ek[ht ] , then by part (i) it must be that all agent types in C[ht ] take the action in
period t and Tt = ck[ht ] ; hence, at such states
U σ [bt , µ[ht ]] = bt − ck[ht ] + δE U σ [bt+1 , µ[ht ]]|bt
If bt ∈
/ Ek[ht ] and X(bt , Ek[ht ] ) > 1, then by part (ii), all agent types in C[ht ] don’t take
the action (in this case, the principal makes an offer T small enough that all agents reject);
hence, at such states
U σ [bt , µ[ht ]] = δE U σ [bt+1 , µ[ht ]]|bt
In either case, the principal doesn’t learn anything about the agent’s type, since all types of
agents in C[ht ] take the same action, so her beliefs don’t change.
Finally, consider states bt ∈
/ Ek[ht ] and X(bt , Ek[ht ] ) ≤ 1. Two things can happen at such
a state: (i) all types of agents in C[ht ] don’t take the action, or (ii) a strict subset of types in
C[ht ] don’t take the action and the rest do.14 In case (i), the beliefs of the principal at time
t + 1 would be the same as the beliefs of the principal at time t, and her payoffs are
U σ [bt , µ[ht ]] = δE U σ [bt+1 , µ[ht ]]|bt
In case (ii), the types of the agent not taking the action has the form Cj+ [ht ] = {ci ∈ C[ht ] :
ci ≥ cj } for some cj ∈ C[ht ]. So in case (ii) the support of the beliefs of the principal at time
t + 1 would be Cj+ [ht ] if the agent doesn’t take the action, and C[ht ]\Cj+ [ht ] if he does.
By Lemma 2, there exists an offer that types Cj+ [ht ] reject and types C[ht ]\Cj+ [ht ] accept
14
By Lemma 1, in equilibrium an agent with cost ck[ht ] doesn’t take the action.
28
if and only if Aσj+ [ht , bt ] ≤ 1. Note that, by the induction hypothesis, Aσj+ [ht , bt ] is uniquely
determined.15 Let C ∗ [ht , bt ] = {ci ∈ C[ht ] : Aσi+ [ht , bt ] ≤ 1}. Without loss of generality,
renumber the types in C[ht ] so that C[ht ] = {c1 , ..., ck[ht ] }, with c1 < ... < ck[ht ] . For each
ci ∈ C ∗ [ht , bt ], let
∗
= ci−1 + Viσ [ht , bt ] + Aσi+ [ht , bt ](ci − ci−1 )
Tt,i−1
be the offer that leaves an agent with type ci−1 indifferent between accepting and rejecting
when all types in Ci+ [ht ] reject the offer and all types in C[ht ]\Ci+ [ht ] accept. Note that
∗
is the best offer for a principal who wants to get all agents with types in C[ht ]\Ci+ [ht ]
Tt,i−1
to take the action and all agents with types in types in Ci+ [ht ] to not take the action.
∗
: ci ∈ C ∗ [ht , bt ]}. At states bt ∈
/ Ek[ht ] with X(bt , Ek[ht ] ) ≤ 1, the principal
Let T = {Tt,i−1
must choose optimally whether to make an offer in T or to make a low offer (for example,
∗
Tt = 0) that all agents reject: an offer Tt = Tt,i−1
would be accepted by types in C[ht ]\Ci+ [ht ]
and rejected by types in Ci+ [ht ], while an offer Tt = 0 will be rejected by everyone. For each
∗
∗
) be the probability that offer is accepted; i.e., the probability that
∈ T , let p(Tt,i−1
offer Tt,i−1
∗
∗
, at = 0]
, at = 1] and U σ [bt , Tt,i−1
the agent has cost weakly smaller than ci−1 . Let U σ [bt , Tt,i−1
∗
denote the principal’s expected continuation payoffs if the offer Tt,i−1 ∈ T is accepted and
rejected, respectively, at state bt . Note that these payoffs are uniquely pinned down by the
induction hypothesis: after observing whether the agent accepted or rejected the offer, the
cardinality of the support of the principal’s beliefs will be weakly lower than n − 1. Let
n
o
U ∗ (b) = max p(T )(b − T + U σ [b, T, 1]) + (1 − p(T ))U σ [b, T, 0]
T ∈T
and let T (b) be a maximizer of this expression.
Partition the states B as follows:
B1 = Ek[ht ]
B2 = {b ∈ B\B1 : X(bt , Ek[ht ] ) > 1}
B3 = {b ∈ B\B1 : X(bt , Ek[ht ] ) ≤ 1}
15
Aσj+ [ht , bt ] is determined in equilibrium when the principal has beliefs with support Cj+ [ht ], and the
induction hypothesis states that the continuation equilibrium is unique when the cardinality of the support
of principal’s beliefs is less than n.
29
By our arguments above, the principal’s payoff U σ [b, µ[ht ]] satisfies:

σ

if b ∈ B1
t ]]|bt = b
 b − ck[ht ] + δE U [bt+1 , µ[h
σ
σ
U [b, µ[ht ]] =
δE U [bt+1 , µ[ht ]]|bt = b
if b ∈ B2

σ

∗
max{U (b), δE U [bt+1 , µ[ht ]]|bt = b } if b ∈ B3
(16)
Let F be the set of functions from B to R and let Φ : F → F be the operator such that, for
every f ∈ F,


if b ∈ B1
 b − ck[ht ] + δE[f [bt+1 ]|bt = b]
Φ(f )(b) =
δE[f [bt+1 ]|bt = s]
if b ∈ B2


∗
max{U (b), δE[f [bt+1 ]|bt = b]} if b ∈ B3
One can check that Φ is a contraction of modulus δ < 1, and therefore has a unique fixed
point. Moreover, by (16), the principal’s equilibrium payoffs U σ [b, µ[ht ]] are a fixed point of
Φ. These two observations together imply that the principal’s equilibrium payoffs U σ [b, µ[ht ]]
are unique. Finally, the equilibrium strategies at (ht , bt ) can be immediately derived from
(16).
C. Proof of Proposition 2
Fix a history ht such that |C[ht ]| ≥ 2 and without loss of generality renumber the types so
that C[ht ] = {c1 , ..., ck[ht ] } with c1 < ... < ck[ht ] . We start by showing that for every such
history, there exists a shock realization b ∈ B with the property that at state (µ[ht ], b) the
principal makes an offer that a strict subset of the types in C[ht ] accepts.
Suppose for the sake of contradiction that this is not true. Note that this implies that
µ[ht0 ] = µ[ht ] for every ht0 ht . By Theorem 1, this further implies that after history ht , the
agent only takes the action when the shock is in Ek[ht ] , and receives a transfer equal to ck[ht ] .
Therefore, the principal’s payoff after history (ht , b) is
U σ [ht , b] = E
"∞
X
#
δ
t0 −t
(bt0 − ck[ht ] )1{bt0 ∈Ek[h ] } |bt = b .
t
t0 =t
Let b ∈ Ek[ht ]−1 be such that X(b, Ek[ht ] ) < 1. By Assumption 2 such a shock b exists.
Suppose that the shock at time t after history ht is b, and let > 0 be small enough such
that
T = ck[ht ]−1 + X(b, Ek[ht ] )(ck[ht ] − ck[ht ]−1 ) + < ck[ht ] .
(17)
Note that at state (µ[ht ], b), an offer equal to T is accepted by all types below ck[ht ] , and is
30
rejected by type ck[ht ] .16 The principal’s payoff from making an offer T conditional on the
agent’s type being ck[ht ] is U σ [ht , b]. On the other hand, when the agent’s type is lower than
ck[ht ] , the principal obtains b − T at period t if she offers transfer T , and learns that the
agent’s type is not ck[ht ] . From period t + 1 onwards, the principal’s payoff is bounded below
by what she could obtain if at all periods t0 > t she offers Tt0 = ck[ht ]−1 whenever bt0 ∈ Ek[ht ]−1
(an offer which is accepted by all types), and offers Tt0 = 0 otherwise (which is rejected by all
types). The payoff that the principal obtains from following this strategy when the agent’s
cost is lower than ck[ht ] is
"
U =b−T +E
∞
X
#
0
δ t −t (bt0 − ck[ht ]−1 )1{bt0 ∈Ek[h ]−1 } |bt = b
t
t0 =t+1
"
= b − ck[ht ]−1 − + E
∞
X
#
δ
t0 −t
(bt0 − ck[ht ] )1{bt0 ∈Ek[h ] } |bt = b
t
t0 =t+1
"
+E
∞
X
#
δ
t0 −t
(bt0 − ck[ht ]−1 )1{bt0 ∈Ek[h ]−1 \Ek[h ] } |bt = b
t
t0 =t+1
t
= U σ [ht , b] + b − ck[ht ]−1 − " ∞
#
X 0
+E
δ t −t (bt0 − ck[ht ]−1 )1{bt0 ∈Ek[h ]−1 \Ek[h ] } |bt = b ,
t
t0 =t+1
t
where the first line follows from substituting (17). Then, from the third line we see that
if > 0 is small enough then U strictly larger than U σ [ht , b]. But this cannot be, since
the proposed strategy profile was an equilibrium. Therefore, for all histories ht such that
|C[ht ]| ≥ 2, there exists b ∈ B with the property that at state (µ[ht ], b) the principal makes
an offer that a strict subset of the types in C[ht ] accept.
We now use this result to establish the proposition. Note first that this result, together
with the assumption that [Qb,b0 ] is ergodic, implies that there is long run learning in equilibrium. As long as C[ht ] has two or more elements, there will be some shock realization at
which the principal makes an offer that only a strict subset of types in C[ht ] accepts. Since
there are finitely many types, the principal will end up learning the agent’s type.
Finally, suppose that the principal history ht is such that C[ht ] = {ci }. Then, from
P∞ t0 −t
time t onwards the principal’s payoff is U σ [ht , b] = E
(bt0 − ci )1{bt0 ∈Ei } |bt = b =
t0 =t δ
Ui∗ (b|c = ci ), which is the first best payoff. This and the previous arguments imply that the
16
Indeed, an agent with cost ci < ck[ht ] obtains a payoff that is strictly larger by accepting offer T than
what he obtains by rejecting and continuing playing the equilibrium.
31
equilibrium is long run first best, has long run learning, and is long run efficient.
D. Proof of Proposition 4
To prove the proposition we show that, under the Assumptions of Proposition 4, the unique
equilibrium has the following properties.
(i) If ht is such that C[ht ] = {c1 , c2 }, then µ[ht0 ] = µ[ht ] for all ht0 ht (i.e., there is no
more learning by the principal from time t onwards).
(ii) If ht is such that C[ht ] = {c2 , c3 }, the principal learns the agent’s type at time t if and
only if bt = b2.
(iii) For histories ht such that C[ht ] = {c1 , c2 , c3 }: if bt = b1, type c1 takes action a = 1
while types c2 and c3 take action a = 0; if bt = b2, types c1 and c2 take action a = 1
and type c3 takes action a = 0; if bt = b3, all agent types take action a = 0; and if
bt = b4, all agent types take action a = 1.
Before proving properties (i)-(iii), we note that they imply the desired result. Indeed,
when the agent’s type is c1 , properties (i)-(iii) imply that the principal eventually learns the
agent’s type if and only if t(b1) := min{t ≥ 0 : bt = b1} < t(b2) := min{t ≥ 0 : bt = b2} (i.e.,
if state b1 is visited before state b2).
Proof of Property (i). Note first that, by Theorem 1, after such a history the principal
makes a pooling offer that both types accept if bt ∈ E2 = {b2, b3, b4}. To establish the
result, we show that if bt = b1, types c1 and c2 take action a = 0 after history ht . If
the principal makes a separating offer that only a c1 agent accepts, she pays a transfer
Tt = c1 + X(b1, E2 )(c2 − c1 ). The principal’s payoff from making such an offer, conditional
on the agent being type c1 , is
"
Ũ sc [c1 ] = b1 − Tt + E
#
X
δ
t0 −t
1bt ∈E1 (bt0 − c1 )|bt = b1
t0 >t
= b1 − c1 +
X
X(b1, {b})[b − c2 ].
b∈{b2,b3,b4}
Her payoff from making that offer conditional on the agent’s type being c2 is Ũ sc [c2 ] =
P
b∈{b2,b3,b4} X(b1, {b})[b − c2 ]. If the principal doesn’t make a separating offer when bt = b1,
P
she never learns the agent’s true type and gets a payoff Ũ nsc = b∈{b2,b3,b4} X(b1, {b})[b − c2 ].
32
Since b1 − c1 < 0 by assumption, Ũ nsc > µ[ht ][c1 ]Ũ sc [c1 ] + µ[ht ][c2 ]Ũ sc [c2 ], and therefore the
principal chooses not to make a separating offer.
Proof of Property (ii). Theorem 1 implies that, after such a history, the principal makes
a pooling offer that both types accept if bt ∈ E3 = {b4}. Theorem 1 also implies that, if
bt = b3, then after such a history the principal makes an offer that both types reject (since
X(b3, {b4}) > 1 by assumption). So it remains to show that, after history ht , the principal
makes an offer that a c2 agent accepts and a c3 agent rejects if bt = b2, and that the principal
makes an offer that both types reject if bt = b1.
Suppose bt = b2. Let U [ci ] be the principal’s value at history (ht , bt = b2) conditional
on the agent’s type being ci ∈ {c2 , c3 }, and let Vi the value of an agent of type ci at history
P
(ht , bt = b2). Note that U [c2 ] + V2 ≤ b2 − c2 + b∈{b2,b3,b4} X(b2, {b})[b − c2 ], since the righthand side of this equation corresponds to the efficient total payoff when the agent is of type
c2 (i.e., the agent taking the action if and only if the state is in E2 .) Note also that incentive
compatibility implies V2 ≥ X(b2, {b4})(c2 − c3 ), since a c2 -agent can mimic a c3 -agent forever
and obtain X(b2, {b4})(c2 − c3 ). It thus follows that U [c2 ] ≤ b2 − c2 + X(b2, {b4})[b4 − c3 ] +
P
s∈{b2,b3} X(b2, {b})[b − c2 ].
If when bt = b2 the principal makes an offer that only a c2 agent accepts, the offer must
satisfy Tt = c2 + X(b2, {b4})(c2 − c3 ) < c3 . The principal’s payoff from making such an offer
when the agent’s type is c2 is
b2 − Tt +
X
X(b2, {b})[b − c2 ]
b∈{b2,b3,b4}
=b2 − c2 + X(b2, {b4})[b4 − c3 ] +
X
X(b2, {b})[b − c2 ],
(18)
b∈{b2,b3}
which, from the arguments in the previous paragraph, is the highest payoff that the principal
can ever get from a c2 agent after history (ht , bt = b2). Hence, it is optimal for the principal
to make such a separating offer.17
Suppose next that bt = b1. If the principal makes an offer that a c2 -agent accepts and a
c3 -agent rejects, she pays a transfer Tt = c2 + X(b1, E3 )(c3 − c2 ). Thus, the principal’s payoff
17
Indeed, the principal’s payoff from making an offer equal to Tt when the agent’s type is c3 is
X(2, {4})[b(4) − c3 ], which is also the most that she can extract from an agent of type c3 .
33
from making such an offer, conditional on the agent being type c2 , is
Ũ sc [c2 ] = b1 − Tt +
X
X(b1, {b})[b − c2 ]
b∈{b2,b3,b4}
X
= b1 − c2 + X(b1, {b4})[b4 − c3 ] +
X(b1, {b})[b − c2 ].
b∈{b2,b3}
If the principal makes an offer that both types reject when bt = b1, then by the arguments
above she learns the agent’s type the first time at which shock b2 is reached. Let ť be the
random variable that indicates the next date at which shock b2 is realized. Then, conditional
on the agent’s type being c2 , the principal’s payoff from making an offer that both types
reject when bt = b1 is
"
Ũ nsc [c2 ] =E
ť−1
X
#
δ
t0 −t
1bt0 =b4 (b4 − c3 )|bt = b1
t0 =t+1



+ E δ ť−t b2 − c2 + X(b2, {b4})[b4 − c3 ] +
X

X(b2, {b})[b − c2 ] |bt = b1
b∈{b2,b3}
=X(b1, {b4})[b4 − c3 ] + X(b1, {b2})[b2 − c2 ] + E δ ť−t |bt = b1 X(b2, {b3})[b3 − c2 ]
where we used (18), which is the payoff that the principal obtains from an agent with type
c2 when the state is b2 and the support of her beliefs is {c2 , c3 }. Then, we have
h
i
Ũ nsc [c2 ] − Ũ sc [c2 ] = −[b1 − c2 ] − X(b1, {b3}) − E δ ť−t |bt = b1 X(b2, {b3}) [b3 − c2 ]
which is positive when |b1 − c2 | is large enough since b1 − c2 < 0 by assumption. Since the
principal’s payoff conditional on the agent’s type being c3 is the same regardless of whether
she makes a separating offer or not when bt = b1 (i.e., in either case the principal earns
X(b1, {b4})(b4 − c3)), the principal chooses not to make an offer that c2 accepts and c3
rejects when bt = b1.
Proof of Property (iii). Suppose C[ht ] = {c1 , c2 , c3 }. Theorem 1 implies that all agent
types take action a = 1 if bt = b4, and all agent types take action a = 0 if bt = b3 (this last
claim follows since X(b3, {b4}) > 1).
Suppose next that C[ht ] = {c1 , c2 , c3 } and bt = b2. We first claim that if the principal
makes an offer that only a subset of types accept at state b2, then this offer must be such
34
that types in {c1 , c2 } take action a = 1 and type c3 takes action a = 0. To see this,
suppose that she instead makes an offer that only an agent with type c1 accepts, and that
agents with types in {c2 , c3 } reject. The offer that she makes in this case satisfies Tt − c1 =
V2σ [ht , bt ] + Aσ2 [ht , bt ](c2 − c1 ). By property (ii) above, under this proposed equilibrium a
c2 -agent will from period t + 1 onwards take the action at all times t0 > t such that bt0 = b2.18
Therefore, Aσ2 [ht , bt ] ≥ X(b2, {b2}) > 1, where the last inequality follows from Assumption 1.
The payoff that an agent of type c2 obtains by accepting offer Tt at time t is bounded below
by Tt − c2 = c1 − c2 + V2σ [ht , bt ] + Aσ2 [ht , bt ](c2 − c1 ) > V2σ [ht , bt ], where the inequality follows
since Aσ2 [ht , bt ] > 1. But this cannot be, since V2σ [ht , bt ] is the equilibrium payoff of an agent
with type c2 . Therefore, either the principal makes an offer that only types in {c1 , c2 } accept
in state b2, or she makes an offer that all types reject.
We now show that the principal makes an offer that types in {c1 , c2 } accept and type c3
rejects when bt = b2 and C[ht ] = {c1 , c2 , c3 }. If she makes an offer that agents with cost in
{c1 , c2 } accept and a c3 -agent rejects, then she pays a transfer Tt = c2 + X(b2, {b4})(c3 − c2 ).
Note then that, by property (i) above, when the agent’s cost is in {c1 , c2 }, the principal stops
learning: for all times t0 > t the principal makes an offer Tt0 = c2 that both types accept when
bt0 ∈ E2 , and she makes a low offer Tt0 = 0 that both types reject when bt0 ∈
/ E2 . Therefore,
conditional on the agent’s type being either c1 or c2 , the principal’s payoff from making at
time t an offer Tt that agents with cost in {c1 , c2 } accept and a c3 -agent rejects is
Û sc [{c1 , c2 }] = b2 − Tt +
X
X(b2, {b})[b − c2 ]
b∈{b2,b3,b4}
= b2 − c2 + X(b2, {b4})[b4 − c3 ] +
X
X(b2, {b})[b − c2 ]
b∈{b2,b3}
On the other hand, if she does not make an offer that a subset of types accept when bt = b2,
then the principal’s payoffs conditional on the agent being of type ci ∈ {c1 , c2 } is bounded
above by

Û nsc [ci ] = E 
t̂−1
X

0
δ t −t 1bt0 =b4 (b4 − c3 ) + δ t̂−t
t0 =t
X
X(b1, {b})(b − ci )|bt = b2
b∈Ei
18
Indeed, under the proposed equilibrium, if the offer is rejected the principal learns that the agent’s type
is in {c2 , c3 }. By property (ii), if the agent’s type is c2 , the principal will learn the agent’s type the time
the shock is b2 (because at that time an agent with type c2 will take the action, while an agent with type c3
won’t), and from that point onwards the agent will take the action when the shock is in E2 = {b2, b3, b4}.
35
where t̂ denotes the next period that state b1 is realized. Note that there exists > 0 small
such that, if Qb,b1 < for all b 6= b1, then Û sc [{c1 , c2 }] > Û nsc [ci ] for i = 1, 2. Finally, note
that the payoff that principal obtains from an agent of type c3 at history ht when bt = b2
is X(b2, {b4})(b4 − c3 ), regardless of whether the principal makes a separating offer or not.
Therefore, if Qb,b1 < for all b 6= b1, when C[ht ] = {c1 , c2 , c3 } and bt = b2 the principal makes
an offer Tt that only types in {c1 , c2 } accept.
Finally, we show that when C[ht ] = {c1 , c2 , c3 } and bt = b1, the principal makes an offer
that only type c1 accepts. Let ť be the random variable that indicates the next date at which
state b2 is realized. If the principal makes an offer Tt that only a c1 -agent accepts, this offer
satisfies
Tt − c1 = V2σ [ht , b1] + Aσ2 [ht , b1](c2 − c1 )
= X(b1, {b4})(c3 − c1 ) + X(b1, {b2}) + E[δ ť−t |bt = b1]X(b2, {b3}) (c2 − c1 )
(19)
where the equality follows since V2σ [ht , b1] = X(b1, {b4})(c3 − c2 ) and since, by property (ii),
when the support of the principal’s beliefs is {c2 , c3 } and the agent’s type is c2 , the principal
learn the agent’s type at time ť.19 Therefore, conditional on making an offer that only type
c1 accepts, the principal’s equilibrium payoff from making an offer that only an agent with
cost c1 accepts at state b1 is
Ǔ sc [c1 ] = b1 − Tt +
X
X(b1, {b})[b − c1 ]
b∈{b2,b3,b4}
= b1 − c1 + X(b1, {b4})[b4 − c3 ] + X(b1, {b3})[b3 − c1 ]
+ X(b1, {b2})[b2 − c2 ] − E[δ ť−t |bt = b1]X(b2, {b3})(c2 − c1 )
where the second line follows from substituting the transfer in (19). On the other hand, the
principal’s payoff from making such an offer at state b1, conditional on the agent’s type being
19
Indeed, the fact that the principal learns the agent’s type at time ť implies that


∞
ť−1
X
X
0
0
Aσ2 [ht , b1] =E 
δ t −t 1bt0 =b4 + δ ť
δ t −ť 1bt0 ∈E2 |bt = b1
t0 =t
t0 =ť
h
i
=X(b1, {b4}) + X(b1, {b2}) + E δ ť−t X(b2, {b3})|bt = 1 .
Since X(b1, {b4}) < 1, there exists ε > 0 such that Aσ2 [ht , b1] < 1 whenever Qb,b2 < ε for all b 6= b2.
36
c2 , is
Ǔ sc [c2 ] = E
" ť−1
X
#
δ τ −t 1bτ =b4 (b4 − c3 )|bt = b1
τ =t

+ E δ ť−t (b2 − c2 − X(b2, {b4})(c3 − c2 )) +
∞
X

δ τ −t 1bτ ∈E2 (bτ − c2 )|bt = b1
τ =ť+1
h
i
=X(b1, {b4})(b4 − c3 ) + X(b1, {b2})(b2 − c2 ) + E δ ť−t X(b2, {b3})|bt = b1 (b3 − c2 ),
where we used the fact that, when the support of her beliefs is {c2 , c3 }, the principal makes
an offer that only a c2 -agent accepts when the state is b2 (the offer that she makes at that
point is T = c2 + X(b2, {b4})(c3 − c2 )).
Alternatively, suppose the principal makes an offer that both c1 and c2 accept but c3
rejects. Then she pays a transfer Tt = c2 + X(b1, {b4})(c3 − c2 ); thus, her payoff from
learning that the agent’s type is in {c1 , c2 } in state b1 is
Ū sc [{c1 , c2 }] = b1 − Tt +
X
X(b1, {b})(b − c2)
b∈{b2,b3,b4}
= b1 − c2 + X(b1, {b4})[b4 − c3 ]
+ X(b1, {b2})[b2 − c2 ] + X(b1, {b3})[b3 − c2 ],
where we used the fact that the principal never learns anything more about the agent’s type
when the support of her beliefs is {c1 , c2 } (see property (i) above). Note that there exists
η > 0 and K > 0 such that, if Qb,b2 < η for all b 6= b2 and if b1 − c2 < −K, then
sc
sc
h
ť−t
i
Ǔ [c1 ] − Ū [{c1 , c2 }] = 1 + X(b1, {b3}) − E[δ |bt = b1]X(b2, {b3}) (c2 − c1 ) > 0 and
h h
i
i
Ǔ sc [c2 ] − Ū sc [{c1 , c2 }] = E δ ť−t X(b2, {b3})|bt = b1 − X(b1, {b3}) (b3 − c2 ) − (b1 − c2 ) > 0.
Therefore, under these conditions, at state b1 the principal strictly prefers to make an offer
that a c1 -agent accepts and agents with cost c ∈ {c2 , c3 } reject than to make an offer that
agents with cost in {c1 , c2 } accept and a c3 -agent rejects.
However, the principal may choose to make an offer that all agent types reject when
bt = b1 and C[ht ] = {c1 , c2 , c3 }. In this case, by the arguments above, the next time the
state is equal to b2 the principal will make an offer that only types in {c1 , c2 } accept. The
offer that she makes in this case is such that T − c2 = X(b2, {b4})(c3 − c2 ). Then, from that
37
point onwards, she will never learn more (by property (i) above). In this case, the principal’s
payoff conditional on the agent’s type being {c1 , c2 } is
Ū nsc =E
" ť−1
X
#
1bτ =b4 (bτ − c3 )|bt = b1
τ =t
"
#
+ E δ ť−t (b2 − T ) +
X
X(b2, {b})(b − c2 )|bt = b1
b∈E2
= X(b1, {b4})[b4 − c3 ] + X(b1, {b2})[b2 − c2 ] + E[δ ť−t |bt = b1]X(b2, {b3})[b3 − c2 ].
Note that there exists η 0 > 0 and 0 > 0 such that, if Qb,b2 < η 0 for all b 6= b2, and if
b1 − c1 > −0 , then,
sc
Ǔ [c1 ] − Ū
nsc
h
= b1 − c1 + X(b1, {b3}) − E[δ
ť−t
i
|bt = b1]X(b2, {b3}) [b3 − c1 ] > 0 and
Ǔ sc [c2 ] − Ū nsc = 0.
Therefore, under these conditions, the principal makes an offer that type c1 accepts and types
in {c2 , c3 } reject when C[ht ] = {c1 , c2 , c3 } and bt = b1.
References
Blume, A. (1998): “Contract Renegotiation with Time-Varying Valuations,” Journal of
Economics & Management Strategy, 7, 397–433.
Carmichael, H. L. and W. B. MacLeod (2000): “Worker cooperation and the ratchet
effect,” Journal of Labor Economics, 18, 1–19.
Chassang, S. (2010): “Building routines: Learning, cooperation, and the dynamics of
incomplete relational contracts,” The American Economic Review, 100, 448–465.
Dewatripont, M. (1989): “Renegotiation and information revelation over time: The case
of optimal labor contracts,” The Quarterly Journal of Economics, 589–619.
Dixit, A. (2000): “IMF programs as incentive mechanisms,” Unpublished manuscript, Department of Economics, Princeton University, Princeton, NJ.
Fiocco, R. and R. Strausz (2015): “Consumer standards as a strategic device to mitigate
38
ratchet effects in dynamic regulation,” Journal of Economics & Management Strategy, 24,
550–569.
Freixas, X., R. Guesnerie, and J. Tirole (1985): “Planning under incomplete information and the ratchet effect,” The review of economic studies, 52, 173–191.
Fudenberg, D., D. K. Levine, and J. Tirole (1985): “Infinite-horizon models of
bargaining with one-sided incomplete information,” in Bargaining with incomplete information, ed. by A. Roth, Cambridge Univ Press, 73–98.
Gerardi, D. and L. Maestri (2015): “Dynamic Contracting with Limited Commitment
and the Ratchet Effect,” Tech. rep., Collegio Carlo Alberto.
Gibbons, R. (1987): “Piece-rate incentive schemes,” Journal of Labor Economics, 413–429.
——— (2010): “Inside Organizations: Pricing, Politics, and Path Dependence,” Annual
Review of Economics, 2, 337–365.
Gul, F., H. Sonnenschein, and R. Wilson (1986): “Foundations of dynamic monopoly
and the Coase conjecture,” Journal of Economic Theory, 39, 155–190.
Halac, M. (2012): “Relational contracts and the value of relationships,” The American
Economic Review, 102, 750–779.
Halac, M. and A. Prat (2015): “Managerial Attention and Worker Performance,” .
Hallward-Driemeier, M., G. Iarossi, and K. Sokoloff (2001): “Manufacturing
productivity in East Asia: market depth and aiming for exports,” Unpublished manuscript,
World Bank.
Hart, O. D. and J. Tirole (1988): “Contract renegotiation and Coasian dynamics,” The
Review of Economic Studies, 55, 509–540.
Kanemoto, Y. and W. B. MacLeod (1992): “The ratchet effect and the market for
secondhand workers,” Journal of Labor Economics, 85–98.
Kennan, J. (2001): “Repeated bargaining with persistent private information,” The Review
of Economic Studies, 68, 719–755.
Laffont, J.-J. and J. Tirole (1988): “The dynamics of incentive contracts,” Econometrica: Journal of the Econometric Society, 1153–1175.
39
Levin, J. (2003): “Relational incentive contracts,” The American Economic Review, 93,
835–857.
Li, J. and N. Matouschek (2013): “Managing conflicts in relational contracts,” The
American Economic Review, 103, 2328–2351.
Luria, D. (1996): “Why markets tolerate mediocre manufacturing,” Challenge, 39, 11–16.
Malcomson, J. M. (2015): “Relational incentive contracts with persistent private information,” Econometrica, forthcoming.
Ortner, J. (2016): “Durable goods monopoly with stochastic costs,” Theoretical Economics, forthcoming.
Schmidt, K. M. (1993): “Commitment through incomplete information in a simple repeated
bargaining game,” Journal of Economic Theory, 60, 114–139.
40
Download