December 2012 Optimal Dynamic Contracting∗ Abstract We study a simple dynamic Principal-Agent model in which the agent’s types are serially correlated. In these models, the standard approach consists in first solving a relaxed version in which only local incentive compatibility constraints are considered, and then in proving that the local constraints are sufficient for implementability. We show that, with the exception of few notable examples highlighted in the literature, this approach is not generally valid: even assuming standard regularity conditions, both local and global incentive constraints are generally binding when serial correlation is sufficiently high. We uncover a number of interesting features of the optimal contract that cannot be observed in the special environments in which the standard approach works. Finally, we show that even in complex environments, approximately optimal allocations can be easily characterized by focusing on a particular class of contracts in which the allocation is forced to be monotonic. Marco Battaglini Department of Economics Princeton University Princeton NJ 08544 mbattagl@princeton.edu Rohit Lamba Department of Economics Princeton University Princeton NJ 08544 rlamba@Princeton.edu ∗ For useful comments and discussions we thank Dilip Abreu, Dirk Bergemann, Sylvain Chassang, Stephen Morris, Wolfgang Pesendorfer, Roland Strausz, Balazs Szentes, Juuso Valimaki. Rohit Lamba is especially grateful to Stephen Morris for guidance and encouragement. We thank Edoardo Grillo for outstanding research assistance. 1 Introduction Most contractual relationships have a dynamic nature, involving long term, non-anonymous interaction between a principal and an agent. Examples of these contractual relationships include the cases of income taxation, of regulation, or of a monopolist repeatedly selling a non durable good to a buyer. In these environments contracts can be made contingent on past realizations of the agent’s type, allowing the principal to use the agent’s revealed preferences to screen future types’ realizations. This may be particularly useful in limiting asymmetric information and agency problems when the agent’s type is persistent over time. Despite recent advances in contract theory, there is still a limited understanding about how to use this information to design optimal screening contracts. Dynamic contracts are difficult to study because they involve a large number of incentive compatibility constraints. The analysis of optimal dynamic contracts has therefore been limited to economic environments in which a form of the “first-order approach” can be applied: that is, in which the optimal contract can be fully characterized using only the necessary conditions implied by the local incentive compatibility constraints. While the first-order approach can be generally applied in static environments under mild regularity assumptions, in dynamic environments local incentive compatibility constraints have proven to be sufficient only in very special economic environments.1 This leaves two questions open. approach? First, what is the general applicability of the first-order In order to characterize the optimal contract in general environments, can we just proceed by ignoring the global incentive constraints? Second, in environments in which the firstorder approach does not hold, what does the optimal contract look like? Are there phenomena associated with dynamic contracts that we are ignoring by focusing on environments in which solving the contract is easy? For all environments in which the optimal contract has been characterized the first-order approach could be applied: this may suggest that under appropriate regularity conditions the answer to the first question is affirmative, and the second question is therefore irrelevant. In this paper we show that this is not generally correct. Even in the simplest (and most natural) environments, both local and global constraint are typically binding in dynamic environments. When global incentive constraints cannot be ignored, moreover, the optimal contract shows features that are not apparent in environments in which the first-order 1 We will discuss the literature in greater details in the next subsection. 1 approach is valid. To make our point, we consider a simple Principal-Agent model in which a monopolist repeatedly sells a non durable good to a buyer. The marginal valuation of the buyer is his private information, and it evolves over time according to a general -state Markov process. Higher types are assumed to have higher marginal valuations and their associated conditional distribution on future types first-order stochastically dominates the distribution of lower types. We present three sets of results. We first explore the applicability of the first-order approach. We show that if we ignore the global constraints, the necessary local incentive compatibility constraints allow us to state a “dynamic envelope theorem” through which we can express the agent’s equilibrium rents just as a function of the expected allocation. The dynamic envelope theorem in turn allows a simple characterization of the profit maximizing contract in the relaxed problem that includes only local incentive constraints (first-order optimal contract, or FO-contract). We also show that the envelope formula and a simple form of monotonicity of the allocation are sufficient for implementability.2 Monotonicity requires that if º b , then ( ) ≥ (b ), where )) is the quantity allocated following a history (resp., b ).3 ( ) (resp., (b Although this the optimal dynamic contract has been characterized in the existing literature.4 These results condition is only sufficient and quite strong, it is verified for virtually all environments in which directly extend the analysis in Battaglini [2005] who studied the same environment under the assumption of only two types. This generalization allows us to put into a common framework the key features of the environments in which the optimal contract has been characterized in the literature. We then show that the environments for which the dynamic envelope formula is sufficient to characterize the optimal dynamic contract are quite special. In general, even in the simplest examples, the allocation is not monotonic and local incentive constraints are not sufficient for implementability. We proceed in two steps. To gain intuition on the structure of the optimal contract, we start by proving this result in a simple case with three types and two periods where the optimal contract can be fully solved in closed form. The characterization shows that the seller 2 An allocation is implementable if there exist transfers such that the contract is individually rational and incentive compatible. 3 A history is a vector of reports = (1 ), so as usual º if ≥ ∀ ≤ . 4 Less intuitive necessary and sufficient conditions, or weaker sufficient conditions can be easily stated, but they are of little, if any, practical interest. 2 typically finds it optimal to offer a continuation utility in the second period that is not monotonic in the revealed first period type. The optimal contract is characterized by what we call dynamic pooling: strategic, state contingent treatment of types in which types may be initially separated, to be then pooled conditioned on particular histories. We then generalize the analysis showing that with types and periods the applicability of the first-order approach is even more limited. Indeed, under quite weak conditions on the transition matrix, the first-order approach always fails for 2 if the time horizon is sufficiently important (that is when types’ persistence, number of periods and the discount factor are high enough). Numerical calculations show that in general the level of persistence needed to have the first-order approach fail is quite low. The intuition for why the analysis cannot be limited to local constraints in dynamic environments is simple. In a dynamic principal agent problem the informational advantage of an agent is not limited to his/her unobservable current type, but it also includes the (privately known) conditional distribution of future types. When there are types, the conditional distribution is an − 1 dimensional vector. A dynamic problem, however, is different from a standard static multidimensional screening problem because the time framework provides a particular structure to the incentive compatibility constraints: at any point in time what matters to the agent is only the expected rent. It may be conjectured that under sufficient regularity conditions the expected rents may be monotonic in the types, thus eliminating the multidimensionality problem. The main contribution of this paper is to show that in general environments expected rents are typically non monotonic, even assuming standard regularity conditions.5 This makes it impossible to reduce the design of the optimal contract to a standard one-dimensional screening problem. What does the optimal contract look like in general environments with large and ? What kind of advice can we give to a seller who needs to design an optimal contract? In our final contribution, we identify a particular class of allocations for which the optimal implementable contract can be easily characterized: monotonic contracts. Quantities in monotonic contracts are forced to be non-decreasing in types (a restriction that we call ironing). Restricting to this subset of contracts is not optimal in general. We show, however, that the optimal monotonic contract converges in probability to the optimal contract as types become highly persistent and thus the loss in the monopolist’s profit goes to zero. Further, numerical calculations show that the optimal monotonic contract performs very well, ensuring a minimal loss in objective, even with lower levels 5 The economic intuition for this phenomenon is discussed in detail in Section 4.1 following Lemma 4. 3 of persistence. In the following, we proceed as follows. In Section 2 we present our simple benchmark model. In section 3 we study the general problem. In Section 4 we characterize the optimal contract in the case with 3 types and 2 periods. In Section 5 we study monotonic contracts. Section 6 concludes the analysis. We discuss the related literature in the next subsection. 1.1 Related literature The first paper to present a general theory of dynamic contracts, and to state a dynamic version of the so-called “envelope formula,” is Baron and Besanko [1984]. The envelope formula, an essential result in static contract theory, is an implication of the local incentive compatibility constraints: it allows us to express the agent’s equilibrium rent purely in terms of the final allocation.6 In static environments, under standard regularity conditions, the formula is sufficient for incentive compatibility and therefore drastically simplifies the constraint set. In their paper, Baron and Besanko [1984] state the formula in general terms and show it to be sufficient in two benchmark cases: when types are constant over time, in which case the optimal dynamic contract corresponds to a repetition of the static optimum; and when types’ realizations are independently distributed over time, in which case the optimal contract is efficient starting from period 2. However, they do not study the conditions under which the envelope formula is sufficient for incentive compatibility in the general case with imperfectly correlated types. Besanko [1985] and Battaglini [2005] are the first papers to present natural environments in which the dynamic envelope theorem is sufficient and so the optimal contract can be fully characterized even with an infinite horizon.7 Besanko [1985] studies a model in which the type follows a first-order autoregressive process: the type at time is = −1 + (1 − ) , where is the realization of an i.i.d random variable and ∈ (0 1) is a parameter measuring persistence. This environment is particularly simple to study because the stochastic process can be separated in a deterministic part that induces persistence, and in a shock that is uncorrelated with the agent’s type. Because the shock at + 1 is uncorrelated to the type at , the agent at time does not have any private informational advantage on the stochastic part of his future type: the 6 See Stole [2001], Laffont and Martimort [2002], Milgrom (2004), and Bolton and Dewatripont [2005] for general discussions of the envelope formula in the static case. 7 A number of other paper have studied dynamic contracts in two period environments. See for example Laffont and Tirole [1990]. 4 shock is therefore irrelevant for incentive compatibility. As shown by Besanko [1985], the optimal contract in this environment is qualitatively similar to a fully deterministic one in the sense that the equilibrium distortions are independent of the distribution of the shocks, .8 [2005] considers a two types Markov process with correlated types. Battaglini In this case the agent has private information both on his current type and on the shape of future types’ distributions. The distortions in the allocation are, thus, state contingent and depend on the entire history of the contract. However, these distortions persist only along the “lowest” history in which the agent’s valuation is low in every period and they decrease over time, converging to zero as time becomes inifinitely large. The contract thus converges to efficiency along all histories.9 The model that we study in the reminder of the paper is a direct generalization of this framework to the case with types.10 A recent literature has attempted to apply the first-order approach to general environments. The implicit assumption in this literature is that the first-order approach works if a sufficient dose of “regularity” conditions is imposed on the model. Pavan, Segal and Toikka [2012] have studied the necessary conditions characterized by local incentive compatibility in the form of a dynamic envelope formula in quite general environments. These formulas have been used in recent applied work to study a variety of principal agent models in which sufficient conditions for the validity of the first-order approach have not been established. This literature typically either assumes sufficiency without checking, or relies on numerical methods to verify a sample of the global incentive constraints.11 The problem with this approach is that the verification of all incentive compatibility constraints is impossible in an infinite horizon problem, and impractical in non trivial finite problems, since the number of constraints grows exponentially with the types and periods. All the papers in the literature discussed above study environments in which the allocation is determined in every period, as is natural to assume in application such as nonlinear pricing (where 8 This model has been used in a variety of applications: regulation (Besanko [1985]), managerial compensation (Garret and Pavan [2012]) and others (see Pavan, Segal and Toikka [2012]). 9 Applications of this model include regulation (Battaglini [2007]), optimal taxation (Battaglini and Coate [2008]), non-durable goods monopoly with renegotiation (Maestri [2012]). 10 In Section 3.2 we show how the model with persistent Markov shocks and the model with an auto regressive component can be easily combined and studied in the same framework. 11 This approach has been adopted by Farhi and Werning [2011], Golosov, Troshkin and Tsyvinski [2011], Kapicka (2011), Zhang [2009] and others. 5 a good is sold to a customer in every period), or nonlinear taxation (where income is produced in every period). A related literature has studied environments in which the agent receives information gradually over time, but the allocation is determined only in the last period.12 The seminal contribution here is Courty and Li [2000] who consider a monopolist model in which the buyer first observes a signal about the distribution of his type, and then, before the good is sold, observes his type. This literature studies the conditions under which it is optimal for the seller to screen the agent sequentially. The fact that the allocation is determined only at the end makes implementation conceptually simpler in these environments. As in the papers listed before, Courty and Li [2000] focus on environments in which global incentive constraints can be ignored, showing sufficient conditions under which this approach is valid.13 We finally note that our characterization of monotonic contracts fits into a recent literature devoted to the study of approximately optimal mechanisms in environments in which fully optimal mechanisms are hard to characterize (see Madarasz and Prat [2012], Chassang [2011] for recent contributions and Hartline [2012] for a summary of the computer science approach). While parts of this literature deal with more general environments than ours, our approach takes full advantage of the dynamic structure of the framework we study; this allows us to obtain an approximately optimal contract that guarantees incentive compatibility for all types at all histories. 2 Model There are two players, a buyer and a seller. The buyer repeatedly buys a non-durable good from the seller. He enjoys a per-period utility − for units of the good bought at a price . In every period, the seller produces the good with a cost function () = 1 2 2 . The marginal benefit evolves over time according to a Markov process. There are + 1 possible types, Θ = {0 1 }, with − +1 = ∆ 0 for any = 0 − 1. Let N = {0 1 2 } denote the set of all indices of types. The probability that state is reached if the agent is in state P − (+) . The is given by ( | ) = . Let be the conditional CDF, defined ( | ) = =0 distribution of types conditional on being type is denoted α = (0 1 ). We assume that 12 In addition, important work has been done to study the related but different problem of dynamic implementation of efficient mechanisms with multiple agents (see Athey and Segal [2007] and Bergemann and Valimaki [2010]). 13 Other important theoretical contributions in this literature on sequential screening are Eso and Szentes [2007] and Krahmer and Strausz [2011]. A related model of price discrimination with information revelation at the interim stage is presented by Inderst and Ottaviani [2011] to study the market for financial advice. 6 α has full support (so 0 for any ), and α first-order stochastically dominates α for any and any : so, given that higher indices imply lower values, we have ( | ) ≤ ( | ) for any and ≤ . In each period the consumer observes the realization of his own type; the seller, in contrast, can only observe past allocations. At date 0 the seller has a prior μ = (0 ) on the agent’s type; the prior has full support, so 0 for any . For future reference, note that the efficient level of output, that maximizes the surplus, − 12 2 is equal to ( ) = in all periods and after any history of types’ realizations. We assume that time is discrete and the relationship between the buyer and the seller lasts for ≥ 2 periods. In period 1 the seller offers a supply contract to the buyer. The buyer can reject the offer or accept it; in the latter case the buyer can walk away from the relationship at any time ≥ 1 if the expected continuation utility offered by the contract falls below the reservation value = 0. In line with the standard model of price discrimination, the monopolist commits to the contract that is offered. The common discount factor is ∈ (0 1). It is easy to show that in this environment a form of the revelation principle is valid, which allows us to consider, without loss of generality, only contracts that in each period depend on the revealed type at time and on the history of previous type revelations, i.e. the contract hp qi ¢ ¡ ¯ ¢¢ ¡ ¡ ¯ can be written as hp qi = ¯−1 ¯−1 =1 , where −1 and are, respectively, the public history up to period − 1 and the type revealed at time 14 In general, can be defined © ª recursively as = −1 , 0 = ∅. The set of possible histories at time is denoted (for simplicity = ). Let be the mapping that projects the first elements of a vector. The set of full histories that follow till time is given by ( ) = { ∈ | () = }. A strategy for a seller consists of offering a direct mechanism hp qi as described above. The strategy of a n o b−1 , where is consumer is, at least potentially, contingent on a richer history = −1 the actual type every period and b is the revealed type. Note that 0 = ∅. For a given contract, a strategy for the consumer, then, is simply a function that maps a history into a revealed type: 7→ ( ). In the study of static models it is generally assumed that all types are served, i.e., each type is offered a positive quantity. This assumption considerably simplifies the analysis because it permits ignoring the non-negativity constraints in quantities. In our model, as in a static model, 14 When it does not create confusion, the subscript of signifies time period or a specific type depending on the context: so when we write , we mean . The extra notation would be superfluous in most of the paper. 7 this property is guaranteed by the assumption that ∆ is sufficiently small. In the following we ¢ ¡ ¯ will assume that this is the case, and so ignore the non-negativity constraints on ¯−1 15 3 The dynamic envelope formula and optimality In this section we characterize the seller’s problem and we discuss the standard approach that has been used in the literature to solve it- the so called first-order approach. The seller’s problem consists of choosing a contract hp qi that maximizes profits under two sets of constraints: incentive compatibility constraints, that guarantee that agent does not want to report being a type after any history ; and individual rationality constraints that guarantee that all types receive at least their reservation utility = 0 after any history . Since the choice of prices and quantities corresponds to the choices of utilities and quantities levels for the buyer, this problem can be conve¢ ¡ ¯ ¢¢ ¡ ¡ ¯ niently represented as a choice of utilities and quantities hU qi = ¯−1 ¯−1 =1 , ¢ ¡ ¯ where ¯−1 is the expected utility of a type after history −1 . The generic incentive compatibility constraint (−1 ) requires ( |−1 ) ≥ ( ; |−1 ) where ( ; |−1 ) the expected utility for a type of reporting to be a type at time after history −1 . This constraint can be easily rewritten in terms of hU qi as: ( |−1 ) ≥ ( ; |−1 ) = ( |−1 ) + (1) X =0 ( − ) ( |−1 ) + ( − )∆( |−1 ) The individual rationality constraint for type at history −1 , (−1 ), is a simple non negativity constraint: ( |−1 ) ≥ 0 (2) For future reference, we call local downward constraints the incentive constraints that are associated with a deviation to a contiguous lower type (i.e. +1 (−1 )), and local upward constraints the incentive constraints that are associated with a deviation to a contiguous higher type (i.e. +1 (−1 )). We refer to all the other constraints as global. A contract that satisfies all the incentive and rationality constraints is said to be implementable. 15 The exact condition that guarantees that the non-negativity constraints is non binding will be obvious in the analysis since the optimal contract will be characterized in closed form. 8 The monopolist’s problem constitutes maximizing expected surplus net of the buyers expected equilibrium rents: ⎧ P ⎪ ⎪ ⎨ E [(q)] − ( |0 ) =0 max hUqi ⎪ ⎪ ⎩ s.t. (−1 ), (−1 ) for any and, ∀ −1 ∈ −1 ∀ ⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭ (3) This is a standard maximization problem of a concave function under a system of linear constraints. As and increase, however, the number of variables and constraints becomes prohibitively large making (3) analytically intractable. The standard approach in the literature is to first study a relaxed problem in which only individual rationality constraint of the lowest type and the local downward constraints +1 ( ) are considered. The remaining constraints can be verified ex-post after the solution of the relaxed problem has been characterized. Definition 1. A contract is first-order optimal if and only if it maximizes profits under the following constraints: +1 (−1 ) and (−1 ), ∀ ∈ N \{ } ∀−1 ∈ −1 ∀ The interest in the FO-optimal contracts derives from the fact that in many environments they coincide with the optimal contracts. Under standard assumptions, the FO-optimal contract coincides with the optimal contract in a static environment, both with finite and continuous type spaces (Stole [2001]).16 This approach has also been used in all papers that have extended the Principal-Agent model to dynamic environments: the first-order autoregressive environment (Besanko [1985]) and the Markov environment with two types (Battaglini [2005]). Perhaps, more importantly, the applied literature often focuses on FO-optimal contracts even in the absence of an explicit proof that local incentive constraints are sufficient for implementability.17 It is easy to show that when we consider the relaxed problem with only local downward constraints, the incentive compatibility constraints can be assumed to hold as equalities.18 This allows us to eliminate utilities from the optimization problem and drastically simplify the constraint set. Define: ∆ ( | ) = ( | ) − ( |−1 ) 16 For the utility funtion we are assuming in this paper, the FO-optimal contract is optimal in a static environment if the prior satisfies the monotone hazard rate condition. See Stole [2001] for discussion of these results. 17 See the Section 1.1 for a discussion of this literature. 18 The details of the statements made in this section are formally proven in the appendix. 9 Recalling that ( ) is the set of histories follow , and representing by the th element of hisotry , we have the following characterization of the agent’s utility as a function of only q:19 Lemma 1. In correspondence to a FO-optimal contract we have: ( |−1 ) − (+1 |−1 ) ∆ = (+1 |−1 ) X X + ̂∈(−1 +1 ) − Y =+1 for any ∈ N \{ } ∈ −1 and = 1 . ³ ¯ ´ ¯ ∆ ̂ ¯̂−1 (̂ |̂ −1 ) (4) Lemma 1 presents a straightforward dynamic extension of the envelope formula introduced by Myerson [1981]. This can be seen by taking to zero, in which case the second term in the right hand side vanishes and (4) coincides with the classic static formula. A continuous type version of the formula is presented in Baron and Besanko [1984] for the case in which = 3 and by Besanko [1985] for an infinite horizon model with first-order autoregressive types in which shocks have independent realizations. Battaglini [2005] states the formula for a Markov process with two states: (4) is a direct, but more involved extension of this result for the case with |Θ| ≥ 2. Pavan, Segal and Toikka [2012] present a general version of the formula for a continuous type space and other stochastic processes. Lemma 1 allows us to express the utility vector only in terms of q. Define: ⎡ ∗ −1 ( | − ⎢ X ⎢ ; q) = ∆ ⎢ ⎣ + =1 −1 P P ̂∈(−1 + ) (+ | − ) ³ ¯ ´ Q ¯ ∆ ̂ ¯̂−1 (̂ |̂ −1 ) =+1 ⎤ ⎥ ⎥ ⎥ ⎦ (5) for any , and ∗ ( |−1 ; q) = 0. The following result derives immediately from (4): Corollary 1. In correspondence to a FO-optimal contract, we have ( |−1 ) = ∗ ( |−1 ; q) for any ∈ N , −1 ∈ −1 ∀ Note, that this result allows us to express ( |0 ) solely as a function of q. The FO-optimal contract can now be characterized as the solution of the following program: ) ( X ∗ 0 max E [(q)] − ( | ; q) q (6) =0 19 To interpret (4), note that, given a history ̂ = (̂ ̂ ), ̂ is the realization of the type at time ≤ . It 1 follows that (̂ |̂ −1 ) is the quantity at time when the realized history is ̂ −1 . 10 This problem can be solved to obtain the closed form solutions presented in Proposition 1. Let ( ) be equal to 1 at = 1, and for 1, define: ⎧ ⎪ ⎪ ⎨ 0 if = 0 for any ≤ ( ) = −1 Q ³ ∆ ( +1 | ) ´ ⎪ ⎪ ⎩ ( | ) =1 +1 From the first-order conditions of (6) we obtain:20 Proposition 1. In correspondence to the FO-optimal contract we have: ¡ ¢ ∗ |−1 = − 1− = (−1 )∆ ∀ ∈ N ∀ ∈ −1 ∀ (7) for t1. where = for t=1 and = −1 1 The FO-optimal contract has very distinctive characteristics. As soon as the type becomes 0 , the highest type, the contract becomes efficient in all following periods, a phenomenon that has been called the “Generalized No-Distortion at the Top”. For any other history, the quantities are i hP −1 −1 )∆: this formula is complicated distorted below and the distortion is =0 ( because the wedge is state contingent and it depends on the entire history. The distortion, however, converges to zero as the serial correlation of types converges to zero. Given the (relatively) simple characterization of Proposition 1, the key question is: when is it with loss of generality to focus on the first-order approach? Substituting (7) for the relevant allocation, (1) and (2) are the necessary and sufficient conditions on the primitives for the implementability of ∗ , but they are not very practical to check. A much simpler sufficient condition for ¡ ¢ the validity of the first-order approach is the following. Let ( ) = |−1 be an allocation if ≥ b ∀ ≤ . We have: after history , and let º b ) for any º b . Definition 2. An allocation is monotonic if ( ) ≥ (b The following result was used in Battaglini [2005] to establish the sufficiency of (5) for the case |Θ| = 2. The generalization to the case with types is immediate:21 20 Note that, in the following expression, (−1 ) corresponds to ( ) for = −1 . 21 Indeed in Battaglini [2005] it is shown that a weaker monotonicity condition than the one in Proposition 2 −1 is actually sufficient. The condition requires that is non decreasing in (see Step =0 ∆ 1 of Claim 2 in Battaglini [2005]). This condition is implied by the monotonicity of the allocation as defined in Definition 2. Analogous monotonicity results applicable to other stochastic processes have been presented by Pavan, Segal and Toikka [2012], see Theorem 4. 11 Proposition 2. The envelope formula (5) and monotonicity of the FO-optimal contract are sufficient for implementability. Proposition 2 directly parallels the well known results in static environments that show that local incentive compatibility (i.e. the envelope formula) and monotonicity of the allocation are necessary and sufficient for implementability. The result is however weaker for two reasons: first the monotonicity condition is stronger than in a static environment, since it involves all histories following a report; second, the result is only sufficient.22 The problem with Proposition 2 is that it is useful only to the extent that it is easy to apply. The are a number of applications in which the FO-optimal contract is monotonic, but the number is limited. Example 1. Battaglini [2005] has considered a model identical to the model presented above, but with |Θ| = 2: a high type and low type. In this case, the validity of the first-order approach is established as follows. Note, that the FO-optimal contract is always distorted below and as soon as the type is high, the contract becomes efficient. With two types distortions are present only in the history in which the type is always low. This history corresponds to the lowest history according to the order º introduced above. It follows that the contract is monotonic according to definition 1, and so the FO-optimal contract is optimal.23 Example 2. Besanko [1985] has considered a case in which = −1 + (1 − ) , where is the realization of an i.i.d. random variable and ∈ (0 1) is a parameter measuring persistence. The Markovian framework developed above can be easily adapted to generalize Besanko’s environment. We present a two period model to drive home the point in a simple fashion. The type in the first period 1 is drawn from some prior with support 0 ,where − +1 = ∆. The second period type 2 is determined by the following stochastic process. 2 = 1 + (1 − )b2 where 1 is the realization at = 1 and b2 is the random realization of a Markov process with with + 1 states, the same support as 1 , and transition probabilities ( | ) = (where is the probability of b2 = given 1 = ). Note that measure the level of autocorrelation. If 22 This sufficient condition is stronger than what it is needed. The point here is to state the simplest sufficient condition that allows us to apply the first order condition to all the environments in which it is known to work. 23 Boleslavsky and Said [2012] generalize this two type model assuming continuous types at = 0 with binary shocks in the following periods. They also consider a version with a continuum of shocks, but in this case they directly assume that the quantities are monotonic in the reported values. 12 we assume = for any , then we have i.i.d. shocks and we are in a model isomorphic to Besanko [1985]. When we consider only local incentive constraints, it is easy to show that they must hold as equalities. In period 2, we have ( | ) = (+1 | ) + (1 − ) ∆(+1 | ) where ( | ) and ( | ) are respectively the second period utility and quantity of the agent who has type in the first period and and +(1−) in the second. Without loss of generality P we can set ( | ) = 0, so we have ( | ) = (1 − ) ∆ =+1 ( | ). Similarly, in the first period, we have " ( ) = (+1 ) + ∆(+1 |0 ) + ∆ X ¡ ¢ − (+1) ( |+1 ) + X =0 # ( |+1 ) (8) =0 The expected utility of the agent with type is equal to the utility of type +1 plus an informational rent. The informational rent can be decomposed into two parts. First, we have a P ( |+1 ): the realization of type at time 1 affects deterministic part ∆(+1 |0 ) + ∆ =0 rents at = 1 (i.e., ∆(+1 |0 )), but it effects rents at = 2 as well, in a way that is proportional P ( |+1 )). Second, we have the stochastic to the degree of autocorrelation (i.e., ∆ =0 ¡ ¢ P − (+1) ( |+1 ). This term depends only on the fact that a type and a type part =0 + 1 have different expectation on the probabilities of future types. Besanko [1985] assumes that the shocks are i.i.d., so that = (+1) . In this case, the stochastic term of the information rent is zero. The distortions are then exclusively deterministic. The first order optimal quantities are given by ( |0 ) = − 1− = ∆ in period 1, and ( | ) = + (1 − ) − 1− = ∆ in period 2. It is immediate to observe that, under monotone hazard rate on the priors, quantities are monotonic in the sense of Definition 1. It follows from Proposition 2 that the first order approach works and these quantities describe the optimal contract. Following the same steps, we can generalize the analysis to periods as in Besanko [1985], assuming = −1 + (1 − )b where b has the same i.i.d. distribution as b2 . In this case it is easy to verify that we have: ( |−1 ) = + (1 − ) − −1 · 13 1− = ∆ −1 where is the realization at time − 1 (i.e. −1 −1 ) and is the realization at time 1 (i.e. 1 ). Note that + (1 − ) is the first best efficient quantity: the optimal contract is characterized by deterministic distortions that are independent of the Markov process governing the evolution of types.24 Example 3. When the shock is i.i.d. it is easy to extend the analysis of Example 2 to a case with X X − + (1 − 0 )b with = 1. The example can also -order autoregression: = =1 =0 be extended to a case in which = ( −1 −1 −1 −1 ) + (1 − )b , where (−1 −1 ) is a function of , the types up to − 1, and of , the allocation up to − 1. To see this point, note that X − or (−1 −1 ) are just constants for all types, so they do not at time the terms =1 have any effect on incentives to reveal the true type. Another simple variation is to assume that = 1 (−1 −1 ) + (1 − )2 (−1 −1 )b , where (−1 −1 ) = 1 2 are functions of −1 , the types up to − 1, and of −1 , the allocation up to − 1. Both of these extensions of Besanko’s model have been suggested by Pavan, Segal and Toikka [2012]. The key assumption here is that the shock is independent of the agent’s type. The examples presented above show that the first-order approach can be extended to study quite complex dynamic environments. basic assumptions. All the examples, however, can be reconducted to two The environment studied in Besanko [1985] allows for many possible types (infact a continuum), but with the stochastic shocks being uncorrelated to the agent’s type. In this environment the shocks are irrelevant for the equilibrium distortions, which are independent of the history of realized types (except for the first). The environment studied in Battaglini [2005] allows the conditional distributions of the types to depend on the type, but limits the analysis to two types only. In this case the optimal contract is history dependent. These two environments have a common feature that explains why it has been impossible to find other natural environments in which the first-order approach works: they limit, or completely neutralize the “multidimensionality” of the dynamic screening problem. When there are types, an agent’s type is typically an − 1 dimensional object: the type includes the shape of the conditional distribution function, in the model described above. Limiting the analysis to these environments, however, is with loss of generality and it artificially limits the range of economic phenomena that naturally arise in dynamic screening problems, as the example in Section 4.1 will clearly illustrate. 24 Besanko [1985] assumses the intial type to be determinstic, so there are no priors in the model. The distortions, therefore, are just functions of - distortion in period being −1 . 14 4 Global constraints and optimality: impossibility results In this section we discuss the conditions under which the first-order approach cannot be used, and explore the structure of the optimal contract in these situations. We first present a simple example with three types and two periods that can be fully solved. This example allows us to analyze the complexities of an optimal dynamic contract when there are more than two types and the shocks are not independent from the types. The example shows that global constraints cannot in general be ignored and, even when they can be ignored, the envelope formula (4) is not sufficient to characterize the optimal contract because additional monotonicity constraints need to be satisfied. It also shows that, contrary to what happens when the first-order approach can be adopted, the seller finds it optimal to strategically pool the types to extract higher rents: pooling is dynamic and history dependent. We then generalize the results of the example by showing that the first-order approach always fails when the time horizon is sufficiently important. 4.1 A fully characterized optimal contract In this section we analyze a three types and two periods model. In particular, the set of types is { } where with − = ∆ = − . The transition probabilities between period one and two are described by (|) = and (|0 ) = 1− 2 for any 0 ∈ { } 6= 0 , with 13. In this model, therefore, measures the persistence of the types. Conditional on not being the same type, there is an equal probability of being any of the other two types. An appealing feature of this simple transition matrix is that it ranks the types according to first-order scholastic dominance, as requested in the model described in the previous sections. The first-order approach cannot generally be used in this setting. To characterize the optimal contract we therefore focus on a weakly relaxed program that constitutes problem (3) with |Θ| = 3 and = 2, once we replace the constraint set with the following constraints: (9) ( ) ( ) ( ) () () () where is the individual rationality constraint of type at = 0, is incentive compatibility constraint requiring that type doesn’t want to misreport being a type in period 1, and () is 15 Figure 1: The dashed arrows are the constraints in the WR-problem that are ignored in the first order approach. the incentive compatibility constraint requiring that type doesn’t want to misreport being a type in period 2, after the agent reports to be a type in period 1. With respect to (3), this problem has two key differences. First, now we are ignoring all the individual rationality constraints of the lowest type in period 2 and incentive compatibility constraints after history . Second, and most importantly, we are adding three new constraints: the global downward constraint and the local upward constraints ( ) () in period 2. The constraint set of the problem is illustrated in the relevant history tree in Figure 1. In the following we will refer to this program as the WR-program. Since this is a three type and two period model we simplify notation. Let be the expected utility of type in the first period and () be the expected utility of type after history in the second period. Note that since the second period is the terminal period, the expected utility and stage utility are the same. Similarly, we define and () to be the first and second period allocations respectively. The following lemma allows to simplify the constraint set:25 Lemma 2. In the WR-program, constraints , , bind at the optimum. We can now use the equalities implied by Lemma 2, to reduce the number of free variables 25 When only the usual local downward incentive compatibility constraints are considered, the following result is immediate. If, for example were not binding, the principal could simply raise the price a type is paying. In a WR-problem the proof of the result is complicated by the additional constraints: reducing type ’s rent at = 0 may conflict with . 16 in the optimization problem. In particular we can eliminate the period 1 utility vectors. Define () = () − () and () = () − () for = . The variable () is the net utility of reporting to be type rather than a type after history . Using this notation, we can rewrite the WR-program as a maximization problem in which the control variables are the quantities q and second period marginal utilities ω: ⎧ " # ⎪ ¡ ¢ P P ⎪ 1 1 ⎪ − 2 2 + ( | ) () − 2 ()2 ⎪ ⎪ ⎪ = = ⎪ ⎨ ¤ £ max − ∆ + 3−1 hqi ⎪ 2 ( ) ⎪ ⎪ ⎪ ⎪ £ ¤ ⎪ ⎪ ⎩ −( + ) ∆ + 3−1 2 () subject to [] : [ ( )] : ∆ + ⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ (10) 3 − 1 3 − 1 ( ) ≥ ∆ + () 2 2 ( ) ≥ ∆ ( ) | [ ()] : () ≥ ∆ () [ ( )] : ( ) ≥ ∆ ( ) | [ ()] : () ≥ ∆ () [ ( )] : ( ) ≤ ∆ ( ) | [ ()] : () ≤ ∆ () where the variables in the square brackets on the left are the Lagrange multipliers associated with the constraints. Program (10) is a standard maximization problem, but it is complicated by a still significantly large number of constraints. The first constraint is global: it prevents the high type from reporting to be a low type. All the remaining constraints are local, associated with deviations to a contiguous type. The difference between (10) and the problem of the first-order approach (6) is the global constraint and the presence of the upward constraints ( ) and (). The upward constraints ( ) and () are essentially monotonicity conditions requiring () ≥ () for = 26 We cannot ignore any of these three constraints. Moreover now we cannot assume without loss of generality that all local downward incentive constraints are binding at = 2: so the envelope formula (4) cannot be directly applied. This is why we still have utilities in the objective function. We have: Lemma 3. A contract is optimal if and only if it solves the WR-program. 26 To see this note that given (), () ≥ () if and only if () is satisfied. 17 Figure 2: Fully characterized contract We can therefore focus on problem (10). The analysis can be divided into two cases: first the case in which the global constraint can be ignored and so it is sufficient to look at local constraints, i.e. = 0; second, the case in which the global constraint is binding, i.e. 0. The following result characterizes the necessary and sufficient condition for = 0. For a given and , the environment is fully described by two parameters, , and therefore it can be represented in the two dimensional box ( ) ∈ ( ) = (0 1 − ) × (13 1).27 In the rest of the analysis we will fix and and study how the equilibrium changes as we change . This approach is without loss of generality and it allows for simpler statements (and a graphical representation) of the relevant cases. We have: Lemma 4. There exists a threshold ∗ () such that the global incentive constraint can be ignored if and only if ≥ ∗ (). The reason why the global constraint may bind in this WR-program is that in correspondence with the FO-optimal contract, the expected rent of an agent in the second period is not generally monotonic in the type revealed in the first. 27 To see this point, consider the global incentive The thresholds defined below do not depend on the types . 18 constraint . Because is binding, the rent of a high type is equal to the expected utility that a type receives in equilibrium by reporting to be a type , that is ( ; ),where ( ; ) is the expected utility that a type receives in equilibrium by reporting to be a type . The global constraint can therefore be written as ( ; ) ≥ ( ; ), or: ∆ X = + X X 3 − 1 3 − 1 ( ) ≥ 2 ∆ + () ∆ ∆ 2 2 = = P ∆, on the left hand Each side of the inequality has two components: a “static” term, = P 3−1 side, and 2 ∆, on the right hand side; and a dynamic term, 2 ∆ ( ), on the left = P hand side and 3−1 = (), on the right hand side. 2 ∆ It can be verified that the static term on the right hand side is always lower than the static rent on the left hand side since ≤ .28 The same, however, is not true for the dynamic terms P P because = () may be strictly larger than = ( ). For example, we always have P P ( ) ≤ (). When = () = ( ), may become binding. Why does the seller find it optimal to offer a lower quantity to an type after history than after history (i.e., ( ) ≤ ())? In static contracts, the intuition behind ≥ is relatively simple. The quantities chosen after history determine the informational rent of type and of type 29 ; the quantities chosen after determine only the informational rent of type : this is why in a static environment, we should expect that distorting quantities after should have a large impact on total expected rents in the first period, and so be a better choice for the seller. Dynamic contracts, however, are different, as illustrated by Figure 3. Consider the effect a reduction in ( ) keeping the rent of type constant at = 1 (see the left panel of Figure 3). The change implies a decline in the rent of ( ) by, say, ∆. The rent of type at = 1, therefore, remains constant if monetary transfers at = 1 increase by ( | ) ∆.30 After the change, the rent of at = 1 (that in equilibrium is equal to the utility of falsely reporting to be ) decreases by ( | ) ∆ because of the expected change in contract at = 2 and increases by ( | ) ∆, because of the monetary compensation at = 1: the rent of type is therefore reduced by a net [ ( | ) − ( | )] ∆ = 3−1 2 ∆. Assuming to be slack, the seller 28 Explicit solutions for the optimal quantities for all feasible parameters are presented in Table 1 in the appendix. 29 As in static contracts, type receives the informational rent of type plus an additional rent. 30 Recollect from Section 2 that given type in period , the probability of being type in period + 1 is given by ( | ). In a slight abuse of notation, we write it as ( | ). 19 UH f (H H ) f ( H M ) 3 1 2 UH f (H M ) f (H L) 0 uH (M ) UM f (H M ) f (H L) 0 qM (M ) UM f (H M ) uH (L) qM (L) UL f (H L) Figure 3: will choose the distortion to solve the trade-off between the marginal benefit of reducing the rent of , and the marginal cost produced by the reduction in surplus, implying:31 ( ) = − 3 − 1 2 Consider now a correspondent reduction in () that keeps the rent of the type constant at = 0 (see the right panel of Figure 3) By the same logic as above, this change induces a decrease in the rents of at = 1 equal to [ ( | ) − ( | )] ∆. The rent of is a linear function of , so it is also reduced by the same amount. In our model, however we have that ( | ) and ( | ) are both equal to (1 − )2, so the net benefit of the reduction in () is zero. It follows that: () = − and so the contract is not monotonic. 3 − 1 = ( ) 2 This is a general result: for any Markov process, as persistence converges to one, ( | ) and ( | ) must be both converging to zero, so the benefit of distorting () must be converging to zero; the net benefit of distorting ( ), on the contrary, increases with . The optimal contract must therefore be non-monotonic with sufficiently high persistence. 31 The marginal benefit is equal to (3 − 1) 2 · ∆ and marginal cost is equal 2 )2 to () − ( − ( () − ∆) − ( (2)−∆) . Ignoring the ∆2 term, and equating 2 the two gives us the quantity. 20 The general lesson here is that non-monotonic allocations should not be seen as a phenomenon induced by lack of “regularity”. When types are sufficiently highly correlated, a non monotonic allocation naturally arise as a consequence of profit maximization. Further, if is binding, a reduction in the equilibrium rent of type violates the constraint, thereby creating an addtional layer of complexity, beyond the tradeoffs mentioned above. Within the two regions defined by ∗ (), the particular shape of the optimal contract depends on the remaining set of binding constraints. The following proposition describe how the optimal contract looks like for ≥ ∗ (), when the global constraint an be ignored: Proposition 3. Assume ≥ ∗ (). There is a threshold 0 ( ), such that: - Case A1. If 0 ( ), the optimal contract is fully separating and first-order optimal. - Case A2. If ≥ 0 ( ), the optimal contract is fully separating after all histories except . After this history types M and L are pooled: ( ) = ( ). Regions A1 and A2 are illustrated in Figure 2 in a simple parametric example, where the threshold 0 ( ) is represented by a dashed line.32 sufficient to characterize the optimal contract. In region 1, the envelope formula is It is interesting to note, that in this region, however, the optimal contract is not monotonic. This is a feature that, as discussed in Section 3, is never true when = 2, or when the shocks are serially uncorrelated as in Besanko’s autoregressive model. In addition, Case A2 in Proposition 3 shows that even if we can ignore the global incentive compatibility constraints, the optimal contract cannot be characterized by simply relying on the envelope formula (4). Under standard regularity conditions, in static environments the contract that is optimal under only local constraints is monotonic. a dynamic environment. This is not necessarily the case in While in this case the “missing” binding constraint set can be easily identified, in more general environments with types and periods, it is likely to be hard to pin down which incentive constraints are binding. In these cases the envelope formula alone is not sufficient to characterize the optimal contract. When ∗ () both the global constraint and the local constraints and are simultaneously binding in the first period. There are three relevant cases. The following result characterizes the optimal contract in these situations: 32 In Figure 2 we assume = 025 and = 095. 21 Proposition 4. There exists a threshold ∗∗ () such that: - Case B1&B2. Assume ∈ [∗∗ () ∗ ()). The optimal contract is fully separating at = 1. There exists a threshold 0 () such that the optimal contract is fully separating at = 2 as well if 0 () (case B1). If ≤ 0 (), types and are pooled after history : ( ) = ( ) (case B2). - Case B3. If ∗∗ (), the optimal contract pools types and in the first period: = . In the second period, after history , the contract is separating and efficient. After histories and , types and are pooled across both histories: () = () and ( ) = () for = . Propositions 3 and 4 provide a full characterization of the optimal contract. Table 1, presented in the appendix, enlists closed form solutions of the optimal quantities for the entire parameter space. It is interesting to note that although in region 1 the global constraint is binding, we have full separation of the types. In region 2, the global constraint is binding and we have separation in the first period, but not in the second; in region 3, we have pooling of and both in the first and second periods. An important lesson of this example is the significance of pooling in dynamic contracts, a phenomenon that is not apparent in environments that are solvable with the simple envelope formula. Pooling may occur only in period 2 (regions 2 and 2) or both in period 1 and period 2 (region 3). Regions 2 and 2 are interesting because they involve strategic separation of types in period 1 followed by history dependent pooling in period 2, we term this dynamic pooling. Finally, note that in dynamic contracts, unline static, global incentive constraints can bind even types are fully separated. In cases 1 and 2, first period quantites are separated at the opitmum even though the first period global incentve constraint binds. 4.2 General impossibility results The results of the previous section show that even in a very simple example, the envelope theorem and local incentive constraints are not sufficient to characterize the optimal contract. In this section we generalize the results by showing that with types and periods, the first order approach is not valid if the time horizon and persistence are sufficienty important. To make this point, we continue to assume that types’ persistence is parameterized by a single parameter 22 , such that = . Now, however, besides allowing for any |Θ| ≥ 2 and ≥ 2, we put no restrictions on the “off-diagonal” probabilities ( 6= ), that can be written as = (1 − ), P = 1 ∀ . We say that the first-order approach fails where are generic constants with 6= as → 1 if there is an ∗ such that for ∗ , at least one of the omitted global constraints fails to be satisfied when the allocation is given by the first-order optimal contract (7). Proposition 5: There exist a ∗ 1, and a ∗ ≥ 2 such that for any ∗ and ≥ ∗ the first-order approach fails as → 1. Proposition 5 shows that the first-order approach fails precisely when it seems more important: when the future is significant (many periods, non myopic agent) and when types are sufficiently persistent. Note that the proposition above puts no restriction whatsoever on the priors. The fact that we require the time horizon to be sufficiently important for the result to hold should not be surprising: when types have low levels of persistence, then the dynamic nature of the environment has less bite since the agent has limited information about the future when signing the contract. Corollary 2 presented below shows that a high and are not necessary for the FOA to fail. Consider environments that satisfy the following weak assumption: Assumption 1. 1− −1 − (−2) −1 (−1) 1. This assumption is weak because it is natural to assume that the lowest two types do not have − → ∞ if −1 → 0 as increases. In a large mass when is large. Note that 1−−1 −1 this case, if the ratio (−2) ( −1) is finite, then Assumption 1 is verified for sufficiently large. An example that satisfies Assumption 1 is the case with a uniform prior = 1( + 1), and a transition matrix = , = (1 − ) for 6= as in the model of Section 4.1. We have: Corollary 2 Suppose Assumption 1 holds. Then, there exists ∗ ∈ (0 1) such that the first-order approach fails for all 0 and ≥ 2.33 The table in figure 4 provides additional support to the results of Proposition 5 and Corollary 2 by computing the lower bound ∗ for = 4 (5 types) and = 3, and a variety of configurations of the other parameters. In the table it is assumed that the types follow a “bell shaped” transition 33 Note that the FOA does not fail when = 1, but it may fail with 1 for any arbitrarily small. Indeed, when = 1 there is no second period in which the incentive compatibility constraints can fail; when 1 we can always find a future ≤ in which some necessary incentive constraint fails. 23 Figure 4: Failure of FOA for |Θ| = 5 = 3. matrix in which the probability that type becomes a type declines with the distance between and : = Ω 1 1 + · | − | where is a parameter regulating the curvature of the transition probabilities and Ω = (11) P 1 6= 1+·|−| is a constant chosen to have the probabilities sum to one. The left hand panel shows examples in which = 0, and so the off-diagonal probabilities are uniform as in the model of Section 4.1; the right hand panel assumes = 1. The fact that global constraints are binding when types are highly correlated may seem surprising since it is well known that when types are constant (and so perfectly correlated) the optimal contract is just a repetition of the optimal static contract (Baron and Besanko [1984]). As persistence converges to one, shouldn’t we expect the optimal contract to converge to the optimal contract when persistence is one? If this were the case, then local incentive constraints would be sufficient. Proposition 5 and Corollary 2 show that there is a failure of lower-hemicontinuity of the optimal contract with respect to types’ persistence. The intuition is very simple. As persistence converges to one, the FO-optimal contract does converge to the optimal static contract on the “main history,” that is the history in which types remain constant. Outside this history, however, the contract is generally not monotonic, even for arbitrarily high levels of persistence. Implementability fails precisely because of this non monotonicity, rendering the FO-optimal contract not incentive compatible. So, then, why is optimal to have repetition of the static optimal 24 contract when persistence is perfect? The reason is that any allocation is optimal (including a repetition of the static contract) after an history that has probability exactly equal to zero. However, even an arbitrarily small probability to reach the non-constant histories makes the repetition of the static contract suboptimal and demands non-montonic allocations as part of the optimum. 5 Ironing, implementability and optimality The results of the previous sections make clear that in order to solve for an optimal contract, the principal cannot generally use the first-order approach and limit the analysis to local incentive compatibility constraints. Without the first-order approach, we have no systematic way to simplify the constraint set: potentially, all constraints could be binding. extremely complicated even from a numerical point of view. This may make the analysis What does the optimal contract looks like in general environments with large and ? What kind of advice can we give to a seller who needs to design an optimal contract? In this section we show that there is a class of contracts that is relatively easy to characterize, and that induces a minimal loss (if any) on the principal’s payoff precisely when the first-order approach fails, that is, when the agent’s types are highly persistent. This class consists of contracts that are monotonic as in Definition 2. In static environments the envelope formula plus monotonicity are necessary and sufficient for a contract to be implementable: if we ignore the monotonicity constraint, then the contract must be ironed out to make it monotonic, otherwise implementability fails (see Myerson [1981]). In a dynamic environment monotonicity is not necessary: it follows that if we impose monotonicity in the seller’s problem, we guarantee implementability even if we ignore the global constraints, but we may obtain a suboptimal contract. The main result of this session is that as types’ persistence converge to one, the optimal monotonic contract converges in probability to the optimal contract, and so the loss from focusing on this class of contracts converges to zero. Define M as the set of monotonic contracts: ⎧ ¯ ¯ ⎪ ⎪ ⎨ ¯¯ ( |−1 ) ≥ (+1 |−1 ) , and ( |−1 ) ≥ ( |b −1 ), M = q ¯¯ ⎪ ⎪ ⎩ ¯¯ = 1 ∀−1 and −1 º b −1 ⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭ (12) if ≥ b ∀ ≤ . It follows immediately from Proposition 2 that the where, as before, º b 25 Figure 5: optimal monotonic contract can be characterized by solving the following program: ) ( X ∗ 0 max E [(q)] − ( ; q) q∈M (13) =0 where ∗ ( 0 ; q) is given by the envelope formula (5). Problem (13), moreover, is sufficiently tractable to allow a partial characterization of the properties of its solution. ¡ ¯ ¢ Proposition 6. In the optimal monotonic allocation, ¯−1 ≤ for any and −1 . More¯ ¡¯ ¡ ¯ ¢ ¢ over, for any arbitrarily small 1 2 0 there exists a ∗ such that Pr ¯ ¯−1 − ¯ 1 ≤ 2 for any environment with ≥ ∗ periods. The first part of the proposition establishes that analogous to the static model, the optimal monotonic contract is uniformly downward distorted. The second part of the proposition states that the contract converges to an efficient contract in probability. What about the loss in profits for high levels of persistence, owing to the deployment of the suboptimal monotonic contracts? Let q = { ( )} ∈ be an allocation and let q∗∗ = { ∗∗ ( )} ∈ be the optimal allocation. Let be the identity matrix that describes the transition matrix when types are perfectly correlated. As types become perfectly persistent, we must have that the transition matrix converges to , i.e. α → . We say that q converges in probability to the 26 optimal allocation as types become perfectly persistent if lim→ Pr (| ( ) − ∗∗ ( )| ≥ ) = 0 for any 0. Proposition 7. The optimal monotonic contract converges in probability to the optimal allocation as types become perfectly persistent. A major implication of above proposition is that, the seller’s profit associated with the optimal monotonic contract converges to the profit level in the optimal contract as types become increasingly persistent. The table in Figure 5 illustrates the loss of profits associated with the optimal monotonic contract in an example with 3 periods and 3 types and the Markov matrix used in section 4.1. The loss is expressed as a percentage of the profit in the optimal contract. As can be seen, the approximation is quite good for all cases, with a loss of profit that is never higher than 0.06%. The table also provides the loss that would be incurred by ignoring the dynamic nature of the problem and repeatedly offering a state uncontingent optimal static contract: the loss can be higher than 10% of the optimal profits, even in this simple example with only 3 periods. Naturally larger losses should be expected with longer horizons. It is interesting to note the inverse-U relationship between losses and the level of persistence. As persistence increases, losses increase, peak and then come down again. The reason is simple. At = 13, the model is akin to the iid shock framework, where we know that the optimal contract is monotonic. At the other extreme, = 1, the optimal contract constitutes repititon of the static opitmum which too is monotonic. As we increase , the distortions vary and the probability of non-constant histories decreases. Thus, the loss in using monotonic contracts increases with the non-monotonicities only to be supressed in probability by the increasing weight of constant histories along which the optimal monotonic allocation converges to the optimal allocation. The results of this section may be useful in applied work. As mentioned in the introduction, many works in the applied literature postulate that the first-order approach works. The risk is that the contracts thus characterized are not incentive compatible. Further in the most natural environments, this risk can not be fully resolved by numerical methods.. To the extent that it is not possible to check all the incentive compatibility constraints, studying the optimal monotonic contract may be a more robust option, since it guarantees implementability and it is equal to the true optimal contract with high probability when types are highly persistent. 27 6 Conclusion In this paper we have studied a simple Principal-Agent model in which the agent’s type is private information and follows a Markov process. We have presented three sets of results. First, following the standard approach in the literature, we have studied the optimal contract when only local incentive constraints are considered. We have shown that the agent’s equilibrium rents can be represented purely as a function of the allocation through a dynamic version of the so called “envelope formula.” Moreover, as in static model, the envelope formula and a natural monotonicity condition on the allocation guarantee that the contract is implementable. Although this condition is only sufficient and quite strong, it is verified for virtually all the natural environments in which the optimal dynamic contract has been characterized in the existing literature. Second, and most importantly, we have shown that the environments for which the dynamic envelope formula is sufficient to characterize the optimal dynamic contract are quite special. In general, even in the simplest examples, the allocation is not monotonic and local incentive constraints are not sufficient for implementability. We have first proved this result in a simple case with three types and two periods where the optimal contract can be fully solved in closed form. The characterization shows that the optimal contract is characterized by what we call dynamic pooling: strategic, state contingent treatment of types in which types may be initially separated, but then be pooled conditioned on particular histories. We have then generalized the analysis showing that with types and periods the applicability of the first-order approach is even more limited: under quite weak conditions it fails when the time horizon is sufficiently important. The fact that the envelope formula is not sufficient to characterize implementability suggests that a new approach is required to study dynamic contracts in general environments. Finally, we have shown that some insights on the optimal contract in general environments with many types and periods can be gained by studying a simple class of suboptimal contracts: monotonic contracts, in which non monotonicities in the allocation are “ironed” out. The appeal of this class is that the optimal monotonic contract converges in probability to the optimal contract as the persistence of types converges to one, that is precisely when the first-order approach tends to fail. The analysis suggests a number of important research questions. The characterization of the optimal contract with three types and two periods suggests that state dependent pooling of types 28 plays an important role in dynamic screening. The example suggests a number of features that are natural to expect to hold in more general environments as well. The analysis in section 5, moreover, suggests that even when it is not possible to fully characterize the optimal contract, useful insights can be gained by studying contracts that are approximately optimal. the further development of these ideas for future research. 29 We leave References Athey S. and I. Segal (2007), “An Efficient Dynamic Mechanism,”working paper. Baron, D. and D. Besanko (1984), “Regulation and Information in a Continuing Relationship,” Information Economics and Policy, 1(3):267-302. Battaglini, M. (2005), “Long-term Contracting with Markovian consumers.” American Economic Review. 95, 637—658. Battaglini, M. (2007). “Optimality and Renegotiation in Dynamic Contracting,” Games and Economic Behavior, 60 (2), 213-246. Battaglini M. and S. Coate (2008). “Pareto Efficient Income Taxation with Stochastic Abilities,” Journal of Public Economics, 92:3-4, pp. 844-68. Bergemann D. and J. Valimaki (2010). “The Dynamic Pivot Mechanism,” Econometrica, 78(2), 771-789. Besanko, David (1985). “Multiperiod Contracts between Prinipal and Agent with Adverse selection,” Economic Letters, 17, 33-37. Boleslavsky Raphael and Said Maher. “Progressive Screening: Long-Term Contracting with Privately Known Stochastic Process,” Review of Economic Studies, forthcoming. Bolton, P. and M. Dewatripont (2005). Contract Theory. The MIT Press, Cambridge, MA. Chassang, S. (2011), “Calibrating Incentive Contracts,” working paper. Courty, P. and H. Li (2000), “Sequential Screening,” Review of Economic Studies, 67(4):697718. Eso, P and B. Szentes (2007), “Optimal Information Disclosure in Auctions and the Handicap Auction,” Review of Economic Studies, 74(3), 705-731. Farhi, E. and I. Werning (2011), “Insurance and Taxation over the Life Cycle, ” working paper. Garrett, D. and A. Pavan (2012), “Managerial Turnover in a Changing World, ,” working paper. Golosov, M., M. Troshkin, and A. Tsyvinski, (2011), “Optimal Dynamic Taxes,” working paper. Hartline, J. (2012), “Approximation in Mechanism Design,” American Economic Review P&P. Inderst R. and M. Ottaviani (2011), “How (not) to pay for advice: A framework for consumer financial protection,” working paper. Kapicka, M., (2011), “Information Economies with Persistent Shocks: A First-Order Approach,” working paper. Krahmer, D. and R. Strausz (2011), “The Benefits of Sequential Screening ” working paper. 30 Laffont, J.J. and D. Martimort (2002), The Theory of Incentives, Princeton University Press, Princeton, NJ. Laffont J.J. and J. Tirole (1990): “Adverse Selection and Renegotiation in Procurement.” Review of Economic Studies, 1990, 57(4), pp. 597—625. Laffont, Jean-Jacques and J. Tirole (1993), A theory of incentives in procurement and regulation. Cambridge, MA: MIT Press, 1993. Madarasz, K. and A. Prat (2012), “Screening with an Approximate Type Space,” working paper. Maestri, L. (2011), “Dynamic Contracting under Adverse Selection and Renegotiation, ” working paper. Milgrom, P. (2004), Putting Auction theory to Work, Cambridge University Press. Myerson R. (1981), “Optimal Auction Design, ” Mathematics of Operations Research. Pavan, A., I. Segal, and J. Toikka (2012), “Dynamic Mechanism Design,” working paper. Stole, Lars (2001), “Lectures on the Theory of Contracts and Organizations,” mimeo, The University of Chicago. Zhang Yuzhe (2009), “Dynamic Contracting with Persistent Shocks,” Journal of Economic Theory, 144, 635-675. 31 7 Appendix 7.1 Proof of Lemma 1 and Corollary 1 We first show that all the constraints in the relaxed problem can be assumed to hold as equalities. Lemma A1. In a FO-relaxed problem: (−1 ) can be assumed to hold as equality for all −1 ∈ −1 ; +1 (−1 ) can be assumed to hold as an equality for all −1 ∈ −1 and = 0 1 − 1. Proof. The proof of this result is in the on line appendix. The appendix is available for ¥ download at http://www.princeton.edu/˜mbattagl/OptDyn appendix.pdf. We can now prove Lemma 1 and Corollary 1 together. We shall proceed by (backward) induction on . Note that at = , Lemma A1 implies: ( | −1 ) = 0 and ( | −1 ) = ∆ − X =1 (+ | −1 ) ∀ ≤ − 1 (14) Similarly, for = − 1, we have for ≤ − 1: ( | −2 ) = ∆(+1 | −2 ) + (+1 | −2 ) + = − X =1 = ∆ X ( − (+1) ) ( | −2 +1 ) =0 # X ((+−1) − (+) ) ( | −2 + ) ∆(+ | −2 ) + " − X =1 " =0 −2 (+ | )+ −1 X =0 ((+−1) − (+) ) − X =1 −2 (+ | + ) # Now, it is easy to verify that: −1 X =0 ((+−1) − (+) ) − X =1 −2 (+ | + ) = "−1 X X =1 = X =1 =0 (+−1) − −1 X =0 (+) ( | −2 + ) ∆ ( |+ )( | −2 + ) It follows that we can write: # " − X X ( | −2 ) = ∆ ∆ ( |+ )( | −2 + ) (+ | −2 ) + =1 = ∆ − X =1 ⎡ ⎣(+ | −2 ) + =1 X X ̂∈( −2 + ) −1 32 # − Y =+1 (15) ⎤ ´ ³ ¯ ¯ ∆ ̂ ¯̂−1 (̂ |̂ −1 )⎦ It is easy to see that (14) and (15) prove the statement in Corollary 1 and in Lemma 1 respectively for = and = − 1. We therefore conclude that our hypothesis holds for ≥ − 1. Next, suppose it holds for + 1 where ≥ − 2. We want to show that it holds for . We have, X ¡ ¢ ¡ ¢ ( |−1 ) = ∆ +1 |−1 + +1 |−1 + ( − (+1) ) ( |−1 +1 ) = − X =1 = ∆ (16) =0 " # X ¡ ¡ ¢ ¢ −1 −1 ((+−1) − (+) ) | + ∆ + | + − X =1 =0 " à −1 − X X ¡ ¡ ¢ ¢ + |−1 + + |−1 + + ((+−1) − (+) ) =1 =0 X ̂∈(−1 + + ) X −(+1) +1 Y =+2 ³ ¯ ´ ³ ´ ¯ ∆ ̂ ¯̂−1 ̂ |̂ −1 !# where the third equality follows from the induction hypothesis. Now, −1 X =0 ((+−1) − (+) ) − X =1 −1 (+ | + ) = X =1 ∆ ( |+ )( |−1 + ) (17) and, −1 X =0 ((+−1) −(+) ) − X X −1 X =0 ((+−1) − (+) ) X X = Y −(+1) ̂∈(−1 + ) +1 =+2 With some algebra we can write: −1 X =0 ((+−1) − (+) ) Combining (17) and (18) we obtain: X =1 =δ ¤ £ ∆ ( |+ ) ( |−1 + ) + X =1 = ⎡ ∆ ( |+ ) ⎣( |−1 + )+ X X ̂∈(−1 + ) − −(+1) =1 ̂∈(−1 + + ) +1 = where, X Y =+1 − X Y =+2 − X ³ ¯ ´ ³ ´ ¯ ∆ ̂ ¯̂−1 ̂ |̂ −1 + =1 ³ ¯ ´ ³ ´ ¯ ∆ ̂ ¯̂−1 ̂ |̂ −1 + = =1 X =1 ∆ ( |+ ) (18) (19) X X ̂∈(−1 + ) +1 ³ ¯ ´ ³ ´ ¯ ∆ ̂ ¯̂−1 ̂ |̂ −1 33 −(+1) Y =+2 ⎤ ³ ¯ ´ ³ ´ ¯ ∆ ̂ ¯̂−1 ̂ |̂ −1 ⎦ Combining (16) and (19), we obtain: ⎡ − X ¡ ¡ ¢ ¢ ⎣ + |−1 + |−1 = ∆ =1 This proves Corollary 1. X X − ̂∈(−1 + ) Y =+1 ⎤ ³ ¯ ´ ¯ ∆ ̂ ¯̂−1 (̂ |̂ −1 )⎦ Subtracting (+1 |−1 ) and dividing by ∆ from the above ex- pression gives us Lemma 1. 7.2 Proof of Proposition 2 ¯ ¯ ¯ Recall that ∆ ( ¯−1 ) = ( ¯−1 ) − ( ¯−1 +1 ). We start with a useful lemmata. ¯ Lemma A2. If ( |−1 ) and ∆ ( ¯−1 ) are non increasing in for any −1 , then (5) implies that local upward incentive compatibility constraints are satisfied. Proof. The proof of this result is in the on line appendix. Lemma A3. ¥ ¯ If ( |−1 ) and ∆ ( ¯−1 ) are non increasing in for any −1 and (5) holds, then the local incentive compatibility constraints imply the global incentive compatibility constraints. Proof. The proof of this result is in the on line appendix. ¥ Given the lemmas presented above, Proposition 2 is proven if we establish that when the ¯ allocation is monotonic as defined in Definition 2, then ( |−1 ) and ∆ ( ¯−1 ) are non increasing in for any −1 . The fact that ( |−1 ) is non increasing in for any −1 is an ¯ immediate consequence of the monotonicity. The fact that ∆ ( ¯−1 ) is non increasing in for any −1 is established by the following result. ¯ Lemma A4. If the allocation is monotonic, then ∆ ( ¯−1 ) is non increasing in i for any −1 . ¯ ¯ ¯ Proof. Note first that ( ¯−1 ) = ( ¯−1 +1 ) = 0, so ∆ ( ¯−1 ) = 0. By Lemma 1, we have: (−1 |−1 )∆ = ( |−1 ) + X X ̂∈(−1 −1 ) +1 34 −−1 Y =+2 ´ ³ ¯ ¯ ∆ ̂ ¯̂−1 (̂ |̂ −1 ) (20) It is useful to write this expression with a different notation. Let () be set of realizations of length − that start with the first element equal to (we denote is the typical element of (), so 1 = ). A possible history of length ≥ following with ( + 1)-th element equal © ª to (+1 = ) is then = { − } for ∈ () (by convention we write = 0 ). We can then write: ¯ (−1 ¯−1 )∆ = ( |−1 ) + ⎡ ⎤ ∆ ( | ) · −1 X X ⎢ ⎥ ⎥ =+2 −−1 ⎢ ⎣ ⎦ ∈ (−1) +1 ( |−1 −−1 ) Q (21) Similarly we can write: ¯ (−1 ¯−1 +1 )∆ = ( |−1 +1 ) + X ∈ (−1) X +1 ⎡ ⎢ −−1 ⎢ ⎣ Q =+2 (22) ⎤ ∆ ( | −1 ) · ( |−1 +1 −−1 ) ⎥ ⎥ (23) ⎦ Therefore we have: ¯ ∆ ( −1 ¯−1 )∆ = ( |−1 ) − ( |−1 +1 ) + X ∈ (−1) X +1 −−1 " Y =+2 # ⎡ ⎢ ∆ ( | −1 ) ⎢ ⎣ −1 ( | −−1 ) −( |−1 +1 −−1 ) Note that by monotonicity, we must have ( |−1 ) − ( |−1 +1 ) ≥ 0 and ⎤ ⎥ ⎥ ⎦ ( |−1 −−1 ) − ( |−1 +1 −−1 ) ≥ 0 ¯ ¯ It follows that ∆ ( −1 ¯−1 ) ≥ ∆ ( ¯−1 ). ¯ Assume now that ∆ ( ¯−1 ) is ¯ monotonic in for ≥ . We prove the result by induction proving that ∆ (−1 ¯−1 ) ≥ ¯ ∆ ( ¯−1 ). Applying Lemma 1 and using the notation developed above, we have: 35 ¯ ∆ (−1 ¯−1 )∆ ¯ = ∆ ( ¯−1 )∆ + ( |−1 ) − ( |−1 +1 ) ⎡ " # X X Y ⎢ ( |−1 −−1 ) + −−1 ∆ ( | −1 ) ⎢ ⎣ =+2 ∈ (−1) +1 −( |−1 +1 −−1 ) ¯ ¯ And so by monotonicity of the allocation we have ∆ (−1 ¯−1 ) ≥ ∆ ( ¯−1 ). 7.3 ⎤ ⎥ ⎥ ⎦ ¥ Proof of Lemma 2 First, we prove a useful lemma that will be invoked in the proof of Lemma 2. Lemma A5. The optimal solution satisfies: ≤ and () ≤ . Proof. The proof of this result is in the on line appendix. ¥ Now, we show that binds. Suppose not. Decrease by the same small amount. The first period incentive compatibility constraints continue to hold and the second period constraints are unaffected. This increases the profit of the monopolist without disturbing any of the constraints, giving us a contradiction. Thus, = 0. Next, we show that binds. Suppose not. Decrease by . Then, all the constraints are satisfied and we increase the monopolist’s profit, giving us a contradiction. Using these two binding constraints we can eliminate and from the maximization problem. In particular, can now be written as ≥ ∆ ( + ) + 3 − 1 [( ( ) − ( )) + ( () − ())] 2 Also, is given by ≥ 2∆ + 3 − 1 [ () − ()] 2 First, note that at least one of and must bind. If not, then we can decrease and increase the monopolist’s profit. Suppose does not bind. Then, must bind. Thus, we can eliminate from the maximiztion problem. In particular, can now be written as ∆ + 3 − 1 3 − 1 [ () − ()] ≥ ∆ + [ ( ) − ( )] 2 2 36 Second, we claim that if and bind and does not bind, then () binds. Suppose () − () ∆ (). Decrease () by , thereby, increasing the profit of the monopolist without disturbing any of the remaining constraints, giving us a contradiction. Thus, () must bind. Rewriting , using ( ) and the binding () we get ∆ + 3 − 1 3 − 1 ∆ () ≥ ∆ + ∆ 2 2 Since does not bind, it is easy to see that = . In Lemma A5, we showed that ≤ (and thus ), and () ≤ . These clearly contradict the above inequality. Thus, we must have that binds. 7.4 ¥ Proof of Lemma 3 We prove the lemma as follows. Let U = ( ) be the vector of expected utilities, mapping an history to the corresponding agent’s expected utility. First, we construct a vector of utilities U using the solution of the WR-problem, hω qi. We then show that the solution hU qi satisfies all the constraints of the seller’s profit maximization problem and it maximizes profits. We proceed in two steps: Step 1. We set ( ) (), () all equal to zero. We also define: ( ) = ( ) () = () () = ∆ () ( ) = ( ) + ( ) () = () + () () = ∆ ( () + ()) Since , and hold as an equality, we must have: = 0 = ∆ + = 3 − 1 () and 2 3 − 1 + ∆ + ( ) 2 Step 2. We now show that hU qi satisfies all the constraints of the profit maximizing problem. By construction it is immediate that hU qi satisfies all the constraints in the WR-problem. It 37 reamains to be shown that it also satisfies the other constraints, (24) () () () ( ) () () () () ()) ( ) ( ) ( ) () () () First, we show that is satisfied. From we have 3 − 1 [ () − ()] 2 3 − 1 = ∆ + [Using ] [ () − ()] 2 3 − 1 ≥ ∆ + [Using ()] ∆ () 0 2 = + ∆ + Similarly, we can show that is satisfied. To prove the remaining constraints we need the following properties of the solution of the WR-problem. Lemma A6. For all parameter configurations, in the WR solution we have: 1. () = for = , ( ) ( ) ≤ and ( ) ≥ () 2. ( ) = ∆ ( ) ( ) = ∆ ( ) () = ∆ (); 3. quantities at = 2 are nondecreasing in type after any history; 4. ≥ ≥ . Proof. The proof of this result is in the on line appendix. ¥ Consider the first period constraints. To show that holds it is sufficient to prove: ∙ ¸ 1− 1− 0 = ≥ + () + (25) ( ) + ( ) 2 2 3 − 1 = − ∆ − ( ) 2 3 − 1 = − ∆ − ( ) 2 Since = ∆ + 3−1 2 (), (25) can be written as: + 3 − 1 3 − 1 ( ) ≥ + () 2 2 The fact that tis inequality is satisfied follows from Point 1 and 4 in Lemma A6. (In the following, when we mention a point, we refer to the points of Lemma A6.) 38 Next, we show that holds. From we have: = + ∆ + 3 − 1 [ ( ) − ( )] 2 Thus, 3 − 1 [ ( ) − ( )] 2 3 − 1 = − ∆ − [ () − ()] 2 3 − 1 + ∆( − ) + [( () − ()) − ( ( ) − ( ))] 2 3 − 1 − ∆ − [ () − ()] 2 = − ∆ − The last inequality follows from the observation that: () − () ≥ ∆ () = ∆ ∆ ( ) = ( ) − ( ) (26) where the first inequality follows from the definition of (), the first equality and the second inequality follow from Point 1. From (26) and the fact that (Point 4), it follows that holds. We now turn to . Using first and then , we have: 3 − 1 [ ( ) − ( )] 2 3 − 1 3 − 1 ≥ − ∆ − [ () − ()] − ∆ − [ ( ) − ( )] 2 2 3 − 1 = − 2∆ − [ () − ()] 2 3 − 1 + ∆ ( − ) + [( () − ()) − ( ( ) − ( ))] 2 3 − 1 − 2∆ − [ () − ()] 2 ≥ − ∆ − The last inequality follows from the observation that: () − () ≥ ∆ () = ∆ ≥ ∆ ( ) = ( ) − ( ) (27) where the first inequality follows from the definition of (), the first equality and the second inequality follow from Point 1. From (27) and (Point 4), it follows that holds. Consider now the second period constraints. The constraints ( ), () (), (), and ()) follow immediately by the definition of the utilities at = 2. The proof that hU qi solves the seller’s problem is therefore completed if we prove that it satisfies the 39 constraints in the last two lines of (24). This result follows from the fact that the local downward incentive constraints are satisfied in period 2 and quantities are weakly monotonic after any history (Point 3). Finally, to see that the contract is optimal, we note that it maximizes expected profits in the less restricted WR-problem, so it must be optimal in the seller’s problem. Note moreover that since the original problem is concave in this is in fact the unique solution (in quantities). ¥ 7.5 Proof of Lemma 4 For the reminder of the proof, it is useful to state the first order conditions of the WR-problem. It is easy to see that the type always gets the efficient quantity. After history , moreover, quantities are always efficient, implying: = ( ) = () = and () = () = () = . The remaining first-order conditions are given by: [ ] : ( − ) − ∆ + ∆ = 0 [ ] : ( − ) − ( + ) ∆ − ∆ = 0 [ ( )] : ( − ( )) − ( )∆ + ( )∆ = 0 1− ( − ( )) − ( )∆ = 0 2 1− [ ()] : ( − ()) − ()∆ + ()∆ = 0 2 [ ( )] : [ ()] : ( − ()) − ()∆ = 0 [ ( )] : − 3 − 1 3 − 1 + + ( ) = 0 2 2 [ ( )] : ( ) − ( ) = 0 [ ()] : [ ()] : 3 − 1 + () = 0 2 3 − 1 − ( + ) + () − () = 0 2 − We can now proceed with the proof. In the reminderof this section, we firs characterize the optimal allocation assuming = 0. We then derive the conditions under which the assumption of = 0 is admissible. Assuming = 0, we have = − ∆ and = − 40 + ∆ (28) Clearly, = 0 implies () = 0. Also, it is easy to show that () = 0, else () , which contradicts lemma A5. We therefore have () = ( + ) 3−1 2 , and the solution after history is given by: () = and () = − + 3 − 1 ∆ 2 and ( ) = ( ). Next, note that we must have ( ) = 3−1 2 (29) We have two possible cases: Case 1 (Region A1). ( ) = ( ) = 0. In this case: ( ) = − 3 − 1 ∆ 2 For this to be a solution, we must have − and ( ) = 3−1 2 ∆ 0 ( ) = (30) ≥ , so ≤ 0 ( ) where 3 − 2 We conclude that for ≤ 0 ( ) the solution is given by = , () = , () = for all = in addition to (28)-(30). Case 2 (Region A2). ( ) = ( ) 0. Then, ( ) and ( ) are both equal to a constant . From the first order condition with respect to ( ) and ( ) we have: ( ) = ( ) = 2 1− 3 − 1 + − ∆ 1+ 1+ 1 + (31) We conclude that for 0 ( ) the solution is given by = , () = , () = for all = , (28)-(29) and (31). To characterize the necessary and sufficient condition for = 0, we need to verify that given the solution defined above, is satisfied. Plugging in the values of Case 1, we obtain: − 3 − 1 ∆ + 2 µ − 3 − 1 ∆ 2 ¶ ≥ − + 3 − 1 ∆ + 2 (32) that is, ³ ¡ ¢2 ´ (1 − ) 1 + 3−1 2 ∗ ³ ≥ ¡ ¢ ´ = 1 () 3−1 2 1 + 1 + 2 41 (33) Plugging in the values of Case 2, we obtain: 3 − 1 − ∆ + 2 µ 1− 3 − 1 2 + − ∆ 1+ 1+ 1 + ¶ ≥ − + 3 − 1 ∆ + 2 (34) that is, ³ ´ 2 (1 − ) 1 + (3−1) 2(1+) ´ = ∗2 () ³ ≥ 3−1 1 + 1 − 1+ (1 − 2) (35) Let us define ∗ () = min{∗1 () ∗2 ()}. We have the following result. Lemma A7. If is such that ≥ ∗ () and ≤ 0 ( ) then the optimal contract is as described in Case 1 presented above. If ≥ ∗ () and 0 ( ) then the optimal contract is as described in Case 2 presented above. Proof. We first prove that when ≤ 0 ( ), then ≥ ∗ () implies ≥ ∗1 (). To this end, we prove the counterpositive: when ≤ 0 ( ), ∗1 () implies ∗ (). Note that: 1. the left hand side of (32) and (34) are the same; 2. the right hand side of (32) is not larger than the right hand side of (34) if and only if ≤ 2 3−1 , that is if ≤ 0 ( ). It follows that if ∗1 (), then neither (32) nor (34) hold, implying ∗2 () as well: we therefore conclude that ∗ (). Given this, the conditions ≥ ∗ () and ≤ 0 ( ) imply the conditions ≥ ∗1 () and ≤ 0 ( ), so by the discussion presented above, the allocation described in Case 1 is an optimal solution of the WR-problem. By a similar argument, we can prove that when 0 ( ), then ≥ ∗ () implies ≥ ∗2 (). This implies sthat when we have ≥ ∗ () and 0 ( ), then the allocation described in Case 2 is an optimal solution of the WR-problem. ¥ Finally note that Case 1 and Case 2 described above are the only possible allocations consistent with = 0. So, if ∗ (), the largrange multiplier of must be binding. 7.6 Proof of Proposition 3 The result follows from Lemma A7. 7.7 ¥ Proof of Proposition 4 We first prove a useful lemma. 42 Lemma A8. The optimal solution satisfies: ≤ − + ∆, () ≤ − + 3−1 2 ∆ and ( ) ≤ . Proof. The proof of this result is in the on line appendix. ¥ Keep in mind that 0 ⇒ () 0. It follows from the first order condition with respect to (). Next, in order to characterize the quantities after history , we prove a useful lemma. Lemma A9. 0 ⇒ ( ) 0. Proof. The proof of this result is in the on line appendix. ¥ We divide the proof of Proposition 4 into two steps. First, in Section 7.7.1, we assume that () is not binding and we characterize the parameter region in which this assumption is correct. This will allow us to define the regions B1 and B2 described in Proposition 4. Then, in Section 7.7.2, we characterize the optimal contract when () is binding, region B3. 7.7.1 Characterization of Regions B1 and B2 Let us assume () = 0. Since ∗ (), we have 0. From the first order conditions, we obtain: − + + ∆, = − ∆ 3 − 1 + 3 − 1 − ∆, () = − ∆ 1 − 2 = − () = (36) Since 0, we have ( ) 0 and () 0. Thus, + 3 − 1 3 − 1 ( ) = + () 2 2 (37) There are two relevant cases. We use 1 to denote from Case 1 and 2 from Case 2. Case 1 (Region B1). ( ) = ( ) = 0. Then, from the first-order conditions: ( ) = − − 1 3 − 1 ∆ 2 and ( ) = (38) Substituting, the values from (36)-(36) and (38) in equation (37) we obtain: 1 + 1 3 − 1 1 3 − 1 3 − 1 − 1 3 − 1 − 1 + + = 2 1 − 2 2 43 (39) which gives: 1 = 1 () = ¡ ¢ 3−1 1 + 3−1 − 1 2 2 ³ ´ ¡ ¢ 3−1 3−1 1 + 3−1 + 1 1 + 3−1 2 2 2 1− 1 (40) Clearly, for this case to be valid, we must justify the assumption that ( ) = ( ) = 0. A necessary and sufficient condition for this is ( ) ≥ ( ). Given (38), this condition can be rewritten as: −1 3−1 2 ≤ 1, where 1 is given by (40). This condition is implied by: ≥ 1 + (1 − )0 () − 0 ()0 () = 0 () 0 () (1 + 0 ()) where 0 () = 1 + 3 − 1 3 − 1 3 − 1 3 − 1 2 0 () = 1 + and 0 () = 2 2 2 1− 3 − 1 It follows that (under the assumption that () = 0) the solution is given by (36)-(36), (38) and (40) when when ≥ 0 (). Case 2 (Region B2). For 0 () we must have ( ) = ( ) 0. In this case, we must have: ( ) = ( ) = 2 1− − 2 3 − 1 + − ∆ 1+ 1+ 1+ Substituting ( ) and () equation (37) we obtain: µ ¶ 1 + 2 3 − 1 2 3 − 1 1 − 3 − 1 − 2 3 − 1 − 2 + + − = 2 1 − 1+ 2 1+ (41) (42) which gives 2 = ³ ´ ³ ´ 3−1 1 3−1 1− − 1 + 3−1 − 2 1+ 2 1+ ³ ´ ³ ´ 3−1 3−1 1 3−1 3−1 1 + 2 1+ + 1 + 2 1− 1 (43) It follows that (under the assumption that () = 0) the solution is given by (36)-(36), (41) and 43 when 0 (). We now complete the analysis of this section by characterizing the conditions under which we can ignore the () constraint and so () = 0. It is easy to see that () is satisfied if and only if () ≥ (). We have () ≥ () if and only if: µ ¶−1 µ ¶ 1 − 3 − 1 1 3 − 1 1+ ≤ 1 − 2 44 (44) Thus, for Case 1 we have, ¡ ¢ ¶−1 µ ¶ µ 3−1 3−1 − 1 1 − 3 − 1 1 3 − 1 1 + 2 2 ³ ´ 1+ ≤ ¡ ¢ 3−1 1 3−1 3−1 1 − 2 + 1 1 + 3−1 1 + 2 2 2 1− Define 3 − 1 3 − 1 3 − 1 3 − 1 1 ( ) = 1 + and 2 2 2 1− µ ¶−1 µ ¶ 1 3 − 1 1 − 3 − 1 1+ 1 ( ) = 1 − 2 1 ( ) = 1 + We can then write the previous inequality as: ≥ 1 ( ) [1 − − 1 ( )] = ∗∗ 1 () 1 + 1 ( ) + 1 ( ) 1 ( ) Next, for Case 2, we have () ≥ () iff, ³ ´ ³ ´ 1− 3−1 3−1 µ ¶−1 µ ¶ − 1 − 3−1 1 + 2 1+ 2 1+ 1 3 − 1 1 − 3 − 1 ³ ´ ³ ´≤ 1+ 3−1 1 3−1 3−1 1 − 2 + 1 1 + 3−1 1 + 2 1+ 2 1− Define 3 − 1 3 − 1 1 3 − 1 1 − − 2 ( ) = 2 1+ 2 1+ µ ¶−1 µ ¶ 1 3 − 1 3 − 1 3 − 1 1 − 3 − 1 2 ( ) = 1+ 2 ( ) = 1 + 1 − 2 2 1− 2 ( ) = 1 + Rearranging, we obtain: ≥ 2 ( ) [1 − − 2 ( )] = ∗∗ 2 () (2 ( ) + 2 ( )) + 2 ( ) 2 ( ) ∗∗ Let us define ∗∗ () = min {∗ () ∗∗ 1 () 2 ()} We have: Lemma A10. If ∈ [∗∗ () ∗ ()] and ≥ 0 (), then the solution of the WR-problem is given by the solution in Case 1 presented above. If ∈ [∗∗ () ∗ ()] and 0 (), then the solution of the WR-problem is given by the solution in Case 2 presented above. ∗ Proof. We first show that if ∈ [∗∗ () ∗ ()] and ≥ 0 (), then ∈ [∗∗ 1 () ()] and ≥ 0 (). This implies that the solution is given by Case 1. Assume ∗∗ 1 (). In this case, (44) does not hold with 1 . This implies that (44) does not hold with 2 as well if 2 ≥ 1 . Subtracting equation (42) from equation (39), we get ∙ µ ¶¸ ∙ ¸ 1 3 − 1 1 3 − 1 1 1 3 − 1 3 − 1 1 − − 1 3 − 1 + + (1 − 2 ) + = −1 2 1 − 1 + 2 1+ 2 (45) 45 So, we have that 2 ≥ 1 if: − 1 3 − 1 − 1 ≤ 0 2 which is implied by ≥ 0 (). ∗∗ It follows that if ∗∗ 1 (), then (), a contradiction. We conclude that it must be ≥ ∗∗ 1 (). ∗ We now show that if ∈ [∗∗ () ∗ ()] and 0 (), then ∈ [∗∗ 2 () ()] and 0 (). This implies that the solution is given by Case 2. Assume ∗∗ 2 (). In this case,(44) does not hold with 2 . This implies that (44) does not hold with 1 as well if 1 ≥ 2 . From (45) we have that this always true if 0 (). It follows that if ∗∗ 2 (), then ∗∗ (), a contradiction. We conclude that it must be ≥ ∗∗ 2 (). 7.7.2 ¥ Characterization of Region B3 Finally, we characterize the contract when ∗∗ () and so both 0 and () 0. This is region B3. In this case: = − − ∆ and = − + + ∆ (46) We also have that () 0 implies () = (), so: () = () = 1− 2 + + 3 − 1 + − ∆ 1+ 1+ 1+ From Lemma A8, we have () ≤ − + 3−1 2 ∆. (47) Also, when () 0, the above inequality is strict. Thus, substituting the optimal value of (), we obtain: 1− 3 − 1 + 3 − 1 + 0 1 − 2 (48) Note that as () converges to zero, (48) is the exact violation of ≥ ∗∗ (), that is, inequality (44). To characterize the quantities after history , we now show that ( ) = ( ) 0. Lemma A11. () 0 ⇒ ( ) = () 0. Proof. The proof of this result is in the on line appendix. ¥ It follows that ( ) = ( ) = 2 1− − 3 − 1 + − ∆ 1+ 1+ 1 + 46 Finally, substituting the optimal values in as equality, we obtain: ¶ µ − + + =0 + 1− (49) that implies = . Note that equation (49) gives the value of , which uniquely defines the solution at the optimum. In particular note that type and are treated as one, that is, = and ( ) = ( ) = () = () (50) We conclude that the solution of the WR-problem in region B3 ( ∗∗ ()) is given by: (46),(47), (50) and (49). Table 1 summarizs the solution of the WR-problem describing the optimal allocation for each possible case. 7.8 ¥ Proof of Proposition 5 We prove the result by showing that, under the conditions stated in proposition, the FO-optimal contract violates the second period global incentive constraint −2 (−1 ). To this end, we first make a few useful observations. As → 1, the Markov matrix ( ) converges to the identity matrix, so ∆ ( | ) → 1 and ∆ ( | ) → 0 for ∀ 6= In addition, since ⎧ ¢ P ¡ ⎪ ⎪ ⎪ (−1) − ≥ ⎨ ∆ ( | ) ≤−1 = ⎪ P ⎪ ⎪ ( − ) ⎩ −1 ≥ we have that ∆ ( | ) is independent of for any 6= and, as it is easy to verify, ∆ ( |−1 ) −1 =1− −2 −1 1 and ∆ (−1 | ) −1 =1− Lemma A12. −2 ( −1 ) holds if and only if ⎡ ∆ (( −1 | −1 ) − ( | −1 )) + X =0 ⎢ ⎢ ⎣ −1 −1 1. Finaly, we have that: ¡ ¢ ( −2) − ( −1) · ( ( | −1 −1 ) − ( | −1 )) ⎤ ⎥ ⎥≥0 ⎦ (51) where ( | −1 ) = ∗ ( −1 ; q), as defined in (5). Proof. The global incentive compatibility constraint −1+1 ( ) can be written as: (−1 | ) − (+1 | ) ≥ 2∆(+1 | ) + 47 X ¡ ¢ (−1) − (+1) ( | +1 ) =0 (52) Note that (−1 |−1 ) − (+1 |−1 ) = ( (−1 | −1 ) − ( | −1 )) + ( ( | −1 ) − (+1 | −1 )) So using −1 ( −1 ) and +1 ( −1 ), we have: ⎡ (−1 |−1 ) − (+1 |−1 ) ⎢ ⎢ ¡ ⎣ ¢ P −2∆(+1 | −1 ) − (−1) − (+1) ( |−1 +1 ) =0 ∆ (( | −1 ) − (+1 | −1 )) + ⎤ ⎥ ⎥= ⎦ (53) X ¡ ¢ (−1) − [ ( |−1 ) − ( |−1 +1 )] =0 Using (52) and (53), it easy to conclude that that −2 (−1 ) fails if and only if (51) fails. ¥ We now prove that, under the conditions of the Proposition, (51) is violated. Note first that " P µ ¶# 1− ∆ (−1 | −1 ) ∆ ( | −1 ) = −1 − ∆ (−1 |−1 ) − ( |−1 ) = 1 − −1 −1 −1 −1 ∙ ¸ 1− =−1 |−1 ) so lim→1 [( −1 | −1 ) − ( | −1 )] = 1 − (1 − ) ∆, where = ∆(−1 −1 1. We also have that: X ¡ ¢ ( −2) − ( −1) [ ( | −1 −1 ) − ( | −1 )] =0 ⎤ ⎡ ⎢ (−2 |−1 −1 ) − ( −1 |−1 −1 ) ⎥ ⎥ + () = ⎢ ⎦ ⎣ − ( −2 | −1 ) + ( −1 | −1 ) (54) (55) where () is such that () → 0 as → 1. Let ̃ () be an history in which the realization is in evert period for periods. Using (4), we have: ⎤ ⎡ ⎢ ( −2 | −1 −1 ) ⎥ X ⎥= ⎢ (∆ ( −1 | −1 ))−3 ( −1 |̃−1 (−1 ))∆ + () ⎦ ⎣ =3 − ( −1 | −1 −1 ) Taking the limit as → 1 we have (−1 |̃−1 ( −1 )) → −1 − 1. So we can write: ⎡ ⎤ 1− =−1 ∆, −1 and ∆ (−1 | −1 ) → # " P X ⎢ ( −2 | −1 −1 ) ⎥ 1 − =−1 −2 ⎥ = lim ⎢ ∆ ∆ −1 − ⎦ →1 ⎣ −1 =3 − (−1 |−1 −1 ) # " P 1− 1 − −2 = −1 = ∆ ∆ −1 − 1− −1 48 Similarly, we have: [ ( −2 | −1 ) − ( −1 | −1 )] = X (∆ ( −1 | −1 ))−3 ( −1 | −1 ̃−3 ( −1 ))∆ + () =3 To interpret this expression, note that (−1 |−1 ̃−3 (−1 )) is the quantity after a realn o ization −1 following the history −1 = −1 ̃−3 (−1 ) . Taking the limit: # " P 1 − = −1 1 − −2 lim [ ( −2 | −1 ) − ( −1 | −1 )] = ̂∆ ∆ −1 − →1 1− −1 where ̂ = ∆ ( |−1 ) ∆ (−1 | ) −1 −1 1. We therefore have that: ⎤ ⎡ −1 X ¡ ¢ ⎢ ( | −1 −1 ) ⎥ ⎥ = −1 − ( −2) − (−1) ⎢ lim ⎦ ⎣ →1 1− =0 − ( |−1 ) " 1− P = −1 −1 # (1 − ̂) ∆ We conclude that as → 1, the left hand side of inequality (51) converges (modulo a factor of proportionality ∆) to: 1− 1− P =−1 −1 ∙ ¸ 1 − −2 1−+ (1 − ̂) 1− ¥ Clearly, there exist thresholds for and above which the this expression is negative. 7.9 Proof of Proposition 6 We proceed in two steps. ¢ ¡ ¯ Step 1. We say that a quantity ¯−1 is distorted downward (respectively, upward) if ¡ ¯ ¢ ¢ ¡ ¯ ¯−1 ≤ (respectively, ¯−1 ). We first show that in the optimal monotonic contract distortions are all downward. Consider the constraint set of (13), as described by M. Define, ¢ ¡ Γ −1 = ⎧ ⎪ ⎪ ⎨ b −1 |∃ ≤ − 1 s.t. given −1 = for some = 0 1 − 1; ⎪ ⎪ ⎩ we have b −1 = +1 and −1 =b −1 ∀ 6= ⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭ ¡ ¢ Thus, Γ −1 is the set of histories that differ from −1 only once and the distinct element is contiguous lower type. It is easy to see that a contract is monotonic if and only if for any history 49 ´ ³ ¡ ¢ ¡ ¢ ¡ ¢ −1 : 1. |−1 ≥ +1 |−1 for all ; and, 2. |−1 ≥ |b −1 for all and ¡ ¢ for all b −1 ∈ Γ −1 . Next, we introduce the following complete order on the set of all histories at time . For ³ ´ −1 , let ∗ −1 b −1 be the first period in which they diverge: any two histories −1 and b ³ ´ n o ³ ´ −1 ∗ −1 b −1 b −1 = min 0 ≤ ≤ − 1 s.t. −1 = 6 , with = − 1 if −1 = ∗ −1 b b −1 . We say that −1 º∗ b −1 if −1 ≥b −1 . It is easy to verify that the −1 ) −1 ) ∗ (−1 ∗ (−1 −1 order º∗ is complete, so without loss we can order the histories at time from largest ( ) to smallest (−1 ), where the largest (smallest) history has all realizations equal to 0 ( ). Also, ¡ ¢ −1 ∈ Γ −1 . note that, −1 º∗ b −1 for all b Consider period , and the smallest history of length − 1 (−1 ), in which all the realiza¡ ¢ tions are . It is immediate to see that |−1 can not be distorted upward. To see ¢ ¡ ¯ If it were distorted upthis note that ¯−1 is on the left hand side of no constraint.34 ¢ ¡ ward, then a marginal decrease in |−1 would relax all constraints and increase surplus. ¢ ¡ Now, consider −1 |−1 : this quantity appears on the left hand side of only one constraint: ¢ ¡ ¢ ¡ −1 |−1 ≥ |−1 . If this constraint is not binding, then by the argument presented ¡ ¢ ¢ ¡ ¢ ¡ above, −1 |−1 ≤ −1 . Assume it is binding. In this case −1 |−1 = |−1 ≤ ¡ ¯ ¢ ≤ −1 . Proceeding inductively with a similar argument, we can prove that ¯−1 ≤ for all . Note that the case for first period qunatities, when the history is just the empty set, is already covered by the above paragraph. Thus, now we consider ≥ 2. Assume, as an induction step, that ¡ ¢ −1 º∗ −1 , such that b −1 º∗ −1 º∗ −1 implies |−1 ≤ there is a history b −1 , where b −1 −1 º∗ −1 −1 6= for all . Let us also introduce a useful definition. For any −1 with £ ¤+ and ≥ 2, define −1 as the smallest history larger than −1 according to the order º∗ in the ª ¤+ © £ following inductive way. If = 2, then −1 = −1 (−1 ) −1 −1 + ∆ ; if 2 then: ⎧ ¢ ⎪ ⎪ ¡ −1 −1 −1 £ −1 ¤+ ⎨ −1 ( ) −1 + ∆ if −1 0 = ³£ ´ ¤+ ⎪ ⎪ ⎩ = −1 (−1 ) if −1 0 −1 µ h i+ ¶ where projects the first elements of a vector.35 We intend to show that | b ≤ −1 34 We say that a quantity is on the left hand side of a given constraint if in that constraint it must be larger than some other quantity. 35 −1 Recollect that −1 is a vector of length : −1 = (−1 −1 −1 = ∅. So, −1 (−1 ) = 0 1 −1 ), where 0 50 µ h i+ ¶ −1 b for all . Now, | appears on the left hand side in the following constraints: µ µh h ³ i+ ¶ ´ i+ ¶ | b −1 −1 for all e −1 ≥ |e . If none of these constraints bind, −1 ∈ Γ b then as before, we have the desired inequality. Suppose at least of them binds. Clearly, by the µh i+ i+ ¶ h −1 −1 ∗ e −1 −1 −1 b e b b , we have º for all ∈Γ . Thus, by the induction definition of µh ³ ´ i+ ¶ hypothesis |e −1 ≤ for all e −1 ∈ Γ b −1 . Since the the inequality constraint µ h ³ i+ ¶ ´ binds for some e −1 , we have | b −1 −1 ≤ . = |e µ h i+ ¶ Next, consider | b −1 . It appears on the left hand side in the following constraints: µ ¶ µ µ h h h ³ i+ i+ ¶ i+ ¶ ´ −1 −1 −1 b b b −1 | −1 for all e ≥ | and −1 | ≥ −1 |e −1 ∈ µh ¶ i+ Γ b −1 . If none of these constraints bind, then as before, we have the desired inequality. If µ h i+ ¶ −1 b the first one binds then, −1 | ≤ −1 . If any of the latter one binds, then invoking the induction hypothesis, as argued in the case above, we have the desired inequality. ¢ ¡ Proceeding inductively, we can show |−1 ≤ for all and −1 . Step 2. We now prove that the allocation is asymptotially efficient. Consider problem (13). From ¢ ¡ ¢ ¡ this problem eliminate the constraint 0 |0 ≥ 1 |0 and all the monotonicity constraints that involve quantities following an history in which the agents reports to be a type 0 . It is easy to see that in this problem the quantities offered after the agent reports (or has reported) ¡ ¢ −1 −1 = to be 0 are efficient: |−1 = for = 0 and/or ∀−1 ∈ ≥ 2, where ª © −1 ¯ ¯∃ ≤ − 1 s.t. −1 = 0 . Following the same approach as in Step 1, it can be shown that the solution of this relaxed problem is monotonic and so it coincides with the optimal monotonic contract. Since the probability of the event in which no type realization in periods is equal to 0 converges to zero as → ∞, this solution, and so the optimal monotonic contract, is asymptotically efficient. 7.10 ¥ Proof of Proposition 7 We prove that for any given , the optimal monotonic contract converges in probability to the optimal contract. Let Π (), Π () and Π∗∗ () be the profits obtained by the seller from, respectively, the repetition of the optimal static contract, the optimal monotonic contract and the (−1 −1 0 −2 ). 51 optimal contract when the discount is . Because the repetition of the optimal static contract is a monotonic dynamic contract, we must have Π () ∈ [Π () Π∗∗ ()]. Now note that when types are constant and = , it is well known that the repetition in every period of the optimal static contract is optimal.36 Since Π () Π () and Π∗∗ () are continuous in by the theorem of the maximum, we must have that for any sequence → and 0 there must be a such that for , we have |Π ( ) − Π∗∗ ( )| ≤ |Π ( ) − Π∗∗ ( )| ¡ ¢ ¢ ¡ Let |−1 and ∗∗ |−1 be the quantity chosen in the optimal monotonic contract and in an optimal contract when persistence is after an history = −1 . Assume that the optimal monotonic contract does not converge in probability to the optimal contract. Then there is a sequence → and two thresholds 1 0 and an 2 0 such that for any there is a history ¯ ¡ ¢ ¡ ¢¯ ∗∗ ¯ 1 . with probability at least 2 in which ¯ |−1 |−1 = −1 − ¡ ¢ Since ∗∗ |−1 is uniquely defined for any , the profit function is strictly concave and there is a finite number of history nodes, we conclude that we must have that there is a 0 such that Π∗∗ ( ) − Π ( ) for any , a contradiction. It follows that the optimal monotonic contract converges in probability to the optimal contract. ¥ 36 This result is shown, for example, in Laffont and Tirole (1993), and it can be easily deduced studying problem (13). To see it, note that the repetition of the static contract is incentive compatible and individually rational. Then note that when Λ = , the first order optimal contract coincides with the static optimal contract along the histories in which the agent reports always the same type. Since the other histories have probability zero, the profit from the repetition of the static contract is the same of the profit from the FO-optimal contract. Since the FO-optimal contract yields a profit not inferior to the optimal contract, the result is proven. 52 H M H M L H M M H 2 M 2 L H M M M H M 1 M H 1 M L M L H M qL M M H M qM H qH(θ) i qi(H) H 3 1 M 2 qM(M) L qL(M) H 1 3 1 M 2 L M M 2 3 1 L 1 1 3 1 L 1 M qM(L) L H M 3 1 M 2 qL(L) 2 3 1 M 3 1 1 1 2 H H 2 1 1 M 1 L M 1 M 1 L L 1 2 2 3 1 M L H 1 1 M 1 M 1 2 3 1 H 1 M 1 L M 1 M Table 1: The optimal contract when |Θ|=3 and T=2. B3 B2 B1 A2 A1 qH