Social Insurance, Information Revelation, and Lack of Commitment Mikhail Golosov Princeton Luigi Iovino Bocconi October 2014 Abstract We study the optimal provision of insurance against unobservable idiosyncratic shocks in a setting in which a benevolent government cannot commit. A continuum of agents and the government play an in…nitely repeated game. Actions of the government are constrained only by the threat of reverting to the worst perfect Bayesian equilibrium (PBE). We construct a recursive problem that characterizes the resource allocation and information revelation on the Pareto frontier of the set of PBE. We prove a version of the Revelation Principle and …nd an upper bound on the maximum number of messages that are needed to achieve the optimal allocation. Agents play mixed strategies over that message set to limit the amount of information transmitted to the government. The central feature of the optimal contract is that agents who enter the period with low implicitly-promised lifetime utilities reveal no information to the government and receive no insurance against current period shock, while agents with high promised utilities reveal precise information about their current shock and receive insurance as in economies with full commitment by the government. We thank Mark Aguiar, Fernando Alvarez, V.V. Chari, Hugo Hopenhayn, Ramon Marimon, Stephen Morris, Ali Shourideh, Chris Sleet, Pierre Yared, Sevin Yeltekin, Ariel Zetlin-Jones for invaluable suggestions and all the participants at the seminars at Bocconi, Carnegie Mellon, Duke, EIEF, Georgetown, HSE, Minnesota, NY Fed, Norwegian Business School, Paris School of Economics, Penn State, Princeton, University of Lausanne, University of Vienna, Washington University, the SED 2013, the SITE 2013, the ESSET 2013 meeting in Gerzensee, 12th Hydra Workshop on Dynamic Macroeconomics, 2014 Econometric Society meeting in Minneapolis, EFMPL Worskhop at NBER SI 2015. Golosov thanks the NSF for support and the EIEF for hospitality. Iovino thanks NYU Stern and NYU Econ for hospitality. We thank Sergii Kiiashko and Pedro Olea for excellent research assistance. 1 1 Introduction The major insight of the normative public …nance literature is that there are substantial bene…ts from using past and present information about individuals to provide them with insurance against shocks and incentives to work. A common assumption of the normative literature is that the government is a benevolent social planner with perfect ability to commit. The more information such a planner has, the more e¢ ciently she can allocate resources.1 The political economy literature has long emphasized that such commitment may be dif…cult to achieve in practice.2 Self-interested politicians and voters –whom we would broadly refer to as “the government” –are tempted to re-optimize over time and choose new policies. When the government cannot commit, the bene…ts of providing more information to the government are less clear. Better informed governments may allocate resources more e¢ ciently as in the conventional normative analysis but may also be more tempted to depart from the ex-ante desirable policies. The analysis of such environments is di¢ cult because the main analytical tool to study private information economies –the Revelation Principle –fails when the decision maker cannot commit. In this paper we study optimal information revelation and resource allocation in a simple model of social insurance – the unobservable taste shock environment of Atkeson and Lucas (1992). This environment, together with closely related models of Green (1987), Thomas and Worrall (1990), Phelan and Townsend (1991), provides theoretical foundation for a lot of recent work in macro and public …nance. This set up and its extensions was used to study design of unemployment and disability insurance (Hopenhayn and Nicolini (1997), Golosov and Tsyvinski (2006)), life cycle taxation (Farhi and Werning (2013), Golosov, Troshkin and Tsyvinski (2011)), human capital policies (Stantcheva (2014)), …rm dynamics (Clementi and Hopenhayn (2006)), military con‡ict (Yared (2010)), international borrowing and lending (Dovis (2009)). In the key departure from that literature we assume that resources are allocated by the government, which, although benevolent, lacks commitment. We study how information revelation a¤ects incentives of the government to provide insurance in such settings and characterize the 1 The seminal work of Mirrlees (1971) started a large literature in public …nance on taxation, redistribution and social insurance in presence of private information about individuals’types. Well known work of Akerlof (1978) on “tagging” is another early example of how a benevolent government can use information about individuals to impove e¢ ciency. For the surveys of the recent literature on social insurance and private informaiton see Golosov, Tsyvinski and Werning (2006) and Kocherlakota (2010). 2 There is a vast literature in political economy that studies frictions that policymakers face. For our purposes, work of Acemoglu (2003) and Besley and Coate (1998) is particularly relevant who argue that ine¢ ciencies in a large class of politico-economic models can be traced back to the lack of commitment. Kydland and Prescott (1977) is the seminal contribution that was the …rst to analyze policy choices when the policymaker cannot commit. 1 properties of the optimal contract. Formally, we study a repeated game between the government and a continuum of atomless citizens in the spirit of Chari and Kehoe (1990) and focus on the perfect Bayesian equilibria (PBE) that are not Pareto-dominated by other PBE. The economy is endowed with a constant amount of perishable good in each period and the government allocates that endowment among agents. Agents receive privately observable taste shocks that follow a Markov process, which is assumed to be iid in the baseline version of our model. Agents transmit information about their shocks to the government by sending messages. The government uses these messages to form posterior beliefs about realization of agents’types and allocate resources. The main friction is that ex-post, upon learning information about the agent’s type, the government has the temptation to allocate resources di¤erently from what is required ex-ante to provide incentives to reveal information. The more precise information the government has about the agents’types, the higher its payo¤ from deviation is. Our paper makes three contributions. First, we construct a recursive formulation of our problem. The key di¢ culty that we need to overcome is that the value of the worst equilibrium depends on the implicit promises made to all agents. The standard recursive techniques, that characterize optimal insurance for each history of past shocks in isolation from other histories, do not apply directly since information that any agent reveals a¤ects the government’s incentives to renege on the promises made to other agents. We make progress by showing how to construct an upper bound for the value of deviation which coincides with that value if all agents play the best PBE and takes weakly higher value for any other strategy. This upper bound can be represented as a history-by-history integral of functions that depends only on the current reporting strategy of a given agent, and thus can be represented recursively. Since the original best PBE strategies still satisfy all the constraints of this modi…ed, tighter problem, the solution to this recursive problem allows us to uncover the best PBE. The resulting recursive problem is very simple. It is essentially a standard problem familiar from the recursive contract literature with two modi…cations: agents are allowed to choose mixed rather than pure strategies over their reports and there is an extra term in planner’s objective function that captures "temptation" costs of revealing more information. Our second contribution is to show a version of the Revelation Principle for our settings. It is well known that standard direct revelation mechanisms, in which agents report their types truthfully to the government, are not e¢ cient in this type of settings. Bester and Strausz (2001) showed that, in a repeated game between one principle and one agent whose type can take N values, the message space can be restricted to only N messages over which agents play 2 mixed reporting strategies. These techniques do not apply to economies with more than one agent.3 We show that in our baseline version of the model with continuum of agents and iid shocks it is without loss of generality to restrict attention to message spaces with at most 2N 1 messages. This result follows from the fact that with iid shocks our recursive characterization satis…es single-crossing property, and therefore all messages that give any type n the highest utility can be partitioned in three regions: those that also give the highest utility to types n + 1 and n 2N 1 and those that give the highest utility to only type n: There can be at most 1 of such partitions and convexity of our problem implies that it is su¢ cient to have one message per each partition. Our third contribution is the characterization of the properties of the optimal contract and the e¢ cient information revelation. As in the full commitment case of Atkeson and Lucas (1992), it is optimal to allocate resources to agents with temporality high taste shocks by implicitly promising to increase future lifetime utility of agents with temporarily low taste shocks. As agents experience di¤erent histories of shocks, there is a distribution of promised utilities at any given time. We show that it is e¢ cient for agents who enter the period with di¤erent promised utilities to reveal di¤erent amounts of information. In particular, we show that under quite general conditions agents who enter the period with low promised utilities reveal no information about their idiosyncratic shock in that period and receive no insurance. In contrast, under some additional assumption on the utility function and the distribution of shocks, agents who enter the period with high promised utilities reveal their private information fully to the government. The optimal insurance contract for such agents closely resembles the contract when the government can commit. The intuition for this result can be seen from comparing costs and bene…ts of revealing information to the government. The costs are driven by the temptation to deviate from the exante optimal plan and re-optimize. When the government deviates from the best equilibrium, it reneges on all the past implicit promises and allocates consumption to each agent based on its posterior beliefs about agent’s current type. Therefore, the incentives for the government to deviate depend only on the total amount information it has but not on which agents reveal it. On the contrary, the bene…ts of information revelation depend on the e¢ ciency gains from better information on the equilibrium path, which in turn depend on the implicit promises with which an agent enters the period. The general lesson is that the agents for whom better 3 See also Skreta (2006) who extends the analysis to an arbitrary type space and show the optimal selling mechanism in a game between one buyer and one seller. Bester and Strausz (2000) provide an example of an economy with two agents to illustrate that that with more than one agent message spaces with only N elements are restrictive. 3 information leads to the highest e¢ ciency gains on the equilibrium path should send precise signals to the government. In Atkeson and Lucas (1992) multiplicative taste shock environment those are the agents with high promised utility. Participation constraints of the government prevent the emergence of the extreme inequality, known as immiseration, which is a common feature of environments with commitment. The optimal contract exhibits mean reversion, so that agents with low current promised utilities receive in expectation higher future promises, and vice versa. This implies that in an invariant distribution no agent is stuck in the no-insurance region forever, and in a …nite number of steps he reaches a point when he reveals some information about his shock and some insurance is provided. Thus, we show that in the invariant distribution there is generally an endogenous lower bound below which agent’s promised utility never falls. Our paper is related to a relatively small literature on mechanism design without commitment. Roberts (1984) was one of the …rst to explore the implications of lack of commitment for social insurance. He studied a dynamic economy in which types are private information but do not change over time. More recently, Sleet and Yeltekin (2006), Sleet and Yeltekin (2008), Acemoglu, Golosov and Tsyvinski (2010), Farhi et al. (2012) all studied versions of dynamic economies with idiosyncratic shocks closely related to our economy but made various assumptions on commitment technology and shock processes to ensure that any information becomes obsolete once the government deviates. In contrast, the focus of our paper is on understanding incentives to reveal information and their interaction with the incentives of the government. Our paper is also related to the literature on the ratchet e¤ect, e.g. Freixas, Guesnerie and Tirole (1985), La¤ont and Tirole (1988). These authors pointed out that without commitment it is di¢ cult to establish which incentive constraints bind, which signi…cantly inhibited further development of this literature. Some papers have thus focused on the analysis of suboptimal but tractable incentive schemes. We show that the problem substantially simpli…es when the principal interacts with a continuum of agents since any agent’s report does not a¤ect the aggregate distribution of the signals received by the principal. In this case we can obtain a tight characterization of the optimal information revelation for high and low values of promised utilities. Our results about e¢ cient information revelation are also related to the insights on optimal monitoring in Aiyagari and Alvarez (1995). In their paper the government has commitment but can also use a costly monitoring technology to verify the agents’reports. They characterize how monitoring probabilities depend on the agents’promised values. Although our environment and theirs are very di¤erent in many respects, we both share the same insight that more information 4 should be revealed by those agents for whom e¢ ciency gains from better information are the highest. Work of Bisin and Rampini (2006) pointed out that in general it might be desirable to hide information from a benevolent government in a two period economy. Finally, our recursive formulation builds on the literature on recursive contracts with private information and a patient social planner. Work of Farhi and Werning (2007) provides recursive tools that are particularly useful in our settings. The rest of the paper is organized as follows. Section 2 describes our baseline environment with iid shocks. Section 3 derives the recursive characterization and shows a version of the Revelation Principle. Section 4 analyzes e¢ cient information revelation and optimal insurance. Section 5 extends our analysis to general Markov shocks. 2 The model The economy is populated by a continuum of agents of total measure 1 and the government. There is an in…nite number of periods, t = 0; 1; 2; ::: The economy is endowed with e units of a perishable good in each period. Agent’s instantaneous utility from consuming ct units of the good in period t is given by tU (ct ) where U : R+ ! R is an increas- ing, strictly concave, continuously di¤erentiable function. The utility function U satis…es Inada conditions limc!0 U 0 (c) = 1 and limc!1 U 0 (c) = 0 and it may be bounded or un- bounded. All agents have a common discount factor v = limc!0 U (c) = (1 . Let v = limc!1 U (c) = (1 ) and ) be the upper and lower bounds on the lifetime expected utility of the agents (v and v may be …nite or in…nite). For our purposes it is more convenient to work with utils rather than consumption units. Let C The taste shock we assume that t t takes values in a …nite set U 1 be the inverse of the utility function. with cardinality j j: For most of the analysis are iid across agents and across time, but we relax this assumption in Section 5. Let ( ) > 0 be the probability of realization of 2 : We assume that 1 < ::: < j j and P normalize ( ) = 1: We use superscript t notation to denote a history of realization 2 of any variable up to period t, e.g. realization of history t t = ( 0 ; :::; t ). Let t t denote the probability of : We assume that types are private information. Each agent is identi…ed with a real number v 2 [v; v] which we soon interpret as a lifetime promised utility. All consumers with the same v are treated symmetrically. The distribution of promises v is denoted by . As it will become clear shortly, the initial distribution gives us a ‡exible way to trace the Pareto frontier of the set of subgame perfect equilibria. The government collects reports from agents about their types and allocates consumption subject to the aggregate feasibility in each period. The government is benevolent and its payo¤ 5 is given by the sum of expected lifetime utilities of all agents. 2.1 The game between the government and agents The physical structure of our environment corresponds exactly to the model of Atkeson and Lucas (1992). The constrained e¢ cient allocations that they characterize can be achieved by the government which is able to commit in period 0 to in…nite period contracts. We focus on the environment without commitment. We model the interaction between the government and individuals along the lines of the literature on sustainable plans (Chari and Kehoe (1990), Chari and Kehoe (1993)). Before formally de…ning the game we brie‡y outline its structure. The government does not observes agents’types, so it collects information by having the agents submit messages. More speci…cally, at the beginning of any period t the government chooses a message set Mt from some space M: All agents observe their types and submits messages mt 2 Mt . The government allocates consumption to agents as a function of current and past reports. To capture the main trade-o¤ in information revelation we assume that the government cannot commit even within a period: it can pick any feasible allocation of resources after collecting agents’reports. This allows for a simple and transparent analysis in our benchmark case of iid shocks. As we discuss in Section 5, much of the analysis carries through to the case when shocks are persistent.4 We now formally de…ne our game. We start with information sets. Let S t summary of information available to the government at the end of period t set recursively with S 1 = : St 1 is the space of public histories S t 1 be the 1: We de…ne this 1: Each period has three stages. In the …rst stage the government chooses a message set Mt : S t 1 ! M. In the second stage agents send reports about their types. Each agent i observes the realization of his type in period t; i;t ; and of a payo¤ irrelevant random variable zi;t uniformly distributed on Z = [0; 1] : The realizations of zi;t are publicly observable. Each agent chooses a reporting strategy ~ i;t over Mt as a function of the aggregate history S t his past history of reports and sunspot realizations = vi ; mti 1 ; zit 1 t are privately i : Realizations of t 1 h are publicly observed and we zi;t ; and the history of his shocks private history. Realizations of hti 1 t 4 1; M t ; ; current realization observed and we call them call them personal history A small recent literature on social insurance without commitment sidesteps the possibility that the government may misuse information when it deviates from equilibrium strategies. For example, Acemoglu, Golosov and Tsyvinski (2010) allow agents to stop interacting with the government if it deviates, while Sleet and Yeltekin (2008), Farhi et al. (2012) assume that shocks are iid but there is commitment within the period. All these environment are constructed so that the information about individuals becomes obsolete if the government deviates. 6 with a space of ht denoted by H t 1 To economize on notation, de…ne ht 1: v; mt 1; zt and let H t be the space of all ht : Formally, the reporting strategy is de…ned as ~ i;t : S t Ht M t ! (Mt ) where 1 (Mt ) is a set of all Borel probability measures on Mt . We assume that all agents i with the same history choose the same strategy ~ i;t ; and that the law of large numbers holds so that strategies ~ i;t generate a distribution of reports simplify the exposition, we will refer to t over Mt : To as agents’strategy. Finally, in the last stage of the t period the government chooses utility allocations ut for each personal history ht as a function of S t = S t 1; M ; t : The distribution of ut ; t t, at the beginning of the next period, S t : Let t and G;t be the spaces of feasible denote the whole in…nite sequence of t G;t and Mt and S t 1 constitute the aggregate history be the government strategy in period t and G;t respectively. We use boldface letters x to fxt g1 t=0 : For our purposes it is useful to keep track of two objects, the distribution of the agents’ t t t t (h ; jS ) as the measure Bm B where At private histories and that of personal histories. For any S t de…ne of agents with histories Mt [v; v] Zt is a Borel set of of Mt t Z set A of t t t (h jS ) H t as t 1 : It is de…ned recursively. Let can be represented as a product At = At Ht 1 t 1 : The measure At jS t = Let t ht ; t At 1 jS = 1 1 : Any Borel set At of Bz 1 and Bm ; Bz ; B are the mt -,z- and -sections of some Borel set t t 1 ht ; t jS t over space H t Pr (z 2 Bz ) Pr ( 2 B ) t t is de…ned as Bm jS t 1 ; Mt ; At 1 Bz B : be the measure of agents with personal histories ht ; de…ned for each Borel t AjS t = Z t t A t jS t d t t : (1) We require the government’s strategy to be feasible, so that it satis…es for any S t Z C ut ; S t d t jS t e: (2) Ht t Citizens’strategies induce posterior beliefs pt over ht and St : These beliefs are de…ne for all and satisfy Bayes’rule, i.e. that for any Borel set A of H t and any t ; Z t t pt t jS t ; h t dhjS t = t A jS whenever t AjS t > 0: (3) A Finally, we de…ne continuation payo¤s for agents and the government. Any pair ( ; generates a stochastic process for utility allocation u: We use E( pectation of the random variable xt : St measures generated by strategies ( ; G) ; pectation given some Borel subset A of Ht t and E( St Ht 7 ; ; G) G) xt to denote the ex- ! R with respect to the probability G) [xt jA] to denote the conditional ex- t: The lifetime utility of agent v is is P1 t t ut jv t=0 R P1 t E( ; G ) t ut = t=0 then E( ; G) : Since the government is benevolent and utilitarian, its payo¤ P1 t (dv) : E( ; G ) t ut jv t=0 De…nition 1 A Perfect Bayesian Equilibrium (PBE) is a strategy pro…le ( ; G) and a belief system p such that conditions (2) and (3) are satis…ed, and both agents and the government choose and G as their best responses, i.e. Agents’ best response: for every S T 1 ; MT ; hT ; T ; " 1 # X t T T 1 E( ; ) ; MT ; hT ; T E( 0 ; t ut S G G t=T n Government’s best response: for every S~T 2 S T E( ; G ) " 1 X t T t ut S~T t=T # Z E( ; 0 G ) " ) " 1; ST 1 X 1 X t ut ST 1 ; MT ; hT ; t=T o t T t T T # (4) ; t ut S~T t=T # for all 0 G 2 G: (5) We focus on equilibria that maximize the payo¤ of the utilitarian government subject to delivering lifetime utility of at least v to an agent that belongs to family v: We call it a best equilibrium and de…ne it as follows De…nition 2 A triple ( ; G; p E( ; ) is a best equilibrium if it is a PBE such that " 1 # X t v for all v; t ut v G) (6) t=0 and there is no other PBE ( 0 ; government, E( 0; 0 G ) 0 ; p0 ) G "1 X t that satis…es (6) and delivers a higher payo¤ to the t ut t=0 Throughout the paper we assume that assume that # > E( ; G ) "1 X t=0 t t ut # : is such that a best equilibrium exists. We further is such that (6) holds with equality in a best equilibrium. This assumption simpli…es the recursive formulation since all the “promise keeping” constraints can be de…ned as equalities in that case. Without this assumption our period 0 recursive formulation would have to be de…ned as an inequality, which would make notation more bulky but not a¤ect any of our results. 8 for all 0 2 ; 3 The recursive problem and the Revelation principle In this section we discuss two intermediate results that are necessary for our analysis of optimal information revelation. First, we derive the recursive problem solution to which characterizes the optimal reporting strategy and consumption allocation in best PBEs. Recursive techniques are the standard tool to solve dynamic contracting problems, which essentially allow to solve for optimal insurance after any given history of shocks separately from the other histories. In our game-theoretic settings the standard approach to constructing Bellman equations is not applicable: the government’s incentive to deviate from a best equilibrium depends on the reporting strategies of all agents making the usual history-by-history separation of incentives is impossible. To overcome this di¢ culty, we consider a constrained problem in which we replace the function that characterizes the value of deviation for the government with another function which is linear in the distribution of the past histories of reports. We show that this function can be chosen such that a solution to the original problem is also a solution to the constrained problem. The advantage is that the linearity of the constrained problem in the distribution of past reports allows us to solve for the optimal contract recursively, history-by-history. The key insight for this construction is that the Lagrange multiplier on resource constraint in the best one shot deviation from best equilibrium strategies summarizes all relevant information. The second result of this section characterizes the minimum dimension of the message space M that is needed to support best PBEs. Reduction of an arbitrarily message space M to a smaller dimension object is important for tractability of the maximization problem. This step is similar in spirit to invoking the Revelation Principle in standard mechanism design problems with commitment, and we refer to this result as the (generalized) Revelation principle. Although these steps are crucial for our ultimate goal – characterization of e¢ cient information revelation – these tools are of interest in themselves. A lot of environments without commitment share the feature that value of deviation for the “principal”depends on actions of many agents, and the recursive techniques developed in our settings should also be applicable to such environments. To the best of our knowledge, there is no version of the Revelation Principle for economies with multiple agents when the principal lacks commitment. A well known paper by Bester and Strausz (2001) developed a version the Revelation Principle for an economy with one agent, but their approach and the main result do not extend to multi-agent settings. 9 3.1 The recursive problem We start by describing a PBE that delivers the lowest payo¤ to the government, to which we refer as the worst PBE. It is easy to show that there are worst equilibria in which no information is revealed.5 To see this, let m? (M ) be an arbitrary message from a set M 2 M. Consider a w; strategy pro…le ( w) G in which all agents report message m? (M ) with probability 1 for all M , and the government allocates the same utility uw t =C The beliefs pw that satisfy (3) are given by pw t Lemma 1 ( w; w ; pw ) G t 1 (e) jht ; S t = t for any message m it receives. t for all ht ; S t : is a worst PBE. Proof. The best response of the government to agents’reporting strategy w is to allocate the same utility to all agents since its beliefs pw do not depend on m and C is strictly convex. Given this allocation rule, playing w is optimal for agents, therefore ( that delivers the government payo¤ U (e) = (1 ) : Strategy w G w; w; w ; pw ) G is a PBE is feasible for any reporting strategy that agents choose, so the government can attain payo¤ U (e) = (1 which proves that ( w ; pw ) G ) in any PBE, is a worst PBE for the government. Standard arguments establish that best equilibria are supported to reverting to the worst PBE following any deviation of the government from its equilibrium strategies. We split our problem in two parts. First, we …x any sequence of message sets M and describe the maximization problem that characterizes the best PBE given that sequence of message sets, which we call the best equilibria given M: We then show that the same payo¤ can be achieved with a much simpler message set that has only a …nite number of messages. That set is independent of M and hence without loss of generality it can be chosen as the message set in the best PBE. We start with several preliminary observations that simplify the notation. Since there is no aggregate uncertainty the aggregate distribution of reports and allocations is a deterministic sequence. Therefore, it is redundant to keep track of the aggregate history S t in best equilibria and we drop this explicit dependence from our notation. Moreover, without loss of generality we can restrict attention to reporting strategies on the past history of shocks since as long as t t 1 6 : t that depend on ht 1; zt and t; but not This results holds more generally with Markov shocks, follows a Markov process, t 1 is payo¤ irrelevant and agents with the same 5 In fact, a stronger statement is true that in any worst equilibrium no information is revealed. This statement follows from our Lemma 2. 6 In the Supplementary Material we show that for any PBE ( 0 ; 0G ; p0 ) there is another PBE ( ; G ; p) which delivers the same payo¤ to all agents and the government as ( 0 ; 0G ; p0 ) (i.e. ( ; G ; p) is payo¤ -equivalent to ( 0 ; 0G ; p0 )) and has the property that t does not depend on t 1 : 10 ht ; t can replicate each other strategies for all past t 1 : This implies that all information about agents at the beginning of period t can be summarized by the distribution of reports t 1: We now turn to setting up the maximization problem that characterizes best equilibria given M: Since any deviation by the government from a best equilibrium triggers a switch to the worst PBE, it is su¢ cient to consider only the best one-shot deviation from equilibrium strategies by the government. The government that observes the distribution of personal ~ t ( t ) de…ned as histories t can achieve the maximum payo¤ W ~ t ( t) = W subject to Z sup Ht fu(h)gh2H t Z E C (u (h)) d t [ jh] u (h) d e: t Ht (7) t (8) Therefore, any best equilibrium given M is a solution to 1 X max E u; t t ut (9) t=0 subject to (2), E 1 X s t ~ t ( t) + W s us s=t E 1 X t t ut E 0 t=0 and 1 X 1 t U (e) for all t; t ut for all (10) 0 (11) t=0 E " 1 X t t ut t=0 # v = v: (12) The main di¢ culty in the analysis of problem (9) is that the sustainability constraint (10) ~ t ; which is a non-linear function of the past reports embedded in the probability depends on W ~ t with a function that (i) is linear in t 1 ; distribution t 1 : To make progress, we replaced W ~ t for all (ii) is weakly greater than W t; ~ t at a best equilibrium’s distribution of and (iii) equals W reports. This de…nes a more constrained maximization problem for which the best equilibrium is still a solution. The linearity of the constraining function in t 1 allows us to write the constrained problem recursively. Let (u ; by : Let ) be a solution to (9) and w t be the distribution of personal histories induced 0 be a Lagrange multiplier in problem (7) given 11 t: For any mapping : ! (Mt ) de…ne Wt ( ) = max fu(m)gm2M t Z w t C ( u (m) Mt (u (m))) (dmj ) (d ) + We use fuw (m)gm2Mt to denote the solution to (13), which is given by equation E [ jm]. Function Wt plays an important role in our analysis: We show that R w t e: (13) w 0 w t C (u Wt ( t ) dzt d (m)) = t 1 is an upper bound for Wt ( t ) that satis…es the three properties needed for recursive characterization. Before formally proving this result, it is useful to introduce some de…nitions. We say that P reporting strategies are uninformative if E [ jm] = ( ) = 1 a.s., i.e. for almost every message m with respect to the measure generated by . Uninformative reporting strategies reveal no additional information other than the unconditional expectation of : All other strategies are called informative. We also need to generalize the notion of the envelope theorem for Wt : For any and 0 de…ne a derivative of Wt ( ) in the direction @Wt ( ) @ 0 Wt ((1 lim 0) ) + Wt ( ) #0 0 as : With these de…nitions we can state the following lemma, which is important for all subsequent analysis. w t Lemma 2 The multiplier is uniformly bounded away from 0 and belongs to a compact set and, therefore, uw (m) belongs to a compact set in the interior of [(1 ) v; (1 ) v] for all : Family of fWt gt and all families of its directional derivaties are uniformly bounded. Function Wt is well de…ned, continuous, convex, and is minimized if and only if uninformative. For any t; t 1 ~ t ( t) W with equality if t; t 1 = Z Wt t Ht t; t 1 : 0 is well-de…ned The derivative @Wt ( ) =@ Z @Wt ( ) = ( uw (m) @ 0 Mt w t C jht ; for all (uw (m))) dzd 0 (14) t 1 and satis…es 0 (mj ) (mj ) dm (d ) : Consider now a modi…ed maximization problem (9) in which we replace (10) with Z 1 X s t U (e) : E Wt t jht 1 ; dzd t 1 + s us 1 t H s=t The constraint set is tighter in the modi…ed problem, but (u ; straints, therefore (u ; is (15) (16) ) still satis…es all the con- ) is a solution to the modi…ed problem. One can then use standard 12 techniques (see, e.g. Marcet and Marimon (2009) and Sleet and Yeltekin (2008)) and use Lagrangian multipliers to re-write it as L = max E u; 1 X t [ t ut tC (ut ) t Wt ] (17) t=0 1 subject to (11) and (12) for some non-negative sequences t ; t ; t t=0 , with a property that ^ ; with strictly inequality if and only if constraint (16) binds in period t:7 We t t= t 1 interpret ^ as the e¤ective discount factor of the government in the best equilibrium. It is t greater than agents’discount factor because the government also needs to take into account how its actions a¤ects the sustainability constraints in the future. This is a general result about the role of the lack of commitment on discounting that was highlighted, for example, by Sleet and Yeltekin (2008) and Farhi et al. (2012). The maximization problem (17) shows costs and bene…ts of information revelation. The more information the agents reveal, the easier it is for the planner to maximize the costweighted utility function of the agents, t ut tC (ut ) : At the same time, better information also increases the temptation to deviate, captured by the term t Wt : This trade-o¤ is at the heart of our characterization in Section 4. We make two assumptions on the properties of these Lagrange multipliers which we maintain throughout the analysis, lim sup t > 0 and lim inf t > 0: These conditions simplify technical arguments and in the Supplementary material we provide su¢ cient conditions for C that ensure that these assumptions are satis…ed. Since the Langrange multipliers should satisfy these properties much more generally – for example whenever the economy converges to an invariant distribution and the long-run immiseration is a feature of the optimal contract with full commitment –we opted to make assumptions on the multipliers directly.8 An important simplifying feature of the modi…ed maximization problem and the Lagrangian (17) is that the distribution of past reports t 1 enters the objective function and the con- straints linearly. This allows us to solve for the optimal allocations and reporting strategies separately for each history ht 1: We do so by extending the recursive techniques developed by Farhi and Werning (2007) who study an economy with commitment but in which the principal is more patient than agents. 7 This representation is possible since allowing the government and agents to condition their strategies on z convexi…es the maximization problem (9). The formal arguments are straightforward but cumbersome, we present their sketch in the Supplementary Material. 8 Immiseration is a common feature of long-term contracts with commitment. See Ljunqvist and Sargent (2004) for a textbook discussion and Phelan (2006) and Hosseini, Jones and Shourideh (2013) for recent contributions to the literature. We discuss existence of the invariant distribution in Section 5.1. 13 Let k0 (v) be the value of the objective function (17) for family v; de…ned as " 1 # X 1 k0 (v) = max E t ( t ut t C (ut ) t Wt ) v 0 u; t=0 subject to (11) and (12). The Lagrangian de…ned in (17) satis…es L = Similarly we can de…ne9 kt (v) = 1 max E t u; " 1 X s us t+s t+s C (us ) t+s Ws s=0 R v (18) 0 k0 (v) # (dv) : (19) subject to (11) and (12). The next proposition shows the relationship between kt (v) and kt+1 (v) : Proposition 1 The value function kt (v) is continuous, concave, di¤ erentiable with limv!v kt0 (v) = 0 1 kt (v) 1: If utility is unbounded below, then limv! limv!v kt0 (v) = 1; if utility is bounded below, then = 1. The value function kt (v) satis…es the Bellman equation h ^ kt (v) = max E u t C (u) + t+1 kt+1 (w) fu(m;z);w(m;z); ( jz; )g(m;z)2M t ( jz; )2 (Mt ) Z t Wt i (20) subject to h i E ^u + wj^; z v = E [ u + w] h i E 0 ^u + wj^; z for all z; ^; (21) 0 (22) Without loss of generality, the optimum is achieved by randomization between at most two points, i.e. one can choose z 2 [0; 1] and set (u ( ; z 0 ) ; w ( ; z 0 ) ; ( jz 0 ; )) = (u ( ; z 00 ) ; w ( ; z 00 ) ; ( jz 00 ; )) for all z 0 ; z 00 z and all z 0 ; z 00 > z: Most of the proof of this proposition follows the arguments of Farhi and Werning (2007) and is provided in the Supplementary materials. The solution to this Bellman equation is attained by some function gtu (v; ; z) ; gtw (v; ; z) and gt (v; ; z) : We can use these policy functions to generate recursively ut and t: Standard arguments establish when these (u; ) are a solution to the original problem (9). Proposition 2 If (u; ) obtains a maximum to (18), then it is generated by (gu ; gw ; g ) : If (u; ) generated by (gu ; gw ; g ) is such that lim E t t!1 vt = 0 (23) then it is a solution to (18). Condition (23) can be veri…ed in the same way as in Farhi and Werning (2007). Variables (u; ) = fut ; t g1 t=0 are de…ned over slightly di¤erence spaces in problems (18) and (19). The 1 sequence of message sets is fMt g1 t=0 in (18), while it is fMs+t gt=0 in (19): 9 14 3.2 The Revelation Principle The recursive problem (20) is similar to the usual recursive dynamic contract formulation with commitment with two modi…cations. First, agents play a mixed reporting strategy over set M rather than a pure reporting strategy over set t Wt : Second, there is an additional term that captures the cost of information revelation. In this section we further simplify the problem by showing that incentive constraints (22) impose a lot of structure on agents’ reporting strategies and simplify the message space that needs to be considered in best PBEs. Since incentive constraints (22) hold for each z; we drop in this section the explicit dependence of strategies on the payo¤ irrelevant variable, and our arguments should be understood to apply for any z: Let fu (m) ; w (m)gm2Mt be any allocation that satis…es (22). Let Mt ( ) be the subset of messages that give type the highest utility, i.e. u m0 + w m0 u (m) + w (m) for all m 2 Mt ( ) ; m0 2 Mt : Sum this inequality with an analogous condition for type 0 0 u (m) u m0 Mt 0 to get for all ; 0 ; m 2 Mt ( ) ; m0 2 Mt 0 : (24) One immediate implication of this equation is that any incentive compatible allocation is monotone in types, i.e. > 0 implies u (m) u (m0 ) and w (m) w (m0 ) : It follows that each message set Mt ( i ) can be partitioned into three regions: the messages that also belong to Mt ( i+1 ) ; the messages that also belong to Mt ( i 1 ), and the messages that do not belong to any other Mt ( j ) : The next proposition shows that it is without loss of generality to assume that agents send only one message per each partition. Since there can be at most 2j j 1 of such partitions, it also gives an upper bound on the cardinality of the message space needed in best PBEs. Proposition 3 Any best PBE is payo¤ equivalent to a PBE in which agents report no more than 2j j 1 messages after any history ht : Furthermore such PBE can be constructed so that if m0 6= m00 then the utility allocation in equilibrium, ut , satis…es ut ht ; m 6= ut ht ; m ^ : We brie‡y sketch the key steps of the arguments. Consider any partition Mt ( i )\Mt ( i+1 ) : Equation (24) implies that (u (m) ; w (m)) must be the same for all m 2 Mt ( i ) \ Mt ( i+1 ) : Therefore, all reporting strategies that keep the same probability of reporting messages m 2 = h i ^ Mt ( i )\Mt ( i+1 ) have the same on-the-equilibrium-path payo¤ E u C (u) + k (w) . t t+1 t+1 Thus, in a best BPE the reporting of messages in Mt ( i ) \ Mt ( i+1 ) must minimize the o¤- the-equilibrium-path payo¤ E [ t Wt ]. Convexity of Wt implies that such reporting strategy 15 keeps posterior beliefs of the government the same for all m 2 Mt ( i ) \ Mt ( without loss of generality to allow only one message for each Mt ( i ) \ Mt ( i+1 ) : Then it is i+1 ) : The reverse arguments allow to reach the same conclusion for the subset of Mt ( i ) that does not intersect with any other Mt ( j ) ; which we denote Mtexcl ( i ) : Conditional of observing any message sent with a positive probability from subset Mtexcl ( i ) ; the government learns that the sender is type i with certainty. Therefore the o¤-the-equilibrium-path payo¤ does not depend on the exact probability with each type i reports messages in Mtexcl ( i ) : Strict convexity of the on-the-equilibrium-path payo¤ implies that the government chooses that same (u (m) ; w (m)) for all m 2 Mtexcl ( i ), which again implies that it is without loss of generality to restrict attention to only one message. Proposition 3 is very useful for the characterization of the optimal incentive provision and information revelation in the next section. It also allows us to prove a version of the Revelation Principle in our economy. Let M be any set of cardinality j j: Corollary 1 Any best PBE given M is payo¤ equivalent to a best PBE given Mt = M for all t: The incentive constraint (22) for such equilibrium can be written, for each z; as (m2i 1 j 2i 1 ) > 0 and i u (m2i 1 ) (mj i ) [( i u (m2i 1) + w (m2i + w (m2i 1 )) 1) i u (m) + w (m) ; (25) ( i u (m) + w (m))] = 0 for all i = 1; ::; j j; m 2 M : This result shows that in general the dimensionality of message space with a continuum of agents is …nite but bigger than the dimensionality of the message space with only one agent. In the latter case Bester and Strausz (2001) showed that without loss of generality one can restrict attention to a message space M = : Their result does not extend to economies with multiple agents.10 Corollary 1 shows that even with a continuum of agents the dimensionality of the message space remains small. 4 Characterization In this section we characterize properties of the e¢ cient information revelation. We start with a simpli…ed environment that illustrates the main trade-o¤s very transparently. We then extend the insights developed by this example to our general settings. 10 See Bester and Strausz (2000) for a counterexample with two agents. 16 4.1 A simpli…ed environment We make two simplifying assumptions in this section. First, we assume that the government can deviate from its implicit promises in period t only if it deviates in period 0. This simpli…es the analysis of the best equilibrium as it is su¢ cient to ensure that government’s sustainability constraint (10) holds only in period 0. This case is isomorphic to our Lagrangian formulation with t = 0 for all t > 0: Secondly, we assume that agents’preferences can be represented by a utility function U (c) = ac1=a when 1=a < 1; with U (c) = ln c when 1=a = 0: Under the …rst simplifying assumption the Lagrangian (17) can be written, up to a constant, as ( L = max E u; 1 X t tC (ut ) 0 W0 t=0 ) subject to (11) and (12). First, consider the sub-game starting from t = 1: Since the sustainability constraints do not bind for all t 1; the standard Revelation principle applies and the maximization problem that characterizes the optimal allocations is identical to that in Atkeson and Lucas (1992). It is easy to show that the value function with CRRA preferences takes the form 1 (v) = const jvja when 1=a 6= 0 and For any reporting strategy 0 (v; 2 )= 1 (v) (M0 ) in period 0, let 0 (v; 0C (u) + max fu(m);w(m)gm2M subject to (21) and (22). Function = const exp (v) when 1=a = 0: 0 (v; E [ ) be de…ned as 1 (v)] 0 ) is interpreted as the value function for the govern- ment on the equilibrium path if an agent plays reporting strategy : It inherits the homogeneity property of function 1; namely ) = d ( ) jvja for some constant d ( ) : 0 (v; If no public randomization is allowed, the actual value function 0 (v) = max 0 (v; ) 0 W0 ( 0 (v) is given by ): With public randomization the maximum is taken over the convex hull of the function on the right hand side. Consider …rst the case when j j = 2 and agents can only play two reporting strategies: a full information revelation strategy in ; when each type reports a distinct message with un ; probability 1, and a no information revelation strategy when each type randomizes between all messages with the same probability. Since the former strategy has higher payo¤ both onand o¤- the equilibrium path, d Figure 1 plots 0 (v; ) in 0 W0 ( > d( ) for un ) and W0 in ; 2 ization. The key observation is that function 0 17 un ; in in > W0 ( ; as well as 0 W0 un ) : 0 (v) in with public random- intersects 0( ; un ) un ) only 0 W0 ( un gives higher once, from below at some point v2 : This shows that the uninformative strategy in value to the government than for all v < v2 and lower value for all v > v2 : Allowing public randomization may further increase the government’s payo¤ and create an intermediate region (region (v1 ; v3 ) on the graph) in which the reporing strategy is stochastic but it does not alter the main conclusion: no information revelation is optimal for low values of v and full information is optimal for high values of v. To understand the intution for this result, it is useful to compare the on the equilibium path bene…ts of information revelation, costs W0 in W0 ( un ) : 0 in ; 0( ; un ) ; to the o¤ the equilibrium path Given our CRRA assumption, it is easy to see that the bene…ts are monotonically increasing in v; equal to zero as v ! v and to in…nity as v ! v: The o¤ the equilibrium path costs of information revelation do not depend on v: Therefore, there must exist a region of low enough v where the costs of better information revelation exceed the bene…ts and a region of high enough v where the opposite conclusion holds. This conclusion does not depend on the fact that we restricted attention to only two reporting strategies. When j j = 2; any the government satis…es d un and in that reveals some but not all information to un ) : > d ( ) and W0 ( ) > W0 ( have higher payo¤ than This implies that reporting for low and high values of v respectively. κ(v) strategies in ⇐ no info ⇒ full info v 1 v Figure 1: Value functions in the simpli…ed example. Solid line: in in un ; dashed: un 0 (v) 0 0 (v) 0W 0W ( 0 ) : When j j > 2; it is still true that W0 ( ) > W0 ( v 2 un ) 0 3 (v) with public randomization, dotted: and therefore no information revelation is optimal for low enough v. However, full information revelation no longer gives strictly higher value than any other reporting strategy on the equilibrium path. The reason for it is that bunching might be optimal even in standard mechanism design problems with commitment. Our result for high values of v extends if the distribution of types is such that no bunching is desirable with commitment (equation (33) provides a su¢ cient condition for that). More 18 generally, it can be shown that some information revelation is desirable for high v, in particular it is optimal for 1 to reveal his type perfectly. Finally, we comment on the role of the assumption of constant relative risk aversion. Our arguments for no information revelation for low values of v only used the fact that un provides the lowest value to the government o¤ the equilibrium path. This is true for any preferences. The arguments for e¢ ciency of full information revelation rely on the fact that 0 (v; 0 v; in ) becomes unboundedly large for high values of v: A su¢ cient condition for this is that the absolute risk aversion goes to zero as consumption goes to in…nity, and our conclusion holds for all preferences satisfying this condition. This example illustrates the general principle that goes beyond the particular taste-shock model that we consider in this paper. Since the o¤ the equilibrium path costs of information revelation do not depend on past promises (the principal simply reneges on them when deviates) but the bene…ts do, it is optimal to reveal less information for those agents for whom the on the equilibrium gains are lowest. The gains from better information are monotone in v in the taste shock economy that we consider in this paper; which allows us to get sharp cut-o¤s, but the general principle holds regardless of this property. 4.2 Characterization of the general case In this section we study e¢ cient information revelation in our baseline model of Section 3. Let (uv ; wv ; v) be a solution to (20). To streamline the exposition, in the body of the paper we focus on interior optimum when the lower bound U (0) does not bind; many arguments are simpler if the lower bound binds. We start with two key optimality conditions. The …rst order conditions for uv and wv can be re-arranged as ^ t+1 E v 0 kt+1 (wv ) jz = kt0 (v) (26) C 0 (uv ) jz : (27) and kt0 (v) = 1 tE v The …rst condition describes the law of motion for promised utilities. Since ^ t+1 with strict inequality when the sustainability constraint binds in period t + 1; it shows a version of meanreversion in promised utilities. Since kt and kt+1 are concave functions with interior maxima, it implies that period t + 1 promises have a drift towards the value that maximizes kt+1 : This drift component implies higher future promises for low-v agents and lower promises for high-v agents. This mean-reverting drift helps the government to relax sustainability constraints in 19 future periods. The second condition provides a linkage between promised and current utility allocations. For our purposes it is important to establish bounds for uv and wv : The optimality conditions (see Appendix for details) imply that 1 kt0 (v) tC 1 0 (uv (m; z)) 1 kt0 (v) j j; (28) and kt0 (v) + % 1 1 ^ t+1 ! 1 0 kt+1 (wv (m; z)) % 1 for some numbers %; % with a property that %; % ! = ^ t+1 as kt0 (v) + j j 1 1 ^ t+1 ! (29) ! 0: Equation (28) shows the role of private information. If agents’ types were observable, the planner would equalize the marginal costs of providing 1 util to type , 1 1 tC 0 (u) ; to the shadow cost of past promises, kt0 (v) ; while allocating the same promised utility w to all types. This would give more consumption to higher types, which is not incentive compatible. With private information the dispersion of current period consumption allocations is smaller, which can be seen by the bounds (28). Equation (29) shows, in addition, how the sustainability constraints in future periods interact with the provision of incentives in the current period. It shows that bounds for future promises 1 0 kt+1 (wv ) shrink as v ! v: When future sustainability constraints bind, the government realizes that more dispersion of inequality tomorrow makes it harder to sustain e¢ cient outcomes, which limits the dispersion of promises that the government gives and prevents the long run immiseration. Conditions similar to (26)-(29) also appear in the problems in which the government can fully commit (see Atkeson and Lucas (1992) and Farhi and Werning (2007)). Our analysis generalizes those conditions to the environments with no commitment. A novel feature of such environments is that complete information revelation is generally suboptimal. We next turn to the characterization of the e¢ cient information revelation. Suppose that it is optimal for type optimality for the reporting strategy uv m00 ; z tC uv m0 ; z = t 00 uw v m ;z w t C tC v to randomize between messages m0 and m00 . The together with the envelope theorem yields11 uv m00 ; z uv m0 ; z 00 uw v m ;z w t C Formally, we take a derivative of v in the direction 0 ; de…ned as 0 (m0 j ; z) = (m00 j ; z) = 0 and 0 (mj ; z) = v (mj ; z) for all other m; ; and apply (15). 20 (30) + ^ t+1 kt+1 wv m0 ; z 0 uw v m ;z 11 0 + ^ t+1 kt+1 wv m00 ; z 0 uw v m ;z v (m0 j ; z) + : v (m00 j ; z) ; Equation (30) captures bene…ts and costs of information revelation. The left hand side of the equation captures the bene…ts of better information revelation on the equilibrium path. It compares the “cost-weighted” utility that the government obtains for type messages m0 and m00 : This term is the general form of the expression 0 v; if it receives in 0 (v; un ) that appeared in our simpli…ed example in Section 4.1. The right hand side of the equation captures the cost of information revelation o¤ the equilibrium path. If message m00 reveals more information about further from than m0 ; in a sense that the expectation of conditional on m00 is than the expectation conditional on m0 ; this expression is positive. The right hand side of (30) is the analogue of 0 W0 in W0 ( un ) in Section 4.1. Equation (30) develops the key intuition for e¢ cient information revelation. First, consider information revelation for low values of v: Bounds (28) and (29) imply the expression on the left hand side of (30) converges to zero as v ! v; so that the gains from information revelation disappear for low values of v. Therefore, the optimal reporting strategy must converge to a strategy that minimizes the o¤ the equilibrium path cost of information revelation. Such strategy must be uninformative by Lemma 2. Note that the uninformative strategy assigns the same posterior belief for any reported message, so that that right hand side of (30) is zero. In the Appendix we prove a stronger result that the uninformative strategy is optimal for all v su¢ ciently low. Theorem 1 Suppose that the sustainability constraint (10) binds in periods t and t + 1: There exists vt > v such that for all v 2 v; vt strategy v the allocation (uv ; wv ; v) does not depend on z; is uninformative and (uv (m) ; wv (m)) = (uv (m0 ) ; wv (m0 )) for all m; m0 : We now turn to the analysis of information revelation for high values of v: Consider the case when j j = 2: By Proposition 3 we can restrict attention to at most 3 messages, such that reporting two of those messages reveals the type of the sender perfectly. Therefore the left hand side of (30) compares on-the-equilibrium-path bene…ts from a message that reveals full information about a sender and a message that reveals only some of the information. Bounds (28) and (29) can be used to show that as long as the coe¢ cient of absolute risk aversion, U 00 (c)=U 0 (c) goes to zero for high c; the informational gains from better information must go to in…nity as v ! v: Since the informational costs are bounded by Lemma 2, this implies that equation (30) cannot hold and each agent must reveal his type fully for all v high enough. Formally, suppose that utility satis…es Assumption 1 (decreasing absolute risk aversion) U is twice continuously di¤ erentiable 21 and lim U 00 (c) =U (c) = 0: (31) c!1 We say that messages as type v reveals full information about type i with positive probability. i if no type j 6= i sends the same Theorem 2 Suppose that Assumption 1 is satis…ed and j j = 2: Then there exists vt+ < v vt+ ; such that for all v v reveals full information about each type. Previous arguments can also be extended for the …rst and the last type of arbitrary message spaces. Corollary 2 Suppose that Assumption 1 is satis…ed. 1. (no informational distortions at the top). Then 1 v reveals full information about for all v su¢ ciently high. 2. (no informational distortions at the bottom). Suppose in addition that % j j 1 j j information about j j 1 j j > j j 1 + j j j j 1 j j 2 : Then v 0 and reveals full for all v su¢ ciently high. The …rst part of this Corollary shows that full information for the lowest type 1 is optimal for high v: It is an informational analogue of the “no distortion at the top” result from the mechanism design literature (see, e.g. Mirrlees (1971)). In the mechanism design version of our environment, the incentive constraints bind upward, so that that no other type wants to pretend to be 1: 1 is the “top type”in a sense This property of the top types drives both the “no distortion at the top” result in standard mechanism design models with commitment and part 1 of Corollary 2 in our environment. Part 2 of Corollary 2 shows that informational distortions are suboptimal for the highest j j as long as the types are not too close to each other. The extra condition is needed to rule out bunching. When some types are close, it might be optimal to give them the same allocations even if they reveal their type fully (as, for example, in models with commitment) in order to provide incentives for other types. Since bunching is suboptimal with two types, no additional conditions were necessary in Theorem 2. More insight into the behavior of the interior types can be gained from assuming that the utility function satis…es U (c) = ac1=a for a > 1: In this case it is easy to …nd the limiting allocations towards (uv ; wv ; v) converge as v ! v: In particular, we can use homogeneity of 22 C and boundedness from below of u to show that for any v; kt (v0 v) =v a ! k~t (v) as v0 ! v; where function k~t (v) is de…ned by h i ^ E (32) k~t (v) = max t C (u) + t+1 kt+1 (w) fu(m);w(m); (mj )gm2M ( j )2 (M ) subject to (22) and (21). Moreover, v converges to a solution ~ of this problem. The limiting problem (32) has no cost of information revelation. It is equivalent to a standard mechanism design problem in which agents can send reports over a redundantly large message set M : If it is suboptimal to bunch di¤erent types when agents report over , it is also suboptimal to bunch types when agents report over M : A su¢ cient condition to rule out bunching in mechanism design problems is an assumption that types are not too close, which we state as Assumption 2. It turns out to be a su¢ cient condition for full information revelation for high enough v in our economy without commitment. Assumption 2 The distribution of ( n 1) [ n n 1] satis…es ( n 1 n 2) j j X ( i) 0 for all n > 2: (33) i=n 1 Note that this assumption is satis…ed both in Theorem 1 and Corollary 2. Suboptimality of bunching in the limit implies that information revelation is suboptimal for all su¢ ciently high v: The proof of this result is a tedious but mostly straightforward application of the Theorem of Maximum and we leave it for the Supplementary material. Proposition 4 Suppose that U (c) = ac1=a for a > 1 and Assumption 2 is satis…ed. Then there exists vt+ < v such that for all v 5 vt+ ; v reveals full information about each type. Extensions We discussion two extensions of the baseline analysis. 5.1 Invariant distribution In the previous sections we used arbitrary distributions of initial lifetime utilities : In this section we brie‡y discuss two implications for our analysis if is an invariant distribution. In this case multipliers ^ ; and do not depend on t, with ^ > ; > 0 needed to prevent immiseration.12 For given ^ ; ; one can follow the arguments of Farhi and Werning (2007) to show that there exists initial distibution ~ with a property that t generated by a solution to (20) is such that t = ~ for all t: This distribution is invariant if feasibility and sustainabilty constraints hold with equalities. We leave it for future research to explore under which conditions these constraints are satis…ed. 12 23 One implication of the invariant distribution is that no agent is stuck forever in the region in which no information is revealed. To see this, note that equation (29) shows that agent’s promised utility strictly increases with positive probability if vt < v and strictly decreases with positive probability if vt > v ; where v is given by k 0 (v ) = 0: Suppose there is no information revelation for all v v + a where a 0: Then the invariant distribution must assign all the mass to v and, moreover, no information about agents is revealed. But in this case the sustainability constraint is slack, which is a contradiction. Therefore the “no information” region lies strictly below v . Another implication of Theorem 1 and bounds (29) in this case is that there must exists an endogenous lower bound beyond which agents’lifetime utility never falls. Once an agent’s utility reaches that bound, he bounces from it with a positive probability. This dynamics resembles that of Atkeson and Lucas (1995), except that in their case the utility bound was set exogenously. Near the endogenous lower bound agent may reveal no information and receive no insurance against that period’s shocks. 5.2 Persistent shocks We chose to focus on iid shocks in our benchmark analysis. Most of our key results extend with few modi…cations to Markov shocks. When the shocks are Markov, let the probability of realization of shock assume that j conditional on shock > 0 for all ; conditional on shock t : Let j j denote in the previous period. We be the probability of realization of being realized t periods ago. Many arguments in the persistent case are straightforward extentions of our previous analysis. We brie‡y sketch them in this section leaving the details for the Supplementary material. Following the same steps as before, we can show that in the worst equilibrium there is no information revelation for the government. The payo¤ in that equilibrium, unlike the iid case, depends on the prior beliefs that the government has. The maximum payo¤ that the government can achieve in any period t is given by ~ t ( t) = W sup fut+s (h)gh2H t ;s 0 Z Ht 1 X s=0 s Ept+s [ jh] ut+s (h) d t subject to the feasibility constraint (8) holding for all t + s and Z s pt+s jht = j pt d jht for s > 0: ~ t ( t ) with a function that is linear in Similarly to the iid case, we can bound W 24 t 1: De…ne the analogue of (13) as Wt ( ; p) = max fut+s (m)gs 0 Mt Z 1 X s=0 0 s@ Z s (d s j ) w t;t+s (C s ut+s (m) 1 e)A (dmj ) (ut+s (m)) d j then Lemma 2, and in particular bound (14), can be extended to Wt ( t ; pt ) : It is straightforward then to obtain the analogue of Lagrangian (17). This Lagrangian can then be rewritten recursively using the techniques of Fernandes and Phelan (2000), who studied an economy with persistent shocks and commitment. v= v 1 period, ; :::; v j j In their formulation the state is a vector of promised utilities and the realization of the shock in the previous : In our environment v remains a state variable with an analogue of the promise keeping constraints (21) holding for each type, Z v i = [ u (m; z) + w (m; z; )] (dmjz; ) dz d j i E u + wj i . (34) In the environments with commitment each agent plays a pure reporting strategy and is known along the equilibrium path. In our settings posterior beliefs about is not known, but the planner forms : Therefore, the only di¤erence from the commitment environment is that the posterior belief p replaces as the state variable. The recursive formulation is given by kt (v; p) = max (u;w; ) ( j )2 (M );p0 2 ( ) X p 2 E h u t C (u) + ^ t+1 kt+1 i 0 w; p j t Z Wt dz (35) subject to (34), the incentive constraint E [ u + w ( ; ; ) j ; z] E 0 [ u + w ( ; ; ) j ; z] for all z; ; 0 ; and the Bayes rule condition that requires p0 to satisfy, whenever de…ned,13 P (mj ) j p 2 0 : p ( jm; z) = P j p ( ; )2 2 (mj ) The analysis of dynamic mechanism design problems with persistent shocks is more involved since the recursive problem with additional state variables is more complicated. The broad lessons about desirability of information revelation for high and low promised values (now in the sense of increasing each element of vector v) still holds. First of all, the example in Section 13 Strictly speaking m should be replaces with any Borel set A of 25 (Mt ) analogously to de…nition (3). p d ; 4.1 remains virtually unchanged. Functions x > 0; 1 (xv) = xa 1 (v) and 0 (xv; 1 (v) ) = xa and 0 (v; 0 (v; ) are homogenous in v : for any ). Note that v 0 if 1=a 2 (0; 1) and v < 0 if 1=a < 0: Therefore, no information revelation is optimal for low values of xv (i.e. x ! 0 if 1=a 2 [0; 1) and x ! 1 if 1=a < 0) and full information is optimal for high values of xv in the absence of bunching. We can also generalize Theorem 1 to the persistent case (see the Supplementary material for details). Proposition 5 Suppose that utility is bounded. Let v;p be a solution to (35) and mative strategy. If the sustainability constraint (10) binds in periods t; then Pr ( as v ! 0 for all p. 6 an uninforv;p ! )!1 Final remarks In this paper we took a step towards developing of theory of social insurance in a setting in which the principal cannot commit. We focused on the simplest version of no commitment that involves a direct, one-shot communication between the principal and agents, and showed how such model can be incorporated into the standard recursive contracting framework with relatively few modi…cations. Our approach can be extended to other types of communication. For example, a literature starting with a seminar work of Myerson (1982) considered communication protocols that involve a mediator. Such protocols may further increase welfare (see, e.g. Bester and Strausz (2007)) but they require some ability to commit by the policy maker to use the mediating device. The key di¤erence is that, after sending any message m; the agent receives a stochastic allocation, so that the incentive constraint (25) holds in expectation rather than message-by-message. Two main insights of the paper – recursive characterization and Theorems 1 and Theorems 2 –depend little on the detailed structure of the incentive constraints. We believe that our analysis can be extended to such settings in a relatively straightforward way and leave it to future research. Another important direction that needs to be explored in future work is how the allocations in best equilibria can be decentralized through a system of taxes and transfers. Some of the most fruitful way to proceed may be to extend our basic set up to incorporate such margins as labor supply or job search e¤ort, and consider decentralizations along the lines of labor taxation in Albanesi and Sleet (2006) and unemployment insurance in Atkeson and Lucas (1995). 26 7 Appendix 7.1 Proof of Lemma 2 w t gt First we show that sequences f is uniformly bounded away from 0. The objective function w t in (7) is concave and the constraint set is convex, therefore there exists solution to (7), uw ht w t ht 2H t is given by ; satis…es C 0 uw ht Z C C0 1 1 wE t Ht Since E t [ jh] 2 in…nity as w t 1; j j t 1 = E w t [ jh] d jht : The Lagrange multiplier t t = e: (36) ; the left hand side of (36) is continuous in w t goes in…nity and 0 respectively. Therefore 0 such that the w t and goes to 0 and is bounded away from 0 from below and bounded above, and these bounds can be chosen independently of t. This also implies that the supremum to (7) is obtained. The same arguments applied to (13) establish that Wt is well de…ned, uw (m) satis…es 1 C 0 (uw (m)) = 1 w; t [ jm] 2 wE t j j w t ; (37) and family fWt gt is uniformly bounded. Continuity of Wt then follows from the Theorem of Maximum since (37) implies that we can restrict fu (m)gm2Mt in maximization problem (13) to a compact set. 0; To see that Wt is convex, for any Wt 0 + (1 00 ) = 00 Z max fu(m)gm2M ) Z max fu(m)gm2M = 0 Wt Z w t C [ u (m) Mt t ) w t C [ u (m) [ u (m) Mt + (1 2 [0; 1] ; Mt t + (1 and max fu(m)gm2M + (1 t Z ) Wt Mt 00 (u(m))] (u(m))] w t C 00 (mj ) (d ) (dmj ) (d ) + (u(m))] [ u (m) 0 w t C 0 w t e (dmj ) (d ) (u(m))] 00 (dmj ) (d ) + : To see that Wt is minimized if and only if a strategy is uninformative, consider any uninformative . From (37), the solution to (13) given Wt ( ) = C 0 1 1 w t w t C C0 1 1 w t : Let 27 satis…es C 0 (uw (m)) = 1w a.s. and hence t R (A) = (Aj ) (d ) for any Borel A of w t e message set M and note that for any Wt ( ) = = max Wt ( ) Z fu(m)gm2M t max fu(m)gm2M t w t C [( u (m) Mt Z Mt f(E [ jm] u (m) ( uw (u (m))) w t C w t C (uw ))] (dmj ) (d ) (E [ jm] uw (u (m))) w t C (uw ))g (dm) : The expression in curly bracket is non-negative, which implies that Wt ( ) all : Moreover for informative jE [ jA] , there is a set of messages A with Wt ( ) for (A) > 0 such that 1j > 0: For all such messages the expression in curly brackets is strictly positive since uw does not satisfy the optimality condition (37) for m 2 A: Since expression above is strictly positive therefore any informative (A) > 0; the cannot be a minimum. We prove (15) by using Theorem 3 in Milgrom and Segal (2002). For completeness, we state it here adapted to our set up Theorem 3 Let V (t) = supx2X f (x; t) where t 2 [0; 1] ; and let x (t) be a solution to this problem given t and ft be the derivative of f with respect to t: Suppose that (i) for any t a solution x (t) exists, (ii) fft (x; )gx2X is equicontinuous, (iii) supx2X jf (x; t) j < 1 at t = t0 , and (iv) ft (x (t) ; t0 ) is continuous in t at t = t0 : Then V 0 (t0 ) exists and V 0 (t0 ) = ft (x (t0 ) ; t0 ) : In our case u = fu (m)gm2Mt can be restricted to a compact set and Z w ) (mj ) + 0 (mj ) dm (d ) ( u (m) t C (u (m))) (1 Mt is continuous in fu (m)gm2Mt ; hence conditions (i) and (iii) of this theorem are satis…ed. The derivative with respect to Z f (fug ; ) is ( u (m) Mt w t C (u (m))) 0 (mj ) (mj ) dm (d ) ; and it does not depend on , therefore ff (fug ; )gfug is equicontinuous. For any the solution to (13), uw ( ) ; given (1 ) 0 + mum, continuous in : Therefore, f (fu ( )g ; 0) 2 [0; 1] ; is unique and, by the Theorem of Maxi- is continuous in at = 0; which veri…es condition (iv). Since sequence f t gt is uniformly bounded (away from zero from below), equation (37) implies that uw (m) is bounded and those bound do not depend on t: Therefore all families of directional derivatives f@Wt ( ) =@ 0 gt are uniformly bounded. 28 We prove (14) next. By Lagrange Duality theorem (Luenberger (1969), Theorem 1, p. ~ t ( t ) can be written as a minmax problem 224), W ~ t ( t ) = min W max Z max 0 fu(h)gh2H t fu(h)gh2H t = = max fu(h)gh2H t Z Ht Z Wt Ht 1 Ht t Z Ht (E t [ jh] u (h) w t C (E t [ jh] u (h) Z 1 Mt jht 1 n ; C (u (h))) d u ht dzd 1 By de…nition of 7.2 w t C ;m t + e w t e + u(ht 1 ; m) o t dmjht 1 ; (d ) dzd t 1: where the inequality follows from the fact that w t (u (h))) d t w t may not be a minimizer for an arbitrary this inequality becomes equality at t; t: which establishes (14). Proof of Proposition 3 First we prove some preliminary results. Fix z and consider any incentive compatible allocation that satis…es (22). Let Mt ( ) be as de…ned in the text. For any i let Mt ( i ) = Mt ( i ) \ Mt ( i 1 ), Mt+ ( i ) = Mt ( i ) \ Mt ( i+1 ) and Mtexcl ( i ) = Mt ( i ) n ([j6=i Mt ( j )) : Lemma 3 For all i; Mt ( i ) = Mt ( i ) [ Mtexcl ( i ) [ Mt+ ( i ) : Proof. Suppose m 2 Mt ( i ) nMtexcl ( i ) : Then there exists j such that m 2 Mt ( j ) : Without loss of generality let j i+1 u > i: For any m0 2 Mt ( m0 + w m0 i+1 ) i+1 u (m) + w (m) and i u (m) + w (m) iu m0 + w m0 ; j u (m) + w (m) ju m0 + w m0 : The sum of the …rst the second inequalities implies u (m0 ) the third inequalities implies Lemma 4 Let (u ; w ; u (m0 ) u (m) ; therefore u (m0 ) u (m), the sum of the …rst and = u (m) and m 2 Mt ( i+1 ) : ) be a solution to (20) for a given z: Then (u (m0 ) ; w (m0 )) = (u (m00 ) ; w (m00 )) for all m0 ; m00 2 Mt ( t ) ; for all m0 ; m00 2 Mt+ ( t ) and for almost all (with respect to measure ) m0 ; m00 2 Mtexcl ( i ) : 29 t 1 Proof. The …rst part of the lemma follows immediately from the discussion in the text, so u (m) jm 2 Mtexcl ( ) ; w ^=E we prove the second part. Let u ^=E and consider an alternative (u0 ; w0 ; ) de…ned as (u0 (m) ; w0 (m)) = (u (m) ; w (m)) if m 2 = Mtexcl ( ) ; and (u0 (m) ; w0 (m)) = (^ u; w) ^ if m 2 Mtexcl ( ) : For any 0 therefore 0 u m0 + w m0 > 0 u (m0 ) + w (m0 ) > 0 m2 0 6= ; m0 2 Mt 0 u (m) + w (m) for all m 2 Mtexcl ( ) ; u ^ + w, ^ which implies that (u0 ; w0 ; also satis…es (21) by construction, since Mtexcl w (m) jm 2 Mtexcl ( ) ( ) : The objective function E ) satis…es (22). It gets the same utility from reporting any message h i ^ kt+1 (w) is strictly convex in u C (u) + t t+1 (u; w) ; and would take a strictly higher value at (u0 ; w0 ) than at (u ; w ) unless (u ; w ) is constant on Mtexcl ( i ) a.s. Lemma 5 Let (u ; w ; ) be a solution to (20) for a given z: Suppose there exists a set P ~j ~ M ( ) > 0 and u (m0 ) = u (m00 ) for almost all (with respect to M Mt s.t. ~ . If t > 0 then E [ jm0 ] = E [ jm00 ] a.s: measure ) m0 ; m00 2 M ~ and consider an alternative strategy 0 de…ned as 0 (mj ) = Proof. Fix any m ^ 2 M ~ and 0 (mj ~j (mj ) if m 2 = M ^ ) = M for all : Any agent reports any subset of messages with a positive probability with reporting strategy with a positive probability with 0 : Therefore the strategy pro…le 2 [0; 1]. Let h fm E u ^ ( ) only if he reports that subset = (1 ) t Wt ( ): + 0 satis…es (21) and (22) for any ^ t C (u ) + t+1 kt+1 (w ) Since u (m0 ) = u (m00 ) ; w (m0 ) = w (m00 ) ; 0 fm ^ ( )j =0 = = i @Wt ( ) @ w w ~) f(E 0 [ jm] ^ uw (m) ^ ^ Pr(m 2 M t C(u (m))) Zt w w ( uw (m) t C(u (m))) (mj )dm (d )g: t ~ M 0 ( )j Optimality of = 0 implies that fm ^ h i w w ~ uw (m) E jM ^ ^ E t C (u (m)) 0 for all m; ^ so that it can be written as h i w w ~ E [ jm] uw (m) ^ t C(u (m))jM all m: =0 ~ of both sides, we obtain E Taking expectations conditional on M h i ~ which implies that cov (E [ jm] ; uw (m)) E [ jm] uw (m)jM E optimality (37) implies that uw h (m) is monotonically increasing in E 30 i ~ jM E h ~ uw (m)j ^ M i 0: On the other hand, [ jm] ; and therefore cov (E a.s. [ jm] ; uw (m)) 0: The two conditions can be satis…ed only if E Proof of Proposition 3 and Corollary 1. any solution to (20) can take at most 2j j [ jm0 ] = E Lemmas 4 and 5 show that for a given z; 1 distinct values of (u (m) ; w (m) ; E except for possible values that occur with probability 0 under reporting strategy 5 is stated for the case t > 0; but when t [ jm00 ] [ jm]) (Lemma = 0 the problem is isomorphic to the standard mechanism design problem and usual arguments apply to show that j j distinct allocations represent a solution). The outcomes that occur with probability zero can be replaced with any outcome from that set of 2j j 1 distinct values. Therefore without loss of generality the set M can be restricted to at most 2j j 1 messages in (20), and hence, by Proposition 2, to M in the modi…ed problem (17). It remains to verify that it is a best equilibrium since not every solution to (17) needs to be a PBE. Let ( ; G; p ) be a best PBE that generates (u ; M ) along the equilibrium path. By the arguments leading to construction of problem (17), (u ; ) is a solution to that prob- lem given M : It is easy to show that any equilibrium is payo¤-equivalent to one in which the P1 s t us jht 1 E allocations in period t depends only on the expected utility wt ht 1 s=t but not on the particular realizations of shocks in the previous t the Supplementary materials) and so we assume that (u ; 1 periods (Lemma ?? in ) satis…es this property. Pre- vious arguments imply that after any history ht there can be at most 2j j 1 distinct n h i o values of ut ht ; mt ; wt ht ; mt ; E jht ; mt that occur with probability 1 given mt o n jht ; : Therefore it is possible to construct another strategy 0 which is incentive t 2 compatible and delivers the same payo¤ to the agents as tribution of histories any Borel set A of 0 with a property that Pr ([(1 ) v; (1 E C (ut ) ; which veri…es that ( 0 ; ) v] G; p G) generates a dis- 2 A) = Pr ((ut ; E t ) 2 A) for ~ t ( ) and E 0 C (ut ) = ~ t ( 0t ) = W ) : Therefore W t 0 ((ut ; E : The pair ( 0 ; t) ) is a best PBE and proves Proposition 3. If removing all non-distinct messages leaves fewer than jM j messages, extra messages can be added and agents strategies extended to keep posterior probabilities for such messages the same as for some of the original messages. By appropriately choosing which messages to add it is possible construct strategies that satisfy (25). This argument extends for all histories, and hence any value that k0 (v) achieves given any M; can also be achieved given Mt = M for all t: This completes the proof of Corollary 1. 31 7.3 Proofs for Section 4 First, we use Proposition 3 to simplify our problem. We drop explicit dependence of all variables on z and throughout this section all the results are understood to hold for all z: By Proposition 3 we can restrict attention to messages in a set Mv of cardinality no greater than 2j j 1 such that no two distinct reports give the agent the same allocation. We use Nv to denote cardinality of Mv and Mv ( ) to denote the set of messages in Mv that maximize agent ’s utility. The incentive constraints (22) can be written as u m0 + w m0 ; u (m ) + w (m ) m0 j u m0 + w m0 ( u (m ) + w (m )) The triple (uv ; wv ; =0 8 is a solution to h ^ max E (1 v) u t C (u) + t+1 kt+1 (w) (38) ; m 2 Mv ( ) ; m0 2 Mv : 2 v) v u;w; subject to (38) where v w i t Wt ( ) (39) = kt0 (v) :14 Without loss of generality we arrange messages in Mv so that uv (m1 ) < ::: < uv (mNv ) ; which also implies that wv (m1 ) > ::: > wv (mNv ) : Let 0 ( ; m ; m0 ) and 00 ( ; m ; m0 ) be Lagrange multipliers on the …rst and second constraint (38). We set ( ; m; m0 ) = 0 for all m 2 = Mv ( ) and ( ; m; m) = 0 for all m; so that 0 ; 00 are well de…ned for all ( ; m; m0 ) : The next lemma establishes bounds on optimal allocations that are important for establishing our main results in this section. Lemma 6 (uv ; wv ; (29) hold, with % = v) satis…es (26): For all v such that kt0 (v) 1+ ^ j j 1 t+1 1 ;%= j j +1 1 ^ t+1 j j : Proof. The …rst order conditions for wv (m) are " # X ^ t+1 X 0 kt+1 (wv (m)) v (mj ) ( )+ v 2 X ( ;m0 )2 Sum over m to obtain X ( ;m)2 Mv " ^ ( ;m0 )2 0 t+1 0 kt+1 (wv 0 # (m)) v ; m; m0 + v (mj ) 00 ; m; m0 Mv ; m0 ; m + Mv 1, equations (27), (28) and v m0 j (mj ) ( ) = v 0 ; m0 ; m = 0: (40) = k 0 (v) ; 14 To see this, let v be the Lagrange multiplier on the promise keeping constraint in the original, convexi…ed Bellman equation. By Proposition 1 and the envelope theorem it satis…es v = kt0 (v) : Form a Lagrangian with v and observe that it can be maximized for each z separately, which yields (39). 32 which is expression (26). For what follows, it is useful to establish bounds on the Lagrange multipliers. Consider 0 a message m1 : If Nv = 1; then trivially Nv > 1 and let 0 be the largest strictly suboptimal for any with 0 > 0 : Moreover, 0 > 0 ; 00 and are two scalars equal to zero. Suppose such that m1 2 Mv ( ) : Since sending message m1 is ( ; m0 ; m1 ) = 0 uv (m1 ) + wv (m1 ) v (m1 j ) 00 ( ; m0 ; m1 ) = 0 for all ( ; m0 ) uv (m) + wv (m) and uv (m1 ) < uv (m) for all m 6= m1 ; which implies u (m1 ) + w (m1 ) > u (m) + w (m) for all m 6= m1 if Therefore, 0 ( ; m1 ; m0 ) = v write (40) for m1 as (m0 j ) ^ 00 ( ; m1 ; m0 ) = 0 for all ( ; m0 ) with t+1 0 kt+1 (wv (m1 )) v ! 0 < < 0 : : Thus we can = # (m1 ) ; (41) where # (m1 ) = P m0 2Mv 0 0 ; m1 ; m0 + m0 j v 0 00 0 0 0 ; m1 ; m0 ; m0 ; m1 + P v (m1 j ) ( ) 2 Since w is decreasing in m and kt+1 is concave, v; ^ t+1 ^ 0 kt+1 (wv (m1 )) t+1 E v v m1 j 0 00 0 0 kt+1 (wv ) = which implies that # (m1 ) Similar steps for mNv establish # (mNv ) 0: (42) 0: The …rst order conditions for uv (m) are X (1 v) tC 0 (uv (m)) 2 X ( ;m0 )2 where (m) v (mj ) ( )+ X ( ;m0 )2 0 ; m0 ; m + Mv v 0 ; m; m0 + v 00 (mj ) ; m; m0 Mv m0 j 00 ; m0 ; m 0 is the Lagrange multipliers on the constraint u (m) = (m) ; (43) U (0) : It is zero if utility is unbounded below. Moreover, since we assumed uv (m1 ) < ::: < uv (mNv ) ; it can only be strictly positive for m1 even in the case when utility is bounded below. We guess and verify that if kt0 (v) = v 1; then (m1 ) = 0:15 Suppose v (m1 ) = 0: Then the …rst order condition for uv (m1 ) becomes X (1 v) tC 0 (uv (m1 )) 2 15 More precisely, for a …xed su¢ cient. v; v (m1 j ) ( ) = # (m1 ) X 2 v ! (m1 j ) ( ) 0 : (44) maximization problem (39) is convex and thus necessary conditions are also 33 ; m0 ; m1 : Combining it with (42), it implies that tC 0 (uv (m1 )) (1 v) E [ jm1 ] v 0 and veri…es that the lower bound for uv (m1 ) does not bind. Analogous arguments establish (1 v) E [ jmNv ] v tC 0 (uv (mNv )) ; which, together with the monotonicity assumption on uv (m) ; gives (28). For the rest of the proof consider v that satis…es kt0 (v) 1: Summing (43) over m when (m) = 0 implies (27). Finally, combining (41) with (44) we obtain ^ t+1 0 kt+1 (wv (m1 )) = v + v + 1 1 1 tC 0 (u v (m1 )) v tE v 0 0 where we used v 0 v ( E v [ jm1 ] E v [ jm1 ] 1) + 1 1 0 1 0 + + 1 0 1 tC 0 (uv (m1 )) 1 0 v v; [C 0 (uv )] = 1 v to get the second inequality. Re- arrange it to get 1 0 kt+1 (wv (m1 )) 0 ^ ^ 1 0 t+1 +1 1 j j (1 +1 (1 1 t+1 v) + v) 1 + ^ 1 t+1 ^ ! t+1 ! : This, together with monotonicity of wv (m) ; establishes the upper bound in (29). Analogous manipulation of the …rst order condition for mNv establishes the lower bound in (29). We use this lemma to establish bounds that we use to characterize information revelation for low v: Lemma 7 (a) If utility is unbounded below and constraint (10) binds in period t+1; then there are scalars Au;t > 0; Aw;t > 0 and vt such that for all v vt any solution to (39) satis…es 0 uv (m) uv m0 Au;t for all m; m0 ; 0 wv (m) wv m0 Aw;t for all m; m0 : (b) If utility is bounded below, without loss of generality by zero, then limv!v uv (m) = 0 and limv!v wv (m) = 0 for all m: 34 Proof. (a) When (10) binds in t + 1; ^ t+1 > 0 1 kt (v) is unbounded below, limv! and therefore 1 ^ > 0: When utility t+1 = 1 and therefore expression (29) implies that there exists vt such that 1 2 1 ^ t+1 ! 3 2 0 kt+1 (wv (m)) 1 1 ^ t+1 ! for all m; v vt : This establishes bounds for wv (m) : The incentive constraint 1 uv (m1 ) + wv (m1 ) 1 uv (mNv ) + wv (mNv ) together with monotonicity wv (m1 ) > ::: > wv (mNv ) ; uv (m1 ) < ::: < uv (mNv ) imply that (wv (m1 ) w (mNv )) 1 uv (mNv ) uv (m1 ) 0 establishing bounds for uv : (b) When utility is bounded, Proposition 1 shows that limv!v kt0 (v) = 1: For any the largest v that satisfy kt0 (v ) close to v = 0: Let v~ = E v : By choosing arbitrarily high we can set v arbitrarily [ uv + wv ] : Due to the possibility of randomization, in general v~ 6= v but kt0 (v) = kt0 (~ v ) : Therefore v~ v …nd v~ = E E v v if v v : Then [ uv + wv ] (45) [ 1 uv + wv ] X ( 1) (mj 1 ) [ 1 uv (m) + wv (m)] v m2Mv ( 1) ( 1 ) [ 1 uv (m1 ) + wv (m1 )] ( 1 ) wv (m1 ) 0: Here the second, third and …fth lines follows from nonnegativity of u and w; and the fourth line follows from the fact that if 1 (mj 1 ) > 0; then (uv (m) ; wv (m)) gives the same utility to as (uv (m1 ) ; wv (m1 )) : Since v limv!v wv (m1 ) = 0: Since wv (m1 ) can be chosen to be arbitrarily close to 0, this implies that wv (m) ; this implies that limv!v wv (m) = 0 for all m: Analogous arguments show that limv!v uv (m) = 0: 7.3.1 Proofs for low v We now can prove the …rst limiting result about optimal information revelation. Lemma 8 Suppose that sustainability constraint (10) binds in periods t and t + 1. Then converges to an uninformed strategy as v ! v: 35 v Proof. Let be any uninformative strategy. Let uv (m) = E uv ; wv (m) = E v wv for v all m: Note that since uv (m) is increasing in m uv (m1 ) = uv (m1 ) E uv (m) v uv mj j E v = uv mj j ; and an analogues relationship holds for wv : Pro…le (uv ; wv ; ) is incentive compatible and satis…es E v [ uv + wv ] = E [ uv + wv ] : The value of the objective function (39) evaluated as (uv ; wv ; v) (46) should be higher than eval- uated at (uv ; wv ; ) ; E v h ^ t C (uv ) + t+1 kt+1 (wv ) uv uv tC (uv ) + ^ t+1 kt+1 (wv ) t Wt ( t Wt ( v) ): i (47) First, suppose that utility is bounded below, without loss of generality by zero. From Lemma 7, uv (m) ! 0 and wv (m) ! 0 for any m as v ! 0; and therefore uv and wv also converge to zero. Hence in the limit equation (47) becomes lim sup v!v Since Wt ( v) t (Wt ( Wt ( ) by Lemma 2 and exists and satis…es limv!v Wt ( v) t ) Wt ( v )) 0: (48) > 0 when (10) binds in periods t, limv!v Wt ( = Wt ( ) : Wt ( ) is continuous in v) and achieves its mini- mum only at uninformative reporting strategies by Lemma 2, therefore v must converge to some uninformative strategy. Now suppose that utility is unbounded below. By the mean value theorem t (C ^ (uv (m)) C (uv )) = tC 0 (~ uv (m)) (uv (m) uv ) ; 0 kt+1 (wv )) = ^ t+1 kt+1 (w ~v (m)) (wv (m) t+1 (kt+1 (wv ) wv ) ; for some u ~v (m) that takes values between uv (m) and uv and for some w ~v (m) that takes values between wv (m) and wv : By construction, uv (m1 ) wv wv (mNv ), therefore, uv (m1 ) From (28) and (29) limv! 1C 0 (~ uv w ~v (m) uv uv (mNv ) and wv (m1 ) uv (mNv ) and wv (m1 ) (m)) = 0 and limv! w ~v (m) wv (mNv ). 0 ~v (m)) = = ^ t+1 for all m: 1 kt+1 (w We have lim E v lim E v v! 1 = v! 1 = 0; h h (uv tC uv ) 0 (~ uv ) (uv t (C (uv ) 0 uv ) + ^ t+1 kt+1 (w ~v ) 36 i kt+1 (wv )) i wv ) C (uv )) + ^ t+1 (kt+1 (wv ) (wv where the second line uses the mean value theorem and (46), and the last line uses the fact that (uv (m) uv ) and (wv (m) wv ) are bounded for low v by Lemma 7. This implies that (48) holds when utility is unbounded and that converges to an uninformative strategy. v We now can prove Theorem 1. Proof of Theorem 1. Suppose Nv > 1 for some realization of z and consider (uv ; wv ; that solve (39) for such z: Let allocation when priors Wt ( = v) max + [E m m2 : Let uw be the optimal w t C [( u (m) ( uw (u (m))) w t C (uw ))] v (mj ) ( ) m; 2Mv 8 < = max [E fu(m)gm2Mv : X 0 v be the largest such that m1 2 Mv are uninformative: Then Wt ( ) X fu(m)gm2Mv 0 v v v) v w t C [ jm1 ] u (m1 ) w t C [ jm] u (m) (u (m1 )) (u (m)) (E v (E v [ jm1 ] uw [ jm] uw w t C w t C 0 (uw ))] @ 0 (uw ))] @ X 0 v v (m1 j ) ( )A 19 = A (mj ) ( ) v ; X 0 v All terms in square brackets are non-negative since the choice u (m) = uw is feasible. They are strictly positive if E v [ jm] 6= 1; since uw is the optimal allocation for E [ jm] = 1: From Lemma 8 the left hand side of this expression goes to zero as v ! v; therefore the right hand side should also go to zero. This is possible either if (a) E v [ jm1 ] ! 1 (and hence 0v ! j j ) and P P 0 0 v (mj ) ( ) ! 0 for all m > m1 ; or (b) v (m1 j ) ( ) ! 0; E v [ jm2 ] ! v v P 0 1 and 0 v (mj ) ( ) ! 0 for all m > m2 (and hence v ! 1 and m2 2 Mv ( 1 )). The v P other possibilities are ruled out since if E v [ jm] ! 1 and 0 v (mj ) ( ) 9 0 for some v P 0 0 m > m2 ; then for some m0 m2 we would have 0 v (m j ) ( ) 9 0 and E v [ jm ] 9 v 1.16 Since it is impossible to have j j and 1 to be indi¤erent between more than two distinct allocations in the optimum, for any " > 0 there must be some vt such that for all v solution has Nv = 2 and either (a) m1 2 Mv ", E E v E v j j , v [ jm1 ] arbitrarily close to 1, or (b) m2 2 Mv ( 1 ) ; m2 j v j j (m1 j 1 ) v "; E [ jm2 ] = v [ jm1 ] = 0 [ jm2 ] arbitrarily close to 1. In case (a) (37) implies that uw v (m2 ) = C 0 uw v (m1 ) ! C 1 (1= w ) ; t 0 and in case (b) uw v (m1 ) = C 1( w 1= t ) 1 and uw v (m2 ) ! vt the j j 1 > 1 and < 1 and w and j j= t 0 1 C (1= w t ): We can now rule cases (a) and (b) for v low enough. Consider case (b), case (a) is ruled out analogously. In this case 16 Note that all m > m2 . 1 v (m1 j 1 ) < 1 for v low enough and the optimality condition can be indi¤ererent between at most two distict allocations and therefore 37 v 1 (mj 1 ) = 0 for (30) is 1 (uv (m2 ) uv (m1 )) + ^ t+1 [kt+1 (wv (m2 )) = [( 1 uw v (m2 ) Function w t C 1u w t C t (C (uv (m2 )) C (uv (m1 ))) (49) kt+1 (wv (m1 ))] (uw v (m2 ))) ( 1 uw v (m1 ) w t C (uw v (m1 )))] : (u) is strictly convex and achieves its maximum u ^ at u ^ = C0 1( w 1= t ) = w w uw v (m1 ) : Since uv (m2 ) is bounded away from uv (m1 ) ; the right hand side of (49) is strictly negative and bounded away from 0. When utility is bounded below, Lemma 7 established that all terms on the left hand side of (49) go to zero as v ! v; yielding a contradiction. When utility is unbounded below, substitute the indi¤erence condition 1 (uv (m2 ) uv (m1 )) = (wv (m1 ) wv (m2 )) into (49) and apply the mean value theorem t (C (uv (m2 )) C (uv (m1 ))) = tC 0 (~ u) (uv (m2 ) uv (m1 )) and ^ t+1 (kt+1 (wv ^ = (m2 )) 0 ~ t+1 kt+1 (w) kt+1 (wv (m1 ))) (wv (m2 ) (wv (m2 ) wv (m1 )) wv (m1 )) for some u ~ 2 (uv (m1 ) ; uv (m2 )) and w ~ 2 (wv (m2 ) ; wv (m1 )) : The terms (uv (m2 ) uv (m1 )) u) = 0; and (wv (m2 ) wv (m1 )) are bounded for small v by Lemma 7 and limv! 1 t C 0 (~ 0 ^ limv! 1 t+1 kt+1 (w) ~ = 0 by Lemma 6. Therefore, the left hand side of (49) converges to zero for utilities unbounded below, yielding a contradiction. This proves that Nv = 1 and v is uninformative for all su¢ ciently low v independently of the realization of z: For uninformative 7.3.2 v; there is a unique (uv ; wv ) that solves (39) by strict convexity of u tC (u) : Proofs for high v We prove Corollary 2, since Theorem 2 is a special case of it. Corollary 2. Suppose v is su¢ ciently high so that case that 1 and 2 v = kt0 (v) < 1: We …rst rule out the send the same message with probability 1 for high v: From (41) and (44) we have that there is #v 0 such that ^ t+1 0 kt+1 (wv (m1 )) = 38 v #v ; and tC where ~ = ( 1 and 2 1 1 + 0 (uv (m1 )) = (1 2 2) = ( 1 + 2) v) E > 1: v [ jm1 ] + #v (1 v) ~ + #v ; The last inequality follows from the fact that types play message m1 with probability 1: De…ne function f as f (x) 1 (uv (m1 ) x) tC x) + ^ t+1 kt+1 wv (m1 ) + (uv (m1 ) This function is concave with 0 f (0) = tC 1 0 ^ (uv (m1 )) + Let x ^v be a solution to f 0 (^ xv ) = (1 1 v) t+1 0 kt+1 (wv ~ ! (m1 )) > (1 v) 1 x : ~ =2 and let (^ uv ; w ^v ) = uv (m1 ) 1 1 : x ^v ; wv (m1 ) + Since f is concave, x ^v > 0 and u ^v < uv (m1 ) ; w ^v > wv (m1 ) : Claim 1. Allocation (^ uv ; w ^v ) satis…es bounds (28) and (29). ^v < xv ; so we establish the claim Let xv be a solution to f 0 (xv ) = 0: By concavity, 0 < x by proving the stronger statement that uv (m1 ) and (29). By de…nition, " 1 1 ^ t+1 0 kt+1 wv (m1 ) + xv and wv (m1 ) + 1 # xv = tC 0 1 (uv (m1 ) xv satisfy bounds (28) xv ) and by xv > 0 tC ^ t+1 0 kt+1 0 (uv (m1 ) wv (m1 ) + xv ) < 1 tC ^ xv 0 (uv (m1 )) ; t+1 0 kt+1 (wv (m1 )) : These conditions imply that 1 (1 1 (1 v) tC v) 1 0 " (uv (m1 ) ^ 1 xv ) < t+1 0 kt+1 j j (1 wv (m1 ) + v) ; 1 xv # < (1 v ) j j; establishing the bounds (28) and (29). Claim 2. (1 ^v v) x ! 1 as v ! v: In Supplementary material we showed (Lemma ??) that Assumption 1 implies that there ^v such that are numbers Bv and B C 0 (uv (m1 )) 0 kt+1 (wv (m1 )) C 0 (^ uv ) 0 kt+1 (w ^v ) Bv 2 0 ^v ; 2 C (uv (m1 )) x (m1 ))] ^v B 2 0 kt+1 (w ^v ) x ^v ; 2 1 0 kt+1 (w ^v ) [C 0 (uv 1 39 (50) 1 x ^v : ^v = 1 ) v; wv (m1 ) ! v then Bv = [C 0 (uv (m1 ))]2 ! 0 and B and if u ^v ! (1 t+1 ^ 0 E v kt+1 (wv ) t+1 2 ! 0: ) v as v ! v: Since by (27) kt0 (v) = The bounds from Claim 1 establish that u ^v ! (1 ^ 0 kt+1 (w ^v ) 0 kt+1 (wv (m1 )) ; this implies that wv (m1 ) ! v as v ! v: Therefore by Lemma ?? the …rst terms on the right hand side of these expression go to zero as v ! v:17 We have ~ 1 1 2 1 = v 1 1 ( f 0 (0) " t v f 0 (^ xv ) C 0 (uv (m1 )) tC C 0 (uv (m1 )) Bv 1 [C 0 (uv (m1 ))]2 v From Claim 1, both C 0 (uv (m1 )) = (1 v) 0 ^ (^ uv ) + 1 2 + t+1 0 kt+1 (wv (m1 )) ^v B 1 and 1 1 0 (w ^v ) kt+1 2 0 kt+1 (w ^v ) = (1 ^v v) x the expression in curly brackets goes to 0 as v ! v: Therefore (1 Claim 3. f (^ xv ) 0 kt+1 (w ^v ) 0 kt+1 (w ^v ) 1 v v) f (0) ! 1 as v ! v: # 2 ) (1 ^v : v) x are …nite, therefore ! 1: Applying the mean value theorem, f (^ xv ) f (0) = f 0 (~ xv ) (1 1 v ^v v) x for some x ~v 2 (0; x ^v ) : Since f is concave; f 0 (0) > f 0 (~ xv ) > f 0 (^ xv ) : Therefore h i 1 ~ ~ : The result follows from Claim 2. 1 ; 1 2 We are now ready to show that it is not optimal for types 1 and 2 f (~ xv ) 1 v 2 to send the same message with probability 1 for high enough v: Suppose it is. Consider an alternative strategy ^ v , where ^ v (mj ^ 1 ) = 1 for some message m ^ that gives allocation (^ uv ; w ^v ) ; and ^ v (mj ) = all 6= 1 ; m: Since u ^ < uv (m1 ) ; w ^ > wv (m1 ) and 1 uv (m1 ) + wv (m1 ) = ^v 1u v (mj ) for + w ^v ; this allocation is incentive compatible and delivers utility v to the agent. Then o n h i i n h ^ ^ kt+1 (w) C (u) + W C (u) + k (w) E^v u (^ ) E u v t+1 t t t t v = ( 1 ) ff (^ xv ) f (0)g + t fWt ( v ) t Wt ( Wt (^ v )g : The second term is …nite since Wt is bounded by Lemma 2, therefore this expression is positive for high v from Claim 3. This contradicts the optimality of (uv ; wv ; Using these arguments we can also rule out type 1 v) : sending the same message as 2 with any positive probability. Then the optimality condition (30) implies that n o n o ^ ^ u (m ) C (u (m )) + k (w (m )) u (m ) C (u (m )) + k (w (m )) 1 v 1 v 1 t+1 v 1 1 v 2 v 2 t+1 v 2 t t When kt is twice di¤erentiable, this result can be established without using Lemma ??. In this case kt00 (v) 2 2 00 00 0 00 kt0 (v)] = )v C (u) = [C (u)] = 0 and limv!v kt (v) = [1 t C (uv ) ; and condition (31) implies that limu!(1 0: Claim 2 can then be established by appling the mean value theorem to the left hand side of (50). 17 40 o ) v is bounded because the expression on the right hand side of (30) is bounded for all : Therefore the value of this strategy can exceed the value of strategy when 1 and 2 send message m2 with probability 1 by only a …nite amount. But then the arguments of the previous paragraph lead to a contradiction. Part 2 of Corollary 2 is proven using analogous arguments. Here we consider function F (x) j j When that F 0 (0) uv mjMv j + x j j 1 j j = const (1 establish that F (^ xv ) allocation for type type j j: j j 1 v) tC > and % uv mjMv j + x + ^ t+1 kt+1 wv mjMv j j j 1 + j j j j 1 j j 2 j j 1 x : holds, we can show 0 ensures the boundary conditions. Previous arguments F (0) ! 1: Note that the perturbation we consider leaves the same j j 1, and gives an allocation uv mjMv j + x ^v ; wv mjMv j j j 1 x ^v to This is incentive compatible but gives a higher value than v: The contradiction then follows from the fact that kt (v) is a decreasing function for high enough v: 41 References Acemoglu, Daron. 2003. “Why not a political Coase theorem? Social con‡ict, commitment, and politics.” Journal of Comparative Economics, 31(4): 620–652. Acemoglu, Daron, Mikhail Golosov, and Aleh Tsyvinski. 2008. “Political economy of mechanisms.” Econometrica, 76(3): 619. Acemoglu, Daron, Mikhail Golosov, and Aleh Tsyvinski. 2010. “Dynamic Mirrlees Taxation under Political Economy Constraints.” Review of Economic Studies, 77(3): 841– 881. Aiyagari, S. Rao, and Fernando Alvarez. 1995. “E¢ cient Dynamic Monitoring of Unemployment Insurance Claims.” Mimeo, University of Chicago. Akerlof, George A. 1978. “The Economics of "Tagging" as Applied to the Optimal Income Tax, Welfare Programs, and Manpower Planning.” The American Economic Review, 68(1): pp. 8–19. Albanesi, Stefania, and Christopher Sleet. 2006. “Dynamic optimal taxation with private information.” Review of Economic Studies, 73(1): 1–30. Atkeson, Andrew, and Robert E. Lucas. 1992. “On E¢ cient Distribution with Private Information.” Review of Economic Studies, 59(3): 427–453. Atkeson, Andrew, and Robert E. Lucas. 1995. “E¢ ciency and Equality in a Simple Model of E¢ cient Unemployment Insurance.” Journal of Economic Theory, 66(1): 64–88. Benveniste, L M, and J A Scheinkman. 1979. “On the Di¤erentiability of the Value Function in Dynamic Models of Economics.” Econometrica, 47(3): 727–32. Besley, Timothy, and Stephen Coate. 1998. “Sources of Ine¢ ciency in a Representative Democracy: A Dynamic Analysis.” American Economic Review, 88(1): 139–56. Bester, Helmut, and Roland Strausz. 2000. “Imperfect Commitment and the Revelation Principle: the Multi-agent Case.” Economics Letters, 69(2): 165–171. Bester, Helmut, and Roland Strausz. 2001. “Contracting with Imperfect Commitment and the Revelation Principle: The Single Agent Case.” Econometrica, 69(4): 1077–98. Bester, Helmut, and Roland Strausz. 2007. “Contracting with imperfect commitment and noisy communication.” Journal of Economic Theory, 136(1): 236–259. 42 Bisin, A., and A.A. Rampini. 2006. “Markets as bene…cial constraints on the government.” Journal of Public Economics, 90(4-5): 601–629. Chari, V V, and Patrick J Kehoe. 1990. “Sustainable Plans.” Journal of Political Economy, 98(4): 783–802. Chari, V V, and Patrick J Kehoe. 1993. “Sustainable Plans and Debt.” Journal of Economic Theory, 61(2): 230–261. Clementi, Gina Luca, and Hugo A Hopenhayn. 2006. “A Theory of Financing Constraints and Firm Dynamics.” The Quarterly Journal of Economics, 121(1): 229–265. Dovis, Alessandro. 2009. “E¢ cient Sovereign Default.” Mimeo, Penn State. Farhi, Emmanuel, and Ivan Werning. 2007. “Inequality and Social Discounting.”Journal of Political Economy, 115(3): 365–402. Farhi, Emmanuel, and Ivan Werning. 2013. “Insurance and Taxation over the Life Cycle.” Review of Economic Studies, 80(2): 596–635. Farhi, Emmanuel, Christopher Sleet, Ivan Werning, and Sevin Yeltekin. 2012. “Nonlinear Capital Taxation Without Commitment.” Review of Economic Studies, 79(4): 1469– 1493. Fernandes, Ana, and Christopher Phelan. 2000. “A Recursive Formulation for Repeated Agency with History Dependence.” Journal of Economic Theory, 91(2): 223–247. Freixas, Xavier, Roger Guesnerie, and Jean Tirole. 1985. “Planning under Incomplete Information and the Ratchet E¤ect.” Review of Economic Studies, 52(2): 173–91. Golosov, Mikhail, Aleh Tsyvinski, and Iván Werning. 2006. “New dynamic public …nance: A user’s guide.” NBER Macroeconomics Annual, 21: 317–363. Golosov, Mikhail, and Aleh Tsyvinski. 2006. “Designing optimal disability insurance: A case for asset testing.” Journal of Political Economy, 114(2): 257–279. Golosov, Mikhail, Maxim Troshkin, and Aleh Tsyvinski. 2011. “Optimal Dynamic Taxes.” NBER working paper 17642. Green, E. J. 1987. “Lending and the Smoothing of Uninsurable Income.” In Contractual Arrangements for Intertemporal Trade. , ed. E. C. Prescott and N. Wallace. Minneapolis:University of Minnesota Press. 43 Hopenhayn, Hugo A, and Juan Pablo Nicolini. 1997. “Optimal Unemployment Insurance.” Journal of Political Economy, 105(2): 412–38. Hosseini, Roozbeh, Larry E. Jones, and Ali Shourideh. 2013. “Optimal Contracting with Dynastic Altruism: Family Size and per Capita Consumption.” Journal of Economic Theory, 148(5): 1806–1840. Kocherlakota, Narayana. 2010. The New Dynamic Public Finance. Princeton University Press, USA. Kydland, Finn E, and Edward C Prescott. 1977. “Rules Rather Than Discretion: The Inconsistency of Optimal Plans.”Journal of Political Economy, University of Chicago Press, 85(3): 473–91. La¤ont, Jean-Jacques, and Jean Tirole. 1988. “The Dynamics of Incentive Contracts.” Econometrica, 56(5): 1153–75. Ljunqvist, Lars, and Thomas Sargent. 2004. Recursive Macroeconomic Theory. MIT Press. Luenberger, David. 1969. Optimization by Vector Space Methods. Wiley-Interscience. Marcet, Albert, and Ramon Marimon. 2009. “Recursive Contracts.” Mimeo, European University Institute. Milgrom, Paul, and Ilya Segal. 2002. “Envelope Theorems for Arbitrary Choice Sets.” Econometrica, 70(2): 583–601. Mirrlees, James. 1971. “An Exploration in the Theory of Optimum Income Taxation.” Review of Economic Studies, 38(2): 175–208. Myerson, Roger B. 1982. “Optimal Coordination Mechanisms in Generalized PrincipalAgent Problems.” Journal of Mathematical Economics, 10(1): 67–81. Phelan, Christopher. 2006. “Opportunity and Social Mobility.”Review of Economic Studies, 487–505. Phelan, Christopher, and Robert M Townsend. 1991. “Computing Multi- period, Information-Constrained Optima.” Review of Economic Studies, Wiley Blackwell, 58(5): 853–81. 44 Roberts, Kevin. 1984. “The Theoretical Limits of Redistribution.” Review of Economic Studies, 51(2): 177–95. Rockafellar, R.T. 1972. Convex Analysis. Princeton mathematical series, Princeton University Press. Royden, H.L. 1988. Real Analysis. Mathematics and statistics, Macmillan. Rudin, Walter. 1976. Principles of Mathematical Analysis. . Third ed., New York:McGrawHill Book Co. Skreta, Vasiliki. 2006. “Sequentially Optimal Mechanisms.” Review of Economic Studies, 73(4): 1085–1111. Sleet, Christopher, and Sevin Yeltekin. 2006. “Credibility and endogenous societal discounting.” Review of Economic Dynamics, 9(3): 410–437. Sleet, Christopher, and Sevin Yeltekin. 2008. “Politically credible social insurance.” Journal of Monetary Economics, 55(1): 129–151. Stantcheva, Stefanie. 2014. “Optimal Taxation and Human Capital Policies over the Life Cycle.” working paper. Stokey, Nancy L., Robert E. Lucas, and Edward C. Prescott. 1989. Recursive Methods in Economic Dynamics. Cambridge, MA:Harvard University Press. Thomas, J., and T. Worrall. 1990. “Income Fluctuation and Asymmetric Information: An Example of a Repeated Principal-agent Problem.”Journal of Economic Theory, 51(2): 367– 390. Yared, Pierre. 2010. “A dynamic theory of war and peace.” Journal of Economic Theory, 145(5): 1921–1950. 45