Commitment, Flexibility, and Optimal Screening of Time Inconsistency Simone Galperti Northwestern University

Commitment, Flexibility, and Optimal Screening of Time Inconsistency Simone Galperti Northwestern University November 11, 2012 Abstract I study the optimal supply of ‡exible commitment devices to people who value both commitment and ‡exibility, and whose preferences exhibit varying degrees of time inconsistency. I …nd that, if time inconsistency is observable, then both a monopolist and a planner will supply devices that enable each person to commit to the e¢ cient level of ‡exibility. If instead time inconsistency is unobservable, then both suppliers will face a screening problem. To screen a more time-inconsistent from a less time-inconsistent person, the monopolist and (possibly) the planner ine¢ ciently curtail the ‡exibility of the device tailored to the …rst person, and include unused options in the device tailored to the second person. These results have important policy implications for designing special savings devices that use tax incentives to help time-inconsistent people adequately save for retirement. KEYWORDS: Time inconsistency, self-control, commitment, ‡exibility, contracts, screening, unused items. JEL CLASSIFICATION: D42, D62, D82, D86, D91, G21, G23. Department of Economics, Northwestern University, Evanston, IL 60208 (E-mail: simonegalperti2008@u.northwestern.edu). I am indebted to Eddie Dekel and Alessandro Pavan for many, long, and fruitful discussions that greatly improved the paper. I also thank Gene Amromin, Stefano DellaVigna, Roland Eisenhuth, Je¤rey Ely, William Fuchs, Garrett Johnson, Botond Koszegi, David Laibson, Santiago Oliveros, Jonathan Parker, Nicola Persico, Henrique Roscoe de Oliveira, Todd Sarver, Ron Siegel, Bruno Strulovici, Balazs Szentes, Rakesh Vohra, Asher Wolinsky, Michael Whinston, Leeat Yariv, and seminar participants at Northwestern University, New York University, UC Berkeley, EEA-ESEM 2012. I gratefully acknowledge …nancial support from the Center of Economic Theory of the Weinberg College of Arts and Sciences of Northwestern University. All remaining errors are mine. 1 1 Introduction Evidence from economics and psychology shows that many people have self-control problems (see, e.g., DellaVigna (2009) for a survey). Signs of such problems include amassing excessive credit-card debt, saving too little for retirement, and exercising too little. People are often aware of their self-control problems and thus look for commitment devices. At the same time, they also face uncertainty about their future and thus value ‡exibility.1 These con‡icting desires create a demand for ‡exible commitment devices. Examples include illiquid savings devices, such as individual retirement accounts (IRAs) and 401(k) plans, gym memberships with large upfront fees but no per-visit charge, and on-line goal-setting tools provided by companies like stickK and GymPact.2 To get a sense of the importance of such devices, consider the U.S. retirement market. As of June 2012 this market had assets of $18.5 trillion, of which $3.3 trillion (17.8%) was in 401(k) plans and $5.1 trillion (26.5%) was in IRAs.3 IRAs and 401(k) plans are examples of "tax-shielded" investment accounts, which obey special tax rules that subsidize savings but also set limits on contributions to the balance as well as penalties for withdrawals. Authorizing such special accounts is also particularly costly: For the …scal years 2010-2014, the resulting burden on the U.S. federal budget amounts to $711 billion (i.e., 12.7%) of the estimated tax expenditures.4 This paper o¤ers a theoretical investigation of markets for ‡exible commitment devices. It is the …rst to consider simultaneously three essential aspects of such markets. First, people demand devices that provide both commitment and ‡exibility. Second, people di¤er in their degrees of self-control and consequently demand di¤erent devices. Third, the provider of these devices— whether it is a pro…t-maximizing …rm or a welfare-maximizing government— cannot observe people’s degrees of self-control when tailoring its supply to individual needs. Thus, it faces the problem of screening each individual according to his preference for commitment and for ‡exibility. The paper studies this problem and characterizes its pro…t- and welfare-maximizing solutions.5 I model a market for ‡exible commitment devices using a dynamic principal-agent framework with two periods.6 In period 1, the principal (‘she,’hereafter) o¤ers the agent (‘he’) an incentive device, which allows him to choose among several actions in period 2, and for each action speci…es a payment to the principal. As an example, consider a savings device that allows the agent to contribute to and withdraw from the balance by paying some fee. In period 1 For an alternative analysis of the trade-o¤s between commitment and ‡exibility see, e.g., Amador et al. (2006). I will compare my analysis with theirs in Section 6.2: 2 Other typical examples of commitment devices include automatic drafts from checking to investment accounts, Christmas clubs, rotating savings and credit associations, microcredit savings accounts in developing countries, fat farms, and programs to reduce consumption of cigarettes, alcohol or drugs. (See Ashraf et al. (2003), Ashraf at al. (2006), DellaVigna and Malmendier (2004, 2006), Bryan et al. (2010)). 3 Investment Company Institute, “The U.S. Retirement Market, Second Quarter 2012” (September 2012). 4 Source: Estimates of The Federal Tax Expenditures for the Fiscal Years 2010-2014 (2010) 5 O’Donoghue and Rabin (2007) as well as Bryan et al. (2010) raise the question of whether markets can provide products to solve people’s commitment problems, but the answer lies beyond the scope of their papers. Laibson (1998) questions the optimality of the retirement instruments created by the U.S. government. Amador et al. (2006) study the design of paternalistic commitment policies such as minimum-savings requirements. 6 DellaVigna and Malmendier (2004) and Eliaz and Spiegler (2006) adopt a similar approach. 2 1, agents desire ‡exible devices because their preference over period-2 actions depends on an unknown state (Kreps (1979)), which is not contractible. Moreover, in period 1 some agents also desire commitment, because their preference is time inconsistent (Strotz (1956)). Finally, the agents’self-control problems are unobservable, because they alone know their degree of time inconsistency (i.e., their type). For the sake of illustration, suppose that there are only two types: agents of type C are time consistent, and agents of type I are all time inconsistent to the same degree. In Section 3 I show that, if the agents’types were observable, then the principal would o¤er each of them a (possibly di¤erent) device with the best mix of commitment and ‡exibility— independently of whether she maximizes pro…ts or welfare. That is, such a device helps each type commit to a ‡exible plan of action that is e¢ cient in each state from the viewpoint of his period-1 preference.7 I call this outcome e¢ cient, adopting the standard interpretation of the agent’s period-1 preference as his long-run preference (O’Donoghue and Rabin (1999)). I also show that, in some cases, the principal can sustain the e¢ cient outcome using the same device with both types— so their unobservability is irrelevant. Otherwise, it creates a screening problem. This screening problem arises because, thanks to his stronger self-control, C values any ‡exible device strictly more than does I. So, if the principal wants to sustain a ‡exible outcome with I, she must grant C enough rents to disincentivize his pretending to be I— as in standard screening models. Unfortunately, these rents can jeopardize I’s incentives for revealing his time inconsistency. This possibility can lead to the unusual situation in which the principal has to take into account, at once, both types’ incentives for mimicking one another when designing each commitment device. I derive the optimal screening devices in Section 4. When designing the device for I, the principal faces a standard trade-o¤ between maximizing e¢ ciency with him and extracting rents from C. Intuitively, since these rents depend on the ‡exibility of I’s device, the principal should curtail it below the e¢ cient level. Although in principle she can do so in many ways, I am able to provide a tight characterization of the optimal one. Roughly speaking, the principal limits I’s choices to a strict subset of the e¢ cient set and induces bunching at (possibly) both ends of his choice range. I provide an illustrative example in Section 1.1 and present the general result in Section 4.2. When designing the device for C, the principal always wants to maximize its e¢ ciency, but she must also take into account— as noted— that this device may induce I to pretend to be C. To prevent this, she may have to modify the device for C— relative to the symmetricinformation, e¢ cient one— in two ways. First, she may add tempting items that C never uses but that I would. These items make I view the device for C as o¤ering less commitment, and thus deter I from pretending to be C.8 Second, the principal may also distort the outcome with C. Thus, we have a failure of the ‘no distortion at the top’ property that distinguishes previous screening models, both static and dynamic (e.g., Mussa and Rosen (1978); Courty and Li (2000); Battaglini (2005)). Finally, comparing the monopolist’s and planner’s optimal device for C, I show that if the monopolist has to rely on unused items (respectively, has to distort C’s 7 This result generalizes a previous result of DellaVigna and Malmendier (2004), as discussed in Section 6.2. A related result appears in Esteban and Miyagawa (2006b). Our models di¤er in ways that lead to substantive di¤erences in when and how the principal relies on the unused items; I shall discuss these di¤erences in Section 6.2 after presenting my analysis. 8 3 choices), then the planner does as well. This paper contributes to the growing literature on behavioral contract theory.9 It adds to this literature the result that, when a pro…t-maximizing …rm faces diversely time-inconsistent agents, it screens them using contracts with di¤erent degrees of ‡exibility. Moreover, the paper characterizes the speci…c form that this screening takes: The contract for I ine¢ ciently curtails ‡exibility as illustrated in Section 1.1; the contract for C, instead, may intentionally in‡ate ‡exibility by o¤ering more options than C deems useful. This paper also contributes to the normative study of optimal paternalism (O’Donoghue and Rabin (2003)). On the one hand, it shows that there is scope for paternalistic interventions in providing ‡exible commitment devices to people with self-control problems: When the degree of such problems is unobservable, pro…t-maximizing …rms create ine¢ ciencies for the people who most need commitment. On the other hand, since also a paternalistic government cannot observe people’s degree of self-control, its best provision of commitment devices may not reach the …rst best. To illustrate its properties, in Section 5 I address the following question: How should the government design special savings devices (SDs), with speci…c tax rules that help I appropriately save for retirement, while ensuring that C does not use such SDs? The main points can be summarized as follows: (1) due to asymmetric information, the government faces a tradeo¤ between the corrective and the redistributive role of taxation; (2) solving this trade-o¤ calls for curtailing the liquidity of the special SDs relative to the standard SDs; and (3) the optimal way to do so involves limiting— through binding caps implementable with tax penalties— the ability both to tap and to contribute to a special SD. Of course, given the simplicity of my setup, these points are very stylized and extreme caution is required in relating them to real world situations. However, at a broad level, they resemble some features of the U.S. regulation of IRAs and 401(k) plans, special SDs with tax incentives. Also, one can interpret some evidence in Amromin (2002, 2003) as consistent with the principle that curtailing the liquidity of the special SDs— namely, the IRAs and the 401(k) plans— helps keep people with stronger self-control choosing the standard SDs— namely, the so-called ‘regular taxable accounts.’ Overall, my analysis can o¤er an additional way of looking at the current U.S. regulation of SDs and of assessing its future revisions. Finally, this paper relates to the literature on dynamic mechanism design. First, it highlights a methodological point: In models with time-inconsistent agents, one cannot in general restrict attention to mechanisms that are de…ned only on the path of play. Indeed, to characterize the optimal devices, one must allow for options that are o¤ path. This can never occur in standard models without time inconsistency. Second, deriving the devices that optimally screen the agents’time inconsistency requires some non-standard and recent techniques. For reasons that I will explain later, to handle the incentive constraints involving the agents’types, I use Lagrangian methods from Luenberger (1969). This approach di¤ers from a standard optimalcontrol approach and the standard dynamic-mechanism-design approach (Courty and Li (2000); Pavan, Segal, and Toikka (2011)). Moreover, to ensure that each device satis…es a necessary monotonicity property in the period-2 states, I adapt Toikka’s (2011) generalization of Myerson’s (1981) ironing procedure to my setting with o¤-path options. In Section 6, I provide a more detailed literature review; I also discuss the possibility of 9 This literature includes DellaVigna and Malmendier (2004, 2006); Esteban and Miyagawa (2006, 2007); Eliaz and Spiegler (2006, 2008); Heidhues and Koszegi (2010); and Spiegler (2011) among others. 4 introducing overcon…dent agents into the model. Section 7 concludes. All proofs are in the appendix. 1.1 An Illustrative Example: Savings Devices Consider the following situation. In period 1, the principal o¤ers savings devices (SDs) to the agent, who is planning his future savings. An SD allows him to contribute to and withdraw from the balance in period 2 (his working life) at some fee, thereby providing a tool to smooth consumption between that period and some future period of retirement. Speci…cally, given an SD, in period 2 the agent chooses the action a and pays the fee p, where a > 0 for contributions and a < 0 for withdrawals. For each a, SDs give a return at retirement, according to a rate that I assume …xed and exogenous. On the other hand, the principal can design di¤erent SDs by changing how the fee p depends on the action a.10 Finally, in this example, the principal incurs no cost to provide SDs. The choice of type C and type I among SDs depends on how each evaluates, in period 1, the decisions that he expects to make in period 2. In period 1 each type knows his period-2 income, y, but he is still uncertain about what level of savings a will be optimal in period 2, as this decision hinges on a state s > 0 that realizes only then— e.g., his period-2 health. Speci…cally, in period 1 each type assigns the utility y a p + sU (a) to choosing a and paying p in state s, where s determines the trade-o¤ between immediate consumption, y a p, and expected utility at retirement given a, namely U (a). However, when actually choosing in period 2, C and I may use a di¤erent utility function. In period 2 C continues to choose (a; p) according to the function y a p + sU (a). Instead, in period 2 I chooses (a; p) according to the function y a p + s U (a) with 0 < < 1. Thus, in period 2 Inconsistent systematically overweighs his immediate consumption over his future utility. Suppose that, according to the period-1 utility y a + sU (a), for some states it is optimal to contribute to an SD (a > 0), but for others (e.g., unexpected health problems) it is optimal to withdraw from it (a < 0). For simplicity, let this optimal plan correspond to the function a (s) = s a0 , where s ranges from s to s.11 Note that a de…nes what I call the e¢ cient outcome, since the principal’s costs are zero. Before describing the optimal SDs, I want to brie‡y give some intuition for the relation between the problem of screening time inconsistency and the ‡exibility of I’s SD, and for why this problem leads to its curtailment. Suppose that the principal wants to help I commit to the plan a . To do so, his SD must feature penalties for withdrawals and rewards for deposits, since in period 2 I cares systematically more about his immediate consumption than about his retirement well-being. How does C evaluate such an SD in period 1? Roughly speaking, since C has stronger self-control, he expects to reap the rewards and avoid the penalties more often than does I; so C expects a larger payo¤ from I’s SD than does I himself. In contrast, suppose that the principal o¤ers I an SD with only one option, anf , i.e., no ‡exibility at all. Then, I doesn’t need any penalty or reward to commit to anf , and C cannot bene…t from his stronger self-control. More generally, by making I’s SD less ‡exible, the principal can reduce 10 Note that the fee p can involve two parts, p1 and p2 , where only p2 depends on a and p1 is possibly paid in period 1 by the agent as an initial contribution to the SD. 11 This plan is optimal if, for example, U (a) = ln (a + a0 ). 5 the penalties and rewards that help him commit. These changes curtail how much C expects to bene…t from lying in period 1 and consequently reduce the rents that keep C from doing so. I now describe the SDs that the principal o¤ers to screen the two types. I consider the case in which she maximizes her expected pro…ts from the fees p. The detailed derivation is in Section 4. Consider …rst the SD designed for I. Figure 1 illustrates his resulting savings decisions— the function denoted by aI — and compares them with the e¢ cient plan a . Panel (a) illustrates aI for a setting in which I’s market share is large, and panel (b) for a setting in which it is small. (a) (b) Figure 1: E¢ cient Outcome vs. Second Best with type I By comparing aI and a , we see exactly how the principal curtails the ‡exibility of the savings opportunities o¤ered to I relative to the …rst best. Speci…cally, there are three key properties: (1) I obtains fewer savings options than in the e¢ cient plan, since the range of aI is a strict subset of that of a ; (2) I will withdraw too little for small states, where aI (s) > a (s), and will contribute too little for large states, where aI (s) < a (s); and (3) I will not adjust at all his savings decisions to some extreme states, where aI is constant. This last property always occurs for large enough states; it can also occur for small enough states as shown in panel (b). As I show in the rest of the paper (see Proposition 2), properties (1)-(3) hold quite generally, as long as the principal’s goal assigns some weight to pro…ts. Concretely, we can think of the SD o¤ered to I as an account that, besides possibly rewarding contributions and punishing withdrawals, also sets limits on these transactions. Such limits appear, for example, in some individual-development accounts— a form of matched savings accounts— and Christmas-club accounts o¤ered by some …nancial institutions in the U.S. (see also Ashraf, Gons, Karlan, and Yin (2003)). Consider now the SD designed for C. Figure 2 compares the savings options that this SD o¤ers C— the function denoted by xC — with the e¢ cient plan a . We see that C’s device commits him to the e¢ cient savings plan a on the relevant range [s; s]. But it also includes one option that involves an ine¢ ciently large withdrawal (a) in period 2, and that C never uses because he deems it reasonable only for s below the smallest possible state s. I shall later discuss how the presence of unused options depends on the primitives of the model. 6 Figure 2: E¢ cient Outcome vs. Second Best with type C 2 The Model A principal interacts with an agent over two periods. In period 1 (the contracting stage), the principal o¤ers the agent a device that gives him the opportunity to choose an action a in period 2 (the consumption stage). The contractible action a belongs to a feasible set A := [a; a] R with 1 < a < a < +1. For each a, a device speci…es a payment p from the agent to the principal. The principal incurs a production cost in period 2, which is given by the twice di¤erentiable function c : A ! R with c0 0 and c00 0. The principal can perfectly commit to a device, which— if accepted by the agent— is then binding for both parties. I also assume that there is no spot market for a in period 2. Agents may have time-inconsistent preferences, but they are always sophisticated. To model time-inconsistent agents, I follow Strotz (1956). For each agent, I call self-1 the self who chooses a device and self-2 the self who chooses an option from the resulting menu. Both selves’ preferences depend on a state s, which occurs in period 2 and re‡ects shocks to taste, income, or health. These shocks induce self-1 to desire ‡exibility; moreover, they are not contractible, for example because only the agent observes them.12 Conditional on s, the direct utilities of self-1 and self-2 from action a are u1 (a; s) := sU (a) a and u2 (a; s; ) := s U (a) a, where the function U : A ! R is twice di¤erentiable with U 0 > 0 and U 00 < 0. Finally, agents’ total utility is u1 (a; s) p for self-1 and u2 (a; s; ) p for self-2. The positive parameter determines the preferences’degree of time inconsistency and can lead self-1 to desire commitment. When 6= 1, self-1 foresees that, in each state, his self-2 trades o¤ the bene…t and the cost of any action a in a systematically di¤erent manner. This modeling assumption is based on the idea, proposed by cognitive psychologists, of ‘salience:’ Decision-makers seem to perceive the cost of their actions ( a) as more (or less) salient than their bene…t (sU (a)) depending on whether the decision they face is immediate, or occurs in the future (see, e.g., Akerlof (1991) and the references therein). 12 If s were contractible, the perfect ability of the principal to commit would make the solution for the agents’ time inconsistency trivial. 7 The illustrative example in Section 1.1 gives one interpretation of the model for the case in which the principal o¤ers savings devices— clearly, adding y to both selves’ utilities does not a¤ect their preferences. The model can however capture other situations. For instance, the principal may o¤er gym memberships as in the next example.13 Example 1 In period 1, a gym o¤ers membership contracts. Becoming a member allows the agent to exercise in period 2 (e.g., the following month) at the gym; the time spent there determines a fee p. Exercising d minutes causes an immediate discomfort, but it improves future health by U^ (d) (with U^ 0 > 0 and U^ 00 < 0). The agent’s self-1 knows that his disutility from d will depend, for example, on whether he has a bad day at work (s). Self-1 may also foresee that self-2 will always overweigh the discomfort of d and thus will tend to exercise less than self-1 wants ex ante. To model the agent’s preferences for exercising, we can use the functions U^ (d) sd for self-1 and U^ (d) s d for self-2, where now > 1. Letting a = U^ (d) and assuming quasi-linearity in p, we can rewrite the agent’s preferences over (a; p) as u1 (a; s) p and u2 (a; s; ) p with the properties that I assumed above. As far as period-2 states are concerned, s belongs to the set S := [s; s] R with 0 < s < s < +1. This assumption simply says that, in each state, the agent assigns a weight to the bene…t and the cost of his actions which is bounded away from zero. States have distribution F , which has continuous and positive density f and is commonly known in period 1. For clarity’s sake, in most of the analysis I focus on the case with only two types: agents of type C are time consistent with C = 1, and agents of type I are all time inconsistent to the same degree I . I also assume that 0 < I < 1, as in the savings example.14 Because the agents are sophisticated, they know their degree of time inconsistency before contracting occurs. In contrast, the principal cannot observe ; she, however, knows the possible types and the share 2 (0; 1) of type C in the market. To evaluate any device in period 1, type i has to take into account what his self-2 does in period 2. Therefore, I restrict attention to devices for which self-2’s optimal decisions are always well-de…ned, independently of the agent’s type. So, the payo¤ to self-1 of any device is just the expected utility of the resulting decisions of self-2, computed using the period-1 preference. Finally, if an agent rejects all the principal’s o¤ers, he receives the outside option whose value is normalized to zero.15 I adopt the following de…nition of e¢ ciency, which adheres to the standard interpretation of the self-1’s preference as the long-run preference of the agent (O’Donoghue and Rabin (1999)). De…nition 1 (E¢ ciency) For each state s, the ex-ante social surplus of action a is u1 (a; s) c (a), and the e¢ cient outcome is a (s) := arg max u1 (a; s) a2A 13 c (a) . The interpretation of some gym memberships as examples of commitment devices already appears in DellaVigna and Malmendier’s (2004, 2006) study of pro…t-maximizing contracts designed for time-inconsistent consumers. 14 Later in the paper, I explain how the analysis changes when I > 1 as in the gym example (Section 6.1) and I extend my results to the case with more than two types (Section 4.4). 15 I discuss the case in which the outside option has type-dependent values in Section 6.1. 8 I also assume that it is never e¢ cient to induce the smallest and the largest action, and that the maximum ex-ante social surplus is positive in every state. Assumption 1 For every s, a (s) is interior and u1 (a (s) ; s) c (a (s)) > 0. By standard arguments, the outcome a is already non-decreasing; by Assumption 1, a becomes increasing. Thus, a involves choosing a di¤erent action in each state and de…nes the e¢ cient, benchmark level of ‡exibility. Finally, I assume that, when designing her devices in period 1, the principal maximizes a weighted sum of her expected pro…ts and the expected ex-ante social surplus, with weight 2 [0; 1] on pro…ts and 1 on welfare. This assumption is simply a convenient way to cover the case of a monopolist ( = 1), as well as the case of a paternalistic planner who may also have to worry about the pro…tability of her devices ( < 1). Intuitively, this situation may arise when the planner is the exclusive provider of commitment devices and has limited resources to …nance them, or when she only regulates the market of such devices and has to ensure that third-party providers expect enough pro…ts to enter the market. In these cases, I could alternatively let the planner maximize the expected ex-ante social surplus subject to the constraint of generating some minimum (possibly negative) level of pro…ts. As I show in Appendix C, this alternative speci…cation would only make endogenous, without a¤ecting my results. 3 Symmetric Information: A Benchmark In this section I show that, if the agents’degree of time inconsistency were observable, then the principal would o¤er each type a device that sustains the e¢ cient outcome, independently of how much she cares about pro…ts. In so doing, she would provide a perfect mix of commitment and ‡exibility. When the agent’s type is observable, by the Revelation Principle, I can characterize optimal devices using direct mechanisms (DMs) that induce the agent to report truthfully the state s in period 2. Formally, a DM is a pair of functions (a; p), where a : S ! A and p : S ! R. DMs must satisfy the constraints R [u1 (a (s) ; s) p (s)] f (s) ds 0 (IR) S and u2 (a (s) ; s; ) p (s) u2 (a (s0 ) ; s; ) p (s0 ) for all s; s0 . (IC ) Note that the constraint (IR) uses the preference of self-1, whereas the constraints (IC ) use the preference of self-2. Subject to these constraints, the principal then chooses (a; p) to maximize R R (1 ) S [u1 (a (s) ; s) c(a (s))]f (s) ds + S [p (s) c (a (s))] f (s) ds. Although changing modi…es the principal’s goal, it turns out that she always helps timeinconsistent agents fully solve their self-control problems. Lemma 1 (First Best) Suppose that the agents’degree of time inconsistency is observable. Then, for any , the principal o¤ers a device that sustains the e¢ cient outcome a for all 2 [0; 1]. 9 The intuition for Lemma 1 is as follows. Since the agent has no private information at the contracting stage, he has no advantage over the principal. Thus, the principal becomes the residual claimant of the utility that the agent expects when choosing a device, i.e., of the expected utility of the agent’s self-1. Hence, independently of , the principal’s goal becomes to maximize the expected ex-ante surplus. There is, however, a twist here, compared to a model with only time-consistent agents: When 0 < < 1, the principal has to make sure that the agent’s self-2 is willing to go along with her e¢ cient plan. Doing so is feasible, despite time-inconsistency, because both self-1 and self-2 prefer higher actions in higher states. Lemma 1 generalizes a similar result of DellaVigna and Malmendier (2004), which relies on the same intuition. The goal of the present paper is to explain why and how the agent’s private information about his time inconsistency a¤ects the principal’s supply of ‡exible commitment devices. As I show in Appendix A, if the number of states is …nite, then for I large enough the principal can (and will) sustain the e¢ cient outcome a without worrying about the agent’s private information. Intuitively, with discrete states there are many incentive schemes that sustain a with each type. Moreover, if I and C are close enough, there is a single incentive scheme that sustains a with both types; so private information does not matter. However, this is never the case with a continuum of states. 4 4.1 Asymmetric Information The Screening Problem In this section, I set up the principal’s problem of screening the agents’degree of time inconsistency as a dynamic mechanism-design problem. I then show that asymmetric information grants the time-consistent agent an advantage over the time-inconsistent agent, which depends on the ‡exibility of the latter’s device. Finally, I derive necessary and su¢ cient conditions that will allow me to characterize the optimal screening devices. Hereafter, I will call i’s device the i-menu. The principal has to design two devices, one for each type, while ful…lling two requirements. The …rst requirement is that type i self-selects the i-menu; the second requirement is that, given the i-menu, in each state i’s self-2 chooses the option that the principal wants him to choose in that state. To formally describe this design problem, we can use direct mechanisms (DMs) that induce the agents to reveal sequentially their period-1 and period-2 private information. In a model with time-inconsistent agents, one needs to be careful when de…ning the class of DMs to be used in the analysis of the principal’s problem. In contrast to models with only time-consistent agents (Myerson (1986)), restricting attention to DMs that let self-1 report and self-2 report only his new information s entails a loss of generality. To characterize the optimal screening menus, DMs must allow self-2 to report how his preference depends on the combination of and s. The need for allowing these richer reports will be clari…ed below (see Proposition 3). Note that, in each state, self-2’s preference is pinned down by the single number s . So, call this number the self-2 ‘valuation’of U (a), and for each type i de…ne i := f j i = s i ; s 2 Sg = [ i ; ]. 10 It is then without loss of generality to focus on DMs that satisfy two properties. First, they assign a pair (a; p) to the agent’s sequential reports of and — these reports correspond, respectively, to choosing a menu in period 1 and one of its options in period 2. To track period2 choices after any period-1 report, I consider DMs that assign pairs (a; p) to coherent as well as incoherent reports; for example, if self-1 reports that he is time consistent (i.e., C ), each DM still allows self-2 to report a valuation that only a time-inconsistent agent can have (i.e., < C ). The second property that each DM must satisfy is that it ensures that truthfully reporting is optimal in period 1 and that truthfully reporting is optimal in period 2 for any period-1 report about .16 I can now formally state the principal’s screening problem; this requires introducing some C notation. De…ne := ; where := I and := (i.e., = co( C [ I )). A DM is i i i a collection of functions fX; Tg = (x ; t )i=C;I , where x : ! A is an allocation rule and i t : ! R is a payment rule. With a slight abuse of notation, let u2 (a; ) := u2 (a; s; ) for = s . Also, let U i (xj ; tj ) be the utility that type i expects in period 1 from choosing the j-menu; let i (xi ; ti ) be the expected pro…ts that the i-menu yields with i; and let W i (xi ) be the expected surplus that xi induces with i’s self-1— recall that the ex-ante surplus of action a in state s = = i is u1 (a; = i ) c(a). The principal’s problem is then 8 maxfX;Tg (1 )W (X) + (X; T) > > > > > > s.t., for i = C; I, > > > > > < u2 (xi ( ) ; ) ti ( ) u2 (xi ( 0 ) ; ) ti ( 0 ) for ; 0 2 , ( -IC ) P := > > > > > U i (xi ; ti ) U i (x i ; t i ), (IC i ) > > > > > > : i i i U (x ; t ) 0, (IRi ) where W (X) = W C (xC ) + (1 )W I (xI ) and (X; T) = C (xC ; tC ) + (1 ) I (xI ; tI ). The goal for the rest of this section will be to reduce problem P to a set of simple optimality conditions that characterize each screening menu. Starting from the constraints involving , it follows from standard arguments that a DM (X; T) satis…es ( -IC ) if and only if, for i = C; I, xi 2 M := fx : ! A j x non-decreasingg , and ti ( ) = u2 (xi ( ) ; ) + R U (xi (y))dy k i for each , (MON i ) (ENV i ) where k i := u2 (xi ( ); ) ti ( ). Hereafter I will consider only DMs that satisfy the constraint ( -IC ), which I will denote by fX; kg = (xi ; k i )i=C;I and I will simply call DMs. Similarly, I will denote the i-menu by (xi ; k i ). 16 This second property is stronger than requiring that truthfully reporting be optimal only conditional on a truthful report about . However, it entails no loss of generality. See, e.g., Pavan (2007). 11 Consider now the constraints (IC i ). Using condition (ENV j ), we can explicitly write the expected utility to type i from the j-menu as U i (xj ; k j ) = E u1 (xj ( ) ; = ) Z i = U (xj ( )) = tj ( ) j = i Fi ( ) i f ( )d f i( ) i i (EU) Z i U (xj ( ))d + k j , where F i is the distribution that F induces on conditional on being type i. Using expression (EU), we can now study i’s incentives to mimic i. To do so, de…ne Ri (x i ) := U i (x i ; k i ) U i (x i ; k i ). and re-write IC i as U i (xi ; k i ) U i (x i ; k i ) + Ri (x i ). (R-IC i ) We see that i’s willingness to mimic i depends on two things: the …rst, U i (x i ; k i ), is the overall expected payo¤ to i; the second, Ri (x i ), is how much i expects to gain or lose, relative to i, by facing the menu of i while being of type i. The function Ri (x i ) is crucial, because it determines whether type i will enjoy information rents due to the allocation sustained with type i. By analogy, recall that in the screening model of Mussa and Rosen (1978), a high-valuation buyer has an incentive to misreport his type because he values more any quality o¤ered to lowvaluation buyers (the analog of x i ) than they do. This advantage ultimately determines his information rents. In my model, it is type C who enjoys an information advantage at the contracting stage, provided that the I-menu involves some ‡exibility. Proposition 1 Time-consistent agents enjoy an information advantage over time-inconsistent agents: For any direct revelation mechanism fX; Tg that satis…es ( -IC), R C (xI ) 0 and I C R (x ) 0. Furthermore, the (dis)advantage of one type disappears if and only if the menu of the other type involves no ‡exibility: R i (x i ) = 0 if and only if x i is constant over ( ; ).17 The intuition for Proposition 1 is simple. Given the menu (xj ; k j ), C’s and I’s future choices re‡ect di¤erent preferences: u2 (a; s; C ) and u2 (a; s; I ). On the other hand, today both C and I use the same preference to evaluate their future choices, and this preference coincides with that governing C’s choices. Therefore, today both C and I must like C’s future choices better than I’s. This explains the …rst part of the proposition.18 Moreover, today C must be strictly better o¤ than I, if their future choices di¤er with positive probability. As shown in the proof, C’s and I’s choices will di¤er with positive probability unless xj corresponds to a singleton menu. In summary, the cause of i’s (dis)advantage is— and can only be— the di¤erent behavior of i and of j given the j-menu, a di¤erence that can arise only if the j-menu features some ‡exibility. 17 The allocation x i can jump at and without making Ri (x i ) 6= 0 simply because the distribution F is atomless. Since this indeterminacy has no economic content, hereafter I focus on the extension of x i to [ ; ] by continuity, whenever possible. 18 In the proof, I only assume that 0 < I < C 1. So, what really matters is that C’s future preference is ‘closer’to today’s preference than I’s future preference. 12 To further understand Proposition 1, it is useful to think about what would change if the agents were heterogeneous, but all time consistent. Speci…cally, consider a model that is identical to mine except that the agents’utility function is u2 (a; s; ) p in both periods, and suppose that U (a) > 0.19 Call Ic the time-consistent agent with = I , and C the one with = C . Note that C expects to have a systematically higher valuation than Ic , as she did before relative to I. But now C has an advantage relative to Ic because, even if their future choices coincide, C enjoys a more than Ic already today. In particular, C’s advantage is at work even if the menu of Ic is a singleton. Using Proposition 1, we can now simplify problem P as follows. Corollary 1 The DM fX; Tg solves problem P if and only if X solves 8 i h C I C C I I > R (x ) max W (x ) + (1 ) W (x ) > X 1 > > < s.t. xC ; xI 2 M, P 0 := > > > > : I C R (x ) R C (xI ). (RR) By Corollary 1, xC and xI solve the principal’s problem if and only if they are non-decreasing and satisfy condition (RR), which requires that C’s advantage under the I-menu not exceed I’s disadvantage under the C-menu. That (RR) is necessary comes from combining the constraints (R-IC C ) and (R-IC I ); that it is also su¢ cient depends on having only two types. Finally, the next lemma gives necessary and su¢ cient conditions for xC and xI to solve problem P 0 (and hence P). It relies on in…nite-dimensional Lagrangian techniques that don’t require assuming any property about xC and xI beyond the necessary monotonicity condition. Lemma 2 (Optimality) The allocation rules xC and xI solve problem P 0 if and only if, for some real number 0, (xC ; xI ; ) satis…es xi 2 arg max W i (x) i x2M RC (xI ) + RI (xC ) where C := + 1 and I 0, and R i (x) for i = C; I, RC (xI ) + RI (xC ) = 0, := . Note that, by Lemma 2, we can essentially characterize the optimal xC and xI by solving two distinct maximizations, with monotonicity as the only remaining constraint. Moreover, the form of these maximizations suggests deriving each xi and its properties directly as a function of the weight i . For this reason, it is worth observing at this point that C is increasing in how much the principal cares about pro…ts ( ), how much type C matters for her pro…ts ( ), and how much condition (RR) binds ( ). On the other hand, I is increasing in but decreasing in . I consider xI …rst. 19 A similar model appears in Courty and Li (2000). 13 4.2 The Optimal Screening Device for Time-Inconsistent Agents In this section I characterize the properties of the screening device that the principal designs for type I. I show that she distorts the I-menu to limit the rents to type C, and that these distortions always involve curtailing ‡exibility as described in the initial example about savings devices. I also show how the distortions change as a function of the market share of each type and the weight that the principal assigns to pro…ts. By Lemma 2, when designing the I-menu, in general the principal faces a trade-o¤ between maximizing the surplus with I and extracting rents from C. On the one hand, if she o¤ers a ‡exible I-menu that sustains the e¢ cient outcome with I, then she maximizes not only the surplus with I’s self-1, but also her pro…ts from this menu. On the other hand, by Proposition 1, a ‡exible I-menu triggers C’s advantage, thereby boosting his information rents. Since these rents reduce the pro…ts from any C-menu, the principal faces a trade-o¤. Similarly, in Mussa and Rosen (1978), serving low-valuation buyers e¢ ciently requires granting a high-valuation buyer rents that make trading with him less pro…table. However, such rents never lead lowvaluation buyers to choose the quality designed for the high-valuation buyer. In the present model, instead, the principal also has to worry about the rents that the I-menu creates for C because they may jeopardize I’s incentives for choosing the I-menu (i.e., they may lead to a violation of the condition (RR)). To gain some intuition for why the trade-o¤ between surplus with I and rents to C leads to distorting the ‡exibility of the I-menu, consider …rst an analogy with Mussa and Rosen (1978). In their model, to curb the rents to high-valuation buyers, the monopolist distorts the quality for low-valuation buyers below the e¢ cient level. This is because lowering such a quality curbs the advantage of the high-valuation buyers. Similarly, in the present model, the principal should distort the feature of the I-menu that causes C’s advantage, namely, its ‡exibility. To further help intuition, consider a simple example with two states: sd > sw . The e¢ cient choices are to deposit ad > 0 in sd and withdraw aw < 0 in sw . To commit I to these choices, the I-menu must provide his self-2 with appropriate incentives: Intuitively, the payment ‘premium’pIw pId must be large enough for I’s self-2 not to choose aw in sd . Suppose that, as a result, C would always choose (ad ; pId ) from the I-menu, thus expecting a strictly higher payo¤ from it than does I.20 Also, suppose that I’s incentive to choose aw over ad decrease as aw increases (i.e., it becomes less negative). Then, by allowing I to withdraw less than aw , say a ^, the principal I I can decrease the premium (^ p pd ) that makes I’s self-2 withdraw only in sw . Since I’s choice in sw is now closer to C’s, C gains less from his stronger self-control. But since e¢ ciency calls for withdrawing aw in sw and depositing ad in sd , having only the options a ^ and ad amounts to ine¢ ciently low ‡exibility for I. The bottom line is then this: Reducing the gap between I’s and C’s choices from the I-menu curbs C’s advantage, through the changes in the prizes and penalties (t) that help I commit.21 Therefore, we should reduce I’s ability to respond to future information below the e¢ cient level. The optimal way to do so, however, is less immediate than lowering quality as in Mussa and Rosen. We may curtail I’s ‡exibility for intermediate states, but leave it for extreme states 20 By Lemma 5 in Appendix A, for such values of pIw and pId , C’s self-2 will always choose (ad ; pId ) from the I-menu provided that I is small enough. 21 A further discussion of the role of payments appears in Section 6.2 in relation to the model of Amador et al. (2006), which rules them out. 14 where it seems more valuable, or we could do the opposite. To characterize the optimal allocation xI — one of the paper’s main result— I rewrite the C C objective W I (xI ) R (xI ) in Lemma 2 as an expected virtual surplus (see Appendix B for details). This surplus assigns an intricate virtual valuation to each choice xI ( ) of I; moreover, it keeps track of how C’s rents depend on options xI ( ) that only C would choose from the I-menu. Before stating the result, let xI be the allocation rule that de…nes the symmetricI information I-menu (up to the scalar k I ): xI ( ) = a ( = I ) for 2 I and xI ( ) = xI ( ) otherwise. Also, let anf := arg max E (s) U (a) a c (a) , a2A which is the ex-ante e¢ cient action if the agent is not allowed to respond to future information at all— ‘nf’stands for ‘no ‡exibility. Proposition 2 (Optimal I-menu) For every C 0, there exists an allocation rule xI ( C ) I I C C that is non-decreasing in and maximizes W (x ) R (xI ). The optimal xI ( C ) is unique and continuous in and C . If C > 0, xI ( C ) features: (a) Range Reduction with ‘Underconsumption’ at the Bottom and ‘Overconsumption’ at the I Top: xI ( ; C ) > xI ( ) over [ I ; ) 6= ? and xI ( ; C ) < xI ( ) over ( ; ] 6= ?; (b) No Flexibility at the Top: xI ( C ) is constant over [ b ; C ] with b s in [s; sy ] 6= ? F (s0 )=f (s0 ) s0 F (s)=f (s) s I 1 I + 1 I C b] with b > I . A . Finally, lim max xI ( ; C !+1 C ) anf = 0 and lim xI ( ; C !0 C ) = xI ( ) for 2 . The proof builds on Toikka’s (2011) generalization of Myerson’s (1981) ironing technique to initially characterize xI on path (i.e., over I ), and then explicitly constructs the optimal extension of xI o¤ path. Uniqueness follows from strict concavity of the function U . A few comments on the properties of the optimal I-menu follow. By Property (a), the screening I-menu always restricts I’s choices to a strict subset of the e¢ cient set of actions, which corresponds to the range of xI . The I-menu will in general let I act on his future information: xI need not be constant at anf , and by continuity this ‡exibility extends over a range of states with a rich set of options. For example, the I-menu will feature some ‡exibility when the share of time-consistent agents is small (so that C is small) and the utility U (a) of the smallest action is low enough (see Corollary 2). The principal makes I ‘overconsume’over large states and ‘underconsume’over small states for the following reason. Recall that to curb C’s rents, she has to shrink the gap between C’s and I’s choices implied by xI , as doing so also shrinks the payment premia between them. Therefore, the key is understanding how distorting xI a¤ects tI (through (ENV I )). To do so, …x 2 I . 15 On the one hand, raising xI ( ) has a positive e¤ect on tI ( ) and a negative e¤ect on tI ( 0 ) for every 0 above . So, the di¤erence in actions as well as in payments between and 0 shrinks for every 0 > , but rises for every 0 < . On the other hand, reducing xI ( ) has the opposite e¤ects. Thus, curbing C’s rents involves raising xI ( ) for close to I because, intuitively, the I mass of 0 > prevails over that of 0 < . Instead, it involves reducing xI ( ) for close to , because for such s the mass of 0 < prevails over that of 0 > . Of course, the principal must weigh the bene…cial e¤ects on RC of distorting xI against its welfare costs. Nevertheless, the direction of the distortions does not change. By Property (b), the screening I-menu always makes I not respond to information over a range of large states. This bunching destroys surplus with I but also curtails C’s rents, for the same reason as before. To see why, …x an I-menu and identify the largest action chosen by I, I I xI ( ). If any item has a > xI ( ), then tossing it doesn’t a¤ect I’s choices, but curbs C’s rents: I I In the states above , a dishonest C would now choose the smaller action xI ( ), which is closer to what I chooses. This argument applies a fortiori if, over those states, C would pick an even I smaller action, say ab < xI ( ). This extra step, however, requires that also I choose a smaller I action for some s below . The smaller is the set of s a¤ected by this change, the smaller is surplus loss with I. So, the best strategy is to bunch at ab every that was choosing a larger action— formally, every > b — and not to change the allocation for all other s. Finally, for I every threshold b < , a dishonest C would behave more similarly to an honest I for the states in [ b ; s], whereas I must choose a smaller action only for the states in [ b = I ; s]. Therefore, for I b close enough to = s I , the surplus with I falls less than do C’s rents, and some bunching is always optimal. By Property (c), the screening I-menu can also make type I not respond to information over a range of small states. By Property (a), the principal curbs C’s rents by ine¢ ciently raising xI ( ) for small s. However, recall that doing so increases the di¤erence in actions and payments between and all 0 < . So, intuitively, the larger is , the less the principal wants to raise xI ( ). In particular, she may wish that a larger chose a smaller action than does a smaller 0 . Since this violates the monotonicity condition (MON I ), at most the principal can pool bottom self-2 valuations of type I at the same action. The e¤ect of rasing xI ( ) on the payments of all 0 < gains weight as the ratio F I ( )=f I ( ), which equals I F ( = I )=f ( = I ), grows. So, the principal will wish to implement a decreasing allocation for close to I , if in this region F ( = I )=f ( = I ) grows fast enough, as formally stated in Proposition 2. Note that this condition on F is more likely to hold when I’s degree of time inconsistency is weaker (i.e., I is closer to 1), and when the principal cares more about the rents to C (i.e., C is higher). Finally, the last part of Proposition 2 describes the distortions of the screening I-menu as the weight C becomes arbitrarily large or small. As the principal becomes extremely concerned about C’s rents, she tends to disregard I’s desire for ‡exibility, in the limit o¤ering I only the option anf — a radical reduction of ‡exibility vis-à-vis the …rst best. For example, whenever the principal cares about pro…ts, she tends to induce such an outcome with I if C represents almost the entire market (if > 0, C ! +1 as ! 1). If instead the principal cares very little about C’s rents, she tends to o¤er an I-menu that sustains the e¢ cient outcome. By introducing further assumptions about the distribution F , it is possible to achieve a more detailed comparison of the screening I-menu (i.e., xI ( C )) with the e¢ cient one (i.e., xI ). This 16 is because, in general, the virtual valuation de…ning xI has a complex form. For illustrative purposes, I consider here the case with uniform F . In this case, a simple relationship also emerges between the weight C and the set of actions in the I-menu, as well as the range of states involving ‡exibility (i.e., [ b ; b ]). Lemma 3 Suppose that s is uniformly distributed and I > s=s. Then, the allocation rule xI ( C ) crosses xI only once and is increasing over [ b ; b ]. Furthermore, as C rises, xI ( C ) changes as follows: b and xI ( b ; C ) decrease, and xI ( b ; C ) increases; when b > I , then b increases as well. In general, more complicated patterns can occur, including bunching at intermediate points for standard ironing reasons. The main point is that, in the present model, bunching at the bottom and at the top arise for new reasons, which hold for a general class of distributions F . 4.3 The Optimal Screening Device for Time-Consistent Agents In this section I characterize the screening device that the principal designs for type C. I show that, to deter I from mimicking C, the principal may have to modify the e¢ cient C-menu by adding to it unused options and, if these options are not deterring enough, by also distorting C’s choices. By Lemma 2, when designing the C-menu, the principal wants to maximize the surplus with C, but she may also have to worry about how the C-menu a¤ects I’s incentives for revealing his time-inconsistency. By condition (RR), she can sustain the e¢ cient outcome with C, given a certain I-menu, if and only if I’s disadvantage when facing the C-menu ( RI (xC )) is at least as large as C’s advantage when facing the I-menu (RC (xI )). A natural benchmark of a Cmenu sustaining the e¢ cient outcome with C is the symmetric-information menu. This menu is de…ned, up to the scalar k C , by the allocation rule xC with xC ( ) = a ( = C ) for 2 C and xC ( ) = xC ( C ) otherwise. The next lemma shows that xC need not satisfy condition (RR) for any allocation rule xI . Lemma 4 Let xC and xI be the allocation rules associated to the e¢ cient outcome a . There is a family of distributions F , such that (xC ; xI ) violates condition (RR) and is therefore infeasible. The intuition for Lemma 4 is as follows. Note that, under xC , in states close to s type I and type C behave similarly, so I’s lack of self-control hurts him only a little. Therefore, if such states are likely enough according to F , then ex ante I values the C-menu almost as much as does C— i.e., RI (xC ) is small. On the other hand, since xI is ‡exible, the C-menu must grant C positive rents (R C (xI ) > 0), and if these rents p are large enough, they will ClureI I to mimic C. One can show, for example, that if U (a) = 2a and F is uniform, then (x ; x ) is infeasible. Although Lemma 4 looks at the extreme case of xC and xI , by the continuity properties of the principal’s problem its conclusion extends to a larger set of cases. By Proposition 2, as the principal cares less about C’s rents (i.e., C ! 0), she tends to design an I-menu that resembles the …rst-best one, i.e., xI . Therefore, if xC and xI are infeasible, then by continuity xC and xI ( C ) are also infeasible for C small enough. On the other hand, as the principal cares more 17 about C’s rents (i.e., C ! +1), she tends to reduce the I-menu to a single option. So, again by continuity, xC and xI ( C ) are always feasible for C large enough. For the cases in which xC and xI ( C ) are infeasible, the question becomes how the principal modi…es xC to ensure that I will report truthfully. The …rst strategy that the principal adopts to strengthen I’s incentives to reveal is unconventional. In models with time-consistent agents, the principal has to distort her o¤er to one type— and hence his choices— whenever this o¤er causes mimicking by another type. Instead, in the present model with time-inconsistent agents, the principal may successfully deter I from mimicking C while still o¤ering a C-menu that sustains the e¢ cient outcome with C. To do that, she adds to the C-menu tempting items that exacerbate I’s disadvantage when facing such a menu. Since these items are o¤ path, there are many ways to usefully employ them. However, to maximize I’s disadvantage under the e¢ cient C-menu, the principal only needs to add one item to it, as shown in the next proposition.22 C Proposition 3 (Usefulness of Unused Items) Let the allocation rule xC u satisfy xu ( ) = xC ( ) for 2 C and ( C C < C x ( ) if d C . xu ( ) = a if < d I C C that Then, there exists d > I such that xC u minimizes R (x ). Furthermore, the rule x I C sustains the e¢ cient outcome with C and minimizes R (x ) is unique except possibly over C [ d ; u ], where d and u depend only on I and F . To relax the constraint (R-IC I ), the principal adds to the C-menu one item that C never chooses, but I would choose in a range of small states. If the state is small enough and I faces the C-menu with the unused item, then I’s choice will di¤er more from C’s than if I faced the C-menu without the unused item; I is therefore less willing to mimic C. Although other strategies involving multiple unused items can deter I’s mimicking, the most e¤ective strategy requires just one unused item. This is because, from the viewpoint of self-1, the worst possible temptation is a, and since this option is o¤ path, the principal doesn’t have to worry about the cost of producing it. However, note that to make lying the least attractive for I, the principal may restrict the range of self-2 valuations for which I would choose a— i.e., it is possible that u < C . Intuitively, the payment for a controls both the probability and the disappointment that I assigns ex ante to choosing it when C doesn’t. A low payment— tailored to a close to C — makes I expect that he would choose a in a larger set of states, but with little disappointment. Instead, a high payment— tailored to a close to I — makes I think that he would choose a in a smaller set of states, but with much disappointment. Clearly, depending on the distribution F , charging the high payment for a may deter mimicking more. Although enlarging the C-menu as described in Proposition 3 maximally relaxes the constraint (R-IC I ), it may still fail to make a pair of allocations implementable. Indeed, the enlarged C-menu satis…es condition (RR) if and only if the unused action a is tempting enough. Corollary 2 Suppose that the allocations xC and xI are not implementable. Then, there is a I threshold (xC ; xI ) > 0 such that xC u as in Proposition 3 and x are implementable if and only if U (a) U (xC ( C )) (xC ; xI ). 22 I discuss how my results relate to Esteban and Miyagawa’s (2006b) use of "decorated" menus in Section 6.2. 18 Thus, if the principal can enlarge the C-menu with a tempting enough a, she can always sustain the e¢ cient outcome with C. The lowest feasible action a can be su¢ ciently tempting, relative to the lowest e¢ cient action a (s), for several reasons. The principal may be able to o¤er very unattractive actions, without any technological or legal constraint. For instance, if we view U (a) as the future consequences of some current action— e.g., shopping with credit cards— the worst consequence U (a) is likely far worse than the e¢ cient one U (a (s)), which takes into account the current utility a and the cost c(a)— e.g., the cost of a default. Combining the insights of the previous results, the next proposition summarizes the possible con…gurations of the screening C-menu as a function of C’s market share ( ) and the weight 0 is the Lagrangian that the principal assigns to pro…ts ( ). Recall that C = 1 + , where multiplier associated with condition (RR). Proposition 4 (Optimal C-menu) There exist thresholds +1, such that the following holds: (a) If items; 1 (b) If C 1 (c) If 1 C 2, 1 < < C 1, C 1 and C 2, with 0 C 1 C 2 < the C-menu sustains the e¢ cient outcome a and need not include unused C 2, the C-menu sustains a and includes unused items; the C-menu does not sustain a and includes unused items. Finally, there exist environments with 0 < C 1 < C 2. A few remarks follow. First, part (c) implies that the ‘no distortion at the top’ property— common in standard screening models— fails in the present model. In standard models, the agents of the ‘strongest’ type usually achieve an undistorted outcome, as if information were symmetric; for example, in Mussa and Rosen (1978) the buyers with the highest valuation always trade e¢ ciently with the monopolist. In the present model, Proposition 1 suggests that C is the ‘strong’type: As in Mussa and Rosen the buyers with the highest valuation value any (positive) quality strictly more than do low valuation buyers, in the present model C values any (‡exible) menu strictly more than does I. However, both C and I can receive a menu sustaining an ine¢ cient outcome. Second, the C-menu is more likely to feature unused items and (possibly) be ine¢ cient when C represents a smaller share of the market (i.e., is lower) or the principal cares less about pro…ts (i.e., is lower). Intuitively, when either or are lower, the principal is willing to grant larger rents to C. But these rents make I more willing to mimic C, so that the principal has to balance them by weakening the degree of commitment that I expects from the C-menu. In particular, we have the following corollary. Corollary 3 If the monopolist ( = 1) has to add unused items to the C-menu, so does the planner ( < 1). Similarly, if the monopolist’s solution violates the ‘no distortion at the top’ property, so does the planner’s. In particular, o¤ering C unused items may be necessary for the planner to be able to sustain the e¢ cient outcome with both types. Finally, when unused items alone cannot ensure I’s truthfulness, the distorted xC must I I maximize W C (x) R (x) with I > 0. The weight on RI calls for designing the C-menu so 19 as to worsen I’s disadvantage, which depends on the gap between I’s and C’s choices from the C-menu. To gain some intuition, consider again the savings example with two states. Suppose we start from the e¢ cient C-menu, with only the options to withdraw aw and deposit ad , and we add to it the option to withdraw a. Recall that, to deter I from mimicking C, the payment for a should be large enough relative to the others. Now suppose that I enjoys withdrawing a more than aw , but only by a little. To charge a relatively large payment for a, we must then raise aw and (possibly) ad above e¢ ciency. When there are many states and self-2’s valuations overlap across types, deriving the optimal xC requires more work. The principle is, however, the same: make I’s and C’s choices from the C-menu more di¤erent, in terms of both actions and payments. 4.4 Extension: Many Degrees of Time Inconsistency I now extend the analysis of the two-type model to a model with more realistic variety in selfcontrol problems. Screening any two agents, who have di¤erent degrees of time inconsistency, raises the same issues as screening C and I. However, with more than two types, deriving the optimal supply of menus is more intricate. Nonetheless, I show that the menu of the least timeinconsistent agents— who represent the ‘strongest’type— resembles the C-menu in the two-type model. The menus of the other agents, instead, may ine¢ ciently curtail ‡exibility; if so, they again induce ‘over-consumption’ and no ‡exibility at the top, and ‘under-consumption’ and possibly no ‡exibility at the bottom. Finally, not only the menu of the least time-inconsistent agents may feature unused items, but also the menus of intermediate types. To consider a population featuring more than two degrees of time inconsistency, I modify 1 the model in Section 2 as follows. Let the set of types be ; : : : ; N , with N > 2 …nite; for convenience, order the types so that a higher index coincides with a stronger degree of time 1 inconsistency, i.e., 1 > 2 > : : : > N > 0. Finally, let the share of type i be i > 0. All the other primitives remain as in Section 2. As in Section 4.1, we can study the principal’s problem using direct mechanisms. A DM is an SN i i . It is still ! A and ti : ! R, where = co array fX; Tg = (xi ; ti )N i=1 , with x : i=1 without loss of generality to consider DMs such that each (xi ; ti ) satis…es the constraint ( -IC ) (see problem P), so that we can again describe each i-menu as a pair (xi ; k i ) with xi 2 M. Letting the payo¤ that type i expects from the j-menu be U i (xj ; k j ) as in expression (EU), the period-1 information and participation constraints are R-IC i;j : U i (xi ; k i ) U j (xj ; k j ) + Ri (xj ) and IRi : U i (xi ; k i ) 0. For convenience, I have already written the constraints (IC i;j ) in terms of the functions Ri (xj ), as in Section 4.1. With regard to each type’s information (dis)advantage, Proposition 1 generalizes as follows. A less time-inconsistent agent enjoys an information advantage over any more time-inconsistent agent; i.e., Ri (xj ) 0 if i < j, and Ri (xj ) 0 if i > j. Moreover, i’s (dis)advantage disappears if and only if the j-menu makes i and j behave (almost) always identically; i.e., Ri (xj ) = 0 if i j and only if xj is constant over (minf i ; j g; maxf ; g). For outside this range, neither i nor j get to choose xj ( ), so Ri (xj ) cannot depend on it. 20 With regard to period-1 incentive compatibility, two other insights extend to the general model. First, the information rents granted to a less time-inconsistent agent can jeopardize the incentives of a more time-inconsistent agent to disclose his self-control problems. Second, adding unused items to the menu of a less time-inconsistent agent helps promote truthfulness by a more time-inconsistent agent. Moreover, the most e¤ective way to do so is still given by Proposition 3, except that the thresholds d and u now depend on i , j , and F . Consider now the problem of …nding DM fX; kg = (xi ; k i )N i=1 . Recall that the R ian optimal i expected pro…ts from the i-menu are [t ( ) c(x ( ))]dF i , or equivalently W i (xi ) U i (xi ; k i ) where W i (xi ) is the expected surplus that xi induces with i’s self-1 (see (EU)). We can then write the principal’s problem as follows: ( PN PN i i i i max (1 ) W (x ) + U i (xi ; k i )] fX;kg i i=1 i=1 i [W (x ) N P := s.t. xi 2 M, R-IC i;j , and IRi , for all i; j. With more than two types, the problem P N raises new di¢ culties, which I solve with a novel approach. First, in general, local necessary conditions for period-1 incentive compatibility do not imply global incentive compatibility. A similar issue appears in the literature on dynamic mechanism design with only time-consistent agents, which has developed a speci…c approach to it (Courty and Li (2000); Pavan, Segal, and Toikka (2011)). This approach involves introducing additional, potentially ad-hoc, restrictions on the primitives so that local conditions ensure global incentive compatibility. I …nd using this approach for studying a new screening problem unappealing; moreover, …nding e¤ective— let alone reasonable— restrictions is particularly hard in my context. Second, in my context, there isn’t an ordering of types that allows one to identify, a priori, which constraints will bind. For these two reasons, to solve P N , I take a di¤erent approach that does not introduce new restrictions on the primitives, and deals with period-1 incentive constraints all at once. My approach builds on Lagrangian methods and relies on having a …nite number of constraints. Applying these methods reveals that, when designing each j-menu, the principal trades o¤ the surplus with j and the rents that the menu causes for (some of) the less time-inconsistent agents (see Appendix B). Such a trade-o¤ generalizes the one highlighted in the two-type model. Furthermore, the principal has to ensure that each j-menu deters any type who is more time inconsistent than j from mimicking j. Based on the results in Section 4.3, I focus on the case in which unused items su¢ ce to guarantee this property. Proposition 5 Suppose U (a) is small so that unused items su¢ ce to satisfy (R-IC i;j ) for i > j. i A solution fX; kg to P N exists with xi unique over ( i ; ), for every i. Furthermore, we have the following: (a) The 1-menu always sustains the e¢ cient outcome a ; (b) If = 0, then all menus sustain a , otherwise at least the N -menu sustains a distorted outcome; (c) If the i-menu sustains a distorted outcome, it features properties (a), (b), and possibly (c), as in Proposition 2; (d) For every i < N , the i-menu may have to include unused items with a < xi ( i ). 21 As long as the principal cares about pro…ts ( > 0), she will distort the menu of the most time-inconsistent agents, and possibly of other intermediate types. Whether intermediate types obtain a distorted menu depends on the exact con…guration of the active incentive constraints, which cannot be predicted a priori. 5 Normative Implications for Paternalistic Savings Devices In this section, I discuss the implications of my analysis for how a paternalistic government should design special SDs, to tackle the tendency of some people to save too little for retirement. I argue that, at a broad level, my policy implications are consistent with some features of the U.S. regulation of ‘tax-shielded’and taxable investment accounts.23 People’s tendency to undersave for retirement is a well-documented phenomenon that has been attributed to self-control problems (Diamond (1977), Thaler (1994), Laibson (1998), Bernheim (2002)) and that has attracted governments’attention. To help people with self-control problems save adequately, a paternalistic government may want to create special SDs that provide appropriate tax incentives. However, when trying to sustain e¢ cient savings decisions with all individuals, the government faces two natural constraints. First, it can usually allocate limited …scal resources to …nance its policies. Second, people value ‡exibility, because they are uncertain about their best savings decisions in the future, and they have di¤erent tendencies to undersave, because their self-control varies. Since the government cannot observe people’s self-control, it has to induce each individual to select voluntarily the SD that best …ts his desires for commitment and for ‡exibility. The question is then how the government should design its SDs, given these constraints. My model addresses this type of question. To see how, recall our initial example about SDs. Again, each SD allows the agents to contribute to (a > 0) and withdraw from (a < 0) the balance. Now, however, each choice of a leads to a tax burden or bene…t p— think of p as the …scal consequences of a distribution from or a contribution to an SD. As in our initial example, it seems natural that e¢ ciency calls for distributions (a < 0) when spending in period 2 has high enough priority (s < s0 ), and for contributions (a > 0) otherwise. Finally, since the government can allocate limited resources to its SDs, it seems natural that it also takes into account their overall pro…tability, i.e., 0 < < 1. The results of Section 4 have the following implications: The unobservability of people’s self-control o¤ers one economic justi…cation for government intervention in the market for SDs. Indeed, asymmetric information leads pro…tmaximizing …rms ( = 1) to supply SDs ine¢ ciently, especially to the agents who actually need commitment devices. When designing its SDs, a …nancially constrained government faces a trade-o¤ between corrective and redistributive taxation; its solution calls for curtailing the liquidity of the special SDs, relative to the standard SDs. By Proposition 1, special SDs— tailored to 23 Of course, I do not intend this section to be seen as providing an explanation for all the complicated rules governing such accounts in the U.S.. 22 time-inconsistent agents— make time-consistent agents less willing to choose the standard SDs tailored to them. To make the time-consistent agents choose the standard SDs, it is necessary to grant a tax discount on them. Doing so drains …scal resources, in addition to any tax incentives for the time-inconsistent agents. Curbing the discount on the standard SDs calls for curtailing the ‡exibility, or liquidity, of the special SDs. To optimally curtail the liquidity of the special SDs, the government should set limits on the owners’ freedom to make contributions and distributions. At one end, the special SDs should induce ine¢ ciently low contributions by setting a binding cap (Proposition 2, Properties (a) and (b); Proposition 5); at the other end, they should induce ine¢ ciently low distributions, possibly using again a binding cap (Proposition 2, Properties (a) and (c); Proposition 5). It is important to remark that, according to the present analysis, the reason for setting limits on distributions from the special SDs is not to help the timeinconsistent agents avoid depleting their savings. Indeed, if the government could observe the agents’ self-control, its special SDs could and should help time-inconsistent agents e¢ ciently respond to all future states, through appropriate penalties and rewards and not through curtailed ‡exibility (recall Section 3). The government should not take for granted that time-inconsistent agents will always prefer the special SDs to the standard SDs. The tax discount on the standard SDs may lure the time-inconsistent agents into choosing them. To avoid this, the standard SD should include the possibility of taking large distributions in period 2, but make it so costly that only time-inconsistent agents would use it (see Section 4.3). For example, a standard SD could allow unlimited borrowing against its balance, and also charge very high penalties for loans beyond a certain threshold.24 For the sake of completeness and with the extreme caution that the subject deserves, I will brie‡y compare my highly stylized results with existing SDs available in the U.S. retirement market. In this market, federal laws regulate the special SDs, called ‘tax-shielded’investment accounts (TSAs), as well as the standard SDs, called taxable investment accounts (TAs). As the names suggest, TSAs and TAs have di¤erent tax rules. Examples of TSAs include the individual retirement accounts (IRAs) and the 401(k) plans. TSAs account for signi…cant tax expenditures in the federal budget; more generally, savings taxation contributes to …nance other redistributive goals. Therefore, budget considerations are likely to play a role in how the U.S. Congress regulates the TSAs and the TAs. At a broad level, the TSAs— in particular the IRAs and the 401(k) plans— di¤er from the TAs as follows. Both IRAs and 401(k) plans set maximal contribution limits; TAs don’t. These limits are enforced with dear tax penalties for crossing them, and sometimes they bind: In 2007 (2008), about 59% (49%) of IRA-owners contributed at the limit (Holden et al. (2010b)), and roughly 11% of all 401(k) participants did so in 2004 (Munnell and Sundén (2006)). Also, both IRAs and 401(k) plans limit distributions; again, TAs don’t. Except for a list of quali…ed cases (e.g., …rst-time home purchase for IRAs), any amount withdrawn before the age of 59 12 24 In settings with naive time-inconsistent agents, the government should obviously be very careful when adding such borrowing options, because— in contrast to my model— it cannot be sure that no agent will choose them. For further discussion, see Section 6.1. 23 incurs a tax penalty, which seems to actually limit access to these TSAs: According to Holden and Schrass (2008-2010a), the vast majority of IRA withdrawals are retirement related, and only about 5% occurs before the owner turns 59 12 . Moreover, any IRA-backed loan is de facto prohibited, and although for some 401(k) plans loans are allowed, they are capped and subject to quick repayments. Of course, there are many competing reasons that can explain these features. They are, however, consistent with the implications of the model. Finally, Amromin (2002, 2003) presents some evidence that is consistent with the principle that curtailing ‡exibility in the special SDs helps curb their appeal to less time-inconsistent agents. He shows that a signi…cant share of U.S. savers does not take full advantage of the TSAs’tax bene…ts, and prefers to invest in the TAs because of the TSAs’liquidity constraints. These savers reveal that they care more about ‡exibility than about commitment, which may depend, among other things, on them being less time inconsistent. 6 Discussion and Literature Review 6.1 Discussion Overcon…dence. The empirical evidence suggests that people can underestimate their selfcontrol problems, i.e., they can be overcon…dent (see, e.g., DellaVigna 2009 and the references therein). In my model, an agent is overcon…dent if in period 1 he believes that his type is ^ , but learns in period 2 that it is < ^ . Overcon…dence a¤ects the analysis as follows. On the one hand, the characterization of the information and participation constraints does not change, because it only depends on the belief that each agent holds in period 1 about his type. On the other hand, the principal’s problem of …nding the optimal screening menus changes. Now the principal has to design each menu taking into account that both sophisticated and overcon…dent agents can choose it. Therefore, depending on her goal (i.e., on ), she may exploit or counteract the agents’overcon…dence. This is especially relevant to the design of unused options. In the two-type model, we saw that in some cases the principal may have to enlarge the C-menu to screen the sophisticated I. However, if in period 1 some agent of type I wrongly believes that his type is C, then the principal may also want to enlarge the C-menu to deal with his overcon…dence in period 2. Thus, she may face a trade-o¤ between optimally dealing with the overcon…dent I and ensuring that the sophisticated I does not choose the C-menu. This trade-o¤ may arise, in particular, in the case of a fully paternalistic planner who cares only about the agents’welfare (i.e., = 0).25 Outside Option with Type-Dependent Values. Since only the principal can provide ‡exible commitment devices, we can interpret any period-1 outside option as the set of statecontingent choices that the agents would make in period 2 if left to their means. Without loss of generality, we can describe these choices through a menu fx0 ; k 0 g, using the formalism of Section 4.1. For simplicity, consider the two-type model. By Proposition 1, U C (x0 ; k 0 ) U I (x0 ; k 0 ) with equality if and only if x0 is constant over . In other words, C and I value the outside option equally— as I have assumed throughout the paper— if and only if the menu fx0 ; k 0 g is singleton. For example, the outside option may lead to the same default action in every state. Instead, if x0 is ‡exible, then C and I value the outside option di¤erently. Similarly, in Mussa and Rosen 25 A more detailed analysis of the model with overcon…dent agents is available upon request. 24 (1978) a high-valuation buyer would value more an outside option featuring strictly positive consumption than would a low-valuation buyer. When C values fx0 ; k 0 g more than does I, my analysis changes to the extent that the constraint IRC may bind, together with or even before the constraint (IC C ). Obviously, if x0 = xI , then the principal will never o¤er a distorted I-menu, because she can never extract from type C more than U C (xI ; k 0 ) = U I (xI ; k 0 ) + RC (xI ). Similarly, if in Mussa and Rosen the outside option features the e¢ cient level of trade with lowvaluation buyers, then the monopolist will never o¤er them a distorted level of trade. Finally, if (IC C ) binds before (IRC ), then the principal will distort the I-menu as discussed in Section 4.2, possibly stopping if (IRC ) starts to bind too. Di¤erent Period-1 Heterogeneity. In this paper, time-inconsistent agents overweigh ( < 1) the immediate bene…ts (or costs) of their actions when they have to choose one. If all time-inconsistent agents have > 1, the substance of the paper doesn’t change. Signi…cant di¤erences appear, instead, if some agents have < 1 and others have > 1. For example, with two types and 1 > 1 > 2 , neither may have an information advantage (see Appendix A). However, the approach that I introduced in Section 4.4 should be useful to analyze this case as well. 6.2 Relation to the Previous Literature As noted in the Introduction, this paper relates to the literature on behavioral contract theory and optimal paternalism. Several papers have already studied how to design incentive schemes to tackle people’s self-control problems (e.g., O’Donoghue and Rabin (1999)). The novelty of the present paper consists in studying, in a market environment, the role of asymmetric information about people’s con‡icting desires for commitment and for ‡exibility. A more detailed comparison of my work to the closest papers in the literature follows. Amador, Werning, and Angeletos (AWA) (2006). In AWA, a decision-maker faces a consumption-savings problem and lacks self-control. Anticipating this situation, ex ante he desires commitment, but because he still doesn’t know how much he will value consumption, he also desires ‡exibility. AWA focus on commitment strategies that only entail removing options from the decision-maker’s future budget set; they purposefully rule out payments across states to eliminate any form of insurance. AWA’s main contribution is to identify necessary and su¢ cient conditions for the optimal (paternalistic) strategy to coincide with minimum-savings plans. In contrast, I look at markets in which, to commit to a desired plan, agents obtain incentive schemes from a pro…t- or welfare-maximizing supplier; I also allow for payments across states. AWA’s model is more appropriate to study public pension plans which involve no payments, such as Social Security and de…ned-bene…t plans. But to study de…ned-contribution plans and other forms of voluntary savings that receive tax incentives from the government, my model seems better suited. Furthermore, AWA essentially assume that people’s self-control is observable. However, with unobservable self-control, AWA’s analysis does not change because their commitment policies raise no incentive compatibility issue. To see why, suppose AWA allowed for both timeinconsistent and time-consistent agents, type I and type C. AWA’s optimal policies involve an e¤ective minimum-savings constraint for I, but not for C. Given these policies, by de…nition, C doesn’t want to mimic I ex ante, although if C did, ex post C would save at least as much as does I in every state and strictly more in some states. The ‡ip side of AWA’s policies is that I’s 25 ex-post choices are (almost) always ine¢ cient, according to his ex-ante preference. By contrast, I show that allowing for payments would permit to fully solve the agent’s self-control problems under symmetric information, but causes screening problems under asymmetric information. DellaVigna and Malmendier (DM) (2004). DM’s in‡uential work studies how …rms design two-part tari¤s when selling to quasi-hyperbolic buyers goods that cause immediate costs (bene…ts) and deferred bene…ts (costs). DM’s model helps explain why …rms set per-usage prices below (above) their marginal costs, and introduce renewal or cancellation fees. DM’s model di¤ers from mine as follows. First, in DM …rst- and second-period preferences di¤er for any given agent, but they coincide across agents within each period. Also, at the …rst-period contracting stage, some agents correctly forecast their preference at the secondperiod consumption stage, but others are partially naive in the sense of O’Donoghue and Rabin (2001). Second, DM restrict a priori the agent to a buy-or-not-buy decision— with no quantity variable— a restriction that precludes studying the trade-o¤ between commitment and ‡exibility. Finally, DM assume that …rms know the agents’ degree of time inconsistency and of naivete. This symmetric information assumption is relaxed by Jianye (2011), who takes DM’s model and studies whether their predictions about per-usage prices are robust to asymmetric information. DM also …nd that …rms will provide sophisticated agents with a two-part tari¤ that perfectly solves their self-control problems. I con…rm and extend this result in Section 3. Nonetheless, I cast doubt on DM’s positive message in two directions. First, the result is not robust to asymmetric information, which leads a monopolist to supply commitment ine¢ ciently. Second, the result relies on the property, which holds also in DM, that self-1 and self-2 rank any two states in the same order. To see this, suppose the utility function of self-1 is sU (a) a p, but that of self-2 is 1s U (a) a p— the inconsistency is now about which state makes a more valuable. In this case, to be period-2 incentive compatible, an allocation rule x must be non-increasing; so, the e¢ cient outcome a — which is increasing— cannot be sustained with self-2. Esteban and Miyagawa (EM) (2006b). EM study a standard model of non-linear pricing except that buyers have Gul-Pesendorfer (2001) preferences and the monopolist can o¤er them non-singleton menus. EM’s …nd that the monopolist may fully extract buyers’ surplus using "decorated" menus. EM have two types of buyers (H and L), which have di¤erent valuations of the monopolist’s good, but also di¤erent self-control costs. Buyers dislike exerting self-control and desire commitment. So, intuitively, by adding to the singleton L-menu an item that makes a dishonest H incur high self-control costs, the monopolist makes the L-menu less appealing to H and can extract more surplus from H. In my model, the unused item added to the C-menu works similarly: By making I view the C-menu as o¤ering even less commitment, it increases I’s expected cost of mimicking C. Except this similarity, EM’s paper is quite di¤erent from mine. In EM, agents value commitment but not ‡exibility, so no trade-o¤ can arise between the two. In EM, unused items help screen ‘strong’ from ‘weak’ buyers; instead, in my model, they help screen ‘weak’ from ‘strong’agents. In EM, maximizing welfare never requires adding unused items to menus, but maximizing pro…ts may; instead, in my model, maximizing welfare may require unused items while maximizing pro…ts may be possible without them. Eliaz and Spiegler (ES) (2006). ES analyze a model in which agents sign contracts with a monopolist in period 1, to get access to a set of actions in period 2. All agents have the same utility function u in period 1, which changes to v in period 2 with probability q. The monopolist knows q; instead, in period 1, each agent has his own private belief q^ about the 26 chance of switching to v. Crucially, in ES the agents’ private information a¤ects what they are willing to pay for a given contract in period 1, but it doesn’t a¤ect what the monopolist expects them to choose in period 2. Instead, in my model, period-1 private information also matters in predicting the period-2 outcome of a (non-trivial) device— as in adverse-selection models. Furthermore, in ES, the agents have the same information about the environment in both periods, so they care about commitment (unless q^ = 0), but not about ‡exibility. 7 Conclusion I study how a monopolist and a paternalistic planner supply ‡exible commitment devices to agents who privately know their preference for commitment and for ‡exibility. If information were symmetric, both the monopolist and the planner would help each agent fully solve his selfcontrol problems. This is generally not true, however, with asymmetric information: Agents with superior self-control enjoy an information advantage, which causes a screening problem. I derive the optimal screening devices using non-standard techniques for solving dynamic mechanismdesign problems. To screen a more time-inconsistent from a less time-inconsistent agent, the monopolist and (possibly) the planner ine¢ ciently curtail the ‡exibility of the device tailored to the …rst agent, and include unused items in the device tailored to the second agent. Finally, in my model the familiar ‘no distortion at the top’property fails. My results are relevant to the problem of providing commitment devices in several situations in which only time-inconsistent people should use such devices, but they cannot be distinguished a priori from time-consistent people. Governments may want to design savings devices that provide tax incentives for time-inconsistent people to adequately save for retirement. Gyms may want to design special memberships that provide monetary incentives for time-inconsistent people to exercise regularly. Finally, both private and public institutions may want to provide time-inconsistent people with incentives to commit to other goals, such as consuming less alcohol, cigarettes, or drugs. The paper does not look at the e¤ects of competition among many providers of ‡exible commitment devices. While a detailed study of this case is beyond the scope of the current paper, I expect that even perfectly competitive markets lead to ine¢ cient outcomes, with distortions qualitatively similar to the monopolistic case. Intuitively, competition pushes …rms to generate the largest possible surplus and payo¤ for each type of agent. However, the asymmetric information creates an adverse-selection problem for the …rms, who have to screen the di¤erent types. Conditional on sustaining the e¢ cient outcome with time-consistent agents, each …rm has to balance the e¢ ciency with time-inconsistent agents against the incentives of the time-consistent agents to pretend to be time-inconsistent. This trade-o¤ is similar to that analyzed in the paper, and hence is likely to have similar implications. 8 Appendix A: Irrelevance of Asymmetric Information with Finitely Many States In this appendix, I show that if the number of states is …nite, then asymmetric information at the contracting stage may be immaterial as far as sustaining the e¢ cient outcome is concerned. 27 To understand the role of the discreteness of the state space S, consider the case with two states, s2 > s1 . If is observable, the principal sustains a2 = a (s2 ) > a (s1 ) = a1 (Lemma 1), with payments p1 = p(s1 ) and p2 = p(s2 ) that satisfy the condition u2 (a2 ; s2 ; ) u2 (a1 ; s2 ; ) p2 p1 u2 (a2 ; s1 ; ) (1) u2 (a1 ; s1 ; ), which follows by combining the (IC ) constraints. Since u2 (a; s; ) has strictly increasing differences in (a; s), the discreteness of S creates some slack in the period-2 incentive constraints evaluated at (a2 ; a1 ). Thus, for any , condition (1) does not uniquely pin down p1 and p2 . Now suppose that type C and type I have about the same self-control, i.e., that I is almost equal to C . Intuitively, to implement the same outcome with both types, it should be possible to use incentive devices that are su¢ ciently alike. Furthermore, if the discreteness of S leaves some leeway in the choice of payments, it may be even possible to …nd one device that works for both types. On the other hand, if I is very di¤erent from C , then sustaining the e¢ cient outcome with both type requires di¤erent devices. Since I < C , type I is more tempted to pick a1 also in state s2 than is type C, and the more so, the lower I is relative to C . Thus, for I to choose a1 only in state s1 , a1 must be su¢ ciently more expensive than is a2 , and this price premium must increase as I falls. Therefore, it must eventually exceeds C’s willingness to pay for switching from a2 to a1 in s1 . The following lemma formalizes this intuition. Since the result does not depend on having two types with 0 1 and < 1. Let := max and := min . Lemma 5 Suppose S is …nite with sN > sN 1 > : : : > s1 . There exists a single commitment device that sustains the e¢ cient outcome a with each 2 if and only if = mini si+1 =si . Proof. With N states the period-2 incentive constraints become u2 (ai ; si ; ) pi u2 (aj ; si ; ) (ICi;j ) pj for all i; j = 1; : : : ; N , where ai := a(si ) and pi := p(si ). By standard arguments, it is enough to focus on the adjacent constraints (IC i;i 1 ) and (IC i 1;i ). For i = 2; : : : ; N , let i := pi pi 1 . If a(si ) = a (si ) for all i, then aN > aN 1 > : : : > a1 (Assumption 1). To implement a with , we must have, for i = 2; : : : ; N , u2 (ai ; si ; ) u2 (ai 1 ; si ; ) i u2 (ai ; si 1 ; ) u2 (ai 1 ; si 1 ; ). Recall that for any s and , u2 (a0 ; s; ) u2 (a; s; ) = s (a0 mini si =si 1 and suppose sk 1 > sk . Then, u2 (ak ; sk 1 ; ) u2 (ak 1 ; sk 1 ; ) > u2 (ak ; sk ; ) and no k satis…es (CIC k;k 1 ) for both then for each and i u2 (ai ; si ; ) a) + d (a0 ) u2 (ai 1 ; si ; ) and . If instead si u2 (ai ; si 1 ; ) u2 (ai ; si 1 ; ) 28 (CICi;i 1 ) d (a). Let sk =sk 1 = u2 (ak 1 ; sk ; ), si 1 for every i = 2; : : : ; N , u2 (ai 1 ; si 1 ; ) u2 (ai 1 ; si 1 ; ). Set i := u2 (ai ; si 1 ; ) u2 (ai 1 ; si 1 ; ) for each i. Then f i gN i=2 satis…es all the (CIC i;i 1 ) P conditions for each . And the payment rule de…ned by pi = p1 + ij=2 j — with p1 small enough to ensure participation— sustains a with each . Therefore, if the heterogeneity across agents— measured by — is small enough, the principal can sustain the e¢ cient outcome without worrying about period-1 incentive constraints. The condition in Lemma 5, however, is not necessary for asymmetric information to be irrelevant when sustaining the e¢ cient outcome a . Even if = is large, it may be possible to design di¤erent devices such that each device sustains a with one type of agent, and each agent optimally chooses the device designed for his type. To see why, consider a simple example with two types with h > l , and two states, s2 > s1 . Suppose that h > 1 > l , h s1 > l s2 , but s2 > s1 h and s2 l > s1 . Consider all payments (p1 ; p2 ) that satisfy (1) and the (IR) constraint with equality, i.e., (1 f ) p2 + f p1 = (1 f ) u1 (a2 ; s2 ) + f u1 (a1 ; s1 ) (2) where f := F (s1 ). Finally, choose (ph1 ; ph2 ) so that h’s self-1 strictly prefers a2 in state s2 — i.e., u1 (a2 ; s2 ) ph2 > u1 (a1 ; s2 ) ph1 — and (pl1 ; pl2 ) so that l’s self-1 strictly prefers a1 in state s1 — i.e., u1 (a1 ; s1 ) pl1 > u1 (a2 ; s1 ) pl2 . Then, the device de…ned by (pl1 ; pl2 ) (respectively (ph1 ; ph2 )) sustains a and generates zero expected payo¤s to the agent if and only if type l (h) chooses it. Moreover, I claim that type l (h) strictly prefers the device with payments (pl1 ; pl2 ) (respectively (ph1 ; ph2 )). Note that, if the self-1 of either type had to choose in period 2, under either device he would strictly prefer to behave exactly as does his self-2. Therefore, by choosing the ‘wrong’ device, either type can only decrease his payo¤ below zero. The next lemma provides a necessary condition for asymmetric information to be irrelevant k when sustaining a . Let 1 := \ [0; 1] and 2 := \ [1; +1). For k = o; u, let := max k k and k := min k , and set rk := = k . Lemma 6 Suppose that S is …nite with sN > sN 1 > : : : > s1 . If maxfr1 ; r2 g > mini si+1 =si , there is no collection of commitment devices, each tailored to a speci…c type 2 , such that (I) type chooses ‘his’device, (II) each device sustains the e¢ cient outcome a with the corresponding type , and (III) all types enjoy the same expected payo¤. Proof. Suppose max fr1 ; r2 g = r1 ; the other case follows similarly. Suppose that devices that satisfy (I)-(III) exist. Let fp g 2 be the corresponding payment schedules, and normalize 1 1 selects each type’s expected payo¤ to zero. Consider p and denote it by p. I claim that if 1 p rather than p , he enjoys a positive expected payo¤. Given p, let ai ( ) be an optimal choice of 2 1 in si (this is well de…ned by assumption). For 1 , ai ( 1 ) = ai for every i. Let S := fi : si+1 / si < r1 g 6= ?. We have: (a) for every i, 1 1 1 (b) for i 2 S , si > si+1 , and hence ai ( ) with 1, imply that 1 p(ai ( )) p (ai ) 1 u2 (ai ( ); si ; 1 ) 1 si > 1 i=1 [u1 (ai ( 1 ); si ) ai ; ai+1 > ai . Observations (a) and (b), together u2 (ai ; si ; 1 ) 1 u1 (ai ( ); si ) where the …rst inequality is strict for i 2 S . The expected payo¤ to PN 1 si and hence ai ( ) 1 p(ai ( ))]fi > 29 PN i=1 [u1 (ai ; si ) 1 u1 (ai ; si ) , from p is then p(ai )]fi = 0, where fi := F (si ) F (si 1 ) for i = 2; : : : ; N and f1 := F (s1 ). Note that, if either 1 n f1g = ? or 2 n f1g = ?, then the su¢ cient condition in Lemma 5 becomes also necessary for sustaining the e¢ cient outcome without worrying about the agent’s private information. 9 Appendix B: Omitted Proofs from Sections 3-4. All proofs allow for 9.1 C 1, unless the speci…c result assumes otherwise. Proof of Lemma 1 The principal’s problem is R max (1 ) S [u1 (a (s) ; s) fa;pg R c(a(s))]dF + S [p (s) c (a (s))] dF s.t. (IR) and (IC ). If > 0, (IR) must bind; if = 0, assume w.l.o.g. that (IR) holds with equality. Thus, the problem reduces to R s.t. (IC ). max S [u1 (a (s) ; s) c(a(s))]dF a Ignoring (IC ), the resulting relaxed problem admits a a as its unique solution (up to the set of zero measure fs; sg). It remains to show that there exist p that sustains a with self-2. For any , standard arguments imply that the necessary and su¢ cient condition for the existence of such a p is that a is non-decreasing in s, a property satis…ed by a . 9.2 Proof of Proposition 1 Using expression EU and changing variables, re-write U i (xj ; k j ) as U i (xj ; k j ) = Now de…ne for each s (sj xi ) = s(1 Rs s [sU (xj (s i )) C I s(1 = s(1 )U (xi (s C C )) R )U (xi (s I )) + )[U (xi (s C )) R s i U (xj (s i )) C s R i U (xj (y))dy]dF + k j . U (xi (y))dy (3) (4) U (xi (y))dy Rs C U (xi (s I ))] + s I [U (xi (y)) s I Note that xi 2 M and I < C 1 imply that (sj xi ) every 2 ( ; ), then (sj xi ) = 0 for every s 2 (s; s). Consider …rst type C. Using (3) and (4), we have RC (xI ) = U C (xI ; k I ) s U I (xI ; k I ) = U (xi (s I ))]dy. 0 for every s. Also, if xi ( ) = a for Rs s (sj xI )dF 0, (5) with equality if xI ( ) = a over ( ; ). Now suppose xI ( ) is not constant over ( ; ). Since xI is non-decreasing, there exists ~ 2 ( ; ) such that xI ( ) < xI ( 0 ) for every ; 0 such that 30 < ~ < 0 . Let s~1 := ~= C and s~2 := ~= I , and consider the interval I := (~ s1 ; s~2 ) \ [s; s] 6= ?. I C I C I I C I ~ For s 2 I, s < < s ; hence x (s ) < x (s ). To prove that R (x ) > 0, it is enough to show that i R hR s C I i i U (x (s ))]dy dF > 0. (6) I [U (x (y)) s I For s 2 I, the integrand in (6) satis…es Rs C ~ [U (xi (y)) R~ U (xi (s I ))]dy + s I [U (xi (y)) Rs C [U (xi (y)) ~ U (xi (s I ))]dy U (xi (s I ))]dy > 0, where the …rst inequality follows from (MON I ) and the last from xI (y) > xI (s I ) for every y 2 (~; s C ). Since I has positive measure, (6) follows. Rs Now consider type I. Using again (3) and (4), we have RI (xC ) = s (sj xC )dF , and the properties of RI (xC ) follow from the same type of arguments used for RC (xI ). 9.3 Proof of Corollary 1 Using condition (ENV j ), the expected pro…ts equal (X; T) = = R tC ( ) W C (xC ) c(xC ( )) f C ( ) d + (1 U C (xC ; k C ) + (1 ) R ) W I (xI ) tI ( ) c(xI ( )) f I ( ) d U I (xI ; k I ) . By Proposition 1, (IR I ) and (R-IC C ) imply (IR C ). Now, recall that U i (xi ; k i ) is decreasing in k i (see EU). So, if > 0, (R-IC C ) must bind, and (R-IC I ) becomes RI (xC ) + R C (xI ) 0. I Similarly, if > 0, (IR ) must also bind. If = 0, we can safely focus on DMs that make (RIC C ) and (IR I ) hold with equality, since we are ultimately interested in the optimal X. Given X, we can then uniquely pin down k using the conditions U I (xI ; k I ) = 0 and U C (xC ; k C ) = RC (xI ). Substituting these quantities in (X; T) and rearranging the principal’s objective deliver problem P 0 . 9.4 Proof of Lemma 2 Let U := fu : ! [U (a); U (a)] j u is non-decreasingg. Then, for every x 2M, u( ) = f i (u) := U (x( )) 2 U; similarly, for every u 2 U, x( ) = U 1 (u( )) 2 M. Also, let W ei (uj ) := Ri (U 1 (uj )). It is easy to see that problem P 0 is equivalent to W i (U 1 (u)) and R the following problem: h i 8 f C (uC ) + (1 f I (uI ) e C (uI ) < max W ) W R 1 . Pu = : C I C I I C e e s.t. u ; u 2 U and R (u ) + R (u ) 0. The space X := f(uC ; uI ) j ui : ! R; i = C; Ig is linear and Y := U U is a convex subset eI ( ) + R eC ( ) maps X into R, which has nonempty, positive, of X . The constraint functional R f i (u) is so due to the convexity of and closed cone R+ . The objective is concave because W 31 eI ( ) + R eC ( ) is linear, and there exists (uC ; uI ) 2 Y such that U 1 and c. The constraint R eI (uC ) + R eC (uI ) < 0: e.g., uC = U (xC ) and uI constant. R Let 0 and de…ne the Lagrangian L(uC ; uI ; ) : f C (uC ) + (1 = W f C (uC ) = [W I f I (uI ) ) [W eC (uI )] R 1 eI (uC )] + (1 R eI (uC ) + R eC (uI )] [R f I (uI ) ) [W C eC (uI )]. R (7) By Corollary 1, p. 219, and Theorem 2, p. 221, of Luenberger (1969), (uC ; uI ) solve P u if and I 0, only if there exists 0 such that, for all (uC 0 ; u0 ) 2 Y; 0 L(uC ; uI ; L(uC ; uI ; ) 0) I L(uC 0 ; u0 ; ). Given 0, uC and uI maximize the …rst and the second term in brackets in (7) within U if I C I C I and only if L(uC ; uI ; ) L(uC 0 ; u0 ; ) for every (u0 ; u0 ) 2 Y. Moreover, if (u ; u ; ) satis…es eC (uI ) + R eI (uC ) 0 and [R eC (uI ) + R eI (uC )] = 0, then L(uC ; uI ; 0 ) L(uC ; uI ; ) for every R eC (uI ) + R eI (uC ) 0. And if R eC (uI ) + R eI (uC ) < 0, 0. Finally, if (uC ; uI ) solves P u , then R 0 then must equal zero: otherwise there exists 0 2 [0; ) such that L(uC ; uI ; eI (uC ) + R eC (uI )] < 0. L(uC ; uI ; ) = ( 0) 0 ) [R ie i f i (ui ) eC (uI ) + Finally, if (uC ; uI ; ) are such that ui 2 arg maxu2U fW R (ui )g, 0, R eI (uC ) eC (uI ) + R eI (uC )] = 0, then xi = U 1 (ui ) 2 arg maxx2M fW i (xi ) R 0, and [R i R i (xi )g, RC (xI ) + RI (xC ) 0, and [RC (xI ) + RI (xC )] = 0. Similarly, if if (xC ; xI ; ) i i are such that x 2 arg maxx2M fW i (xi ) R i (xi )g, 0, RC (xI ) + RI (xC ) 0, and i i C C I I C i i i i i e (u )g, R e (uI ) + f (u ) R [R (x ) + R (x )] = 0, then u = U (x ) 2 arg maxu2U fW eI (uC ) 0, and [R eC (uI ) + R eI (uC )] = 0. R 9.5 Proof of Proposition 2 Preliminary Step: First, I rewrite the objective W I (xI ) expected virtual surplus. To do so, note that W i (xi ) = Next, using (EU), we have R RC (xI ) = R [u1 (xi ( ) ; = i ) R R U (xI ( )) C I 1 I R C (xI ) in Lemma 2 as an c(xi ( ))]dF i ( ) . C 1 C U (xI ( )) (8) U (xI (y))dy dF C R U (xI (y))dy dF I Changing the order of integration and rearranging, we can express RC (xI ) as RC (xI ) = R C I U (xI ( ))g C ( ) d 32 R I I U (xI ( ))V I ( ) dF I ( ) , (9) I where g C ( ) : ( ; g C ( ) := C I ] ! R and V I : [ I ; C 1 C fC ( ) ] ! R are given by fC( ) C v fI( ) F C ( )) and V I ( ) := v I ( ) (1 with v i ( ) := = i F i ( ) =f i ( ). Using (8) and (9), we get that W I (xI ) the expected virtual surplus VS I (xI ; C ) : = R I R C where, for 2 I , wI ( ; C U (xI ( ))wI ( ; I C I ) := = U (xI ( ))(1 I + C C xI ( ) ) C (10) ( ), RC (xI ) equals c(xI ( )) dF I ( ) + (11) F C ( ))d , V I ( ) is the virtual valuation of U (xI ( )) with fC ( ) C v ( ). fI ( ) V I ( ) := v I ( ) Finally, as in the proof of Lemma 2, it is convenient to work in terms of the functions uI 2 U. The properties of the corresponding allocation rule follow by letting xI = U 1 (uI ). Part 1: Existence and Uniqueness. Step 1: Construction of the generalized version of VS I using Toikka’s (2011) technique over I [ I ; ]. I Since the density f is positive, the inverse function (F I ) 1 : [0; 1] ! [ I ; ] is well-de…ned, increasing, and continuous. Fix C and de…ne, for q 2 [0; 1], Rq z(q; C ) := wI ((F I ) 1 (q) ; C ) and Z(q; C ) := 0 z(y; C )dy. The function z is continuous in q, except possibly at q m := F I ( I if C < 1 and C < , so that m = C , we have lim wI ( ; # C C ) = lim wI ( ; " C C m ) > 0 (recall ( C) 1 fI( C) Cf ) m I = minf ; C g): C C C C . (12) Let := convZ be the convex hull of Z on [0; 1]. Let ! : [0; 1] ! R be de…ned as !(q; C ) := 0 (q; C ), whenever 0 (q; C ) exists. Also, w.l.o.g., extend !(q; C ) by right-continuity to all [0; 1) and by left-continuity at 1. Lemma 7 The function ! is continuous in q and C . Proof. (Continuity in q). Throughout this part of the proof suppress C . Continuity at 0 and 1 holds by construction. Since z is continuous at every q 2 (0; 1) n fq m g, so is !. To see this, note that continuity of z at q implies that Z 0 (q) = z (q). There are two cases to consider. First, suppose (q) < Z (q). By de…nition, ! ( ) must be constant at ! (q) in a neighborhood of q. It follows that ! is continuous at q. Second, suppose (q) = Z (q). Since is convex and Z (y) (y) for every y, we have: + (q) = lim y#q (y) y (q) lim q y#q 33 Z (y) y Z (q) = Z + (q) q (q) q (q) = lim y"q (y) y lim y"q Z (q) q Z (y) = Z (q) . y + Since (q) (q) and Z is di¤erentiable at q, we have that (q) = + (q), and ! is continuous at q. I Consider now q m . If m = , then q m = 1 and we are done. To prove that ! is continuous if q m 2 (0; 1), it is enough to show that if z jumps at q m , then (q m ) < Z (q m ). Recall that z (q m ) := limq"qm z (q) = lim " m wI ( ; C ) and z(q m +) := limq#qm z (q) = lim # m wI ( ; C ). Hence z can at most jump down at q m (see (12)), in which case z (q m ) > z (q m +). Also recall that z (q m +) = z (q m ). Suppose (q m ) = Z (q m ). By the same steps as in Lemma 7, + (q m ) Z + (q m ) = z (q m ). By convexity of , ! (q) (q m ) for all q q m . Therefore, for q close enough to q m from the left R qm R qm (q) = (q m ) ! (s) ds > Z (q m ) z (s) ds = Z (q) . q q A contradiction. (Continuity in C ). The function Z(q; C ) is continuous in C for every q. So is continuous if q 2 f0; 1g, since (0; C ) = Z(0; C ) and (1; C ) = Z(1; C ). Consider q 2 (0; 1). For every q and C 0, is de…ned as (q; C ) := min Z(q1 ; C ) + (1 ) Z(q2 ; C ) j ( ; q1 ; q2 ) 2 [0; 1] ; q = q1 + (1 ) q2 . By continuity of Z(q; C ) and the Maximum Theorem, (q; ) is continuous in C for every q. Furthermore, ( ; C ) is di¤erentiable in q with derivative !( ; C ). Fix q 2 (0; 1) and any C C (q; C ), it follows from . Since limn!1 (q; C sequence f C n) = n g such that limn!1 n = C ). Theorem 25.7, p. 248, of Rockafellar (1970) that limn!1 !(q; C n ) = !(q; I Now, for 2 , de…ne the generalized virtual valuation wI ( ; C ) := !(F I ( ) ; C ), which is non-decreasing by construction and continuous by Lemma 7. Let c (U 1 ( )) and replace wI with wI and x = U 1 (u) in VS I to get: I VS (u; C ) := R I I u ( ) wI ( ; C ) + (u ( )) dF I + C R C I () = U 1 ()+ u ( ) gC ( ) d . I Step 2: Derivation of a candidate solution, which maximizes VS , in two steps. For 2 I , de…ne (y; ; C ) := ywI ( ; C ) + (y) and let uI ( ; I C ) := arg max y2[U (a);U (a)] C (y; ; C (13) ), I and uI ( ; C ) = U (a) for 2 ( ; ]; uI is the unique pointwise maximizer of VS . Although uI ( C ) is non-decreasing on I , it may not be non-decreasing over . The next lemma shows I that any monotone maximizer of VS , if it exists, must belong to a certain subclass of U. 34 I Lemma 8 Suppose uI 2 U and VS (uI ; ( I u ( ; where b 2 [ I; I ] and y b ( C ) C )= xI ( b ; C I ) = maxu2U VS (u; C yb( uI ( ; C if b ) if I ) C C C < < ). Furthermore, if , b I , then y b ( C ) = uI ( b ; C ). I I Proof. Suppress C and suppose that uI 2 U maximizes VS . I claim that uI ( ) = uI ( ) I C I C I for 2 ( ; ]. Otherwise, there exists 0 2 ( ; ) such that uI ( ) > uI ( ) for every > 0 . But then uI can’t be optimal within U because i i R Ch I I R Ch I I I I u ( ) uI ( ) g C ( ) d u ( ) u ( ) g C ( ) d > 0. 0 Now consider uI ( ) for 2 I . Recall that (y; ) in (13) is strictly concave in y and continuous in . Because uI ( ) is continuous and non-decreasing over I , only two cases can arise. I I I Case 1: uI ( ) uI ( I ). I claim that uI ( ) = uI ( ) for 2 ( I ; ]. If not, there exI 0 ists 0 > I such that uI ( ) < uI ( ) uI ( ) for every . By strict concavity, for I I I 0 I I 2 ( ; ], (u ( ); ) (u ( ) ; ), with strict inequality at least for every ; hence R R I I I I I I ); )dF > I (u ( ) ; )dF , contradicting the optimality of u . I (u ( I Case 2: uI ( ) = uI for > 2 ( I; b I b > uI ( I ) for some b 2 ( I; ]. Suppose not. First, consider ( b ; I I ]. I claim that uI ( ) = minfuI ] and suppose that uI ( ) < uI b . Then, by the same argument as in case 1, setting uI ( ) = uI R for every I b b ; uI ( )g for some 2 ( b; I ] is a strict improvement on uI : the resulting function is in U and b (uI b ; )dF I > R I uI ( ) ; dF I . Second, consider ( I ; b ] and suppose uI ( 0 ) 6= uI ( 0 ) for some 0 . If b uI ( 0 ) > uI ( 0 ), then by continuity of uI and monotonicity of uI there exists a 00 > 0 such that uI ( ) > uI ( ) for every 2 ( 0 ; 00 ). Similarly, if uI ( 0 ) < uI ( 0 ), then there exists 000 < 0 such that uI ( ) < uI ( ) for every 2 ( 000 ; 0 ). Finally, since uI is the unique maximizer of (a; ), for 2 ( I ; b ], (uI ( ) ; ) (uI ( ) ; ), with strict inequality at least for every R R b b 2 ( 000 ; 0 ) [ ( 0 ; 00 ); hence I (uI ( ) ; )dF I > I (uI ( ) ; )dF I , which contradicts the optimality of uI . I I It remains to show that uI ( ) > uI ( ) is impossible. Suppose the opposite is true. By the I I I same argument as in case 2, uI ( ) = uI ( ) for 2 ( I ; ). Then setting uI ( ) > uI ( ) can’t I I C be optimal for the following reason: since uI ( ) = uI ( ) for 2 ( ; ), and g C ( ) is negative, I I I reducing uI ( ) to uI ( ) satis…es monotonicity and strictly improves VS . C By Lemma 8, uI must be continuous on ( I ; ). Although Lemma 8 doesn’t pin down uI C C at I and , it is w.l.o.g. to extend uI at I and by continuity. The next lemma proves that C I a maximizer of VS exists, and shows that it is unique over ( I ; ) and, therefore, also over C [ I ; ] w.l.o.g.. I Lemma 9 There exists uI such that VS (uI ; C 35 I ) = maxu2U VS (u; C ); such an uI is unique. Proof. Suppress C . By Lemma 8, if a solution uI exists, it can take only two forms: (1) C C uI equals a constant y uI ( I ) over [ I ; ], (2) uI is constant at uI b over [ b ; ], with b I b , and equals uI ( ) for every . So we only need to look for a solution within this subclass of U. C I f I (uI ): the …rst Subclass 1: If uI constant at y over [ I ; ], then VS (uI ) =VS I (uI ) = W equality follows from R1 R I I wI ( )]dF I = 0 [z (q) ! (q)] dq = 0; I [w ( ) the second from Proposition 1. Moreover, if uI is constant at y over [ I ; R I f I (uI ) = y I ( = I )dF I + (y). W C ] (14) By continuity and strictly concavity, there exists a unique constant function that maximizes I VS (uI ). Call it uI1 . I Subclass 2: Using (y; ) in (13), VS equals to the following function of b : R I R b b = I (uI ( ) ; )dF I + b (uI b ; )dF I + uI b K, (15) where K := C R C I g C ( ) d . Continuity of uI implies that b I b is continuous, therefore there I exists an optimal 2 [ ; ] that completely identi…es a maximizer within the second subclass of functions. Since uI can be locally ‡at, there can be multiple solutions b . Nonetheless, I claim that there can’t be two solutions 1b and 2b such that uI ( 1b ) 6= uI ( 2b ). Suppose to the b contrary that 1b < 2b both maximize , and uI ( 1b ) < uI ( 2b ). W.l.o.g. assume that 1b is the largest such that uI ( ) = uI ( 1b ). Let u1 and u2 be the functions corresponding to 1b C e := u1 + (1 and 2b , and for 2 (0; 1) let u ) u2 2 U. For 2 ( 1b ; ], u2 ( ) 6= u1 ( ), whereas for 2 [ I ; 1b ], u2 ( ) = u1 ( ) = uI ( ). By the strict concavity of (y; ), we have R 1b R I e( )K > (e u ( ) ; )dF I + 1b (e u ( ) ; )dF I + u ( 1b ) + (1 ) ( 2b ). I b b C e is constant on [ 2b ; ] at some uI (~ ), with ~ 2 ( 1b ; 2b ). Therefore, the function Note that u b uI ( ) = minfuI (~ ); uI ( )g satis…es property (2) and, by the same argument as in Lemma 8 (Case 2), R I R 1b b e ( ) K > ( 2b ). (e u ( ) ; )dF I ( ) + 1b (e u ( ) ; )dF I ( ) + u (~ ) I b The claim follows. Hence any maximizer of yields a unique uI that satis…es property (2). I Call it u2 . I I By an argument similar to that for the uniqueness of uI2 , VS (uI2 ) = VS (uI1 ) if and only if C I uI2 uI1 (over ( I ; )). Therefore, the overall maximizer of VS is unique, and equals uI1 if I I VS (uI1 ) VS (uI2 ), and uI2 otherwise. I Step 3: The unique maximizer of VS , denoted uI ( C ), is also the unique maximizer of VS I (U 1 (u)). The argument modi…es Toikka’s (2011) proof of Theorem 3.7 and Corollary 3.9 I C to account for ( ; ]. 36 Lemma 10 The function uI ( Proof. Suppress R I I C C ) is the unique maximizer of VS I (U 1 (u)). . Integrating by parts and recalling that u 2 U, we have u ( ) [wI ( ) wI ( )]dF I = R I I u ( ) [z(F I ( )) !(F I ( ))]dF I I = u ( ) [Z(F I ( )) (F I ( ))] I R I I (F I ( ))]du ( ) I [Z(F ( )) R I I Z(F I ( ))]du ( ) 0. = I [ (F ( )) The last equality follows from Z (0) = (0) and Z (1) = (1); the inequality follows from u 2 U and (q) Z (q) for q 2 [0; 1]. Re-writing VS I (U 1 (u)), we have sup VS I (U u2U 1 I (u)) = supfVS (u) + u2U R I I [ (F I ( )) Z(F I ( ))]du ( )g. I We know that uI 2 U and achieves the supremum of VS (u). We only have to show that C R I I [ (F I ( )) Z(F I ( ))]duI ( ) = 0. (16) If uI is constant over [ I ; ], then duI 0 and we are done. Otherwise, consider the pointwise I solution uI over [ I ; ] as de…ned in (13), and a such that (F I ( )) < Z(F I ( )). For some open interval N around , wI ( ) = !(F I ( )), and uI is constant over N . Therefore, duI ( ) assigns zero measure to any such N , and satis…es (16). I claim that also duI does so to any C C such N . Consider [ b ; ], over which uI is constant at uI b . If N [ b ; ], the claim is immediate. The same holds if N \ [ b ; C if both N \ [ b ; ] 6= ? and N \ [ I ; C C ] = ?, because then uI ( ) = uI ( ) for 2 N . Finally, b ) 6= ? (therefore, b > I ), then uI equals uI b for every 2 [ b ; ] [ N , which implies the claim. Hence (16) holds also for a non-constant uI . C I I e 2 U that di¤ers from uI over ( I ; ), VS (e By Lemma 9, for any u u) < VS (uI ). Uniqueness C C follows on ( I ; ), and extending it to [ I ; ] is w.l.o.g.. Part 2: Continuity and Limit Behavior of uI I proved continuity of uI in in Part 1; I now prove continuity in C . By the de…nition of I uI ( ; ) in (13) and the Maximum Theorem, uI ( ; ) is continuous in C for 2 [ I ; ]. Now consider ( b ; C ) in (15). Pointwise continuity of wI ( ; C ) and uI ( ; C ) implies that ( b ; C ) is continuous in C , hence b ( C ) := arg max 2 I ( ; C ) is u.h.c.. Furthermore, recall that C for every ; 0 2 b ( C ), uI ( ; C ) = uI ( 0 ; C ). Take any sequence f C n g with limn!1 n = b C b C b C I C . Then, limn!1 ( n ) = 2 ( ). Recall that the candidate u2 ( ) maximizing I b C I b C I C ( ; ) satis…es u2 ( ; ) = minfu ( ( ); C ); uI ( ; C )g for 2 [ I ; ], and uI2 ( ; C ) = I I C uI ( b ( C ); C ) for > . Hence, by continuity of uI , limn!1 uI2 ( ; C ) for every 2 n ) = u2 ( ; C I I [ ; ]. Finally, recall that the constant solution u1 in the proof of Lemma 9, as well as (14), is independent of C . It remains to show that the actual solution uI ( C n ) converges pointwise to the actual solution uI ( C ). Suppose …rst that VS I (U 1 (uI1 )) >VS I (U 1 (uI2 ( C ))) = ( b ( C ); C ). 37 By continuity of , there exists N such that n N implies VS I (U 1 (uI1 )) >VS I (U 1 (uI2 ( C n ))). C I I I C I 1 I Therefore, for n N , u ( ; n ) = u1 for every 2 [ ; ]. Now, suppose that VS (U (u1 )) < I C VS I (U 1 (uI2 ( C ))). Similarly, for n large enough uI ( C n ) = u2 ( n ) which converges pointwise to uI ( C ). Finally, if VS I (U 1 (uI1 )) =VS I (U 1 (uI2 ( C ))), then uI1 uI2 ( C ). Hence I I C I I C u1 u ( ; n ) maxf0; u1 u2 ( ; n ) g ! 0 as n ! 1. To prove that uI ( C ) ! uI = U (xI ) pointwise as C ! 0, it is enough to observe that f I (u). Therefore, uI ( ; 0) = uI ( ) for 2 ( I ; I ), which can be extended VS I (U 1 (u); 0) = W C I I . Finally, I prove that to [ I ; ] by letting uI ( I ; 0) = uI ( I ) and uI ( ; 0) = uI ( ) for I C nf C max u ( ; ) U (a ) ! 0 as ! +1 by …rst showing pointwise convergence. Recall I C I 1 C eC f I (u) that u ( ) maximizes VS (U (u); C ) = W R (u) and that, by Proposition 1, C I C e (u) > 0 for any u 2 U that is not constant on ( ; ). Clearly, uI ( C ) cannot converge to R a constant function that takes value y0 6= U (anf ), because U (anf ) is the unique maximizer of f I ( ) in (14). Now, suppose that uI ( C ) converges pointwise to a function, denoted uI1 , that W C is not constant on ( I ; ). Then, there exists ^C large enough so that for every C > ^C , uI1 f I (uI1 ) is bounded and R eC (uI1 ) > 0; hence is strictly dominated by U (anf ). This is because W f I (U (anf )). Now, consider the unique eC (uI ) f I (uI ) ^C R W there exists ^C 0 so that W 1 1 extension of uI ( C ) by continuity. By monotonicity, we have max uI ( ; C ) U (anf ) = maxf uI ( ; C ) U (anf ) ; uI ( ; C U (anf ) g, ) which completes the argument. Part 3: Properties (a)-(c) of uI Property (b): Suppress C . Recall: (1) uI satis…es Lemma 8; (2) for 2 I , uI is de…ned by (13) and is continuous; (3) ( ) as in (15). Suppose b > I . For < b , uI ( ) = uI ( ) < b uI b = uI b by Lemma 8, and ( ) by construction; hence b ( ) b uI uI ( ) R = I I I w (y) dF + K + R b R wI (y) dF I b (uI uI (uI b ) b b uI uI ( ) ) b (uI ( )) (uI (y)) uI ( ) Since uI and wI are non-decreasing, R b I w (y) dF I maxfjwI ( I )j; jwI ( b )jg(F I ( b ) F I ( )) (1 dF I 0. F I ( )). Since uI is continuous, there exists ~ 2 [ I ; b ) such that uI ( ) is interior for 2 (~; Mean Value Theorem, for every 2 (~; b ), there exists 2 ( ; b ) such that b (uI u hence for 2 (~; R b b I ) b (uI ( )) I u ( ) = 0 (u ( )) = b wI ( ) , ) (uI uI b ) b (uI (y)) uI ( ) dF I maxfjwI ( I )j; jwI ( b )jg(F I ( b ) 38 F I ( )). ). By the Therefore, b lim " b uI ( ) b uI ( ) = R I b wI (y) dF I (y) + K + 0 (uI b b FI )(1 ) 0. (17) I It follows that b < because K < 0 and 0 ( ) < 0. I I claim that there exists 2 ( b ; ] such that uI ( ) > uI b . Suppose not. If uI b is 0 b interior, wI ( ) = (uI ( )) for every , and (17) is violated. If uI b = U (a) and the 0 2 I j wI ( ) > (U (a)) is non-empty— if a = ?, we are back to the previous set a := case— then (17) is violated again. To see this, note that since b is the smallest for which R I R I 0 wI ( ) = (U (a)), Z(F I b ) = (F I b ), which implies b wI (y) dF I = b wI (y) dF I ; hence R I b [wI (y) + 0 (U (a))]dF I + K = R I b I (y= + 0 (U (a)))dF I + C I R C I g C (y) dy + R I b V I (y) dF I . By Assumption 1 and concavity , = I + 0 (U (a)) < 0, so the …rst integral is negative. The second term is too, contradicting 17. To see this, integrate by parts: R R C I I b gC ( ) d = v i ( ) dF i = = Thus, we have R C I g C (y) dy + R I b R R R C I b V I (y) dF I = = = i ( = R R C I b I R b b = = I C (s ( R I F C ( )), I )dF i I ( = (1 F i( ) + gC ( ) d + I b I dF C + ( = i )dF i I b C = I I b b F i( b) I )F i ( ). I v I ( ) dF I R C b )dF I b ( = b b C R I b v C ( ) dF C b )dF C (18) )dF < 0, I where the last equality is a change of variables, and the inequality follows from > b > I > 0 and 0 1 , uI ( ) > uI ( 1 ) and ( ) b ( 1) = . The same steps that led to (17) yield ( 1) lim1 I 1 # u ( ) R I I ( ) I = + K + 0 (uI ( 1 ))(1 1 w (y) dF I u ( ) F I ( 1 )) 0. As uI (y) is interior and constant over [ b ; 1 ], 0 (uI (y)) = wI (y) = wI b = and R I I R I 0 wI ( 1 )]dF I + K = b [wI (y) wI b ]dF I + K 0. 1 [w (y) 39 0 (uI b ), b Finally, since Z(F I wI b . Therefore, R I b [wI (y) b (F I ) = wI b ), the same argument as in Lemma 7 yields wI R ]dF I + K = which gives, by rearranging wI (y), R I b [wI ( b ; C = I ]dF I ( ) = ) R C I b I b [wI (y) b wI R V I ( ) dF I ( ) b = ]dF I + K, C I (1 F C ( ))d ; (19) Property (a): Since both uI ( C ) and uI = U (xI ) are continuous and non-decreasing, it I I is enough to prove that uI ( I ; C ) > uI ( I ) and uI ( ; C ) < uI ( ). I Case 1: uI not constant. This implies that b > I , and by Lemma 8 uI ( ; C ) = I I uI ( b ; C ) = uI ( b ; C ). To show that uI ( ; C ) < uI ( ) for C > 0, it is enough to prove I that wI ( b ; C ) < = I . This inequality holds because b and wI ( b ; C ) must satisfy (see the proof of Property (b)) R I b (wI ( b ; I now show that uI ( I ; Lemma 11 wI ( I ; C C C ) y= I )dF I = ) > uI ( I ). Given wI ( I ; ) C C R C C I g C (y) dy + > 0, let b R I b V I (y) dF I < 0. := maxf j wI ( ; C ). Moreover, if the inequality is strict, then ) = wI ( I ; b > I C )g. . Proof. Suppress C and recall that wI ( I ) = ! (0) and wI ( I ) = z (0). If ! (0) > z (0), the same argument as in Lemma 13 leads to a contradiction. Suppose that ! (0) < z (0) and let q^ := supfq j 8q 0 < q; !(q 0 ) < z(q 0 )g; q^ > 0 by continuity. Then, for every 0 < q < q^, Rq Rq Z (q) = Z (0) + 0 z (y) dy > (0) + 0 ! (y) dy = (q) , It follows that b (F I ) 1 (^ q) > I . I So if wI ( I ; C ) = wI ( I ; C ), it equals ( I = I )(1 + C (1 )) + C (f I ( I )) 1 > I = I . If I I I instead wI ( ; C ) < wI ( ; C ), then it is constant over [ ; b ] at wI ( b ; C ). Since at b it must be that Z(F I ( b ); C ) = (F I ( b ); C ), b must satisfy R b I C ) wI ( b ; C )]dF I = 0 (20) I [w (y; or equivalently, R b I (y= I wI ( b ; C ))dF I = Integrating by parts, we have R b I R b I R b C I (y) dF I = (y) dF C I V I v (y) dF I v R b R b I C I = I (y= I (y= b )dF C R b I V I (y) dF I . b )dF C = R b= b= (21) I C (s b )dF > 0, where the last equality follows from a change of variables and the inequality from b > I > 0 and 0 I = I . In either case, uI ( I ; C ) must be interior and strictly greater than uI ( I ). 40 Case 2: uI constant. From the proof of Lemma 9, uI ( ; C ) equals U (anf ) for every I I Since I = I < E (s) < = I , Assumption 1 implies uI ( I ) < U (anf ) < uI ( ). I m Property (c): Let I < 0 (recall m = minf ; C g) and consider " # I 0 I I I 0 F F ( ) I C (1 + C (1 )) wI ( 0 ; C ) wI ( I ; C ) = . I f I ( 0) fI I 2 . (22) Although the …rst part of (22) is always positive since 0 < I < 1, the second part of (22) can be negative. Therefore, w( ; C ) can be non-increasing in a neighborhood of I . If so, then wI ( ; C ) and uI ( C ) must be constant over [ I ; b ] 6= ?. I 0 I 0 I I F ( 0 = I )=f ( 0 = I ) F ( = I )=f ( = I ) Now note that F ( )=f ( 0) F ( )=f ( ) = . Therefore, the condi0 = I = I 0 tion stated in part (c) of Proposition 2 implies that, for all wI ( 0 ; C ) C( 0 wI ( ; ) C ) 1 = I C which in turn implies that wI ( ; 9.6 + C + I ) and uI ( in [ I ; minf I sy ; m I I F ( 0 = I )=f ( 0 = I ) I 1 > 0 C = F I = = =f = g], 0, I ) must be constant over [ I ; b] 6= ?. Proof of Lemma 3 i For the uniform distribution F i ( ) = ( ( = I )(1 + I C w ( ; )= ( = I )(1 + . Using expression 11, we have C (1 C ( 2 I )) + I C I 1)2 ) if 2 [ I; if 2[ C ; C I ) . ] I The function wI is continuous at C . It is increasing and greater than = I over [ C ; ], because C > 0 and I < 1; wI is increasing over [ I ; C ) if and only if I 1=2 or C > (2 I 1) 1 := C . Consider …rst the behavior of b and b , when b > b and xI is not constant. If I 1=2 C I I C or < , then w is increasing and coincides with w (see the proof of Proposition 2); hence I C I x (see (13)) is increasing over [ I ; ], which implies b = I . Otherwise, b > I and b is characterized by (20), which boils down to ( ) = b Since wI is always increasing over [ b ; Since R I b [wI (y; C @ R I I b [w (y; @ b ( I )2 ( 1 + C ( I 1)2 C I 2 ) I 2 (23) ) ], it coincides with wI . Using 19, wI ( b ; ) C I C C wI ( b ; )]dy = C )]dy = 41 ( I I ) C wI ( b ; C R C I )( b must satisfy g C (y) dy. I b ) < 0. (24) R C for C > 0, there is a unique b > b that satis…es (24). Letting K = I g C (y) dy < 0, condition (24) becomes 8 b 2 C < (1 + C ( I 1)2 )( I ) if b I I I C [2 ( . )K] = I : C ( I )2 ( I b 2 C 2 ) if b < C ) + (1 + C (1 2 I ))( If I > 1=2, the function b ( C ) is constant at I for C < C and at C it jumps from C . Monotonicity for C > C follows by applying the Implicit Function Theorem to (23): d d Similarly, for b ( C 1 = 2 1+ b C 2 I C( I 1)2 ( C ( b I to I 2 ) > 0. ) I ) we have: d d b C = 8 > : C [1+ I 2 b 1)2 ] C( I b C (1+ C (1 2 I )) if b C < 0 if b C <0 ; < for the second inequality, recall that b < b < C if and only if I 1=2 or C < C . Consider now the behavior of uI ( C ) = U (xI ( C )), which matches that of xI ( C ). By Proposition 2 and Assumption 1, uI ( ; C ) 2 (U (a); U (a)). Also, uI ( ; C ) = arg maxy2[U (a);U (a)] ywI ( ; (y). By strict concavity of yw + (y), it is enough to consider the properties of wI ( C ) in relaI tion with the function = I . The function wI ( ; C ) crosses = I only once at some I < < . Furthermore, for 2 [ b ; b ], wI ( ; C ) = wI ( ; C ). Hence, it is enough to show that, as C rises, wI ( b ( C ); C ) falls and wI ( b ( C ); C ) rises. Lemma 12 Suppose b and b are characterized by NFB and (20). If wI ( b ; wI ( b ; C ) > 0, then d dC wI ( b ( C ); C ) < 0 and d dC wI ( b ( C ); C ) > 0. C ) > 0 and Proof. It follows by applying the Implicit Function Theorem to 19 and (20). Consider wI ( b ( C ); C ). If I 1=2 or C < C , then b ( C ) = I and wI ( I ; C ) = I (1 )( I = I ) > 0. Consider now the case with I > 1=2. As C " C , wI ( I ; C ) " wI ( I ; C ) = wI ( C ; C ). By Lemma 12 wI ( b ( C ); C ) increases in C , for C > C , because here wI ( b ( C ); C ) is positive when b > C . Hence, C # C implies wI ( b ( C ); C ) # wI ( C ; C ). By the same argument wI ( b ( C ); C ) decreases in C again because here wI ( b ( C ); C ) is positive when b < b . 9.7 Proof of Lemma 4 Using (4) and splitting the integral at the state s such that RI (xC ) = Rs s C I (sj xC )dF + 42 Rs s C I I = C , we have (sj xC )dF . C )+ For every s (sj xC ) = s s C C I , since xC (s I ) = xC (s U (xC (s C C )) s U (xC (s C ), C Rs C ))+ s C U (xC (y))dy+s[U (xC (s C )) U (xC (s C and because xC is continuous, (sj xC ) ! 0 as s ! s. Now consider RC (xI ). Recall that since s I < s C , (sj xI ) > 0 for s < s. For convenience, …x s0 := 21 (s + s). By continuity, min[s;s0 ] (sj xI ) = > 0. Choose s =2 > s so that (sj xC ) =2 for s 2 [s; s =2 ]. Finally, let s1 := minfs0 ; s =2 g. We have: R s1 (sj xI )dF F (s1 ) , RC (xI ) s RI (xC ) (sj xC )(1 sup F (s1 )) + F (s1 ) . 2 s>s1 Therefore, RC (xI ) > 9.8 RI (xC ) if F (s1 )/ (1 F (s1 )) > 2 sups>s1 (sj xC ). Proof of Proposition 3 Using EU, we have R RI (xC ) = I 1 U (xC ( )) I R C 1 U (xC ( )) C R U (xC (y))dy dF I R U (xC (y))dy dF C . Changing the order of integration and rearranging, we can express RI (xC ) as RI (xC ) = where the function g I : [ I ; R C C I U (xC ( ))g I ( ) d + ] ! R is given by I g I ( ) := and the function V C : [ C ; C C V C ( ) := C C U (xC ( ))V C ( ) dF C , (25) fI ( ) + FI ( ) , ] ! R is given by 1 C 1 I R 1 FC ( ) fC ( ) fI ( ) fC ( ) I 1 I 1 FI ( ) . fI ( ) Then, minimizing RI (xC ) over M while ensuring that xC C = xC C ( j C means restrictR I ing xC to C ) corresponds to maximizing C U (x( ))g I ( ) d by choosing a non-decreasing function x : [ I ; C ) ! [a; xC ( C )]. Although g I ( I ) < 0 and by continuity also for close to I , nothing ensures that g I ( ) is always positive, or at least non-decreasing. I therefore apply Toikka’s (2011). 43 ))], Let Fe be the uniform distribution on [ I ; C ] and Fe increasing inverse function. For q 2 [0; 1], de…ne z (q) := g I (Fe 1 : [0; 1] ! [ I ; 1 (q)) and Z (q) := Rq 0 C ] its continuous and z (y) dy. Let := convZ be the convex hull of Z on [0; 1] (see, e.g., Rockafellar (1970), p.36), i.e., the highest convex function on [0; 1] that satis…es Z. By de…nition, is continuously di¤erentiable except at possibly countably many points, so de…ne ! : [0; 1] ! R as ! (q) := 0 (q) , whenever 0 (q) exists. Also, w.l.o.g. extend ! (q) by right-continuity to all [0; 1) and by leftcontinuity at 1. For 2 [ I ; C ], let g I ( ) := !(Fe ( )); so g I is non-decreasing. I I Let m := minf ; C g > I and q m := Fe( m ) > 0. Since g I is continuous over [ I ; ], by the argument in Lemma 7 ! is continuous over [0; q m ], hence g I is continuous over [ I ; m ]. Lemma 13 g I ( I ) g I ( I ). Proof. If the opposite holds, then ! (0) > z (0). Since z is continuous over [0; q m ] and ! is non-decreasing, there exists q > 0 such that ! (y) > z (y) for every y q. Since Z (0) = (0), it follows that Rq Rq Z (q) = 0 z (y) dy Z (0) < 0 ! (y) dy (0) = (q) . A contradiction. It follows that d := supf 2 [ I ; C ] j g I ( ) < 0g > I . Similarly, de…ne u := inff 2 [ I ; C ] j g I ( ) > 0g, if the set is non-empty, otherwise u := C . By Theorem 3.7 of Toikka (2011), any best extension xC of xC C must satisfy xC ( ) = a for 2 ( I ; d ) and xC ( ) = xC ( C ) for 2 ( u ; C ), if any. Letting xC ( u ) = xC ( C ) is w.l.o.g.. Finally, over [ d ; u ), xC can be any non-decreasing function mapping to [a; xC ( C )], so long as it satis…es the necessary pooling property described by Toikka (see De…nition 3.5). By Corollary 3.8 of Toikka (2011), it is then w.l.o.g. to set xC ( ) = xC ( C ) for 2 [ d ; u ). 9.9 Proof of Corollary 2 I C Fix xI0 = xI and recall the expression of RC (xI ) in (9). Observe that g C ( ) < 0 for 2 ( ; ). Therefore, the rule xI 2 M with xI I = xI0 I that minimizes RC (xI ) must satisfy xI ( ) = I I C I I C xI0 ( ) for 2 ( ; ). So, let x ^I ( ) = xI0 ( ) for 2 ( ; ], and x ^I ( ) = xI0 ( ) otherwise. C C Using (25) and the best extension xu in Lemma 3 of x C , condition (RR) becomes RC (^ xI )+ R C C U (xC ( ))V C ( ) dF C U (xC ( C R )) C I gI ( ) d [U (a) U (xC ( C R ))] Id g I ( ) d . By assumption the left hand side is positive; by construction RC (^ xI ) has been minimized, and R u I < 0. The result follows. I g ( )d 44 9.10 Proof of Proposition 4 Preliminary Observations. First, I show that the screening C-menu sustains the e¢ cient outcome with C if and only if the constraint (RR) does not bind. Recall that I = = , where is the Lagrangian multiplier associated to (RR). Lemma 14 There exists xC 2 M such that xC ( ) = xC ( ) for every I I W C (x) R (x) if and only if I = 0. Proof. Once more, I will work with the functions u 2 U. Suppose I I eI f C (u) R (U 1 (u)) in (25), rewrite W R (u) as VS C (U where wC ( ; I 1 I (u); )= = C )= I R C C [u ( ) wC ( ; I ) + (u ( ))]dF C + I R C 2 I C I and xC maximizes eI (u) = > 0. Using R u ( ) gI ( ) d , C C C V C ( ). Let uC u = U (xu ) where xu is the best enlargement of x C C uC u C C as in Lemma 3 and U be the set of u 2 U that coincide with on [ ; ]. By construction, C 1 C I C 1 I VS (U (uu ); ) = maxu2U VS (U (u); ). I claim that there exists u ^ C 2 UnU such that C I I ). Focus on [ m ; ] with m := maxf ; C g, and let w^ C VS C (U 1 (^ uC ); I ) >VS C (U 1 (uC u ); be the generalized version of wC over this interval obtained with the same techniques as in the C proof of Proposition 2. Since wC is continuous on [ m ; ], so is w^ C (Lemma 7). Because I > 0, C V C implies wC ( ; I ) > = C for 2 [ m ; ). I claim that w^ C ( m ; I ) > m = C . By the same argument as in Lemma 11, w^ C ( m ; I ) wC ( m ; I ) and, if the inequality is strict, then there exists ^ C ( ; I ) = wC ( 0 ; I ) for 2 [ m ; 0 ]. If w^ C ( m ; I ) = wC ( m ; I ), then the 0 > m such that w C claim follows. If w^ C ( m ; I ) = wC ( 0 ; I ), then w^ C ( m ; I ) > m = C . Finally, because 0= w^ C is continuous and non-decreasing, in either case there exists 1 > m such that w^ C ( ; I ) > = C for 2 [ m ; 1 ]. Construct u ^ C by letting u ^ C ( ) = arg maxy2[U (a);U (a)] y w^ C ( ; I ) + (y) if C 2 [ I ; m ). Then, u ^ C 2 U, but u ^ C ( ) > uC 2 [ m ; 1 ]; 2 [ m ; ], and uC u ( ) if u ( ) for every C so u ^ 2 = U . Finally, VS C (U 1 (^ uC ); I ) VS C (U 1 (uC u ); I ) = R R C m [^ uC ( ))wC ( ; I )+ u ^ C ( ) ]dF C C m C [uC u ( )w ( ; I C ) + (uC > 0. u ( ))]dF I I It follows that the solution to maxx2M fW C (x) R (x)g di¤ers from xC ( ) at least for I 2 [ ; m ). C C Recall that xI ( C ) maximizes W I (xI ) R (xI ) over the set M. By revealed optimality, C C C I C C I C ^ > implies that R (x (^ )) R (x ( )). Uniqueness of xI ( C ) (Proposition 2) implies RC (xI (^C )) = RC (xI ( C )) if ^C = C . It follows that, for every C 0, RC (xI ( C )) is a decreasing function of C , RC (xI ( C )) RC (xI ), and lim C !+1 RC (xI ( C )) = 0 by Proposition 2 and 1. C Part (a): By Proposition 1, RI (xC ) < 0. De…ne C 0 j RC (xI ( C )) + 2 = minf I C R (x ) 0g. The result follows from Lemma 14. Lemma 4 and continuity of RC (xI ( C )) imply that C 2 > 0 in some environments. 45 I C I C Part (b): Let xC u be as in Proposition 3, which satis…es R (xu ) < R (x ) < 0. De…ne C C 0g. Again, the result follows from Lemma 14. 0 j RC (xI ( C )) + RI (xC u) 1 = minf C Lemma 4 and continuity of RC (xI ( C )) imply that C 1 < 2 in some environments. Part (c): Lemma 4, Corollary 2, and continuity of RC (xI ( C )) imply that C 1 > 0 in some environments. Again, the result follows from Lemma 14. 9.11 Proof of Proposition 5 Preliminary Observations: Recall that, given xj , there is a one-to-one relationship between U j (xj ; k j ) and k j . Therefore de…ne hj := U j (xj ; k j ), and write R-IC i;j : hi hj + Ri (xj ) and IRi : hi 0. Moreover, as in the proof of Lemma 2 and Proposition 2, it is convenient to work with the f i (ui ) := W i (U 1 (ui )) and R ei (uj ) := Ri (U 1 (uj )). functions u 2 U. Recall that W Step 1: There exists U (a) small enough so that unused items su¢ ce to satisfy (R-IC i;j ) for i > j. ei (uj ). If i < j, i > j and Recall that (R-IC i;j ) is hi hj + R where Gi ( ) = if i > j, j i > R e i uj = R i 1 i fi ( ) i j R uj ( ) G i ( ) d (1 j j uj ( ) V j;i ( ) dF j , F i ( )) and V j;i ( ) = v j ( ) f i( ) i v fj( ) ( ); and R e i uj = R where j i i uj ( ) G ( ) d + i i G ( )= V j;i j ( )= 1 1 j 1 i R j j uj ( ) V ( ) dF j , fi ( ) + Fi ( ) i fi ( ) fj ( ) Fj ( ) fj ( ) j;i 1 1 i Fi ( ) . fi ( ) Take j > i. Suppose that (R-IC j;i ) is binding (and all the other constraints are satis…ed), i ej (ui ). Fix ui for every i.e., hj < hi + R , and let ui ( ) = U (a) for every < i . Then, R i j R i i;j U (a) j G ( ) d + i ui ( ) V ( ) dF i . R j ui = Lemma 15 R i j j G ( ) d < 0. Proof. Rewrite the integral as R i j ( j 1)( = j )f j ( ) d + 46 R i j Fj ( )d . Integrate by parts the second term, to get R i j )f j ( ) d + F j ( i ) j ( = i = R i j ( i ( = j ))f j ( ) d . = j , with strict inequality for 2 ( j ; i ). Finally, note that i s e i so constructed satis…es hj It follows that there exists U (a) small enough so that the u ej (e hi + R ui ). Now, we need to check the other constraints. For all j 0 < i, the values that ui takes 0 for i and |^ 6= j, it could be 0 e i is such that R|^ (e e i may violate (R-IC j ;i ) while ui did not. However, that u ui ) > R|^ (ui ), and u since Lemma 15 holds for every j > i and N is …nite, clearly we can …nd U (a) small enough so that no j > i wants to mimic i. Thus, for the rest of the proof I will assume that (R-IC j;i ) never binds for j > i. Step 2: Reformulation of the principal’s problem. As usual, (IRN ) and (R-IC i;N ) imply (IRi ) for all i < N . Now, let Y := (U R)N be the P fi i the subspace of (X R)N , where X = fuj u : ! Rg. Let e (U; h) = N hi ] and i=1 i [W (u ) P f i (ui ). Then, the problem P N is equivalent to f (U) = N i W W i=1 ( f (U) + e (U; h) maxfU;hg2Y (1 )W eN := , P s.t. (U; h) 0 where the functional : (X R)N ! Rr (r = 1 + N (N2 1) ) is given by ei (uj ) + hj hi for some i 2 N , j > i. all n = 2; : : : ; r, n (U; h) = R Step 3: Existence of interior points. eN , there exists fU; hg 2 Y such that Lemma 16 In P 1 (U; h) = hN and, for (U; h) < 0. ei (uj ) for every Proof. The condition (U; h) < 0 is equivalent to hN > 0, and hi > hj + R i i i i i 2 N , j > i. For every i 2 N , let u = u = U (x ) over [ ; ] and possibly extend it over ej (ui ) [ ; i ) to include appropriate unused items. Note that these extensions are irrelevant for R ej (ui ) 0 for all j < i, and it can be easily shown that R e1 (ui ) R ej (ui ) if j < i. Recall that R e1 (ui+1 ) + 1. Now, …x for all 1 < j < i. Thus, let hN = 1, and for every i < N , let hi = hi+1 + R i < N and consider any j > i. We have P e1 ui+n + (j i) hj + R ei xj + (j i) > hj + R e i uj . hi = hj + jn=1i R ei (uj ) are bounded and N is …nite, the constructed vector h is well de…ned. Note that since R eN . Step 4: Lagrangian characterization of the solution to P eN , I use Corollary 1, p. 219, and Theorem 2, p. 221, of To characterize the solutions to P Luenberger (1969). Note that (X R)N is a linear vector space, and Rr is normed spaces with the usual Euclidean norm, with positive closed cone containing an interior point. Y is a convex f are concave by subset of (X R)N . By Lemma 16, admits interior points. Finally, e and W the assumptions on U ( ) and c ( ). Hence, the objective is concave and (U; h) is convex. Now, for 2Rr+ , de…ne the Lagrangian j;i ej f (U) + e (U; h) + N hN PN P L (U; h; ) = (1 )W [R (ui ) + hi hj ] i=1 ji N P j;i i j<i P j;N j<N 0 L (U; h; for all fU0 ; h0 g 2 Y, 0 ) . if i = N N eN if and only if there exists Then fU; hg solve P if i < N 0 such that L (U0 ; h0 ; ) L (U; h; ) 0. The second inequality is equivalent to P f i (u) u 2 arg max W i j i ei (uj ) + hj R hi 0 and i;j [Ri (uj ) + hj Lemma 17 If (U; h; ) satis…es the conditions (26)-(29), then hi ] = 0. i (29) ( ; ; ) = 0 for all i 2 N . Proof. Recall that hi 0 for all i 2 N , by combining (IRN ) and (R-IC i;N ). Therefore, i f (U) + e (U; h) is bounded below ( ; ; ) 0 for all i 2 N . On the other hand, (1 )W nf nf nf i by U (a )E(s) a c a > 0, hence ( ; ; ) 0. This lemma has two immediate implications about the binding constraints. Corollary 4 If = 0, then such that R-ICi;j binds. i = 0. If > 0, IRN binds and, for every i < N , there is j > i Proof. The second part follows from the last Lemma. For the …rst part, note that, since ( ; ; ) = 0 for all i = 1; : : : ; N , " # N NP1 P P P j;i P j;N i;j i 0= ( ; ; )= + N = N , i=1 i=1 j>i ji P N ( ; ; ) = 0 implies j<N j;N = 0. Hence, j;N = 0 for all j < N . Suppose for all j i + 1, P P n;j = 0 for all n < j. Then, by i ( ; ; ) = 0, we have ji i;j = 0. Hence, j;i = 0 for all j < i. Therefore, although i ( ; ; ) = 0 makes any hi 2 R solve (27), we can use the upward binding constraints to pin down h, once U has been chosen. 48 eN exists if we can …nd (U; ) so that, for every i = 1; : : : ; N , ui Thus, a solution to the P solves (26), i ( ; ; ) = 0, and (28) and (29) hold. By replicating the arguments in the proof of Proposition 2 (see Step 5 below), we have that, for every 0, a solution ui to (26) always i exists unique over ( i ; ) and is pointwise continuous in . Furthermore, if j;i ! +1 for i ej (ui ) ! 0. Moreover, since i ( ; ; ) = 0, some j < i, then ui ! U (anf ) over ( j ; ), i.e., R 0 i;j 0 ei uj ! 0 and hi ! 0 (using the binding (R-IC i;j 0 )). ! +1 for some j 0 > i, so that R Therefore, there exists j;i large enough to make (29) hold. Finally, (28) always hold with hN = 0. Step 5: Characterization of expected virtual surplus and existence of a solution. Fix i > 1. Then, by (26), ui must solve Pi f i (u) max W 1 n=1 u2U n;i en (u) , R en (xi ), and letting ( ) = where i 2 Ri+ 1 . Now, using the expression of R we have f i ui W iP1 n;i n=1 e n ui R iP1 = n;i n=1 R n ui ( ) G n ( ) d + i i = V S i ui ; : , R i i U [ui ( ) wi ; iP1 n i 1 () c(U 1 ( )), + (ui ( ))]dF i where wi ; i := i + iP1 n;i V i;n ( ) = n=1 i + iP1 n;i i v ( ) n=1 n;i f fi n=1 ( ) n v ( ). ( ) We can now apply to VS i ui ; i the technique used in the two-type case to characterize f i (ui ) and ui = ui = U (xi ) on i . For uI (Proposition 2). Clearly, if i = 0, VS i (ui ; 0) = W i i > , we can let ui ( ) = ui ( ). For 0 for some n < i. Apply the Myerson-Toikka ironing technique on i , by letting Rq z i q; i = wi (F i ) 1 (q); i and Z i q; i = 0 z i y; i dy. Let i q; i = convZ i q; i , and ! i q; i = iq q; i wherever de…ned. Extend ! i by right continuity, and at 1 by left continuity. For ! i to be continuous, it is su¢ cient to show that, if z i is discontinuous of q, then z i jumps down at q. To see this, note that wi can be discontinuous i only at points like j for j j w ( +; i )= i Pi 1 n=1 n;i i v( ) Pi 1 n=1 n;i f n ( ) n v ( ). ) f i( , which implies that f n ( j ) = 0, and (2) v j ( j ) = (1 j i + + Pi 1 n=1 n;i i j v( ) 49 Pi 1 n=j ( j) n j v ( ). f i( j ) n;i f n On the other hand, wi ( j i ; ) = limj i " j = i Pi w( j i + ; j i ) i w ( +; Pi 1 n=1 1 n=1 Therefore, i + )= Pi v( ) n;i i ( j) j j v ( )= f i( j ) j;i f ( ) n v ( ) ) f i( n;i f 1 n=j+1 v( ) n ( j) n j v ( ). f i( j ) Pi j n;i f 1 n=1 n;i i n j ( j) 1 f i( j ) j j;i f j j 0. j This proves that z i can at most jump down, and, hence, ! i is continuous. Letting wi ( ; i ) = i ! i (F i ( ); i ) for 2 i , construct the generalized virtual surplus VS , as we did in the proof of Proposition 2. i n Now, note that Gn ( ) < 0 for 2 ( ; ). Therefore, since n;i > 0 for some n < i, n the …rst term in VS i is strictly negative. Let n = min n : n;i > 0 . Then, over ( i ; ), the I i characterization of Lemma 8 of any maximizer to VS extends to VS . In particular, ui must n i i be constant at y ib over ( ib ; ), where ib and y ib ui ( ). Furthermore, y ib = ui ( ib ), if ib > i ; and ui ( ) = ui ( ) for 2 [ i ; ib ]. Next, the same argument as in Lemma 9 for I i VS yields that a (unique) maximizer of VS exists as well. Finally, the same argument as in i Lemma 10 implies that the (unique) maximizer of VS is also the (unique) maximizer of VS i . Step 6: Properties of the solutions to (26). Suppose that n;i > 0 for some n < i and let n be de…ned as before. Letting Ki = Pi 1 n=n the analog of the ironing condition for which implies ib R To prove that wi R i ib iP1 n=1 n;i < i ib i [wi ib ; R i wi y; ib i ib n;i R n i Gn ( ) d < 0, applies to wi ib ; ib i : dF i + K i = 0, , and can be written as ib i i ; < V i;n ( ) dF i + ( = i )]dF i = i iP1 n=n R i ib Pi 1 n=1 n;i V i;n ( ) dF i + K i . = i , it is enough to show that the right hand side is negative: n;i R n i Gn ( ) d = iP1 n=n n;i R i ib V i;n ( ) dF i + n R n i Gn ( ) d i <0 which follows by (18). Therefore, ui exhibits bunching over [ ib ; ] with ib < , and at value i y ib < ui ( ). i Now, consider the bottom of [ i ; ]. By the same argument as in Lemma 11, wi ( i ; i ) wi ( i ; i ) with strict inequality if ib > i . Furthermore, for i = i . So if wi ( i ; i ) = n=1 w ( ; i ), then ui ( i ; i ) > ui ( i ). Otherwise, there is ironing at the bottom over [ i ; ib ] 6= ? and the following must hold 1 n=1 i i n;i i R which corresponds to Now, for each n < i, R i b i R i b i R V i;n (y) dF i = R = R = i b i i i y= i i b= s i b i b s ) i wi ( ib ; ) dF i = 0, iP1 ) dF i = n;i n=1 v i (y) dF i i b i i wi (y; wi ( ib ; i y= i b i R R i b i V i;n (y) dF i . i b i v n (y) dF n R ib dF i y= i R ib = n dF s s n i b i b dF n R dF = i i b= i n b= i b s dF > 0. Therefore, wi ( ib ; i ) > i = i , and ui ( i ; i ) > ui ( i ). Finally, for bunching at the bottom, note that for s in [s; sy ], F (s0 ) =f (s0 ) s0 [1 + iP1 n;i 1 i ]+ n=1 i iP1 n;i n=1 F i ( 0) f i ( 0) Fi ( ) . fi ( ) to be non-increasing in a neighborhood of F (s) =f (s) s 1 i " (1 i )+ iP1 n=1 1 n;i # i is that, for . P 1 n;i Hence, bunching at the top is more likely if i is closer to 1 and in=1 is large enough, i.e., if the principal assigns high enough shadow value to not increasing the rents of the types below type i. 10 Appendix C: The Planner’s Constrained Problem I consider the alternative speci…cation of the planner’s problem as that of maximizing the expected ex-ante social surplus subject to a budget constraint. For simplicity, I do so in the two-type model. I suggest looking at Section 4 before reading this part. 51 Let B be the planner’s budget. Assume that B is less than the maximal expected surplus: B W i (xi ). Using 8 and (X; k), the planner’s constrained problem is 8 maxfX;kg W C (xC ) + (1 ) W I (xI ) > > < s.t. xi 2 M, (IRi ), and (R-IC i ) for i = I; C, and PB := > > : (X; k) B. (IR B ) In PB, given X, multiple vectors k may be optimal. However, we can safely focus on DMs that make (R-IC C ) and (IR I ) hold with equality. Then, let (xC ; xI ; B) := W C (xC ) + (1 to reduce PB to ) W I (xI ) 1 RC (xI ) B, 8 maxX W C (xC ) + (1 ) W I (xI ) > > < s.t. xC ; xI 2 M, (RR) and PB0 = > > : B (xC ; xI ; B) 0. (IR ) By the same argument as in the proof of Lemma 2, PB0 is equivalent to 8 f C (uC ) + (1 f I (uI ) > maxU W )W > > < eC (uI ) + R eI (uC ) 0 and PB u = s.t. uC ; uI 2 U, R > > > B : e (U 1 (uC ); U 1 (xI ); B) 0, (IR ) where e (uC ; uI ; B) = (U 1 (uC ); U 1 (xI ); B). f C (uC ) + Let X and Y be as in the proof of Lemma 2. By concavity of U and c, W f I (uI ) and e (uC ; uI ; B) are concave; R eC (uI ) + R eI (uC ) is linear. The constraint func(1 )W C I C I I C C I e (u ) + R e (u ); e (u ; u ; B)) maps to R2 , which has non-empty, positional (u ; u ) := (R tive, and closed cone R2+ . Let 0 and 0 and de…ne the Lagrangian L(uC ; uI ; ; ) : f C (uC ) + (1 f I (uI ) + e (uC ; uI ; B) = W )W n eI (uC )]+ f C (uC ) = (1 + ) [W R (1+ ) + (1 f I (uI ) ) [W (1 o + eC (uI )] R )(1+ ) eC (uI ) + R eI (uC )] [R B. (30) Now, comparing (30) with (7), we see that here the weights on pro…ts and therefore on C’s rents depend endogenously on the multiplier 0, rather than on . However, the same fundamental C trade-o¤s arise in …nding the optimal u and uI (hence, xC and xI ). Speci…cally, by Theorem 2, p. 221, of Luenberger (1969), if (uC ; uI ; ; ) is such that, for all I (uC 0; 0 0, 0 ; u0 ) 2 Y; 0 L(uC ; uI ; 0; 0) L(uC ; uI ; ; ) 52 I L(uC 0 ; u0 ; ; ), (31) then (uC ; uI ) solves PBu . Condition (31) essentially implies the same optimality conditions as in Lemma 2. Also, if there exists (^ uC ; u ^ I ) 2 Y such that (^ uC ; u ^ I ) < 0, then by Corollary 1, p. 219, of Luenberger (1969), the condition (31) is also necessary for (uC ; uI ) to solve PBu . In particular, let B1 be the pro…ts that the optimal DM— derived in Section 4— yields in the monopolist’s case ( = 1). Lemma 18 There exists uC ; uI 2 U such that (uC ; uI ) < 0 if and only if B < B1 . eC (uI ) + R eI (uC ) Proof. ()) By de…nition of B1 , if uC ; uI 2 U and R 0 holds, then C I e (u ; u ; B1 ) 0. I u I (() Consider a solution (uC = 1. Since B < B1 , e (uC 1 ; u1 ) to the problem P with 1 ; u1 ; B) > eC (uI ) + R eI (uC ) < 0, we are done. Suppose R eC (uI ) + R eI (uC ) = 0. Recall that uI = 0. If R 1 1 1 1 1 f I (u) ^C R eC (u) with ^C > 0 (Lemma 2). By revealed optimality uI (^C ) = arg maxu2U W eC (uI (^C )) eC (uI ( C )) for every C > ^C . Because R eC (uI ) + R eI (uC ) = 0, R eC (uI (^C )) > R R 1 1 0 and uI (^C ) is not constant (Proposition 1). I claim that there exists ~C > ^C such that eC (uI (^C )) > R eC (uI (~C )): by the proof of Proposition 2 R eC (uI ( C )) is continuous in C , and R eC (uI ( C )) = 0 (Proposition 1). Finally, since also e is continuous in C , it remains lim C !+1 R C C eC I C eC I C " and e (uI ( C to choose C " ); u1 ; B) > 0. " > ^ so that R (u ( " )) = R (u (^ )) References [1] Akerlof, G. A. (1991): "Procrastination and Obedience", American Economic Review (Papers and Proceedings), 81, 2, 1-19. [2] Amador, M., Werning, I. and Angeletos, G.M. (2006): "Commitment vs. Flexibility", Econometrica, 74, 2, 365-396. [3] Amromin, G. (2002): "Portfolio Allocation Choices in Taxable and Tax-Deferred Accounts: An Empirical Analysis of Tax E¢ ciency", Mimeo. [4] Amromin, G. (2003): "Household Portfolio Choices in Taxable and Tax-Deferred Accounts: Another Puzzle?", European Finance Review, 7, 547-582. [5] Ashraf, N., N. Gons, D. S. Karlan and W. Yin, (2003): "A Review of Commitment Savings Products in Developing Countries", ERD Working Paper No. 45. [6] Ashraf, N., D. S. Karlan and W. Yin, (2006): "Tying Odysseus to the Mast: Evidence From a Commitment Savings Product in the Philippines", Quarterly Journal of Economics, 121, 2, 635-672. [7] Battaglini, M.: "Long-Term Contracting with Markovian Consumers," American Economic Review, Vol. 95, n.3, June 2005, pp. 637-658. [8] Bernheim, D. B. (2002): "Taxation and Savings", in Handbook of Public Economics, Volume 3, Edited by A. J. Auerbach and M. Feldstein. 53 [9] Bryan, G. and Karlan, D. and Nelson, S. (2010): "Commitment Devices", Annual Review of Economics. [10] Courty, P. and Li, H. (2000): "Sequential Screening", Review of Economic Studies, 67, 4, 697–717. [11] DellaVigna, S. (2009): "Psychology and economics: Evidence from the …eld", Journal of Economic Literature, 47,2, 315-372. [12] DellaVigna, S. and Malmendier, U., (2004): "Contract Design And Self-Control: Theory And Evidence", Quarterly Journal of Economics, 119, 2, 353-402. [13] DellaVigna, S. and Malmendier, U., (2006): "Paying not to go to the gym", American Economic Review, 96, 3, 694-719. [14] Diamond, P. A. (1977): "A Framework for Social Security Analysis", Journal of Public Economics, 8, 275-298. [15] Eliaz, K. and Spiegler, R., (2006): "Contracting with diversely naive agents", Review of Economic Studies, 73, 3, 689-714. [16] Eliaz, K. and Spiegler, R., (2008): "Consumer optimism and price discrimination", Theoretical Economics, 3, 4, 459–497. [17] Esteban, S. and Miyagawa, E., (2006a): "Temptation, self-control, and competitive nonlinear pricing", Economics Letters, 90, 3, 348-355. [18] Esteban, S. and Miyagawa, E., (2006b): "Optimal menu of menus with self-control preferences", Mimeo, Universitat Autònoma de Barcelona. [19] Esteban, S., Miyagawa, E. and Shum, M. (2007): "Nonlinear pricing with self-control preferences", Journal of Economic Theory, 135, 1, 306-338. [20] Gul, F, and W. Pesendorfer (2001): "Temptation and Self-Control," Econometrica, 69, 1403-1435. [21] Heidhues, P. and Koszegi, B. (2010): "Exploiting naivete about self-control in the credit market", American Economic Review, Forthcoming. [22] Holden, S., Ireland, K., Leonard-Chambers, V., and Bodgan, M. (2005): "The Individual Retirement Account at Age 30: A Retrospective", Investment Company Institute, 11, 1. [23] Holden, S., Schrass, D. (2008): “The Role of IRAs in U.S. Households’Saving for Retirement, 2008”, Investment Company Institute, 18, 1. [24] Holden, S., Schrass, D. (2009): “The Role of IRAs in U.S. Households’Saving for Retirement, 2009”, Investment Company Institute, 19, 1. [25] Holden, S., Schrass, D. (2010a): “The Role of IRAs in U.S. Households’Saving for Retirement, 2010”, Investment Company Institute, 19, 8. 54 [26] Holden, S., Sabelhaus, J., and Bass, S. (2010b): "The IRA Investor Pro…le. Traditional IRA Investors’Contribution Activity, 2007 and 2008", Investment Company Institute. [27] Investment Company Institute (2012): “The U.S. Retirement Market, Second Quarter 2012”(September). http://www.ici.org/info/ret_12_q2_data.xls [28] Jianye, Yan (2011): "Contracting with Heterogeneous Quasi-Hyperbolic Agents: Investment Good Pricing under Asymmetric Information", Mimeo, Toulouse School of Economics. [29] Kreps, D. (1979): "A Representation Theorem for Preference for Flexibility", Econometrica, 47, 565-577. [30] Laibson, D. (1998): "Life-Cycle Consumption and Hyperbolic Discount Functions", European Economic Review, 42, 861-871. [31] Luenberger, D. G. (1969): "Optimization by Vector Space Methods". New York: Wiley. [32] Munnell, A. H., Sundén, A. E. (2006): "401(k) Plans Are Still Coming Up Short", Brookings Institute Press. [33] Mussa, M., and Rosen, S. (1978): "Monopoly and Product Quality", Journal of Economic Theory, 18, 2, 301-317. [34] Myerson, R. B. (1981): "Optimal Auction Design", Mathematics of Operation Research, 6, 1, 58-73. [35] Myerson, R. B. (1986): "Multi Stage Games with Communication", Econometrica, 54, 2, 323-358. [36] O’Donoghue, T. and Rabin, M. (1999): "Doing It Now or Later", American Economic Review, 89, 1, 103-124. [37] O’Donoghue, T. and Rabin, M. (1999): "Incentives for Procrastinators", Quarterly Journal of Economics, 114, 3, 769-816. [38] O’Donoghue, T. and Rabin, M. (2001): "Choice and Procrastination", Quarterly Journal of Economics, 116, 1, 121-160. [39] O’Donoghue, T. and Rabin, M. (2003): "Studying Optimal Paternalism, Illustrated by a Model of Sin Taxes", American Economic Review, 93, 2, 186-191. [40] O’Donoghue, T. and Rabin, M. (2007): "Incentives and Self Control", in Richard Blundell, Whitney Newey, and Torsten Persson, eds., Advances in Economics and Econometrics: Theory and Applications (Ninth World Congress), Cambridge University Press, August 2007. [41] Pavan, A. (2007): "Long-Term Contracting in a Changing World", Mimeo. Northwestern University. 55 [42] Pavan, A., Segal, I., and J. Toikka: "Dynamic Mechanism Design: Incentive Compatibility, Pro…t Maximization, and Information Disclosure." Mimeo, Northwestern University. [43] Rockafellar, R. T. (1970): "Convex Analysis", Princeton University Press. [44] Royden, H. L. (1988): "Real Analysis", Third Edition, Prentice Hall. [45] Sandroni, A. and F. Squintani (2007): "Overcon…dence, Insurance, and Paternalism", American Economic Review, 97, 5, 1994-2004. [46] Spiegler, R. (2011): "Bounded Rationality and Industrial Organization," Oxford University Press. [47] Strotz, R. H. (1956): "Myopia and Inconsistency in Dynamic Utility Maximization," Review of Economic Studies, 23, 165-180. [48] Thaler, R. H. (1994): "Psychology and Savings Policies", American Economic Review, 84, 2, 186-192. [49] Toikka, J. (2011): "Ironing Without Control", Journal of Economic Theory, 146, 6, 25102526. [50] U.S. government, (2010): "Estimates of The Federal Tax Expenditures for the Fiscal Years 2010-2014", Washington. 56

Commitment, Flexibility, and Optimal Screening of Time Inconsistency Simone Galperti Northwestern University

Related documents

Products

Support

Commitment, Flexibility, and Optimal Screening of Time Inconsistency Simone Galperti Northwestern University

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib