Commitment, Flexibility, and Optimal Screening of Time Inconsistency Simone Galperti Northwestern University

advertisement
Commitment, Flexibility, and Optimal Screening of
Time Inconsistency
Simone Galperti
Northwestern University
November 11, 2012
Abstract
I study the optimal supply of ‡exible commitment devices to people who value both
commitment and ‡exibility, and whose preferences exhibit varying degrees of time inconsistency. I …nd that, if time inconsistency is observable, then both a monopolist and a
planner will supply devices that enable each person to commit to the e¢ cient level of
‡exibility. If instead time inconsistency is unobservable, then both suppliers will face a
screening problem. To screen a more time-inconsistent from a less time-inconsistent person, the monopolist and (possibly) the planner ine¢ ciently curtail the ‡exibility of the
device tailored to the …rst person, and include unused options in the device tailored to
the second person. These results have important policy implications for designing special
savings devices that use tax incentives to help time-inconsistent people adequately save for
retirement.
KEYWORDS: Time inconsistency, self-control, commitment, ‡exibility, contracts, screening, unused items.
JEL CLASSIFICATION: D42, D62, D82, D86, D91, G21, G23.
Department of Economics, Northwestern University, Evanston, IL 60208 (E-mail:
simonegalperti2008@u.northwestern.edu). I am indebted to Eddie Dekel and Alessandro Pavan for many, long, and
fruitful discussions that greatly improved the paper. I also thank Gene Amromin, Stefano DellaVigna, Roland
Eisenhuth, Je¤rey Ely, William Fuchs, Garrett Johnson, Botond Koszegi, David Laibson, Santiago Oliveros,
Jonathan Parker, Nicola Persico, Henrique Roscoe de Oliveira, Todd Sarver, Ron Siegel, Bruno Strulovici,
Balazs Szentes, Rakesh Vohra, Asher Wolinsky, Michael Whinston, Leeat Yariv, and seminar participants at
Northwestern University, New York University, UC Berkeley, EEA-ESEM 2012. I gratefully acknowledge …nancial support from the Center of Economic Theory of the Weinberg College of Arts and Sciences of Northwestern
University. All remaining errors are mine.
1
1
Introduction
Evidence from economics and psychology shows that many people have self-control problems
(see, e.g., DellaVigna (2009) for a survey). Signs of such problems include amassing excessive
credit-card debt, saving too little for retirement, and exercising too little. People are often aware
of their self-control problems and thus look for commitment devices. At the same time, they also
face uncertainty about their future and thus value ‡exibility.1 These con‡icting desires create
a demand for ‡exible commitment devices. Examples include illiquid savings devices, such as
individual retirement accounts (IRAs) and 401(k) plans, gym memberships with large upfront
fees but no per-visit charge, and on-line goal-setting tools provided by companies like stickK
and GymPact.2
To get a sense of the importance of such devices, consider the U.S. retirement market. As
of June 2012 this market had assets of $18.5 trillion, of which $3.3 trillion (17.8%) was in
401(k) plans and $5.1 trillion (26.5%) was in IRAs.3 IRAs and 401(k) plans are examples of
"tax-shielded" investment accounts, which obey special tax rules that subsidize savings but also
set limits on contributions to the balance as well as penalties for withdrawals. Authorizing
such special accounts is also particularly costly: For the …scal years 2010-2014, the resulting
burden on the U.S. federal budget amounts to $711 billion (i.e., 12.7%) of the estimated tax
expenditures.4
This paper o¤ers a theoretical investigation of markets for ‡exible commitment devices. It
is the …rst to consider simultaneously three essential aspects of such markets. First, people
demand devices that provide both commitment and ‡exibility. Second, people di¤er in their
degrees of self-control and consequently demand di¤erent devices. Third, the provider of these
devices— whether it is a pro…t-maximizing …rm or a welfare-maximizing government— cannot
observe people’s degrees of self-control when tailoring its supply to individual needs. Thus, it
faces the problem of screening each individual according to his preference for commitment and for
‡exibility. The paper studies this problem and characterizes its pro…t- and welfare-maximizing
solutions.5
I model a market for ‡exible commitment devices using a dynamic principal-agent framework
with two periods.6 In period 1, the principal (‘she,’hereafter) o¤ers the agent (‘he’) an incentive
device, which allows him to choose among several actions in period 2, and for each action
speci…es a payment to the principal. As an example, consider a savings device that allows
the agent to contribute to and withdraw from the balance by paying some fee. In period
1
For an alternative analysis of the trade-o¤s between commitment and ‡exibility see, e.g., Amador et al.
(2006). I will compare my analysis with theirs in Section 6.2:
2
Other typical examples of commitment devices include automatic drafts from checking to investment accounts, Christmas clubs, rotating savings and credit associations, microcredit savings accounts in developing
countries, fat farms, and programs to reduce consumption of cigarettes, alcohol or drugs. (See Ashraf et al.
(2003), Ashraf at al. (2006), DellaVigna and Malmendier (2004, 2006), Bryan et al. (2010)).
3
Investment Company Institute, “The U.S. Retirement Market, Second Quarter 2012” (September 2012).
4
Source: Estimates of The Federal Tax Expenditures for the Fiscal Years 2010-2014 (2010)
5
O’Donoghue and Rabin (2007) as well as Bryan et al. (2010) raise the question of whether markets can
provide products to solve people’s commitment problems, but the answer lies beyond the scope of their papers.
Laibson (1998) questions the optimality of the retirement instruments created by the U.S. government. Amador
et al. (2006) study the design of paternalistic commitment policies such as minimum-savings requirements.
6
DellaVigna and Malmendier (2004) and Eliaz and Spiegler (2006) adopt a similar approach.
2
1, agents desire ‡exible devices because their preference over period-2 actions depends on an
unknown state (Kreps (1979)), which is not contractible. Moreover, in period 1 some agents
also desire commitment, because their preference is time inconsistent (Strotz (1956)). Finally,
the agents’self-control problems are unobservable, because they alone know their degree of time
inconsistency (i.e., their type). For the sake of illustration, suppose that there are only two
types: agents of type C are time consistent, and agents of type I are all time inconsistent to the
same degree.
In Section 3 I show that, if the agents’types were observable, then the principal would o¤er
each of them a (possibly di¤erent) device with the best mix of commitment and ‡exibility—
independently of whether she maximizes pro…ts or welfare. That is, such a device helps each
type commit to a ‡exible plan of action that is e¢ cient in each state from the viewpoint of
his period-1 preference.7 I call this outcome e¢ cient, adopting the standard interpretation of
the agent’s period-1 preference as his long-run preference (O’Donoghue and Rabin (1999)). I
also show that, in some cases, the principal can sustain the e¢ cient outcome using the same
device with both types— so their unobservability is irrelevant. Otherwise, it creates a screening
problem.
This screening problem arises because, thanks to his stronger self-control, C values any
‡exible device strictly more than does I. So, if the principal wants to sustain a ‡exible outcome
with I, she must grant C enough rents to disincentivize his pretending to be I— as in standard
screening models. Unfortunately, these rents can jeopardize I’s incentives for revealing his time
inconsistency. This possibility can lead to the unusual situation in which the principal has to
take into account, at once, both types’ incentives for mimicking one another when designing
each commitment device.
I derive the optimal screening devices in Section 4. When designing the device for I, the
principal faces a standard trade-o¤ between maximizing e¢ ciency with him and extracting rents
from C. Intuitively, since these rents depend on the ‡exibility of I’s device, the principal should
curtail it below the e¢ cient level. Although in principle she can do so in many ways, I am able
to provide a tight characterization of the optimal one. Roughly speaking, the principal limits
I’s choices to a strict subset of the e¢ cient set and induces bunching at (possibly) both ends of
his choice range. I provide an illustrative example in Section 1.1 and present the general result
in Section 4.2.
When designing the device for C, the principal always wants to maximize its e¢ ciency,
but she must also take into account— as noted— that this device may induce I to pretend to
be C. To prevent this, she may have to modify the device for C— relative to the symmetricinformation, e¢ cient one— in two ways. First, she may add tempting items that C never uses
but that I would. These items make I view the device for C as o¤ering less commitment, and
thus deter I from pretending to be C.8 Second, the principal may also distort the outcome
with C. Thus, we have a failure of the ‘no distortion at the top’ property that distinguishes
previous screening models, both static and dynamic (e.g., Mussa and Rosen (1978); Courty and
Li (2000); Battaglini (2005)). Finally, comparing the monopolist’s and planner’s optimal device
for C, I show that if the monopolist has to rely on unused items (respectively, has to distort C’s
7
This result generalizes a previous result of DellaVigna and Malmendier (2004), as discussed in Section 6.2.
A related result appears in Esteban and Miyagawa (2006b). Our models di¤er in ways that lead to substantive
di¤erences in when and how the principal relies on the unused items; I shall discuss these di¤erences in Section
6.2 after presenting my analysis.
8
3
choices), then the planner does as well.
This paper contributes to the growing literature on behavioral contract theory.9 It adds to
this literature the result that, when a pro…t-maximizing …rm faces diversely time-inconsistent
agents, it screens them using contracts with di¤erent degrees of ‡exibility. Moreover, the paper
characterizes the speci…c form that this screening takes: The contract for I ine¢ ciently curtails
‡exibility as illustrated in Section 1.1; the contract for C, instead, may intentionally in‡ate
‡exibility by o¤ering more options than C deems useful.
This paper also contributes to the normative study of optimal paternalism (O’Donoghue and
Rabin (2003)). On the one hand, it shows that there is scope for paternalistic interventions in
providing ‡exible commitment devices to people with self-control problems: When the degree
of such problems is unobservable, pro…t-maximizing …rms create ine¢ ciencies for the people
who most need commitment. On the other hand, since also a paternalistic government cannot
observe people’s degree of self-control, its best provision of commitment devices may not reach
the …rst best. To illustrate its properties, in Section 5 I address the following question: How
should the government design special savings devices (SDs), with speci…c tax rules that help I
appropriately save for retirement, while ensuring that C does not use such SDs? The main points
can be summarized as follows: (1) due to asymmetric information, the government faces a tradeo¤ between the corrective and the redistributive role of taxation; (2) solving this trade-o¤ calls
for curtailing the liquidity of the special SDs relative to the standard SDs; and (3) the optimal
way to do so involves limiting— through binding caps implementable with tax penalties— the
ability both to tap and to contribute to a special SD.
Of course, given the simplicity of my setup, these points are very stylized and extreme caution
is required in relating them to real world situations. However, at a broad level, they resemble
some features of the U.S. regulation of IRAs and 401(k) plans, special SDs with tax incentives.
Also, one can interpret some evidence in Amromin (2002, 2003) as consistent with the principle
that curtailing the liquidity of the special SDs— namely, the IRAs and the 401(k) plans— helps
keep people with stronger self-control choosing the standard SDs— namely, the so-called ‘regular
taxable accounts.’ Overall, my analysis can o¤er an additional way of looking at the current
U.S. regulation of SDs and of assessing its future revisions.
Finally, this paper relates to the literature on dynamic mechanism design. First, it highlights
a methodological point: In models with time-inconsistent agents, one cannot in general restrict
attention to mechanisms that are de…ned only on the path of play. Indeed, to characterize
the optimal devices, one must allow for options that are o¤ path. This can never occur in
standard models without time inconsistency. Second, deriving the devices that optimally screen
the agents’time inconsistency requires some non-standard and recent techniques. For reasons
that I will explain later, to handle the incentive constraints involving the agents’types, I use
Lagrangian methods from Luenberger (1969). This approach di¤ers from a standard optimalcontrol approach and the standard dynamic-mechanism-design approach (Courty and Li (2000);
Pavan, Segal, and Toikka (2011)). Moreover, to ensure that each device satis…es a necessary
monotonicity property in the period-2 states, I adapt Toikka’s (2011) generalization of Myerson’s
(1981) ironing procedure to my setting with o¤-path options.
In Section 6, I provide a more detailed literature review; I also discuss the possibility of
9
This literature includes DellaVigna and Malmendier (2004, 2006); Esteban and Miyagawa (2006, 2007); Eliaz
and Spiegler (2006, 2008); Heidhues and Koszegi (2010); and Spiegler (2011) among others.
4
introducing overcon…dent agents into the model. Section 7 concludes. All proofs are in the
appendix.
1.1
An Illustrative Example: Savings Devices
Consider the following situation. In period 1, the principal o¤ers savings devices (SDs) to the
agent, who is planning his future savings. An SD allows him to contribute to and withdraw
from the balance in period 2 (his working life) at some fee, thereby providing a tool to smooth
consumption between that period and some future period of retirement. Speci…cally, given an
SD, in period 2 the agent chooses the action a and pays the fee p, where a > 0 for contributions
and a < 0 for withdrawals. For each a, SDs give a return at retirement, according to a rate that
I assume …xed and exogenous. On the other hand, the principal can design di¤erent SDs by
changing how the fee p depends on the action a.10 Finally, in this example, the principal incurs
no cost to provide SDs.
The choice of type C and type I among SDs depends on how each evaluates, in period 1, the
decisions that he expects to make in period 2. In period 1 each type knows his period-2 income,
y, but he is still uncertain about what level of savings a will be optimal in period 2, as this
decision hinges on a state s > 0 that realizes only then— e.g., his period-2 health. Speci…cally,
in period 1 each type assigns the utility y a p + sU (a) to choosing a and paying p in state
s, where s determines the trade-o¤ between immediate consumption, y a p, and expected
utility at retirement given a, namely U (a). However, when actually choosing in period 2, C
and I may use a di¤erent utility function. In period 2 C continues to choose (a; p) according to
the function y a p + sU (a). Instead, in period 2 I chooses (a; p) according to the function
y a p + s U (a) with 0 < < 1. Thus, in period 2 Inconsistent systematically overweighs
his immediate consumption over his future utility.
Suppose that, according to the period-1 utility y a + sU (a), for some states it is optimal
to contribute to an SD (a > 0), but for others (e.g., unexpected health problems) it is optimal
to withdraw from it (a < 0). For simplicity, let this optimal plan correspond to the function
a (s) = s a0 , where s ranges from s to s.11 Note that a de…nes what I call the e¢ cient
outcome, since the principal’s costs are zero.
Before describing the optimal SDs, I want to brie‡y give some intuition for the relation
between the problem of screening time inconsistency and the ‡exibility of I’s SD, and for why
this problem leads to its curtailment. Suppose that the principal wants to help I commit to
the plan a . To do so, his SD must feature penalties for withdrawals and rewards for deposits,
since in period 2 I cares systematically more about his immediate consumption than about
his retirement well-being. How does C evaluate such an SD in period 1? Roughly speaking,
since C has stronger self-control, he expects to reap the rewards and avoid the penalties more
often than does I; so C expects a larger payo¤ from I’s SD than does I himself. In contrast,
suppose that the principal o¤ers I an SD with only one option, anf , i.e., no ‡exibility at all.
Then, I doesn’t need any penalty or reward to commit to anf , and C cannot bene…t from his
stronger self-control. More generally, by making I’s SD less ‡exible, the principal can reduce
10
Note that the fee p can involve two parts, p1 and p2 , where only p2 depends on a and p1 is possibly paid in
period 1 by the agent as an initial contribution to the SD.
11
This plan is optimal if, for example, U (a) = ln (a + a0 ).
5
the penalties and rewards that help him commit. These changes curtail how much C expects to
bene…t from lying in period 1 and consequently reduce the rents that keep C from doing so.
I now describe the SDs that the principal o¤ers to screen the two types. I consider the case in
which she maximizes her expected pro…ts from the fees p. The detailed derivation is in Section
4.
Consider …rst the SD designed for I. Figure 1 illustrates his resulting savings decisions— the
function denoted by aI — and compares them with the e¢ cient plan a . Panel (a) illustrates aI
for a setting in which I’s market share is large, and panel (b) for a setting in which it is small.
(a)
(b)
Figure 1: E¢ cient Outcome vs. Second Best with type I
By comparing aI and a , we see exactly how the principal curtails the ‡exibility of the savings
opportunities o¤ered to I relative to the …rst best. Speci…cally, there are three key properties:
(1) I obtains fewer savings options than in the e¢ cient plan, since the range of aI is a strict
subset of that of a ; (2) I will withdraw too little for small states, where aI (s) > a (s), and will
contribute too little for large states, where aI (s) < a (s); and (3) I will not adjust at all his
savings decisions to some extreme states, where aI is constant. This last property always occurs
for large enough states; it can also occur for small enough states as shown in panel (b). As I show
in the rest of the paper (see Proposition 2), properties (1)-(3) hold quite generally, as long as the
principal’s goal assigns some weight to pro…ts. Concretely, we can think of the SD o¤ered to I as
an account that, besides possibly rewarding contributions and punishing withdrawals, also sets
limits on these transactions. Such limits appear, for example, in some individual-development
accounts— a form of matched savings accounts— and Christmas-club accounts o¤ered by some
…nancial institutions in the U.S. (see also Ashraf, Gons, Karlan, and Yin (2003)).
Consider now the SD designed for C. Figure 2 compares the savings options that this
SD o¤ers C— the function denoted by xC — with the e¢ cient plan a . We see that C’s device
commits him to the e¢ cient savings plan a on the relevant range [s; s]. But it also includes
one option that involves an ine¢ ciently large withdrawal (a) in period 2, and that C never uses
because he deems it reasonable only for s below the smallest possible state s. I shall later discuss
how the presence of unused options depends on the primitives of the model.
6
Figure 2: E¢ cient Outcome vs. Second Best with type C
2
The Model
A principal interacts with an agent over two periods. In period 1 (the contracting stage), the
principal o¤ers the agent a device that gives him the opportunity to choose an action a in period
2 (the consumption stage). The contractible action a belongs to a feasible set A := [a; a] R
with 1 < a < a < +1. For each a, a device speci…es a payment p from the agent to
the principal. The principal incurs a production cost in period 2, which is given by the twice
di¤erentiable function c : A ! R with c0 0 and c00 0. The principal can perfectly commit to
a device, which— if accepted by the agent— is then binding for both parties. I also assume that
there is no spot market for a in period 2.
Agents may have time-inconsistent preferences, but they are always sophisticated. To model
time-inconsistent agents, I follow Strotz (1956). For each agent, I call self-1 the self who
chooses a device and self-2 the self who chooses an option from the resulting menu. Both selves’
preferences depend on a state s, which occurs in period 2 and re‡ects shocks to taste, income,
or health. These shocks induce self-1 to desire ‡exibility; moreover, they are not contractible,
for example because only the agent observes them.12 Conditional on s, the direct utilities of
self-1 and self-2 from action a are
u1 (a; s) := sU (a)
a
and u2 (a; s; ) := s U (a)
a,
where the function U : A ! R is twice di¤erentiable with U 0 > 0 and U 00 < 0. Finally, agents’
total utility is u1 (a; s) p for self-1 and u2 (a; s; ) p for self-2.
The positive parameter determines the preferences’degree of time inconsistency and can
lead self-1 to desire commitment. When 6= 1, self-1 foresees that, in each state, his self-2
trades o¤ the bene…t and the cost of any action a in a systematically di¤erent manner. This
modeling assumption is based on the idea, proposed by cognitive psychologists, of ‘salience:’
Decision-makers seem to perceive the cost of their actions ( a) as more (or less) salient than
their bene…t (sU (a)) depending on whether the decision they face is immediate, or occurs in the
future (see, e.g., Akerlof (1991) and the references therein).
12
If s were contractible, the perfect ability of the principal to commit would make the solution for the agents’
time inconsistency trivial.
7
The illustrative example in Section 1.1 gives one interpretation of the model for the case in
which the principal o¤ers savings devices— clearly, adding y to both selves’ utilities does not
a¤ect their preferences. The model can however capture other situations. For instance, the
principal may o¤er gym memberships as in the next example.13
Example 1 In period 1, a gym o¤ers membership contracts. Becoming a member allows the
agent to exercise in period 2 (e.g., the following month) at the gym; the time spent there determines a fee p. Exercising d minutes causes an immediate discomfort, but it improves future
health by U^ (d) (with U^ 0 > 0 and U^ 00 < 0). The agent’s self-1 knows that his disutility from
d will depend, for example, on whether he has a bad day at work (s). Self-1 may also foresee
that self-2 will always overweigh the discomfort of d and thus will tend to exercise less than
self-1 wants ex ante. To model the agent’s preferences for exercising, we can use the functions
U^ (d) sd for self-1 and U^ (d) s d for self-2, where now > 1. Letting a = U^ (d) and
assuming quasi-linearity in p, we can rewrite the agent’s preferences over (a; p) as u1 (a; s) p
and u2 (a; s; ) p with the properties that I assumed above.
As far as period-2 states are concerned, s belongs to the set S := [s; s] R with 0 < s <
s < +1. This assumption simply says that, in each state, the agent assigns a weight to the
bene…t and the cost of his actions which is bounded away from zero. States have distribution
F , which has continuous and positive density f and is commonly known in period 1.
For clarity’s sake, in most of the analysis I focus on the case with only two types: agents of
type C are time consistent with C = 1, and agents of type I are all time inconsistent to the
same degree I . I also assume that 0 < I < 1, as in the savings example.14 Because the agents
are sophisticated, they know their degree of time inconsistency before contracting occurs. In
contrast, the principal cannot observe ; she, however, knows the possible types and the share
2 (0; 1) of type C in the market.
To evaluate any device in period 1, type i has to take into account what his self-2 does in
period 2. Therefore, I restrict attention to devices for which self-2’s optimal decisions are always
well-de…ned, independently of the agent’s type. So, the payo¤ to self-1 of any device is just the
expected utility of the resulting decisions of self-2, computed using the period-1 preference.
Finally, if an agent rejects all the principal’s o¤ers, he receives the outside option whose value
is normalized to zero.15
I adopt the following de…nition of e¢ ciency, which adheres to the standard interpretation of
the self-1’s preference as the long-run preference of the agent (O’Donoghue and Rabin (1999)).
De…nition 1 (E¢ ciency) For each state s, the ex-ante social surplus of action a is u1 (a; s)
c (a), and the e¢ cient outcome is
a (s) := arg max u1 (a; s)
a2A
13
c (a) .
The interpretation of some gym memberships as examples of commitment devices already appears in DellaVigna and Malmendier’s (2004, 2006) study of pro…t-maximizing contracts designed for time-inconsistent consumers.
14
Later in the paper, I explain how the analysis changes when I > 1 as in the gym example (Section 6.1) and
I extend my results to the case with more than two types (Section 4.4).
15
I discuss the case in which the outside option has type-dependent values in Section 6.1.
8
I also assume that it is never e¢ cient to induce the smallest and the largest action, and that
the maximum ex-ante social surplus is positive in every state.
Assumption 1 For every s, a (s) is interior and u1 (a (s) ; s)
c (a (s)) > 0.
By standard arguments, the outcome a is already non-decreasing; by Assumption 1, a becomes
increasing. Thus, a involves choosing a di¤erent action in each state and de…nes the e¢ cient,
benchmark level of ‡exibility.
Finally, I assume that, when designing her devices in period 1, the principal maximizes
a weighted sum of her expected pro…ts and the expected ex-ante social surplus, with weight
2 [0; 1] on pro…ts and 1
on welfare. This assumption is simply a convenient way to cover
the case of a monopolist ( = 1), as well as the case of a paternalistic planner who may also have
to worry about the pro…tability of her devices ( < 1). Intuitively, this situation may arise when
the planner is the exclusive provider of commitment devices and has limited resources to …nance
them, or when she only regulates the market of such devices and has to ensure that third-party
providers expect enough pro…ts to enter the market. In these cases, I could alternatively let the
planner maximize the expected ex-ante social surplus subject to the constraint of generating
some minimum (possibly negative) level of pro…ts. As I show in Appendix C, this alternative
speci…cation would only make endogenous, without a¤ecting my results.
3
Symmetric Information: A Benchmark
In this section I show that, if the agents’degree of time inconsistency were observable, then
the principal would o¤er each type a device that sustains the e¢ cient outcome, independently of
how much she cares about pro…ts. In so doing, she would provide a perfect mix of commitment
and ‡exibility.
When the agent’s type is observable, by the Revelation Principle, I can characterize optimal
devices using direct mechanisms (DMs) that induce the agent to report truthfully the state s in
period 2. Formally, a DM is a pair of functions (a; p), where a : S ! A and p : S ! R. DMs
must satisfy the constraints
R
[u1 (a (s) ; s) p (s)] f (s) ds 0
(IR)
S
and
u2 (a (s) ; s; )
p (s)
u2 (a (s0 ) ; s; )
p (s0 ) for all s; s0 .
(IC )
Note that the constraint (IR) uses the preference of self-1, whereas the constraints (IC ) use the
preference of self-2. Subject to these constraints, the principal then chooses (a; p) to maximize
R
R
(1
) S [u1 (a (s) ; s) c(a (s))]f (s) ds + S [p (s) c (a (s))] f (s) ds.
Although changing modi…es the principal’s goal, it turns out that she always helps timeinconsistent agents fully solve their self-control problems.
Lemma 1 (First Best) Suppose that the agents’degree of time inconsistency is observable.
Then, for any , the principal o¤ers a device that sustains the e¢ cient outcome a for all
2 [0; 1].
9
The intuition for Lemma 1 is as follows. Since the agent has no private information at
the contracting stage, he has no advantage over the principal. Thus, the principal becomes
the residual claimant of the utility that the agent expects when choosing a device, i.e., of the
expected utility of the agent’s self-1. Hence, independently of , the principal’s goal becomes
to maximize the expected ex-ante surplus. There is, however, a twist here, compared to a
model with only time-consistent agents: When 0 < < 1, the principal has to make sure that
the agent’s self-2 is willing to go along with her e¢ cient plan. Doing so is feasible, despite
time-inconsistency, because both self-1 and self-2 prefer higher actions in higher states. Lemma
1 generalizes a similar result of DellaVigna and Malmendier (2004), which relies on the same
intuition.
The goal of the present paper is to explain why and how the agent’s private information about
his time inconsistency a¤ects the principal’s supply of ‡exible commitment devices. As I show
in Appendix A, if the number of states is …nite, then for I large enough the principal can (and
will) sustain the e¢ cient outcome a without worrying about the agent’s private information.
Intuitively, with discrete states there are many incentive schemes that sustain a with each type.
Moreover, if I and C are close enough, there is a single incentive scheme that sustains a with
both types; so private information does not matter. However, this is never the case with a
continuum of states.
4
4.1
Asymmetric Information
The Screening Problem
In this section, I set up the principal’s problem of screening the agents’degree of time inconsistency as a dynamic mechanism-design problem. I then show that asymmetric information
grants the time-consistent agent an advantage over the time-inconsistent agent, which depends
on the ‡exibility of the latter’s device. Finally, I derive necessary and su¢ cient conditions that
will allow me to characterize the optimal screening devices. Hereafter, I will call i’s device the
i-menu.
The principal has to design two devices, one for each type, while ful…lling two requirements.
The …rst requirement is that type i self-selects the i-menu; the second requirement is that, given
the i-menu, in each state i’s self-2 chooses the option that the principal wants him to choose in
that state. To formally describe this design problem, we can use direct mechanisms (DMs) that
induce the agents to reveal sequentially their period-1 and period-2 private information.
In a model with time-inconsistent agents, one needs to be careful when de…ning the class
of DMs to be used in the analysis of the principal’s problem. In contrast to models with only
time-consistent agents (Myerson (1986)), restricting attention to DMs that let self-1 report
and self-2 report only his new information s entails a loss of generality. To characterize the
optimal screening menus, DMs must allow self-2 to report how his preference depends on the
combination of and s. The need for allowing these richer reports will be clari…ed below (see
Proposition 3). Note that, in each state, self-2’s preference is pinned down by the single number
s . So, call this number the self-2 ‘valuation’of U (a), and for each type i de…ne
i
:= f j
i
= s i ; s 2 Sg = [ i ; ].
10
It is then without loss of generality to focus on DMs that satisfy two properties. First, they
assign a pair (a; p) to the agent’s sequential reports of
and — these reports correspond,
respectively, to choosing a menu in period 1 and one of its options in period 2. To track period2 choices after any period-1 report, I consider DMs that assign pairs (a; p) to coherent as well
as incoherent reports; for example, if self-1 reports that he is time consistent (i.e., C ), each
DM still allows self-2 to report a valuation that only a time-inconsistent agent can have (i.e.,
< C ). The second property that each DM must satisfy is that it ensures that truthfully
reporting is optimal in period 1 and that truthfully reporting is optimal in period 2 for any
period-1 report about .16
I can now formally state the principal’s screening problem; this requires introducing some
C
notation. De…ne
:= ; where := I and :=
(i.e.,
= co( C [ I )). A DM is
i i
i
a collection of functions fX; Tg = (x ; t )i=C;I , where x :
! A is an allocation rule and
i
t :
! R is a payment rule. With a slight abuse of notation, let u2 (a; ) := u2 (a; s; ) for
= s . Also, let U i (xj ; tj ) be the utility that type i expects in period 1 from choosing the
j-menu; let i (xi ; ti ) be the expected pro…ts that the i-menu yields with i; and let W i (xi ) be
the expected surplus that xi induces with i’s self-1— recall that the ex-ante surplus of action a
in state s = = i is u1 (a; = i ) c(a). The principal’s problem is then
8
maxfX;Tg (1
)W (X) +
(X; T)
>
>
>
>
>
>
s.t., for i = C; I,
>
>
>
>
>
<
u2 (xi ( ) ; ) ti ( ) u2 (xi ( 0 ) ; ) ti ( 0 ) for ; 0 2 , ( -IC )
P :=
>
>
>
>
>
U i (xi ; ti ) U i (x i ; t i ),
(IC i )
>
>
>
>
>
>
: i i i
U (x ; t ) 0,
(IRi )
where
W (X) = W C (xC ) + (1
)W I (xI )
and
(X; T) =
C
(xC ; tC ) + (1
)
I
(xI ; tI ).
The goal for the rest of this section will be to reduce problem P to a set of simple optimality
conditions that characterize each screening menu.
Starting from the constraints involving , it follows from standard arguments that a DM
(X; T) satis…es ( -IC ) if and only if, for i = C; I,
xi 2 M := fx :
! A j x non-decreasingg ,
and
ti ( ) = u2 (xi ( ) ; ) +
R
U (xi (y))dy
k i for each ,
(MON i )
(ENV i )
where k i := u2 (xi ( ); ) ti ( ). Hereafter I will consider only DMs that satisfy the constraint
( -IC ), which I will denote by fX; kg = (xi ; k i )i=C;I and I will simply call DMs. Similarly, I will
denote the i-menu by (xi ; k i ).
16
This second property is stronger than requiring that truthfully reporting be optimal only conditional on a
truthful report about . However, it entails no loss of generality. See, e.g., Pavan (2007).
11
Consider now the constraints (IC i ). Using condition (ENV j ), we can explicitly write the
expected utility to type i from the j-menu as
U i (xj ; k j ) = E u1 (xj ( ) ; = )
Z i
=
U (xj ( )) =
tj ( ) j
=
i
Fi ( ) i
f ( )d
f i( )
i
i
(EU)
Z
i
U (xj ( ))d + k j ,
where F i is the distribution that F induces on conditional on being type i. Using expression
(EU), we can now study i’s incentives to mimic i. To do so, de…ne
Ri (x i ) := U i (x i ; k i )
U i (x i ; k i ).
and re-write IC i as
U i (xi ; k i )
U i (x i ; k i ) + Ri (x i ).
(R-IC i )
We see that i’s willingness to mimic i depends on two things: the …rst, U i (x i ; k i ), is the
overall expected payo¤ to i; the second, Ri (x i ), is how much i expects to gain or lose, relative
to i, by facing the menu of i while being of type i. The function Ri (x i ) is crucial, because it
determines whether type i will enjoy information rents due to the allocation sustained with type
i. By analogy, recall that in the screening model of Mussa and Rosen (1978), a high-valuation
buyer has an incentive to misreport his type because he values more any quality o¤ered to lowvaluation buyers (the analog of x i ) than they do. This advantage ultimately determines his
information rents.
In my model, it is type C who enjoys an information advantage at the contracting stage,
provided that the I-menu involves some ‡exibility.
Proposition 1 Time-consistent agents enjoy an information advantage over time-inconsistent
agents: For any direct revelation mechanism fX; Tg that satis…es ( -IC), R C (xI )
0 and
I
C
R (x ) 0. Furthermore, the (dis)advantage of one type disappears if and only if the menu of
the other type involves no ‡exibility: R i (x i ) = 0 if and only if x i is constant over ( ; ).17
The intuition for Proposition 1 is simple. Given the menu (xj ; k j ), C’s and I’s future choices
re‡ect di¤erent preferences: u2 (a; s; C ) and u2 (a; s; I ). On the other hand, today both C and I
use the same preference to evaluate their future choices, and this preference coincides with that
governing C’s choices. Therefore, today both C and I must like C’s future choices better than
I’s. This explains the …rst part of the proposition.18 Moreover, today C must be strictly better
o¤ than I, if their future choices di¤er with positive probability. As shown in the proof, C’s and
I’s choices will di¤er with positive probability unless xj corresponds to a singleton menu. In
summary, the cause of i’s (dis)advantage is— and can only be— the di¤erent behavior of i and
of j given the j-menu, a di¤erence that can arise only if the j-menu features some ‡exibility.
17
The allocation x i can jump at and without making Ri (x i ) 6= 0 simply because the distribution F is
atomless. Since this indeterminacy has no economic content, hereafter I focus on the extension of x i to [ ; ]
by continuity, whenever possible.
18
In the proof, I only assume that 0 < I < C
1. So, what really matters is that C’s future preference is
‘closer’to today’s preference than I’s future preference.
12
To further understand Proposition 1, it is useful to think about what would change if the
agents were heterogeneous, but all time consistent. Speci…cally, consider a model that is identical
to mine except that the agents’utility function is u2 (a; s; ) p in both periods, and suppose
that U (a) > 0.19 Call Ic the time-consistent agent with = I , and C the one with = C .
Note that C expects to have a systematically higher valuation than Ic , as she did before relative
to I. But now C has an advantage relative to Ic because, even if their future choices coincide,
C enjoys a more than Ic already today. In particular, C’s advantage is at work even if the menu
of Ic is a singleton.
Using Proposition 1, we can now simplify problem P as follows.
Corollary 1 The DM fX; Tg solves problem P if and only if X solves
8
i
h
C
I
C
C
I
I
>
R
(x
)
max
W
(x
)
+
(1
)
W
(x
)
>
X
1
>
>
<
s.t. xC ; xI 2 M,
P 0 :=
>
>
>
>
: I C
R (x )
R C (xI ).
(RR)
By Corollary 1, xC and xI solve the principal’s problem if and only if they are non-decreasing
and satisfy condition (RR), which requires that C’s advantage under the I-menu not exceed I’s
disadvantage under the C-menu. That (RR) is necessary comes from combining the constraints
(R-IC C ) and (R-IC I ); that it is also su¢ cient depends on having only two types.
Finally, the next lemma gives necessary and su¢ cient conditions for xC and xI to solve
problem P 0 (and hence P). It relies on in…nite-dimensional Lagrangian techniques that don’t
require assuming any property about xC and xI beyond the necessary monotonicity condition.
Lemma 2 (Optimality) The allocation rules xC and xI solve problem P 0 if and only if, for
some real number
0, (xC ; xI ; ) satis…es
xi 2 arg max W i (x)
i
x2M
RC (xI ) + RI (xC )
where
C
:=
+
1
and
I
0, and
R
i
(x) for i = C; I,
RC (xI ) + RI (xC ) = 0,
:= .
Note that, by Lemma 2, we can essentially characterize the optimal xC and xI by solving two
distinct maximizations, with monotonicity as the only remaining constraint. Moreover, the form
of these maximizations suggests deriving each xi and its properties directly as a function of the
weight i . For this reason, it is worth observing at this point that C is increasing in how much
the principal cares about pro…ts ( ), how much type C matters for her pro…ts ( ), and how
much condition (RR) binds ( ). On the other hand, I is increasing in but decreasing in . I
consider xI …rst.
19
A similar model appears in Courty and Li (2000).
13
4.2
The Optimal Screening Device for Time-Inconsistent Agents
In this section I characterize the properties of the screening device that the principal designs
for type I. I show that she distorts the I-menu to limit the rents to type C, and that these
distortions always involve curtailing ‡exibility as described in the initial example about savings
devices. I also show how the distortions change as a function of the market share of each type
and the weight that the principal assigns to pro…ts.
By Lemma 2, when designing the I-menu, in general the principal faces a trade-o¤ between
maximizing the surplus with I and extracting rents from C. On the one hand, if she o¤ers
a ‡exible I-menu that sustains the e¢ cient outcome with I, then she maximizes not only the
surplus with I’s self-1, but also her pro…ts from this menu. On the other hand, by Proposition
1, a ‡exible I-menu triggers C’s advantage, thereby boosting his information rents. Since these
rents reduce the pro…ts from any C-menu, the principal faces a trade-o¤. Similarly, in Mussa
and Rosen (1978), serving low-valuation buyers e¢ ciently requires granting a high-valuation
buyer rents that make trading with him less pro…table. However, such rents never lead lowvaluation buyers to choose the quality designed for the high-valuation buyer. In the present
model, instead, the principal also has to worry about the rents that the I-menu creates for C
because they may jeopardize I’s incentives for choosing the I-menu (i.e., they may lead to a
violation of the condition (RR)).
To gain some intuition for why the trade-o¤ between surplus with I and rents to C leads to
distorting the ‡exibility of the I-menu, consider …rst an analogy with Mussa and Rosen (1978).
In their model, to curb the rents to high-valuation buyers, the monopolist distorts the quality
for low-valuation buyers below the e¢ cient level. This is because lowering such a quality curbs
the advantage of the high-valuation buyers. Similarly, in the present model, the principal should
distort the feature of the I-menu that causes C’s advantage, namely, its ‡exibility. To further
help intuition, consider a simple example with two states: sd > sw . The e¢ cient choices are to
deposit ad > 0 in sd and withdraw aw < 0 in sw . To commit I to these choices, the I-menu
must provide his self-2 with appropriate incentives: Intuitively, the payment ‘premium’pIw pId
must be large enough for I’s self-2 not to choose aw in sd . Suppose that, as a result, C would
always choose (ad ; pId ) from the I-menu, thus expecting a strictly higher payo¤ from it than
does I.20 Also, suppose that I’s incentive to choose aw over ad decrease as aw increases (i.e.,
it becomes less negative). Then, by allowing I to withdraw less than aw , say a
^, the principal
I
I
can decrease the premium (^
p
pd ) that makes I’s self-2 withdraw only in sw . Since I’s choice
in sw is now closer to C’s, C gains less from his stronger self-control. But since e¢ ciency calls
for withdrawing aw in sw and depositing ad in sd , having only the options a
^ and ad amounts to
ine¢ ciently low ‡exibility for I.
The bottom line is then this: Reducing the gap between I’s and C’s choices from the I-menu
curbs C’s advantage, through the changes in the prizes and penalties (t) that help I commit.21
Therefore, we should reduce I’s ability to respond to future information below the e¢ cient level.
The optimal way to do so, however, is less immediate than lowering quality as in Mussa and
Rosen. We may curtail I’s ‡exibility for intermediate states, but leave it for extreme states
20
By Lemma 5 in Appendix A, for such values of pIw and pId , C’s self-2 will always choose (ad ; pId ) from the
I-menu provided that I is small enough.
21
A further discussion of the role of payments appears in Section 6.2 in relation to the model of Amador et al.
(2006), which rules them out.
14
where it seems more valuable, or we could do the opposite.
To characterize the optimal allocation xI — one of the paper’s main result— I rewrite the
C C
objective W I (xI )
R (xI ) in Lemma 2 as an expected virtual surplus (see Appendix B for
details). This surplus assigns an intricate virtual valuation to each choice xI ( ) of I; moreover,
it keeps track of how C’s rents depend on options xI ( ) that only C would choose from the
I-menu. Before stating the result, let xI be the allocation rule that de…nes the symmetricI
information I-menu (up to the scalar k I ): xI ( ) = a ( = I ) for 2 I and xI ( ) = xI ( )
otherwise. Also, let
anf := arg max E (s) U (a) a c (a) ,
a2A
which is the ex-ante e¢ cient action if the agent is not allowed to respond to future information
at all— ‘nf’stands for ‘no ‡exibility.
Proposition 2 (Optimal I-menu) For every C
0, there exists an allocation rule xI ( C )
I
I
C C
that is non-decreasing in and maximizes W (x )
R (xI ). The optimal xI ( C ) is unique
and continuous in and C . If C > 0, xI ( C ) features:
(a) Range Reduction with ‘Underconsumption’ at the Bottom and ‘Overconsumption’ at the
I
Top: xI ( ; C ) > xI ( ) over [ I ; ) 6= ? and xI ( ; C ) < xI ( ) over ( ; ] 6= ?;
(b) No Flexibility at the Top: xI (
C
) is constant over [ b ;
C
] with
b
<
I
.
In addition it may feature
(c) No Flexibility at the Bottom: xI ( C ) can also be constant over [ I ;
su¢ cient condition for this to happen is that for all s0 > s in [s; sy ] 6= ?
F (s0 )=f (s0 )
s0
F (s)=f (s)
s
I
1
I
+
1
I C
b]
with
b
>
I
. A
.
Finally,
lim max xI ( ;
C !+1
C
)
anf = 0 and
lim xI ( ;
C !0
C
) = xI ( ) for
2
.
The proof builds on Toikka’s (2011) generalization of Myerson’s (1981) ironing technique to
initially characterize xI on path (i.e., over I ), and then explicitly constructs the optimal
extension of xI o¤ path. Uniqueness follows from strict concavity of the function U . A few
comments on the properties of the optimal I-menu follow.
By Property (a), the screening I-menu always restricts I’s choices to a strict subset of the
e¢ cient set of actions, which corresponds to the range of xI . The I-menu will in general let I
act on his future information: xI need not be constant at anf , and by continuity this ‡exibility
extends over a range of states with a rich set of options. For example, the I-menu will feature
some ‡exibility when the share of time-consistent agents is small (so that C is small) and the
utility U (a) of the smallest action is low enough (see Corollary 2).
The principal makes I ‘overconsume’over large states and ‘underconsume’over small states
for the following reason. Recall that to curb C’s rents, she has to shrink the gap between C’s and
I’s choices implied by xI , as doing so also shrinks the payment premia between them. Therefore,
the key is understanding how distorting xI a¤ects tI (through (ENV I )). To do so, …x 2 I .
15
On the one hand, raising xI ( ) has a positive e¤ect on tI ( ) and a negative e¤ect on tI ( 0 ) for
every 0 above . So, the di¤erence in actions as well as in payments between and 0 shrinks
for every 0 > , but rises for every 0 < . On the other hand, reducing xI ( ) has the opposite
e¤ects. Thus, curbing C’s rents involves raising xI ( ) for close to I because, intuitively, the
I
mass of 0 > prevails over that of 0 < . Instead, it involves reducing xI ( ) for close to ,
because for such s the mass of 0 < prevails over that of 0 > . Of course, the principal must
weigh the bene…cial e¤ects on RC of distorting xI against its welfare costs. Nevertheless, the
direction of the distortions does not change.
By Property (b), the screening I-menu always makes I not respond to information over a
range of large states. This bunching destroys surplus with I but also curtails C’s rents, for the
same reason as before. To see why, …x an I-menu and identify the largest action chosen by I,
I
I
xI ( ). If any item has a > xI ( ), then tossing it doesn’t a¤ect I’s choices, but curbs C’s rents:
I
I
In the states above , a dishonest C would now choose the smaller action xI ( ), which is closer
to what I chooses. This argument applies a fortiori if, over those states, C would pick an even
I
smaller action, say ab < xI ( ). This extra step, however, requires that also I choose a smaller
I
action for some s below . The smaller is the set of s a¤ected by this change, the smaller is
surplus loss with I. So, the best strategy is to bunch at ab every that was choosing a larger
action— formally, every > b — and not to change the allocation for all other s. Finally, for
I
every threshold b < , a dishonest C would behave more similarly to an honest I for the states
in [ b ; s], whereas I must choose a smaller action only for the states in [ b = I ; s]. Therefore, for
I
b
close enough to = s I , the surplus with I falls less than do C’s rents, and some bunching
is always optimal.
By Property (c), the screening I-menu can also make type I not respond to information
over a range of small states. By Property (a), the principal curbs C’s rents by ine¢ ciently
raising xI ( ) for small s. However, recall that doing so increases the di¤erence in actions and
payments between and all 0 < . So, intuitively, the larger is , the less the principal wants
to raise xI ( ). In particular, she may wish that a larger chose a smaller action than does a
smaller 0 . Since this violates the monotonicity condition (MON I ), at most the principal can
pool bottom self-2 valuations of type I at the same action. The e¤ect of rasing xI ( ) on the
payments of all 0 < gains weight as the ratio F I ( )=f I ( ), which equals I F ( = I )=f ( = I ),
grows. So, the principal will wish to implement a decreasing allocation for close to I , if in
this region F ( = I )=f ( = I ) grows fast enough, as formally stated in Proposition 2. Note that
this condition on F is more likely to hold when I’s degree of time inconsistency is weaker (i.e.,
I
is closer to 1), and when the principal cares more about the rents to C (i.e., C is higher).
Finally, the last part of Proposition 2 describes the distortions of the screening I-menu as
the weight C becomes arbitrarily large or small. As the principal becomes extremely concerned
about C’s rents, she tends to disregard I’s desire for ‡exibility, in the limit o¤ering I only the
option anf — a radical reduction of ‡exibility vis-à-vis the …rst best. For example, whenever the
principal cares about pro…ts, she tends to induce such an outcome with I if C represents almost
the entire market (if > 0, C ! +1 as ! 1). If instead the principal cares very little about
C’s rents, she tends to o¤er an I-menu that sustains the e¢ cient outcome.
By introducing further assumptions about the distribution F , it is possible to achieve a more
detailed comparison of the screening I-menu (i.e., xI ( C )) with the e¢ cient one (i.e., xI ). This
16
is because, in general, the virtual valuation de…ning xI has a complex form. For illustrative
purposes, I consider here the case with uniform F . In this case, a simple relationship also
emerges between the weight C and the set of actions in the I-menu, as well as the range of
states involving ‡exibility (i.e., [ b ; b ]).
Lemma 3 Suppose that s is uniformly distributed and I > s=s. Then, the allocation rule
xI ( C ) crosses xI only once and is increasing over [ b ; b ]. Furthermore, as C rises, xI ( C )
changes as follows: b and xI ( b ; C ) decrease, and xI ( b ; C ) increases; when b > I , then b
increases as well.
In general, more complicated patterns can occur, including bunching at intermediate points
for standard ironing reasons. The main point is that, in the present model, bunching at the
bottom and at the top arise for new reasons, which hold for a general class of distributions F .
4.3
The Optimal Screening Device for Time-Consistent Agents
In this section I characterize the screening device that the principal designs for type C. I show
that, to deter I from mimicking C, the principal may have to modify the e¢ cient C-menu by
adding to it unused options and, if these options are not deterring enough, by also distorting
C’s choices.
By Lemma 2, when designing the C-menu, the principal wants to maximize the surplus with
C, but she may also have to worry about how the C-menu a¤ects I’s incentives for revealing
his time-inconsistency. By condition (RR), she can sustain the e¢ cient outcome with C, given
a certain I-menu, if and only if I’s disadvantage when facing the C-menu ( RI (xC )) is at least
as large as C’s advantage when facing the I-menu (RC (xI )). A natural benchmark of a Cmenu sustaining the e¢ cient outcome with C is the symmetric-information menu. This menu is
de…ned, up to the scalar k C , by the allocation rule xC with xC ( ) = a ( = C ) for 2 C and
xC ( ) = xC ( C ) otherwise. The next lemma shows that xC need not satisfy condition (RR)
for any allocation rule xI .
Lemma 4 Let xC and xI be the allocation rules associated to the e¢ cient outcome a . There
is a family of distributions F , such that (xC ; xI ) violates condition (RR) and is therefore
infeasible.
The intuition for Lemma 4 is as follows. Note that, under xC , in states close to s type I and
type C behave similarly, so I’s lack of self-control hurts him only a little. Therefore, if such
states are likely enough according to F , then ex ante I values the C-menu almost as much as
does C— i.e., RI (xC ) is small. On the other hand, since xI is ‡exible, the C-menu must
grant C positive rents (R C (xI ) > 0), and if these rents
p are large enough, they will ClureI I to
mimic C. One can show, for example, that if U (a) = 2a and F is uniform, then (x ; x ) is
infeasible.
Although Lemma 4 looks at the extreme case of xC and xI , by the continuity properties of
the principal’s problem its conclusion extends to a larger set of cases. By Proposition 2, as the
principal cares less about C’s rents (i.e., C ! 0), she tends to design an I-menu that resembles
the …rst-best one, i.e., xI . Therefore, if xC and xI are infeasible, then by continuity xC and
xI ( C ) are also infeasible for C small enough. On the other hand, as the principal cares more
17
about C’s rents (i.e., C ! +1), she tends to reduce the I-menu to a single option. So, again
by continuity, xC and xI ( C ) are always feasible for C large enough. For the cases in which
xC and xI ( C ) are infeasible, the question becomes how the principal modi…es xC to ensure
that I will report truthfully.
The …rst strategy that the principal adopts to strengthen I’s incentives to reveal is unconventional. In models with time-consistent agents, the principal has to distort her o¤er to one
type— and hence his choices— whenever this o¤er causes mimicking by another type. Instead,
in the present model with time-inconsistent agents, the principal may successfully deter I from
mimicking C while still o¤ering a C-menu that sustains the e¢ cient outcome with C. To do
that, she adds to the C-menu tempting items that exacerbate I’s disadvantage when facing such
a menu. Since these items are o¤ path, there are many ways to usefully employ them. However,
to maximize I’s disadvantage under the e¢ cient C-menu, the principal only needs to add one
item to it, as shown in the next proposition.22
C
Proposition 3 (Usefulness of Unused Items) Let the allocation rule xC
u satisfy xu ( ) =
xC ( ) for 2 C and
( C C
< C
x ( ) if d
C
.
xu ( ) =
a
if < d
I
C
C
that
Then, there exists d > I such that xC
u minimizes R (x ). Furthermore, the rule x
I
C
sustains the e¢ cient outcome with C and minimizes R (x ) is unique except possibly over
C
[ d ; u ], where d and u
depend only on I and F .
To relax the constraint (R-IC I ), the principal adds to the C-menu one item that C never
chooses, but I would choose in a range of small states. If the state is small enough and I
faces the C-menu with the unused item, then I’s choice will di¤er more from C’s than if I
faced the C-menu without the unused item; I is therefore less willing to mimic C. Although
other strategies involving multiple unused items can deter I’s mimicking, the most e¤ective
strategy requires just one unused item. This is because, from the viewpoint of self-1, the worst
possible temptation is a, and since this option is o¤ path, the principal doesn’t have to worry
about the cost of producing it. However, note that to make lying the least attractive for I,
the principal may restrict the range of self-2 valuations for which I would choose a— i.e., it
is possible that u < C . Intuitively, the payment for a controls both the probability and the
disappointment that I assigns ex ante to choosing it when C doesn’t. A low payment— tailored
to a close to C — makes I expect that he would choose a in a larger set of states, but with
little disappointment. Instead, a high payment— tailored to a close to I — makes I think that
he would choose a in a smaller set of states, but with much disappointment. Clearly, depending
on the distribution F , charging the high payment for a may deter mimicking more.
Although enlarging the C-menu as described in Proposition 3 maximally relaxes the constraint (R-IC I ), it may still fail to make a pair of allocations implementable. Indeed, the
enlarged C-menu satis…es condition (RR) if and only if the unused action a is tempting enough.
Corollary 2 Suppose that the allocations xC and xI are not implementable. Then, there is a
I
threshold (xC ; xI ) > 0 such that xC
u as in Proposition 3 and x are implementable if and only
if U (a) U (xC ( C ))
(xC ; xI ).
22
I discuss how my results relate to Esteban and Miyagawa’s (2006b) use of "decorated" menus in Section 6.2.
18
Thus, if the principal can enlarge the C-menu with a tempting enough a, she can always sustain
the e¢ cient outcome with C. The lowest feasible action a can be su¢ ciently tempting, relative
to the lowest e¢ cient action a (s), for several reasons. The principal may be able to o¤er very
unattractive actions, without any technological or legal constraint. For instance, if we view U (a)
as the future consequences of some current action— e.g., shopping with credit cards— the worst
consequence U (a) is likely far worse than the e¢ cient one U (a (s)), which takes into account
the current utility a and the cost c(a)— e.g., the cost of a default.
Combining the insights of the previous results, the next proposition summarizes the possible
con…gurations of the screening C-menu as a function of C’s market share ( ) and the weight
0 is the Lagrangian
that the principal assigns to pro…ts ( ). Recall that C = 1 + , where
multiplier associated with condition (RR).
Proposition 4 (Optimal C-menu) There exist thresholds
+1, such that the following holds:
(a) If
items;
1
(b) If
C
1
(c) If
1
C
2,
1
<
<
C
1,
C
1
and
C
2,
with 0
C
1
C
2
<
the C-menu sustains the e¢ cient outcome a and need not include unused
C
2,
the C-menu sustains a and includes unused items;
the C-menu does not sustain a and includes unused items.
Finally, there exist environments with 0 <
C
1
<
C
2.
A few remarks follow.
First, part (c) implies that the ‘no distortion at the top’ property— common in standard
screening models— fails in the present model. In standard models, the agents of the ‘strongest’
type usually achieve an undistorted outcome, as if information were symmetric; for example,
in Mussa and Rosen (1978) the buyers with the highest valuation always trade e¢ ciently with
the monopolist. In the present model, Proposition 1 suggests that C is the ‘strong’type: As
in Mussa and Rosen the buyers with the highest valuation value any (positive) quality strictly
more than do low valuation buyers, in the present model C values any (‡exible) menu strictly
more than does I. However, both C and I can receive a menu sustaining an ine¢ cient outcome.
Second, the C-menu is more likely to feature unused items and (possibly) be ine¢ cient when
C represents a smaller share of the market (i.e., is lower) or the principal cares less about
pro…ts (i.e., is lower). Intuitively, when either or are lower, the principal is willing to
grant larger rents to C. But these rents make I more willing to mimic C, so that the principal
has to balance them by weakening the degree of commitment that I expects from the C-menu.
In particular, we have the following corollary.
Corollary 3 If the monopolist ( = 1) has to add unused items to the C-menu, so does the
planner ( < 1). Similarly, if the monopolist’s solution violates the ‘no distortion at the top’
property, so does the planner’s.
In particular, o¤ering C unused items may be necessary for the planner to be able to sustain
the e¢ cient outcome with both types.
Finally, when unused items alone cannot ensure I’s truthfulness, the distorted xC must
I I
maximize W C (x)
R (x) with I > 0. The weight on RI calls for designing the C-menu so
19
as to worsen I’s disadvantage, which depends on the gap between I’s and C’s choices from the
C-menu. To gain some intuition, consider again the savings example with two states. Suppose
we start from the e¢ cient C-menu, with only the options to withdraw aw and deposit ad , and
we add to it the option to withdraw a. Recall that, to deter I from mimicking C, the payment
for a should be large enough relative to the others. Now suppose that I enjoys withdrawing a
more than aw , but only by a little. To charge a relatively large payment for a, we must then
raise aw and (possibly) ad above e¢ ciency. When there are many states and self-2’s valuations
overlap across types, deriving the optimal xC requires more work. The principle is, however,
the same: make I’s and C’s choices from the C-menu more di¤erent, in terms of both actions
and payments.
4.4
Extension: Many Degrees of Time Inconsistency
I now extend the analysis of the two-type model to a model with more realistic variety in selfcontrol problems. Screening any two agents, who have di¤erent degrees of time inconsistency,
raises the same issues as screening C and I. However, with more than two types, deriving the
optimal supply of menus is more intricate. Nonetheless, I show that the menu of the least timeinconsistent agents— who represent the ‘strongest’type— resembles the C-menu in the two-type
model. The menus of the other agents, instead, may ine¢ ciently curtail ‡exibility; if so, they
again induce ‘over-consumption’ and no ‡exibility at the top, and ‘under-consumption’ and
possibly no ‡exibility at the bottom. Finally, not only the menu of the least time-inconsistent
agents may feature unused items, but also the menus of intermediate types.
To consider a population featuring more than two degrees of time inconsistency, I modify
1
the model in Section 2 as follows. Let the set of types be
; : : : ; N , with N > 2 …nite; for
convenience, order the types so that a higher index coincides with a stronger degree of time
1
inconsistency, i.e., 1
> 2 > : : : > N > 0. Finally, let the share of type i be i > 0. All
the other primitives remain as in Section 2.
As in Section 4.1, we can study the principal’s problem using direct mechanisms. A DM is an
SN
i
i
. It is still
! A and ti : ! R, where = co
array fX; Tg = (xi ; ti )N
i=1 , with x :
i=1
without loss of generality to consider DMs such that each (xi ; ti ) satis…es the constraint ( -IC )
(see problem P), so that we can again describe each i-menu as a pair (xi ; k i ) with xi 2 M.
Letting the payo¤ that type i expects from the j-menu be U i (xj ; k j ) as in expression (EU), the
period-1 information and participation constraints are
R-IC i;j :
U i (xi ; k i )
U j (xj ; k j ) + Ri (xj ) and IRi :
U i (xi ; k i )
0.
For convenience, I have already written the constraints (IC i;j ) in terms of the functions Ri (xj ),
as in Section 4.1.
With regard to each type’s information (dis)advantage, Proposition 1 generalizes as follows.
A less time-inconsistent agent enjoys an information advantage over any more time-inconsistent
agent; i.e., Ri (xj ) 0 if i < j, and Ri (xj ) 0 if i > j. Moreover, i’s (dis)advantage disappears
if and only if the j-menu makes i and j behave (almost) always identically; i.e., Ri (xj ) = 0 if
i j
and only if xj is constant over (minf i ; j g; maxf ; g). For outside this range, neither i nor
j get to choose xj ( ), so Ri (xj ) cannot depend on it.
20
With regard to period-1 incentive compatibility, two other insights extend to the general
model. First, the information rents granted to a less time-inconsistent agent can jeopardize the
incentives of a more time-inconsistent agent to disclose his self-control problems. Second, adding
unused items to the menu of a less time-inconsistent agent helps promote truthfulness by a more
time-inconsistent agent. Moreover, the most e¤ective way to do so is still given by Proposition
3, except that the thresholds d and u now depend on i , j , and F .
Consider now the problem of …nding
DM fX; kg = (xi ; k i )N
i=1 . Recall that the
R ian optimal
i
expected pro…ts from the i-menu are [t ( ) c(x ( ))]dF i , or equivalently W i (xi ) U i (xi ; k i )
where W i (xi ) is the expected surplus that xi induces with i’s self-1 (see (EU)). We can then
write the principal’s problem as follows:
(
PN
PN
i
i
i
i
max
(1
)
W
(x
)
+
U i (xi ; k i )]
fX;kg
i
i=1
i=1 i [W (x )
N
P :=
s.t. xi 2 M, R-IC i;j , and IRi , for all i; j.
With more than two types, the problem P N raises new di¢ culties, which I solve with a novel
approach. First, in general, local necessary conditions for period-1 incentive compatibility do
not imply global incentive compatibility. A similar issue appears in the literature on dynamic
mechanism design with only time-consistent agents, which has developed a speci…c approach to
it (Courty and Li (2000); Pavan, Segal, and Toikka (2011)). This approach involves introducing
additional, potentially ad-hoc, restrictions on the primitives so that local conditions ensure
global incentive compatibility. I …nd using this approach for studying a new screening problem
unappealing; moreover, …nding e¤ective— let alone reasonable— restrictions is particularly hard
in my context. Second, in my context, there isn’t an ordering of types that allows one to
identify, a priori, which constraints will bind. For these two reasons, to solve P N , I take a
di¤erent approach that does not introduce new restrictions on the primitives, and deals with
period-1 incentive constraints all at once. My approach builds on Lagrangian methods and relies
on having a …nite number of constraints.
Applying these methods reveals that, when designing each j-menu, the principal trades o¤
the surplus with j and the rents that the menu causes for (some of) the less time-inconsistent
agents (see Appendix B). Such a trade-o¤ generalizes the one highlighted in the two-type model.
Furthermore, the principal has to ensure that each j-menu deters any type who is more time
inconsistent than j from mimicking j. Based on the results in Section 4.3, I focus on the case
in which unused items su¢ ce to guarantee this property.
Proposition 5 Suppose U (a) is small so that unused items su¢ ce to satisfy (R-IC i;j ) for i > j.
i
A solution fX; kg to P N exists with xi unique over ( i ; ), for every i. Furthermore, we have
the following:
(a) The 1-menu always sustains the e¢ cient outcome a ;
(b) If = 0, then all menus sustain a , otherwise at least the N -menu sustains a distorted
outcome;
(c) If the i-menu sustains a distorted outcome, it features properties (a), (b), and possibly (c),
as in Proposition 2;
(d) For every i < N , the i-menu may have to include unused items with a < xi ( i ).
21
As long as the principal cares about pro…ts ( > 0), she will distort the menu of the most
time-inconsistent agents, and possibly of other intermediate types. Whether intermediate types
obtain a distorted menu depends on the exact con…guration of the active incentive constraints,
which cannot be predicted a priori.
5
Normative Implications for Paternalistic Savings Devices
In this section, I discuss the implications of my analysis for how a paternalistic government
should design special SDs, to tackle the tendency of some people to save too little for retirement.
I argue that, at a broad level, my policy implications are consistent with some features of the
U.S. regulation of ‘tax-shielded’and taxable investment accounts.23
People’s tendency to undersave for retirement is a well-documented phenomenon that has
been attributed to self-control problems (Diamond (1977), Thaler (1994), Laibson (1998), Bernheim (2002)) and that has attracted governments’attention. To help people with self-control
problems save adequately, a paternalistic government may want to create special SDs that provide appropriate tax incentives. However, when trying to sustain e¢ cient savings decisions with
all individuals, the government faces two natural constraints. First, it can usually allocate limited …scal resources to …nance its policies. Second, people value ‡exibility, because they are
uncertain about their best savings decisions in the future, and they have di¤erent tendencies
to undersave, because their self-control varies. Since the government cannot observe people’s
self-control, it has to induce each individual to select voluntarily the SD that best …ts his desires
for commitment and for ‡exibility. The question is then how the government should design its
SDs, given these constraints.
My model addresses this type of question. To see how, recall our initial example about SDs.
Again, each SD allows the agents to contribute to (a > 0) and withdraw from (a < 0) the
balance. Now, however, each choice of a leads to a tax burden or bene…t p— think of p as the
…scal consequences of a distribution from or a contribution to an SD. As in our initial example,
it seems natural that e¢ ciency calls for distributions (a < 0) when spending in period 2 has high
enough priority (s < s0 ), and for contributions (a > 0) otherwise. Finally, since the government
can allocate limited resources to its SDs, it seems natural that it also takes into account their
overall pro…tability, i.e., 0 < < 1.
The results of Section 4 have the following implications:
The unobservability of people’s self-control o¤ers one economic justi…cation for government intervention in the market for SDs. Indeed, asymmetric information leads pro…tmaximizing …rms ( = 1) to supply SDs ine¢ ciently, especially to the agents who actually
need commitment devices.
When designing its SDs, a …nancially constrained government faces a trade-o¤ between
corrective and redistributive taxation; its solution calls for curtailing the liquidity of the
special SDs, relative to the standard SDs. By Proposition 1, special SDs— tailored to
23
Of course, I do not intend this section to be seen as providing an explanation for all the complicated rules
governing such accounts in the U.S..
22
time-inconsistent agents— make time-consistent agents less willing to choose the standard
SDs tailored to them. To make the time-consistent agents choose the standard SDs, it is
necessary to grant a tax discount on them. Doing so drains …scal resources, in addition to
any tax incentives for the time-inconsistent agents. Curbing the discount on the standard
SDs calls for curtailing the ‡exibility, or liquidity, of the special SDs.
To optimally curtail the liquidity of the special SDs, the government should set limits on
the owners’ freedom to make contributions and distributions. At one end, the special
SDs should induce ine¢ ciently low contributions by setting a binding cap (Proposition 2,
Properties (a) and (b); Proposition 5); at the other end, they should induce ine¢ ciently
low distributions, possibly using again a binding cap (Proposition 2, Properties (a) and
(c); Proposition 5). It is important to remark that, according to the present analysis, the
reason for setting limits on distributions from the special SDs is not to help the timeinconsistent agents avoid depleting their savings. Indeed, if the government could observe
the agents’ self-control, its special SDs could and should help time-inconsistent agents
e¢ ciently respond to all future states, through appropriate penalties and rewards and not
through curtailed ‡exibility (recall Section 3).
The government should not take for granted that time-inconsistent agents will always prefer
the special SDs to the standard SDs. The tax discount on the standard SDs may lure the
time-inconsistent agents into choosing them. To avoid this, the standard SD should include
the possibility of taking large distributions in period 2, but make it so costly that only
time-inconsistent agents would use it (see Section 4.3). For example, a standard SD could
allow unlimited borrowing against its balance, and also charge very high penalties for loans
beyond a certain threshold.24
For the sake of completeness and with the extreme caution that the subject deserves, I will
brie‡y compare my highly stylized results with existing SDs available in the U.S. retirement
market. In this market, federal laws regulate the special SDs, called ‘tax-shielded’investment
accounts (TSAs), as well as the standard SDs, called taxable investment accounts (TAs). As
the names suggest, TSAs and TAs have di¤erent tax rules. Examples of TSAs include the
individual retirement accounts (IRAs) and the 401(k) plans. TSAs account for signi…cant tax
expenditures in the federal budget; more generally, savings taxation contributes to …nance other
redistributive goals. Therefore, budget considerations are likely to play a role in how the U.S.
Congress regulates the TSAs and the TAs.
At a broad level, the TSAs— in particular the IRAs and the 401(k) plans— di¤er from the
TAs as follows. Both IRAs and 401(k) plans set maximal contribution limits; TAs don’t. These
limits are enforced with dear tax penalties for crossing them, and sometimes they bind: In
2007 (2008), about 59% (49%) of IRA-owners contributed at the limit (Holden et al. (2010b)),
and roughly 11% of all 401(k) participants did so in 2004 (Munnell and Sundén (2006)). Also,
both IRAs and 401(k) plans limit distributions; again, TAs don’t. Except for a list of quali…ed
cases (e.g., …rst-time home purchase for IRAs), any amount withdrawn before the age of 59 12
24
In settings with naive time-inconsistent agents, the government should obviously be very careful when adding
such borrowing options, because— in contrast to my model— it cannot be sure that no agent will choose them.
For further discussion, see Section 6.1.
23
incurs a tax penalty, which seems to actually limit access to these TSAs: According to Holden
and Schrass (2008-2010a), the vast majority of IRA withdrawals are retirement related, and
only about 5% occurs before the owner turns 59 12 . Moreover, any IRA-backed loan is de facto
prohibited, and although for some 401(k) plans loans are allowed, they are capped and subject to
quick repayments. Of course, there are many competing reasons that can explain these features.
They are, however, consistent with the implications of the model.
Finally, Amromin (2002, 2003) presents some evidence that is consistent with the principle
that curtailing ‡exibility in the special SDs helps curb their appeal to less time-inconsistent
agents. He shows that a signi…cant share of U.S. savers does not take full advantage of the
TSAs’tax bene…ts, and prefers to invest in the TAs because of the TSAs’liquidity constraints.
These savers reveal that they care more about ‡exibility than about commitment, which may
depend, among other things, on them being less time inconsistent.
6
Discussion and Literature Review
6.1
Discussion
Overcon…dence. The empirical evidence suggests that people can underestimate their selfcontrol problems, i.e., they can be overcon…dent (see, e.g., DellaVigna 2009 and the references
therein). In my model, an agent is overcon…dent if in period 1 he believes that his type is ^ ,
but learns in period 2 that it is < ^ . Overcon…dence a¤ects the analysis as follows. On the
one hand, the characterization of the information and participation constraints does not change,
because it only depends on the belief that each agent holds in period 1 about his type. On the
other hand, the principal’s problem of …nding the optimal screening menus changes. Now the
principal has to design each menu taking into account that both sophisticated and overcon…dent
agents can choose it. Therefore, depending on her goal (i.e., on ), she may exploit or counteract
the agents’overcon…dence. This is especially relevant to the design of unused options. In the
two-type model, we saw that in some cases the principal may have to enlarge the C-menu to
screen the sophisticated I. However, if in period 1 some agent of type I wrongly believes that his
type is C, then the principal may also want to enlarge the C-menu to deal with his overcon…dence
in period 2. Thus, she may face a trade-o¤ between optimally dealing with the overcon…dent I
and ensuring that the sophisticated I does not choose the C-menu. This trade-o¤ may arise, in
particular, in the case of a fully paternalistic planner who cares only about the agents’welfare
(i.e., = 0).25
Outside Option with Type-Dependent Values. Since only the principal can provide
‡exible commitment devices, we can interpret any period-1 outside option as the set of statecontingent choices that the agents would make in period 2 if left to their means. Without loss of
generality, we can describe these choices through a menu fx0 ; k 0 g, using the formalism of Section
4.1. For simplicity, consider the two-type model. By Proposition 1, U C (x0 ; k 0 ) U I (x0 ; k 0 ) with
equality if and only if x0 is constant over . In other words, C and I value the outside option
equally— as I have assumed throughout the paper— if and only if the menu fx0 ; k 0 g is singleton.
For example, the outside option may lead to the same default action in every state. Instead, if
x0 is ‡exible, then C and I value the outside option di¤erently. Similarly, in Mussa and Rosen
25
A more detailed analysis of the model with overcon…dent agents is available upon request.
24
(1978) a high-valuation buyer would value more an outside option featuring strictly positive
consumption than would a low-valuation buyer. When C values fx0 ; k 0 g more than does I, my
analysis changes to the extent that the constraint IRC may bind, together with or even before the
constraint (IC C ). Obviously, if x0 = xI , then the principal will never o¤er a distorted I-menu,
because she can never extract from type C more than U C (xI ; k 0 ) = U I (xI ; k 0 ) + RC (xI ).
Similarly, if in Mussa and Rosen the outside option features the e¢ cient level of trade with lowvaluation buyers, then the monopolist will never o¤er them a distorted level of trade. Finally,
if (IC C ) binds before (IRC ), then the principal will distort the I-menu as discussed in Section
4.2, possibly stopping if (IRC ) starts to bind too.
Di¤erent Period-1 Heterogeneity. In this paper, time-inconsistent agents overweigh
( < 1) the immediate bene…ts (or costs) of their actions when they have to choose one. If
all time-inconsistent agents have > 1, the substance of the paper doesn’t change. Signi…cant
di¤erences appear, instead, if some agents have < 1 and others have > 1. For example,
with two types and 1 > 1 > 2 , neither may have an information advantage (see Appendix A).
However, the approach that I introduced in Section 4.4 should be useful to analyze this case as
well.
6.2
Relation to the Previous Literature
As noted in the Introduction, this paper relates to the literature on behavioral contract theory
and optimal paternalism. Several papers have already studied how to design incentive schemes
to tackle people’s self-control problems (e.g., O’Donoghue and Rabin (1999)). The novelty of the
present paper consists in studying, in a market environment, the role of asymmetric information
about people’s con‡icting desires for commitment and for ‡exibility. A more detailed comparison
of my work to the closest papers in the literature follows.
Amador, Werning, and Angeletos (AWA) (2006). In AWA, a decision-maker faces
a consumption-savings problem and lacks self-control. Anticipating this situation, ex ante he
desires commitment, but because he still doesn’t know how much he will value consumption, he
also desires ‡exibility. AWA focus on commitment strategies that only entail removing options
from the decision-maker’s future budget set; they purposefully rule out payments across states to
eliminate any form of insurance. AWA’s main contribution is to identify necessary and su¢ cient
conditions for the optimal (paternalistic) strategy to coincide with minimum-savings plans.
In contrast, I look at markets in which, to commit to a desired plan, agents obtain incentive
schemes from a pro…t- or welfare-maximizing supplier; I also allow for payments across states.
AWA’s model is more appropriate to study public pension plans which involve no payments,
such as Social Security and de…ned-bene…t plans. But to study de…ned-contribution plans and
other forms of voluntary savings that receive tax incentives from the government, my model
seems better suited.
Furthermore, AWA essentially assume that people’s self-control is observable. However,
with unobservable self-control, AWA’s analysis does not change because their commitment policies raise no incentive compatibility issue. To see why, suppose AWA allowed for both timeinconsistent and time-consistent agents, type I and type C. AWA’s optimal policies involve an
e¤ective minimum-savings constraint for I, but not for C. Given these policies, by de…nition, C
doesn’t want to mimic I ex ante, although if C did, ex post C would save at least as much as
does I in every state and strictly more in some states. The ‡ip side of AWA’s policies is that I’s
25
ex-post choices are (almost) always ine¢ cient, according to his ex-ante preference. By contrast,
I show that allowing for payments would permit to fully solve the agent’s self-control problems
under symmetric information, but causes screening problems under asymmetric information.
DellaVigna and Malmendier (DM) (2004). DM’s in‡uential work studies how …rms
design two-part tari¤s when selling to quasi-hyperbolic buyers goods that cause immediate costs
(bene…ts) and deferred bene…ts (costs). DM’s model helps explain why …rms set per-usage prices
below (above) their marginal costs, and introduce renewal or cancellation fees.
DM’s model di¤ers from mine as follows. First, in DM …rst- and second-period preferences
di¤er for any given agent, but they coincide across agents within each period. Also, at the
…rst-period contracting stage, some agents correctly forecast their preference at the secondperiod consumption stage, but others are partially naive in the sense of O’Donoghue and Rabin
(2001). Second, DM restrict a priori the agent to a buy-or-not-buy decision— with no quantity
variable— a restriction that precludes studying the trade-o¤ between commitment and ‡exibility.
Finally, DM assume that …rms know the agents’ degree of time inconsistency and of naivete.
This symmetric information assumption is relaxed by Jianye (2011), who takes DM’s model and
studies whether their predictions about per-usage prices are robust to asymmetric information.
DM also …nd that …rms will provide sophisticated agents with a two-part tari¤ that perfectly
solves their self-control problems. I con…rm and extend this result in Section 3. Nonetheless,
I cast doubt on DM’s positive message in two directions. First, the result is not robust to
asymmetric information, which leads a monopolist to supply commitment ine¢ ciently. Second,
the result relies on the property, which holds also in DM, that self-1 and self-2 rank any two
states in the same order. To see this, suppose the utility function of self-1 is sU (a) a p, but
that of self-2 is 1s U (a) a p— the inconsistency is now about which state makes a more valuable.
In this case, to be period-2 incentive compatible, an allocation rule x must be non-increasing;
so, the e¢ cient outcome a — which is increasing— cannot be sustained with self-2.
Esteban and Miyagawa (EM) (2006b). EM study a standard model of non-linear pricing
except that buyers have Gul-Pesendorfer (2001) preferences and the monopolist can o¤er them
non-singleton menus. EM’s …nd that the monopolist may fully extract buyers’ surplus using
"decorated" menus. EM have two types of buyers (H and L), which have di¤erent valuations of
the monopolist’s good, but also di¤erent self-control costs. Buyers dislike exerting self-control
and desire commitment. So, intuitively, by adding to the singleton L-menu an item that makes
a dishonest H incur high self-control costs, the monopolist makes the L-menu less appealing to
H and can extract more surplus from H. In my model, the unused item added to the C-menu
works similarly: By making I view the C-menu as o¤ering even less commitment, it increases
I’s expected cost of mimicking C.
Except this similarity, EM’s paper is quite di¤erent from mine. In EM, agents value commitment but not ‡exibility, so no trade-o¤ can arise between the two. In EM, unused items
help screen ‘strong’ from ‘weak’ buyers; instead, in my model, they help screen ‘weak’ from
‘strong’agents. In EM, maximizing welfare never requires adding unused items to menus, but
maximizing pro…ts may; instead, in my model, maximizing welfare may require unused items
while maximizing pro…ts may be possible without them.
Eliaz and Spiegler (ES) (2006). ES analyze a model in which agents sign contracts with
a monopolist in period 1, to get access to a set of actions in period 2. All agents have the
same utility function u in period 1, which changes to v in period 2 with probability q. The
monopolist knows q; instead, in period 1, each agent has his own private belief q^ about the
26
chance of switching to v. Crucially, in ES the agents’ private information a¤ects what they
are willing to pay for a given contract in period 1, but it doesn’t a¤ect what the monopolist
expects them to choose in period 2. Instead, in my model, period-1 private information also
matters in predicting the period-2 outcome of a (non-trivial) device— as in adverse-selection
models. Furthermore, in ES, the agents have the same information about the environment in
both periods, so they care about commitment (unless q^ = 0), but not about ‡exibility.
7
Conclusion
I study how a monopolist and a paternalistic planner supply ‡exible commitment devices to
agents who privately know their preference for commitment and for ‡exibility. If information
were symmetric, both the monopolist and the planner would help each agent fully solve his selfcontrol problems. This is generally not true, however, with asymmetric information: Agents with
superior self-control enjoy an information advantage, which causes a screening problem. I derive
the optimal screening devices using non-standard techniques for solving dynamic mechanismdesign problems. To screen a more time-inconsistent from a less time-inconsistent agent, the
monopolist and (possibly) the planner ine¢ ciently curtail the ‡exibility of the device tailored
to the …rst agent, and include unused items in the device tailored to the second agent. Finally,
in my model the familiar ‘no distortion at the top’property fails.
My results are relevant to the problem of providing commitment devices in several situations
in which only time-inconsistent people should use such devices, but they cannot be distinguished
a priori from time-consistent people. Governments may want to design savings devices that
provide tax incentives for time-inconsistent people to adequately save for retirement. Gyms
may want to design special memberships that provide monetary incentives for time-inconsistent
people to exercise regularly. Finally, both private and public institutions may want to provide
time-inconsistent people with incentives to commit to other goals, such as consuming less alcohol,
cigarettes, or drugs.
The paper does not look at the e¤ects of competition among many providers of ‡exible commitment devices. While a detailed study of this case is beyond the scope of the current paper,
I expect that even perfectly competitive markets lead to ine¢ cient outcomes, with distortions
qualitatively similar to the monopolistic case. Intuitively, competition pushes …rms to generate
the largest possible surplus and payo¤ for each type of agent. However, the asymmetric information creates an adverse-selection problem for the …rms, who have to screen the di¤erent types.
Conditional on sustaining the e¢ cient outcome with time-consistent agents, each …rm has to
balance the e¢ ciency with time-inconsistent agents against the incentives of the time-consistent
agents to pretend to be time-inconsistent. This trade-o¤ is similar to that analyzed in the paper,
and hence is likely to have similar implications.
8
Appendix A: Irrelevance of Asymmetric Information
with Finitely Many States
In this appendix, I show that if the number of states is …nite, then asymmetric information at
the contracting stage may be immaterial as far as sustaining the e¢ cient outcome is concerned.
27
To understand the role of the discreteness of the state space S, consider the case with two
states, s2 > s1 . If is observable, the principal sustains a2 = a (s2 ) > a (s1 ) = a1 (Lemma 1),
with payments p1 = p(s1 ) and p2 = p(s2 ) that satisfy the condition
u2 (a2 ; s2 ; )
u2 (a1 ; s2 ; )
p2
p1
u2 (a2 ; s1 ; )
(1)
u2 (a1 ; s1 ; ),
which follows by combining the (IC ) constraints. Since u2 (a; s; ) has strictly increasing differences in (a; s), the discreteness of S creates some slack in the period-2 incentive constraints
evaluated at (a2 ; a1 ). Thus, for any , condition (1) does not uniquely pin down p1 and p2 .
Now suppose that type C and type I have about the same self-control, i.e., that I is almost
equal to C . Intuitively, to implement the same outcome with both types, it should be possible
to use incentive devices that are su¢ ciently alike. Furthermore, if the discreteness of S leaves
some leeway in the choice of payments, it may be even possible to …nd one device that works
for both types. On the other hand, if I is very di¤erent from C , then sustaining the e¢ cient
outcome with both type requires di¤erent devices. Since I < C , type I is more tempted to
pick a1 also in state s2 than is type C, and the more so, the lower I is relative to C . Thus,
for I to choose a1 only in state s1 , a1 must be su¢ ciently more expensive than is a2 , and this
price premium must increase as I falls. Therefore, it must eventually exceeds C’s willingness
to pay for switching from a2 to a1 in s1 .
The following lemma formalizes this intuition. Since the result does not depend on having
two types with 0 < I < C = 1, consider a …nite set of types, which may include both > 1
and < 1. Let := max and := min .
Lemma 5 Suppose S is …nite with sN > sN 1 > : : : > s1 . There exists a single commitment
device that sustains the e¢ cient outcome a with each 2 if and only if =
mini si+1 =si .
Proof. With N states the period-2 incentive constraints become
u2 (ai ; si ; )
pi
u2 (aj ; si ; )
(ICi;j )
pj
for all i; j = 1; : : : ; N , where ai := a(si ) and pi := p(si ). By standard arguments, it is enough to
focus on the adjacent constraints (IC i;i 1 ) and (IC i 1;i ). For i = 2; : : : ; N , let i := pi pi 1 .
If a(si ) = a (si ) for all i, then aN > aN 1 > : : : > a1 (Assumption 1). To implement a with ,
we must have, for i = 2; : : : ; N ,
u2 (ai ; si ; )
u2 (ai 1 ; si ; )
i
u2 (ai ; si 1 ; )
u2 (ai 1 ; si 1 ; ).
Recall that for any s and , u2 (a0 ; s; ) u2 (a; s; ) = s (a0
mini si =si 1 and suppose sk 1 > sk . Then,
u2 (ak ; sk 1 ; )
u2 (ak 1 ; sk 1 ; ) > u2 (ak ; sk ; )
and no k satis…es (CIC k;k 1 ) for both
then for each and i
u2 (ai ; si ; )
a) + d (a0 )
u2 (ai 1 ; si ; )
and . If instead si
u2 (ai ; si 1 ; )
u2 (ai ; si 1 ; )
28
(CICi;i 1 )
d (a). Let sk =sk
1
=
u2 (ak 1 ; sk ; ),
si
1
for every i = 2; : : : ; N ,
u2 (ai 1 ; si 1 ; )
u2 (ai 1 ; si 1 ; ).
Set i := u2 (ai ; si 1 ; ) u2 (ai 1 ; si 1 ; ) for each i. Then f i gN
i=2 satis…es all the (CIC i;i 1 )
P
conditions for each . And the payment rule de…ned by pi = p1 + ij=2 j — with p1 small
enough to ensure participation— sustains a with each .
Therefore, if the heterogeneity across agents— measured by
— is small enough, the principal can sustain the e¢ cient outcome without worrying about period-1 incentive constraints.
The condition in Lemma 5, however, is not necessary for asymmetric information to be
irrelevant when sustaining the e¢ cient outcome a . Even if = is large, it may be possible to
design di¤erent devices such that each device sustains a with one type of agent, and each agent
optimally chooses the device designed for his type. To see why, consider a simple example with
two types with h > l , and two states, s2 > s1 . Suppose that h > 1 > l , h s1 > l s2 , but
s2 > s1 h and s2 l > s1 . Consider all payments (p1 ; p2 ) that satisfy (1) and the (IR) constraint
with equality, i.e.,
(1 f ) p2 + f p1 = (1 f ) u1 (a2 ; s2 ) + f u1 (a1 ; s1 )
(2)
where f := F (s1 ). Finally, choose (ph1 ; ph2 ) so that h’s self-1 strictly prefers a2 in state s2 — i.e.,
u1 (a2 ; s2 ) ph2 > u1 (a1 ; s2 ) ph1 — and (pl1 ; pl2 ) so that l’s self-1 strictly prefers a1 in state s1 — i.e.,
u1 (a1 ; s1 ) pl1 > u1 (a2 ; s1 ) pl2 . Then, the device de…ned by (pl1 ; pl2 ) (respectively (ph1 ; ph2 ))
sustains a and generates zero expected payo¤s to the agent if and only if type l (h) chooses it.
Moreover, I claim that type l (h) strictly prefers the device with payments (pl1 ; pl2 ) (respectively
(ph1 ; ph2 )). Note that, if the self-1 of either type had to choose in period 2, under either device
he would strictly prefer to behave exactly as does his self-2. Therefore, by choosing the ‘wrong’
device, either type can only decrease his payo¤ below zero.
The next lemma provides a necessary condition for asymmetric information to be irrelevant
k
when sustaining a . Let 1 := \ [0; 1] and 2 := \ [1; +1). For k = o; u, let
:= max k
k
and k := min k , and set rk := = k .
Lemma 6 Suppose that S is …nite with sN > sN 1 > : : : > s1 . If maxfr1 ; r2 g > mini si+1 =si ,
there is no collection of commitment devices, each tailored to a speci…c type 2 , such that
(I) type chooses ‘his’device, (II) each device sustains the e¢ cient outcome a with the corresponding type , and (III) all types enjoy the same expected payo¤.
Proof. Suppose max fr1 ; r2 g = r1 ; the other case follows similarly. Suppose that devices
that satisfy (I)-(III) exist. Let fp g 2 be the corresponding payment schedules, and normalize
1
1
selects
each type’s expected payo¤ to zero. Consider p and denote it by p. I claim that if
1
p rather than p , he enjoys a positive expected payo¤. Given p, let ai ( ) be an optimal choice
of 2 1 in si (this is well de…ned by assumption). For 1 , ai ( 1 ) = ai for every i. Let
S
:= fi : si+1 / si < r1 g 6= ?. We have: (a) for every i,
1
1
1
(b) for i 2 S , si > si+1 , and hence ai ( )
with
1, imply that
1
p(ai ( ))
p (ai )
1
u2 (ai ( ); si ;
1
)
1
si >
1
i=1 [u1 (ai (
1
); si )
ai ;
ai+1 > ai . Observations (a) and (b), together
u2 (ai ; si ;
1
)
1
u1 (ai ( ); si )
where the …rst inequality is strict for i 2 S . The expected payo¤ to
PN
1
si and hence ai ( )
1
p(ai ( ))]fi >
29
PN
i=1 [u1 (ai ; si )
1
u1 (ai ; si ) ,
from p is then
p(ai )]fi = 0,
where fi := F (si ) F (si 1 ) for i = 2; : : : ; N and f1 := F (s1 ).
Note that, if either 1 n f1g = ? or 2 n f1g = ?, then the su¢ cient condition in Lemma 5
becomes also necessary for sustaining the e¢ cient outcome without worrying about the agent’s
private information.
9
Appendix B: Omitted Proofs from Sections 3-4.
All proofs allow for
9.1
C
1, unless the speci…c result assumes otherwise.
Proof of Lemma 1
The principal’s problem is
R
max (1
) S [u1 (a (s) ; s)
fa;pg
R
c(a(s))]dF +
S
[p (s)
c (a (s))] dF
s.t. (IR) and (IC ).
If > 0, (IR) must bind; if = 0, assume w.l.o.g. that (IR) holds with equality. Thus, the
problem reduces to
R
s.t. (IC ).
max S [u1 (a (s) ; s) c(a(s))]dF
a
Ignoring (IC ), the resulting relaxed problem admits a a as its unique solution (up to the set
of zero measure fs; sg). It remains to show that there exist p that sustains a with self-2. For
any , standard arguments imply that the necessary and su¢ cient condition for the existence
of such a p is that a is non-decreasing in s, a property satis…ed by a .
9.2
Proof of Proposition 1
Using expression EU and changing variables, re-write U i (xj ; k j ) as
U i (xj ; k j ) =
Now de…ne for each s
(sj xi ) = s(1
Rs
s
[sU (xj (s i ))
C
I
s(1
= s(1
)U (xi (s
C
C
))
R
)U (xi (s I )) +
)[U (xi (s
C
))
R
s i U (xj (s i ))
C
s
R
i
U (xj (y))dy]dF + k j .
U (xi (y))dy
(3)
(4)
U (xi (y))dy
Rs C
U (xi (s I ))] + s I [U (xi (y))
s
I
Note that xi 2 M and I < C 1 imply that (sj xi )
every 2 ( ; ), then (sj xi ) = 0 for every s 2 (s; s).
Consider …rst type C. Using (3) and (4), we have
RC (xI ) = U C (xI ; k I )
s
U I (xI ; k I ) =
U (xi (s I ))]dy.
0 for every s. Also, if xi ( ) = a for
Rs
s
(sj xI )dF
0,
(5)
with equality if xI ( ) = a over ( ; ). Now suppose xI ( ) is not constant over ( ; ). Since
xI is non-decreasing, there exists ~ 2 ( ; ) such that xI ( ) < xI ( 0 ) for every ; 0 such that
30
< ~ < 0 . Let s~1 := ~= C and s~2 := ~= I , and consider the interval I := (~
s1 ; s~2 ) \ [s; s] 6= ?.
I
C
I
C
I
I
C
I
~
For s 2 I, s < < s ; hence x (s ) < x (s ). To prove that R (x ) > 0, it is enough to
show that
i
R hR s C
I
i
i
U
(x
(s
))]dy
dF > 0.
(6)
I [U (x (y))
s
I
For s 2 I, the integrand in (6) satis…es
Rs
C
~
[U (xi (y))
R~
U (xi (s I ))]dy + s I [U (xi (y))
Rs C
[U (xi (y))
~
U (xi (s I ))]dy
U (xi (s I ))]dy > 0,
where the …rst inequality follows from (MON I ) and the last from xI (y) > xI (s I ) for every
y 2 (~; s C ). Since I has positive measure, (6) follows.
Rs
Now consider type I. Using again (3) and (4), we have RI (xC ) = s (sj xC )dF , and the
properties of RI (xC ) follow from the same type of arguments used for RC (xI ).
9.3
Proof of Corollary 1
Using condition (ENV j ), the expected pro…ts equal
(X; T) =
=
R
tC ( )
W C (xC )
c(xC ( )) f C ( ) d + (1
U C (xC ; k C ) + (1
)
R
) W I (xI )
tI ( )
c(xI ( )) f I ( ) d
U I (xI ; k I ) .
By Proposition 1, (IR I ) and (R-IC C ) imply (IR C ). Now, recall that U i (xi ; k i ) is decreasing in
k i (see EU). So, if > 0, (R-IC C ) must bind, and (R-IC I ) becomes RI (xC ) + R C (xI )
0.
I
Similarly, if > 0, (IR ) must also bind. If = 0, we can safely focus on DMs that make (RIC C ) and (IR I ) hold with equality, since we are ultimately interested in the optimal X. Given
X, we can then uniquely pin down k using the conditions U I (xI ; k I ) = 0 and U C (xC ; k C ) =
RC (xI ). Substituting these quantities in (X; T) and rearranging the principal’s objective
deliver problem P 0 .
9.4
Proof of Lemma 2
Let U := fu :
! [U (a); U (a)] j u is non-decreasingg. Then, for every x 2M, u( ) =
f i (u) :=
U (x( )) 2 U; similarly, for every u 2 U, x( ) = U 1 (u( )) 2 M. Also, let W
ei (uj ) := Ri (U 1 (uj )). It is easy to see that problem P 0 is equivalent to
W i (U 1 (u)) and R
the following problem:
h
i
8
f C (uC ) + (1
f I (uI )
e C (uI )
< max W
) W
R
1
.
Pu =
:
C
I
C
I
I
C
e
e
s.t. u ; u 2 U and R (u ) + R (u ) 0.
The space X := f(uC ; uI ) j ui : ! R; i = C; Ig is linear and Y := U U is a convex subset
eI ( ) + R
eC ( ) maps X into R, which has nonempty, positive,
of X . The constraint functional R
f i (u) is so due to the convexity of
and closed cone R+ . The objective is concave because W
31
eI ( ) + R
eC ( ) is linear, and there exists (uC ; uI ) 2 Y such that
U 1 and c. The constraint R
eI (uC ) + R
eC (uI ) < 0: e.g., uC = U (xC ) and uI constant.
R
Let
0 and de…ne the Lagrangian
L(uC ; uI ; ) :
f C (uC ) + (1
= W
f C (uC )
= [W
I
f I (uI )
) [W
eC (uI )]
R
1
eI (uC )] + (1
R
eI (uC ) + R
eC (uI )]
[R
f I (uI )
) [W
C
eC (uI )].
R
(7)
By Corollary 1, p. 219, and Theorem 2, p. 221, of Luenberger (1969), (uC ; uI ) solve P u if and
I
0,
only if there exists
0 such that, for all (uC
0 ; u0 ) 2 Y; 0
L(uC ; uI ;
L(uC ; uI ; )
0)
I
L(uC
0 ; u0 ; ).
Given
0, uC and uI maximize the …rst and the second term in brackets in (7) within U if
I
C
I
C
I
and only if L(uC ; uI ; ) L(uC
0 ; u0 ; ) for every (u0 ; u0 ) 2 Y. Moreover, if (u ; u ; ) satis…es
eC (uI ) + R
eI (uC ) 0 and [R
eC (uI ) + R
eI (uC )] = 0, then L(uC ; uI ; 0 ) L(uC ; uI ; ) for every
R
eC (uI ) + R
eI (uC ) 0. And if R
eC (uI ) + R
eI (uC ) < 0,
0. Finally, if (uC ; uI ) solves P u , then R
0
then must equal zero: otherwise there exists 0 2 [0; ) such that
L(uC ; uI ;
eI (uC ) + R
eC (uI )] < 0.
L(uC ; uI ; ) = (
0)
0 ) [R
ie i
f i (ui )
eC (uI ) +
Finally, if (uC ; uI ; ) are such that ui 2 arg maxu2U fW
R (ui )g,
0, R
eI (uC )
eC (uI ) + R
eI (uC )] = 0, then xi = U 1 (ui ) 2 arg maxx2M fW i (xi )
R
0, and [R
i
R i (xi )g, RC (xI ) + RI (xC )
0, and [RC (xI ) + RI (xC )] = 0. Similarly, if if (xC ; xI ; )
i
i
are such that x 2 arg maxx2M fW i (xi )
R i (xi )g,
0, RC (xI ) + RI (xC )
0, and
i
i
C
C
I
I
C
i
i
i
i
i
e (u )g, R
e (uI ) +
f (u )
R
[R (x ) + R (x )] = 0, then u = U (x ) 2 arg maxu2U fW
eI (uC ) 0, and [R
eC (uI ) + R
eI (uC )] = 0.
R
9.5
Proof of Proposition 2
Preliminary Step: First, I rewrite the objective W I (xI )
expected virtual surplus. To do so, note that
W i (xi ) =
Next, using (EU), we have
R
RC (xI ) =
R
[u1 (xi ( ) ; = i )
R
R
U (xI ( ))
C
I
1
I
R C (xI ) in Lemma 2 as an
c(xi ( ))]dF i ( ) .
C
1
C
U (xI ( ))
(8)
U (xI (y))dy dF C
R
U (xI (y))dy dF I
Changing the order of integration and rearranging, we can express RC (xI ) as
RC (xI ) =
R
C
I
U (xI ( ))g C ( ) d
32
R
I
I
U (xI ( ))V I ( ) dF I ( ) ,
(9)
I
where g C ( ) : ( ;
g C ( ) :=
C
I
] ! R and V I : [ I ;
C
1
C
fC ( )
] ! R are given by
fC( ) C
v
fI( )
F C ( )) and V I ( ) := v I ( )
(1
with v i ( ) := = i
F i ( ) =f i ( ). Using (8) and (9), we get that W I (xI )
the expected virtual surplus
VS I (xI ;
C
) : =
R
I
R
C
where, for
2
I
, wI ( ;
C
U (xI ( ))wI ( ;
I
C
I
) := =
U (xI ( ))(1
I
+
C
C
xI ( )
)
C
(10)
( ),
RC (xI ) equals
c(xI ( )) dF I ( ) +
(11)
F C ( ))d ,
V I ( ) is the virtual valuation of U (xI ( )) with
fC ( ) C
v ( ).
fI ( )
V I ( ) := v I ( )
Finally, as in the proof of Lemma 2, it is convenient to work in terms of the functions uI 2 U.
The properties of the corresponding allocation rule follow by letting xI = U 1 (uI ).
Part 1: Existence and Uniqueness.
Step 1: Construction of the generalized version of VS I using Toikka’s (2011) technique over
I
[ I ; ].
I
Since the density f is positive, the inverse function (F I ) 1 : [0; 1] ! [ I ; ] is well-de…ned,
increasing, and continuous. Fix C and de…ne, for q 2 [0; 1],
Rq
z(q; C ) := wI ((F I ) 1 (q) ; C ) and Z(q; C ) := 0 z(y; C )dy.
The function z is continuous in q, except possibly at q m := F I (
I
if C < 1 and C < , so that m = C , we have
lim wI ( ;
#
C
C
) = lim wI ( ;
"
C
C
m
) > 0 (recall
( C) 1
fI( C)
Cf
)
m
I
= minf ;
C
g):
C
C
C
C
.
(12)
Let := convZ be the convex hull of Z on [0; 1]. Let ! : [0; 1] ! R be de…ned as !(q; C ) :=
0
(q; C ), whenever 0 (q; C ) exists. Also, w.l.o.g., extend !(q; C ) by right-continuity to all
[0; 1) and by left-continuity at 1.
Lemma 7 The function ! is continuous in q and
C
.
Proof. (Continuity in q). Throughout this part of the proof suppress C . Continuity at 0
and 1 holds by construction. Since z is continuous at every q 2 (0; 1) n fq m g, so is !. To see
this, note that continuity of z at q implies that Z 0 (q) = z (q). There are two cases to consider.
First, suppose (q) < Z (q). By de…nition, ! ( ) must be constant at ! (q) in a neighborhood
of q. It follows that ! is continuous at q. Second, suppose (q) = Z (q). Since is convex and
Z (y)
(y) for every y, we have:
+
(q) = lim
y#q
(y)
y
(q)
lim
q
y#q
33
Z (y)
y
Z (q)
= Z + (q)
q
(q)
q
(q) = lim
y"q
(y)
y
lim
y"q
Z (q)
q
Z (y)
= Z (q) .
y
+
Since
(q)
(q) and Z is di¤erentiable at q, we have that
(q) = + (q), and ! is
continuous at q.
I
Consider now q m . If m = , then q m = 1 and we are done. To prove that ! is continuous
if q m 2 (0; 1), it is enough to show that if z jumps at q m , then (q m ) < Z (q m ). Recall that
z (q m ) := limq"qm z (q) = lim
"
m
wI ( ;
C
) and z(q m +) := limq#qm z (q) = lim
#
m
wI ( ;
C
).
Hence z can at most jump down at q m (see (12)), in which case z (q m ) > z (q m +). Also
recall that z (q m +) = z (q m ). Suppose (q m ) = Z (q m ). By the same steps as in Lemma 7,
+
(q m ) Z + (q m ) = z (q m ). By convexity of , ! (q)
(q m ) for all q q m . Therefore, for
q close enough to q m from the left
R qm
R qm
(q) = (q m )
! (s) ds > Z (q m )
z (s) ds = Z (q) .
q
q
A contradiction.
(Continuity in C ). The function Z(q; C ) is continuous in C for every q. So is continuous
if q 2 f0; 1g, since (0; C ) = Z(0; C ) and (1; C ) = Z(1; C ). Consider q 2 (0; 1). For every
q and C 0, is de…ned as
(q;
C
) := min
Z(q1 ;
C
) + (1
) Z(q2 ;
C
) j ( ; q1 ; q2 ) 2 [0; 1] ; q = q1 + (1
) q2 .
By continuity of Z(q; C ) and the Maximum Theorem, (q; ) is continuous in C for every
q. Furthermore, ( ; C ) is di¤erentiable in q with derivative !( ; C ). Fix q 2 (0; 1) and any
C
C
(q; C ), it follows from
. Since limn!1 (q; C
sequence f C
n) =
n g such that limn!1 n =
C
).
Theorem 25.7, p. 248, of Rockafellar (1970) that limn!1 !(q; C
n ) = !(q;
I
Now, for 2 , de…ne the generalized virtual valuation
wI ( ;
C
) := !(F I ( ) ;
C
),
which is non-decreasing by construction and continuous by Lemma 7. Let
c (U 1 ( )) and replace wI with wI and x = U 1 (u) in VS I to get:
I
VS (u;
C
) :=
R
I
I
u ( ) wI ( ;
C
) + (u ( )) dF I +
C
R
C
I
() = U
1
()+
u ( ) gC ( ) d .
I
Step 2: Derivation of a candidate solution, which maximizes VS , in two steps.
For 2 I , de…ne (y; ; C ) := ywI ( ; C ) + (y) and let
uI ( ;
I
C
) := arg
max
y2[U (a);U (a)]
C
(y; ;
C
(13)
),
I
and uI ( ; C ) = U (a) for 2 ( ; ]; uI is the unique pointwise maximizer of VS . Although
uI ( C ) is non-decreasing on I , it may not be non-decreasing over . The next lemma shows
I
that any monotone maximizer of VS , if it exists, must belong to a certain subclass of U.
34
I
Lemma 8 Suppose uI 2 U and VS (uI ;
(
I
u ( ;
where
b
2 [ I;
I
] and y b (
C
)
C
)=
xI ( b ;
C
I
) = maxu2U VS (u;
C
yb(
uI ( ;
C
if
b
) if
I
)
C
C
C
<
<
). Furthermore, if
,
b
<
b
).Then, uI must satisfy
>
I
, then y b (
C
) = uI ( b ;
C
).
I
I
Proof. Suppress C and suppose that uI 2 U maximizes VS . I claim that uI ( ) = uI ( )
I
C
I
C
I
for 2 ( ; ]. Otherwise, there exists 0 2 ( ; ) such that uI ( ) > uI ( ) for every > 0 .
But then uI can’t be optimal within U because
i
i
R Ch I I
R Ch I I
I
I
u ( ) uI ( ) g C ( ) d
u
(
)
u
(
)
g C ( ) d > 0.
0
Now consider uI ( ) for 2 I . Recall that (y; ) in (13) is strictly concave in y and continuous
in . Because uI ( ) is continuous and non-decreasing over I , only two cases can arise.
I
I
I
Case 1: uI ( )
uI ( I ). I claim that uI ( ) = uI ( ) for 2 ( I ; ]. If not, there exI
0
ists 0 > I such that uI ( ) < uI ( )
uI ( ) for every
. By strict concavity, for
I
I
I
0
I
I
2 ( ; ], (u ( ); )
(u ( ) ; ), with strict inequality at least for every
; hence
R
R
I
I
I
I
I
I
); )dF > I (u ( ) ; )dF , contradicting the optimality of u .
I (u (
I
Case 2: uI ( ) = uI
for
>
2 ( I;
b
I
b
> uI ( I ) for some
b
2 ( I;
]. Suppose not. First, consider ( b ;
I
I
]. I claim that uI ( ) = minfuI
] and suppose that uI ( ) < uI
b
. Then, by the same argument as in case 1, setting uI ( ) = uI
R
for every
I
b
b
; uI ( )g
for some
2 ( b;
I
]
is a strict improvement on uI : the resulting function is in U and b (uI b ; )dF I >
R I
uI ( ) ; dF I . Second, consider ( I ; b ] and suppose uI ( 0 ) 6= uI ( 0 ) for some 0 . If
b
uI ( 0 ) > uI ( 0 ), then by continuity of uI and monotonicity of uI there exists a 00 > 0 such
that uI ( ) > uI ( ) for every 2 ( 0 ; 00 ). Similarly, if uI ( 0 ) < uI ( 0 ), then there exists 000 < 0
such that uI ( ) < uI ( ) for every 2 ( 000 ; 0 ). Finally, since uI is the unique maximizer of
(a; ), for 2 ( I ; b ], (uI ( ) ; )
(uI ( ) ; ), with strict inequality at least for every
R
R b
b
2 ( 000 ; 0 ) [ ( 0 ; 00 ); hence I (uI ( ) ; )dF I > I (uI ( ) ; )dF I , which contradicts the
optimality of uI .
I
I
It remains to show that uI ( ) > uI ( ) is impossible. Suppose the opposite is true. By the
I
I
I
same argument as in case 2, uI ( ) = uI ( ) for 2 ( I ; ). Then setting uI ( ) > uI ( ) can’t
I
I
C
be optimal for the following reason: since uI ( ) = uI ( ) for 2 ( ; ), and g C ( ) is negative,
I
I
I
reducing uI ( ) to uI ( ) satis…es monotonicity and strictly improves VS .
C
By Lemma 8, uI must be continuous on ( I ; ). Although Lemma 8 doesn’t pin down uI
C
C
at I and , it is w.l.o.g. to extend uI at I and
by continuity. The next lemma proves that
C
I
a maximizer of VS exists, and shows that it is unique over ( I ; ) and, therefore, also over
C
[ I ; ] w.l.o.g..
I
Lemma 9 There exists uI such that VS (uI ;
C
35
I
) = maxu2U VS (u;
C
); such an uI is unique.
Proof. Suppress C . By Lemma 8, if a solution uI exists, it can take only two forms: (1)
C
C
uI equals a constant y
uI ( I ) over [ I ; ], (2) uI is constant at uI b over [ b ; ], with
b
I
b
, and equals uI ( ) for every
. So we only need to look for a solution within this
subclass of U.
C
I
f I (uI ): the …rst
Subclass 1: If uI constant at y over [ I ; ], then VS (uI ) =VS I (uI ) = W
equality follows from
R1
R I I
wI ( )]dF I = 0 [z (q) ! (q)] dq = 0;
I [w ( )
the second from Proposition 1. Moreover, if uI is constant at y over [ I ;
R I
f I (uI ) = y I ( = I )dF I + (y).
W
C
]
(14)
By continuity and strictly concavity, there exists a unique constant function that maximizes
I
VS (uI ). Call it uI1 .
I
Subclass 2: Using (y; ) in (13), VS equals to the following function of b :
R I
R b
b
= I (uI ( ) ; )dF I + b (uI b ; )dF I + uI b K,
(15)
where K :=
C
R
C
I
g C ( ) d . Continuity of uI implies that
b
I
b
is continuous, therefore there
I
exists an optimal 2 [ ; ] that completely identi…es a maximizer within the second subclass
of functions. Since uI can be locally ‡at, there can be multiple solutions b . Nonetheless, I
claim that there can’t be two solutions 1b and 2b such that uI ( 1b ) 6= uI ( 2b ). Suppose to the
b
contrary that 1b < 2b both maximize
, and uI ( 1b ) < uI ( 2b ). W.l.o.g. assume that 1b
is the largest such that uI ( ) = uI ( 1b ). Let u1 and u2 be the functions corresponding to 1b
C
e := u1 + (1
and 2b , and for 2 (0; 1) let u
) u2 2 U. For 2 ( 1b ; ], u2 ( ) 6= u1 ( ),
whereas for 2 [ I ; 1b ], u2 ( ) = u1 ( ) = uI ( ). By the strict concavity of (y; ), we have
R 1b
R I
e( )K >
(e
u ( ) ; )dF I + 1b (e
u ( ) ; )dF I + u
( 1b ) + (1
) ( 2b ).
I
b
b
C
e is constant on [ 2b ; ] at some uI (~ ), with ~ 2 ( 1b ; 2b ). Therefore, the function
Note that u
b
uI ( ) = minfuI (~ ); uI ( )g satis…es property (2) and, by the same argument as in Lemma 8
(Case 2),
R I
R 1b
b
e ( ) K > ( 2b ).
(e
u ( ) ; )dF I ( ) + 1b (e
u ( ) ; )dF I ( ) + u
(~ )
I
b
The claim follows. Hence any maximizer of
yields a unique uI that satis…es property (2).
I
Call it u2 .
I
I
By an argument similar to that for the uniqueness of uI2 , VS (uI2 ) = VS (uI1 ) if and only if
C
I
uI2
uI1 (over ( I ; )). Therefore, the overall maximizer of VS is unique, and equals uI1 if
I
I
VS (uI1 ) VS (uI2 ), and uI2 otherwise.
I
Step 3: The unique maximizer of VS , denoted uI ( C ), is also the unique maximizer of
VS I (U 1 (u)). The argument modi…es Toikka’s (2011) proof of Theorem 3.7 and Corollary 3.9
I
C
to account for ( ; ].
36
Lemma 10 The function uI (
Proof. Suppress
R
I
I
C
C
) is the unique maximizer of VS I (U
1
(u)).
. Integrating by parts and recalling that u 2 U, we have
u ( ) [wI ( )
wI ( )]dF I =
R
I
I
u ( ) [z(F I ( ))
!(F I ( ))]dF I
I
= u ( ) [Z(F I ( ))
(F I ( ))] I
R I
I
(F I ( ))]du ( )
I [Z(F ( ))
R I
I
Z(F I ( ))]du ( ) 0.
=
I [ (F ( ))
The last equality follows from Z (0) = (0) and Z (1) = (1); the inequality follows from u 2 U
and (q) Z (q) for q 2 [0; 1]. Re-writing VS I (U 1 (u)), we have
sup VS I (U
u2U
1
I
(u)) = supfVS (u) +
u2U
R
I
I
[ (F I ( ))
Z(F I ( ))]du ( )g.
I
We know that uI 2 U and achieves the supremum of VS (u). We only have to show that
C
R
I
I
[ (F I ( ))
Z(F I ( ))]duI ( ) = 0.
(16)
If uI is constant over [ I ; ], then duI 0 and we are done. Otherwise, consider the pointwise
I
solution uI over [ I ; ] as de…ned in (13), and a such that (F I ( )) < Z(F I ( )). For some
open interval N around , wI ( ) = !(F I ( )), and uI is constant over N . Therefore, duI ( )
assigns zero measure to any such N , and satis…es (16). I claim that also duI does so to any
C
C
such N . Consider [ b ; ], over which uI is constant at uI b . If N
[ b ; ], the claim is
immediate. The same holds if N \ [ b ;
C
if both N \ [ b ; ] 6= ? and N \ [ I ;
C
C
] = ?, because then uI ( ) = uI ( ) for 2 N . Finally,
b
) 6= ? (therefore, b > I ), then uI equals uI b for
every 2 [ b ; ] [ N , which implies the claim. Hence (16) holds also for a non-constant uI .
C
I
I
e 2 U that di¤ers from uI over ( I ; ), VS (e
By Lemma 9, for any u
u) < VS (uI ). Uniqueness
C
C
follows on ( I ; ), and extending it to [ I ; ] is w.l.o.g..
Part 2: Continuity and Limit Behavior of uI
I proved continuity of uI in in Part 1; I now prove continuity in C . By the de…nition of
I
uI ( ; ) in (13) and the Maximum Theorem, uI ( ; ) is continuous in C for 2 [ I ; ]. Now
consider ( b ; C ) in (15). Pointwise continuity of wI ( ; C ) and uI ( ; C ) implies that ( b ; C )
is continuous in C , hence b ( C ) := arg max 2 I ( ; C ) is u.h.c.. Furthermore, recall that
C
for every ; 0 2 b ( C ), uI ( ; C ) = uI ( 0 ; C ). Take any sequence f C
n g with limn!1 n =
b C
b
C
b C
I C
. Then, limn!1 ( n ) =
2
( ). Recall that the candidate u2 ( ) maximizing
I
b C
I b C
I
C
( ; ) satis…es u2 ( ; ) = minfu ( ( ); C ); uI ( ; C )g for 2 [ I ; ], and uI2 ( ; C ) =
I
I
C
uI ( b ( C ); C ) for > . Hence, by continuity of uI , limn!1 uI2 ( ; C
) for every 2
n ) = u2 ( ;
C
I
I
[ ; ]. Finally, recall that the constant solution u1 in the proof of Lemma 9, as well as (14), is
independent of C . It remains to show that the actual solution uI ( C
n ) converges pointwise to the
actual solution uI ( C ). Suppose …rst that VS I (U 1 (uI1 )) >VS I (U 1 (uI2 ( C ))) = ( b ( C ); C ).
37
By continuity of , there exists N such that n N implies VS I (U 1 (uI1 )) >VS I (U 1 (uI2 ( C
n ))).
C
I
I
I
C
I
1
I
Therefore, for n N , u ( ; n ) = u1 for every 2 [ ; ]. Now, suppose that VS (U (u1 )) <
I C
VS I (U 1 (uI2 ( C ))). Similarly, for n large enough uI ( C
n ) = u2 ( n ) which converges pointwise to uI ( C ). Finally, if VS I (U 1 (uI1 )) =VS I (U 1 (uI2 ( C ))), then uI1
uI2 ( C ). Hence
I
I
C
I
I
C
u1 u ( ; n )
maxf0; u1 u2 ( ; n ) g ! 0 as n ! 1.
To prove that uI ( C ) ! uI = U (xI ) pointwise as C ! 0, it is enough to observe that
f I (u). Therefore, uI ( ; 0) = uI ( ) for 2 ( I ; I ), which can be extended
VS I (U 1 (u); 0) = W
C
I
I
. Finally, I prove that
to [ I ; ] by letting uI ( I ; 0) = uI ( I ) and uI ( ; 0) = uI ( ) for
I
C
nf
C
max u ( ; ) U (a ) ! 0 as
! +1 by …rst showing pointwise convergence. Recall
I C
I
1
C eC
f I (u)
that u ( ) maximizes VS (U (u); C ) = W
R (u) and that, by Proposition 1,
C
I
C
e (u) > 0 for any u 2 U that is not constant on ( ; ). Clearly, uI ( C ) cannot converge to
R
a constant function that takes value y0 6= U (anf ), because U (anf ) is the unique maximizer of
f I ( ) in (14). Now, suppose that uI ( C ) converges pointwise to a function, denoted uI1 , that
W
C
is not constant on ( I ; ). Then, there exists ^C large enough so that for every C > ^C , uI1
f I (uI1 ) is bounded and R
eC (uI1 ) > 0; hence
is strictly dominated by U (anf ). This is because W
f I (U (anf )). Now, consider the unique
eC (uI )
f I (uI ) ^C R
W
there exists ^C
0 so that W
1
1
extension of uI ( C ) by continuity. By monotonicity, we have
max uI ( ;
C
)
U (anf ) = maxf uI ( ;
C
)
U (anf ) ; uI ( ;
C
U (anf ) g,
)
which completes the argument.
Part 3: Properties (a)-(c) of uI
Property (b): Suppress C . Recall: (1) uI satis…es Lemma 8; (2) for 2 I , uI is de…ned
by (13) and is continuous; (3) ( ) as in (15). Suppose b > I . For < b , uI ( ) = uI ( ) <
b
uI b = uI b by Lemma 8, and
( ) by construction; hence
b
( )
b
uI
uI ( )
R
=
I
I
I
w (y) dF + K +
R
b
R
wI (y) dF I
b
(uI
uI
(uI
b
)
b
b
uI
uI ( )
)
b
(uI ( ))
(uI (y))
uI ( )
Since uI and wI are non-decreasing,
R b I
w (y) dF I
maxfjwI ( I )j; jwI ( b )jg(F I ( b )
F I ( ))
(1
dF I
0.
F I ( )).
Since uI is continuous, there exists ~ 2 [ I ; b ) such that uI ( ) is interior for 2 (~;
Mean Value Theorem, for every 2 (~; b ), there exists 2 ( ; b ) such that
b
(uI
u
hence for
2 (~;
R
b
b
I
)
b
(uI ( ))
I
u ( )
= 0 (u ( )) =
b
wI ( ) ,
)
(uI
uI
b
)
b
(uI (y))
uI ( )
dF I
maxfjwI ( I )j; jwI ( b )jg(F I ( b )
38
F I ( )).
). By the
Therefore,
b
lim
"
b
uI
( )
b
uI ( )
=
R
I
b
wI (y) dF I (y) + K + 0 (uI
b
b
FI
)(1
)
0.
(17)
I
It follows that b < because K < 0 and 0 ( ) < 0.
I
I claim that there exists 2 ( b ; ] such that uI ( ) > uI b . Suppose not. If uI b is
0
b
interior, wI ( ) =
(uI ( )) for every
, and (17) is violated. If uI b = U (a) and the
0
2 I j wI ( ) >
(U (a)) is non-empty— if a = ?, we are back to the previous
set a :=
case— then (17) is violated again. To see this, note that since b is the smallest for which
R I
R I
0
wI ( ) =
(U (a)), Z(F I b ) = (F I b ), which implies b wI (y) dF I = b wI (y) dF I ;
hence
R
I
b
[wI (y) + 0 (U (a))]dF I + K =
R
I
b
I
(y=
+ 0 (U (a)))dF I +
C
I
R
C
I
g C (y) dy +
R
I
b
V I (y) dF I .
By Assumption 1 and concavity , = I + 0 (U (a)) < 0, so the …rst integral is negative. The
second term is too, contradicting 17. To see this, integrate by parts:
R
R
C
I
I
b
gC ( ) d =
v i ( ) dF i =
=
Thus, we have
R
C
I
g C (y) dy +
R
I
b
R
R
R
C
I
b
V I (y) dF I =
=
=
i
( =
R
R
C
I
b
I
R
b
b
=
=
I
C
(s
(
R
I
F C ( )),
I
)dF i
I
( =
(1
F i( ) +
gC ( ) d +
I
b
I
dF C +
( = i )dF i
I
b
C
=
I
I
b
b
F i( b)
I
)F i ( ).
I
v I ( ) dF I
R C
b
)dF I
b ( =
b
b
C
R
I
b
v C ( ) dF C
b
)dF C
(18)
)dF < 0,
I
where the last equality is a change of variables, and the inequality follows from > b > I > 0
and 0 < I < C 1.
I
Now de…ne 1 := maxf j uI ( ) = uI b g < . For > 1 , uI ( ) > uI ( 1 ) and ( )
b
( 1) =
. The same steps that led to (17) yield
( 1)
lim1 I 1
# u ( )
R I I
( )
I
=
+ K + 0 (uI ( 1 ))(1
1 w (y) dF
I
u ( )
F I ( 1 ))
0.
As uI (y) is interior and constant over [ b ; 1 ], 0 (uI (y)) = wI (y) = wI b =
and
R I I
R I
0
wI ( 1 )]dF I + K = b [wI (y) wI b ]dF I + K 0.
1 [w (y)
39
0
(uI
b
),
b
Finally, since Z(F I
wI b . Therefore,
R
I
b
[wI (y)
b
(F I
) =
wI
b
), the same argument as in Lemma 7 yields wI
R
]dF I + K =
which gives, by rearranging wI (y),
R
I
b
[wI ( b ;
C
= I ]dF I ( ) =
)
R
C
I
b
I
b
[wI (y)
b
wI
R
V I ( ) dF I ( )
b
=
]dF I + K,
C
I
(1
F C ( ))d
;
(19)
Property (a): Since both uI ( C ) and uI = U (xI ) are continuous and non-decreasing, it
I
I
is enough to prove that uI ( I ; C ) > uI ( I ) and uI ( ; C ) < uI ( ).
I
Case 1: uI not constant. This implies that b > I , and by Lemma 8 uI ( ; C ) =
I
I
uI ( b ; C ) = uI ( b ; C ). To show that uI ( ; C ) < uI ( ) for C > 0, it is enough to prove
I
that wI ( b ; C ) < = I . This inequality holds because b and wI ( b ; C ) must satisfy (see the
proof of Property (b))
R
I
b
(wI ( b ;
I now show that uI ( I ;
Lemma 11 wI ( I ;
C
C
C
)
y= I )dF I =
) > uI ( I ). Given
wI ( I ;
)
C
C
R
C
C
I
g C (y) dy +
> 0, let
b
R
I
b
V I (y) dF I < 0.
:= maxf j wI ( ;
C
). Moreover, if the inequality is strict, then
) = wI ( I ;
b
>
I
C
)g.
.
Proof. Suppress C and recall that wI ( I ) = ! (0) and wI ( I ) = z (0). If ! (0) > z (0), the
same argument as in Lemma 13 leads to a contradiction. Suppose that ! (0) < z (0) and let
q^ := supfq j 8q 0 < q; !(q 0 ) < z(q 0 )g; q^ > 0 by continuity. Then, for every 0 < q < q^,
Rq
Rq
Z (q) = Z (0) + 0 z (y) dy > (0) + 0 ! (y) dy = (q) ,
It follows that b (F I ) 1 (^
q) > I .
I
So if wI ( I ; C ) = wI ( I ; C ), it equals ( I = I )(1 + C (1
)) + C (f I ( I )) 1 > I = I . If
I
I
I
instead wI ( ; C ) < wI ( ; C ), then it is constant over [ ; b ] at wI ( b ; C ). Since at b it
must be that Z(F I ( b ); C ) = (F I ( b ); C ), b must satisfy
R b I
C
) wI ( b ; C )]dF I = 0
(20)
I [w (y;
or equivalently,
R
b
I
(y=
I
wI ( b ;
C
))dF I =
Integrating by parts, we have
R b I
R b I
R b C
I
(y) dF I =
(y) dF C
I V
I v (y) dF
I v
R b
R b
I
C
I
=
I (y=
I (y=
b )dF
C
R
b
I
V I (y) dF I .
b )dF
C
=
R
b=
b=
(21)
I
C
(s
b )dF
> 0,
where the last equality follows from a change of variables and the inequality from b > I > 0 and
0 < I < C 1. Therefore the left hand side of (21) must be positive, and wI ( b ; C ) > I = I .
In either case, uI ( I ; C ) must be interior and strictly greater than uI ( I ).
40
Case 2: uI constant. From the proof of Lemma 9, uI ( ; C ) equals U (anf ) for every
I
I
Since I = I < E (s) < = I , Assumption 1 implies uI ( I ) < U (anf ) < uI ( ).
I
m
Property (c): Let I < 0
(recall m = minf ; C g) and consider
"
#
I
0
I
I
I 0
F
F
(
)
I
C
(1 + C (1
))
wI ( 0 ; C ) wI ( I ; C ) =
.
I
f I ( 0)
fI I
2
.
(22)
Although the …rst part of (22) is always positive since 0 < I < 1, the second part of (22)
can be negative. Therefore, w( ; C ) can be non-increasing in a neighborhood of I . If so, then
wI ( ; C ) and uI ( C ) must be constant over [ I ; b ] 6= ?.
I 0
I 0
I
I
F ( 0 = I )=f ( 0 = I ) F ( = I )=f ( = I )
Now note that F ( )=f ( 0) F ( )=f ( ) =
. Therefore, the condi0
= I = I
0
tion stated in part (c) of Proposition 2 implies that, for all
wI ( 0 ;
C
)
C(
0
wI ( ;
)
C
)
1
=
I
C
which in turn implies that wI ( ;
9.6
+
C
+
I
) and uI (
in [ I ; minf I sy ;
m
I
I
F ( 0 = I )=f ( 0 = I )
I
1
>
0
C
=
F
I
=
=
=f
=
g],
0,
I
) must be constant over [ I ;
b]
6= ?.
Proof of Lemma 3
i
For the uniform distribution F i ( ) =
(
( = I )(1 +
I
C
w ( ; )=
( = I )(1 +
. Using expression 11, we have
C
(1
C
(
2 I )) +
I
C I
1)2 )
if
2 [ I;
if
2[
C
;
C
I
)
.
]
I
The function wI is continuous at C . It is increasing and greater than = I over [ C ; ], because
C
> 0 and I < 1; wI is increasing over [ I ; C ) if and only if I 1=2 or C > (2 I 1) 1 := C .
Consider …rst the behavior of b and b , when b > b and xI is not constant. If I
1=2
C
I
I
C
or
< , then w is increasing and coincides with w (see the proof of Proposition 2); hence
I
C
I
x (see (13)) is increasing over [ I ; ], which implies b = I . Otherwise, b
> I and b
is characterized by (20), which boils down to
(
) =
b
Since wI is always increasing over [ b ;
Since
R
I
b
[wI (y;
C
@ R I I
b [w (y;
@ b
( I )2
(
1 + C ( I 1)2
C
I 2
)
I 2
(23)
)
], it coincides with wI . Using 19,
wI ( b ;
)
C
I
C
C
wI ( b ;
)]dy =
C
)]dy =
41
(
I
I
)
C
wI ( b ;
C
R
C
I
)(
b
must satisfy
g C (y) dy.
I
b
) < 0.
(24)
R C
for C > 0, there is a unique b > b that satis…es (24). Letting K = I g C (y) dy < 0, condition
(24) becomes
8
b 2
C
< (1 + C ( I 1)2 )( I
)
if b
I
I
I
C
[2 (
.
)K] =
I
: C ( I )2 ( I
b 2
C 2
) if b < C
) + (1 + C (1 2 I ))(
If I > 1=2, the function b ( C ) is constant at I for C < C and at C it jumps from
C
. Monotonicity for C > C follows by applying the Implicit Function Theorem to (23):
d
d
Similarly, for
b
(
C
1
=
2 1+
b
C
2
I
C(
I
1)2
( C
( b
I
to
I 2
)
> 0.
)
I
) we have:
d
d
b
C
=
8
>
<
I
2
>
:
C
[1+
I
2
b
1)2 ]
C( I
b
C (1+ C (1
2
I
))
if
b
C
< 0 if
b
C
<0
;
<
for the second inequality, recall that b < b < C if and only if I 1=2 or C < C .
Consider now the behavior of uI ( C ) = U (xI ( C )), which matches that of xI ( C ). By Proposition 2 and Assumption 1, uI ( ; C ) 2 (U (a); U (a)). Also, uI ( ; C ) = arg maxy2[U (a);U (a)] ywI ( ;
(y). By strict concavity of yw + (y), it is enough to consider the properties of wI ( C ) in relaI
tion with the function = I . The function wI ( ; C ) crosses = I only once at some I < < .
Furthermore, for 2 [ b ; b ], wI ( ; C ) = wI ( ; C ). Hence, it is enough to show that, as C
rises, wI ( b ( C ); C ) falls and wI ( b ( C ); C ) rises.
Lemma 12 Suppose b and b are characterized by NFB and (20). If wI ( b ;
wI ( b ; C ) > 0, then d dC wI ( b ( C ); C ) < 0 and d dC wI ( b ( C ); C ) > 0.
C
) > 0 and
Proof. It follows by applying the Implicit Function Theorem to 19 and (20).
Consider wI ( b ( C ); C ). If I
1=2 or C < C , then b ( C ) = I and wI ( I ; C ) =
I
(1
)( I = I ) > 0. Consider now the case with I > 1=2. As C " C , wI ( I ; C ) "
wI ( I ; C ) = wI ( C ; C ). By Lemma 12 wI ( b ( C ); C ) increases in C , for C > C , because here
wI ( b ( C ); C ) is positive when b > C . Hence, C # C implies wI ( b ( C ); C ) # wI ( C ; C ).
By the same argument wI ( b ( C ); C ) decreases in C again because here wI ( b ( C ); C ) is
positive when b < b .
9.7
Proof of Lemma 4
Using (4) and splitting the integral at the state s such that
RI (xC ) =
Rs
s
C
I
(sj xC )dF +
42
Rs
s
C
I
I
=
C
, we have
(sj xC )dF .
C
)+
For every s
(sj xC ) = s
s
C
C
I
, since xC (s I ) = xC (s
U (xC (s
C
C
)) s
U (xC (s
C
),
C
Rs C
))+ s C U (xC (y))dy+s[U (xC (s
C
)) U (xC (s
C
and because xC is continuous, (sj xC ) ! 0 as s ! s. Now consider RC (xI ). Recall that
since s I < s C , (sj xI ) > 0 for s < s. For convenience, …x s0 := 21 (s + s). By continuity,
min[s;s0 ] (sj xI ) = > 0. Choose s =2 > s so that (sj xC )
=2 for s 2 [s; s =2 ]. Finally,
let s1 := minfs0 ; s =2 g. We have:
R s1
(sj xI )dF
F (s1 ) ,
RC (xI )
s
RI (xC )
(sj xC )(1
sup
F (s1 )) + F (s1 ) .
2
s>s1
Therefore, RC (xI ) >
9.8
RI (xC ) if F (s1 )/ (1
F (s1 )) >
2
sups>s1
(sj xC ).
Proof of Proposition 3
Using EU, we have
R
RI (xC ) =
I
1
U (xC ( ))
I
R
C
1
U (xC ( ))
C
R
U (xC (y))dy dF I
R
U (xC (y))dy dF C .
Changing the order of integration and rearranging, we can express RI (xC ) as
RI (xC ) =
where the function g I : [ I ;
R
C
C
I
U (xC ( ))g I ( ) d +
] ! R is given by
I
g I ( ) :=
and the function V C : [
C
;
C
C
V C ( ) :=
C
C
U (xC ( ))V C ( ) dF C ,
(25)
fI ( ) + FI ( ) ,
] ! R is given by
1
C
1
I
R
1
FC ( )
fC ( )
fI ( )
fC ( )
I
1
I
1
FI ( )
.
fI ( )
Then, minimizing RI (xC ) over M while ensuring that xC C = xC C ( j C means restrictR I
ing xC to C ) corresponds to maximizing C U (x( ))g I ( ) d by choosing a non-decreasing
function x : [ I ; C ) ! [a; xC ( C )]. Although g I ( I ) < 0 and by continuity also for close to
I
, nothing ensures that g I ( ) is always positive, or at least non-decreasing. I therefore apply
Toikka’s (2011).
43
))],
Let Fe be the uniform distribution on [ I ; C ] and Fe
increasing inverse function. For q 2 [0; 1], de…ne
z (q) := g I (Fe
1
: [0; 1] ! [ I ;
1
(q)) and Z (q) :=
Rq
0
C
] its continuous and
z (y) dy.
Let
:= convZ be the convex hull of Z on [0; 1] (see, e.g., Rockafellar (1970), p.36), i.e.,
the highest convex function on [0; 1] that satis…es
Z. By de…nition,
is continuously
di¤erentiable except at possibly countably many points, so de…ne ! : [0; 1] ! R as
! (q) :=
0
(q) ,
whenever 0 (q) exists. Also, w.l.o.g. extend ! (q) by right-continuity to all [0; 1) and by leftcontinuity at 1. For 2 [ I ; C ], let g I ( ) := !(Fe ( )); so g I is non-decreasing.
I
I
Let m := minf ; C g > I and q m := Fe( m ) > 0. Since g I is continuous over [ I ; ], by
the argument in Lemma 7 ! is continuous over [0; q m ], hence g I is continuous over [ I ; m ].
Lemma 13 g I ( I )
g I ( I ).
Proof. If the opposite holds, then ! (0) > z (0). Since z is continuous over [0; q m ] and ! is
non-decreasing, there exists q > 0 such that ! (y) > z (y) for every y q. Since Z (0) = (0),
it follows that
Rq
Rq
Z (q) = 0 z (y) dy Z (0) < 0 ! (y) dy
(0) = (q) .
A contradiction.
It follows that d := supf 2 [ I ; C ] j g I ( ) < 0g > I . Similarly, de…ne u := inff 2
[ I ; C ] j g I ( ) > 0g, if the set is non-empty, otherwise u := C . By Theorem 3.7 of Toikka
(2011), any best extension xC of xC C must satisfy xC ( ) = a for 2 ( I ; d ) and xC ( ) =
xC ( C ) for 2 ( u ; C ), if any. Letting xC ( u ) = xC ( C ) is w.l.o.g.. Finally, over [ d ; u ), xC
can be any non-decreasing function mapping to [a; xC ( C )], so long as it satis…es the necessary
pooling property described by Toikka (see De…nition 3.5). By Corollary 3.8 of Toikka (2011), it
is then w.l.o.g. to set xC ( ) = xC ( C ) for 2 [ d ; u ).
9.9
Proof of Corollary 2
I
C
Fix xI0 = xI and recall the expression of RC (xI ) in (9). Observe that g C ( ) < 0 for 2 ( ; ).
Therefore, the rule xI 2 M with xI I = xI0 I that minimizes RC (xI ) must satisfy xI ( ) =
I
I
C
I
I
C
xI0 ( ) for 2 ( ; ). So, let x
^I ( ) = xI0 ( ) for 2 ( ; ], and x
^I ( ) = xI0 ( ) otherwise.
C
C
Using (25) and the best extension xu in Lemma 3 of x
C , condition (RR) becomes
RC (^
xI )+
R
C
C
U (xC ( ))V C ( ) dF C U (xC (
C
R
))
C
I
gI ( ) d
[U (a) U (xC (
C
R
))] Id g I ( ) d .
By assumption the left hand side is positive; by construction RC (^
xI ) has been minimized, and
R u I
< 0. The result follows.
I g ( )d
44
9.10
Proof of Proposition 4
Preliminary Observations.
First, I show that the screening C-menu sustains the e¢ cient outcome with C if and only if
the constraint (RR) does not bind. Recall that I = = , where is the Lagrangian multiplier
associated to (RR).
Lemma 14 There exists xC 2 M such that xC ( ) = xC ( ) for every
I I
W C (x)
R (x) if and only if I = 0.
Proof. Once more, I will work with the functions u 2 U. Suppose
I
I eI
f C (u)
R (U 1 (u)) in (25), rewrite W
R (u) as
VS C (U
where wC ( ;
I
1
I
(u);
)= =
C
)=
I
R
C
C
[u ( ) wC ( ;
I
) + (u ( ))]dF C +
I
R
C
2
I
C
I
and xC maximizes
eI (u) =
> 0. Using R
u ( ) gI ( ) d ,
C
C
C
V C ( ). Let uC
u = U (xu ) where xu is the best enlargement of x
C
C
uC
u
C
C
as in Lemma 3 and U be the set of u 2 U that coincide with
on [ ; ]. By construction,
C
1
C
I
C
1
I
VS (U (uu ); ) = maxu2U VS (U (u); ). I claim that there exists u
^ C 2 UnU such that
C
I
I
). Focus on [ m ; ] with m := maxf ; C g, and let w^ C
VS C (U 1 (^
uC ); I ) >VS C (U 1 (uC
u );
be the generalized version of wC over this interval obtained with the same techniques as in the
C
proof of Proposition 2. Since wC is continuous on [ m ; ], so is w^ C (Lemma 7). Because I > 0,
C
V C implies wC ( ; I ) > = C for 2 [ m ; ). I claim that w^ C ( m ; I ) > m = C . By the same argument as in Lemma 11, w^ C ( m ; I ) wC ( m ; I ) and, if the inequality is strict, then there exists
^ C ( ; I ) = wC ( 0 ; I ) for 2 [ m ; 0 ]. If w^ C ( m ; I ) = wC ( m ; I ), then the
0 > m such that w
C
claim follows. If w^ C ( m ; I ) = wC ( 0 ; I ), then w^ C ( m ; I )
> m = C . Finally, because
0=
w^ C is continuous and non-decreasing, in either case there exists 1 > m such that w^ C ( ; I ) >
= C for 2 [ m ; 1 ]. Construct u
^ C by letting u
^ C ( ) = arg maxy2[U (a);U (a)] y w^ C ( ; I ) + (y) if
C
2 [ I ; m ). Then, u
^ C 2 U, but u
^ C ( ) > uC
2 [ m ; 1 ];
2 [ m ; ], and uC
u ( ) if
u ( ) for every
C
so u
^ 2
= U . Finally,
VS C (U
1
(^
uC );
I
) VS C (U
1
(uC
u );
I
)
=
R
R
C
m
[^
uC ( ))wC ( ;
I
)+
u
^ C ( ) ]dF C
C
m
C
[uC
u ( )w ( ;
I
C
) + (uC
> 0.
u ( ))]dF
I I
It follows that the solution to maxx2M fW C (x)
R (x)g di¤ers from xC ( ) at least for
I
2 [ ; m ).
C C
Recall that xI ( C ) maximizes W I (xI )
R (xI ) over the set M. By revealed optimality,
C
C
C
I C
C
I C
^ >
implies that R (x (^ )) R (x ( )). Uniqueness of xI ( C ) (Proposition 2) implies
RC (xI (^C )) = RC (xI ( C )) if ^C = C . It follows that, for every C
0, RC (xI ( C )) is a
decreasing function of C , RC (xI ( C )) RC (xI ), and lim C !+1 RC (xI ( C )) = 0 by Proposition
2 and 1.
C
Part (a): By Proposition 1, RI (xC ) < 0. De…ne C
0 j RC (xI ( C )) +
2 = minf
I
C
R (x )
0g. The result follows from Lemma 14. Lemma 4 and continuity of RC (xI ( C ))
imply that C
2 > 0 in some environments.
45
I
C
I
C
Part (b): Let xC
u be as in Proposition 3, which satis…es R (xu ) < R (x ) < 0. De…ne
C
C
0g. Again, the result follows from Lemma 14.
0 j RC (xI ( C )) + RI (xC
u)
1 = minf
C
Lemma 4 and continuity of RC (xI ( C )) imply that C
1 < 2 in some environments.
Part (c): Lemma 4, Corollary 2, and continuity of RC (xI ( C )) imply that C
1 > 0 in some
environments. Again, the result follows from Lemma 14.
9.11
Proof of Proposition 5
Preliminary Observations: Recall that, given xj , there is a one-to-one relationship between
U j (xj ; k j ) and k j . Therefore de…ne hj := U j (xj ; k j ), and write
R-IC i;j : hi
hj + Ri (xj ) and IRi :
hi
0.
Moreover, as in the proof of Lemma 2 and Proposition 2, it is convenient to work with the
f i (ui ) := W i (U 1 (ui )) and R
ei (uj ) := Ri (U 1 (uj )).
functions u 2 U. Recall that W
Step 1: There exists U (a) small enough so that unused items su¢ ce to satisfy (R-IC i;j ) for
i > j.
ei (uj ). If i < j, i > j and
Recall that (R-IC i;j ) is hi hj + R
where
Gi ( ) =
if i > j,
j
i
>
R
e i uj =
R
i
1
i
fi ( )
i
j
R
uj ( ) G i ( ) d
(1
j
j
uj ( ) V j;i ( ) dF j ,
F i ( )) and V j;i ( ) = v j ( )
f i( ) i
v
fj( )
( );
and
R
e i uj =
R
where
j
i
i
uj ( ) G ( ) d +
i
i
G ( )=
V
j;i
j
( )=
1
1
j
1
i
R
j
j
uj ( ) V
( ) dF j ,
fi ( ) + Fi ( )
i
fi ( )
fj ( )
Fj ( )
fj ( )
j;i
1
1
i
Fi ( )
.
fi ( )
Take j > i. Suppose that (R-IC j;i ) is binding (and all the other constraints are satis…ed),
i
ej (ui ). Fix ui for every
i.e., hj < hi + R
, and let ui ( ) = U (a) for every < i . Then,
R i j
R i
i;j
U (a) j G ( ) d + i ui ( ) V ( ) dF i .
R j ui =
Lemma 15
R
i
j
j
G ( ) d < 0.
Proof. Rewrite the integral as
R
i
j
(
j
1)( = j )f j ( ) d +
46
R
i
j
Fj ( )d .
Integrate by parts the second term, to get
R i
j
)f j ( ) d + F j ( i )
j ( =
i
=
R
i
j
(
i
( = j ))f j ( ) d .
= j , with strict inequality for 2 ( j ; i ).
Finally, note that i s
e i so constructed satis…es hj
It follows that there exists U (a) small enough so that the u
ej (e
hi + R
ui ). Now, we need to check the other constraints. For all j 0 < i, the values that ui takes
0
for < i are irrelevant; so, all (R-IC j ;i remain unchanged. For |^ > i and |^ 6= j, it could be
0
e i is such that R|^ (e
e i may violate (R-IC j ;i ) while ui did not. However,
that u
ui ) > R|^ (ui ), and u
since Lemma 15 holds for every j > i and N is …nite, clearly we can …nd U (a) small enough
so that no j > i wants to mimic i. Thus, for the rest of the proof I will assume that (R-IC j;i )
never binds for j > i.
Step 2: Reformulation of the principal’s problem.
As usual, (IRN ) and (R-IC i;N ) imply (IRi ) for all i < N . Now, let Y := (U R)N be the
P
fi i
the subspace of (X R)N , where X = fuj u : ! Rg. Let e (U; h) = N
hi ] and
i=1 i [W (u )
P
f i (ui ). Then, the problem P N is equivalent to
f (U) = N i W
W
i=1
(
f (U) + e (U; h)
maxfU;hg2Y (1
)W
eN :=
,
P
s.t. (U; h) 0
where the functional : (X R)N ! Rr (r = 1 + N (N2 1) ) is given by
ei (uj ) + hj hi for some i 2 N , j > i.
all n = 2; : : : ; r, n (U; h) = R
Step 3: Existence of interior points.
eN , there exists fU; hg 2 Y such that
Lemma 16 In P
1
(U; h) =
hN and, for
(U; h) < 0.
ei (uj ) for every
Proof. The condition (U; h) < 0 is equivalent to hN > 0, and hi > hj + R
i
i
i
i
i 2 N , j > i. For every i 2 N , let u = u = U (x ) over [ ; ] and possibly extend it over
ej (ui )
[ ; i ) to include appropriate unused items. Note that these extensions are irrelevant for R
ej (ui ) 0 for all j < i, and it can be easily shown that R
e1 (ui ) R
ej (ui )
if j < i. Recall that R
e1 (ui+1 ) + 1. Now, …x
for all 1 < j < i. Thus, let hN = 1, and for every i < N , let hi = hi+1 + R
i < N and consider any j > i. We have
P
e1 ui+n + (j i) hj + R
ei xj + (j i) > hj + R
e i uj .
hi = hj + jn=1i R
ei (uj ) are bounded and N is …nite, the constructed vector h is well de…ned.
Note that since R
eN .
Step 4: Lagrangian characterization of the solution to P
eN , I use Corollary 1, p. 219, and Theorem 2, p. 221, of
To characterize the solutions to P
Luenberger (1969). Note that (X R)N is a linear vector space, and Rr is normed spaces with
the usual Euclidean norm, with positive closed cone containing an interior point. Y is a convex
f are concave by
subset of (X R)N . By Lemma 16, admits interior points. Finally, e and W
the assumptions on U ( ) and c ( ). Hence, the objective is concave and (U; h) is convex.
Now, for 2Rr+ , de…ne the Lagrangian
j;i ej
f (U) + e (U; h) + N hN PN P
L (U; h; ) = (1
)W
[R (ui ) + hi hj ]
i=1
j<i
P
P
j;i
ej (ui )] + PN hi i ( ; ; ),
fi i
= N
R
i=1 i [W (u )
j<i
i=1
i
47
where
i
( ; ; )=
( P
i;j
j>i
N
P
j;i
i
j<i
P
j;N
j<N
0
L (U; h;
for all fU0 ; h0 g 2 Y,
0
)
.
if i = N
N
eN if and only if there exists
Then fU; hg solve P
if i < N
0 such that
L (U0 ; h0 ; )
L (U; h; )
0. The second inequality is equivalent to
P
f i (u)
u 2 arg max W
i
j<i
u2U
and
hi 2 arg max
h2R
j;i
i
i
ej (u)
R
(26)
(27)
( ; ; ) h.
The …rst inequality is equivalent to the complementary slackness conditions
hN
N
0 and
hN = 0 ,
(28)
and for every i 2 N , j > i
ei (uj ) + hj
R
hi
0 and
i;j
[Ri (uj ) + hj
Lemma 17 If (U; h; ) satis…es the conditions (26)-(29), then
hi ] = 0.
i
(29)
( ; ; ) = 0 for all i 2 N .
Proof. Recall that hi
0 for all i 2 N , by combining (IRN ) and (R-IC i;N ). Therefore,
i
f (U) + e (U; h) is bounded below
( ; ; ) 0 for all i 2 N . On the other hand, (1
)W
nf
nf
nf
i
by U (a )E(s) a
c a > 0, hence ( ; ; ) 0.
This lemma has two immediate implications about the binding constraints.
Corollary 4 If = 0, then
such that R-ICi;j binds.
i
= 0. If
> 0, IRN binds and, for every i < N , there is j > i
Proof. The second part follows from the last Lemma. For the …rst part, note that, since
( ; ; ) = 0 for all i = 1; : : : ; N ,
"
#
N
NP1 P
P
P j;i
P j;N
i;j
i
0=
( ; ; )=
+ N
= N
,
i=1
i=1
j>i
j<i
j<N
PN 1 P
i;j
where the last equality follows from re-arraging
. So, if
= 0 = N , then
i=1
j>i
P
N
( ; ; ) = 0 implies j<N j;N = 0. Hence, j;N = 0 for all j < N . Suppose for all j i + 1,
P
P
n;j
= 0 for all n < j. Then, by i ( ; ; ) = 0, we have j<i j;i = j>i i;j = 0. Hence,
j;i
= 0 for all j < i.
Therefore, although i ( ; ; ) = 0 makes any hi 2 R solve (27), we can use the upward
binding constraints to pin down h, once U has been chosen.
48
eN exists if we can …nd (U; ) so that, for every i = 1; : : : ; N , ui
Thus, a solution to the P
solves (26), i ( ; ; ) = 0, and (28) and (29) hold. By replicating the arguments in the proof
of Proposition 2 (see Step 5 below), we have that, for every
0, a solution ui to (26) always
i
exists unique over ( i ; ) and is pointwise continuous in . Furthermore, if j;i ! +1 for
i
ej (ui ) ! 0. Moreover, since i ( ; ; ) = 0,
some j < i, then ui ! U (anf ) over ( j ; ), i.e., R
0
i;j 0
ei uj ! 0 and hi ! 0 (using the binding (R-IC i;j 0 )).
! +1 for some j 0 > i, so that R
Therefore, there exists j;i large enough to make (29) hold. Finally, (28) always hold with
hN = 0.
Step 5: Characterization of expected virtual surplus and existence of a solution.
Fix i > 1. Then, by (26), ui must solve
Pi
f i (u)
max W
1
n=1
u2U
n;i
en (u) ,
R
en (xi ), and letting ( ) =
where i 2 Ri+ 1 . Now, using the expression of R
we have
f i ui
W
iP1
n;i
n=1
e n ui
R
iP1
=
n;i
n=1
R
n
ui ( ) G n ( ) d +
i
i
= V S i ui ;
:
,
R
i
i
U
[ui ( ) wi
;
iP1
n
i
1
()
c(U
1
( )),
+ (ui ( ))]dF i
where
wi
;
i
:=
i
+
iP1
n;i
V i;n ( ) =
n=1
i
+
iP1
n;i i
v ( )
n=1
n;i f
fi
n=1
( ) n
v ( ).
( )
We can now apply to VS i ui ; i the technique used in the two-type case to characterize
f i (ui ) and ui = ui = U (xi ) on i . For
uI (Proposition 2). Clearly, if i = 0, VS i (ui ; 0) = W
i
i
> , we can let ui ( ) = ui ( ). For < i , ui ( ) may be strictly smaller than ui ( i ) to ensure
that the downward constraints involving i as the mimicked type are satis…ed.
Now, suppose n;i > 0 for some n < i. Apply the Myerson-Toikka ironing technique on i ,
by letting
Rq
z i q; i = wi (F i ) 1 (q); i and Z i q; i = 0 z i y; i dy.
Let i q; i = convZ i q; i , and ! i q; i = iq q; i wherever de…ned. Extend ! i by right
continuity, and at 1 by left continuity. For ! i to be continuous, it is su¢ cient to show that, if z i
is discontinuous of q, then z i jumps down at q. To see this, note that wi can be discontinuous
i
only at points like j for j < i and such that j 2 ( i ; ). At such a point, we have
wi ( j +;
i
) = limj
i
#
n
Note that (1) for n < j,
j
)( j = j ) 0. So,
j
>
j
w ( +;
i
)=
i
Pi
1
n=1
n;i i
v( )
Pi
1
n=1
n;i f
n
( ) n
v ( ).
)
f i(
, which implies that f n ( j ) = 0, and (2) v j ( j ) = (1
j
i
+
+
Pi
1
n=1
n;i i
j
v( )
49
Pi
1
n=j
( j) n j
v ( ).
f i( j )
n;i f
n
On the other hand,
wi (
j
i
;
) = limj
i
"
j
=
i
Pi
w(
j
i
+
;
j
i
)
i
w ( +;
Pi
1
n=1
1
n=1
Therefore,
i
+
)=
Pi
v( )
n;i i
( j) j j
v ( )=
f i( j )
j;i f
( ) n
v ( )
)
f i(
n;i f
1
n=j+1
v( )
n
( j) n j
v ( ).
f i( j )
Pi
j
n;i f
1
n=1
n;i i
n
j
( j) 1
f i( j )
j
j;i f
j
j
0.
j
This proves that z i can at most jump down, and, hence, ! i is continuous. Letting wi ( ; i ) =
i
! i (F i ( ); i ) for 2 i , construct the generalized virtual surplus VS , as we did in the proof
of Proposition 2.
i n
Now, note that Gn ( ) < 0 for 2 ( ; ). Therefore, since n;i > 0 for some n < i,
n
the …rst term in VS i is strictly negative. Let n = min n : n;i > 0 . Then, over ( i ; ), the
I
i
characterization of Lemma 8 of any maximizer to VS extends to VS . In particular, ui must
n
i
i
be constant at y ib over ( ib ; ), where ib
and y ib
ui ( ). Furthermore, y ib = ui ( ib ),
if ib > i ; and ui ( ) = ui ( ) for 2 [ i ; ib ]. Next, the same argument as in Lemma 9 for
I
i
VS yields that a (unique) maximizer of VS exists as well. Finally, the same argument as in
i
Lemma 10 implies that the (unique) maximizer of VS is also the (unique) maximizer of VS i .
Step 6: Properties of the solutions to (26).
Suppose that n;i > 0 for some n < i and let n be de…ned as before. Letting
Ki =
Pi
1
n=n
the analog of the ironing condition for
which implies
ib
R
To prove that wi
R
i
ib
iP1
n=1
n;i
<
i
ib
i
[wi
ib
;
R
i
wi y;
ib
i
ib
n;i
R
n
i
Gn ( ) d < 0,
applies to
wi
ib
;
ib
i
:
dF i + K i = 0,
, and can be written as
ib
i
i
;
<
V i;n ( ) dF i +
( = i )]dF i =
i
iP1
n=n
R
i
ib
Pi
1
n=1
n;i
V i;n ( ) dF i + K i .
= i , it is enough to show that the right hand side is negative:
n;i
R
n
i
Gn ( ) d =
iP1
n=n
n;i
R
i
ib
V i;n ( ) dF i +
n
R
n
i
Gn ( ) d
i
<0
which follows by (18). Therefore, ui exhibits bunching over [ ib ; ] with ib < , and at value
i
y ib < ui ( ).
i
Now, consider the bottom of [ i ; ]. By the same argument as in Lemma 11, wi ( i ; i )
wi ( i ; i ) with strict inequality if ib > i . Furthermore, for < i 1 , wi ; i = = i +
50
Pi
i Pi 1 n;i
v ( ) and wi ( i ; i ) = ( i = i )[1 + 1
] > i = i . So if wi ( i ; i ) =
n=1
w ( ; i ), then ui ( i ; i ) > ui ( i ). Otherwise, there is ironing at the bottom over [ i ; ib ] 6= ?
and the following must hold
1
n=1
i i
n;i i
R
which corresponds to
Now, for each n < i,
R
i
b
i
R
i
b
i
R
V i;n (y) dF i =
R
=
R
=
i
b
i
i
i
y=
i
i
b=
s
i
b
i
b
s
)
i
wi ( ib ;
) dF i = 0,
iP1
) dF i =
n;i
n=1
v i (y) dF i
i
b
i
i
wi (y;
wi ( ib ;
i
y=
i
b
i
R
R
i
b
i
V i;n (y) dF i .
i
b
i
v n (y) dF n
R ib
dF i
y=
i
R ib = n
dF
s
s
n
i
b
i
b
dF n
R
dF =
i
i
b=
i
n
b=
i
b
s
dF > 0.
Therefore, wi ( ib ; i ) > i = i , and ui ( i ; i ) > ui ( i ).
Finally, for bunching at the bottom, note that for < i 1 ,
P 1 n;i
i
[(1
)( = i ) + F i ( ) =f i ( )].
wi ; i = = i + in=1
Hence, for
w
i
0
<
0
;
i
i 1
<
w
i
;
i
=
0
i
So, a su¢ cient condition for wi ;
every s0 > s in [s; sy ],
F (s0 ) =f (s0 )
s0
[1 +
iP1
n;i
1
i
]+
n=1
i
iP1
n;i
n=1
F i ( 0)
f i ( 0)
Fi ( )
.
fi ( )
to be non-increasing in a neighborhood of
F (s) =f (s)
s
1
i
"
(1
i
)+
iP1
n=1
1
n;i
#
i
is that, for
.
P 1 n;i
Hence, bunching at the top is more likely if i is closer to 1 and in=1
is large enough, i.e.,
if the principal assigns high enough shadow value to not increasing the rents of the types below
type i.
10
Appendix C: The Planner’s Constrained Problem
I consider the alternative speci…cation of the planner’s problem as that of maximizing the expected ex-ante social surplus subject to a budget constraint. For simplicity, I do so in the
two-type model. I suggest looking at Section 4 before reading this part.
51
Let B be the planner’s budget. Assume that B is less than the maximal expected surplus:
B W i (xi ). Using 8 and (X; k), the planner’s constrained problem is
8
maxfX;kg W C (xC ) + (1
) W I (xI )
>
>
<
s.t. xi 2 M, (IRi ), and (R-IC i ) for i = I; C, and
PB :=
>
>
:
(X; k) B.
(IR B )
In PB, given X, multiple vectors k may be optimal. However, we can safely focus on DMs that
make (R-IC C ) and (IR I ) hold with equality. Then, let
(xC ; xI ; B) := W C (xC ) + (1
to reduce PB to
) W I (xI )
1
RC (xI )
B,
8
maxX W C (xC ) + (1
) W I (xI )
>
>
<
s.t. xC ; xI 2 M, (RR) and
PB0 =
>
>
:
B
(xC ; xI ; B) 0.
(IR )
By the same argument as in the proof of Lemma 2, PB0 is equivalent to
8
f C (uC ) + (1
f I (uI )
>
maxU W
)W
>
>
<
eC (uI ) + R
eI (uC ) 0 and
PB u =
s.t. uC ; uI 2 U, R
>
>
>
B
: e
(U 1 (uC ); U 1 (xI ); B) 0,
(IR )
where e (uC ; uI ; B) = (U 1 (uC ); U 1 (xI ); B).
f C (uC ) +
Let X and Y be as in the proof of Lemma 2. By concavity of U and c, W
f I (uI ) and e (uC ; uI ; B) are concave; R
eC (uI ) + R
eI (uC ) is linear. The constraint func(1
)W
C
I
C
I
I
C
C
I
e (u ) + R
e (u ); e (u ; u ; B)) maps to R2 , which has non-empty, positional (u ; u ) := (R
tive, and closed cone R2+ . Let
0 and
0 and de…ne the Lagrangian
L(uC ; uI ; ; ) :
f C (uC ) + (1
f I (uI ) + e (uC ; uI ; B)
= W
)W
n
eI (uC )]+
f C (uC )
= (1 + ) [W
R
(1+ )
+ (1
f I (uI )
) [W
(1
o
+
eC (uI )]
R
)(1+ )
eC (uI ) + R
eI (uC )]
[R
B.
(30)
Now, comparing (30) with (7), we see that here the weights on pro…ts and therefore on C’s rents
depend endogenously on the multiplier
0, rather than on . However, the same fundamental
C
trade-o¤s arise in …nding the optimal u and uI (hence, xC and xI ).
Speci…cally, by Theorem 2, p. 221, of Luenberger (1969), if (uC ; uI ; ; ) is such that, for all
I
(uC
0; 0 0,
0 ; u0 ) 2 Y; 0
L(uC ; uI ;
0;
0)
L(uC ; uI ; ; )
52
I
L(uC
0 ; u0 ; ; ),
(31)
then (uC ; uI ) solves PBu . Condition (31) essentially implies the same optimality conditions as
in Lemma 2. Also, if there exists (^
uC ; u
^ I ) 2 Y such that (^
uC ; u
^ I ) < 0, then by Corollary 1,
p. 219, of Luenberger (1969), the condition (31) is also necessary for (uC ; uI ) to solve PBu .
In particular, let B1 be the pro…ts that the optimal DM— derived in Section 4— yields in the
monopolist’s case ( = 1).
Lemma 18 There exists uC ; uI 2 U such that (uC ; uI ) < 0 if and only if B < B1 .
eC (uI ) + R
eI (uC )
Proof. ()) By de…nition of B1 , if uC ; uI 2 U and R
0 holds, then
C
I
e (u ; u ; B1 ) 0.
I
u
I
(() Consider a solution (uC
= 1. Since B < B1 , e (uC
1 ; u1 ) to the problem P with
1 ; u1 ; B) >
eC (uI ) + R
eI (uC ) < 0, we are done. Suppose R
eC (uI ) + R
eI (uC ) = 0. Recall that uI =
0. If R
1
1
1
1
1
f I (u) ^C R
eC (u) with ^C > 0 (Lemma 2). By revealed optimality
uI (^C ) = arg maxu2U W
eC (uI (^C ))
eC (uI ( C )) for every C > ^C . Because R
eC (uI ) + R
eI (uC ) = 0, R
eC (uI (^C )) >
R
R
1
1
0 and uI (^C ) is not constant (Proposition 1). I claim that there exists ~C > ^C such that
eC (uI (^C )) > R
eC (uI (~C )): by the proof of Proposition 2 R
eC (uI ( C )) is continuous in C , and
R
eC (uI ( C )) = 0 (Proposition 1). Finally, since also e is continuous in C , it remains
lim C !+1 R
C
C
eC I C
eC I C
" and e (uI ( C
to choose C
" ); u1 ; B) > 0.
" > ^ so that R (u ( " )) = R (u (^ ))
References
[1] Akerlof, G. A. (1991): "Procrastination and Obedience", American Economic Review (Papers and Proceedings), 81, 2, 1-19.
[2] Amador, M., Werning, I. and Angeletos, G.M. (2006): "Commitment vs. Flexibility",
Econometrica, 74, 2, 365-396.
[3] Amromin, G. (2002): "Portfolio Allocation Choices in Taxable and Tax-Deferred Accounts:
An Empirical Analysis of Tax E¢ ciency", Mimeo.
[4] Amromin, G. (2003): "Household Portfolio Choices in Taxable and Tax-Deferred Accounts:
Another Puzzle?", European Finance Review, 7, 547-582.
[5] Ashraf, N., N. Gons, D. S. Karlan and W. Yin, (2003): "A Review of Commitment Savings
Products in Developing Countries", ERD Working Paper No. 45.
[6] Ashraf, N., D. S. Karlan and W. Yin, (2006): "Tying Odysseus to the Mast: Evidence From
a Commitment Savings Product in the Philippines", Quarterly Journal of Economics, 121,
2, 635-672.
[7] Battaglini, M.: "Long-Term Contracting with Markovian Consumers," American Economic
Review, Vol. 95, n.3, June 2005, pp. 637-658.
[8] Bernheim, D. B. (2002): "Taxation and Savings", in Handbook of Public Economics, Volume 3, Edited by A. J. Auerbach and M. Feldstein.
53
[9] Bryan, G. and Karlan, D. and Nelson, S. (2010): "Commitment Devices", Annual Review
of Economics.
[10] Courty, P. and Li, H. (2000): "Sequential Screening", Review of Economic Studies, 67, 4,
697–717.
[11] DellaVigna, S. (2009): "Psychology and economics: Evidence from the …eld", Journal of
Economic Literature, 47,2, 315-372.
[12] DellaVigna, S. and Malmendier, U., (2004): "Contract Design And Self-Control: Theory
And Evidence", Quarterly Journal of Economics, 119, 2, 353-402.
[13] DellaVigna, S. and Malmendier, U., (2006): "Paying not to go to the gym", American
Economic Review, 96, 3, 694-719.
[14] Diamond, P. A. (1977): "A Framework for Social Security Analysis", Journal of Public
Economics, 8, 275-298.
[15] Eliaz, K. and Spiegler, R., (2006): "Contracting with diversely naive agents", Review of
Economic Studies, 73, 3, 689-714.
[16] Eliaz, K. and Spiegler, R., (2008): "Consumer optimism and price discrimination", Theoretical Economics, 3, 4, 459–497.
[17] Esteban, S. and Miyagawa, E., (2006a): "Temptation, self-control, and competitive nonlinear pricing", Economics Letters, 90, 3, 348-355.
[18] Esteban, S. and Miyagawa, E., (2006b): "Optimal menu of menus with self-control preferences", Mimeo, Universitat Autònoma de Barcelona.
[19] Esteban, S., Miyagawa, E. and Shum, M. (2007): "Nonlinear pricing with self-control
preferences", Journal of Economic Theory, 135, 1, 306-338.
[20] Gul, F, and W. Pesendorfer (2001): "Temptation and Self-Control," Econometrica, 69,
1403-1435.
[21] Heidhues, P. and Koszegi, B. (2010): "Exploiting naivete about self-control in the credit
market", American Economic Review, Forthcoming.
[22] Holden, S., Ireland, K., Leonard-Chambers, V., and Bodgan, M. (2005): "The Individual
Retirement Account at Age 30: A Retrospective", Investment Company Institute, 11, 1.
[23] Holden, S., Schrass, D. (2008): “The Role of IRAs in U.S. Households’Saving for Retirement, 2008”, Investment Company Institute, 18, 1.
[24] Holden, S., Schrass, D. (2009): “The Role of IRAs in U.S. Households’Saving for Retirement, 2009”, Investment Company Institute, 19, 1.
[25] Holden, S., Schrass, D. (2010a): “The Role of IRAs in U.S. Households’Saving for Retirement, 2010”, Investment Company Institute, 19, 8.
54
[26] Holden, S., Sabelhaus, J., and Bass, S. (2010b): "The IRA Investor Pro…le. Traditional
IRA Investors’Contribution Activity, 2007 and 2008", Investment Company Institute.
[27] Investment Company Institute (2012): “The U.S. Retirement Market, Second Quarter
2012”(September). http://www.ici.org/info/ret_12_q2_data.xls
[28] Jianye, Yan (2011): "Contracting with Heterogeneous Quasi-Hyperbolic Agents: Investment Good Pricing under Asymmetric Information", Mimeo, Toulouse School of Economics.
[29] Kreps, D. (1979): "A Representation Theorem for Preference for Flexibility", Econometrica,
47, 565-577.
[30] Laibson, D. (1998): "Life-Cycle Consumption and Hyperbolic Discount Functions", European Economic Review, 42, 861-871.
[31] Luenberger, D. G. (1969): "Optimization by Vector Space Methods". New York: Wiley.
[32] Munnell, A. H., Sundén, A. E. (2006): "401(k) Plans Are Still Coming Up Short", Brookings
Institute Press.
[33] Mussa, M., and Rosen, S. (1978): "Monopoly and Product Quality", Journal of Economic
Theory, 18, 2, 301-317.
[34] Myerson, R. B. (1981): "Optimal Auction Design", Mathematics of Operation Research, 6,
1, 58-73.
[35] Myerson, R. B. (1986): "Multi Stage Games with Communication", Econometrica, 54, 2,
323-358.
[36] O’Donoghue, T. and Rabin, M. (1999): "Doing It Now or Later", American Economic
Review, 89, 1, 103-124.
[37] O’Donoghue, T. and Rabin, M. (1999): "Incentives for Procrastinators", Quarterly Journal
of Economics, 114, 3, 769-816.
[38] O’Donoghue, T. and Rabin, M. (2001): "Choice and Procrastination", Quarterly Journal
of Economics, 116, 1, 121-160.
[39] O’Donoghue, T. and Rabin, M. (2003): "Studying Optimal Paternalism, Illustrated by a
Model of Sin Taxes", American Economic Review, 93, 2, 186-191.
[40] O’Donoghue, T. and Rabin, M. (2007): "Incentives and Self Control", in Richard Blundell,
Whitney Newey, and Torsten Persson, eds., Advances in Economics and Econometrics:
Theory and Applications (Ninth World Congress), Cambridge University Press, August
2007.
[41] Pavan, A. (2007): "Long-Term Contracting in a Changing World", Mimeo. Northwestern
University.
55
[42] Pavan, A., Segal, I., and J. Toikka: "Dynamic Mechanism Design: Incentive Compatibility,
Pro…t Maximization, and Information Disclosure." Mimeo, Northwestern University.
[43] Rockafellar, R. T. (1970): "Convex Analysis", Princeton University Press.
[44] Royden, H. L. (1988): "Real Analysis", Third Edition, Prentice Hall.
[45] Sandroni, A. and F. Squintani (2007): "Overcon…dence, Insurance, and Paternalism",
American Economic Review, 97, 5, 1994-2004.
[46] Spiegler, R. (2011): "Bounded Rationality and Industrial Organization," Oxford University
Press.
[47] Strotz, R. H. (1956): "Myopia and Inconsistency in Dynamic Utility Maximization," Review
of Economic Studies, 23, 165-180.
[48] Thaler, R. H. (1994): "Psychology and Savings Policies", American Economic Review, 84,
2, 186-192.
[49] Toikka, J. (2011): "Ironing Without Control", Journal of Economic Theory, 146, 6, 25102526.
[50] U.S. government, (2010): "Estimates of The Federal Tax Expenditures for the Fiscal Years
2010-2014", Washington.
56
Download