Nonlinear Capital Taxation Iván Werning, MIT September 19, 2011 (12:24 Noon)

advertisement
Nonlinear Capital Taxation
Iván Werning, MIT
September 19, 2011 (12:24 Noon)
Abstract
In the presence of private information, constrained optima may call for a distortion
or wedge in the intertemporal Euler equation for consumption. However, without
explicitly modeling a savings choice subject to taxation, one cannot immediately interpret this as a tax on savings. In this paper I provide a construction that provides
this interpretation: I take any given mechanism and augment it to allow a choice over
savings subject to nonlinear taxation. The tax is required to be independent of the
current shock, so that the returns to saving are risk free. I show that the augmented
mechanism implements the original allocation. The tax schedule is differentiable under natural conditions and its derivative, the marginal tax, coincides with the wedge
in the worker’s intertemporal Euler equation. The implementation favors a progressive tax in the sense that the schedule can always be made convex, but not necessarily
concave. However, I show that in some cases a linear tax suffices. Finally, I show how
to make the savings tax independent of the history of shocks.
1
1 Introduction
When insurance is imperfect, individuals smooth consumption by saving and dissaving.
However, if insurance is limited due to the presence of asymmetric information, then it
may be best to restrict free access to savings. Indeed, constrained optima are typically
incompatible with free savings and require distortions in the intertemporal path for consumption [Rogerson, 1985]. In dynamic Mirrlees models, where productivity is private
information, this has been interpreted as a justification for capital taxation (Diamond and
Mirrlees, 1978; Golosov et al., 2003; Kocherlakota, 2005). The question I address here is
the precise form of such a tax system.
One possibility is to ban private savings altogether. The government can then engage
in public savings and control private consumption directly, through taxes and transfers.
This essentially replicates the direct mechanism underlying the revelation principle. Although theoretically possible, tax systems that are less extreme seem of greater interest.
First, an outright ban on private savings is far from current policy discussions and is
likely to be impractical. Second, tax policies that allow for savings are more revealing, as
when marginal tax rates make explicit the distortions that are otherwise implicit in the
allocation.
Kocherlakota [2005] and Albanesi and Sleet [2006] provide tax systems that are less extreme than a direct mechanism. Kocherlakota proposes an implementation where wealth
is taxed linearly, at a rate that depends on the agent’s current and past report for productivity. In his implementation, the dependence on the current report is crucial. It ensures
that zero saving is optimal for the agent regardless of the reporting strategy. In Albanesi
and Sleet’s implementation wealth plays a more active role, with taxes set according to a
nonlinear function of wealth and the current shock. However, the implementation only
works for the optimal allocation in a setting with i.i.d. shocks. In both implementations,
the dependence of the wealth tax on the current shock makes the rate of return on savings
risky. Both papers also require utility from consumption to be additively separable from
the disutility of labor.
The goal of this paper is to offer a new implementation with some desirable features.
First, I seek to implement any incentive compatible allocation, not just optimal ones. Second, I consider persistent shocks, modeling productivity as a general Markov process.
Third, I do not require utility from consumption to be additively separable from that of
labor. Finally, I consider nonlinear savings taxes that are restricted to be independent of
the current shock, so that the after-tax rate of return on savings is risk free.
When the agent can save, the planner must induce the agent to follow some prescribed
2
savings plan. The precise savings plan is indeterminate, since the planner can reshuffle
the agent’s net income across periods, using labor taxes and transfers. The proof that it
is possible to induce some prescribed savings plan using the tax instruments allowed in
my proposed implementation is constructive and proceeds by a dynamic programming
argument.
I first show that there exists a lowest tax schedule for the next to last period that makes
it optimal not to deviate at that stage, provided the agent has not deviated in prior periods. This lower bound is defined by the agent’s indifference to all saving levels. Any
other tax schedule that coincides at the prescibed level of saving and lies weakly above
it elsewhere, induces the prescribed level of saving. Once the prescribed level of saving
is induced, truth-telling follows because the original allocation is incentive compatible.
Note that the tax schedule constructed may depend on the history of reports up to the
next to last period, but, by definition, not on the last period’s report.
The construction then continues by backward induction. Taking as given the tax
schedules imposed in future periods, there exists a tax in the current period that induces
an agent, who has not deviated up to this point, to be indifferent to all saving levels. Any
schedule that is above this lower bound and coincides with it at the prescribed level induces the agent not to deviate from this point forward. We can then construct the entire
sequence of tax schedules that induces the prescribed savings plan.
Having constructed a sequence of tax schedules that implements the allocation, I turn
my attention to deriving some properties of these schedules. I first investigate differentiability and show that under plausible conditions, there are differentiable tax schedules that
implements the allocation. Furthermore, whenever the tax schedule is differentiable, its
derivative, the marginal tax, coincides with the wedge in the agent’s intertemporal Euler
equation.1 Thus, under my implementation, wedges equal marginal taxes.
Next I study the shape of these tax schedules. As explained above, in any period,
the construction implies a lower bound for the allowable tax schedules. It follows that
one can always choose a convex schedule, but not necessarily a concave one, unless the
lower bound is concave. In this sense, progressive taxation, with rising marginal taxes,
is favored by my implementation. Despite this general property, I show that concave or
linear tax schedules are possible in many cases. Indeed, whenever a concave schedule
is possible, it follows that a linear tax that is tangent to it also works. I examine a wide
number of examples numerically and find that a concave lower bound schedule in all
1 The
intertemporal wedge is defined, for any given allocation, as the proportional adjustment in the
rate of return that is required for the standard consumption Euler equation to hold, i.e. letting Uc,t denote
marginal utility at time t the wedge τt is such that Uc,t = β(1 + rt )(1 − τt )E t [Uc,t+1].
3
cases.
Finally, I explore the possibility of making the savings tax history independent, so
that agents face the same tax schedule in all periods, regardless of their past history. It is
important to understand that most allocations, including optimal ones, feature intertemporal wedges that depend on the history. Thus, the challenge is to construct a savings tax
that is history independent, yet delivers the appropriate history-dependent wedges.2
The key idea is to ensure that wealth is a sufficient statistic for the desired intertemporal wedge. Agents with different histories choose different saving levels, ending up
on different positions of the same tax schedule, precisely where the marginal tax rate
equals the intended intertemporal wedge for them. This can be done by exploiting the
aforementioned indeterminacy in the prescribed savings plan.
The shape of the history-independent schedule is restricted because it must be the upper envelope, appropriately transposed, of all history-independent tax schedules. This
implies that it must be sufficiently convex. In this way, history independence favors progressive taxation.
An attractive feature of my implementation is that it avoids unnecessary taxation.
Suppose we confront agents with the simplest of tax systems: labor income is taxed, but
agents can borrow and save freely, with no savings tax. The equilibrium under this system
produces a particular incentive-compatible allocation, where the intertemporal wedge
is zero, by definition. Taking this allocation as the starting point, one can then request
different implementation schemes to support it. With my proposed implementation one
recovers the original tax system, with no tax on savings. In other words, the original
simple tax system is one of the supporting implementations. In this sense it is a fixed
point of sorts. In contrast, the implementation constructed in Kocherlakota [2005] calls
for non-zero and stochastic taxes on savings. For this implementation, the original simple
tax system is not a fixed point of this thought experiment. The implementation demands
a tax on savings, to support an allocation that was generated with no tax on savings.3
Some well-known examples in the literature seem directly at odds with my results.
These examples motivated the need for state-dependent savings taxes—that is, for conditioning on current productivity or labor income—and also suggested a distinction between wedges and marginal taxes. In particular, Kocherlakota [2004, 2005] shows that
2 Studying
the subset of allocations and welfare that obtain using relatively simple and historyindependent tax systems (including the tax on savings and income) is an interesting different question
that is not attempted here. The goal here, as stated at the outset of the paper, remains to implement any
incentive compatible allocation, not a subset. The goal now is to attempt this with a history-independent
savings tax, but allowing the labor-income tax to remain history dependent.
3 One cannot ask the same question for the implementation in Albanesi and Sleet [2006], since their
results only apply to optimal allocations (which feature savings distortions).
4
if the savings tax is differentiable and state-independent, then the agent may plan on a
“double deviation”: saving some positive amount and reporting a lower shock in the
next period. In the example, productivity types are discrete and the optimal allocation
has the agent indifferent between truth telling and underreporting productivity. When
the agent is allowed to save, any small amount of wealth breaks indifference in favor of
underreporting. As a result of this double deviation the intended implementation fails.
These examples allow one to conclude that optimal allocations, discrete shocks, differentiable tax functions and state-independent taxes are incompatible. Kocherlakota relaxes
the latter assumption and proceeds to characterize state-dependent taxes.
This paper takes a different route. I show that a nonlinear tax can substitute for state
depenence. Although I do not rule out kinks a priori, I show that kinks do not arise
under plausible conditions when productivity is continuously distributed. On the other
hand, kinks are natural with discrete shocks. One cannot expect a model to generate
differentiable tax schedules, if primitives, such as the distribution of productivity, are
not continuous. Indeed, with discrete shocks, kinks also arise in the labor-income tax
schedule, even in the static Mirrlees [1971] model.4 It seems inconsistent to insist on a
differentiable savings tax, and allow kinks in the labor-income tax. Thus, I do not rule out
kinks a priori.
2 Non-Linear Tax Implementation
I begin with a two-period horizon version of the model. Time periods are denoted t = 0, 1.
Productivity is given by θt . Initial productivity, θ0 , is known, while θ1 is uncertain. It is
realized privately to the agent at t = 1. An allocation specifies c0 , y0 , c1 (θ1 ), y1 (θ1 ) and
delivers utility
v0∗ = U 0 (c0 , y0 , θ0 ) + βE U 1 c1 (θ1 ), y1 (θ1 ), θ1 ,
where U t (c, y, θ ) represents the utility in period t = 0, 1, assumed to be increasing in c and
θ, concave in (c, y) and satisfy the standard single-crossing condition that the marginal
rates of substitution Uct (c, y, θ )/Uyt (c, y, θ ) is increasing in θ, for any given (c, y).
Unlike Kocherlakota [2005] and Albanesi and Sleet [2006], the implementation I develop does not require additively separable utility of the form U (c, y, θ ) = u(c) − h(y, θ ).
Theoretically, the additively separable case is an important benchmark since it separability is required for the Inverse Euler equations to characterize constrained optimal alloca4 In
Mirrlees (1971), kinks may be required in the income-tax schedule even when the distribution of
productivity is continuous if bunching is optimal over some interval of productivities. These kinks have no
counterpart in the savings tax schedule under my implementation.
5
tions. Empirically, Aguiar and Hurst [2005] provide evidence supporting departures from
separability in the direction of assuming that consumption (expenditures) c and labor y
are Hicksian complements, so that Ucy > 0.
Incentive compatibility requires truth telling to be optimal
U 1 c 1 ( θ1 ), y 1 ( θ1 ), θ1 ≥ U 1 c 1 (r 1 ), y 1 (r 1 ), θ1
for all θ1 ∈ Θ and all r1 ∈ Θ; here, r1 represents the report made by the agent regarding
the true shock θ1 .
This concludes the description of the model’s environment. The main goal of this
paper is to extend a given mechanism so that it allows for saving and attains the same
equilibrium allocation. For that purpose, it is unnecessary to introduce technology, or to
set up a planning problem and solve for optimal allocations.
For any given incentive compatible allocation (c0 , y0 , c1 (θ1 ), y1 (θ1 )), consider an implementation that confronts the agent with the following budget constraints:
c̃0 + a1 ≤ c0 ,
c̃1 ≤ R(a1 ) + c1 (r1 ).
Here where R(·) represents the retention function, with R(0) = 0. Without loss of generality, one can take R to be increasing, since the agent will never save at a level where R is
locally decreasing. If the net interest rate is i then R(a) − (1 + i )a represents the nonlinear
tax on wealth at t = 1.
Equivalently, consider the budget constraints
c̃0 + M( x1 ) ≤ c0 ,
c̃1 ≤ x1 + c1 (r1 ),
for M = R−1 , with M(0) = 0. If the net interest rate is r then M( x ) − x/(1 + r ) represents
a nonlinear savings tax at t = 0. Clearly, one can go back and forth between R and M. I
shall use the second formulation and refer to M as the savings tax function.
These budget constraints strictly augment the original direct mechanism by adding a
saving choice x1 ; the constraints effectively specialize to the direct mechanism when x1 =
0. The agent takes M as given and optimizes by choosing a saving and reporting strategy.
The tax function M is said to implement the proposed allocation (c0 , y0 , c1 (θ1 ), y1 (θ1 )) if
the agent finds it optimal to save zero x1 = 0. The original incentive compatibility of the
allocation then ensures that truth telling is optimal.
6
Consider the agent’s problem in two stages. In the second period, the agent enters
holding wealth x1 , then productivity θ1 is realized and the agent makes a report r1 . The
utility obtained is
V1 ( x1 , θ1 ) ≡ max U 1 c1 (r1 ) + x1 , y1 (r1 ), θ1 .
r1
Define the set of maximizers to be
R∗ ( x1 , θ1 ) ≡ arg max U 1 c1 (r1 ) + x1 , y1 (r1 ), θ1
r1
Given a choice of x1 , expected utility is
W0 ( x1 ; M) ≡ U 0 c0 − M( x1 ), y0 , θ0 + βE [V1 ( x1 , θ1 )].
If M(0) = 0 and the allocation is incentive compatible then W0 (0; M) = v0∗ . Now define
M∗ ( x ) so that
W0 ( x1 ; M∗ ) = v0∗
∀ x1 ,
implying
M∗ ( x1 ) ≡ c0 − Ψ0 v0∗ − βE [V1 ( x1 , θ1 )], y0 , θ0 ,
(1)
where Ψt (·, y, θ ) denotes the inverse of U t (·, y, θ ).
The tax schedule M∗ is the lowest possible schedule to implement the desired allocation. It makes the agent indifferent to all saving levels. Higher schedules, with M(0) = 0,
also work by discouraging deviations from x1 = 0 still further. Lower schedules, on the
other hand, with M( x1 ) < M∗ ( x1 ) for some x1 , cannot work because they offer higher
utility than v0∗ . I summarize these arguments in the following proposition.
Proposition 1. Any incentive compatible allocation (c0 , y0 , c1 (θ1 ), y1 (θ1 )) can be implemented
with a savings tax schedule M( x1 ) as long as
M ( x1 ) ≥ M ∗ ( x1 )
∀ x1
with M(0) = M∗ (0) = 0 and M∗ ( x1 ) defined by equation (1).
In the next two sections I examine the shape of tax functions satisfying Proposition 1.
3 Differentiability: Marginal Taxes and Wedges
In this section, I investigate the differentiability of tax schedules. For any allocation
(c0 , y0 , c1 (θ1 ), y1 (θ1 )), define m∗ by
7
Uc0 (c0 , y0 , θ0 )m∗ ≡ βE Uc1 c1 (θ1 ), y1 (θ1 ), θ1 .
To interpret m∗ , suppose technology offers a marginal net rate of return of ρ. Then
m∗ −
1
1+ρ
is a measure of the intertemporal distortion implicit in the allocation, sometimes referred
to as the intertemporal wedge or implicit marginal tax rate.
When M( x1 ) is differentiable at x1 = 0, the first-order condition for the agent at zero
savings and truth telling is
Uc0 (c0 , y0 , θ0 ) M′ (0) = βE Uc1 c1 (θ1 ), y1 (θ1 ), θ1 .
It follows that the marginal tax equals the intertemporal wedge:
M ′ (0 ) = m ∗ .
In other words, implicit and explicit marginal tax rates coincide. Next, I seek conditions
for M′ (0) to exist.
Since M∗ acts as a lower envelope for M at x = 0, it follows that M inherits certain
properties of M∗ . As we shall see, M∗ can only have convex kinks. Thus, if M is differentiable at x = 0, then its derivative, the marginal tax, must coincide with that of M∗ .
Moreover, M can be differentiable at x = 0 only if M∗ is differentiable at x = 0.
To establishing differentiability of M∗ one can verify the differentiability of V1 (·, θ1 ).
Since V1 is defined as a maximization, I appeal to an envelope theorem. In particular,
Corollary 4 in Milgrom and Segal [2002] applies. It implies that V1 (·, θ1 ) has left and right
derivatives given by
∂
V1 ( x1 +, θ1 ) =
max Uc1 (c1 (r ) + x1 , y1 (r1 ), θ1 )
∂x1
r1 ∈ R∗ ( x1 ,θ1 )
∂
V1 ( x1 −, θ1 ) =
min Uc1 (c1 (r ) + x1 , y1 (r1 ), θ1 )
∂x1
r1 ∈ R∗ ( x1 ,θ1 )
If the agent’s optimal reporting strategy is unique, so that R∗ ( x1 , θ1 ) is a singleton, the left
and right derivative coincide. Kinks only occur if the agent is indifferent to two or more
8
reports r1 . In addition, we always have
∂
∂
V1 ( x1 −, θ1 ) ≤
V1 ( x1 +, θ1 )
∂x1
∂x1
so that
M∗′ ( x1 −) ≤ m∗ ≤ M∗′ ( x1 +).
Thus, the implict marginal tax given by m∗ is sandwiched between the two explicit marginal
taxes M∗′ ( x1 −) and M∗′ ( x1 +).
I now show that the tax schedule M∗ is differentiable if the distribution of productivity is continuous. Intuitively, indifference is “knife edge” and occurs with zero probability. Since V1 (·, θ1 ) is differentiable, except on a set of probability zero, it follows that
E [V1 (·, θ1 )] is differentiable.
Proposition 2. If θ1 has a continuous distribution then M∗ ( x1 ) defined by equation (1) is differentiable and M∗′ (0) = m∗ .
Proof. Incentive compatibility implies that c1 (θ1 ) and y1 (θ1 ) are non-decreasing functions
of θ1 . It follows that there are at most countably many points of discontinuity. Define Θ̂ to
be the set of points of discontinuity in c1 (·). By Theorem 3 in Milgrom and Segal (2002),
the function V1 ( x1 , θ1 ) is differentiable with respect to x1 at x1 = 0 and θ1 ∈
/ Θ̂, with
1
derivative given by the envelope formula: Vx (0, θ1 ) = Uc (c1 (θ1 ), y1 (θ1 ), θ1 ) (since r1 = θ1
is optimal by incentive compatibility). The function of interest to us is
V̄1 ( x1 ) ≡ E [V1 ( x1 , θ1 )],
Thus, V̄1 is the integral of a function that is differentiable at x1 = 0, except on a set of
countable points. With a continuous distribution for θ1 the set of points of non-differentiability
is of probability zero. It then follows that V̄1 ( x1 ) is differentiable at x1 = 0 with derivative
V̄1′ (0) = E Vx (0, θ1 ) = E Uc1 c1 (θ1 ), y1 (θ1 ), θ1 .
If productivity types are finite, so that Θ = {θ 1 , θ 2 , . . . , θ N }, indifference may occur
with positive probability, producing a convex kink. This happens when there are binding
incentive constraints, a common feature of optimal allocations. However, this source for
kinks should not be a concern. First, with finite types, kinks are inherent to the Mirrlees
[1971] model and are also present in the labor-income tax schedule. Second, although
implementing an optimum with finite types requires kinks, one can get arbitrarily close
to this optimum using differentiable schedules. Finally, in the limit as the number of finite
9
types increases and we approach a continuous distribution, kinks disappear, in that the
left and right derivatives converging to each other.
4 Linear and Progressive Capital Taxation
Although a key element of my implementation is to allow for nonlinear taxation, in some
cases a linear tax will do. By Proposition 1, tax function M with M( x1 ) ≥ M∗ ( x1 ) and
M(0) = 0 implements the desired allocation. It follows that, if M∗ ( x1 ) is concave, one
can let M( x1 ) be the linear tangent at x1 = 0.
When can we expect M∗ to be concave? Its definition in equation (1) shows that a sufficient condition for concavity of M∗ is for E [V1 (·, θ1 )] to be concave.5 In turn, a sufficient
condition for concavity of E [V1 (·, θ1 )] is for each V1 (·, θ1 ) function to be concave, for all
θ1 ∈ Θ.6
Concavity of V1 ( x1 , θ1 ) is not a foregone conclusion. A sufficient condition for concavity is that the agent faces as a convex tax schedule in the second period, so that the
the loci of points (y, c) available to him are in a convex relation. This will be the case if
labor-income taxation is progressive.
Proposition 3. (a) If E [V1 (·, θ1 )] is concave a linear schedule M( x1 ) = m∗ x1 implements the
optimal allocation with zero savings; (b) If labor-income taxation is progressive in period t = 1,
then V1 ( x1 , θ1 ) is concave.
5 Arbitrary Finite Horizon
In this section we extend the previous results to a longer horizon, with periods t =
0, 1, . . . , T. The agent has utility function
T
∑ βt E[U t (ct , yt , θt )].
(2)
t =0
We assume {θt } is a Markov process with θt ∈ Θ.
Consider any incentive compatible allocation (c(θ t ), y(θ t )) with equilibrium values
v(θ t ) satisfying
v(θ t ) = U t c(θ t ), y(θ t ), θt + βE [v(θ t+1 )|θt ].
(3)
5 This follows by the following fact. Suppose f : R → R is decreasing and concave and that g : R → R
is decreasing and convex. Then f ( g(·)) is concave.
6 Of course, these conditions are not necessary. For example, E [V (·, θ )] may be concave even if V (·, θ )
1
1
1
is only concave for a subset of values of θ1 .
10
The agent’s optimization problem is captured by the associated Bellman equation
w(r t−1 , θt ) = max U t c(r t ), y(r t ), θt + βE [w(r t , θt+1 )|θt ] .
rt
(4)
An agent with current shock θt enters the period with a history r t−1 which affects the
contract. Thus, the state variable is (r t−1 , θt ). The agent chooses the report rt to maximize
utility. Note that the Markov assumption allows us to ignore the history of shocks θ t−1 .
For an incentive compatible allocation, truth telling is optimal even if the agent has made
false reports in the past.
Incentive compatibility is equivalent to
w (r t − 1 , θ t ) = v (r t − 1 , θ t ),
so that v defined by the recursion (3) solves the Bellman equation (4).
5.1 Nonlinear Tax Implementation
For any given incentive compatible allocation (c(θ t ), y(θ t )), consider the budget constraints
c̃t + M( xt+1 , r t ) ≤ xt + c(r t ),
(5)
with initial condition x0 = 0 and terminal condition x T +1 ≥ 0. If technology offers a net
marginal rate of return ρt , then M( xt+1 , r t ) − xt+1 /(1 + ρt ) represents a tax on savings.
The goal of the implementation is to find a sequence of functions { M( xt+1 , r t )} to
ensure that the agent finds it optimal not to save, so that xt+1 = 0 for all t = 0, 1, . . . , T.
To simplify, I refer to { M( xt+1 , r t )} as tax functions. The agent’s dynamic programming
problem has state variables xt , r t−1 and θt with Bellman equation
V ( xt , r t−1 , θt ) = max U t c(r t ) + xt − M( xt+1 , r t ), y(r t ), θt + βE [V ( xt+1 , r t , θt+1 ) | θ t ]
rt ,x ′
(6)
with V ( x T +1
, rT , θ
T +1 )
= 0 so that
V ( x T , r T −1 , θT ) = max U T c(r T ) + x T , y(r T ), θT .
rT
It is useful to define W ( xt , xt+1, r t , θt ) as the right hand side of the Bellman equation
W ( xt , xt+1, r t , θt ) ≡ U t c(r t ) + xt − M( xt+1 , r t ), y(r t ), θt + βE [V ( xt+1 , r t , θt+1 ) | θ t ] (7)
11
I proceed recursively, starting in the last period and working backwards.
5.2 Last Two Periods
I start at T − 1 when only two periods remain. The argument is similar to the two period
case studied previously, but now there are shocks and reporting in both periods.
At t = T − 1 impose
W (0, x T , r T −1 , θT −1 ) ≤ v(r T −2 , θT −1 )
W (0, 0, (r T −2 , θT −1 ), θT −1 ) = v(r T −2 , θT −1 )
∀ x T , r T −1 , θ T −1
(8)
∀r T −2 , θ T −1
(9)
The inequality ensures that if the agent arrives in period T − 1 with zero savings, x T −1 =
0, then it is optimal to report the current shock truthfully, r T −1 = θT −1 , and not save, x T =
0. The equality ensures that this delivers the same utility as originally, or, equivalently
that M(0, r T −1 ) = 0. These conditions are both necessary and sufficient to implement
the desired allocation. If the inequality is violated the agent can do better by deviating
to another savings level x T 6= 0. Similarly, if the equality is violated it must be that the
original allocation is not available.
These conditions do not ignore double deviations. Indeed, if the agent were to deviate
and save x T 6= 0, then misreporting r T 6= θT may be optimal. Relatedly, if the agent misreports r T −1 6= θT −1 , then saving x T 6= 0 may be optimal. Thus, double deviations may be
better than single deviations. In this implementation, there is no need to ensure that double deviations are not strictly prefered to single deviations. Instead, the conditions ensure
that not deviating at all is preferable to any set of deviations. This is in contrast to the
implementation in Kocherlakota [2005] which rules out double deviations by ensuring
that zero saving is optimal for any reporting strategy.
The inequalities are equivalent to imposing that
M ( x T , r T −1 ) ≥ M ∗ ( x T , r T −1 )
(10)
M∗ ( x T , r T −1 ) ≡ max M̃∗ ( x T , r T −1 , θT −1 ),
(11)
where
θ T −1
M̃∗ ( x T , r T −1 , θT −1 ) ≡ c(r T −1 )
i
h
− ΨT −1 v(r T −2 , θT −1 ) − βE [V ( x T , r T −1 , θT ) | θT −1 ], y(r T −1 ), θT −1 . (12)
12
Here, M̃∗ represents a fictitious tax function that ensures that each agent θT −1 is indifferent to any savings and report, that is, that the inequality (8) holds with equality for all x T ,
r T −1 and θT −1 . Such a tax function is not really feasible, since it would have to depend on
the true type θT −1 , in addition to report r T −1 . Taking the upper envelope over true types
θT −1 to define M∗ yields takes care of this. When an agent θT −1 faces M∗ , instead M̃∗ ,
they will generally not be indifferent to various savings and reports.
Note that, for each r T −1 and θT −1 , the fictitious tax M̃∗ is the analog of the tax function
defined in equation (1), where θT −1 = θ0 was known. The innovation here is to take the
upper envelope over θT −1 and to condition on the history r T −1 . This rules out misreporting in period T − 1 as well as in period T.
The tax function M∗ is defined as the lowest possible tax that prevents a deviation.
Any tax function that is lower necessarily violates inequality (8) for some agent θT −1 . Feasible tax functions M must lie above M∗ , satisfying inequality (10), and have M(0, r T −1 ) =
0.
5.3 Earlier Periods
I now work backwards, defining a tax schedule for all periods t = T − 2, T − 3, . . . , 0.
Suppose tax schedules for periods s = t + 1, t + 2, . . . , T − 1 have already been constructed. Associated with these tax schedules { M( xs+1 , r s )}sT=−t1+1 are value functions
{V ( xs , r s−1, θs )}sT=−t1+1 . I seek to construct a tax schedule M( xt+1 , r t ) and value function
V ( xt , r t−1 , θt ) for period t.
Recall that these functions W ( xt , xt+1, r t , θt ) are given by (7) using next period’s value
function V ( xt+1 , r t , θt+1 ) and the current tax schedule M( xt+1 , r t ), which must be constructed. I impose that, whatever the value function V ( xt+1 , r t , θt+1 ), that the tax function
M( xt+1 , r t ) be such that the implied W satisfy
W (0, xt+1 , r t , θt ) ≤ v(r t−1 , θt )
W (0, 0, (r t−1 , θt ), θt ) = v(r t−1 , θt )
∀ x t +1 , r t , θt
(13)
∀r t −1 , θt
(14)
The previous conditions (8)–(9) are the special case of conditions (13)–(14) for t = T − 1.
The same arguments lead us to define
M∗ ( xt+1 , r t ) ≡ max M̃∗ ( xt+1 , r t , θt ),
θt
13
(15)
where
∗
t
t
M̃ ( xt+1 , r , θt ) ≡ c(r ) − Ψ
t
h
v (r
t −1
t
t
, θt ) − βE [V ( xt+1 , r , θt+1 ) | θt ], y(r ), θt
i
and impose:
M ( x t + 1 , r t ) ≥ M ∗ ( x t + 1 , r t ).
(16)
Given a choice for M( xt+1 , r t ) this defines a value function V ( xt , r t−1 , θt ) using the Bellman equation (6).
Continuing this way gives a tax schedule M( xt+1 , r t ) and value function V ( xt , r t−1, θt )
for t = 0, 1, . . . , T − 1. This construction ensures that the Bellman equation (6) holds
and that the maximum is attained by truth-telling and no savings in all periods t =
0, 1, . . . , T − 1. By the principle of optimality this implies that truth telling and no savings is optimal among all possible reporting and savings strategies satisfying the budget
constraint (5).
Proposition 4. Any incentive-compatible allocation {c(θ t ), y(θ t )}tT=0 can be implemented using the budget constraints (5) by any sequence of tax functions on savings { M( xt+1 , r t )}tT=−01
satisfying the inequalities (16), where M∗ ( xt+1 , r t ) defined implicitly from { M( xt+1 , r t )}tT=−01
using conditions (15) and the Bellman equation (6). Conversely, if a sequence of tax functions
{ M( xt+1 , r t )}tT=−01 implements the incentive-compatible allocation {c(θ t ), y(θ t )}tT=0 it must satisfy inequalities (16).
The characterization provided by (15)–(16) is exhaustive, providing all the tax schedules that implement the allocation. Note that, according to equation (15), the tax schedules { M( xs+1 , r s )}sT=−t1+1 affect the schedule M∗ ( xt+1 , r t ) in period t. Indeed, higher tax
functions { M( xs+1 , r s )}sT=−t1+1 lead to lower values of V ( xt+1 , r t , θt+1 ), resulting in lower
values for M∗ ( xt+1 , r t ). In this way, there is a trade-off between current and future taxes.
We conclude that, unlike the two-period horizon case, it is incorrect to interpret the sequence generated by setting M( xt+1 , r t ) = M∗ ( xt+1 , r t ) for t = 0, . . . , T − 1 as producing
lowest possible tax schedules.
6 History Independent Taxation
I now explore the possibility of making the savings tax history independent. It is important to understand that most allocations, including optimal ones, have history-dependent
intertemporal wedges. Thus, the challenge is to create a savings tax that is history independent, but still manages to deliver the appropriate history-dependent wedges. I show
14
M̂ ( x )
M( x; r ′′ )
M( x; r ′ )
0
M( x; r )
Figure 1: A kinked upper envelope M̂ ( x ) function constructed from M( x; r ) functions
that have not been transposed.
two ways to do this.
The first implementation takes the history independent schedule as the upper envelope of the history dependent ones:
M̂ ( x ) ≡ sup M( x, r t ),
rt
where each function M(·, r t ) is constructed as in section 5 . Since M̂ ( x ) ≥ M( x, r t ) and
M̂ (0) = M(0, r t ) by definition, this history independent tax continues to implement zero
savings. However, because the slope Mx (0, r t ) equals the intertemporal wedge, if the
latter varies across histories r t , the upper envelope M̂ defined in this way will necessarily
have a kink at x = 0. Indeed, the left and right derivatives are the smallest and largest
intertemporal wedge, respectively. This situation is depicted in Figure 1.
The second way of achieving history independence attempts to avoid creating kinks.
Up to now, the implementations have zero savings. In contrast, the next implementation
relies on agents making non-trivial saving decisions. The idea is to have savings act as a
sufficient statistic for the intertemporal wedge. Different savings place agents at different
points along the same history-independent schedule, precisely where the marginal tax
rate equals the desired intertemporal wedge.
The construction relies on Ricardian equivalence arguments, which ensures that there
there are many tax functions that implement the same allocation, each with a different
15
x
savings choice. As I show below, this degree of freedom can be exploited to construct a
history independent tax schedule.
Agents face budget constraints
c t + M ( x t + 1 , r t ) ≤ x t + y (r t ) − T (r t ).
(17)
Here T represents an income tax on labor income. We now separate more explicitly the
income and savings taxes. Previously, we worked with a particular income tax, given by
T (r t ) = y (r t ) − c (r t ).
Suppose { M(·, r t ), T (r t )} implements {c(r t ), y(r t )} with zero savings, x (r t ) = 0 for all
r t . First, consider the set of perturbations defined by:
M̃( x, r t ) = M( x, r t ) + ∆v (r t )
T̃ (r t ) = T (r t ) − ∆v (r t ).
Clearly this perturbation does not affect the budget constraint, since only the sum of taxes
paid matters and M + T = M̃ + T̃. As a result, this perturbation continues to implement
{c(r t ), y(r t )} with zero savings x (r t ) = 0. Next, consider the set of perturbation defined
by
M̃( x, r t ) = M( x − ∆h (r t ), r t )
T̃ (r t+1 ) = T (r t+1 ) + ∆h (r t+1 ).
This perturbation increases labor income taxes at t + 1 but provides a deduction in the
savings tax at t. By saving x (r t ) = ∆h (r t ) in period t the agent is able to pay for the
increase in taxes at t + 1. Indeed, the set of feasible sequences {c(r t ), y(r t )} is unchanged.
Thus, this perturbation implements {c(r t ), y(r t )} with savings given by x (r t ) = ∆h (r t )
for all r t .
Geometrically, the first perturbation shifts the savings tax schedule vertically, while
the second does so horizontally; indeed, the subscripts v and h in ∆ stand for vertical and
horizontal, respectively. Combining both perturbations we can reposition, each function
M(·, r t ) both vertically and horizontally, for any history r t .
The goal is now as follows. One is given some sequence of functions { M(·, r t ), T (r t )}
that implements some allocation. The idea is to reposition each tax schedules M(·, r t ) so
that M̃(·, r t ) is tangent to a common, differentiable, upper-envelope function M̂( x ). If
this turns out to be possible, then, since M̂ lies above each M̃ (·, r t ) and is equal to it at the
proposed equilibrium saving points, the history-independent tax schedule M̂ together
16
M̂ ( x )
M( x; r ′′ )
M( x; r ′ )
0
x
M( x; r )
Figure 2: A smooth upper envelope M̂( x ) function constructed from transposed M( x; r )
functions.
with the labor tax { T̃ (r t )}, implement the same allocation as { M̃ (·, r t ), T̃ (r t )}, which, in
turn, implement the original allocation.
To see how finding a differentiable upper envelope may be possible, consider first the
case where the function M(·, r t ) is concave, for all history of reports r t . Now take any convex and differentiable function M̂( x ) such that limx→−∞ M̂′ ( x ) = −∞ and limx→∞ M̂′ ( x ) =
∞. Fix a history of reports r t and reposition M(·, r t ) so that the perturbed schedule M̃ (·, r t )
is equal to and tangent to M̂ at x = ∆h .7 Since M̂ is convex and M(·, r t ) is concave, it then
follows that M̃ ( x, r t ) ≤ M̂( x ) for all x, with equality at x = ∆h . Repeating this for all
histories r t , one constructs a perturbed sequence of tax schedules { M̃ (·, r t )} for which M̂
can play the role of an upper envelope. This situation is illustrated in Figure 2.
The upper envelope M̂ is not unique. Each M̂ requires a different sequence for the
taxes on labor { T̂ (r t )}. Moreover, the assumption that M̂ is convex in this case provides
a sufficient condition for M̃ ( x, r t ) ≤ M̂ ( x ), but convexity of M̂ is not necessary. There are
cases where a concave M̂ can play the role of an upper envelope. What is required is that
M̂ be “less concave” than each M(·, r t ). In this way, the curvature in the tax schedules
constrains the curvature of the upper envelope.
Similarly, as illustrated in Figure 3, when all the tax functions M(·, r t ) are convex we
can find an upper envelope M̂ that is “more convex” than each one of them. This is
possible if the convexity of the functions { M(·, r t )} is bounded in a sense made precise
7 That
is, setting M̂ ′ (∆h ) = M̃x (∆h , r t ) = Mx (0, r t ) and ∆v = M̂(∆h ) − M (0, r t ).
17
M̂ ( x )
M( x; r ′′ )
M( x; r ′ )
M( x; r )
0
x
Figure 3: A smooth upper envelope M̂( x ) function constructed from transposed M( x; r )
functions.
by the next proposition.
Proposition 5. Suppose the allocation {c(θ t ), y(θ t )}tT=0 is implemented by facing agents with
budget constraints (17) for some sequence of tax functions on savings { M( xt+1 , r t ), T (r t )}tT=−01
with M(·, r t ) differentiable at xt+1 = 0 for all r t . Suppose further that there exists a scalar A ∈ R
such that
M( x, r t ) ≤ Ax2 + Mx (0, r t ) x + M(0, r t )
(18)
for all for all history of reports r t . Take any twice differentiable function M̂ ( x ) with M̂′′ ( x ) ≥ A
for all x and inf.
Then the same allocation can be implemented with a history independent tax on savings
M̂ ( xt+1 ) and some alternative tax on labor { T̂ (r t )}tT=−01 .
The proposition provides a sufficient condition for the existence of a smooth upper
envelope. The key to condition (18) is that it bounds the convexity of the functions
{ M(·, r t )}. To see this, note that taken together, the last two terms, Mx (0, r t ) x + M(0, r t ),
represent the linear Taylor expansion for M(·, r t ). The quadratic term, Ax2 , contributes
strict convexity to the right-hand side of the condition. Thus, if the functions { M(·, r t )}
are all concave, the condition holds with A = 0. If the functions { M(·, r t )} are all twice
differentiable, a sufficient condition for inequality (18) is that their second derivative be
bounded above by some scalar A. The inequality (18) is weaker than this requirement,
for two reasons. First, it does not require the additional differentiability—the second
derivative need not exist, and the first derivative is only required at x = 0. Second, even
18
if M(·, r t ) is twice differentiable, the inequality (18) is weaker than a bounded second
derivative.8
Note that M̂ is not unique. Indeed, if M̂ can serve as an upper envelope, so too can
any more convex function. Indeed, this implementation favors convexity of the M̂ function. In this sense, progressive taxation of savings, with marginal tax rates that rise with
savings, emerges as a desirable feature. Of course, even if M̂ is convex, this does not say
anything about whether the overall tax system is “progressive”. First of all, this should
be hardly expected to be possible because the implementation is meant for any incentive
compatible allocation, regardless of its redistributive properties. Secondly, it is unclear
whether agents that will be saving more and, thus, facing higher marginal tax rates, are
“richer”. Given the allocation, each period agents are ordered by their intertemporal
wedge. This is used in this implementation as a sufficient statistic for their equilibrium
savings: those with a higher wedge must save more. But it is unclear whether agents with
a higher wedge are “richer” in the sense of having high productivity or enjoying higher
consumption.
Finally, it is worth pointing out that while this procedure removes the history dependence in the savings tax, the labor income tax remains generally history dependent.
Again, this should hardly be surprising in the present context, where we aim to implement any incentive compatible allocation. In abstract, it is unclear whether the labor
income tax becomes more or less sophisticated when we move from T (r t ) to T̃ (r t ). The
perturbations to T do have a simple interpretation in terms of lump-sum tax credits, for
∆v , or allowances and deductions on savings taxes, for ∆h . Social security benefits and
sheltered savings accounts that depend on the history of labor income are real-world
counterparts of the kind of instrument that is required.
References
Mark Aguiar and Erik Hurst. Consumption versus expenditure. Journal of Political Economy, 113(5):919–948, October 2005.
Stefania Albanesi and Christoper Sleet. Dynamic optimal taxation with private information. Review of Economic Studies, 2006. forthcoming.
8 In
particular, the second derivative could become unbounded away from x = 0 without violating the
the inequality (18). Note that if, instead, we limit ourselves to a discussion of the second derivative at x = 0
then, if M (·, r t ) is differentiable there, inequality (18) implies that that this second derivative is less than A
for all r t .
19
Peter A. Diamond and James A. Mirrlees. A model of social insurance with variable
retirement. Journal of Public Economics, 10(3), 1978.
Mikhail Golosov, Narayana Kocherlakota, and Aleh Tsyvinski. Optimal indirect and capital taxation. Review of Economic Studies, 70(3):569–587, 2003.
Narayana R. Kocherlakota. Figuring out the impact of hidden savings on optimal unemployment insurance. Review of Economic Dynamics, 7:541–554, 2004.
Narayana R. Kocherlakota. Zero expected wealth taxes: A mirrlees approach to dynamic
optimal taxation. Econometrica, 73(5):1587–1621, 2005.
Paul Milgrom and Ilya Segal. Envelope theorems for arbitrary choice sets. Econometrica,
70(2):583–601, March 2002.
James A. Mirrlees. An exploration in the theory of optimum income taxation. Review of
Economic Studies, 38(2):175–208, 1971.
William P. Rogerson. Repeated moral hazard. Econometrica, 53(1):69–76, 1985.
20
Download