An Exploration in the Theory of Optimum Income Taxation Author(s): J. A. Mirrlees Source: The Review of Economic Studies, Vol. 38, No. 2, (Apr., 1971), pp. 175-208 Published by: The Review of Economic Studies Ltd. Stable URL: http://www.jstor.org/stable/2296779 Accessed: 16/04/2008 12:38 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=resl. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We enable the scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that promotes the discovery and use of these resources. For more information about JSTOR, please contact support@jstor.org. http://www.jstor.org An Exploration in Optimum the Income Theory of 2axa on12 J. A. MIRRLEES Nuffield College, Oxford 1. INTRODUCTION One would supposethat in any economicsystemwhereequalityis valued,progressive income taxationwould be an importantinstrumentof policy. Even in a highly socialist economy,whereall who workare employedby the State,the shadowpriceof highlyskilled labourshouldsurelybe considerablygreaterthan the disposableincomeactuallyavailable to the labourer. In WesternEuropeand America,tax rateson both high and low incomes are widelyand lengthilydiscussed3:but thereis virtuallyno relevanteconomictheoryto appealto, despitethe importanceof the tax. Redistributiveprogressivetaxationis usuallyrelatedto a man'sincome(or, rather,his estimatedincome). Onemightobtaininformationabout a man'sincome-earning potential from his apparentI.Q., the numberof his degrees,his address,age or colour: but the natural,and one wouldsupposethe most reliable,indicatorof his income-earning potential is his income. As a result of using men's economic performanceas evidence of their economicpotentialities,completeequalityof social marginalutilitiesof income ceases to be desirable,for the tax systemthat would bring about that resultwould completelydiscourageunpleasantwork. The questionsthereforearisewhat principlesshouldgovernan optimumincome tax; what such a tax schedulewould look like; and what degree of inequalitywould remainonce it was established. The problemseemsto be a ratherdifficultone evenin the simplestcases. In this paper, I make the followingsimplifyingassumptions: (1) Intertemporalproblemsare ignored. It is usual to levy income tax upon each year'sincome,with only limitedpossibilitiesof transferringone year'sincometo another for tax purposes. In an optimumsystem,one would no doubtwish to relatetax payments to the whole life patternof income,4and to initialwealth; and in schedulingpaymentsone would wish to pay attentionto imperfectpersonalcapitalmarketsand imperfectforesight. The economydiscussedbelowis timeless. Thusthe effectsof taxationon savingareignored. One mightperhapsregardthe theorypresentedas a theoryof " earnedincome" taxation (i.e. non-propertyincome). (2) Differencesin tastes, in familysize and composition,and in voluntarytransfers, are ignored. These raise ratherdifferentkinds of problems,and it is naturalto assume them away. I 2 First versionreceivedAug. 1970;final versionreceivedOctober1970 (Eds.). Work on this paperand its continuationwas begun duringa stimulatingand pleasurablevisit to the Departmentof Economics,M.I.T. The influenceof PeterDiamond is particularlygreat, and his comments have been very useful. Earlierversionswere presentedat the Cowles Foundation, to the Economic Study Society, at the London School of Economics,and to CORE. I am gratefulto the membersof these seminars and to A. B. Atkinson for valuablecomments. I am also greatlyindebtedto P. G. Hare and J. R. Broome for the computations. 3 Discussions on (usually) orthodox lines, including many important points neglected in the present paper, can be found in [7], [1], [5, Chapters5, 7, 8], and [6, Chapters11 and 12]. [2] is close in spirit to what is attemptedhere. 4 Cf. [7, Chapter6]. 175 176 REVIEWOF ECONOMICSTUDIES (3) Individualsare supposedto determinethe quantityandkindof labourtheyprovide by rationalcalculation,correspondingto the maximizationof a utilityfunction,and social welfareis supposedto be a functionof individualutilitylevels. It is also supposedthat the quantityof labour a man offers may be varied within wide limits without affectingthe pricepaidfor it. Thefirstassumptionmaywell be seriouslyunrealistic,especiallyat higher incomelevels,whereit does sometimesappearthat thereis consumptionsatiationand that workis done for reasonsbarelyconnectedwith the incomeit providesto the " labourer". (4) Migrationis supposedto be impossible. Sincethe threatof migrationis a major influenceon the degreeof progressionin actualtax systems,at any rate outsidethe United States,this is anotherassumptionone would rathernot make.' (5) The State is supposedto have perfect informationabout the individualsin the economy,their utilitiesand, consequently,their actions. In practice,this is certainlynot the case for certainkinds of incomefrom self-employment, in particularwork done for the workerhimself and his family; and in some countries,the extent of uncertaintyabout incomesis very great. Yet it seems doubtfulwhetherthe neglectof this uncertaintyis a simplificationof much significance. (6) Variousformalsimplificationsare madeto renderthe mathematicsmoremanageable: thereis supposedto be one kind of labour(in a specialsenseto be explainedbelow); there is one consumergood; welfareis separablein terms of the differentindividualsof the economy,and symmetric-i.e. it canbe expressedas the sumof the utilitiesof individuals when the individualutilityfunction(the samefor all) is suitablychosen). (7) The costs of administeringthe optimumtax scheduleare assumedto be negligible. In sections 2-5, the more generalpropertiesof the optimumincome-taxschedule, and the rules governingit, are discussed. The treatmentis not rigorous. Neverthelessa readerwho wantsto avoid mathematicaldetailscan omit the last page or two of section3, and will probablywant to glancethroughsection4 ratherrapidly. In section 6, I begin the discussionof specialcases. The mathematicalargumentsin sections6-8 are frequently complicated. If the readergoes straightto section9, wherenumericalresultsarepresented and discussed,he should not find the omission of the previous sections any handicap. He may, nevertheless,find it interestingto look at the resultsand conjecturespresentedat the beginningof section7, and at the diagramsfor the two cases discussedin section8. Rigorousproofs of the maintheoremswill be givenin a subsequentpaper,[4]. 2. MODEL AND PROBLEM Individualshave identical preferences. We shall suppose that consumptionand workingtime enterthe individual'sutilityfunction. Whenconsumptionis x and the time workedy, utilityis u(x, y). x andy both haveto be non-negative,andthereis an upperlimitto y, whichis takento be 1. In fact, it is assumedthat: u is a strictlyconcave, continuouslydifferentiable,function (strictly)increasingin x, (strictly)decreasingin y, definedfor x> 0 and 0 < y< 1. u tends to - oo as x tends to 0 from above or y tends to 1 from below. The usefulnessof a man's time, from the point of view of production,is assumedto vary-from person to person. To each individualcorrespondsa numbern such that the quantityof labourprovided,per unit of his time, is n. If he worksfor timey, he provides a quantityof labourny. Thereis a knowndistributionof skills,measuredby the parameter n, in the population. The numberof personswith labourparametern or less is F(n). It 1 The relationof optimumtax schedulesto propensitiesto migrateis discussedin anotherpaper under preparation. MIRRLEES 177 OPTIMUM INCOME TAXATION will be assumedthat F is differentiable,so that there is a density function for ability, is n an n-man. f(n) = F'(n). Call an individualwhose ability-parameter The consumptionchoice of an n-manis denotedby (xn,yn). Writezn = nyn for the labourhe provides. Then the total labouravailablefor use in productionin the economy is Z= . . .(1) { 00 z,,f(n)dn, and the aggregatedemandfor consumergoods is X = ... (2) xnf(n)dn. 0 In orderto avoid the possibilityof infinitelaboursupply,I assumethat nf(n)dn< oo S .(3) . Eachindividualmakeshis choiceof (xn,yn)in the lightof his budgetconstraint. Using an income tax, the governmentcan arrangethat a man who suppliesa quantityof labour z can consume no more than c(z) after tax: the governmentcan choose the function c arbitrarily. It makessenseto imposethe restrictionon the government'schoice of c, that c be upper semi-continuous,for then all individualshave availableto them consumption choicesthat maximizetheirutility,subjectto the budgetconstraint1: ... (4) (xn,yn) maximizes u(x, y) subject to x ? c(ny). Notice that (xn,yn) may not be uniquelydeterminedfor everyn.2 I write: .. Un = u(Xn, Yn)) (5) Proposition 1. There exists a numberno > 0 such that Yn= O (n yn>O Proof. _ no), ...(6) (n>no). If m<n, and ym>0, u[c(mym), ym]<u [c (n. m) m <u. Conse- quently, ym = 0 if yn = 0, since then Ym= 0 gives the utility unto n-man. Thus has the desiredproperties.| n = inf [n Iy?O] Proposition 2. Any function3 of n, (xn, yn), that satisfies (4) for some upper semicontinuousfunction c also satisfies (4) for some non-decreasing,right-continuousfunction c'. 1 To say that c is upper semi-continuousmeans that lim sup c(zi) = c(z) when lim zi = z. i-~o If we can supposethat xi-x of c, un = sup {u(x, y) I x _ c(ny)}, and u(xi, yi)->un, xi _ c(nyj) and yi->y (since {yi}and therefore{xi} is bounded). By the uppersemi-continuity x ? limsup c(nyj) = c(ny); and by the continuityof u, u(x, y) = lim u(xi, yi) = u,. Thereforethe supremumis attained. 2 In other words, we have a correspondence, providinga set of utility maximizingchoices for n-men. It arises when the consumptionfunction c coincideswith the indifferencecurve for part of its length. It is convenient neverthelessto use the notation of the text, despite its suggestion that we are dealing with a function. 3 It is easy to see that the result is true for a correspondencealso. -178 REVIEWOF ECONOMICSTUDIES If x' < c'(ny'), then, for any e>0, there exists Proof. Define c'(z) = sup c(z'). z' < z y Y y'such that x - , ?c(ny'). Thus u(x - B,Yn) <un which implies, since u is a decreasu(x, yn) u,. It follows that ing function in y, that u(xn- 8, Yn)? Unn Un Letting n (xn, yn) maximizes u subject to x < c'(ny). c' is clearlya non-decreasingfunctionof z. To prove that it is right-continuous,take a decreasingsequencezi-z. c'(zi) is a non-increasingsequence,and thereforetends to a limit,whichis not less than c'(z). If it is equalto c'(z), thereis no moreto prove. Suppose it is greater. Then for some i>0 each c'(z')>c'(z)+s. Therefore,thereexists a sequence (fi) such that 2' ? z' and c'(zi) > c(')> c'(z) +B. The second inequalityimplies that zi>z. Thus z'-iz. Yet lim supc() > c(z), which contradictsupper semi-continuity. Thus in fact, c is right-continuous. 11 This propositionsays that the marginaltax rate may as well be not greaterthan 100 per cent. We shall considerlaterwhetherit shouldbe positive. The governmentchooses the functionc so as to maximizea welfarefunction w = 0 G(un)f(n)dn. ... (7) I use the functionG here,ratherthan writingunalone, becauseI shalllaterwant to devote specialattentionto the case uXy= 0 (whenu can be writtenas the sumof a functiondepending only on x and a functiondependingonly on y). In maximizingwelfare,the government is constrainedby productionpossibilities:it mustbe possibleto producethe consumption demands,X, arisingfromits choiceof c, with labourinputno greaterthanZ. Theproduction constraintis written X < H(Z). ... (8) We have not yet fully specifiedthe possibilitiesavailableto the government,since, if (xn,yn) is not uniquelydefined,it is not clearwhetherthe governmentor the consumeris allowed to choose the particularutility-maximizingpoint. Perhapsit is reasonableto supposethat the governmentcan choose, and that the necessityfor market-clearingwill make its choices actual. But it will turn out that the issue is of no significancewhen we make the followingassumption,as we shall: (A) Ynis uniquelydefinedfor all n exceptfor a set of measure0. Thus the class of functionsc from whichthe governmentchooses is furtherrestricted by the requirementthat the functionlead to choices satisfying(A). It will appearin due coursethat (A) is satisfiedfor all functionsc in the particularcases we shall be most concernedwith. 3. NECESSARYCONDITIONSFOR THE OPTIMUM On the assumptionthat an optimumfor our problem exists, we shall now obtain conditionsthat it must satisfy. The mathematicalargumentwill not be rigorous. To do the analysisproperly,one must attend to a numberof rathertrickypoints. Since these technicaldetails tend to obscurethe main lines of the argument,rigorousproofs will be presentedseparately,in the continuationof this paper. The nature of these neglected difficultieswill be discussedbrieflyin the next section. The key to a reasonablyneat solutionof the problemis to find a convenientexpression of the conditionthat each man maximizeshis utilitysubjectto the imposed" consumption the derivativeof u[c(ny), y] with respect function" c. If we supposethat c is differentiable, to y mustbe zero. Denotingthe derivativeof u withrespectto its firstand secondarguments by ul and u2, respectively,we have ulnc'(ny)+u2 = 0. OPTIMUM INCOME TAXATION MIRRLEES 179 Recollect that unis the utility of n-man. Then a straightforwardcalculation,using the first-ordercondition(9), yields - u1yc' dun _ ...(10) n dn (Theexpressionson the rightare, of course,alternativeexpressionsfor thepartialderivative of u with respectto n, evaluatedat the maximum. The case wheren entersu in a more generalmannercan be analyzedby using this more generalequation. We shall returnto this point later.) Our problemis to maximizew subjectto the constraintof the productionfunction, X < H(Z), the differentialequation(10), and the definitionun = u(xn,yn). Thosewho are familiarwith the PontriyaginMaximumPrinciplewill see that this is a form of problem fairly suitablefor treatmentby it. Shadowpricesp and w have to be introducedfor X and Z. Then we would like to maximize W-pX+wZ = f[G(un)-pxn +wynn]f(n)dn . . .(11) subjectto (10). unis to be regardedas the state variable,yn (say) as the controlvariable, while xn is determinedas a function of un and Ynfrom the equationUn= u(xn,yn). The Hamiltonianis M= G[(un) - pxn + wynn]f(n) -4)nYnU2 n whereOn is a functionof n satisfyingthe differentialequation =~ dn __ Du fl+ -f[G(n)- ...(12) Ynshouldthen be chosen so as to maximizeM: = 0~ [wn+ pu2] f(n) + O9nRIY ....(13) wherethe functionqi(u,y) is definedby f(u, y) = -yu2(x, y), u = u(x, y), ... (14) and y is its partialderivativewith respectto y. (Notice, at the same time, that fu- = -YU12/ul.) Equation(12) can now be integratedto obtain an expressionfor in; which, when substitutedin (13), providesus with an equationto be satisfiedby the optimumwe seek. Beforegoing on to use this equation,however,we shall deriveit in a differentway, by a more explicituse of the methodsof the calculusof variations. The use of the Maximum Principlehas a numberof seriousdisadvantages.It does not showus how to obtaincertain importantsupplementaryconditionson the optimum. The analysisprovidesno hint as to how it couldbe maderigorous. It does not provideanyinsightinto the kindof maximization that is going on. Whenwe have done a more explicitvariationalanalysis,we shall be betterable to see wherethe logical holes are, and to understandwhy things come out the way they do. REVIEW OF ECONOMIC STUDIES 180 For this purpose, I prefer to write (10) in integrated form: n Un ~dm YmU2(Xm, Y.) - m o + u(c(O), 0), 5 = JY(U., Y.) -+ Uos... m o using the notation ql introduced above, and denoting the utility allowed to a man who does no work by u0. Suppose first that ql is independent of u (corresponding to the sepcial case 0). If we consider a variation from the optimum which changes the functions un U12= and Yn by " small " variations bun and bYn,we deduce from (15) that these variations must be related by dm Jn .. .(16) fryYm m +L8oU m o This variation will bring about changes in W, X, and Z. As before, introduce shadow prices (in terms of welfare) for X and Z. Then the variation must leave (11) stationary: bUn = 0 = XJ[G(un)-Pxn +wYnn]f(n)dn = {[G'(Un)un-P (ubun -u 2 6n + WbYn jf(n)dn, ... (17) where the variation in x is calculated as follows: bUn= 3u(xn, Y) = ...(18) UlbXn +u23yn It remains to substitute (16) in (17), yielding, Jp [ = I: {: un) y Ymdm + buo] + [wn+p [G'(um)- -P]f(m)dm. Y + (wn+pp 82] y n}f(n)dn )ff(n)} yndn ffP](n)dn. buo. + f[G'(un)- ... (19) The second equation is obtained by inverting the order of integration in the double integral.' (19) is to be satisfied for all possible variations of the function yn, and the number u0. Since u0 can be either increased or decreased at the optimum (if, as is to be expected in general, some people will do no work at the optimum), f G'([u - P-j f(n)dn = 0 ...(20) at the optimum. 1 The double integralis X G'(u)-P f(n) mdn. - The region over which the integrationtakes place is definedby 0 ? m _ n. Thus, when the orderof integration is inverted,n ranges betweenm and oo for given m. The integralcan thereforebe written 77 [G'-Pl-] f(n)dn. y8ym which is seen to justify (19) on permutingthe symbols m and n. ' MIRRLEES OPTIMUM INCOME TAXATION 181 If all variationsin yn werepossible-and this is a questionwe shall take up shortlywe could also claim that the expressionwithincurlybracketsought to be zero: (wn+p [I l{: = )f(n) n ul G'(um)]f(m)dm. - U1 n ...(21) It should be noticed that this equation will only be valid for n > nO: it does not apply to n for which yn = 0 (exceptno) because, there, not all variationsof the functionyn are possible,sinceYncannotbe negative. Finallywe know that the marginalproductof labourshould be equal to the shadow wage: pH'(Z) = w. ...(22) These equations,(20) and (21), have been workedout underthe special assumption that ql is independentof u. In the more generalcase, we have to replace(16) by bun= f o + 5u0, Tmnly6ymm ...(23) where Tmn=expf4(u J m ' . . . . (24) (4 To show this, we can go back to the differentialequation(10). Applyingthe variation, we obtainfrom it, d bun = -1 /luUn + ly6Yn- dn n n ... (25) This is a firstorderlinearequation,and can thereforebe solvedby the standardmethodto give the solution(23). Having replaced(16) by (23), we can now go throughthe rest of the calculationas before. We find that (20) is generalizedinto co T [G'(u.)-p/u1]T0nf(n)dn= 0; ... (26) while (21) becomes (wn+pu2fu1)f(n) = J[p/u1- G'(um)]Tnmf(m)dm. ... (27) Notice that we have Tnmhere, although it was Tmnthat appeared in (23). If theseequationsare correct,the two integralequations,(15) and (27) may be thought of as determiningthe two functionsunandyn, giventhe threeparametersu0,w,andp. The values of these parametersare fixed by the three equations(26), (22), and (8). We have enoughrelationsto determinethe optimumtax schedule,since the functionc can be determined once we know unand yn. 4. NECESSARYCONDITIONS: A COMPLETESTATEMENT The argumentused to derive these conditionsfor the optimumtax schedulehad a numberof weak points. It is indeedunlikelythat the relationshipsderivedabove hold in general. Among the weak points of the argument,notice that (i) the existenceof the shadowpricesp and w was assumedwithoutproof; (ii) the optimumtax schedule,and the resultingfunctionsxn,Yn,and unwereassumed to be differentiable; (iii) the applicationof the variationwas quite heuristic; and (iv) no justificationwas providedfor assumingthat the functionYncould be varied arbitrarily (for n >no). M 182 REVIEWOF ECONOMICSTUDIES I shall not commenton (i) and (iii), which,thoughimportant,are technicalmatters: they can be justified. (ii) is not satisfiedin general: there was no reason to supposethat it would be. When (ii) is not satisfied,the first-ordercondition, (9), for maximizationof utility ceasesto be meaningful. Finally,(iv) is neverjustified. The functionyn is derived from the impositionof the consumptionfunctionc, and we have no a prioriinformation about it. We must expect that some conceivablefunctionsyn can never arise from the impositionof a consumptionfunction. The class of possibley-functionsis no doubt quite complicatedin certaincases. Fortunatelyit is possibleto specifythat class quitesimplyin the realisticcases, and it is then possibleto use the variationalargumentrigorously. Problem (ii) is dealt with in the rigorous analysis by dependingon equation (15) insteadof the differentialfirst-ordercondition(9). It is a remarkablefact thatthis condition holds if and only if the variousfunctionsarisefromutility-maximization underan imposed consumptionfunction,evenwhenthat functionis not differentiable.For proof, the reader is referredto [4]. To deal with problem(iv), we have to restrictthe class of utilityfunctionsconsidered. We assumethat (B) V(x, y) = -yu2/u, is an increasingfunctionof y for each x>0 (and boundedin 0 < x < x, 0 < y < y for any < oo and y< 1). It will be noticedthat this is an assumptionabout preferences,not just about the form of the utility function used to representpreferences. The second part of the assumptionis readilyacceptable. The first, and main part of the assumptionholds if and only if, for a given level of consumptionx, a one per cent increasein the amountof work done requires a largerincreasein consumptionto maintainthe sameutilitylevel,the greateris the amount of work beingdone. It is equivalentto assumingthat (in the absenceof taxation)the consumer'sdemandfor goods is an increasingfunction of the real wage rate (at any given non-wageincome.' Fewindividualsappearto havepreferencesviolating(B), andintuitively it is ratherplausible. We shalllateruse the fact that (B) holds if preferencescan be representedby an additiveutility function. (It will be noticed that, as y-> 1, V-> + 00, so that the assumptionmust hold for some rangesof y.) If the assumptiondoes not hold, the theory of optimumtaxationis more complicated. The point of the assumptionis indicatedin Theorem 1. Under Assumption(B), zn = nyn maximizes utility for every n under some consumptionfunction c if and only if (i) zn is a non-decreasingfunction definedfor n >0; (ii) 0 < zn<nfor all n>O. 1 This equivalenceis fairly obvious from an indifferencecurve diagram. For a formal proof thait(B) impliesthat consumptionis an increasingfunctionof the wage rate,let w be the wage rate, and m non-labour income (both measuredin terms of goods). (B) states that wy, regardedas a function of x and y, is an increasingfunction of y. Write x and y as functions of w and m, putting x = x(w, m), y = y(w, m) and x' = x(w', m), y' = y(w', m) where w'> w. I shall show that x'> x. that x" = x(w", m") = x, and y To do this, choose w" and m" such w = y(w", m") = W y Since x"-w'y" = m, (x', y') is preferredto (x",y"); and therefore x'-x > w"(y'-y") e L(wtytwty) W = = !! (w'y'-wy) W w' (X'-X), since x'- wy' = m = x - wy. This implies, with our assumptionw'< w', that x'> x. The conversepropositioncan be proved by reversingthe steps. MIRRLEES 183 OPTIMUM INCOME TAXATION For a rigorousproof of this theorem,the readeris referredto [4]. For a heuristicjustification,supposethat znis differentiable,and that c is twice differentiable.The first order condition,(9), can be written - Oz u(c(z), z/n) = I [zc'(z)-V(c(z), z z/n)] = 0. ...(28) Furthermore,we have the second-ordercondition, that the derivativeis non-increasing at zn. Since it is zero there, this is also true when we drop the positivefactor ul/z. In otherwords, Oz [ZC'(z) - V(c(z), z/n)] _ 0, at z = Zn. ... (29) Now differentiate the equation znc'(zn)- V(c(zn)zn/n)= 0 with respect to n: azc8-v]lz Oz = Zn ~dn zn/n)z/n2. V(c(zn), . . .(30) It follows from (29) and assumption(B) that dzn >0 ...(31) dn unlesszn = 0. In fact znis strictlyincreasingwhenn > nO and c is differentiable;a corner in c causes zn to be constantfor a range of values of n. (An indifferencecurve diagram makes this clear.) Condition(ii) of the theoremclearlyhas to be satisfiedby the utility maximizing choice. To prove that a suitableconsumptionfunctionexists for a given z-functionsatisfying the two conditions,one definesc by the first-ordercondition(28). (30) then shows(nearly) that the second-orderconditionfor a maximumis satisfied. This does not yet proveglobal maximizationof utility,but that also is true. It shouldbe noticedthat, as a corollaryof Theorem1, condition(A) holds whencondition (B) holds, for zn is shown to be non-decreasingeven if it is a correspondence. It thereforetakesa singlevaluefor all but a countableset of valuesof n. Afortiori,condition (A) is satisfiedin this case. Theorem1 at once impliesthat zn and thereforealso xn are non-decreasingfunctions when the optimumtax scheduleis imposed. Furthermore,it shows us quite straightforwardlywhat changesin the functionyn we are allowedto contemplatewhen applying the variationalargumentthat allowablesmall changesshould make only a second-order differenceto the maximand. The rigorousargumentis still complicated,in part because one has to allowfor the possibilitythatznis constantoversomeintervals,and discontinuous at somevaluesof n. Thefull statementof the result,whichis provedin [4],is as follows: Theorem 2. If preferences satisfy assumption (B) and (un,x", yn) arise from optimum income taxation, then (i) zn = nyn is a non-decreasingfunction of n; (ii) Un= n Uo-3 [YmU2(Xm, ym)/m]dm (n > 0); ... (32) (iii) at all points of increase of zn (i.e., where zn>zn, for all n'<n, or zn<zn for all n'>n) An = [w+u82I/nul]f(n)- K [ G-A (u) Tnmf(m)dm=0, ... (33) 184 REVIEW OF ECONOMIC STUDIES wheresuperscripts" (n) ", etc. indicatethat the function is evaluatedat n-man (etc.)'sutility-maximizing choice,and (n) + ynU(n)U(n)/U(n) = -u(-n) ~ m- ). dm' - exp ...(34) = (iv) If n e [nl, n2], where z is constant on [nl, n2], and [nl, n2] Ymyu12(xm',Ym')/U(Xm, of constancyfor z, rn {Amdm ni 2, rn2 is a maximal interval ];*(35) Of,JAmdm < 0; ...(36) n (v) If z is discontinuous at n, Ynis definedto be lim ym,Xn is definedby m*n- (9n, Y7n)= Un = U(Xn,Yn), andui1,etc., denoteu1 evaluatedat n,,5n,whileu1, etc., denoteevaluationat x",yn, - ,nln) w+ U2/nul (Wyn-xnln) - (w57n YnU2 ,YnU2 - w + ii2/ni1 y (37 jy If qlyis a non-decreasing functionof y for constantu, znis continuous for all n. (vi) {G [-)iG'(um)] o (vii) X= w T0mf(m)dm= 0, ...(38) _ui H(Z), ...(39) H'(Z). ...(40) It will be noticedthat in this statementw is the commodityshadowwage rate (wlpin the earliernotation),whileA (i/p in the previousnotation)is the inverseof the marginalsocial utility of commodities(nationalincome). The second part of (v) should be particularly noted, sincewe are quite likelyto be willingto assumethat y is a non-decreasingfunction of y, and it is a great advantagenot to have to worryabout possiblediscontinuitiesin Zn. It does not seempossible,unfortunately,to delimita class of casesin whichone can be sure that [0, no] will be the only intervalof constancyfor z. It shouldbe mentionedthat, when 4fyis not non-decreasing,and the equations(37) may possibly apply, the conditionsof Theorem1 may definemore than one candidatefor optimality,and then only directcomparisonof the welfaregeneratedby the alternativepaths so definedwill solve the problem. 5. INTERPRETATION If n is not in an intervalof constancyfor z, and c(.) is thereforea differentiable function at zn,the first-ordercondition(9) applies. It can be written .. .(41) -u2/nu1 = c'(z). If we denotethe marginaltax rate, dd[wz - c(z)], by 0, we have = w+u2/nul d(wz) wO= d [wz-c(z)] dz = A- a Tnmf(m)dm, ...(42) MIRRLEES OPTIMUMINCOME TAXATION 185 by (33). (42)suggeststhe considerationsthatshouldinfluencethe magnitudeof the marginal tax rate. First,it can tell us somethingaboutthe sign of 0: we alreadyknowthat 0 will not be greaterthan 1, but we werenot previouslyableto say anythingaboutits sign. Of course, we expectthat it will not usuallybe negative. Using (42) and the conditionsin Theorem1, we can establishthis rigorously. Note firstthat 1- AG'u1is a non-decreasingfunctionof n, sincexnis a non-decreasing functionof n, and a G = G'u1a decreasingfunctionof x. If 1-AG'u1werealwayspositive Ox or alwaysnegative,Equation(38) could not be satisfied. Therefore 00 l f (1-AG'u1)Tn,f(m)dm Jn ul is increasingin n for n less than some n, and decreasingfor n> n; but in any case positive for n >n. (Here we use the propertiesu1>0, Tmn>O.) Since the integralis zero when n = 0, it is non-negativefor all n. Consequentlythe marginaltax rate is non-negativeat all points of increaseof z. If n is not a point of increaseof z, c is not differentiableat Zn. It is easily seen that, if [n1, n2] is a maximalintervalof constancyof z, - u2/nu1 is equal to the left derivativeof c at n1, and the right derivativeat n2. Thus both the " right" and "left" marginaltax ratesare non-negativein this case. Summarizing: Proposition 3V1 If assumption(B) is satisfied, wz- c(z) (the " tax function ") is a nondecreasingfunction for all z that actually occur (and may therefore be taken to be a nondecreasingfunction for all z). Havingestablishedthat the integralin Equation(42) is non-negativefor all n, we can see that the marginaltax ratewill be greaterif therearerelativelyfew n-menthanotherwise; or if the utility-valueof work, -yuy, is more sensitiveto work done (utilitybeing held constant); or if n is closer to ni,the value of n at which 1 = AG'u1(and the integralis thereforea maximum). Iff is a single-peakeddistribution,the firstconsiderationsuggests that marginaltax rates should be greatestfor the richest and the poorest; but the last considerationtells the otherway. In any case, it is importantto note than no, the largestn for whichyn = 0, may be quite large: if the numberwho do not work in the optimumregimeis large,the marginal tax rate may not be high at zero income. Explicitly,we can rewriteEquation(38) in the form [U1(X~, 0) -i'G'(uo)] F(no)+ A[--lG'] Tnomf(m)dm= 0 ...(43) which,when combinedwith Equation(33) (for n = no) gives w+ u2(Xo, nOlxO ?) ) = t;e (uo, 0) F('no)[AG'(uo) - 1I] 0O(O ul(xo, ... (44) ?)- Unfortunately,one cannot get much informationfrom these " local" conditions,at least for smalln. For any detail,and in particularfor numericalresults,one must examinethe whole system of equations. It is easierto do that for particularexamplesof the general problem,and that is what we shall do in succeedingsections. It may be noted, however, that Equation(44) does provideus with someinformationabout no and x0. For example, it is clearthat no can be zero only if F/nf tends to 0 as n tends to 0; indeed,since the left hand side of Equation (44) is bounded, no = 0 only if x0 = 0, and therefore 1/u1 = 0. It follows that no = 0 only if F/(n2f) is boundedas n-*0, whichmeansthat F tendsto zero faster than exp (- 1/n). This excludesthe cases usually consideredby economists. We 1 The analysis and result can be generalizedto the utility function u(x, z, n) where the parametern can indicate-variationsin tastes as well as skill. The extension is fairly routine and will not be discussed here. 186 REVIEWOF ECONOMICSTUDIES mayconcludeat this stagethatit will be optimal,in the mostinterestingcases,to encourage some of the populationto be idle. A numberof conclusionshave been obtained,but they are fairlyweak: the marginal tax ratelies betweenzero and one; in a largeclassof cases,consumptionand laboursupply vary continuouslywith the skill of the individual; therewill usuallybe a group of people who oughtto workonly if theyenjoyit. The mainfeatureof the resultsis thatthe optimum tax scheduledependsupon the distributionof skillswithinthe population,and the labourconsumptionpreferencesof the population,in sucha complicatedway thatit is not possible to say in generalwhethermarginaltax ratesshouldbe higherfor high-income,low-income, or intermediate-income groups. The two integralequationsthat characterisethe optimum tax scheduleare, however,of a reasonablymanageableform. One expectsto be able to calculatethe schedulein particularcases withoutgreat difficulty. In the next sectionsof the paper,we shall show how this can be done in certainspecialcases, and obtainfurther propertiesof the optimumtax in these cases. 6. ADDITIVE UTILITY An interestingcase ariseswhen, for all x and y, Ul2 = 0. ... (45) Thus ul dependsonly on x, and u2 only on y. Proposition4. If assumption(45) is satisfied,V(x,y) is an increasingfunction of y, boundedfor smallx andy. Proof. V = -yu2(y)/u1(x),and V2 = (-u2-yu22)/u1>0. Corollary. Underassumption(45), Theorem1 applies. Boundednessis obvious.11 In particularwe know,from statement(v) of that Theoremthat yn is continuousprovided that fryis non-decreasing. In the presentcase, this conditionis equivalentto the requirementthat -yu2(y) is convex. ...(46) Thereis no reasonwhy this assumptionshouldhold in general,but it is easilycheckedfor any particularcase. We shall now restrictattentionto cases for which(46) holds.' If we restrictattentionalso to caseswherez is strictlyincreasingwhenn >no, the optimum situationwill be a solutionof the equations W+ 8 nu, Un = Uo-J = VI, n2fi(n) ) YmU2 o ( n Ul -AG')f(m)dm, ... (47) ...(48) d. m We shall furtherassumethatf is continuouslydifferentiable. Since xn,yn are continuous in this case, it follows that unand + _!2 -)liy are differentiablefunctionsof n. Write nu,. w+ v=- U2 nu, 4iy ...(49) 1 In [4] a theoremis provedwhich states that the conditions of Theorem2 are in fact sufficient(as well as necessary)for an optimum in the special case now being considered. MIRRLEES OPTIMUM INCOME TAXATION 187 u and v are continuouslydifferentiablefunctionsof x and y. Since au >0, au <0, and, Ox ay as can easilybe seen, eV <0 eV <0, the Jacobiana(u, v) is alwaysnegative. Consequently Ox ay o(x, y) x and y can be expressedas continuouslydifferentiable functionsof u and v, and are therefore themselvesdifferentiablefunctionsof n. We can now write Equations(47) and (48) as differentialequations: dv n du 1 nf f Vm_ v (2+ I dn 2G' _ n2ul Y2 dn ... (50) n2 ...(51) n which, as we havejust shown,can be thoughtof as equationsin u and v. The particular solution we seek, and the particularvalue of A,,are definedby the boundaryconditions, Equations(39), (40), Vno= 2f(n) ...(52) UAG'(un))] whichis the form (38) takes here,and vnn f(n)-0 ....(53) (n- oo),, which is apparent from Equation (47). Provided that zn is strictly increasing for n > no, a solutionthat satisfiesall those conditionswill, by Theorem2 of [4],providethe optimum. Equations (39) and (40), the production function and the marginal productivity equation,may be ignoredin the calculations. Correspondingto the particularvalues of w and 2 usedin the calculation,one obtainsvaluesfor Xand Z. Thuswe knowthe optimum tax schedulewhen the marginalproductis w and the averageproductis X/Z. In this way one could obtaina rangeof tax schedulescorrespondingto differentaverageproductsand marginalproducts-which is what one wants. Of course, it is desirableto choose 2 so that the averageproductwill be relatedto the marginalproduct,w, in a reasonableway. This shouldnot presentany greatdifficulty. To determinethe sign of dv dn _ (22 _ nu, _ \) dy I dn nu, 1( dy dn y (u22 _ n nu, Yn+ ndy we calculate,from Equation(49), dn = dn _Vly y+ u2u11 dx nu2 dn u2 n2ul u2 - + n2ul U2UI1) 3 nu dz dn dy u2211 _ nu3 dn _Y n U2U11 nu3 y v( nu, du dn - *(54) 22 n2ul substitutingfrom (51). Therefore,using (50) Lu22 _vr* +u2u11 dz = YU22 _yy -nu, nu, _ dn nu, =-yl(2~+n + U2 - nu +YL// )+ + ) + fn2t( V+G$ + n ... (55) REVIEWOF ECONOMICSTUDIES 188 We may thereforecheckthe assumptiondz > 0 by examiningthe solutionto see whether dn + YYY (2+ - 2 - -G'> 0. ...(56) Equation(56) is equivalentto dz >0 becausethe expressionin squarebracketsin Equation dn (55) is negative,termby term. In computation,one can proceedas follows:[1] A value of |u A is chosen. To get the right order of magnitude,one can calculate { G'fdn (cf. (38))for someparticularfeasible,and aprioriplausible, 'fdn/ allocationof consumptionand labour. [2] A trial value of no> 0 is chosen. (It shouldbe bornein mind that the inequality vno > 0 may, with (52), restrictthe rangeof possibleno.) [3] Bearingin mind that yno= 0, the values of vnoand uno are obtainedfrom (49) and (52). [4] The solution of equations(50) and (51) is calculatedfor increasingn until either (56) fails to be satisfied,or it becomesapparentthat (53) will not be satisfied(see [6] below). [5] If (56) fails to be satisfied,zn is kept constant,un(and v")being calculatedfrom (49) until (56) is satisfiedagain, when zn is allowed to increaseand the solution pursuedas in [4]. [6] The attemptedsolutionshouldbe stoppedif unor xn beginsto decrease,or vn or Ynfall to zero, or xn, yn cannot be calculated(e.g. becauseunexceedsthe upper bound of u, if there is one). Other stopping rules can be given for particular examples,dependingon the structureof the solutionsof the equations. [7] A rangeof trialvaluesof no mustbe usedto findthe one that most nearlyprovides a solution satisfying (53). Efficientrules for iteration might be obtained in particularcases. 7. FEATURES OF SOLUTIONS Solutionsmay,for all I know,be verydiversein theircharacteristics;but examination of the equationssuggestsa numberof comments. First we note that v"will always lie between 0 and 1 , since qy(O), 1+ 0 < U2 nu l+ U2 nu, < ...(57) We are thereforeled to expectthat v tendsto a limit as n->oo. (It mightcycle for certain forms off, of a kind one would perhapsbe unlikelyto use.) y is also bounded,by 0 and 1, and is thereforelikelyto tend to a limit. Oneis thenled to certainconjecturesaboutthe limits,which ought to hold for sufficientlyregularf and u. MIRRLEES OPTIMUM INCOME TAXATION 189 Let - +y+2<oo. ...(58) b00 nfdn < oo, y > 0: otherwisen2f is increasingfor large n, thereforebounded (Since 0 below.) Further,suppose aex u, ..>(59) )* as x-+ oo. Thenthereappearto be threecases; in each of which one expectsthe following resultsto hold. (i) u<1. Asn -+oo, and Yn-+1 v .-.+.(61) 0. .. .(60) The marginaltax rate, 0-+ 1. ...(62) (ii) u = 1. As n-+oo, Yn Y~ wherey is defined(uniquely)by and .(63) ** YU2(Y)= -a, Vn-[-(1 + )u2(j))-YU22(0)]1. . . .(64) ...(65) Furthermore, 0-> 1j+ ...(66) v ...(67) where YU22(Y) UY) (iii) 1u> 1. As n-+oo, *.*(68) and v-[-(1 +y)u2(0)]1. ...(69) 1 ...(70) 1+y (It may be noted that, in a naturalsense,(66) holds for all cases.) Beforeindicatingthe reasonsfor these conjectures,a few words of interpretationmay be in place. On the whole, the distributionof incomefrom employmentappearsto be of Paretianform at the upper tail1: Equation(58) holds with y between 1 and 2, roughly speaking. It is not improbable,however,that marginalproductivityper workingyear is distributeddifferentlyfromactualincomes: the lognormaldistributionis the mostplausible simpledistribution. For this, y = oo, and nf _ logn f for largen; (a2 a2 is the varianceof the distributionof logarithmof incomes). 1 See the generalassessmentby Lydall [3]. ...(71) REVIEWOF ECONOMICSTUDIES 190 The realismof alternativeassumptionsabout utility may be assessedby calculating the responseof the consumerto a linearbudgetconstraint,x = wy+a. It is easy to see that utility-maximization requires(sinceul2 0) u1(x) 1 x=wy+a. =-, U2(y) W ... (72) If u1 = cxx, we have to solve caw = -(a+wWY)4U2(Y). (If aw < &-au2(0), y = O.) ...(73) Clearlythe solutionhas the followingproperties: y-+ I as w- oo if < 1,. y-+Oas w-*oo if j> 1.J (Cf. (61) and (68).) Also x-a+w aw (y<1), 13(5 ( )t These asymptoticpropertiessuggestthat the case yu 1 is particularlyinteresting. Whenp = 1, since, by (73) a = - --yax w U2 -Yu2-+c as w-*oo; i.e. . . . (76) Y- , wherey is definedby (64). (Cf. (63).) If in addition, u2(y) = -(-y)A (3>0), .. .(77) we have 90(1 y) = y(1-9) = a, v. The choiceof a maybe influencedby consideringthaty = 0 whenwla < 1/a. It is interesting to note that, if ac=2, 3 = 1, y =2, v=2, y=2/3, and, if our conjecturesare correct, 0-+60 per cent. Thiscaseis perhapsnot completelyunrealistic;but it shouldbe remembered thatthe homogeneousformfor u meansthat the decisionnot to workdependsonly on the ratio of earned to unearnedincome,whichis not a veryrealisticassumption. It will be noticed that, in this case, the asymptoticmarginaltax rate is very sensitive to the value of ,u(in the neighbourhoodof 1). The reasonsfor the conjecturesEquations(60)-(70)(in fact, I can providea proof of (iii) and will do so below)are as follows. One expectsthat, as n-* oo,the relevantsolution of the differentialequationswill tend towardsa singularityof the equations: not only will y and v tend to limits,but n dy and n dvwill tend to zero. Denote the postulatedlimit of dn dn y,, by y. Considerfirstthe case ul = oax-(4u< 1). MIRRLEES OPTIMUM INCOME TAXATION 191 In this case utilityis unbounded. I shall show that y = 1. If not, u2 and ify tend to finitelimits,and, from (51), we have nu dx Therefore,since u1 dx= o dn dy\ - _u(y+n [ 1 x -Ja x X1 U2(- ...(78) dn I1- y - =-9u2(y log n[1+o(1)]. ...(79) I This impliesthat nu1 = 0[n(log n)f 1i] ...(80) o00. ..(81) Therefore 1 ('x n2f(n)J [i --A jf(m)dm= 1>0, IF 1+ [1 + 2] ...(82) which is readilyseen to be inconsistentwith (80) if the distributionis either Paretianor lognormal. We must thereforeexpect that y = 1. Supposenow that 1+ 2, the marginaltax flu rate, tends to a limit t < 1. Then dx dn _ U2 (y nu, dy\ dn ...(83) +nI->1+1,1 and consequently X ...(84) n This impliesthat -= U1 (1- )0n'1 + o(1)], - ...(85) OC from whichwe can deducethe behaviourof I { oo. In the Paretiancase,f- n-2-y, as n-> U(I- )f(m)dm ...(86) it is easily seen that ...(87) I--(2+y-u)-1>O. Since 1-I = limn-v u2 . u2 J-, and y nu, 1_ d logfu2 tends to -oo as y-*1 (if it tends dy u2 to a limit at all), we musthave U2 -+O,whichis inconsistentwith the assumption < 1. In nu, the lognormalcase, one obtains 1-l = lim u2 constant u2 If 'y u2 nu1 log n 1 tendedto a finitelimit, since log n log I u2 # log(1-l) +log (nu,) #(1-ji) log n, Y ...(88) 192 REVIEWOF ECONOMICSTUDIES 1 d log U2 u I would tend to a finite limit as y-+1; which is clearly impossible. log I U21 dy _ Thus in the lognormalcase too, we expect that t = 1. This explainsthe conjecturesin the case u< 1. If,u= 1, dx ocd(log x) = nu, = dn d(log n) yn( dy *dy(89) dn which thereforecannot tend to oo, since in that case uj1 = finiteM, so that 21 f 1 >nM n eventuallyfor any f(m)dmbecomesunboundedas n- oo. < 1 and We can expect,therefore,that y- logx ...(90) Y-yU2(y). log n It is easily seen that the only plausiblevalue of y is that for whichlog x/log n- 1, i.e. Then if 1+ U2 nu1 ... (91) -a. YU2(Y) = -4, we shall have nocx --1 -+ -U2(Y) U-, 1-t and VIy f( m)Pfdm-+ n2f(n) Ju (I)'r -U2(Y) Y) Y' whichsuggeststhat t=frt) G') -U2(Y) Y ...(92) (1 - t)(1+ v)/y, in the notation(57). This is equivalentto (56). In particular,we expectthat I = 0 in the lognormalcase. When u> 1, the utility functionis boundedabove, and a more generaland rigorous treatmentis easy. un is an increasingfunction,and beingnow boundedtendsto a finiteii. We shall write u(x,y) = x(x)+p(y). ...(93) Since x is an increasingfunction,X(x)also tends to a finite limit j. Thus p(y) tends to a limit, and so does y. The limit of y must be zero, since otherwise(32) implies u-+co, whichis now impossible. Now = qy nu, ip nul ... (94) -+ -U2(0) OPTIMUM INCOME TAXATION MIRRLEES in this case (since -, being < , nu, 193 is bounded . Therefore Equation (50) becomes U2 n dv1 dn(Y+1+o(1))v+ +o(l) ~~~~U2(0) ...(95) in the Paretiancase. From (95) one deduces,by the usual methodof solvinga first-order lineardifferentialequation,that 1 _ U2(0)(Y ...(96) + 1) from which it follows at once that the marginaltax rate tends to (y+ 1)-. It is easily checkedthat in the lognormalcase the marginaltax rate tends to zero. In the next section,a particularcase is examinedin detail, and providesconfirmation for some of our conjectures. 8. AN EXAMPLE CaseI. Let us, by way of illustration,analyzethe followingcase: u = oclog x+]og(1-y) G(u)= f(n) 1. (lPog = 1exp n+1)2 (log [ J 2 L (The last assumesa lognormaldistributionof skills: the averageof n is 0_= O607...). We put w = 1. With these assumptions,Equations(50) and (51) become V dv dn logn n du dn y n(l-y)' x A n2 n2 where x 1- V= 01+ /V/ nu, ],*= - y) -y)2 1/cc(l n( = (1 Y)( 1y x e and eu= x (1-y). For simplicity,we considerthe case ,B= 0 first,and put 1-y, s= t = log n. The equations become, since u = a log (an) + a log s- + log s, dv = v t+ 1 -s+2e-t, ds = [1-a_ dt -(1 1 (1 + a)s](s2-v)+ocs(vt+2e-t) +a)s2-(1-a)v In the case of 9 = 0, we define G = u. ...(98) (99) REVIEWOF ECONOMICSTUDIES 194 Solutionsof these equationsare depictedin Fig. 1. We now establishtheirproperties. We rememberthat, in the optimumsolution,0< v<s2 (for the marginaltax rate, v/s2, is between0 and 1). Using this fact, we can deducefrom the firstequationthat v-+O (to so). Supposethat, for some t, vt > 1. Then d- v dt >vt+V v-s2 s >vt-1_O,0 'I v~~~~~~~~~~~~~~~~ 2-t~~ 1 5 FIGuVR 1 since v>0, and s < 1. Therefore v is increasing at an increasingrate, contradicting v<s2 < 1. Thisshows that, in fact, 0<v< l/t. ...(100) The two equationstogetherimply that I1-(1 +oc)s I-a2 d '-a _ ...(101) s a (S -V), [S 'a(S2 V)] s dt as one may see if one multipliesthe firstby as, and the secondby [(1 +oc)s2-(1- cx)v],and subtracts. Write - r =sa(s2_v). ... (102) MIRRLEES OPTIMUM INCOME TAXATION 195 so that dr 1-(1 +ex)s r _ dt When s < , ...(103) s r increases; when s> , r decreases. For this reason s cannot tend to a limit otherthan l/( +oc): we shall show more, that s-?1/(1 +oc). (Cf. Fig. 1.) Since v-+0, given s>0, thereexists to such that 0<v,<e for all t _ to. Then 1+a s If rt>(l+o)- a a - 1-a cs a 1+a <r<s rS ...(104) (t_ to). , the right hand inequality implies that 1 St>-~~. l+oc Thereforer is decreasing. If 1++a rt<(l +oa)- ... (105) 1-a max[l, (1 + oc) ccJ. a -e . ..(106) we obtain from the left hand inequality(104), St a <(1+x)- a -e{max[1, (l+ O)~ a } <(1O)- ]-St ... (107) a if, either oc? 1 (in which case {...} ? 0 since s _ 1), or a> I and st> . Thus, in fact St <1+cx and, by (98), rt is increasing. Combiningthese two results,we deducethat ...(108) l+cc a -+<(l+o() whichin turnimplies,since v>0, that St-1+ . ...(109) 1+c Our demonstrationthat v and s tend to limits 0 and , respectively,confirmsthe conjecturesfor the specialcase. It is readilycheckedthatexactlythe sameargumentsapply to the case /3>0. As we have noted previously,the marginaltax rate is v/s2. Thus, as t ->oo o0-. ... (1 10) It is a strikingresult; but we shouldnote at once that 0 is a poor approximationto V/S2 even for larget. This becomesapparentwhenwe demonstratethat vt-* 1 I1+o >s > 0 for an unboundedset of values of t. Suppose the contrary,that vtIf Vt> 1+e + s, and t is large enough to imply that st< dv > ... 1 1+ (Ili) REVIEW OF ECONOMICSTUDIES 196 Thus vt continuesgreaterthan 1 +6, and - > is for all largert: but this impliesthat dt 1+0e v-* c, which we have alreadyshown to be false. If on the otherhand vt< l+cx and -?, t is greaterthan 2 and is largeenoughto imply vt< 4 A 2tet + < 4 1 Stx - ~~~~1+0e then d-(vt) = v + t(Vt-S) + Vt + Ate-t s dt -8 I__ 1+ < 1+ 2 + + 1 This impliesthat vt becomesnegative,which is impossible. Therefore vt- I for <6 all largeenought: Vt-* I . (113) I+o Thus o= 2_ v/s=... 14) =..(1 t Only 1 per cent of our populationhave t > 1P7(one in a thousandhave t > 2*4). Since one mightwant to have acas low as 1, the above approximationis clearlyratherbad even at t = 2., 2 How bad will becomeapparentin the next section. CaseII. It is also of interestto examinethe case of a skill-distribution with Paretian tail: nf' f ... (115) y+2,y>? The equationsfor the optimumbecome(with,B= 0), dv_ dt - vy(t)+ v -s++ ...(116) et, dtt ds =[I1-oec-(I + O)S](S2-v) + os(vy(t)+ et)..(17 dt (I + Oe)S2 Oe-)V 1 In this example, r2= 1: that is, the standarddeviationof log n is 1. This is done merely for convenience in manipulations. A preciselysimilartheory holds for a generallognormaldistribution. It can be shown, by continuing the methods of the text, that vt- l I -t while s = I +0 ( The fact that the optimum path is tangential to the vertical at (s, v) = (11 , O)implies that s< for large t, since otherwiser would be decreasing,and that, as can be seen from the diagram,is inconsistent Thus we have the situationportrayedin Fig. 1. with dv ?? 2 The case /> 0 can be treatedin a preciselysimilarway, to obtain the same qualitativeresults. MIRRLEES OPTIMUM INCOME TAXATION 197 and, exactlyas before,one has the equation dr _1-(1+ao)s dt s where r = s ( (S2-v). (118) The situation is portrayed in Fig. 2. The broken curves have equations s a(s _v) ri i 2,2 3) v= s ...(119) V~~~~~~~ v 1/ / // S2~~~~~~~~~~~~S ~ ~ ~~~~~ vs +1 FIGuRE 2 with 0 < r1< r2< r3. It will be noted that such a curve,with equation v= s2-rs ac(r constant), ...(120) alwayscuts from below the curve dVI v2= = sv - rs3+ p_ constant) (p r cntant),> ... ...(121) (120) that passesthroughthe samepoint. This follows from the calculation, ds dN1 N _ ds dv2 _ ds \ys+ d vs3 ~rs >0. ...(122) 198 REVIEW OF ECONOMIC STUDIES This remark will prove very useful; but first we want to establish that, for large t, the sign of dt is nearly the same as the sign of v_-YS dt vs +1 Let e' be a positive number, and let t1 be so large that Ivy(t)+Ae-t - vy I>6' when t ? t1. Since s = 1 at to = log no, s< 1 at t only if s = 1 for some previous t1; 1+o 1+o if (for the given t) t1 is the greatest such, we have from Equation (118) rt> t =(1 -a)- a(nf + 2-Vtl t St | {(1+a)2 1+oc dt since as t- oo, 0> d- dt 5 ...(123) =A/\>0, implies -t Vt< s2 + o(1) VSt + 1 < S2_ ... (124) yS t3+o(1). Therefore st is positively bounded below, say st > A'>0. ... (125) Hence, when_t _ t1, dt = vy(t)+ v s -s+Ae-t >, ... (126) if 2s' s2 S+ + ys+ ... (127) 11 y Similarly, we can show that dv<, dt ... (128) if v< s YS+1 Now write ?= 8'/( + 1). 2 Y+ ...(129) 1 It is clear that, if, for some t v s-I +s YSl and s > =1+aoc t, MIRRLEES 199 OPTIMUM INCOME TAXATION then dv >0 and also dr <0. Therefore,by the propertiesof the two sets of curves(cf. dt dt Fig. 3), v s2_is increasing. Thusfor all subsequentt,~~~~~~~~dv >e', and v-?oo. Sucha path dt Ys+1 cannot be optimum. Consequentlyon the optimumpath, if t _ tl, 1 or v? either s< l+oc S2 ys+1 +C. ...(130) -?. ...(131) Similarly, for t > tl, either s> 1 or v> s 1+oc -ys+1 X~~~~~~ + pi /~~ FIGuRE 3 Suppose that at t1, s> 1 (An exactly similar argumentapplies if s< 1*) Then r is decreasing,and continuesto do so until 2( ) N r =r' =(1+Soc)~ aE ( 1+oe)2 1 (1+oe+y) +6,. (Cf. Fig. 4.) Once s < Only then can s becomeless than 1 1+x ~ ~ ~~+ Thereforeat no time is r<r" = (1+Oc) 1, r increasesagain. a G +a)2(1 +a Nor can we have r>r' at any later time. Thus we have found t2 such that, when t > t2, LMPQ in Fig. 4, which containsX, and can be (st, v) lies in the curvilinearparallelogram made as small as we pleaseby suitablechoice of 6'. Thereforeas t-+oo, 1+o (t oc)(l+oc+y)32) REVIEWOF ECONOMICSTUDIES 200 The optimumpath is indicatedby XZ in Fig. 2. On it, the marginaltax rate, v2 .(133) ' ~~~~~.. l +a+Y s2 1+ocy whichconfirmsour conjecturein this specialcase.' 2 It shouldbe noted that we have not shown,in eitherof these cases, that s diminishes (nor even that z = ny = et(1 - s) increases)all along the path: the possibilitythat z is constantfor somerangeof n, in the optimumregime,remainsin both the exampleswe have discussed. Calculationof specificcases is requiredto settlethis issue. Suchcalculationis not difficultwith the informationabout the solutionthat we now have. / / sZ / / // ~~/ / X~~ -M+T w~~~~~~~~~~~~~~ Ls FIGu1 4 9. A NUMERICAL ILLUSTRATION The computationswhose results are presentedin the tables below were carriedout for the first case examined above, with oc= 1, but with a more realisticvalue for c2. Computationshave also beencarriedout for the case U2 = 1, andtheseprovidean interesting contrastto the mainset of calculations. In all cases,we take w = 1; and for computationalconvenience,the averageof log n is -1. Thismeansthatthe averagemarginalproduct of a full day'swork is ef272, but it amountsonlyto a choice of unitsfor the consumption good. The results show, for particularvalues of the averageproduct of labour, X/Z, what is the optimumtax schedule,and what is the distributionof consumptionand labour in the population. 1 The case 3>0 can be treatedin a preciselysimilarway, to obtain the same qualitativeresults. 2 It is possible to calculate optimum tax schedulesexplicitlyfor a uniform (rectangular)distribution of skills; but since that distributionis of no great interest in the presentcontext, the analysis is omitted. OPTIMUMINCOME TAXATION MIRRLEES 201 For purposesof comparison,one naturallywants to know what would have been the optimum position if it had been possible to use lump-sumtaxation (or, equivalently, directionof labour). Let us considerthis first for the case ,B= 0. We shall assume a linearproductionfunction ...(134) X=Z+a (whichone thinksof as applyingonly overa certainrangeof valuesof Z, includingall those that are to be considered). In the full optimum,we maximize [log x + log (1- y)] f(n)dn T subject to .(135) .. fxf(n)dn = fnyf(n)dn +a. It is clearthat x will be the samefor everyone: ...(136) x = x?, and that yn must maximize ...(137) log (1-y)+Jny/x0, for otherwisewe could improvemattersby changingyn(for a set of n of positive measure, of course)and changingthe constantx correspondingly.Maximizationof (137)yields where the notation [...]+ .. .(138) [1-x?/n]+, y= means max (0, ...). It is worthnoticingthat in the full optimum,only men for whomn > xo actuallywork, and an interestingcuriositythat, with the particularwelfarefunction specifiedin (135), utilitywill be less for morehighlyskilledindividuals. This is, as we have seen, impossible underthe income-tax. The value of xo is determinedby the productionconstraint: (n-xo)f(n)dn+a, x? ...(139) xo where,for convenience,we havetaken f(n)dn = 1. In the case of the speciallognormal distributionused here,it can be shownthat this equationreducesto ...(140) a. 2x0- x?F(x) - e-ff2 [-F(e 2X0)]= Solution of this equationgives the consumptionlevel in the full optimum,and also the skill-levelbelow which no work is requiredof a man, namelythat at which a full day's labourwould providea wage equal to the consumptionlevel. Whenf,>0, a similartheoryholds. In that case, x>xo for men with n>xo, but it is still the case that such men are made to have a lower utility level than their less skilled neighbours. The equationcorrespondingto (140)is a little more complicatedand will not be reproduced. For n> xo, consumptionand labourare Xn = (x0)(1+P)/(1+2P)n/(l+2P) Yn= ..2.(141) -(X01n)(1 In the tables, certainfeaturesof the optimalregimeunderincometaxationare given, along with xo for the full optimumfor the samelinearproductionfunction. In TablesI-X, the lognormaldistributionhas parametersa = 0 39. This figureis derivedfrom Lydall's figuresfor the distributionof incomefrom employmentfor variouscountries([3], p. 153). It is intendedto representa realisticdistributionof skills within the population. In each 202 REVIEW OF ECONOMIC STUDIES TABLE I (Case 1) 0-39, mean n = 0 40, XIZ = 0 93. ot= 1, =0, c Full optimum for X Z-0-013: xO = 0419, F(xO) = 0-045. Partialoptimum (income-tax): xo = 0-03, no = 0-04, F(no) = 0-000. F(n) x y 0 010 050 0 90 0.99 0-03 010 0-16 025 0-38 0 0-42 045 048 0-49 Population average 0417 x(l -y) 0 03 005 0-08 0-13 0419 z full optimum x 0 009 0 17 0-29 0 45 0419 019 0 19 0419 0419 0418 0419 TABLE II Same case as TableL z x 0 005 0.10 0-20 0 30 040 050 0 03 0*07 0*10 0418 0-26 0*34 043 o=1, =, Average tax rate per cent -34 -5 9 13 14 15 Marginal tax rate per cent 23 26 24 21 19 18 16 TABLE HI (Case 2) cr = 039, mean n = 0 40, X/Z= 140. Full optimum for X = Z+0-017: xO = 0-21, F(xO) = 0-075. Partialoptimum(income-tax): xo = 005, no = 0-06, F(no) = 0-000. F(n) x y x(l -y) z Full optimum x 0 0410 0 50 0 90 0 99 0*05 0411 0-17 0-27 0 40 0 0-36 0*42 0-45 0-47 0.05 0 07 0.10 0-15 0-21 0 0-08 0415 0-28 043 0-21 0-21 0-21 0-21 021 Population average 0418 0417 0-21 TABLE IV Same case as TableIII. z 0 005 040 1013 0-20 0 30 040 0 50 x 005 0'09 0*21 0-29 0*37 0-46 Average tax rate per cent -80 -30 -5 3 6 8 Marginal tax rate per cent 21 20 19 17 16 15 MIRRLEES 203 OPTIMUM INCOME TAXATION TABLE V (Case 3) = 1, P - 1, o = 0 39, mean n = 0 40, X/Z = 1V20. Full optimum for X = Z+0-030: xO = 0 16, F(xO) = 0 016. Partialoptimum(income-tax): xo- 007, no = 009, F(no) = 0000. F(n) x 0 0 10 0 50 0 90 0.99 0*07 0*12 0-17 0-26 0*39 Populationaverage 0 18 x(l -y) y z Full optimum x 0 0-28 037 043 0*46 0-07 0*08 0*11 015 0-21 0 0-07 0-14 0-26 0-42 0416 0-18 0-21 0 25 0 29 0 15 0 21 TABLE VI Same case as Table V. z Average tax rate per cent x 0-07 0 11 0414 022 0-29 0 37 0A45 0 0 05 0'10 0-20 0*30 0 40 0 50 -113 -42 -8 2 7 10 Marginal tax rate per cent 23 28 27 25 23 21 19 TABLE VII (Case 4) = 1, ,3 = 1, o = 0 39, mean n = 0 40, X/Z= 0-98. Full optimumfor X = Z-0 003: xO= 0414,F(xO)= 0 007. Partial optimum (income-tax): xO = 0 05, no = 0-07, F(no) = 0 000. F(n) x 0 0410 0 50 0 90 0.99 0-05 0410 0415 0*24 0*37 Populationaverage 0416 y 0 0 33 0-41 0*46 0*48 x(l -y) z x 005 0 07 0 09 0*13 0.19 0 0-08 0415 0-28 0 44 0-14 0417 0-20 0-23 0 26 0417 0419 TABLE VIII Same case as Table VII. z x 0 0 05 010 020 0 30 0 40 0 50 005 0 08 012 019 0-26 0 34 0-41 Full optimum Average tax rate per cent -66 -34 7 13 16 17 Marginal tax rate per cent 30 34 32 28 25 22 20 204 REVIEW OF ECONOMIC STUDIES TABLE IX (Case 5) 039, mean n = 0 40, X/Z = 0-88. o= , ,B= 1, o = Full optimum for X = Z-0-021; xO = 0-13, F(xO) = 0 004. Partialoptimum(income-tax): xo = 004, no = 0 06, F(no) = 0 000. x F(n) 0 0 10 0 50 0 90 0.99 004 0 09 0O14 0O23 0O36 Population average 0415 y 0 0O36 043 0-48 0 50 x(1 -y) 004 0-06 0-08 0O12 0418 z Full optimum x 0 0-08 0-16 0-29 0 45 0-13 0-15 0-18 0-22 0-25 0417 0.19 TABLE X Same case as TableIX. Average z x 0 005 010 0-20 0 30 0 40 0 50 004 007 0.10 0-17 0-24 0-31 0 39 tax rate per cent per cent -43 -3 15 20 22 21 TABLE Xl (Case 6) 1, mean n = 0-61, XIZ = 1,o = oc= 1, Full optimum for X = Z-0-013: Marginal tax rate 35 39 36 31 27 24 21 0 93. xO = 0-25, F(xO) = 0 35. Partialoptimum (income-tax): xO= 0410,no = 0-20, F(no) = 0-27. F(n) x y 0 0*10 0 50 0 90 0.99 0*10 0*10 0-14 0-32 0*90 0 0 0*15 0-41 0*49 Populationaverage 0418 x(l -y) 0*10 0*10 0.11 0.19 0*46 z Full optimum x 0 0 0-06 0 54 1-84 0-25 0-25 0-28 0 44 0-62 0-20 0-32 TABLE XII Same case as TableXI. z x 0 0 10 0-25 050 1P00 1P50 2-00 3 00 0.10 0415 0-20 030 0-52 0 73 0*97 1P47 Average tax rate per cent -50 20 40 48 51 51 51 Marginal tax rate per cent 50 58 60 59 57 54 52 49 MIRRLEES OPTIMUM INCOME TAXATION 205 .4 .3 s 1-4- *cXs 1 I0 *2 .4 3 FIGURE 5 z 5 Optimum Consumnptio Function .3 i Distribution of 4 Consumption after Tax .2 CaSe 5: X 1c=, ;nea mean Distributionof Income k/;///,;;;t -efore Tax /7<SX/B/ 10 I=,6-.39, X/Z=_____.88, - / //777rr-. 12 .3 FIGuRE .4 6 .5 z t7.40, 206 REVIEW OF ECONOMIC STUDIES case, x0, no, and the values of x, y and x(I -y) (which measures utility) at the 10 per cent, 50 per cent, 90 per cent and 99 per cent points of the skill-distribution are given. In separate tables, the average and marginal tax rates are given for a representative range of values of z. Graphs of the optimal consumption schedule (x = c(z)) are given in Figs. 1 and 2. In Fig. 2, the distributions of xn and zn are displayed in case 5. It will be noticed at once that, under the optimum regime, practically the whole population chooses to work in each of these cases: this contrasts, in some cases, with the full optimum, where sometimes a substantial proportion of the population is allowed to be idle. In most cases, a significant number work for less than a third of the time. It is also somewhat surprising that tax rates are so low. This means, in effect, that the income tax x 1*5 .5 Case 6: ? .5 1 FIGURE *5 c=I 2 z 7 is not as effective a weapon for redistributingincome, under the assumptions we have made, as one might have expected. It is not surprising that tax rates are higher when,B = 1. When objectives are more egalitarian, more output is sacrificed for the sake of the poorer groups. Nevertheless, the difference between the optimum when only an income tax is available, and the full optimum, is rather large. The examples have been chosen for X/Z fairly large: this corresponds to economies in which the requirements of government expenditure are largely met from the profits of public production, or taxation of private profits and commodity transactions. Tax rates are, as one might expect, fairly sensitive to changes in X/Z (i.e. to the production possibilities in the economy, and the extent to which income taxation is used to finance government expenditure as well as for " redistribution "). Tax rates are mildly sensitive to the choice of,f. (When c = 4i,the main features are unchanged). Perhaps the most striking feature of the results is the closeness to linearity of the tax schedules. Since a linear tax schedule, which may be regarded as a proportional income tax in association with a poll subsidy, is particularly easy to administer, it cannot be said that the neglect of administrative costs in the analysis is of any importance, except that MIRRLEES OPTIMUM INCOME TAXATION 207 considerations of administration might well lead an optimizing government to choose a perfectly linear tax schedule. The optimum tax schedule is certainly not exactly linear, however, and we have not explored the welfare loss that would arise from restriction to linear schedules: nevertheless, one may conjecture that the loss would be quite small. It is interesting, though, that in the cases for which we have calculated optimum schedules, the maximum marginal tax rate occurs at a rather low income level, and falls steadily thereafter. This conclusion would not necessarily hold if the distribution of skills in the population had a substantially greater variance. The sixth case presented has a = 1. So great a dispersion of known labouring ability does not seem to be at all realistic at present, but it is just conceivable if a great deal more were known to employers about the abilities of individual members of the population. The optimum is in almost all respects very different. Tax rates are high: a large proportion of the population is allowed to abstain from productive labour. The results seem to say that, in an economy where there is more intrinsic inequality in economic skill, the income tax is a more important weapon of public control than it is in an economy where the dispersion of innate skills is less. The reason is, presumably, that the labour-discouraging effects of the tax are more important, relative to the redistributive benefits, in the latter case. 10. CONCLUSIONS The examples discussed confirm, as one would expect, that the shape of the optimum earned-income tax schedule is rather sensitive to the distribution of skills within the population, and to the income-leisure preferences postulated. Neither is easy to estimate for real economies. The simple consumption-leisure utility function is a heroic abstraction from a much more complicated situation, so that it is quite hard to guess what a satisfactory method of estimating it would be. Many objections to using observed income distributions as a means of estimating the distribution of skills will spring to mind. Yet the assumptions used in the numerical illustrations seem to fit observation fairly well, and are not in themselves implausible. It is not probable that work decisions are entirely, or even, in the long run, mainly, determined by social convention, psychological need, or the imperatives of cooperative behaviour: an analysis of the kind presented is therefore likely to be relevant to the construction and reform of actual income taxes. Being aware that many of the arguments used to argue in favour of low marginal tax rates for the rich are, at best, premissed on the odd assumption that any means of raising the national income is good, even if it diverts part of that income from poor to rich, I must confess that I had expected the rigorous analysis of income-taxation in the utilitarian manner to provide an argument for high tax rates. It has not done so. I had also expected to be able to show that there was no great need to strive for low marginal tax rates on low incomes when constructing negative-income-tax proposals. This feeling has been to some extent confirmed. But my expectation that the minimum consumption level would be rather high has not been confirmed. Instead, virtually everyone is brought into the workforce. Since this conclusion is based on the analysis of an economy in which a man who chooses to work can work, I should not wish to see it applied in real economies. So long as there are periods when employment offered is less than the labour force available, one would perhaps wish to see the minimum income-level, assured to those who are not working, set at such a level that the number who choose not to work is as great as the excess of the labour force over the employment available. A rigorous analysis of this situation has still to be attempted. The results above do at least suggest that we should allow the least skilled to work for a substantially shorter period than the highly skilled. I would also hesitate to apply the conclusions regarding individuals of high skill: for many of them, their work is, up to a point, quite attractive, and the supply of their labour 208 REVIEW OF ECONOMIC STUDIES may be rather inelastic (apart from the possibilities of migration). There is scope for further theoretical work on this problem too. I conclude, for the present, that: (1) An approximately linear income-tax schedule, with all the administrative advantages it would bring, is desirable (unless the supply of highly skilled labour is much more inelastic than our utility function assumed); and in particular (optimal!) negative incometax proposals are strongly supported.' (2) The income-tax is a much less effective tool for reducing inequalities than has often been thought; and therefore (3) It would be good to devise taxes complementary to the income-tax, designed to avoid the difficulties that tax is faced with. In the model we have been studying, this could be achieved by introducing a tax schedule that depends upon time worked (y) as well as upon labour-income (z): with such a schedule, one can obtain the full optimum, since one can, in effect, construct a different z-schedule for each n.A Such a tax would not be fully practicable, but we have other means of estimating a man's skill-level-such as the notorious I.Q. test: high values of skill-indexes may be sought after so much for prestige that they would not often be misrepresented. With any such method of taxation, the risks of evasion are, of course, quite great: but if it is true, as our results suggest, that the income tax is not a very satisfactory alternative, this objection must be weighed against the great desirability of finding some effective method of offsetting the unmerited favours that some of us receive from our genes and family advantages. REFERENCES [1] [2] [3] [4] [5] [6] [7] Blum, W. J. and Kalven, H. Jr. The Uneasy Casefor Progressive Taxation (University of Chicago Press, 1953). Diamond, P. A. " Negative Income Taxes and the Poverty Problem-a Review Article ", National Tax Journal (September 1968). Lydall, H. F. The Structure of Earnings (Oxford, 1968). Mirrlees, J. A. " Characterization of the Optimum Income Tax " (unpublished). Musgrave, R. A. The Theory of Public Finance (McGraw-Hill, 1959). Shoup, C. S. Public Finance (Weidenfeld and Nicolson, 1969). Vickrey, W. Agendafor Progressive Taxation (Ronald Press, N.Y., 1947). 1 The essentialpoint of these proposalsis that the marginaltax rate (as representedby rulesfor deductions from social securitybenefits)should be significantlyless than 100 per cent. Proposals of this kind have sometimes been put forward in terms that suggest-quite wrongly of course-that any plausiblesoundingnegativeincome-taxproposalis betterthan a systemin which all earningsare deductedfrom social securitybenefits. It was a majorintention of the presentstudy to providemethods for estimatingdesirable tax rates at the lowest income levels, and a surprisethat these tax rates are the most difficultto determine, in a sense. They cannot be determinedwithout at the same time determiningthe whole optimum incometax schedule. To put things anotherway, no such proposal can be valid out of the context of the rest of the income-taxschedule. 2 J am indebtedto FrankHahn for pointingthis out. It would seem to be true that lump-sumtaxation is possible in any formal model whereuncertaintyis not introducedexplicitly.