Exact nonlinear budget constraints determined by systems of equations and inequalities ∗ Diderik Lund Department of Economics, University of Oslo P.O. Box 1095, Blindern, NO-0317 Oslo, Norway E-mail: diderik.lund@econ.uio.no Web page: http://folk.uio.no/dilund/ April 5, 1999, slightly revised January 2003 Abstract When the slopes and kinks of a piecewise-linear budget set are determined by some subset of hundreds of alternative systems of equations and inequalities, the computation of such budget sets is not quite straigthforward. An exact method is desirable to avoid unnecessary measurement errors in econometric analysis. A workable method using analytical solutions is developed, implemented by the computer software Mathematica for the analytical part and Gauss for the numerical part. The application is to the choice between owner’s wages and dividends for corporations with single owners in Norway 1991, with an extremely complex system of taxes and corporate regulations. Keywords: Nonlinear budget set, computation, Mathematica, Gauss ∗ Thanks to Erik Fjærli, Jeffrey MacKie-Mason, and Arne Strøm for helpful discussions. 1 1 Introduction In many choice situations economic agents face non-linear budget constraints. One wellknown source of non-linearity is taxes, in particular the progressive income tax. Sophisticated econometric techniques have been developed to analyze behavior under such budget constraints, in particular labor supply. While the computation of the budget set (the set delimited by the budget constraint) for each observation may be tedious, it is often straightforward if individual tax return data are known. Then other sources of income and deductions are known, so one may hypothetically vary labor income, and calculate the levels of this variable at which the individual reaches a new tax bracket. Even in this simple case the budget sets (in consumption-leisure space) differ between individuals, because individuals have different levels of other income and of various deductions. Those are examples of “exogenous” individual-specific variables which enter the computation of the budget sets. The variables are often considered exogenous by the labor supply econometrician, although they are clearly affected by the individual’s lifetime choices. This paper considers a budget set computation which is more complex for two reasons. First, taxes are solutions to simultaneous equation systems, because some deductions are formulated in terms of variables which themselves are tax-dependent. Second, each equation in the system is only piecewise linear, i.e., it can be formulated as a simple or more complex expression of maximum and/or minimum functions. As a consequence, the total tax system can be broken down to sets of simultaneous linear equations, each of which being valid only within some interval. An example of a non-linearity is a tax on net corporate or personal income, imposed only if the income is positive after deduction for, e.g., interest expenses. An example of a simultaneity is a deduction of dividend payments in the tax base, when (as is often the case) the optimal amount of dividends depends on total taxes paid. Section 2 briefly describes the study by Fjærli and Lund (2001) of the Norwegian 1991 tax system. Section 3 discusses alternative solution methods for determining the budget 2 set for each observation, and describes the chosen method. Section 4 gives a Mathematica program for the first part of the solution, while section 5 sketches a Gauss program for the second part. Just as in Brown (1993), the Gauss program uses Gauss procedures written by the Mathematica program. Section 6 concludes. 2 Owner’s wages versus dividends The specific problem to be used as an example is studied in Fjærli and Lund (2001).1 The topic is, what determines the choice between owner’s wages and dividends in a corporation with a single owner. The choice is clearly affected by the tax system, but the study shows that other motives also have some weight. For a more complete discussion of the tax system and behavioral assumptions, the reader is referred to Fjærli and Lund (2001). For the purpose of the present paper, a brief overview is sufficient. An observation is a pair of a corporation and its owner, who holds all shares in it. The budget set is a region in the (W, D) plane, where W is the 1991 wage income the owner receives from the corporation, net of corporate and personal taxes, and D is net dividend, similarly defined. The north-east boundary of the set, called the budget constraint, is defined as the maximum D for each positive value of W , given the observed values of the corporation’s 1991 pre-tax profits, Π, and the retained post-tax part of these, R. Those observed values, (Π, R), are considered exogenous. Π is defined as profits before payment of wage income to the owner. The personal dividend tax at the rate md is proportional, so that D = G(1−md ), where G is gross dividends. Net wage income, W , is a strictly increasing function of gross wage income, Wg , defined through the personal tax system, which is progressive. Book profits before taxes is Y = Π − Wg (1 − a), where a is the rate of a payroll tax. This means that W is a strictly decreasing function of Y when Π, a and the personal tax system are given. Disregarding the personal taxes, the budget constraint may thus essentially be defined in the (Y, G) plane as the maximum G for each Y . 1 More material on the study is available at the web address http://folk.uio.no/dilund/wagediv/. 3 In addition to tax rules, there were some accounting rules which affected the budget constraint. In order to promote corporate retention of earnings, there were two funds under the equity section of the balance sheet. The reserve fund, V , was mandated, while the consolidation fund, F , carried tax incentives. The allocation of retained earnings to these funds, ∆V and ∆F , and to free equity, ∆E, is done simultaneously with the determination of corporate taxes, Tc , and gross dividends, G. In order to trace out the budget constraint, we need the solution to five equations in the five variables, ∆V, ∆F, ∆E, Tc , G, for each value of W . Below this is equivalently stated as a solution for each value of Y , since W is a strictly decreasing, continuous function of Y . As will be seen below, the first equation is an identity, the second given by tax rules, and the third by accounting rules. Only the fourth and fifth equations describe a maximizing (or tax minimizing) choice by the owner of the corporation. It is the simplicity of this choice which allows the easy analytical solution to the maximization problem, cf. Fjærli and Lund (2001). Many of the equations are only piesewise-linear. The solution method is to solve the various linear variants of the system, and then sort out for which Y interval each solution is valid. For instance, a corporate income tax is only effective when its base is positive. Thus there is an inequality, the base being positive, which determines which linear expression is valid for the tax. The endogenous variables are listed in table 1, while the exogenous ones are listed in table 2.2 Observe that Y is listed as an exogenous variable. A system of five linear 2 Unfortunately, we have changed notation for a few of the variables during the project. The published version, Fjærli and Lund (2001), defines two tax rates on corporate income, a municipal rate, cm , and a national rate, cn . Originally we wanted to use the more general definitions of marginal corporate income tax rates found in King (1977), but it turns out that the relation between the two statutory tax rates and King’s rates varies from case to case (e.g., being in or out of tax position), so we decided that we needed the statutory rates as our basic variables. In the computer programs, and in the (older) presentation in this paper, we use instead the rates cu and cd , which here should be taken to be defined simply as cu = cn + cm and cd = cm . (See also Lund (1986).) There is also an inconsistency in the name of the computer program variable f, which should rather have been ddf to be consistent with dde and ddv. This is due to the fact that ∆F was named F in previous versions of the paper. Finally, there is an inconsistency in the use of 4 Table 1: Endogenous variables Symbol Name in Definition in text program ∆E dde Net allocation to free equity ∆F f Net allocation to consolidation fund G g Gross (before personal taxes) dividends Tc tc Corporate taxes ∆V ddv Net allocation to reserve fund equations thus defines the five endogenous variables as functions of Y , given the values of the other exogenous variables. The variables and equations are explained in the appendix of Fjærli and Lund (2001). Below the equations are presented in their Mathematica form, cf. Wolfram (1991). 2.1 First equation There is only one version of this, an accounting identity. eq1 = y-tc-g==f+ddv+dde 2.2 Second equation This gives the total corporate taxes, the sum of a municipal corporate income tax, a national corporate income tax, and wealth taxes (te). There are twelve linear versions of this. The two linear versions of the municipal tax can be combined with each of the six linear versions of the national tax. This is represented as a 2 × 6 matrix of equations. eq2[0,0] = tc==te the variable name χ, which is a tax parameter in this paper, and in the budget set computation programs, but a general name for indicator variables in the econometric parts. 5 Table 2: Exogenous variables in equations and inequalities Ab ab Appreciation fund, beginning-of-year Ae ae Appreciation fund, end-of-year Be be Corporate debt, end-of-year Cb cb Book value of shares, beginning-of-year cd cd Corporate tax rate on distributed profits (= 0.23) Ce ce Book value of shares, end-of-year χ chi Dividend trigger ratio for reserve fund (= 0.1) cu cu Corporate tax rate on undistributed profits (= 0.508) Eb eb Free equity, beginning-of-year Eb− ebm min(Eb , 0) η eta Reserve fund withdrawal rate (= 0.2) Gr gr Dividends received by corporation κ kap Reserve fund allocation rate (= 0.1) Lm lm Loss carried-forward (into 1991) for municipal tax Ln lnat Loss carried-forward (into 1991) for national tax ϕ phi Consolidation fund allocation rate (= 0.23) R∗ rst Observed retained earnings in corporation Te te Corporate wealth taxes Vb vb Reserve fund, beginning-of-year Y y Corporate pre-tax book income Y∆ yd Unexplained difference, corporate book income minus taxable income ζ zeta Reserve fund target ratio (= 0.2) 6 eq2[1,0] = tc==cd*(y-yd-lm-f)+te eq2[0,1] = tc==(cu-cd)*(-yd-lnat)+te eq2[1,1] = tc==cd*(y-yd-lm-f)+(cu-cd)*(-yd-lnat)+te eq2[0,2] = tc==(cu-cd)*(-yd-lnat+tc)+te eq2[1,2] = tc==cd*(y-yd-lm-f)+(cu-cd)*(-yd-lnat+tc)+te eq2[0,3] = tc==(cu-cd)*(-yd-lnat+tc+ddv)+te eq2[1,3] = tc==cd*(y-yd-lm-f)+(cu-cd)*(-yd-lnat+tc+ddv)+te eq2[0,4] = tc==(cu-cd)*(-yd-lnat+tc+ddv-eb)+te eq2[1,4] = tc==cd*(y-yd-lm-f)+(cu-cd)*(-yd-lnat+tc+ddv-eb)+te eq2[0,5] = tc==(cu-cd)*(y-yd-lnat-f-g+gr)+te eq2[1,5] = tc==cd*(y-yd-lm-f)+(cu-cd)*(y-yd-lnat-f-g+gr)+te 2.3 Third equation There are eight versions of the third equation, corresponding to eight different linear alternatives for the net allocations to the reserve fund. There is no need to arrange these in a matrix. eq3v0 = ddv==0 eq3v1 = ddv==eta*(zeta*ce-vb) eq3v2 = ddv==eta*(be+g+tc-ce-ae-vb) eq3v3 = ddv==kap*(y-f-tc+ebm) eq3v4 = ddv==g-chi*(cb+vb+ab) eq3v5 = ddv==kap*(y-f-tc+ebm)+g-chi*(cb+vb+ab) eq3v6 = ddv==zeta*ce-vb eq3v7 = ddv==be+g+tc-ce-ae-vb 2.4 Fourth equation There are two linear versions of this. eq4v0 = f==0 eq4v1 = f+ddv+dde==rst 7 2.5 Fifth equation There are three linear versions of this, but the first two are classified as one version, as the first one should be used when the first index of eq2 is zero, and the second when the first index of eq2 is one. This means that when the total number of equation system is counted, the distinction between eq5v1[0] and eq5v1[1] does not add to this total number. eq5v1[0] = f==0 eq5v1[1] = f==phi*(y-yd-lm) eq5v2 = dde==-eb 2.6 Total number of equation systems As explained in Fjærli and Lund (2001), there are really only three different combinations which are allowed from the variants of the fourth and fifth equations. It has, however, been impossible a priori to rule out any combinations of any of these three with any of the eight variants of the third equation. It has also been impossible to rule out any combination of any of these 24 sets of three equations with any of the twelve variants of the second equation. Thus we are left with 24×12 = 288 linear variants of the system of five equations. 2.7 Inequalities delimiting validity of equation systems The 34 inequalities are presented in Mathematica form below. More precisely, each inequality is arranged so that it has zero on one side, and the expressions which follow represent the other side of the inequality. ie[1] = y-yd-lm-f ie[2] = ddv ie[3] = eb ie[4] = -tc-ddv+eb ie[5] = eb-ddv ie[6] = g-gr-y+f+tc-eb+ddv ie[7] = g-gr-y+f 8 ie[8] = g-gr-y+f+tc ie[9] = g-gr-y+f+tc+ddv ie[10] = -yd-lnat+tc-eb+ddv ie[11] = -yd-lnat ie[12] = -yd-lnat+tc ie[13] = -yd-lnat+tc+ddv ie[14] = y-yd-lnat-f-g+gr ie[15] = dde+eb ie[16] = g ie[17] = vb-zeta*ce ie[18] = vb-be-g-tc+ce+ae ie[19] = zeta*ce-be-g-tc+ce+ae ie[20] = -y+f+tc-ebm ie[21] = -g+chi*(cb+vb+ab) ie[22] = -vb-ddv+zeta*ce ie[23] = -vb-ddv+be+g+tc-ce-ae ie[24] = -zeta*ce+vb+kap*(y-f-tc+ebm) ie[25] = -be-g-tc+ce+ae+vb+kap*(y-f-tc+ebm) ie[26] = -zeta*ce+vb+g-chi*(cb+vb+ab) ie[27] = -be-g-tc+ce+ae+vb+g-chi*(cb+vb+ab) ie[28] = -zeta*ce+vb+kap*(y-f-tc+ebm)+g-chi*(cb+vb+ab) ie[29] = -be-g-tc+ce+ae+vb+kap*(y-f-tc+ebm)+g-chi*(cb+vb+ab) ie[30] = -rst+ddv-eb ie[31] = f ie[32] = y-yd-lm ie33[0] = -rst+ddv-eb ie33[1] = -rst+ddv+phi*(y-yd-lm)-eb The last two inequalities in the list are grouped, as they relate to the two different values of the first index of eq2, just as the case for eq5v1, discussed above. 9 3 Solution method As discussed in Fjærli and Lund (2001), the solution has the following form: For each of the 288 sets of five linear equations, there is a corresponding set of inequalities. When, for an observation’s given values of the exogenous variables, a set of five equations and their corresponding inequalities are simultaneously satisfied for an interval of Y values, then that set of equations gives the unique solution for that interval of Y values. One could think of various ways to implement this solution on a computer. The budget constraint is defined as a maximum G for each value of Y . This is not a linear programming problem, however, since the objective function, G, is only piecewise linear in Y . Instead, one might consider using some more general numerical maximization software. While this might work for any particular value of Y , there is the drawback that we need to trace the constraint for an interval of positive Y values. An approximate solution could be found by using a grid, but the exact kink points would remain unknown. Another drawback would be a lack of control of the correctness of the solution. An advantage of the method which was implemented, is that any misspecification is likely to be detected. Only if all logical possibilities for linear solutions are specified in the equation systems, will the solution method yield a unique sequence of connected (at the kink points), but non-overlapping intervals of Y values for each observation. The solution method is tedious, but in principle not very complicated: The first steps are analytical, and can be done once, irrespective of numerical values for each observation. These were programmed in Mathematica. 1. For each of the 288 sets of linear equations, find the analytical solution, i.e., each of the five endogenous variables expressed as functions of the exogenous ones, which include Y . 2. For each set, enter this solution into the inequalities. This gives the 34 ie functions listed above expressed in terms of only exogenous variables. The 34 functions are linear in Y . 3. For each of the 288 sets of equations, find the analytical derivative of each inequality expression with respect to Y , i.e., the constant slope coefficient with respect to Y . 10 4. For each set, find also the analytical derivative dG/dY . For each set of equations, steps 2 and 3 might have been done only for those inequalities which correspond to that set. However, it turned out to be more practical to do the steps for all 34 inequalities. The next steps are numerical, and were programmed in Gauss (see Aptech (1994)). The steps 5–9 must be done once for each observation, using the observed values of the exogenous variables as inputs. 5. For each of the 288 sets of equations, calculate the signs of the slopes of each of its corresponding inequality expressions (using the slope expressions from step 3), thus determining whether that inequality becomes an upper bound or a lower bound on Y values for which the equation set may be valid. There is also the possibility that the slope is zero, in which case the inequality is either globally true (for all values of Y ) or globally false. 6. For each set of equations, calculate the upper or lower bound for Y imposed by each of the inequalities (using the functions from step 2) corresponding to that set, except when the slope is zero. 7. For each set of equations, determine whether the numerical bounds imposed by the corresponding inequalities leave a non-empty interval when taken together. When this happens, that set of equations is valid for that interval of Y values for that observation. (If one (or more) of the inequalities is globally false, there is no nonempty interval.) 8. Check to see that this procedure leads to a connected sequence of adjacent nonoverlapping intervals of Y values for each observation. 9. Record the solution for dG/dY for each of these intervals, together with the interval boundaries. In fact, step 8 must be modified slightly. For some values of the exogenous variables, e.g., Gr = 0, some of the equations become identical, and more than one of the 288 equation 11 systems is valid for some Y interval. Thus one must provide for deletion of identical solution candidates within identical intervals. 4 The Mathematica program There are 24 different Mathematica programs, one corresponding to each of the 24 versions of the set of the third, fourth and fifth equations. The programs can be distinguished by the case name variable, casnam. The case names are a1–a3, b1–b3, c1–c5, d1–d5, e1–e5, and f1–f3. For simplicity, each program starts with the definition of the 26 equations and of the 34 inequality expressions as given above. Then the lists of endogenous variables are defined. The variable y is in some connections regarded as endogenous, so there are two versions of the list of endogenous variables. There is also a “list” of exogenous variables, not a list in the Mathematica sense, but a string to be written to the Gauss files. endo5={ddv,dde,f,g,tc} endo6=Append[endo5,y] exost="ab,ae,be,cb,cd,ce,chi,cu,eb,ebm,eta,"<> "gr,kap,lm,lnat,phi,rst,te,vb,y,yd,zeta" The following is the beginning of the program for the a1 case. The only lines which are specific to this case are the line giving value to casnam, and the two lines defining the equation systems eqaloc[0] and eqaloc[1]. These are equation systems for allocations (the third, fourth, and fifth equations), to be combined with the 12 different linear versions of the tax equation. Since eqaloc[0] goes with six of those versions and eqaloc[1] goes with the other six, the case a1 has 12 subcases according to which tax equation is being used. For the remaining cases a2,. . . ,f3, the three lines below need to be changed, with the relevant equations entered in the eqaloc expressions. casnam = "a1" eqaloc[0] = eq1 && eq3v1 && eq4v1 && eq5v1[0] eqaloc[1] = eq1 && eq3v1 && eq4v1 && eq5v1[1] 12 Three output files are opened. Mathematica will write Gauss procedures in the first two files. The tya1.src file will contain the expressions from steps 2 and 3, while tda1.src will contain the expressions from step 4. The third file will contain error messages. gadfil = "c:\\gauss\\tax\\td" <> casnam <> ".src" gayfil = "c:\\gauss\\tax\\ty" <> casnam <> ".src" nulfil = "c:\\gauss\\tax\\t" <> casnam <> ".nlf" gadh = OpenWrite[gadfil] gayh = OpenWrite[gayfil] nulh = OpenWrite[nulfil] The beginnings of the Gauss procedure definitions are written to the files.3 (Parts of the resulting files are shown at the end of this section.) WriteString[gadh,"PROC dgdy" <> casnam <> "(" <> exost <> ",it,jt);\n\tLOCAL dgdy;\n\t"] WriteString[gayh,"PROC (3) = y" <> casnam <> "(" <> exost <> ",it,jt,nt);\n" <> "\tLOCAL diedy,ief,yli;\n\t"] Then follow two function definitions, solieqs and solve5, which do most of the calculations and writing to files. solieqs is called repeatedly by solve5, which is called repeatedly by iter5j further down. The function solieqs does step 2 and step 3. It is a function of three indices. i(= 0, 1) is the index for the rows of the tax equation matrix, indicating the expression for the municipal corporate income tax. j(= 0, . . . , 5) is the index for the columns of the tax equation matrix, indicating the expression for the national corporate income tax. n(= 1, . . . , 33) is the index for the inequalities. 3 Observe that the dgdy procedures have two indices, it and jt, as arguments, while the y procedures have one additional index, nt, which refers to the inequality number. 13 The list solu5 is the result of solving the equation system, which takes place further down in the function solve5. If a solution exists (i.e., the length of the list exceeds zero), the first solution is used. This is plugged into the relevant inequality expression, and the result is defined as the function ieq. Its derivative with respect to y is defined as the function dieqdy. If there is no solution to the equation system, the inequality is kept unchanged, and the derivative is set to zero. solieqs[i_,j_,n_] := ( Print["Inequality " <> ToString[n]]; If[Length[solu5] > 0, ieq[y_] = ie[n]/.First[solu5]; dieqdy[y_] = D[ieq[y],y]; , ieq[y_] = ie[n]; dieqdy[y_] = 0; ]; The function solieqs continues: The critical value of y, for which the inequality holds with equality, is assigned to the variable ywri. This is either an upper or a lower bound on the validity of this equation system. The critical value does not exist when the derivative of the inequality expression with respect to y is zero. In that case a dummy value of -88 is assigned instead. The PolynomialQ is included for robustness, so that a non-polynomial expression for ieq[y] can be detected by the dummy value -99. Similarly the exponent of y in ieq[y] could be 0 or greater than 1, which would be detected by the dummy values of -77 or -11 times the exponent. Barring these unexpected occurences, the exponent is unity, and the linear equation ieq[y]= 0 is solved by using the first and second coefficient of the polynomial. This turned out to be less time-consuming than other commands for equation solutions. If[Simplify[dieqdy[y]] === 0, ywri = -88; 14 , If[PolynomialQ[ieq[y],y], If[Exponent[ieq[y],y] == 1, coefs = CoefficientList[ieq[y],y]; ywri = -coefs[[1]]/coefs[[2]]; , If[Exponent[ieq[y],y] == 0 || ieq[y] === 0, ywri = -77;, ywri = -11*Exponent[ieq[y],y]; ]; ]; , ywri = -99; ]; ]; The final part of the function solieqs writes the results to the Gauss files. The CForm in Mathematica is used. This has the advantage that the expressions are directly Gauss compatible, so there is no need to edit the files before they are used by Gauss. This is different from using the FortranForm, as in Brown (1993), p. 297. The only additional function which must be defined in Gauss to accomodate the CForm for our needs4 is fn power(x,y) = x^y; WriteString[gayh, "IF nt == " <> ToString[n] <> ";\n" <> "\t\t\t\tdiedy = " <> ToString[CForm[Simplify[dieqdy[y]]]] <> ";\n\t\t\t\tief = " 4 Also when the FortranForm is used, it is exponentiation which causes problems. But the ** operator in Fortan cannot be read by Gauss at all, which means that FortranForm files from Mathematica must be edited before Gauss reads them. 15 <> ToString[CForm[ieq[y]]] <> ";\n\t\t\t\tyli = " <> ToString[CForm[ywri]] <> ";\n"]; If[n < 33, WriteString[gayh,"\t\t\tELSE"], WriteString[gayh,"\t\t\tENDIF;\n"]]; ) This concludes the function solieqs. The function solve5 solves the system of five equations. If a solution exists, the (first) solution for g is assigned as the function gfn[y]. If not, an error message is written. The derivative of gfn is assigned as the function dgdy. The CForm of this is written to file. solve5[i_,j_] := ( Print["Solving for tax alternative ",i,j]; sy5 = eqaloc[i] && eq2[i,j]; solu5 = Solve[sy5,endo5]; If[Length[solu5] > 0, gfn[y_] = g/.First[solu5], (gfn[y_] = -99y;WriteString[nulh, "No solution to equation system\n" <> ToString[i] <> ToString[j] <> casnam <> "\n"]) ]; dgdy[y_] = D[gfn[y],y]; WriteString[gadh, "IF jt == " <> ToString[j] <> ";\n\t\t\tdgdy = " <> ToString[CForm[Simplify[dgdy[y]]]] <> ";\n"]; 16 The final lines of solve5 write a few extra strings to the Gauss files, and calls upon the function solieqs, presented above, to do the relevant inequality calculations for this equation system. This means an iteration over n values 1–33. WriteString[gayh, "IF jt == " <> ToString[j] <> ";\n\t\t\t"]; Do[(solieqs[i,j,n]; Clear[ieq,dieqdy,soly,ywri]), {n,33}]; If[j < 5, WriteString[{gadh,gayh},"\t\tELSE"], WriteString[{gadh,gayh},"\t\tENDIF;\n"]]; ) This concludes the function solve5. The function iter5j does the iteration over j values 0–5 for each value of i. A few extra strings are written to the Gauss files, and for each i the correct version of ie33 is assigned as ie[33]. iter5j[i_] := ( WriteString[{gadh,gayh}, "IF it == " <> ToString[i] <> ";\n\t\t"]; ie[33] = ie33[i]; Do[solve5[i,j],{j,0,5}]; If[i < 1, WriteString[{gadh,gayh},"\tELSE"], WriteString[{gadh,gayh},"\tENDIF;\n"]]; ) Finally, the main program does the iteration over i values 0–1, and then writes the final lines to the Gauss files, and closes all files. 17 Do[iter5j[i],{i,0,1}] WriteString[gadh, "\tRETP(dgdy);\nENDP;\n"] WriteString[gayh, "\tRETP(diedy,ief,yli);\nENDP;\n"] Close[gadh] Close[gayh] Close[nulh] The resulting Gauss file tya1.src begins like this: PROC (3) = ya1(ab,ae,be,cb,cd,ce,chi,cu,eb,ebm,eta,gr,kap, lm,lnat,phi,rst,te,vb,y,yd,zeta,it,jt,nt); LOCAL diedy,ief,yli; IF it == 0; IF jt == 0; IF nt == 1; diedy = 1; ief = -lm + y - yd; yli = lm + yd; ELSEIF nt == 2; diedy = 0; ief = -(eta*(vb - ce*zeta)); yli = -88; ELSEIF nt == 3; diedy = 0; ief = eb; yli = -88; ELSEIF nt == 4; diedy = 0; ief = eb - te + eta*(vb - ce*zeta); yli = -88; 18 Observe that ief may contain the variable y, while yli never does. This follows since yli is the value of y which makes ief equal to zero (when possible), as can be verified. Moreover, diedy is the derivative of ief with respect to y (as can also be verified). This will never depend on y, since ief is linear in y (or constant). The file continues, and contains 12 · 33 = 396 alternative definitions of diedy, ief, and yli, for the 33 inequalities for each of the 12 tax cases. The file ends like this: ELSEIF nt == 31; diedy = phi; ief = -(phi*(lm - y + yd)); yli = -((-(lm*phi) - phi*yd)/phi); ELSEIF nt == 32; diedy = 1; ief = -lm + y - yd; yli = lm + yd; ELSEIF nt == 33; diedy = phi; ief = -eb - rst + phi*(-lm + y - yd) - eta*(vb - ce*zeta); yli = -((-eb - lm*phi - rst - phi*yd - eta*(vb - ce*zeta))/phi); ENDIF; ENDIF; ENDIF; RETP(diedy,ief,yli); ENDP; Similar procedures are created for the other cases, a2, . . . .5 5 When the function ya1 is called in the Gauss program, and similarly when ya2, . . . are called, the observed value of y, yst, is used as the y argument because some argument value is needed. This has no significance for the calculations of diedy or yli, since these do not depend on y. It also has no significance via the calculation of ief, since the returned value of ief is only used when diedy is zero, i.e., when ief is independent of y. 19 The file tda1.src looks like this: PROC dgdya1(ab,ae,be,cb,cd,ce,chi,cu,eb,ebm,eta,gr,kap, lm,lnat,phi,rst,te,vb,y,yd,zeta,it,jt); LOCAL dgdy; IF it == 0; IF jt == 0; dgdy = 1; ELSEIF jt == 1; dgdy = 1; ELSEIF jt == 2; dgdy = 1; ELSEIF jt == 3; dgdy = 1; ELSEIF jt == 4; dgdy = 1; ELSEIF jt == 5; dgdy = 1; ENDIF; ELSEIF it == 1; IF jt == 0; dgdy = 1 - cd + cd*phi; ELSEIF jt == 1; dgdy = 1 - cd + cd*phi; ELSEIF jt == 2; dgdy = (-1 + cu - cd*phi)/(-1 - cd + cu); ELSEIF jt == 3; dgdy = (-1 + cu - cd*phi)/(-1 - cd + cu); ELSEIF jt == 4; dgdy = (-1 + cu - cd*phi)/(-1 - cd + cu); ELSEIF jt == 5; 20 dgdy = (1 - cu + cu*phi)/(1 + cd - cu); ENDIF; ENDIF; RETP(dgdy); ENDP; Similar procedures are created for the other cases, a2,. . . .6 5 The Gauss program Gauss is used to calculate the coordinates of the budget constraint for each observation. Only a brief overview is given here.7 The calculation has two parts. The first part finds the maximum pre-personal-tax dividend, G, for all relevant values of the pre-personal-tax wage income, Wg . Due to the one-to-one relationship between Wg and Y , this amounts to finding the maximum G for each relevant Y . This results in a constraint which is increasing and piecewise-linear in the (Y, G) plane, defined by a sequence of coordinates for its kink points. The second part, calculating effects of the personal tax system, will be left out of the present discussion. Even the system of corporate taxes and regulations creates a piecewiselinear budget set, but the personal tax system introduces additional kinks because personal taxes were progressive. In order to make the procedures accessible, they are collected in a Gauss library.8 In the beginning of the program, pointers to their names are assigned to two character vectors, pys and pds. pys = &ya1 |&ya2 | ... 6 Again, while y appears as an argument in the dgdya1 function, it really ha no significance. In fact, only a few tax rates and other tax parameters appear in the expressions for dgdy. 7 The full programs are available from the author, diderik.lund@econ.uio.no, on request, and will be made available at the web page http://folk.uio.no/dilund/wadiv/. 8 The power function mentioned above should be included in the same library. 21 ; and pds = &dgdya1 |&dgdya2 | ... ; (where ... indicates the other 22 cases). When the procedures are called in the program, the generic names ypro and dpro are used, and an additional argument, ip, is added. This is an index which runs from 1 to 24, so that the relevant one of the 24 procedures can be called. The definition of the two generic procedures are as follows:9 PROC (3) = ypro(ab,ae,be,cb,cd,ce,chi,cu,eb,ebm,eta,gr,kap,lm, lnat,phi,rst,te,vb,yst,yd,zeta,it,jt,nt,ip); LOCAL fy; fy = pys[ip]; LOCAL fy:proc; RETP(fy(ab,ae,be,cb,cd,ce,chi,cu,eb,ebm,eta,gr,kap,lm, lnat,phi,rst,te,vb,yst,yd,zeta,it,jt,nt)); ENDP; PROC dpro(ab,ae,be,cb,cd,ce,chi,cu,eb,ebm,eta,gr,kap,lm, lnat,phi,rst,te,vb,yst,yd,zeta,it,jt,ip); LOCAL fd; fd = pds[ip]; LOCAL fd:proc; RETP(fd(ab,ae,be,cb,cd,ce,chi,cu,eb,ebm,eta,gr,kap,lm, lnat,phi,rst,te,vb,yst,yd,zeta,it,jt)); ENDP; 9 The indexing of procedures is explained in section 6.5 of Aptech (1994). 22 The main program has an outer loop, which is repeated once for each observation. First, the values of all exogenous variables for that observation are assigned. Then there are nested inner loops over the indices it, jt, and ip, within which it is checked whether a particular equation system is valid for some Y interval for that observation. For each of the 24 systems of the third, fourth and fifth equations (a1,. . . ,f3), there is a set of inequalities which must be satisfied in order for the set to be valid. For instance, for a1 to be valid, one must have ie[2] ≤ 0 ie[19] ≥ 0 ie[17] ≥ 0 ie[16] ≥ 0 ie[15] ≥ 0 There is thus a vector aim[2,.] specifying how many inequalities (between 5 and 8) are relevant for each of the 24 sets, and a matrix aim[3:10,.] specifying two columns for each of the 24 sets: One with the inequality numbers and one with an indicator for positive or negative value for each ie expression.10 The first two pairs of columns, for cases a1 and a2, are as follows: 2 0 2 0 ... 19 1 19 0 . . . 17 16 15 1 18 1 16 1 15 .. . 1 ... , 1 ... 1 ... which indicates that the two cases a1 and a2 are distinguished by the sign of ie[19]. 10 Not only are the equations non-linear in Y , so that several linear variants must be considered. Some of the inequalities which determine which linear variant of the equations should be used, are also non-linear in Y . Since the solution method is based on linear inequalities, this creates the need to distinguish between subcases. Some of the 24 cases are divided into 2 or 3 subcases because of this. There are altogether 48 sets of linear inequalities, each of which implying the validity of one of the 24 sets of linear equations. We shall not go into more detail here. 23 Table 3: Elements of the ldvec vector Element no. Data type Initial value Description 1 logical 0 subcase rejected? 2 logical 0 lower bound candidate identified? 3 logical 0 upper bound candidate identified? 4 real -999999 most recent lower bound candidate 5 real 999999 most recent upper bound candidate 6 real 0 dG/dY between bounds 7 integer 0 ie number of most recent l.b. cand. 8 integer 0 ie number of most recent u.b. cand. A similar matrix includes the numbers of the inequalities which are used to determine which version of the second equation (the tax equation) is valid. Altogether there are 1920 inequality subcases to check.11 For each of the 1920 subcases, an 8 × 1 vector ldvec is created and then updated through the checking, with elements shown in table 3. For each of the 1920 subcases, there is a specific list of inequalities to check, between 5 and 8 relating to the allocation equations and between 3 and 5 relating to the tax equation. For each of these inequalities, the following two statements are executed: {die,ief,yso} = ypro(ab,ae,be,cb,cd,ce,chi,cu,eb,ebm,eta, gr,kap,lm,lnat,phi,rst,te,vb,yst,yd,zeta,it,jt,nt,ic); ldvec = crichk(die,ief,yso,ti,nt,ldvec); Here, nt is the number of the inequality, i.e., the same as the index of the ie functions shown above. ti is the truth value for that inequality in the particular subcase, taken from the matrix shown above. 11 Again, there are non-linearities in the inequalities which imply that the twelve linear variants of the tax equation are subdivided into 40 subcases, i.e., there are 40 alternative sets of linear inequalities, each of which implying the validity of one of the 12 linear tax equations. Together with the 48 sets of linear inequalities mentioned in footnote 10, this creates a total number of 48 · 40 = 1920 subcases to check. 24 The procedure crichk determines whether the inequality depends on Y or not. If not, it is either globally true or false. If it depends on Y , it determines an upper bound or a lower bound along the Y axis, but whether these are candidates for an effective bound, depends on the previously recorded candidate for an upper or lower bound. The procedure is as follows: PROC crichk(die,ief,yso,critru,iqn,ldvec); /* Check whether criterion depends on Y. */ IF die == 0; /* This criterion is globally true or false, indep. of Y. */ IF (FGE(ief,0) AND critru) OR (FLE(ief,0) AND NOT critru); /* Criterion is globally true, do nothing. */ ELSE; /* Criterion is globally false, assign indicator. */ ldvec[1] = 1; ENDIF; ELSE; /* Criterion depends on Y, possibly update ldvec bounds. */ IF die < 0 EQV critru; /* This is an upper bound. */ /* If gt. previous upper, do nothing, else update. */ IF NOT ldvec[3] OR yso < ldvec[5]; ldvec[3] = 1; ldvec[5] = yso; ldvec[8] = iqn; ENDIF; ELSEIF die > 0 EQV critru; /* This is a lower bound. */ /* If lt. previous lower, do nothing, else update. */ IF NOT ldvec[2] OR yso > ldvec[4]; ldvec[2] = 1; 25 ldvec[4] = yso; ldvec[7] = iqn; ENDIF; ENDIF; ENDIF; IF FGE(ldvec[4],ldvec[5]) AND ldvec[3] AND ldvec[2]; /* Lower exceeds upper, case ruled out, assign indicator. */ ldvec[1] = 1; ENDIF; RETP(ldvec); ENDP; After the subcase’s inequalities have been checked, if there remains a non-empty interval within which the equation system is valid, the dG/dY value for that equation system is calculated (with the dpro procedure), and the interval and the dG/dY values are saved in a matrix. Finally, after all 1920 subcases are checked (for this observation), it is checked whether one is left with a connected sequence of non-overlapping intervals. In Fjærli and Lund (2001) this was the case for all observations, confirming that all logically possible solution to the maximization problem had been found. 6 Conclusion A workable method has been found for calculating non-linear budget sets when these are solutions to simultaneous equation systems, and the equations contain maximum and/or minimum expressions which make them only piecewise linear. 26 References Aptech Systems, Inc. (1994), Gauss System and Graphics Manual, Maple Valley, Wash., U.S.A., revision July 18. Brown, Stephen J. (1993), “Nonlinear Systems Estimation: Asset Pricing Model Application,” in Hal R. Varian (ed.), Economic and Financial Modeling with Mathematica, Santa Barbara, Calif.: Telos, pp. 286–299. Fjærli, Erik, and Diderik Lund (2001), “The choice between owner’s wages and dividends under the dual income tax,” Finnish Economic Papers, 14(2), 104–119. King, Mervyn A. (1977), Public Policy and the Corporation, London: Chapman and Hall. Lund, Diderik (1986), “Less than single dividend taxation: A note,” Journal of Public Economics, 29(2), 255–261. Wolfram, Stephen (1991), Mathematica, A System for Doing Mathematics by Computer, second edition, Reading, Mass.: Addison-Wesley. 27