TimeSeriesCEQ, September 19, 2011 The Time Series Behavior of Consumption in the CEQ PIH Model 1 Theory Consider the Permanent Income Hypothesis model of consumption a la Deaton (1992) in which the utility function is quadratic, α u(c) = − (c − c̄)2 (1) 2 u0 (c) = −(c − c̄)α, (2) where c̄ is the ‘bliss point’ of maximum utility and the budget constraint guarantees that c < c̄. If there is no uncertainty about the return factor R, the first order condition at period t − 1 is u0 (ct−1 ) = Rβ Et−1 [u0 (ct )] −α(ct−1 − c̄) = −Rβα Et−1 [(ct − c̄)], (3) (4) and if we assume for convenience that Rβ = 1, this reduces to ct−1 = Et−1 [ct ] ct = ct−1 + t ∆ct = t , (5) (6) (7) where t is a standard expectational error with the usual property that Es [t ] = 0 ∀ s < t. 2 Hall and Flavin The implication that consumption follows a random walk was first tested by Hall (1978) with a regression of the form ct = γ0 + γ1 ct−1 + γ2 zt−1 (8) where the theory implies that the coefficient γ2 should be equal to zero for all variables zt−1 whose value was known at the time that consumption in period t − 1 was decided. Hall tried a variety of plausible z’s and generally found a coefficient on lagged consumption γ1 close to one, while for most z’s he found a coefficient not significantly different from zero – just as the random walk theory would predict. However, when he chose a measure of stock market performance as his zt−1 , he did find a statistically significant coefficient γ2 . Flavin (1981) extended the theory in Hall, and was the first person to estimate the consumption Euler equation in the form we use today: ∆ct = µ0 + µ1 zt−1 . (9) That is, she imposed the theory’s restriction that the coefficient γ1 should be equal to one. For deep and general econometric reasons this provides a more powerful test of the theory, and Flavin found that several variables in addition to the lagged stock market could predict consumption growth, including lagged income growth. Deaton (p. 94) reports results analogous to Flavin’s: ∆ct = 11.39 (9.7) + 0.121 (3.20) ∆yt−1 Thus, contrary to theory, consumption growth is predictable by lagged income growth - it is ‘excessively sensitive’ to this lagged variable, since the correct sensitivity would be zero. 3 Time Aggregation Suppose that consumption data are collected annually, with years indexed by t, but agents make their consumption decisions only once every six months, in periods indexed by τ . Suppose that for a specific year t we refer to the first six-month period as τ and the second six-month period as τ + 1. The recorded change in measured, annual consumption c∗t is related to the underlying semi-annual, unmeasured consumption changes by ∆c∗t = cτ + cτ +1 − cτ −1 − cτ −2 = ∆cτ + cτ +1 |{z} −cτ −2 (10) (11) =cτ +1 −cτ +cτ −cτ −1 +cτ −1 −cτ −2 +cτ −2 = ∆cτ + (cτ −2 + ∆cτ −1 + ∆cτ + ∆cτ +1 ) − cτ −2 = ∆cτ +1 + 2∆cτ + ∆cτ −1 , (12) (13) and the previous year’s measured income change is ∗ ∆yt−1 = ∆yτ −1 + 2∆yτ −2 + ∆yτ −3 , (14) since if τ corresponds to t, τ − 2 must correspond to t − 1. The crucial point is that the half-year period τ − 1 appears in both (13) and (14). ∗ Thus the ‘news’ about income in half-year τ − 1 will affect both ∆c∗t and ∆yt−1 , so that measured consumption growth will be predictable from measured lagged income growth even though in each subperiod τ ‘true’ consumption follows a random walk. 2 Working (1960) is the first person to make the point that time aggregation can cause serial correlation in the aggregated data even if the unaggregated data follow a random walk. Though the algebra is more complex, the same insight holds for quarterly data if the decision period is a month, and more generally the point holds whenever one examines data which are averaged over a time period longer than the period in which the series follows a random walk. However, consider measured data from period t − 2: ∗ ∆yt−2 = ∆yτ −3 + 2∆yτ −4 + ∆yτ −5 . (15) Note that since there is no overlap between the time periods represented in equation ∗ (15) and (13), ∆c∗t should be truly orthogonal to ∆yt−2 (and any other information dated t − 2). We can then see whether the ‘excess sensitivity’ finding of Flavin and others merely ∗ reflects time aggregation problems either by directly estimating ∆c∗t = ζ0 + ζ1 ∆yt−2 ∗ and testing whether ζ1 is statistically significant, or by estimating ∆c∗t = b0 +b1 ∆yt−1 ∗ but instrumenting for ∆yt−1 using instruments from periods t−2 and earlier. Deaton reports the results of the latter experiment as Thus, the ‘excess sensitivity’ results cannot be explained as resulting from time aggregation. 4 Campbell and Mankiw Campbell and Mankiw (1989) proposed a simple explanation for the importance of lagged variables in predicting current consumption growth: They suggest that some fraction λ of income goes to consumers who simply set their consumption equal to their income in every period, called ‘rule of thumb’ or ‘Keynesian’ consumers. The existence of such people can potentially explain the excess sensitivity findings because the lagged variables may be statistically significant for current consumption growth merely because they are good at predicting current income growth. Campbell and Mankiw also show that if we assume a perfect foresight CRRA model then for PIH consumers the relationship between consumption growth and income growth can be rewritten in terms of changes in the logs rather than the levels of consumption and income, ∆ log cPIH = ρ−1 (Et−1 [rt ] − δ) t (16) and that if aggregate consumption is done partly by such PIH consumers and partly by rule-of-thumb consumers the equation for aggregate consumption growth is 3 approximately1 ∆ log ct ≈ (1 − λ)ρ−1 (Et−1 [rt ] − δ) + λ Et−1 [∆ log yt ]. (17) They therefore estimate an equation of the form ∆ log ct = γ0 + γ1 Et−2 [rt ] + γ2 Et−2 [∆ log yt ], (18) where the fact that the expectations are dated time t − 2 signifies that only information from periods t − 2 and earlier is used to forecast income growth and the interest rate. They find an estimate of γ2 ≈ 0.5, and interpret this result as an indication that roughly half of aggregate income goes to ‘rule of thumb’ consumers. 5 Flavin and the ‘Warranted’ Change in Consumption The intertemporal budget constraint (IBC) says that the present discounted value of consumption must be less than or equal to total current assets plus the present discounted value of income, ∞ ∞ X X cs /Rs−t ≤ b t + ys /Rs−t (19) s=t s=t ct + ct+1 /R + ct+2 /R2 + . . . ≤ b t + ∞ X ys /Rs−t . (20) s=t Define human wealth ht as " h t = Et ∞ X # Rt−s ys . (21) s=t Since this equation must hold for all possible future realizations of income, it must hold in expectation. Furthermore, if the marginal utility of consumption is always positive, the equation will hold with equality. Imposing expectations gives ct + Et [ct+1 /R] + Et [ct+2 /R2 ] . . . ct + Et [(ct + t+1 )/R] + Et [(ct+1 + t+2 )/R2 ] . . . ct + Et [ct /R] + Et [ct /R2 ] . . . ct + ct /R + ct /R2 . . . ct 1 + 1/R + 1/R2 + . . . = = = = = bt + ht bt + ht bt + ht bt + ht bt + ht (22) 1 The approximation requires a problematic assumption that the ratio of human wealth to nonhuman wealth is stationary, which is not in fact true in the partial equilibrium framework under consideration; this technical difficulty is not so easy to fix, but is not usually viewed as an important problem. 4 1 = bt + ht ct 1 − 1/R R ct = bt + ht R−1 r (bbt + h t ) (23) ct = R Thus, consumption will be equal to the interest income on total wealth, human and nonhuman (bbt + h t ). Deaton defines the expression (r/R)(bbt + h t ) as permanent income.2 Define b t = R (bbt−1 + yt−1 − ct−1 ) and substitute for b t in equation (23) to get "∞ # X ct = r(bbt−1 + yt−1 − ct−1 ) + (r/R) Et ys Rt−s (24) s=t Now lag (23) by one period and multiply both sides by R, " " ∞ ## X (1 + r)ct−1 = r b t−1 + Et−1 R(t−1)−s ys (25) s=t−1 " " (1 + r)ct−1 = r b t−1 + yt−1 + Et−1 ∞ X ## Rt−s−1 ys (26) s=t " (1 + r)ct−1 = rbbt−1 + ryt−1 + (r/R) Et−1 ∞ X # Rt−s ys (27) s=t Now subtract (27) from (24) to obtain: " ct − ct−1 − rct−1 = r(bbt−1 + yt−1 − ct−1 ) + (r/R) Et − rbbt−1 + ryt−1 + (r/R) Et−1 ∞ X " ∞s=t X # Rt−s ys (28) #! Rt−s ys s=t " ct − ct−1 = (r/R) ∞ X # Rt−s (Et [ys ] − Et−1 [ys ]) . (29) s=t Note that we also had an earlier expression for ct − ct−1 : equation (7). Thus we can 2 Deaton, here, is codifying practice in much of the immediately preceding literature; however, this definition of ‘permanent income’ differs importantly from the verbal definitions originally provided by Friedman (1957); for a discussion, and an argument that Friedman’s original definition was actually more appropriate, see Carroll (1997). 5 conclude that " t = (r/R) ∞ X # t−s R (Et [ys ] − Et−1 [ys ]) . (30) s=t Not only is consumption a random walk, it is a very specific random walk: the change in consumption must be equal to the change in the expectation of permanent income. Consider now a specific assumption about the stochastic process for income, (yt − P ) = t + βt−1 . (31) Given this equation, we have (Et − Et−1 )yt+k t = βt 0 if k = 0 if k = 1 if k > 1 (32) Substituting this into (29) gives ∆ct = (r/R)(1 + β/R)t (33) Now suppose that we estimate a value of β = 0.5(1.04) and for simplicity let’s set r = 0.04. The standard deviation of changes in consumption will be given by σ∆c = (0.04/1.04)(1 + 0.5)σ ≈ 0.06σ . (34) (35) Thus, consumption will be much smoother than income, as it is in the data. This example can be extended to any example of a moving average representation of income. If the income process is ∞ X βk t−k , (36) yt = P + t + k=1 the change in consumption warranted by a current innovation t to income is ∆ct = (r/R) 1 + β1 /R + β2 /R2 + . . . t (37) = (r/R)β[1/R]t (38) where β[1/R] is the lag polynomial that defines the moving average process evaluated at the discount factor R−1 . If income is a general ARMA process, if we write zt = yt − P we have zt + α1 zt−1 + α2 zt−2 + . . . = t + β1 t−1 + β2 t−2 + . . . (39) which using the lag polynomial notation is: α[L]zt = β[L]t . (40) If this is an ‘invertible’ lag polynomial, it can be rewritten in the form zt = 6 α−1 [L]β[L]t so that if income is generated by (39) the warranted change in consumption upon receiving a shock t is given by ∆ct = (r/R)β[R−1 ]/α[R−1 ] t . | {z } (41) ≡θ Thus, if we know the ‘true’ ARMA process for income, the change in consumption upon receipt of a shock t can be predicted and compared with the outcome. Flavin (1981) was the first to consider using these time series formulas to test whether the change in consumption is related to the ‘warranted’ change. She proposed estimating equations like ∆ct = µ + θt (42) where the parameter θ is the warranted change in consumption implied by equation (41). So far we have assumed a stationary income process (that is, a process in which income always eventually returns to its mean level P ). The truth is that income has grown steadily over time, so interpreted strictly the assumption of stationarity is nonsense. There are several ways of improving upon the implausible stationarity assumption. The first (pursued by Flavin) is to simply assume that there is a deterministic trend to income, yt = µ0 + µ1 yt−1 + µ2 t. (43) This is not a plausible process, however, because it implies that income in the far distant future can be predicted with a high degree of accuracy simply by extending the trend into the future. A vast literature on time-series econometrics has arisen to examine problems like this, and its conclusion is that it is much more plausible to represent the process for income as difference stationary, ∆yt = µ0 + µ1 ∆yt−1 + t . (44) Consider first the simplest possible difference stationary process for income, ∆yt = t . (45) What implications does this process have for the behavior of consumption changes? " # ∞ X ∆ct = (r/R) Rt+1−s (Et [ys ] − Et−1 [ys ]) . (46) s=t Consider the expectation as of time t − 1 of yt : Et−1 [yt ] = yt−1 + Et−1 [yt − yt−1 ] = yt−1 + Et [t ] = yt−1 . 7 (47) (48) (49) Similar logic shows that Et−1 [yt+1 ] = Et−1 [yt+2 ] + . . . = yt−1 . Furthermore, the same logic shows that Et [yt+1 ] = Et [yt+2 ] + . . . = yt . Substituting these expressions into (46), " # ∞ X t+1−s ∆ct = (r/R) R (yt − yt−1 ) (50) " = (r/R) s=t ∞ X # Rt+1−s t (51) s=t 1 t (r/R) 1 − (1/R) 1 r t R−1 1 r t r t . = = = = (52) (53) (54) (55) This makes perfect intuitive sense: when income increases by t , the expectation is that it will stay at this new higher level forever, so permanent income has gone up by exactly as much as current income and therefore consumption should react to income changes one-for-one. When an equation like (44) is estimated on US quarterly data, the coefficient on the lagged income change is typically in the neighborhood of 0.26. Consider now what this implies for the change in permanent income that is associated with a given change in current income. Suppose for simplicity that the change in income in the previous period was ∆yt−1 = 0, and suppose that µ0 = 0. Then Et [yt+1 ] = = = Et [yt+2 ] = = Et [yt + ∆yt+1 ] Et [yt + µ1 t + t+1 ] Et [yt + µ1 t ] Et [yt+1 + ∆yt+2 ] Et [yt + µ1 t + µ21 t ] (56) (57) (58) (59) (60) and so on, so that the change in expected permanent income is ht = (r/R) (Et − Et−1 )(r/R)h t |{z} Et yt −Et−1 [yt] = (r/R) t + (t + µ1 t )/R + t (1 + µ1 + µ21 )/R2 + . . . | {z } {z } | Et [yt+1 ]−Et−1 [yt+1 ] 1 1 − (1/R) 8 + t (µ1 /R) Et [yt+2 ]−Et−1 [yt+2 ] 1 1 − (1/R) + t (µ21 /R2 ) 1 1 − (1/R) ... 1 = t 1 − µ1 /R 1 ≈ t 1 − 0.25 ≈ 1.33t (61) (62) (63) Now consider what this means for the standard deviation of changes in consumption in the PIH model. Since consumption is equal to permanent income, the standard deviation of changes in consumption should be equal to the standard deviation of changes in permanent income, σ∆c = 1.33σ . (64) But the standard deviation of changes in income is only σ . Thus if income changes are positively serially correlated, the permanent income hypothesis implies that the magnitude of consumption variability should actually be greater than the variability in income, exactly the opposite of the pattern the PIH was invented to explain! 6 Campbell and Deaton Campbell (1987) and Campbell and Deaton (1989) showed how to carry out the test that Flavin (1981) originally proposed: whether consumption responds the ‘right amount’ to news about income t . As suggested by the simple example above, their main finding was that consumption is ‘excessively smooth.’ To see how they reached that conclusion, begin with the formula for consumption in the CEQ PIH model: # " ∞ X r R−k Et [yt+k ] . (65) bt + ct = R k=0 Define saving in period t as (discounted) income minus consumption st = (r/R)bbt + yt − ct . Substitute (65) into (66) to get " st = (r/R)bbt + yt − (r/R) b t + ∞ X (66) # R−k Et [yt+k ] (67) k=0 = yt [1 − (r/R)] −(r/R) | {z } =(R−r)/R=1/R −1 = yt R 2 ∞ X R−k Et [yt+k ] (68) k=1 − (r/R ) Et [yt+1 ] − (r/R) ∞ X R−k Et [yt+k ] k=2 9 (69) −1 −1 = R yt −R | −1 = −R −1 Et [yt+1 ] + R {z =0 2 Et [yt+1 ] −(r/R ) Et [yt+1 ] − (r/R) } 2 2 ∞ X R−k Et [yt+k ] k=2 (Et [yt+1 − yt ]) + (1 + r) Et [yt+1 ]/R − (r/R ) Et [yt+1 ] −(r/R) {z } | R−2 Et [yt+1 ] ∞ X −k = −R−1 Et [∆yt+1 ] + R−2 Et [yt+1 ] − (r/R) R ∞ X R−k Et [yt+k ] k=2 Et [yt+k ] k=2 = −R−1 Et [∆yt+1 ] + R−2 Et [yt+1 ] − (r/R) R−2 Et [yt+3 ] + = −R−1 Et [∆yt+1 ] + R−2 Et [yt+1 ] − (r/R) R−2 Et [yt+3 ] + ∞ X k=3 ∞ X ! R−k Et [yt+k ] ! R−k−1 Et [yt+k+1 ] k=2 = R −1 ∞ X −k − Et [∆yt+1 ] + R−1 Et [yt+1 ] − (r/R2 ) Et [yt+2 ] + (r/R) R Et [yt+k+1 ] k=2 {z } | =Et of equation (69) at t+1 Repeated substitution leads to the result that ∞ X st = − R−k Et [∆yt+k ] (70) k=1 Saving today is equal to the PDV of future changes in labor income. The intuition is straightforward. If income in the future were expected to be the same as today, consumption would be equal to income. If income in the future will be higher than income today (positive values of ∆yt+n ) then consumption should be higher than income today and the saving rate will be negative. Rewriting this as ∞ X d yt − ct = − R−k Et [∆yt+k ] (71) k=1 ct = ytd + ∞ X R−k Et [∆yt+k ] (72) k=1 it is clear that consumption and disposable income are cointegrated. Now consider the possibility that individual households know more about their future income prospects than the econometrician. Call the information set of the individuals It and that of the econometrician Ωt where Ωt ⊂ It . Consumers will of 10 course make their decisions based upon their own information set: ∞ X st = − R−k Et [∆yt+k |It ] (73) k=1 Notice that the econometrician with information set Ωt would calculate that saving should be ∞ X e st = − R−k Et [∆yt+k |Ωt ] (74) k=1 and thus the difference between what the econometrician would calculate saving should be and what saving actually is is: ∞ X e st − st = − R−k (Et [∆yt+k |Ωt ] − Et [∆yt+k |It ]) (75) k=1 which reveals to us all of the relevant information that individuals have about the PDV of their future incomes that we cannot know from the information directly available to us! This implies that, whatever variables we are using to forecast future income growth, the saving rate is likely to provide additional forecasting power. Thus, the Flavin approach of using only lagged values of income to forecast future values of income is inefficient, and we should at a minimum add the saving rate to the forecasting equation for income. This leads us to the following VAR: ∆yt ζ11 ζ12 ∆yt−1 u1t = + (76) st ζ21 ζ22 st−1 u2t which can be rewritten as xt = Axt−1 + ut . (77) Forecasts of x are formed using Et [xt+i ] = Ai xt (78) Define the vectors e1 = ( 10 ), which picks out the income change element of x, and e2 = ( 01 ) which picks out the saving component of x, so that Et [∆yt+i ] = e01 Ai xt+i ∞ X −st = R−i Et [∆yt+k ] −e02 xt = k=1 ∞ X k=1 11 e01 Ak R−k xt (79) (80) (81) Now we can divide both sides of (81) by −xt to obtain e02 = −e01 ∞ X Ak R−k . (82) k=1 P k −1 Recall the fact for scalars that if µ < 1 then ∞ k=0 µ = 1/(1 − µ) = (1 − µ) . There is an equivalent formula for matrices: ∞ X (A/R)k = (I − A/R)−1 (83) k=0 ∞ X (A/R)k = (I − A/R)−1 − I (84) k=1 which can be substituted into (82) to obtain e02 = −e01 (I − A/R)−1 − I e02 (I − A/R) = −e01 [I − (I − A/R)] e02 = (e02 − e01 )(A/R) (0 1) = [(0 1) − (1 0)]A/R ζ12 (0 1) = [(−1 1)] ζζ11 /R 21 ζ22 0 = (−ζ11 + ζ21 )/R=⇒ζ21 = ζ11 1 = (−ζ12 + ζ22 )/R=⇒ζ22 = ζ12 + R (85) (86) (87) (88) (89) (90) (91) So, to test the theory one estimates the equations of the VAR one by one ∆yt = ζ11 (∆yt−1 ) + ζ12 st−1 + u1t st = ζ21 (∆yt−1 ) + ζ22 st−1 + u2t and then tests the econometric restrictions that ζ21 = ζ11 and ζ22 = ζ12 + R. To understand these restrictions, start with the definition of saving st = (r/R)bbt + yt − ct , lag one period and multiply through by R to get Rst−1 = rbbt−1 + R(yt−1 − ct−1 ) (92) Now recall that the evolution of assets is given by b t = R(bbt−1 + yt−1 − ct−1 ) R(yt−1 − ct−1 ) = b t − Rbbt−1 (93) (94) Use (94) to substitute into (92): Rst−1 = rbbt−1 + b t − Rbbt−1 = b t − b t−1 = ∆bbt 12 (95) (96) (97) But the first difference of the definition of saving says ∆st = (r/R)∆bbt + ∆yt − ∆ct ∆ct = rst−1 + ∆yt − ∆st = ∆yt − st + Rst−1 . (98) (99) (100) Now let’s suppose that the CEQ PIH is true so that the restrictions above hold true. Substituting in equation (100) for ∆yt and st we have ∆ct = ζ11 ∆yt−1 + ζ12 st−1 + u1t | {z } (101) ∆yt =ζ22 =ζ21 z }| { z}|{ − ζ11 (∆yt−1 ) − (ζ12 + R) st−1 − u2t +Rst−1 {z } | =st = u1t − u2t . (102) Thus we find again that consumption follows a random walk, and the change in consumption is predictable neither by lagged saving nor the lagged income change. Now consider the change in permanent income: ∞ X p R−k (Et − Et−1 )∆yt+k (103) ∆yt = k=0 = e01 = ∞ X (A/R)k ut k=0 0 e1 (I − A/R)−1 ut (104) (105) But from equation (85) we have e02 = −e01 (I − A/R)−1 − I e01 − e02 = e01 (I − A/R)−1 (106) (107) Substituting (107) into (105), ∆ytp = = = = (e01 − e02 )ut ([1 0] − [0 1])ut [1 − 1]ut u1t − u2t (108) (109) (110) (111) and thus the change in permanent income is equal to the change in consumption. Suppose now that the restrictions imposed by the CEQ PIH model are not true. In particular, suppose that in the saving equation the coefficient on lagged income growth ζ21 is not ζ11 as it should be but instead ζ11 −µ. Then the same substitutions 13 used to derive ∆ct above now lead to the result that ∆ct = µ(∆yt−1 ) + u1t − u2t . (112) Thus, if ζ21 differs at all from the value it should take, it must be true that consumption today exhibits ‘excess sensitivity’ to lagged income growth. Thus excess sensitivity and excess smoothness are not different, but two ways of looking at the same phenomenon. 14 15 References Campbell, John Y. (1987): “Does Saving Anticipate Declining Labor Income? An Alternative Test of the Permanent Income Hypothesis,” Econometrica, 55, 1249–73. Campbell, John Y., and Angus S. Deaton (1989): “Why Is Consumption So Smooth?,” Review of Economic Studies, 56, 357–74, Available at http: //ideas.repec.org/a/bla/restud/v56y1989i3p357-73.html. Campbell, John Y., and N. Gregory Mankiw (1989): “Consumption, Income, and Interest Rates: Reinterpreting the Time-Series Evidence,” in NBER Macroeconomics Annual, 1989, ed. by Olivier J. Blanchard, and Stanley Fischer, pp. 185–216. MIT Press, Cambridge, MA, Available at http://ideas.repec.org/p/nbr/nberwo/2924.html. Carroll, Christopher D. (1997): “Buffer Stock Saving and the Life Cycle/Permanent Income Hypothesis,” Quarterly Journal of Economics, CXII(1), 1–56, Available at http://econ.jhu.edu/people/ccarroll/BSLCPIH.zip. Deaton, Angus S. (1992): Understanding Consumption. Oxford University Press, New York. Flavin, Marjorie B. (1981): “The Adjustment of Consumption to Changing Expectations About Future Income,” Journal of Political Economy, 89, 974–1009, Available at http://ideas.repec.org/a/ucp/jpolec/v89y1981i5p974-1009.html. Friedman, Milton A. (1957): A Theory of the Consumption Function. Princeton University Press. Hall, Robert E. (1978): “Stochastic Implications of the LifeCycle/Permanent Income Hypothesis: Theory and Evidence,” Journal of Political Economy, 96, 971–87, Available at http://www.stanford.edu/~rehall/Stochastic-JPE-Dec-1978.pdf. Working, Holbrook (1960): “Note on the Correlation of First Differences of Averages in a Random Chain,” Econometrica, 28(4), 916–18. 16 ∆ct = 10.63 (6.83) + 0.174 Et−2 [∆yt−1 ]. (2.18) 17