Financial Econometrics Topic 5: Cointegration Dr. Aidil Rizal Shahrin May 1, 2021 Department of Banking & Finance Faculty of Business and Accountancy 1/49 Contents 1. Present Value Model 1.1 Econometric Model 2. Cointegration 2.1 Two-step Engle-Granger 3. Cointegration Tests for Structural Change 3.1 Gregory-Hansen Tests 4. Error Correction Models (ECM ) 4.1 The Long-Run Compositions 4.2 The Short-Run Compositions 5. Vector ECMs 5.1 Granger Representation Theorem 2/49 Contents 5.2 Relationship between V AR and V ECM 5.3 Estimating the V ECM 5.4 Johansen Tests for Cointegration 3/49 Present Value Model Present Value Model i. Let Pt be the price of a share measured at the end of the period (ex-dividend price), i.e., the purchase of the stock this period entitles you to a claim to next period’s dividend per share Dt+1 ii. The net simple stock return (end-of-period observations) of the stock held from time t to time t + 1 is: Rt+1 = Pt+1 + Dt+1 −1 Pt (1) iii. Assume that expected returns is a constant: Et [Rt+1 ] = r (2) 4/49 Present Value Model iv. Rearrange Eq.1 to get model of stock price: Pt = E t Pt+1 + Dt+1 1+r (3) v. Using the law of iterated expectation, i.e., Et [Et+1 [X]] = Et [X] (4) we can solve Eq.3 by repeatedly substituting out future prices. vi. Solving K periods forward we have: "K # " # i K X 1 1 Pt = E t Dt+i + Et Pt+K 1+r 1+r i=1 (5) 5/49 Present Value Model vii. The first term on the RHS of Eq.5 is the expected DPV of dividends while the second term is the expected discounted value of the stock price K periods from now viii. Assuming that the second term goes to zero as K increases (i.e., the transversality condition) " lim Et K→∞ 1 1+r K # (6) Pt+K = 0 we obtain a model which defines the stock price as the expected present value of future dividends out to infinity, discounted at a constant rate: " Pt = PDt = Et i ∞ X 1 Dt+i 1+r i=1 # (7) 6/49 Present Value Model ix. If dividends are assumed to grow at a constant rate g Et [Dt+i ] = (1 + g)Et [Dt+i−1 ] = (1 + g)i Dt (8) x. Hence the stock price model with constant discount rate r and dividend growth rate g , where g < r (for finite stock price) is Pt = Et [Dt+1 ] (1 + g)Dt = r−g r−g (9) and it is also known as the Gordon growth model 7/49 Present Value Model Econometric Model Econometric Model i. To test a model of the type described by Eq.8 and Eq.9, consider a system of equations which includes Pt = β1 Dt + ut (10) a linear equation between price and dividends Dt = β2 Dt−1 + wt (11) a data-generating AR(1) process for dividends and where the errors are white noise processes. 8/49 Econometric Model ii. Tests of the time-series properties of Dt and Pt (see Topic 2) show that they are nonstationary variables, so Eq.10 should be tested and estimated using the econometric techniques of cointegration 9/49 Cointegration Cointegration i. Cointegration captures the idea of a long-run dynamic economic model such that a. Each variable is nonstationary b. But, they do not drift too far apart from each other ii. Reasons for understanding cointegration a. There is strong evidence that financial time series are nonstationary b. For noncointegrated systems, it can be shown that t-tests and F -tests are inflated in size upwards. This means that the null hypothesis that the parameter is zero will be rejected more often even when the null hypothesis is true (Type I error). For this reason, a noncointegrated system is referred to as a spurious regression 10/49 Cointegration c. A cointegrated system can be decomposed into a long-run (fundamental) part and a short-run (dynamic) part. iii. Variables (Yt , Xt ) are said to be cointegrated if a. Yt ∼ I(1) and Xt ∼ I(1) and b. The residuals ut from the model Yt = βXt + ut are stationary, i.e., ut ∼ I(0) c. The cointegrating vector is (1 − β). d. More generally, if Yt and Xt are I(d) and Yt − βXt is I(d − b) then Yt and Xt are cointegrated and d > b > 0. e. The cointegrating equation can be decomposed into the • Long-run: βXt • Short-run: ut 11/49 Cointegration Two-step Engle-Granger Two-step Engle-Granger i. According to the present value model above, the relationship between share price (Pt ) and dividends (Dt ) is: Pt = βDt + ut where ut is a disturbance term ii. We have established that the two series are nonstationary variables, i.e., they are I(1) variables. iii. However, if theory holds, then the two series are not expected to drift too far apart. That is, the error term ut is I(0) iv. If this holds, then Pt and Dt are said to be cointegrated 12/49 Two-step Engle-Granger v. A natural way to test this is to apply the 2-step approach of Engle-Granger (1987) • Step 1: Perform an OLS regression. Note that although β̂ is superconsistent, the standard errors (and hence t-stats) are not correct except when residuals are white noise • Step 2: Test stationarity property of the regression residuals (always regress ∆ût on ût−1 (+ lags of ∆û) without the constant term). The critical values for the Dickey-Fuller t-statistics when applied to residuals from a spurious regression is attached. (Note: Case 1 is when the Cointegrating equation (CE) contains NO constant term, case 2 when the CE includes a constant term, and Case 3 when the CE includes a constant and time-trend). If residuals are I(1) − (Pt , Dt ) are said to be spuriously related If residuals are I(0) − (Pt , Dt ) are said to be cointegrated. 13/49 Cointegration Tests for Structural Change Cointegration Tests for Structural Change Gregory-Hansen Tests Gregory-Hansen Tests i. The cointegrating relationship between two variables {Y, X} with no structural change is typically Yt = µ + βXt + εt The standard residual-based test for cointegration is an ADF on the OLS residuals ii. Cointegration tests can also be conducted in the presence of deterministic structural breaks iii. This is achieved by augmenting the standard cointegrating equation between Y and X with dummy variables iv. Both intercept and slope dummies can be included 14/49 Gregory-Hansen Tests v. The Gregory-Hansen (GH ) cointegration tests are also ADF tests on the residuals, but they are associated with cointegrating models with structural break vi. They are designed to test the null of no cointegration against the alternative of cointegration in the presence of a structural change H0 : no cointegration H1 : cointegration with a one-time regime shift 15/49 Gregory-Hansen Tests vii. There are three tests. For convenience define a dummy variable Dt = 0, if t ≤ λT = 1, otherwise • GH(1): the cointegrating equation allows for a mean-shift: Yt = µ0 + µ1 Dt + β1 Xt + ε1t • GH(2): the cointegrating equation allows for a mean-shift with trend: Yt = µ0 + µ2 Dt + β1 Xt + δt + ε2t 16/49 Gregory-Hansen Tests • GH(3): the cointegrating equation allows for a regime shift, i.e., shift in mean and slope coefficients: Yt = µ0 + µ3 Dt + β1 Xt + β2 Dt Xt + ε3t viii. The approach is to compute the usual t-statistics associated with an ADF regression on the residuals obtained from estimating the augmented cointegrating equation over all possible breakpoints and choosing the smallest value ix. As with the unit root case, new tables of critical values are needed for conducting cointegration tests. These are given in Gregory A.W. and Hansen, B. E. (1996) x. The critical values are given in the Table below. (m is the regressors excluding constant and/or trend terms) 17/49 Gregory-Hansen Tests 18/49 Error Correction Models (ECM ) Error Correction Models (ECM ) i. The cointegrating relationship between the variables Yt and Xt is a long-run relationship ii. However, unless adjustment is instantaneous, Yt and Xt are more likely to be related dynamically iii. A general dynamic economic model (where ut is a disturbance term) is as follows: Yt = ρYt−1 + φ0 Xt + φ1 Xt−1 + ut (12) iv. Eq.12 can be rearranged into an error correction representation (ECM hereafter) and shown to contain the long run cointegrated relationship between Yt and Xt as well as other short-run components. 19/49 Error Correction Models (ECM ) v. To show the relationship between an ECM and cointegration, subtract Yt−1 from both sides of Eq.12 Yt − Yt−1 = (ρ − 1)Yt−1 + φ0 Xt + φ1 Xt−1 + ut Add and subtract φ0 Xt−1 on the right hand side Yt − Yt−1 = (ρ − 1)Yt−1 + φ0 (Xt − Xt−1 ) + (φ0 + φ1 )Xt−1 + ut Rearrange the RHS ∆Yt = α(Yt−1 − βXt−1 ) + φ0 (∆Xt ) + ut (13) where β= φ0 + φ1 1−ρ α = (ρ − 1) (14) (15) 20/49 Error Correction Models (ECM ) vi. Eq.13 is a single-equation error-correction model with long run parameter β and error-correction term α 21/49 Error Correction Models (ECM ) The Long-Run Compositions The Long-Run Compositions i. The long-run is given by setting a. Yt = Yt−1 = Y b. Xt = Xt−1 = X c. ut = 0 ii. Substituting these restrictions into the dynamic model Eq.12 and rearranging gives the long-run relationship between Yt and Xt as Y = φ0 + φ1 X 1−ρ iii. Notice that the coefficient on X , namely the long-run multiplier, is equal to β in Eq.13 22/49 Error Correction Models (ECM ) The Short-Run Compositions The Short-Run Compositions i. The short-run movements in Yt are represented by Yt − Yt−1 ii. These movements can be decomposed into two types of adjustments a. φ0 (Xt − Xt−1 ) which occurs because of (short-run) movements in Xt b. α(Yt−1 − βXt−1 ) which occurs when the variables are not in equilibrium iii. This decomposition highlights the name given to the model (ECM ) 23/49 The Short-Run Compositions iv. In particular, the second type of adjustment is known as the error correction term since if Yt drifts above its long-run value, thus making Yt−1 − βXt−1 positive, then providing that α is negative and 0 < |α| < 1, the overall effect is to slow down, or even decrease, Yt v. This forces Yt back, that is error correct, to its long-run position 24/49 Vector ECMs Vector ECMs i. The analysis above is based on a single equation representation of the ECM ii. But, that is only appropriate if dividends are (strongly) exogenous (uni-directional causality) iii. In general, for modelling economic systems which exhibit long-run relationships it is more appropriate to estimate a multivariate ECM (that is assume feedback causality) iv. To move from the univariate to the multivariate case, we need to understand • the Granger Representation Theorem • the relationship between a VAR with nonstationary variables and its VECM representation 25/49 Vector ECMs • Johansen’s test for the number of cointegrating relationship(s) • Estimating a VECM 26/49 Vector ECMs Granger Representation Theorem Granger Representation Theorem i. Suppose that Yt and Xt are I(1) and cointegrated with vector (1, −β) ut = Yt − βXt ii. The Granger representation theorem states that Yt and Xt can be expressed as an error-correction model (ECM ) such that each equation contains the “common” long-run component (Yt−1 − βXt−1 ) plus other short-run terms ∆Yt = α1 (Yt−1 − βXt−1 ) + lags(∆Yt , ∆Xt ) + ε1,t ∆Xt = α2 (Yt−1 − βXt−1 ) + lags(∆Yt , ∆Xt ) + ε2,t 27/49 Granger Representation Theorem • Constants, time trends, dummies can be added as required • Cointegration requires that at least one of the α’s be non-zero. If α1 = α2 = 0 then the system is not cointegrated • Signs of the α’s are important to capture adjustments in the right direction (i.e., mean-revert) • Size gives the speed of adjustment; α close to zero (one) implies slow (fast) adjustment iii. Consider again the econometric specification of the P V model: Pt = β1 Dt + ut (16) Dt = β2 Dt−1 + ωt (17) This set of equations implies that: • D has a data-generating process (DGP ) exogenous to P 28/49 Granger Representation Theorem • D cause P , and • P and D are cointegrated with cointegrating vector (1, −β1 ) iv. To write Eq.16 and Eq.17 in the ECM form (ignoring other lags) but assuming bilateral causality, manipulate equations as follows: ∆Pt = −Pt1 + β1 Dt−1 + β1 (Dt − Dt−1 ) + ut ∆Dt = β2 Dt−1 − Dt−1 + ωt Now rearrange terms to factor-out the common cointegration component ∆Pt = α1 (Pt−1 − βDt−1 ) + δ1 ∆Dt + ε1t ∆Dt = α2 (Pt−1 − βDt−1 ) + δ1 ∆Dt + ε2t 29/49 Granger Representation Theorem v. Note that the significance of the error-correction terms (α1 , α2 ) is informative about causality/(weak) exogeneity. vi. For our case study, if Eq.16 and Eq.17 are “true” representations of behaviour in the stock market, (that is D is weakly exogenous) we expect to find α1 to be negative and significant and to find α2 to be not significantly different from zero vii. If on the other hand, P and D are both endogenously determined, we expect to find α1 to be negative and significant and to find α2 to be positive and significant. viii. Both α’s must be between 0 and 1 (in absolute terms) 30/49 Vector ECMs Relationship between V AR and V ECM Relationship between V AR and V ECM i. To derive the multivariate ECM , it is convenient to start with a V AR y t = A1 y t−1 + · · · + Ap y t−p + εt • • • • y t is a vector of endogenous variables p is the lag length A1 , . . . , Ap are matrices of coefficients to be estimated εt is a vector of disturbance terms that are contemporaneously correlated, but not ♦ serially correlated ♦ or correlated with y t−1 , . . . , y t−p , xt 31/49 Relationship between V AR and V ECM ii. The V AR can be written as a vector error correction model y t − y t−1 = Πy t−1 + C 1 (y t−1 − y t−2 ) + · · · + C p−1 (y t−p+1 − y t−p ) + εt ∆y t = Πy t−1 + p−1 X C i ∆y t−i + εt i=1 • C 1 , . . . , C p−1 are matrices which are function of the Ai matrices of the V AR • Π is a matrix which contains the ♦ cointegrating vectors ♦ error correction coefficients • Note that the VECM has one less lag in the ∆y t ’s than there were in the y t ’s from the original VAR. 32/49 Relationship between V AR and V ECM iii. Consider a bivariate {Y1 , Y2 } set of nonstationary variables. Assume that the true model contains data-generating processes as follows Y1t = 0.4Y1t−1 + 0.3Y2t−1 + 0.24Y2t−2 + ε1t Y2t = Y2t−1 + ε2t iv. The V AR form (note here with non-stationary variables) is: " # Y1t Y2t " = 0.4 0 #" # 0.3 Y1t−1 1 Y2t−1 + " 0 0.24 0 0 #" # Y1t−2 Y2t−2 + " # ε1t ε2t (18) Note also that this is a VAR with lag-order 2 33/49 Relationship between V AR and V ECM v. The V AR of nonstationary variables can be transformed into a V ECM as follows (Note that we are assuming cointegration between {Y1 , Y2 } • First subtract the first-order lag from both sides ∆Y1t = (0.4 − 1)Y1t−1 + 0.3Y2t−1 + 0.24Y2t−2 + ε1t ∆Y2t = (1 − 1)Y2t−1 + ε2t • Add and subtract 0.24Y2t−1 for the first equation ∆Y1t = −0.6Y1t−1 + (0.3 + 0.24)Y2t−1 − 0.24(Y2t−1 − Y2t−2 ) + ε1t ∆Y2t = ε2t 34/49 Relationship between V AR and V ECM • Rearrange terms in ECM form ∆Y1t = −0.6(Y1t−1 − 0.9Y2t−1 ) − 0.24∆Y2t−1 + ε1t ∆Y2t = ε2t • Write in matrix form as: #" # " # " ∆Y1t −0.6 −0.6 × −0.9 Y1t−1 = ∆Y2t 0 0 Y2t−1 " #" # " # 0 −0.24 ∆Y1t−1 ε1t + + 0 0 ∆Y2t−1 ε2t (19) Notice that the VAR in Eq.18 is lag 2 while VECM in Eq.19 is lag of 1 in ∆ 35/49 Relationship between V AR and V ECM • Factorise out the “common” long-run relationship: # # " " # " i Y ∆Y1t −0.6 h 1t−1 = 1 −0.9 Y2t−1 ∆Y2t 0 # " # #" " ε1t 0 −0.24 ∆Y1t−1 + + ε2t ∆Y2t−1 0 0 h i • Note that this is V ECM with long-run vector 1 −0.9 and " # −0.6 error-correction vector . Note too that the V AR of 0 " # ∆Y1t stationary variables is now of order ONE. ∆Y2t 36/49 Relationship between V AR and V ECM vi. EViews allows for five different specifications of the vector error correction model depending upon whether or not intercepts and or time trends are included in the • Cointegrating equation (CE) • ECM , or what EViews calls the test vector autoregression (V AR) • In Eviews, in the lag interval, if you type ”1 2” in the edit field, the test VAR regresses ∆yt on ∆yt−1 , ∆yt−2 • In Gretl, in the cointegration test of Johansen, the lag order p of the V AR(p). Thus, if the lag in VECM is 3, in the lag order window set it to 4. 37/49 Relationship between V AR and V ECM a. Case 1 • No intercept and no trend in CE • No intercept and no trend in test VAR Y1,t − Y1,t−1 = α1 (Y1,t−1 − βY2,t−1 ) + ε1,t Y2,t − Y2,t−1 = α2 (Y1,t−1 − βY2,t−1 ) + ε2,t In this example, the number of lags in the V AR is N = 0, and ! α1 Π= 1, −β α2 b. Case 2 • Intercept and no trend in CE 38/49 Relationship between V AR and V ECM • No intercept and no trend in test V AR Y1,t − Y1,t−1 = α1 (Y1,t−1 − βY2,t−1 − µ) + ε1,t Y2,t − Y2,t−1 = α2 (Y1,t−1 − βY2,t−1 − µ) + ε2,t c. Case 3 • Intercept and no trend in CE • Intercept and no trend in test V AR Y1,t − Y1,t−1 = δ1 + α1 (Y1,t−1 − βY2,t−1 − µ) + ε1,t Y2,t − Y2,t−1 = δ2 + α2 (Y1,t−1 − βY2,t−1 − µ) + ε2,t d. Case 4 • Intercept and trend in CE 39/49 Relationship between V AR and V ECM • Intercept and no trend in test V AR Y1,t − Y1,t−1 = δ1 + α1 (Y1,t−1 − βY2,t−1 − µ − φt) + ε1,t Y2,t − Y2,t−1 = δ2 + α2 (Y1,t−1 − βY2,t−1 − µ − φt) + ε2,t e. Case 5 • Intercept and trend in CE • Intercept and trend in test V AR Y1,t − Y1,t−1 = δ1 + θ1 t + α1 (Y1,t−1 − βY2,t−1 − µ − φt) + ε1,t Y2,t − Y2,t−1 = δ2 + θ2 t + α2 (Y1,t−1 − βY2,t−1 − µ − φt) + ε2,t 40/49 Vector ECMs Estimating the V ECM Estimating the V ECM i. The above vector error correction models can be easily estimated as a system of equations with a common long-run relationship ii. If the residuals are “well-behaved”, the procedure also generates t-statistics which are distributed asymptotically as N (0, 1) 41/49 Vector ECMs Johansen Tests for Cointegration Johansen Tests for Cointegration i. As seen above there is a close association between an ECM and V AR with nonstationary time-series ii. This has been exploited by Johansen to test for the number of cointegrating relationships for systems with more than 2 variables iii. To illustrate, consider a bivariate study of (Yt , Xt ) both I(1) variables and cointegrated with vector (1, −β). The bivariate ECM is: ∆Yt = α1 (Yt−1 − βXt−1 ) + ε1,t ∆Xt = α2 (Yt−1 − βXt−1 ) + ε2,t 42/49 Johansen Tests for Cointegration which can be expressed as yt = Yt Xt ! ∆y t = Πy t−1 + εt ! ε1t α1 , εt , Π= ε2t α2 −α1 β ! −α2 β iv. The system can also be rearranged as: Yt = (1 + α1 )Yt−1 − α1 βXt−1 + ε1,t Xt = α2 Yt−1 + (1 − α2 β)Xt−1 + ε2,t 43/49 Johansen Tests for Cointegration v. As a bivariate V AR: y t = Ay t−1 + εt A= ! (1 + α1 ) −α1 β α2 (1 − α2 β) Thus we see that an ECM can be viewed as a V AR with restrictions Π = A − I2 44/49 Johansen Tests for Cointegration vi. In general, V AR(p) can be transformed into a ECM with (p − 1) lags of ∆y t−j y= p X Ai y t−i + ε j=1 ∆y t = Πy t−1 + p−1 X C i ∆y t−i + εt i=1 Notice the lag of is V AR(p) while V ECM (p − 1) vii. Notice that Π has reduced rank. It can be shown that it is the reduced rank property that generates the nonstationary and cointegration. In general for k variables, the rank of Π determines how many cointegrating vectors there are: 45/49 Johansen Tests for Cointegration • rank(Π) = 0, then y t ∼ I(1) but not cointegrated • rank(Π) = r, 0 < r < k, then y t ∼ I(1) with r linearly independent cointegrating vectors • rank(Π) = k, then y ∼ I(0) viii. If Π has reduced rank, then Π is the product of error correction terms and the cointegrating vector Π = αβ 0 ix. There exist a number of cointegration tests. The cointegration test presented in EViews is known as the Johansen likelihood ratio cointegration test 46/49 Johansen Tests for Cointegration x. Large values of the Likelihood ratio relative to the critical value result in rejection of the null hypothesis. The cointegration test proceeds sequentially. • Stage 1 H0 : No cointegrating vectors H1 : At least one cointegating vectors If the null is rejected proceed to the next stage, else stop and do not reject the null hypothesis 47/49 Johansen Tests for Cointegration • Stage 2 H0 : One cointegrating vectors H1 : At least two cointegating vectors If the null is rejected proceed to the next stage, else stop and do not reject the null hypothesis of one cointegrating vector. If the number of cointegrating vectors chosen equals the number of variables, then the variables are all I(0). xi. Although the Johansen test is the most popular test reported in the literature, many Monte Carlo tests have shown that the Johansen test to be very sensitive to mispecification of lag-length and sample sizes. 48/49 Johansen Tests for Cointegration xii. The Engle-Granger single equation test is more robust to misspecification. In practice, one should report a number of tests. 49/49