VAR Models For a set of n time series variables yt ( y1t , y2t , ..., ynt )' , a VAR model of order p (VAR(p)) can be written as: yt A1 yt 1 A2 yt 2 ... Ap yt p ut (1) where the Ai ’s are (nxn) coefficient matrices and ut (u1t , u2t ,..., unt )' is an unobservable i.i.d. zero mean error term. VAR Analysis (Enders Chapter 5) Consider a two-variable VAR(1) with k=2. yt b10 b12 zt c11 yt 1 c12 zt 1 yt (1) (2) zt b20 b21 yt c21 yt 1 c22 zt 1 zt with it ~ i.i.d (0, 2i ) and cov( y , z ) 0 In matrix form: (3) 1 b12 yt b10 c11 c12 yt 1 yt b 21 1 zt b20 c21 c22 zt 1 zt More simply: BX t 0 1 X t 1 t (4) Structural VAR (SVAR) or the Primitive System To normalize the LHS vector, we need to multiply the equation by inverse B: B1BXt B10 B11 X t 1 B1 t , thus: X t A0 A1 X t 1 et (5) VAR in standard form (unstructured VAR=UVAR). or: (6) yt a10 a11 z a t 20 a21 a12 yt 1 e1t a22 zt 1 e2t These error terms are composites of the structural innovations from the primitive system. What are their characteristics/moments? 1 b12 1 1 a 1 B ( B*)T et B 1 t where B 1 b 1 (1 b21b12 ) 21 B B B * =cofactor of B and ( B*)T =transpose. VAR 1 Thus e1t 1 1 e 2t (1 b21b12 ) b21 (7) b12 yt 1 zt Or e1t e2t yt b12 zt where 1 b21b12 b21 yt zt ' s are white noise, thus e’s are (0, i2 ) : E (eit ) 0 Var (e1t ) E (e ) 2 1t E ( yt2 b122 zt2 ) 2 y2 b122 z2 2 is time independent, and the same is true for Var (e2t ) . But covariances are not zero: E[( yt b12 zt )( zt b21 yt )] (b12 z2 b21 y2 ) 0. Covar (e1t , e2t ) E (e1t e2t ) 2 2 So the shocks in a standard VAR are correlated. The only way to remove the correlation and make the covar=0 is if we assume that the contemporaneous effects are zero: b12 b21 0 . The var/covar matrix of the VAR shocks: 2 12 . 1 2 21 2 Identification We can estimate (6) with OLS, since the RHS consists of predetermined variables and the error terms are white noise. The errors are serially uncorrelated but correlated across equations. Although SUR could be used in these cases, here we do not need it since all the RHS variables are identical, so there is no efficiency gain in using SUR over OLS. But we cannot use OLS to estimate the SVAR because of contemporaneous effects, which are correlated with the ' s (structural innovations). Our goal: To see how a structural innovation it affects the dependent variables in our original model. We estimate the reduced form (standard VAR), so how can we recover the parameters for the primitive system from the estimated system? VAR: 9 parameters ( = 6 coefficient estimates+ 2 variance estimates + 1 Covar estimate). SVAR: 10 parameters (=8 parameters + 2 variances). It is underidentified. VAR 2 Sims (1980) suggested using a recursive system. For this we need to restrict some of the parameters in the VAR. Ex: assume y is contemporaneously affected by z but not vice-versa. Thus we assume that b21 0 . In other words, y is affected by both structural innovations of y and z, while z is affected only by its own structural innovation. This is a triangular decomposition also called Cholesky decomposition. Then we have 9 parameter estimates and 9 unknown structural parameters, and SVAR is exactly identified. Now the SVAR system becomes: (8) B 1 1 b12 yt b10 c11 c12 yt 1 yt 0 1 z b c t 20 21 c22 zt 1 zt b12 1 b12 1 1 . 1 1 0 (1 b21b12 ) b21 Hence the VAR system in standard form can be written: (8’) yt b10 b12b20 (c11 b12c21) (c12 b12c22 ) yt 1 yt b12 zt z b z c21 c22 20 zt t 1 t If we match the coefficients in (8’) with the estimates in (6) yt a10 a11 a12 yt 1 e1t , z a t 20 a21 a22 zt 1 e2t we can extract the coefficients of the SVAR: a10 b10 b12b20 a20 b20 e1 y b12 ez a11 c11 b12c21 a21 c21 e2 ez a12 c12 b12c22 a22 c22 Cov12 = (b12 z2 b21 y2 ) 2 = b12 z2 Impulse response functions We want to trace out the time path of the effect of structural shocks on the dependent variables of the model. For this, we first need to transform the VAR into a VMA representation. Rewrite the UVAR more compactly: A0 et (5) X t A0 A1 X t 1 et X t I A1L I A1L First, consider the first component on the RHS: VAR 3 a21 a10 1 a22 1 a11 a12 A0 a a 1 a22 a20 1 a22 A0 ( I A1 ) A0 21 ( I A1 ) 1 A0 12 1 a11 a12 I A1 I A1 (1 a11 )(1 a22 ) a21a12 a21 1 a22 a 1 (1 a22 )a10 a21a20 y a12 a10 (1 a22 )a20 z Stability requires that the roots of I A1 L lie outside the unit circle. We will assume that it is the case. Then, we can write the second component as: a12 a et i0 A1i et i i0 11 I A1L a21 a22 i e1,t i e 2,t i We can thus write the VAR as a VMA with the standard VAR’s error terms. i a12 e1,t i yt y a11 (9) z z i 0 a a22 e2,t i 21 t Ai But these are composite errors consisting of the structural innovations. We must thus replace b12 1 1 the e’s with the ' s from (7) et t 1 b21 (i ) 11 b12 y ,t i y y y 1 Ai (9a) t i0 i0 (i ) 1 z ,t i z 1 b12 b21 b21 zt z 21 (i ) 12 (i ) 22 i y ,t i X i 0 i t i . z ,t i i Impact multipliers They trace the impact effect of a one unit change in a structural innovation. Ex: find the impact effect of z,t on yt and z t : dyt 12 (0) d z ,t dzt 22 (0) d z ,t Lets trace the effect one period ahead on yt 1 and zt 1 dyt 1 12 (1) d z ,t dzt 1 22 (1) d z ,t Note that this is the same effect on yt and z t of a structural innovation one period ago: dyt 12 (1) d z ,t 1 dzt 22 (1) d z ,t 1 Impulse response functions are the plots of the effect of z,t on current and all future y and z. IRs show how { yt } or {zt } react to different shocks. VAR 4 Ex: Impulse response function of y to a one unit change in the shock to z = 12 (0) , 12 (1) , 12 (2) , … Cumulated effect is the sum over IR functions: in0 12 (i) . n Long-run cumulated effect: lim i0 12 (i) n In practice we cannot calculate these effects since the SVAR is underidentified. So we must impose additional restrictions on the VAR to identify the impulse responses. If we use the Cholesky decomposition and assume that y does not have a contemporaneous effect on z, then b12 0 . Thus the error structure becomes lower triangular: (10) e1t 1 b12 yt e 1 zt 2 t 0 The y shock doesn’t affect z directly but it affects it indirectly through its lagged effect in VAR. Granger Causality: If the z shock affects e1, e2 and the y shock doesn’t affect e2 but it affects e1, then z is causally prior to y. Example: Calculate the impulse response functions on { yt } , {zt } of a unit change in z shock ( zt ) from an estimate of a two-variable VAR(1): yt 0.7 yt 1 0.2 zt 1 e1t zt 0.2 yt 1 0.7 zt 1 e2t 12 22 and 12 0.8 . For this, we must get the estimates of the primitive function (SVAR) from the estimated coefficients: Assume Cholesky decomposition b21 0 . 12 Cov1, 2 SE1SE2 b12 2 2 0.8 b12 0.8 Although this information is sufficient to calculate the impulse responses in this simple model, we can extract all of the coefficients of the primitive system as follows: a10 a20 0 b20 0 b10 b12b20 0 b10 0 a22 c22 0.7 and a21 c21 0.2 From a11 0.7 c11 b12c21 we get c11 0.54 . From a11 0.7 c11 b12c21 and a12 0.7 c12 b12c22 c12 0.8(0.7) c12 0.36 and c11 0.54 VAR 5 Substitute b12 into (10) to get: e1t yt 0.8 zt e2t zt A 1 unit zt shock is instantaneously absorbed 100% by z and 80% by y. Impact multipliers: yt 0.7 yt 1 0.2 zt 1 yt 0.8 zt (11) zt 0.2 yt 1 0.7 zt 1 zt At t=0: dyt 0 .8 d z ,t dzt 1 d z ,t At t=1: forward (11) by one period: dyt 1 dy dz 0.7 t 0.2 t 0.76 d z ,t d z , t d z ,t dzt 1 dy dz 0.2 t 0.7 t 0.86 d z ,t d z , t d z ,t At t=2: forward (11) by two periods: dyt 2 dy dz 0.7 t 1 0.2 t 1 0.70 d z ,t d z , t d z ,t dzt 2 dy dz 0.2 t 1 0.7 t 1 0.75 d z ,t d z , t d z ,t Long-run multipliers: both variables go back to zero. Cumulative multipliers: in0 dyt i 0.8 0.76 0.70 .. d z ,t i0 n dzt i 1 0.86 0.75 ... d z ,t Results are ordering-dependent. If you choose the decomposition such that b12 0 instead of b12 , you can have quite different results. One robustness check is, therefore to change the ordering. If results don’t change then the estimates are robust to ordering. If the correlation between the errors is low ( 12 small), then changing ordering does not make a big difference. Eviews specification -Residual: ignores the correlations in the VAR residuals; gives the MA coefficients of the infinite MA representation of the VAR. -Cholesky (with and without degree of freedom adjustment for small sample correction). -Generalized impulses: Pesaran and Shin (1998) methodology. Independent of the VAR ordering. Applies a Cholesky factorization to each variable with the j-th variable at the top of the ordering. VAR 6 Confidence Intervals Help to see the degree of precision in the coefficient estimates. Obtained by Monte Carlo study. Eviews provides two types of calculations of standard errors for the confidence intervals: Monte Carlo and Analytic. For M-C you need to provide the number of draws. Eviews then gives the 2SE standard bands around the impulse responses. Note that for VECM, these confidence intervals are not available on Eviews. For those interested in programming themselves, instructions to generate confidence bounds for SVARS are available at: http://www.eviews.com/support/examples/docs/svar.htm#blanquah3 Variance Decomposition It tells how much of a change in a variable is due to its own shock and how much due to shocks to other variables. In the SR most of the variation is due to own shock. But as the lagged variables’ effect starts kicking in, the percentage of the effect of other shocks increases over time. To see this consider the VMA representation of VAR in (9a): b12 y y 1 Ai xt t i0 1 1 b12b21 b21 zt z i (i ) 11 y ,t i y i0 (i ) z ,t i z 21 (i ) 12 (i ) 22 i y ,t i z ,t i i i 0 or xt X i t i . We want to calculate the n-period forecast error of x in order to find that of say, y. Start from 1 period: xt 1 X 0 t 1 1 t 2 t 1 ... Et xt 1 X 1 t 2 t 1 ... 1-period forecast error xt 1 Ext 1 0 t 1 Proceed in the same way and get 2-period forecast error: xt 2 Ext 2 0 t 2 1 t 1 3-period forecast error: xt 3 Ext 3 0 t 3 1 t 2 2 t 1 n-period forecast error: xt n Ext n 0 t n 1 t n1 2 t n2 ... n1 t 1 in01 t ni Now consider y, the first element of the x matrix. Its n-step-ahead forecast error is: yt n Eyt n (11, 0 y ,t n 11,1 y ,t n1 ... 11,n1 y ,t 1 ) ( 21,0 z ,t n 21,1 z ,t n1 ... 21,n1 z ,t 1 ) VAR 7 The variance of its n-step-ahead forecast error is: y2,n y2 ( 211,0 211,1 ... 211,n1 ) z2 ( 2 21,0 2 21,1 ... 2 21,n1 ) proportion of variance due to own shock proportion of var iance due to a z shock Decreases over time Grows over time If z can explain none of the forecast error var of the sequence { yt } at all forecast horizons ( y2,n z2 0 ), then { yt } is exogenous. If z can explain most of the forecast error var of the sequence { yt } at all forecast horizons ( y2,n z2 0.9 for ex.), then { yt } is endogenous. Note that exogeneity is not the same as Granger-causality. It is a concept involving the contemporaneous value of an endogenous variable and the contemporaneous error term of another variable. Same identification problem as for the impulse response functions. But if the crosscorrelation is not significant then ordering will not matter. Impulse responses + Variance decomposition = innovation accounting. Hypothesis Testing 1. Specification of the VAR model Decide on the variables that enter the VAR: need a model for this. If the VAR is misspecified because of missing variables, it will create an omitted variable(s) problem and be reflected in serially correlated error terms. Number of lags. We need to include the optimal number of lags. Note that increasing the number of lags does not solve the residual correlation if there are omitted variables. Even if there is no omitted variables and we include the optimal number (or reasonable #) of lags, residuals can still reflect a problem caused by structural breaks. At this stage we will control for them by determining the break dates exogenously. Determination of optimal lag length a. LR tests LR (T m)(ln r ln u ) ~ 2 (q) (10) T=#observations (after accounting for lags) m=#parameters estimated in each equation of the unrestricted system, including the constant. ln r natural log of the determinant of the covariance matrix of residuals of the restricted system. VAR 8 q = total number of restrictions in the system (=#lags times n 2 ) and n=#variables (or equations). If the LR statistics < critical value, reject the null of the restricted system. Eviews follows Lutkepohl (1991) methodology in conducting a sequential LR test (adjusting for m). You start with the maximum #lags following your prior. Suppose you decide k lags. Then you compare the kth (largest) lag’s covar matrix determinant with that of k-1. If the LR statistics < critical value, reject the null of k-1 lags over k lags. The LR test statistics then becomes: (11) LR (T m)(ln t 1 ln t ) ~ 2 (q) and q = n 2 However, if you want to compare say, 12th lag with 8th lag, you have to do calculate the test statistics yourself, using the formula in (10). b. Information criteria AIC T ln 2N SBC T ln N ln T Choose the # lag that minimizes the criteria. Note that these criteria are not tests, they mainly indicate goodness of fit of alternatives. So they should be used as complements to the LR tests. You can use the information criteria to compare nonsequential tests. 3. Diagnostic tests of the residuals (in Eviews) Portmanteau Autocorrelation Test (Box-Pierce-Ljung-Box Q statistics) for residual correlation. Null Hypothesis: No serial correlation up to chosen lag. Q statistics distributed 2 dof = n2 (h p) n=#variables, h=#max chosen lags, p=order of the VAR. Not a good statistics to use if there is a quasi-unit root (requires high order MA coefficients to be 0). Autocorrelation LM Test. Null hypothesis: no autocorrelation up to lag h. LM statistics distributed 2 dof = n 2 . Normality tests Multivariate version of the Jarque Bera tests. It compares the 3rd and 4th moments (skewness and kurtosis) to those from a normal distribution. Must specify a factorization of the residuals. Choices in Eviews: Cholesky: the statistics will depend on the ordering of the variables. VAR 9 Doornik and Hansen (94) –Inverse SQRT of residual correlation matrix: invariant to the ordering of variables and the scale of the variables in the system. Urzua (97)- Inverse SQRT of residual covariance matrix: same advantage as Doornick and Hansen, but better. Factorization from SVAR (later: need to have estimated an SVAR) 4. Granger Causality In a two-variable VAR(p)The process {zt } does not G-cause { yt } if all coefficients in A12 ( L) 0 (or a joint test of a21 (1) a21 (2) ... a21 ( p) 0 at all lags is not rejected). This concept involves the effect of past values of z on the current value of y. So it answers the question whether past and current values of z help predict the future value of y. It is different from exogeneity tests, which look at whether the current values of z explains current and future values of y. In a n-variable VAR(p), block-exogeneity (=block-G-causality) test looks at whether the lags of any variables G-cause any other variable in the system. You can test this using the LR test in (10). Application Create a bivariate VAR(1) and apply the tests to get the best specification of the model. Workfile:ENDERSQUARTERLY.wf -Generate the rate of growth of money supply m=log(M1NSAt)-log(M1NSAt-1) -Generate the rate of PPI inflation: Inf=log(PPIt)-log(PPIt-1) -Generate seasonal dummies for each quarter of the year: Di=@seas(i), where i=1,2,3 or: endersquartdummies.prg smpl @all inf=log(ppi)-log(ppi(-1)) m=log(m1nsa)-log(m1nsa(-1)) for !j=1 to 4 series d{!j}=@seas({!j}) next -Check whether m and inf are I(0) Now we can create our bivariate VAR(1): Endogenous variables: m, inf Exogenous variables: constant, 4 seasonal dummies VAR 10 Estimate an unrestricted VAR 1. Test the lag length The sequential LR statistics indicates 5 lags. Also confirmed by info criteria. View-Lag Structure-Lag length criteria-lags to include [8]-OK VAR Lag Order Selection Criteria Endogenous variables: INF M Exogenous variables: C D1 D2 D3 Sample: 1960Q1 2002Q4 Included observations: 160 Lag LogL LR FPE AIC SC HQ 0 1 2 3 4 5 6 7 8 927.0090 993.6745 1000.381 1010.920 1018.108 1023.710 1025.908 1028.766 1033.418 NA 128.3309 12.74192 19.76184 13.29702 10.22334* 3.956218 5.072764 8.142326 3.52e-08 1.61e-08 1.55e-08 1.43e-08 1.38e-08 1.35e-08* 1.38e-08 1.40e-08 1.39e-08 -11.48761 -12.27093 -12.30476 -12.38650 -12.42635 -12.44637* -12.42385 -12.40957 -12.41773 -11.33385 -12.04029* -11.99724 -12.00211 -11.96507 -11.90822 -11.80881 -11.71766 -11.64894 -11.42518 -12.17728 -12.17989 -12.23041 -12.23904* -12.22785 -12.17410 -12.12861 -12.10555 * indicates lag order selected by the criterion LR: sequential modified LR test statistic (each test at 5% level) FPE: Final prediction error AIC: Akaike information criterion SC: Schwarz information criterion HQ: Hannan-Quinn information criterion You may have priors and want to test for lag length yourself using a LR test. Suppose we start with 12 lags and compare it with 8 lags. Calculate the determinant of the residual covariance matrix: Eviews gives it at the bottom of the estimation ˆ det 1 ˆ ˆ ' T p t t t with p=# parameters per equation in the VAR. The unadjusted ignores p. VAR 11 Estimate with 12 lags the unrestricted VAR Vector Autoregression Estimates Date: 04/12/07 Time: 12:12 Sample (adjusted): 1963Q2 2002Q1 Included observations: 156 after adjustments Standard errors in ( ) & t-statis ] Determinant resid covariance (dof adj.) Determinant resid covariance Log likelihood Akaike information criterion Schwarz criterion 1.10E-08 7.41E-09 1017.509 -12.32704 -11.23222 Estimate with 8 lags over the same VAR over the same sample: Determinant resid covariance (dof adj.) Determinant resid covariance Log likelihood Akaike information criterion Schwarz criterion 1.10E-08 8.41E-09 1033.418 -12.41773 -11.64894 To do the comparison properly, we must use the same sample of 12 lags (1963q2 2002q1) Determinant resid covariance (dof adj.) 1.14E-08 Determinant resid covariance 8.65E-09 Log likelihood 1005.422 Akaike information criterion -12.37721 Schwarz criterion -11.59520 Form the LR test statistics: LR (T m)(ln r ln u ) ~ 2 (q) (m=#parameteres in each equation of the unrestricted system+constants, q # restrictions.n 2 , n=#variables) LR=[156-(4+(2x12))][ln(8.65E-09)-ln(7.41E-09)] =19.80 < chisqr((12-8)x4)=chisqr(16)= 34 Do not reject the null. 2. Test the significance of the dummies using the same LR test. 3. Diagnostic tests of the residuals View-Residual tests- VAR 12 Portmanteau test: VAR Residual Portmanteau Tests for Autocorrelations H0: no residual autocorrelations up to lag h Sample: 1963Q2 2002Q4 Included observations: 156 Lags Q-Stat Prob. Adj Q-Stat 1 2 3 4 5 6 7 8 9 10 11 12 0.033361 0.316277 1.186238 1.759066 3.215159 3.736958 4.377997 5.299534 5.875070 9.854399 17.23996 18.81050 NA* NA* NA* NA* NA* NA* NA* NA* 0.2087 0.2754 0.1408 0.2786 0.033577 0.320167 1.207185 1.795088 3.299396 3.842067 4.513222 5.484572 6.095345 10.34723 18.29308 19.99450 Prob. df NA* NA* NA* NA* NA* NA* NA* NA* NA* NA* NA* NA* NA* NA* NA* NA* 0.1921 4 (14.86) 0.2415 8 (21.9) 0.1071 12 0.2205 16 *The test is valid only for lags larger than the VAR lag order. df is degrees of freedom for (approximate) chi-square distribution Not reject the null. LM test VAR Residual Serial Correlation LM Tests H0: no serial correlation at lag order h Sample: 1963Q2 2002Q4 Included observations: 156 Lags LM-Stat Prob 1 2 3 4 5 6 7 8 9 10 11 12 2.327028 4.861899 15.30102 5.459386 9.271766 2.422662 3.174393 2.091522 0.727926 4.659113 8.771122 1.905281 0.6759 0.3018 0.0041 0.2433 0.0547 0.6585 0.5291 0.7189 0.9478 0.3241 0.0671 0.7532 VAR 13 Probs from chi-square with 4 df. Chisqr(4)=14.86 Mostly not reject the null. Normality Test VAR Residual Normality Tests Orthogonalization: Residual Correlation (Doornik-Hansen) H0: residuals are multivariate normal Sample: 1963Q2 2002Q4 Included observations: 156 Component Skewness Chi-sq df Prob. 1 2 -0.071003 0.087734 0.142376 0.217132 1 1 0.7059 0.6412 0.359509 2 0.8355 Joint Component Kurtosis Chi-sq df Prob. 1 2 3.606088 1.695635 3.832680 21.40130 1 1 0.0503 0.0000 25.23398 2 0.0000 Joint Component Jarque-Bera df Prob. 1 2 3.975057 21.61843 2 2 0.1370 0.0000 Joint 25.59349 4 0.0000 The null is a joint test of both the skewness and the kurtosis. Normality not rejected for inf but rejected for m due to kurtosis problem. Is this something we should worry about? in principle rejection of normal distribution invalidates the test statistics. But measures of skewness are found to be not informative in small samples (Bai, Ng Boston College WP 115, 2001). VAR 14 4. Granger causality View-lag structure-G-causality/block exogeneity test VAR Granger Causality/Block Exogeneity Wald Tests Sample: 1963Q2 2002Q4 Included observations: 156 Dependent variable: INF Excluded Chi-sq df Prob. M 7.107555 5 0.2128 All 7.107555 5 0.2128 Dependent variable: M Excluded Chi-sq df Prob. INF 17.95420 5 0.0030 All 17.95420 5 0.0030 Chisqr(5)=16.75 It tests bilaterally whether the lags of the excluded variable affect the endogenous variable. The null: the lagged coefficients are significantly different than 0. All: joint test that the lags of all other variables affect the endogenous variable. Ex: on top panel, first row shows if lagged variables of M are significantly different than 0, the second row shows if lagged variables of all variables other than INF are zero (in our case both tests are identical since we only have two variables). For both the null is rejected, though there is some evidence about effect of inf on m at 10 % significance level. VAR 15