VAR Models Gloria González-Rivera University of California, Riverside and Jesús Gonzalo U. Carlos III de Madrid Some References • Hamilton, chapter 11 • Enders, chapter 5 • Palgrave Handbook of Econometrics, chapter 12 by Lutkepohl • Any of the books of Lutkepohl on Multiple Time Series Multivariate Models • VARMAX Models as a multivariate generalization of the univariate ARMA models: p q r s Ls Yt s 0 n xn G i Li X t j L j t i 0 j 0 n x1 n x k k x1 n xn • Structural VAR Models: BYt 1Yt 1 ... pYt p t • VAR Models (reduced form) Yt 1Yt 1 ... pYt p at Multivariate Models (cont) where the error term is a vector white noise: E (at a ' s ) if s t 0 otherwise To avoid parameter redundancy among the parameters, we need to assume certain structure on 0 and This is similar to univariate models. A Structural VAR(1) Consider a bivariate Yt=(yt, xt), first-order VAR model: y t b10 b12x t 11yt 1 12x t 1 yt x t b20 b21y t 21yt 1 22 x t 1 xt • The error terms (structural shocks) yt and xt are white noise innovations with standard deviations y and x and a zero covariance. • The two variables y and x are endogenous (Why?) • Note that shock yt affects y directly and x indirectly. • There are 10 parameters to estimate. From a Structural VAR to a Standard VAR • The structural VAR is not a reduced form. • In a reduced form representation y and x are just functions of lagged y and x. • To solve for a reduced form write the structural VAR in matrix form as: 1 b12 yt b10 11 12 yt 1 yt b 1 x b x 21 t 20 21 22 t 1 xt BYt 0 1Yt 1 t From a Structural VAR to a Standard VAR (cont) • Premultipication by B-1 allow us to obtain a standard VAR(1): BYt 0 1Yt 1 t Yt B 10 B 11Yt 1 B 1 t Yt 0 1Yt 1 at • This is the reduced form we are going to estimate (by OLS equation by equation) • Before estimating it, we will present the stability conditions (the roots of some characteristic polynomial have to be outside the unit circle) for a VAR(p) • After estimating the reduced form, we will discuss which information do we get from the obtained estimates (Granger-causality, Impulse Response Function) and also how can we recover the structural parameters (notice that we have only 9 parameters now). A bit of history ....Once Upon a Time Sims(1980) “Macroeconomics and Reality” Econometrica, 48 Generalization of univariate analysis to an array of random variables i.e. Z t moneysupply, X t int erestrat e,Vt income Z t VAR(p) Yt X t Yt c 1Yt 1 2Yt 2 ..... pYt p at Vt t E (at ) 0 E (at a ' ) 0 t (1) 11 12 13 i are matrices 1 21 22 23 31 32 33 A typical equation of the system is Z t c1 (1) 11 Z t 1 (1) 12 X t 1 V ..... (1) 13 t 1 11( p ) Z t p 12 ( p ) X t p 13( p )Vt p a1t Each equation has the same regressors Stability Conditions Yt 1Yt 1 2Yt 2 ...... pYt p c at ( I 1L 2 L2 ...... p Lp )Yt c at ( L)Yt c at ( L) is a nxn matrixpolynomialin thelag operator L theij elementof (L) is [ ij ij L ij L .... ij (1) A VAR(p) for (2) 2 Yt ( p) 1 i j L ] ij 0 i j p is STABLE if I n 1 x 2 x 2 ..... p x p 0 p x n roots of the characteristic polynomial are outside of the unit circle. ( I n 1 2 ..... p )1 c If the VAR is stable then a MA() representation exists. Yt at 1at 1 2at 2 ...... ( L)at ( L) [ I n 1L 2 L2 ......] This representation will be the “key” to study the impulse response function of a given shock. VAR(p) VAR(1) Re-writing the system in deviations from its mean Yt 1(Yt 1 ) 2 (Yt 2 ) ... p (Yt p ) at Stack the vector as 1 2 ...... p 1 p Yt at Y 0 I n 0...............0 t 1 F 0 I ...............0 v t n t Y 0 t p 1 0 0..........I 0 n (nxp)x1 (nxp)x1 (nxp)x(nxp) H t t Ft 1 vt E (vt v ') 0 t STABLE: 0.....0 0 0......0 (nxp)x(nxp) eigenvalues of F lie inside where H of the unit circle (WHY?). 0 0......0 Estimation of VAR models Estimation: Conditional MLE T f (YT , YT 1.....Y1 | Y0 , Y1....Y p 1 ; ) f (Yt | Yt 1 , Yt 2 ....Yt p 1 ; ) t 1 Yt | Yt 1 , Yt 2 .... N ( c 1Yt 1 .... pYt p , ) ' [c 1 2 ..... p ] n x (np+1) X t [1 Yt 1 Yt 2 ......Yt p ]' (np+1) x 1 Yt ' X t at T ( ) log f (Yt | past; ) t 1 Tn T 1 T 1 log(2 ) log Yt ' X t ' 1 Yt ' X t 2 2 2 t 1 Claim: OLS estimates equation by equation are good!!! ˆ 'ols Yt X 't X t X 't t 1 t 1 T ˆ ˆ mle ols Proof: T T 1 Y ' X ' Y ' X 1 t t 1 T t t t ˆ 'X ˆ ' X ' X ' 1 Y ˆ 'X ˆ ' X ' X Yt ols t ols t t t ols t ols t t t 1 T ˆ )' X ' 1 aˆ ( ˆ )' X aˆt ( ols t t ols t t 1 ˆ ols ) 1 ( ˆ ols )' X t 2 aˆ 't 1 ( ˆ ols )' X t aˆ 't 1aˆt X 't ( t t t 1 ˆ 1 ˆ ˆ ˆ (*) a 't ( ols )' X t tr a 't ( ols )' X t t t ˆ ols )' X t aˆ 't tr 1 ( ˆ ols ) X t aˆ 't 0 tr 1 ( t t T m in Yt ' X t ' 1 Yt ' X t t 1 m in t ˆ ' t 1a ˆt a ˆ ols ) 1 ( ˆ ols )' X t X ' t ( t because is posit ivedefinit em at rix -1 is posit ivedefinit e ˆ t hesm allest value is achievedwhen ols ˆ , then Maximum Likelihood of Evaluate the log-likelihood at T Tn T 1 1 ˆ) (, log(2 ) log aˆ 't 1aˆt 2 2 2 t 1 T ˆ) T (, 1 T 1 ˆ aˆ aˆ ' ˆ ˆ ' a a ' 0 t t t t 1 2 2 t 1 T t 1 diagonal element sˆ ii 2 1 T off - diagonal element sˆ ij 2 T 2 ˆ a it t 1 1 T T aˆ aˆ t 1 it jt Testing Hypotheses in a VAR model Likelihood ratio test in VAR T Tn T 1 ˆ , ˆ 1 aˆ ' ˆ 1aˆ ˆ ) log(2 ) log ( t t 2 2 2 t 1 T T 1 T 1 1 1 1 1 ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ a 't at trace a 't at trace at a 't 2 t 1 2 t 1 2 t 1 1 1 Tn 1 ˆ ˆ trace T traceTI n 2 2 2 Tn T Tn 1 ˆ ˆ ˆ (, ) log(2 ) log 2 2 2 T estingthenumber of lags p1 p0 : H 0 : VAR( p0 ) H1 : VAR( p1 ) under H0 performn OLS regressions of each variableon ˆ ˆ , a const antand p lags 0 0 0 Tn T Tn 1 ˆ 0 log(2 ) log 0 2 2 2 under H1 * Tn T Tn 1 ˆ log(2 ) log 1 2 2 2 * ˆ 1 log ˆ 1 T log ˆ log ˆ LR 2(*1 0 ) T log 1 0 0 1 * 1 LR m 2 m number of restrictions n 2 ( p1 p0 ) each equationhas p1 p0 restriction on each variable n( p1 p0 ) in each equation In general, linear hypotheses can be tested directly as usual and their A.D follows from the next asymptotic result: Let T vec ( T ) denote the (nk 1) (with k=1+np number of parameters estimated per equation) vector of coef. resulting from OLS regressions of each of the elements of y on x for a sample of size T: t t T 1.T . , n.T T ' where iT = x t x t t=1 -1 T x t yit t=1 ˆ is Asymptotic distribution of T ( T ) N (0, ( M 1 )), and the coef of regression i 2 1 T (ˆ iT i ) N (0, i M ) with Mˆ p lim(1 / T ) X t X ' t t Information Criterion in a Standard VAR(p) • In the same way as in the univariate AR(p) models, Information Criteria (IC) can be used to choose the “right” number of lags in a VAR(p): pˆ that minimizes IC(p) for p=1, ..., P. 2(n 2 p n) AIC ln T (n 2 p n) ln(T) SBC ln T • Similar consistency results to the ones obtained in the univariate world are obtained in the multivariate world.The only difference is that as the number of variables gets bigger, it is more unlikely that the AIC ends up overparametrizing (see Gonzalo and Pitarakis (2002), Journal of Time Series Analysis) Granger Causality Granger (1969) : “Investigating Causal Relations by Econometric Models and CrossSpectral Methods”, Econometrica, 37 Consider two random variables X t , Yt Two Forecast of X t , s periods ahead: ˆX ( s ) (1) E ( X | X , X , ....) Xˆ ( s ) (2) E ( X | X , X , ....Y , Y , ....) t t s t t 1 t t s t t 1 t t 1 2 ˆ ˆ MSE ( X t ( s )) E ( X t s X t ( s )) (1) (2) If MSE ( Xˆ t ( s ) ) MSE ( Xˆ t ( s ) ) then Yt does not Granger-cause X t s 0 Yt is not linearly informative to forecast X t Test for Granger-causality Assume a lag length of p X t c1 1 X t 1 2 X t 2 ..... p X t p 1Yt 1 2Yt 2 .... pYt p at Estimate by OLS and test for the following hypothesis H 0 : 1 2 ...... p 0 (Yt does not Granger - cause X t ) H1 : any i 0 Unrestricted sum of squared residuals RSS1 aˆt 2 t Restricted sum of squared residuals RSS2 aˆˆt 2 t ( RSS2 RSS1 ) F RSS1 /(T 2 p 1) • Under general conditions F ( p ) Impulse Response Function (IRF) Objective: the reaction of the system to a shock Yt c 1Yt 1 2Yt 2 .... pYt p at If the system is stable, Yt ( L)at at 1at 1 2at 2 .... ( L) [ ( L)]1 Redating at time t s : Yt s at s 1at s 1 2at s 2 .... s at s 1at 1 .... Yt s (s) s ij a 't nxn yi ,t s (s) ij a jt (multipliers) Reaction of the i-variable to a unit change in innovation j Impluse Response Function (cont) Impulse-response function: response of yi ,t sto one-time impulse in y with all other variables dated t or earlier held constant. jt yi ,t s ij a jt ij 1 2 3 s Example: IRF for a VAR(1) 2 y y a 12 1t 11 12 1t 1 1t 1 y ; a y 2 2t 21 22 2t 1 a2t 12 2 t 0 y1t y2t 0 t 0 a10 0, a20 1 (y2t increases by 1 unit) (no more shocks occur) Reaction of the system y10 0 (impulse) y 1 20 y11 11 12 0 12 y 21 21 22 1 22 y12 11 12 y11 11 12 0 y 1 y 22 21 22 22 21 21 2 0 y1s 11 12 0 s 1 y 1 2 s 21 22 1 s If you work with the MA representation: ( L) ( L) 1 1 1 2 1 2 s s1 In this example, the variance-covariance matrix of the innovations is not diagonal, i.e. 12 0 There is contemporaneous correlation between shocks, then y10 0 y 1 20 This is not very realistic To avoid this problem, the variance-covariance matrix has to be diagonalized (the shocks have to be orthogonal) and here is where a serious problems appear. Reminder: is positive definite (symmetric) matrix. Q (non-singular) such that QQ ' I Then, the MA representation: Yt i at i i 0 0 In Yt iQ 1Qat i i 0 Let us call M i iQ 1; wt Qat Yt M i wt i i 0 E [ wt w 't ] E [Qat a 't Q '] QE [at a 't ]Q ' QQ ' I n wt has components that are all uncorrelated and unit variance Yt s M s sQ 1 wt Orthogonalized impulse-response Function. Problem: Q is not unique Variance decomposition Contribution of the j-th orthogonalized innovation to the MSE of the s-period ahead forecast MSE (Yˆt ( s )) E (Yt s Yˆt ( s ))(Yt s Yˆt ( s )) ' et ( s ) Yt s Yˆt ( s ) at s 1at s 1 ..... s 1at 1 E[et ( s )et ( s ) '] a 1 a 1 ' .... s 1 a s 1 ' MSE ( s ) Q 1Q aQ ' Q 1' 1Q 1Q aQ ' Q 1' 1 ' .... s 1Q 1Q aQ ' Q 1' s 1 ' Q 1Q 1' 1Q 1Q 1' 1 ' ....... s 1Q 1Q 1' s 1 ' M 0 M 0 ' M 1M 1 ' .........M s 1M ' s 1 recall that M i iQ 1 and M 0 Q 1 , 0 I contribution of the first orthogonalized innovation to the MSE (do it for a two variables VAR model) Example: Variance decomposition in a two variables (y, x) VAR • The s-step ahead forecast error for variable y is: y t s E t y t s M 0 (1,1) yt s M1 (1,1) yt s1 ... M s1 (1,1) yt 1 M 0 (1, 2) xt s M1 (1, 2) xt s1 ... M s1 (1, 2) xt 1 • Denote the variance of the s-step ahead forecast error variance of yt+s as for y(s)2: 2 2 2 2 2 y (s) y [M 0 (1,1) M1 (1,1) ... M s1 (1,1) ] 2 2 2 2 x [M 0 (1, 2) M1 (1, 2) ... M s1 (1, 2) ] • The forecast error variance decompositions are proportions of y(s)2. due to shocks to y 2 2 2 2 2 y [M 0 (1,1) M1 (1,1) ... M s1 (1,1) ] / y (s) due to shocks to x 2 2 2 2 2 x [M 0 (1, 2) M1 (1, 2) ... M s1 (1, 2) ] / y (s) Identification in a Standard VAR(1) • Remember that we started with a structural VAR model, and jumped into the reduced form or standard VAR for estimation purposes. •Is it possible to recover the parameters in the structural VAR from the estimated parameters in the standard VAR? No!! •There are 10 parameters in the bivariate structural VAR(1) and only 9 estimated parameters in the standard VAR(1). •The VAR is underidentified. •If one parameter in the structural VAR is restricted the standard VAR is exactly identified. •Sims (1980) suggests a recursive system to identify the model letting b21=0. 1 b12 y t b10 11 12 y t 1 yt 0 1 x b t 20 21 22 x t 1 xt Identification in a Standard VAR(1) (cont.) • b21=0 implies y t 1 b12 b10 1 b12 11 12 y t 1 1 b12 yt x 0 1 b 0 1 x 0 1 20 21 22 t 1 xt t y t 10 11 12 y t 1 e1t x t 20 21 22 x t 1 e2t • The parameters of the structural VAR can now be identified from the following 9 equations 10 b10 b12 b20 20 b20 2 2 var(e1 ) 2y b12 x 11 11 b12 21 21 21 var(e2 ) 2x 12 12 b12 22 22 22 co v(e1 ,e2 ) b122x Identification in a Standard VAR(1) (cont.) •Note both structural shocks can now be identified from the residuals of the standard VAR. •b21=0 implies y does not have a contemporaneous effect on x. •This restriction manifests itself such that both yt and xt affect y contemporaneously but only xt affects x contemporaneously. •The residuals of e2t are due to pure shocks to x. •Decomposing the residuals of the standard VAR in this triangular fashion is called the Choleski decomposition. •There are other methods used to identify models, like Blanchard and Quah (1989) decomposition (it will be covered on the blackboard). Critics on VAR • A VAR model can be a good forecasting model, but in a sense it is an atheoretical model (as all the reduced form models are). • To calculate the IRF, the order matters: remember that “Q” is not unique. • Sensitive to the lag selection • Dimensionality problem. •THINK on TWO MORE weak points of VAR modelling