Week6.1 VAR

VAR Models Gloria González-Rivera University of California, Riverside and Jesús Gonzalo U. Carlos III de Madrid Some References • Hamilton, chapter 11 • Enders, chapter 5 • Palgrave Handbook of Econometrics, chapter 12 by Lutkepohl • Any of the books of Lutkepohl on Multiple Time Series Multivariate Models • VARMAX Models as a multivariate generalization of the univariate ARMA models: p  q r  s Ls Yt  s 0 n xn  G i Li X t    j L j t i 0 j 0 n x1 n x k k x1 n xn • Structural VAR Models: BYt  1Yt 1  ...   pYt  p  t • VAR Models (reduced form) Yt  1Yt 1  ...   pYt  p  at Multivariate Models (cont) where the error term is a vector white noise: E (at a ' s )   if s  t  0 otherwise To avoid parameter redundancy among the parameters, we need to assume certain structure on 0 and This is similar to univariate models.  A Structural VAR(1) Consider a bivariate Yt=(yt, xt), first-order VAR model: y t  b10  b12x t  11yt 1  12x t 1  yt x t  b20  b21y t   21yt 1   22 x t 1  xt • The error terms (structural shocks) yt and xt are white noise innovations with standard deviations y and x and a zero covariance. • The two variables y and x are endogenous (Why?) • Note that shock yt affects y directly and x indirectly. • There are 10 parameters to estimate. From a Structural VAR to a Standard VAR • The structural VAR is not a reduced form. • In a reduced form representation y and x are just functions of lagged y and x. • To solve for a reduced form write the structural VAR in matrix form as:  1 b12   yt   b10    11  12   yt 1   yt  b 1   x   b       x      21   t   20   21 22   t 1   xt  BYt  0  1Yt 1   t From a Structural VAR to a Standard VAR (cont) • Premultipication by B-1 allow us to obtain a standard VAR(1): BYt  0  1Yt 1   t Yt  B 10  B 11Yt 1  B 1 t Yt   0  1Yt 1  at • This is the reduced form we are going to estimate (by OLS equation  by equation) • Before estimating it, we will present the stability conditions (the roots of some characteristic polynomial have to be outside the unit circle) for a VAR(p) • After estimating the reduced form, we will discuss which information do we get from the obtained estimates (Granger-causality, Impulse Response Function) and also how can we recover the structural parameters (notice that we have only 9 parameters now). A bit of history ....Once Upon a Time Sims(1980) “Macroeconomics and Reality” Econometrica, 48 Generalization of univariate analysis to an array of random variables i.e. Z t  moneysupply, X t  int erestrat e,Vt  income  Z t  VAR(p) Yt   X t  Yt  c  1Yt 1   2Yt 2  ..... pYt  p  at   Vt   t   E (at )  0 E (at a ' )   0 t  (1)  11 12 13  i are matrices 1  21 22 23    31 32 33  A typical equation of the system is Z t  c1   (1) 11 Z t 1   (1) 12 X t 1   V  .....  (1) 13 t 1 11( p ) Z t  p  12 ( p ) X t  p  13( p )Vt  p  a1t Each equation has the same regressors Stability Conditions Yt  1Yt 1   2Yt 2  ...... pYt  p  c  at ( I  1L   2 L2  ...... p Lp )Yt  c  at  ( L)Yt  c  at  ( L) is a nxn matrixpolynomialin thelag operator L theij elementof  (L) is [ ij  ij L  ij L  ....  ij (1) A VAR(p) for (2) 2 Yt ( p) 1 i  j L ]  ij   0 i  j p is STABLE if I n  1 x   2 x 2  ..... p x p  0 p x n roots of the characteristic polynomial are outside of the unit circle.   ( I n  1  2  ..... p )1 c If the VAR is stable then a MA() representation exists. Yt    at  1at 1  2at 2  ......    ( L)at ( L)  [ I n  1L  2 L2  ......] This representation will be the “key” to study the impulse response function of a given shock. VAR(p) VAR(1) Re-writing the system in deviations from its mean Yt    1(Yt 1   )  2 (Yt 2   )  ... p (Yt  p   )  at Stack the vector as 1  2 ...... p 1  p  Yt    at    Y    0   I n 0...............0  t 1  F  0 I ...............0  v    t   n t              Y   0   t  p 1  0 0..........I 0  n   (nxp)x1 (nxp)x1 (nxp)x(nxp) H t   t  Ft 1  vt E (vt v ')   0 t   STABLE:  0.....0 0 0......0  (nxp)x(nxp) eigenvalues of F lie inside  where H   of the unit circle (WHY?).     0 0......0   Estimation of VAR models Estimation: Conditional MLE T f (YT , YT 1.....Y1 | Y0 , Y1....Y p 1 ; )   f (Yt | Yt 1 , Yt 2 ....Yt  p 1 ; ) t 1 Yt | Yt 1 , Yt 2 ....  N ( c  1Yt 1  .... pYt  p , )  '  [c 1  2 ..... p ] n x (np+1) X t  [1 Yt 1 Yt 2 ......Yt  p ]' (np+1) x 1 Yt   ' X t  at T ( )   log f (Yt | past; )  t 1 Tn T 1 T 1   log(2 )  log    Yt   ' X t  '  1 Yt   ' X t   2 2 2 t 1 Claim: OLS estimates equation by equation are good!!!    ˆ  'ols   Yt X 't    X t X 't   t 1   t 1  T ˆ  ˆ  mle ols Proof: T T 1  Y  ' X '  Y  ' X   1 t t 1 T t t t     ˆ 'X  ˆ ' X   ' X '  1 Y   ˆ 'X  ˆ ' X  ' X    Yt   ols t ols t t t ols t ols t t t 1 T     ˆ   )' X '  1 aˆ  (  ˆ   )' X    aˆt  (  ols t t ols t t 1 ˆ ols   ) 1 (  ˆ ols   )' X t  2 aˆ 't  1 (  ˆ ols   )' X t   aˆ 't  1aˆt   X 't (  t t t   1 ˆ 1 ˆ ˆ ˆ (*)  a 't  (  ols   )' X t  tr  a 't  (  ols   )' X t   t  t   ˆ ols   )' X t aˆ 't   tr  1 (  ˆ ols   ) X t aˆ 't   0  tr   1 (     t  t    T m in   Yt   ' X t '  1 Yt   ' X t t 1  m in t ˆ ' t  1a ˆt  a   ˆ ols   ) 1 ( ˆ ols   )' X t X ' t ( t because  is posit ivedefinit em at rix  -1 is posit ivedefinit e ˆ t hesm allest value is achievedwhen  ols   ˆ , then Maximum Likelihood of  Evaluate the log-likelihood at  T Tn T 1  1 ˆ) (,  log(2 )  log    aˆ 't  1aˆt 2 2 2 t 1 T ˆ) T (,  1 T 1 ˆ   aˆ aˆ ' ˆ ˆ   '  a a '  0    t t t t  1 2 2 t 1 T t 1 diagonal element sˆ ii 2 1  T off - diagonal element sˆ ij 2 T 2 ˆ a  it t 1 1  T T  aˆ aˆ t 1 it jt Testing Hypotheses in a VAR model Likelihood ratio test in VAR T Tn T 1 ˆ , ˆ 1   aˆ '  ˆ 1aˆ ˆ )   log(2 )  log  (  t t 2 2 2 t 1 T T 1 T 1 1     1 1 1 ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ a 't  at  trace a 't  at   trace  at a 't    2 t 1 2  t 1  2  t 1  1 1 Tn 1 ˆ ˆ  trace  T  traceTI n   2 2 2 Tn T Tn 1 ˆ ˆ ˆ (,  )   log(2 )  log   2 2 2 T estingthenumber of lags p1  p0 :   H 0 : VAR( p0 ) H1 : VAR( p1 ) under H0 performn OLS regressions of each variableon ˆ ˆ , a const antand p lags   0 0 0 Tn T Tn 1 ˆ  0   log(2 )  log 0  2 2 2 under H1 * Tn T Tn 1 ˆ    log(2 )  log 1  2 2 2 * ˆ 1  log  ˆ 1  T log  ˆ  log  ˆ LR  2(*1   0 )  T log  1 0 0 1 * 1  LR   m 2    m  number of restrictions  n 2 ( p1  p0 ) each equationhas p1  p0 restriction on each variable n( p1  p0 ) in each equation In general, linear hypotheses can be tested directly as usual and their A.D follows from the next asymptotic result: Let  T  vec (  T ) denote the (nk  1) (with k=1+np number of parameters estimated per equation) vector of coef. resulting from OLS regressions of each of the elements of y on x for a sample of size T: t t T   1.T   . ,  n.T  T  ' where  iT =   x t x t   t=1  -1 T    x t yit   t=1  ˆ is Asymptotic distribution of  T ( T   )  N (0, (   M 1 )), and the coef of regression i 2 1 T (ˆ iT   i )  N (0,  i M ) with Mˆ  p lim(1 / T )  X t X ' t t Information Criterion in a Standard VAR(p) • In the same way as in the univariate AR(p) models, Information Criteria (IC) can be used to choose the “right” number of lags in a VAR(p): pˆ that minimizes IC(p) for p=1, ..., P. 2(n 2 p  n) AIC  ln   T (n 2 p  n) ln(T) SBC  ln   T • Similar consistency results to the ones obtained in the univariate world are obtained in the multivariate world.The only difference is that as the number of variables gets bigger, it is more unlikely that the AIC ends up overparametrizing (see Gonzalo and Pitarakis (2002), Journal of Time Series Analysis) Granger Causality Granger (1969) : “Investigating Causal Relations by Econometric Models and CrossSpectral Methods”, Econometrica, 37 Consider two random variables X t , Yt Two Forecast of X t , s periods ahead: ˆX ( s ) (1)  E ( X | X , X , ....) Xˆ ( s ) (2)  E ( X | X , X , ....Y , Y , ....) t t  s t t 1 t t  s t t 1 t t 1 2 ˆ ˆ MSE ( X t ( s ))  E ( X t  s  X t ( s )) (1) (2) If MSE ( Xˆ t ( s ) )  MSE ( Xˆ t ( s ) ) then Yt does not Granger-cause X t s  0  Yt is not linearly informative to forecast X t Test for Granger-causality Assume a lag length of p X t  c1  1 X t 1  2 X t 2  ..... p X t  p  1Yt 1  2Yt 2  .... pYt  p  at Estimate by OLS and test for the following hypothesis H 0 : 1  2  ......   p  0 (Yt does not Granger - cause X t ) H1 : any i  0 Unrestricted sum of squared residuals RSS1   aˆt 2 t Restricted sum of squared residuals RSS2   aˆˆt 2 t ( RSS2  RSS1 ) F RSS1 /(T  2 p  1) • Under general conditions F ( p ) Impulse Response Function (IRF) Objective: the reaction of the system to a shock Yt  c  1Yt 1   2Yt 2  ....   pYt  p  at If the system is stable, Yt    ( L)at    at  1at 1   2at 2  .... ( L)  [ ( L)]1 Redating at time t  s : Yt  s    at  s  1at  s 1   2at  s 2  ....   s at   s 1at 1  ....   Yt  s (s)  s   ij a 't nxn yi ,t  s (s)   ij a jt (multipliers) Reaction of the i-variable to a unit change in innovation j Impluse Response Function (cont) Impulse-response function: response of yi ,t  sto one-time impulse in y with all other variables dated t or earlier held constant. jt yi ,t  s   ij a jt  ij 1 2 3 s Example: IRF for a VAR(1) 2  y y a     12     1t   11 12  1t 1  1t  1   y    ; a    y     2  2t   21 22   2t 1  a2t   12  2  t  0 y1t  y2t  0 t  0 a10  0, a20  1 (y2t increases by 1 unit) (no more shocks occur) Reaction of the system y10   0 (impulse)  y  1   20     y11  11 12   0 12     y        21   21 22  1  22   y12  11 12   y11  11 12   0   y        1   y   22   21  22     22   21  21 2 0  y1s  11 12   0 s    1    y       1   2 s   21 22  1  s If you work with the MA representation:  ( L)   ( L) 1  1 1 2  1 2  s   s1 In this example, the variance-covariance matrix of the innovations is not diagonal, i.e.  12  0 There is contemporaneous correlation between shocks, then  y10  0  y   1   20    This is not very realistic To avoid this problem, the variance-covariance matrix has to be diagonalized (the shocks have to be orthogonal) and here is where a serious problems appear. Reminder:  is positive definite (symmetric) matrix.  Q (non-singular) such that QQ '  I Then, the MA representation:  Yt      i at i i 0 0  In  Yt      iQ 1Qat i i 0  Let us call M i   iQ 1; wt  Qat  Yt     M i wt i i 0 E [ wt w 't ]  E [Qat a 't Q ']  QE [at a 't ]Q '  QQ '  I n wt has components that are all uncorrelated and unit variance Yt  s  M s   sQ 1 wt Orthogonalized impulse-response Function. Problem: Q is not unique Variance decomposition Contribution of the j-th orthogonalized innovation to the MSE of the s-period ahead forecast MSE (Yˆt ( s ))  E (Yt  s  Yˆt ( s ))(Yt  s  Yˆt ( s )) ' et ( s )  Yt  s  Yˆt ( s )  at  s  1at  s 1  ..... s 1at 1 E[et ( s )et ( s ) ']   a  1 a 1 ' ....   s 1 a  s 1 ' MSE ( s )  Q 1Q aQ ' Q 1'  1Q 1Q aQ ' Q 1' 1 ' ....  s 1Q 1Q aQ ' Q 1'  s 1 '   Q 1Q 1'  1Q 1Q 1' 1 ' ....... s 1Q 1Q 1'  s 1 '   M 0 M 0 ' M 1M 1 ' .........M s 1M ' s 1 recall that M i   iQ 1 and M 0  Q 1 ,  0  I contribution of the first orthogonalized innovation to the MSE (do it for a two variables VAR model) Example: Variance decomposition in a two variables (y, x) VAR • The s-step ahead forecast error for variable y is: y t s  E t y t s  M 0 (1,1)  yt s  M1 (1,1)  yt s1  ...  M s1 (1,1)  yt 1  M 0 (1, 2)  xt s  M1 (1, 2)  xt s1  ...  M s1 (1, 2)  xt 1 • Denote the variance of the s-step ahead forecast error variance of yt+s as for y(s)2: 2 2 2 2 2  y (s)   y [M 0 (1,1)  M1 (1,1)  ...  M s1 (1,1) ]  2 2 2 2  x [M 0 (1, 2)  M1 (1, 2)  ...  M s1 (1, 2) ] • The forecast error variance decompositions are proportions of y(s)2. due to shocks to y  2 2 2 2 2  y [M 0 (1,1)  M1 (1,1)  ...  M s1 (1,1) ] / y (s) due to shocks to x  2 2 2 2 2  x [M 0 (1, 2)  M1 (1, 2)  ...  M s1 (1, 2) ] /  y (s) Identification in a Standard VAR(1) • Remember that we started with a structural VAR model, and jumped into the reduced form or standard VAR for estimation purposes. •Is it possible to recover the parameters in the structural VAR from the estimated parameters in the standard VAR? No!! •There are 10 parameters in the bivariate structural VAR(1) and only 9 estimated parameters in the standard VAR(1). •The VAR is underidentified. •If one parameter in the structural VAR is restricted the standard VAR is exactly identified. •Sims (1980) suggests a recursive system to identify the model letting b21=0. 1 b12   y t   b10   11 12   y t 1  yt       0 1   x    b        t   20   21  22   x t 1  xt  Identification in a Standard VAR(1) (cont.) • b21=0 implies  y t  1  b12   b10  1  b12   11 12   y t 1  1  b12   yt    x   0 1   b   0 1          x 0 1   20     21 22   t 1     xt   t   y t   10   11 12   y t 1  e1t     x           t   20   21 22   x t 1  e2t  • The parameters of the structural VAR can now be identified from the following 9 equations 10  b10  b12 b20 20  b20 2 2 var(e1 )  2y  b12 x 11  11  b12  21 21   21 var(e2 )  2x 12  12  b12  22 22   22 co v(e1 ,e2 )   b122x Identification in a Standard VAR(1) (cont.) •Note both structural shocks can now be identified from the residuals of the standard VAR. •b21=0 implies y does not have a contemporaneous effect on x. •This restriction manifests itself such that both yt and xt affect y contemporaneously but only xt affects x contemporaneously. •The residuals of e2t are due to pure shocks to x. •Decomposing the residuals of the standard VAR in this triangular fashion is called the Choleski decomposition. •There are other methods used to identify models, like Blanchard and Quah (1989) decomposition (it will be covered on the blackboard). Critics on VAR • A VAR model can be a good forecasting model, but in a sense it is an atheoretical model (as all the reduced form models are). • To calculate the IRF, the order matters: remember that “Q” is not unique. • Sensitive to the lag selection • Dimensionality problem. •THINK on TWO MORE weak points of VAR modelling

Week6.1 VAR

Related documents

Products

Support

Week6.1 VAR

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib