STAT 497 LECTURE NOTE 12

advertisement
STAT 497
LECTURE NOTE 12
COINTEGRATION
1
Multivariate Unit Root Processes
• Generally we cannot reject the null hypothesis,
that many time series have unit roots. For
example,log consumption and log output are
both non-stationary, but log consumption –log
output is stationary. This situation is called
cointegration. The practical problem is that when
we have cointegration, asymptotics change
completely. Furthermore, we really do not have
enough data to definitively tell whether or not we
have cointegrated series.
2
Multivariate Unit Root Processes
• In a univariate nonstationary time series Yt is
said to be integrated of order d, I(d), if its
(d1)th difference is nonstationary but d-th
difference is stationary.
• If Yt is nonstationary but Yt=(1B)Yt is
stationary, then Yt is integrated of order 1.
Yt~I(1) but Yt~I(0)
3
Multivariate Unit Root Processes
• In many time series, integrated processes are
considered together and they form equilibrium
relationships.
– Short-term and long-term interest rates
– Income and consumption
• These leads to the concept of cointegration.
• The idea behind the cointegration is that
although multivariate time series is integrated,
certain linear transformations of the time
series may be stationary.
4
Granger Causality Tests
• According to Granger, causality can be further subdivided into long-run and short-run causality.
• This requires the use of error correction models or
VECMs, depending on the approach for determining
causality.
• Long-run causality is determined by the error
correction term, whereby if it is significant, then it
indicates evidence of long run causality from the
explanatory variable to the dependent variable.
• Short-run causality is determined as before, with a test
on the joint significance of the lagged explanatory
variables, using an F-test or Wald test.
5
Granger Causality Tests
• Before the ECM can be formed, there first has to
be evidence of cointegration, given that
cointegration implies a significant error
correction term, cointegration can be viewed as
an indirect test of long-run causality.
• It is possible to have evidence of long-run
causality, but not short-run causality and vice
versa.
• In multivariate causality tests, the testing of longrun causality between two variables is more
problematic, as it is impossible to tell which
explanatory variable is causing the causality
through the error correction term.
6
SPURIOUS REGRESSION
• If we regress a y series with unit root on regressors
who also have unit roots the usual t tests on regression
coefficients show statistically significant regressions,
even if in reality it is not so.
• The Spurious Regression Problem can appear with I(0)
series (see Granger, Hyung and Jeon (1998)). This is
telling us that the problem is generated by using
WRONG CRITICAL VALUES!!!!
• In a Spurious Regression the errors would be correlated
and the standard t-statistic will be wrongly calculated
because the variance of the errors is not consistently
estimated. In the I(0) case the solution is:
ˆ ˆ
t  t - distributi on , where ˆ  (long - run varian ce of ˆ )1/2
ˆ
7
SPURIOUS REGRESSION
• How do we detect a Spurious Regression (between I(1)
series)?
Looking at the correlogram of the residuals and also by testing
for a unit root on them.
• How do we convert a Spurious Regression into a valid
regression?
By taking differences.
• Does this solve the SPR problem?
It solves the statistical problems but not the economic
interpretation of the regression. Think that by taking
differences we are loosing information and also that it is not
the same information contained in a regression involving
growth rates than in a regression involved the levels of the
variables.
8
SPURIOUS REGRESSION
Typical symptom: “High R2, t-values, F-value, but low DW”
1. Egyptian infant mortality rate (Y), 1971-1990, annual data, on Gross
aggregate income of American farmers (I) and Total Honduran
money supply (M)
Y ^ = 179.9 - .2952 I - .0439 M, R2 = .918, DW = .4752, F = 95.17
(16.63) (-2.32) (-4.26) Corr = .8858, -.9113, -.9445
2. US Export Index (Y), 1960-1990, annual data, on Australian males’
life expectancy (X)
Y ^ = -2943. + 45.7974 X, R2 = .916, DW = .3599, F = 315.2
(-16.70) (17.76)
Corr = .9570
9
SPURIOUS REGRESSION
3. US Defense Expenditure (Y), 1971-1990, annual data, on Population
of South African (X)
Y ^= -368.99 + .0179 X, R2 = .940, DW = .4069, F = 280.69
(-11.34) (16.75) Corr = .9694
4. Total Crime Rates in the US (Y), 1971-1991, annual data, on Life
expectancy of South Africa (X)
Y ^= -24569 + 628.9 X, R2 = .811, DW = .5061, F = 81.72
(-6.03) (9.04)
Corr = .9008
5. Population of South Africa (Y), 1971-1990, annual data, on Total
R&D expenditure in the US (X)
Y ^= 21698.7 + 111.58 X, R2 = .974, DW = .3037, F = 696.96
(59.44)
(26.40) Corr = .9873
10
SPURIOUS REGRESSION
• Does it make sense a regression between two I(1) variables?
Yes if the regression errors are I(0).
• Can this be possible?
The same question asked David Hendry to Clive Granger time
ago. Clive answered NO WAY!!!!! but he also said that he
would think about. In the plane trip back home to San Diego,
Clive thought about it and concluded that YES IT IS POSSIBLE.
It is possible when both variables share the same source of
the I(1)’ness (co-I(1)), when both variables move together in
the long-run (co-move), ... when both variables are
COINTEGRATED!
11
COINTEGRATION
• An mx1 vector time series Yt is said to be
cointegrated of order (d, b), CI(d,b) where
0<bd, if each of its component series Yit is
I(d) but some linear combination of the series
’Yt is I(db) for some nonzero constant
vector ’.
• ’ is the cointegrating vector or the long run
parameter and it is not unique.
• The most common case is d=b=1.
12
COINTEGRATION
• More generally, if the mx1 vector series Yt
contains more than two components, each being
I(1), then there may exist k (<m) linearly
independent 1xm vectors 1’, 2’,…, k’, such
that ’Yt is a nonstationary kx1 vector process
where
  1 , , k 
is a kxm cointegrating matrix.
• The number of linearly independent cointegrating
vectors is called the cointegrating rank.
 Yt is cointegrated of rank k.
13
EXAMPLE
• Consider the following system of processes
 1.0  x1,t 1  a1,t   0.2  0.4 a1,t 1 
 x1,t   0.5
 x    0.25 0.5   x   a    0.1 0.2  a 
  2 ,t 1   2 ,t  
  2 ,t 1 
 2 ,t  
14
VAR with Cointegration
• Let Yt be mx1. Suppose we estimate VAR(p)
Yt  1Yt 1     pYt  p  at
or Yt  B Yt 1  at .
• Let say we have a unit root. Then, we can
write
*
B   1  1  B  B 
• This is like a multivariate version of the
augmented Dickey- Fuller test
p
Yt  Yt 1   Yt i  at .
i 1
15
VAR with Cointegration
• Rearranging the equation
Yt  1  1Yt 1  *  B Yt 1  at .
where Rank((1)I)<m. There are two cases:
1. (1)= I then we have m independent unit roots, so
there is no cointegration, and we should run the
VAR in differences.
2. 0<Rank((1)I)=k<m, then we can write (1)I
=’ where  and  are mxk. The equation
becomes:
*

Yt   Yt 1    B Yt 1  at .
This is called a vector error correction model (VECM).
16
VAR with Cointegration
• Note that if you run OLS in differences, then the
modeled is misspecified and the results will be biased.
What can you do?
(a)If you know the location of the unit roots and
cointegration relations, then you can run the VECM by
doing OLS of Yt on lags of Y and ’Yt1.
(b)If you know nothing, then you can either (i) run OLS in
levels, or (ii) test (many times) to estimate cointegrating
relations, and run VECM. The problem with this
approach is that you are testing many times and you are
estimating cointegrating relationships. This leads to
poor finite sample properties.
17
Residual Based Tests of the Null of No
Cointegration
• Procedures designed to distinguish a system
without cointegration from a system with at least
one cointegrating relationship; they do not
estimate the number of cointegrating vectors (the
k). Tests are conditional on a pretest for unit roots
in each of the variables.
• When the cointegration vector is known:
construct the hypothesized linear combination
that is stationary, treat it as data, and apply a
Dickey-Fuller unit root test to that linear
combination. The null hypothesis is that there is a
unit root, or no cointegration.
18
Residual Based Tests of the Null of No
Cointegration
• When the cointegration vector is not known: Assume
that, if there exists a cointegrating relation, the
coefficient on Y1t is nonzero, allowing us to express the
“static regression equation” as
Y1t   Y2t  ut
• You can apply a unit root test to the estimated OLS
residual from estimation of the above equation, but
– Include a constant in the static regression if the alternative
allows for a nonzero mean in ut
– Include a trend in the static regression if the alternative is
stochastic cointegration, i.e., a nonzero trend for A’Yt.
19
Residual Based Tests of the Null of No
Cointegration
• The first step in testing cointegration is to test
the null hypothesis of a unit root in each
component series Yit individually using the
univariate unit root tests.
• If the hypothesis is not rejected, then the next
step is to test cointegration among the
components, i.e., to test whether ’Yt is
stationary.
20
Residual Based Tests of the Null of No
Cointegration
• In practice, the cointegration vector is
unknown. One way to test the existence of
cointegration is the regression method
(Engle&Granger, 1986, 1987).
• If Yt=(Y1t,Y2t,…,Ymt) is cointegrated, ’Yt is
stationary where =(1, 2,…, m). Then,
(1/1) is also a cointegrated vector where
10.
21
Residual Based Tests of the Null of No
Cointegration
• Consider the regression model for Y1t
Y1t  1Y2t    m1Ymt   t
and check whether t is I(1) or I(0).
• If t~I(1), then Yt is not cointegrated.
• If t~I(0), then Yt is cointegrated with a
normalizing cointegrating vector
’=(1,1,…, m1) .
22
Residual Based Tests of the Null of No
Cointegration
• In testing the error series for nonstationary,
– Calculate the OLS estimate ˆ  1,ˆ1 , ,ˆm1 .
– Use the residual series ̂ t for the test using the
standard
ADF
or
PP
.
asympt.
– ˆi ~ t  distribution if t~I(0).
– H0: =1 vs H1: <1 for the model
p 1
 t   t 1    j  t  j  at
j 1
– H0: =0 vs H1:  <1 for the model
p 1
 t   t 1    j  t  j  at
j 1
23
Residual Based Tests of the Null of No
Cointegration
• t-statistic:
T
ˆ
sˆ
• The critical values are obtained by simulation
(Engle&Granger, 1987).
Level of significance
1%
5%
p=1
4.07
3.37
p>1
 3.73
3.17
• If T<Critical Value, reject H0Cointegration exists.
24
The Johansen Trace and Maximal
Eigenvalue Tests
• To test whether the variables are cointegrated
or not, one of the well-known tests is the
Johansen trace test. The Johansen test is used
to test for the existence of cointegration and is
based on the estimation of the ECM by the
maximum likelihood, under various
assumptions about the trend or intercepting
parameters, and the number k of
cointegrating vectors, and then conducting
likelihood ratio tests.
25
The Johansen Trace and Maximal
Eigenvalue Tests
• Assuming that the ECM errors are independent Nm[0, ]
distribution, and given the cointegrating restrictions on the
trend or intercept parameters, the maximum likelihood Lmax(k)
is a function of the cointegration rank k.
• The trace test is based on the log-likelihood ratio
ln[Lmax(k)/Lmax(k+1)], and is conducted sequentially for
k = m-1,...,1,0. The name comes from the fact that the test
statistics involved are the trace (the sum of the diagonal
elements) of a diagonal matrix of generalized eigenvalues.
This test examines the null hypothesis that the cointegration
rank is less than or equal to k, against the alternative that the
cointegration rank is greater than k. If the trace is greater
than the critical value for a certain rank, then the null
hypothesis that the cointegration rank is equal to k is rejected.
26
The Johansen Trace and Maximal
Eigenvalue Tests
• Consider a non-stationary cointegrated VAR(p)
model
( I  1B  ...   p Bp ) xt  at
where at are normally distributed with mean 0
and covariance matrix . In a series of influential
papers, Johansen (1988, 1991), and Johansen and
Juselius (1990) proposed practical full maximum
likelihood estimation and testing approaches
based on the error correction representation
(ECM).
27
The Johansen Trace and Maximal
Eigenvalue Tests
• Consider the ECM
xt   0d t 
p 1
  x
j
t j
  xt  p  a t
j 1
where xt  xt  xt 1 ,dt is a vector of deterministic
variables, such as constant and seasonal dummy
variables,  j   I  1    j , j = 1, , p-1
are m×m,    A,A and  are m×k parameter
matrices, the are i.i.d. Nm(0, ) errors, and
det( I   B ) has all of its roots outside the unit
circle.
p 1
j
j
j 1
28
The Johansen Trace and Maximal
Eigenvalue Tests
• This ECM is based on the Engle-Granger (1987) error
correction representation theorem for cointegrated
systems, and the asymptotic inference involved is related to
the work of Sims, Stock, and Watson (1990).
• By step-wise concentrating all the parameter matrices in
the likelihood function out except for the matrix A,
Johansen shows that the maximum likelihood estimator of
A can be derived as the solution of a generalized
eigenvalue problem. Likelihood ratio tests of hypotheses
about the number of cointegrating vectors can then be
based on these eigenvalues. Moreover, Johansen (1988)
also proposes likelihood ratio tests for linear restrictions on
these cointegrating vectors.
29
The Johansen Trace and Maximal
Eigenvalue Tests
• The Johansen test for the existence of cointegration is
based on the estimation of the above ECM by the
maximum likelihood and is used to test the hypothesis
H 0 : Rank    k, where k is less than m. This formulation
shows that I(1) models form nested sequence models
H 0    H k     H m 
where H(m) is the unrestricted VAR model or I(0) model,
and H(0) corresponds to the restriction =0, which is the
VAR model for indifferences. Since    A , it is equivalent
to test that A and  are of full column rank k, the number of
independent cointegrating vectors that forms the matrix A.
The test has been named the Johansen trace test because
the likelihood ratio test statistic is the trace of a diagonal
matrix of generalized eigenvalues from .
30
The Johansen Trace and Maximal
Eigenvalue Tests
Sequential tests:
i. H0: k=0,
cannot be rejected →stop
(at most zero coint)
rejected →next test
ii. H0: k<=1, cannot be rejected →stop→k=1
(at most one coint)
rejected →next test
iii. H0: k2, cannot be rejected →stop→k=2
(at most two coint)
rejected →next test
31
The Johansen Trace and Maximal
Eigenvalue Tests
(i) Rank k = m: all variables in x are I(0), not an
interesting case to start with.
(ii) Rank k = 0: there are no linear combinations of
x that are I(0), no cointegration exists, and  is
full of zeros. Model on differenced series
(iii) Rank k  (m-1): up to (m-1) cointegration
relationships ´xt-k.
i.e. k  (m-1) rows of  form k linearly
independent combinations of variables in x, each
of which is I(0); alternatively (m-k) nonstationary
vectors forming I(1) stochastic trends.
32
The Johansen Trace and Maximal
Eigenvalue Tests
• Under some regularity conditions, we can
write the cointegrated process as an Error
Correction Model (ECM):
xt  1xt 1  ...   p 1xt  p 1  xt  p  at
where  is the difference operator , the at's
are i.i.d. N(0, ).
33
The Johansen Trace and Maximal
Eigenvalue Tests
• We can write this ECM as
Z 0 t  Z1t  Z pt  at
where Z  x , Z  (x ,..., x ) , Z  x ,   ( ,...,  )
• The likelihood ratio statistic for hypothesis
H0 :   A is given by
m
 2 ln   n  ln1  ˆi 
i  k 1
1
ˆ
S
S

where i denotes the eigenvalues of p 0 00 S0 p wrt S pp
and are ordered by ˆ1    ˆ m  0.
0t
t
1t
t 1
t  p 1
pt
t p
1
p 1
34
The Johansen Trace and Maximal
Eigenvalue Tests
Where
1
11
Sij  M ij  M i1M M ij
1
n
M ij  n  Z it Z jt ; i , j  0,1, p.
t 1
• If the test statistics are greater than the critical
value for rank k, then the null hypothesis that
the cointegration rank is equal to k is rejected.
35
The Johansen Trace and Maximal
Eigenvalue Tests
• The statistic ln has the following limiting
distribution which can be expressed in terms
of a mk dimensional Brownian motion as
1 1
1
1





tr   dY  Y   YY dt 
Y  dY  
 0
 0
 0




• The percentiles of the asymptotic distribution
for the trace statistic are tabulated in
Johansen (1988, Table 1) using simulation
analysis.
36
The Johansen Trace and Maximal
Eigenvalue Tests
• An alternative LR statistic, given by
 2 ln   n ln1  ˆ k 1 
and called the maximal eigenvalue statistic,
examines the null hypothesis of k
cointegrating vectors versus the alternative
k+1 cointegrating vectors. The asymptotic
distribution of this statistic is given by the
maximum eigenvalue of the stochastic matrix
1 1
in
1
1


  dY Y   YY dt 
0
 0


0
Y  dY 
37
Analysis of U.S. Economic
Variables (From SAS Online Doc)
• Consider the following four-dimensional
system of U.S. economic variables. Quarterly
data for the years 1954 to 1987 are used
(Lütkepohl 1993, Table E.3.). The following
statements plot the series and proceed with
the VARMAX procedure.
38
SAS Code
symbol1 v=none height=1 c=black;
symbol2 v=none height=1 c=black;
title 'Analysis of U.S. Economic Variables';
data us_money;
date=intnx( 'qtr', '01jan54'd, _n_-1 );
format date yyq. ;
input y1 y2 y3 y4 @@;
y1=log(y1);
y2=log(y2);
label y1='log(real money stock M1)' y2='log(GNP in bil.
of 1982 dollars)' y3='Discount rate on 91-day T-bills'
y4='Yield on 20-year Treasury bonds';
datalines;
... data lines omitted ... ;
legend1 across=1 frame label=none;
39
SAS Code (Contd.)
proc gplot data=us_money;
symbol1 i = join l = 1;
symbol2 i = join l = 2;
axis2 label = (a=-90 r=90 " ");
plot y1 * date = 1 y2 * date = 2 / overlay vaxis=axis2
legend=legend1;
run;
proc gplot data=us_money;
symbol1 i = join l = 1;
symbol2 i = join l = 2;
axis2 label = (a=-90 r=90 " ");
plot y3 * date = 1 y4 * date = 2 / overlay vaxis=axis2
legend=legend1;
run;
proc varmax data=us_money;
id date interval=qtr;
model y1-y4 / p=2 lagmax=6 dftest print=(iarr(3))
cointtest=(johansen=(iorder=2)) ecm=(rank=1 normalize=y1);
cointeg rank=1 normalize=y1 exogeneity;
run;
40
SAS Output
• This example performs the Dickey-Fuller test
for stationarity, the Johansen cointegrated test
integrated order 2, and the exogeneity test.
The VECM(2) fits the data. From the outputs
shown below, you can see that the series has
unit roots and is cointegrated in rank 1 with
integrated order 1. The fitted VECM(2) is given
as
41
SAS Output
42
SAS Output
Dickey-Fuller Unit Root Tests
Variable
Type
Rho
Pr < Rho
Tau
Pr < Tau
y1
Zero
Mean
0.05
0.6934
1.14
0.9343
Single
Mean
-2.97
0.6572
-0.76
0.8260
Trend
-5.91
0.7454
-1.34
0.8725
Zero
Mean
0.13
0.7124
5.14
0.9999
Single
Mean
-0.43
0.9309
-0.79
0.8176
Trend
-9.21
0.4787
-2.16
0.5063
Zero
Mean
-1.28
0.4255
-0.69
0.4182
Single
Mean
-8.86
0.1700
-2.27
0.1842
Trend
-18.97
0.0742
-2.86
0.1803
Zero
Mean
0.40
0.7803
0.45
0.8100
Single
Mean
-2.79
0.6790
-1.29
0.6328
Trend
-12.12
0.2923
-2.33
0.4170
y2
y3
y4
43
SAS Output
Cointegration Rank Test for I(2)
r\k-r-s
4
3
2
1
Trace
of I(1)
5% CV of
I(1)
0
384.60903
214.37904
107.93782
37.02523
55.9633
47.21
219.62395
89.21508
27.32609
20.6542
29.38
73.61779
22.13279
2.6477
15.34
38.29435
0.0149
3.84
1
2
3
5% CV I(2)
47.21000
29.38000
15.34000
3.84000
44
SAS Output
Long-Run Parameter
Beta Estimates When
RANK=1
Adjustment Coefficient
Alpha Estimates When
RANK=1
Variable
1
Variable
1
y1
1.00000
y1
-0.01396
y2
-0.46458
y3
14.51619
y2
-0.02811
y4
-9.35520
y3
-0.00215
y4
0.00510
45
Diagnostic Checks
Schematic Representation of Cross Correlations
of Residuals
Variable
0
/Lag
1
2
3
4
5
6
y1
++..
....
++..
....
+...
..--
....
y2
++++
....
....
....
....
....
....
y3
.+++
....
+.-.
..++
-...
....
....
y4
.+++
....
....
..+.
....
....
....
+ is > 2*std error, - is < -2*std error, . is between
Portmanteau Test for Cross Correlations
of Residuals
Up To Lag
DF
Chi-Square
Pr > ChiSq
3
16
53.90
<.0001
4
32
74.03
<.0001
5
48
103.08
<.0001
6
64
116.94
<.0001
46
Diagnostic Checks
Univariate Model ANOVA Diagnostics
Variable
R-Square
Standard
Deviation
F Value
Pr > F
y1
0.6754
0.00712
32.51
<.0001
y2
0.3070
0.00843
6.92
<.0001
y3
0.1328
0.00807
2.39
0.0196
y4
0.0831
0.00403
1.42
0.1963
Univariate Model White Noise Diagnostics
Variable
Durbin
Watson
y1
Normality
ARCH
ChiSquare
Pr > ChiS
q
F Value
Pr > F
2.13418
7.19
0.0275
1.62
0.2053
y2
2.04003
1.20
0.5483
1.23
0.2697
y3
1.86892
253.76
<.0001
1.78
0.1847
y4
1.98440
105.21
<.0001
21.01
<.0001
47
Diagnostic Checks
Univariate Model AR Diagnostics
AR1
Variable
AR2
AR3
AR4
F Value
Pr > F
F Value
Pr > F
F Value
Pr > F
F Value
Pr > F
y1
0.68
0.4126
2.98
0.0542
2.01
0.1154
2.48
0.0473
y2
0.05
0.8185
0.12
0.8842
0.41
0.7453
0.30
0.8762
y3
0.56
0.4547
2.86
0.0610
4.83
0.0032
3.71
0.0069
y4
0.01
0.9340
0.16
0.8559
1.21
0.3103
0.95
0.4358
48
Diagnostic Checks
Testing Weak Exogeneity of
Each Variables
Variable
DF
Chi-Square
Pr > ChiSq
y1
1
6.55
0.0105
y2
1
12.54
0.0004
y3
1
0.09
0.7695
y4
1
1.81
0.1786
Whether each variable is the weak exogeneity of other variables. The
variable y1 is not the weak exogeneity of other variables, y2, y3, and
y4; the variable y2 is not the weak exogeneity of other variables, y1,
y3, and y4.
If a variable can be taken as "given" without losing information
for the purpose of statistical inference, it call weak exogenous.
Weak exogeneityLong-run noncausality
49
Download