# VECM

```VECM
• First we test to see if variables are stationary I(0).
If not they are assumed to have a unit root and
be I(1).
• If a set of variables are all I(1) they should not be
estimated using ordinary regression analysis, but
between them there may be one or more
equilibrium relationships. We can both estimate
how many and what they are (called
cointegrating vectors) using Johansen’s
technique.
• If a set of variables are found to have one or
more cointegrating vectors then a suitable
estimation technique is a VECM (Vector Error
Correction Model) which adjusts to both short
run changes in variables and deviations from
equilibrium.
• In what follows we work back to front.
Starting with the VECMs, then Johansen’s
technique than stationarity.
• We have data on monthly unemployment
rates in Indiana, Illinois, Kentucky, and
Missouri from
• January 1978 through December 2003. We
suspect that factor mobility will keep the
unemployment
• rates in equilibrium. The following graph plots
the data.
use http://www.stata-press.com/data/r11/urates,
clear
line missouri indiana kentucky illinois t
Note the form of the above line to draw the line
graph; then the variables which will be plotted;
finally t the time variable against which they are
all plotted
For further info press the help key, then line
12
2
4
6
8
10
1980m1
1985m1
1990m1
t
missouri
kentucky
1995m1
indiana
illinois
2000m1
2005m1
• The graph shows that although the series do appear to move
together, the relationship is not that. There are periods when
Indiana has the highest rate and others when Indiana has the
lowest rate.
• Although the Kentucky rate moves closely with the other series for
most of the sample, there is a period in the mid-1980s when the
unemployment rate in Kentucky does not fall at the same rate as
the other series.
• We will model the series with two cointegrating equations and no
linear or quadratic time trends in the original series.
• For now we use the noetable option to suppress displaying the
short-run estimation table.
vec missouri indiana kentucky illinois, trend(rconstant)
rank(2) lags(4) noetable
Vector error-correction model
Sample:
1978m5 - 2003m12
Log likelihood =
Det(Sigma_ml) =
No. of obs
AIC
HQIC
SBIC
417.1314
7.83e-07
=
308
= -2.306048
= -2.005818
= -1.555184
Cointegrating equations
Equation
Parms
_ce1
_ce2
2
2
chi2
P>chi2
133.3885
195.6324
0.0000
0.0000
Johansen normalization restrictions imposed
beta
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
missouri
indiana
kentucky
illinois
_cons
1
(omitted)
.3493902
-1.135152
-.3880707
.
.
.
.
.
.2005537
.2069063
.4974323
1.74
-5.49
-0.78
0.081
0.000
0.435
-.0436879
-1.540681
-1.36302
.7424683
-.7296235
.5868787
missouri
indiana
kentucky
illinois
_cons
-1.11e-16
1
.2059473
-1.51962
2.92857
.
.
.2718678
.2804792
.6743122
.
.
0.76
-5.42
4.34
.
.
0.449
0.000
0.000
.
.
-.3269038
-2.069349
1.606942
.
.
.7387985
-.9698907
4.250197
_ce1
_ce2
• Except for the coefficients on kentucky in the
two cointegrating equations and the constant
term in the first, all the parameters are
significant at the 5% level.
• We can refit the model with the Johansen
normalization and the overidentifying
constraint that the coefficient on kentucky in
the second cointegrating equation is zero.
constraint 1 [_ce1]missouri = 1
constraint 2 [_ce1]indiana = 0
constraint 3 [_ce2]missouri = 0
constraint 4 [_ce2]indiana = 1
constraint 5 [_ce2]kentucky = 0
vec missouri indiana kentucky illinois,
trend(rconstant) rank(2) lags(4) noetable
bconstraints(1/5)
constraint 1 [_ce1]missouri = 1
• Constraint number 1, [_ce1] tells us which
equation and missouri=1 sets constraint.
beta
Coef.
missouri
indiana
kentucky
illinois
_cons
1
(omitted)
.2521685
-1.037453
-.3891102
missouri
indiana
kentucky
illinois
_cons
(omitted)
1
(omitted)
-1.314265
2.937016
Std. Err.
z
P>|z|
[95% Conf. Interval]
_ce1
.
.
.
.
.
.1649653
.1734165
.4726968
1.53
-5.98
-0.82
0.126
0.000
0.410
-.0711576
-1.377343
-1.315579
.5754946
-.6975626
.5373586
.
.
.
.
.
.0907071
.6448924
-14.49
4.55
0.000
0.000
-1.492048
1.67305
-1.136483
4.200982
_ce2
LR test of identifying restrictions:
chi2(
1) =
.3139
Prob > chi2 = 0.575
The test of the overidentifying restriction does not reject the null hypothesis that the restriction
is valid, and the p-value on the coefficient on kentucky in the first cointegrating equation
indicates that it is not significant. We will leave the variable in the model and attribute the lack
of significance to whatever caused the kentucky series to temporarily rise above the others from
1985 until 1990, though we could instead consider removing kentucky from the model.
• Next, we look at the estimates of the
adjustment parameters. In the output below,
we replay the previous results.
• vec missouri indiana kentucky illinois,
trend(rconstant) rank(2) lags(4)
bconstraints(1/5)
Results for D_Missouri
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
D_missouri
_ce1
L1.
-.0683152
.0185763
-3.68
0.000
-.1047242
-.0319063
_ce2
L1.
.0405613
.0112417
3.61
0.000
.018528
.0625946
missouri
LD.
L2D.
L3D.
.2391442
.0463021
.1996762
.0597768
.061306
.0604606
4.00
0.76
3.30
0.000
0.450
0.001
.1219839
-.0738555
.0811755
.3563045
.1664596
.3181768
indiana
LD.
L2D.
L3D.
.000313
-.0071074
.024743
.0526959
.0530023
.0536092
0.01
-0.13
0.46
0.995
0.893
0.644
-.102969
-.11099
-.0803291
.1035951
.0967752
.1298151
kentucky
LD.
L2D.
L3D.
.0169935
.0611493
.0212794
.0475225
.0473822
.0470264
0.36
1.29
0.45
0.721
0.197
0.651
-.0761489
-.0317182
-.0708907
.1101359
.1540168
.1134495
illinois
LD.
L2D.
L3D.
.050437
.0086696
-.0323928
.0491142
.0493593
.0490934
1.03
0.18
-0.66
0.304
0.861
0.509
-.0458251
-.0880728
-.1286141
.1466992
.1054119
.0638285
Interpretation
D_missouri
_ce1
L1.
-.0683152
.0185763
-3.68
0.000
-.1047242
-.0319063
_ce2
L1.
.0405613
.0112417
3.61
0.000
.018528
.0625946
If the error term in the first cointegration relation is positive unemployment in
Missouri FALLS.
If the error term in the second cointegrating regression is positive then unemployment
in Missouri INCREASES.
The first cointegrating regression is Missouri + 0.425Kentucky – 1.037Illinois -0.389 = Error
Missouri + 0.425Kentucky – 1.037Illinois -0.389
= Error
• Viewed in this context if the error term is
positive then unemployment in Missouri can
be viewed as being above equilibrium, same
for Kentucky, but for Illinois it is below
equilibrium (because if we increase Illinois the
error term falls)
• To get back to equilibrium we need
unemployment to fall in Missouri.
D_missouri
_ce1
L1.
-.0683152
.0185763
-3.68
0.000
-.1047242
-.0319063
• As we can see from the regression this is what
we get.
• D_missouri is the change in unemployment in
Missouri i.e. DUMt = Umt – Umt-1
• The coefficient on _ce1 L1 (_ce1 : the error
term from the first cointegrating regression;
L1 lagged one period) is -0.068 and significant
at the 1% level.
• Thus if in period t-1 the error term in _ce1 was
positive, which we can see can be seen as
unemployment in Missouri being too high
compared to the equilibrium relationship with
the other two states, then it will fall.
• The bigger the (negative) coefficient on _ce1
L1 the more rapid is the correction. If it = -1
then the entire error is corrected for in the
following period.
Let us look at the second cointegrating
regression
• This can be written as:
• Error=Indiana -1.342Illinois + 2.93
_ce2
missouri
indiana
kentucky
illinois
_cons
(omitted)
1
(omitted)
-1.314265
2.937016
.
.
.
.
.
.0907071
.6448924
-14.49
4.55
0.000
0.000
-1.492048
1.67305
-1.136483
4.200982
_ce2
L1.
And its impact in the VECM (Vector
Error Correction Model)
• We can see its positive and significant.
Unemployment in Missouri increases if this is
error term is positive. But why? Missouri does
not enter the second cointegrating vector. So
why does unemployment in it respond to it?
.0405613
.0112417
3.61
0.000
.018528
.0625946
Indiana =1.342Illinois + 2.93 + Error
• Well it’s a little convoluted, but if the error term
is positive it suggests that unemployment in
Illinois is below equilibrium (and may increase as
a consequence). Now from first cointegrating
vector:
• Missouri = -0.425Kentucky + 1.037Illinois, if
Illinois unemployment is to increase then the
error term in the first cointegrating vector will fall
(perhaps going negative).
Let us look at the second equation for
Indiana
D_indiana
_ce1
L1.
_ce2
L1.
-.0342096
.0220955
-1.55
0.122
-.0775159
.0090967
.0325804
.0133713
2.44
0.015
.0063732
.0587877
Let us look at the second equation for
Indiana
• The error term from _ce1 is not significant,
but that from _ce2 is and it is positive. _ce2 Is
• Error=Indiana -1.342Illinois + 2.93
D_indiana
_ce1
L1.
_ce2
L1.
-.0342096
.0220955
-1.55
0.122
-.0775159
.0090967
.0325804
.0133713
2.44
0.015
.0063732
.0587877
• Now this does not make much sense if rgw
error term is positive unemployment in
Indiana needs to fall to restore equilibrium.
Yet the coefficient on it is positive indicating
the opposite.
Another View
vec missouri indiana kentucky illinois,
trend(rconstant) rank(2) lags(4) bconstraints(1/5)
matrix cerr=e(beta)
display cerr[1,1]
display cerr[1,3]
display cerr[1,5]
display cerr[1,9]
drop cerr1 cerr2
matrix cerr=e(beta) saves the coefficients from
the two cointgretaing regressions in a vector
cerr.
cerr[1,1] is the first, cerr[1,9] is the penultimate
coefficient in the second equation
Thus: display cerr[1,9] gives: -1.3142654, the
coefficient on Illinois in _ce2
Generate the error terms for the two
equations
generate cerr1= cerr[1,5]+ cerr[1,1]*missouri +
cerr[1,2]*indiana + cerr[1,3]*kentucky +
cerr[1,4]*illinois
generate cerr2= cerr[1,10]+ cerr[1,6]*missouri +
cerr[1,7]*indiana + cerr[1,8]*kentucky +
cerr[1,9]*illinois
Now this:
regress D.missouri LD.missouri LD.indiana
LD.kentucky LD.illinois L2D.missouri L2D.indiana
L2D.kentucky L2D.illinois L3D.missouri
L3D.indiana L3D.kentucky L3D.illinois L.cerr1
L.cerr2
Is almost equivalent to this:
vec missouri indiana kentucky illinois,
trend(rconstant) rank(2) lags(4) bconstraints(1/5)
• I say almost because the VEC estimates both
equations jointly and the regressions are
slightly different, but very slightly.
• Note to if we have a slightly different short run
structure then the cointegrating vectors
change which is a little unsatisfactory
• For example compare:
vec missouri indiana kentucky illinois,
trend(rconstant) rank(2) lags(2)
bconstraints(1/5)
vec missouri indiana kentucky illinois,
trend(rconstant) rank(2) lags(4)
bconstraints(1/5)
Short Run dynamics
• Lets look at the rest of the equation, below is
for D.missouri
• The one period lag is significant as is the 3
period lag. That is it responds to its own
lagged values.
ssouri
LD.
L2D.
L3D.
.2391442
.0463021
.1996762
.0597768
.061306
.0604606
4.00
0.76
3.30
0.000
0.450
0.001
.1219839
-.0738555
.0811755
.3563045
.1664596
.3181768
• But not to those of Indianna.
indiana
LD.
L2D.
L3D.
.000313
-.0071074
.024743
.0526959
.0530023
.0536092
0.01
-0.13
0.46
0.995
0.893
0.644
-.102969
-.11099
-.0803291
.1035951
.0967752
.1298151
This has been based on an example in
the STATA manual, but…..
• There are more variables. Lets try the
regression in full:
• vec missouri indiana kentucky illinois arkansas
ten, trend(rconstant) rank(2) lags(3)
Tennessee appears related to nothing,
so..
_ce1
missouri
indiana
kentucky
illinois
arkansas
tenn
_cons
1
(dropped)
1.399639
-.5534946
-1.463609
-.2175613
-.2406195
.
.
.
.
.
.4071156
.3499487
.3881522
.3220395
.9257524
3.44
-1.58
-3.77
-0.68
-0.26
0.001
0.114
0.000
0.499
0.795
.6017075
-1.239382
-2.224373
-.8487471
-2.055061
2.197571
.1323923
-.7028443
.4136245
1.573822
missouri
indiana
kentucky
illinois
arkansas
tenn
_cons
-1.11e-16
1
.6949351
-1.304683
-.522436
-.1928471
2.832045
.
.
.3402238
.2924498
.3243762
.2691262
.7736451
.
.
2.04
-4.46
-1.61
-0.72
3.66
.
.
0.041
0.000
0.107
0.474
0.000
.
.
.0281088
-1.877874
-1.158202
-.7203248
1.315728
.
.
1.361762
-.7314915
.1133296
.3346307
4.348361
_ce2
vec missouri indiana kentucky illinois arkansas
ten, trend(rconstant) rank(3) lags(3)
• but Tennessee still remains unrelated to
anything
• We can see from the map that Tennessee is on
the South east fringe of this group and it
would be interesting to bring in North
Carolina, Alabama and Georgia.
Johansen’s methoodology
• vecrank implements three types of methods for
determining r, the number of cointegrating
equations in a VECM. The first is Johansen’s
“trace” statistic method. The second is his
“maximum eigenvalue” statistic method. The
third method chooses r to minimize an
information criterion.
• All three methods are based on Johansen’s
maximum likelihood (ML) estimator of the
parameters of a cointegrating VECM.
• webuse balance2
• We have quarterly data on the natural logs of
aggregate consumption, investment, and GDP
inthe United States from the first quarter of 1959
through the fourth quarter of 1982. As discussed
in King et al. (1991), the balanced-growth
hypothesis in economics implies that we would
expect to find two cointegrating equations among
these three variables.
describe
storage
variable name
type
display
format
gdp
t
inv
consump
y
i
c
%9.0g
%tq
%9.0g
%9.0g
%10.0g
%10.0g
%10.0g
float
int
float
float
double
double
double
value
label
variable label
ln(gdp)
ln(investment)
ln(consumption)
• In this example, because the trace statistic at r = 0 of 46.1492
exceeds its critical value of 29.68, we reject the null hypothesis of
no cointegrating equations.
• Similarly, because the trace statistic at r = 1 of 17.581 exceeds its
critical value of 15.41, we reject the null hypothesis that there is
one or fewer cointegrating equation.
• In contrast, because the trace statistic at r = 2 of 3.3465 is less than
its critical value of 3.76, we cannot reject the null hypothesis that
there are two or fewer cointegrating equations.
. vecrank y i c, lags(5)
Johansen tests for cointegration
Trend: constant
Sample: 1960q2 - 1982q4
maximum
rank
0
1
2
3
parms
39
44
47
48
LL
1231.1041
1245.3882
1252.5055
1254.1787
Number of obs =
Lags =
eigenvalue
.
0.26943
0.14480
0.03611
5%
trace
critical
statistic
value
46.1492
29.68
17.5810
15.41
3.3465*
3.76
91
5
• Because Johansen’s method for estimating r is
to accept as the actual r the first r for which
the null hypothesis is not rejected, we accept r
= 2 as our estimate of the number of
cointegrating equations between these three
variables.
• The “*” by the trace statistic at r = 2 indicates
that this is the value of r selected by
Johansen’s multiple-trace test procedure
vecrank y i c, lags(5) level99
In the previous example, we used the default 5% critical
values. We can estimate r with 1% critical values
instead by specifying the level99 option.
Johansen tests for cointegration
Trend: constant
Number of obs =
Sample: 1960q2 - 1982q4
Lags =
maximum
rank
0
1
2
3
parms
39
44
47
48
LL
1231.1041
1245.3882
1252.5055
1254.1787
eigenvalue
.
0.26943
0.14480
0.03611
91
5
1%
trace
critical
statistic
value
46.1492
35.65
17.5810*
20.04
3.3465
6.65
• The output indicates that switching from the 5% to
the 1% level changes the resulting estimate from r =
2 to r = 1.
The maximum eigenvalue
statistic
A second test. This assumes a given r under the null
hypothesis and test this against the alternative
that there are r+1 cointegrating equations.
Johansen (1995, chap. 6, 11, and 12) derives an
LR test of the null of r cointegrating relations
against the alternative of r+1 cointegrating
relations.
• This method is used less often than the trace
statistic method, but often both test statistics are
reported.
vecrank y I c, lags(5) max levela
• The levela option obtains both the 5% and 1%
critical values.
Johansen tests for cointegration
Trend: constant
Number of obs =
Sample: 1960q2 - 1982q4
Lags =
maximum
rank
0
1
2
3
maximum
rank
0
1
2
3
parms
39
44
47
48
parms
39
44
47
48
LL
1231.1041
1245.3882
1252.5055
1254.1787
LL
1231.1041
1245.3882
1252.5055
1254.1787
eigenvalue
0.26943
0.14480
0.03611
eigenvalue
0.26943
0.14480
0.03611
91
5
trace
5% critical
statistic
value
46.1492
29.68
17.5810*1
15.41
3.3465*5
3.76
1% critical
value
35.65
20.04
6.65
max
statistic
28.5682
14.2346
3.3465
1% critical
value
25.52
18.63
6.65
5% critical
value
20.97
14.07
3.76
The test statistics are often referred to as lambda trace
and lambda max respectively
• We print out both tests in this table the eigenvalue ones are in the
second half of the table.
• The test is for r versus r+1 cointegrating vectors.
• In this example, because the trace statistic at r = 0 of 46.1492
exceeds its critical value of 29.68, we reject the null hypothesis of
no cointegrating equations.
• Similarly, because the trace statistic at r = 1 of 17.581 exceeds its
critical value of 15.41, we reject the null hypothesis that there is
one or fewer cointegrating equation. In contrast, because the trace
statistic at r = 2 of 3.3465 is less than its critical value of 3.76, we
cannot reject the null hypothesis that there are two or fewer
cointegrating equations.
• The net result is we conclude there are 2 cointegrating vectors.
Stationarity
• Intuitively a variable is stationary (I(0) – integrated to order nought) if
its characteristics do not change over time, e.g. variance, covariance
and mean is unchanging.
• Another way of looking at it is that ρ<1 in the following equation for
a variable y:
• Yt = ρYt-1
• We do not estimate the above, but subtract Yt-1 from both sides:
• Yt - Yt-1 = ρYt-1 -Yt-1 =(ρ-1)Yt-1
• Now in this regression we test that (ρ-1) is significantly negative,
which implies ρ<1. If we reject this we say y is I(1) and it has a unit
root. In this case doing time series OLS, etc on a variable with
variables that are I(1) results in bias. The Johansen method is a
suitable alternative.
• The above is the Dickey Fuller test (DF), add lagged values of Yt - Yt-1
to get rid of serial correlation and we have the augmented
Dickey Fuller test.
Stationarity
• Among the earliest tests proposed is the one
by Dickey and Fuller (1979), though most
researchers now use an improved variant
called the augmented Dickey–Fuller test
instead of the original version.
• Other common unit-root tests implemented in
Stata include the DF–GLS test of Elliot
Rothenberg, and Stock (1996) and the
Phillips–Perron (1988) test.
webuse air2
dfuller air
The test statistics is less negative than any of the critical
values and hence we cannot reject the null
hypothesis that the variable exhibits a unit root and
is thus not stationary
Dickey-Fuller test for unit root
Test
Statistic
Z(t)
-1.748
Number of obs
=
143
Interpolated Dickey-Fuller
1% Critical
5% Critical
10% Critical
Value
Value
Value
-3.496
MacKinnon approximate p-value for Z(t) = 0.4065
-2.887
-2.577
dfuller air, lags(3) trend
• This is a similar regression, but includes 3
lagged values and a trend term. It is now
stationary. What has made the difference?
Augmented Dickey-Fuller test for unit root
Z(t)
Test
Statistic
1% Critical
Value
-6.936
-4.027
Number of obs
=
140
Interpolated Dickey-Fuller
5% Critical
10% Critical
Value
Value
MacKinnon approximate p-value for Z(t) = 0.0000
-3.445
-3.145
The inclusion of the trend term
. dfuller air, trend
Dickey-Fuller test for unit root
Z(t)
Number of obs
Test
Statistic
1% Critical
Value
-4.639
-4.026
=
143
Interpolated Dickey-Fuller
5% Critical
10% Critical
Value
Value
-3.444
-3.144
MacKinnon approximate p-value for Z(t) = 0.0009
. dfuller air, lags(3)
Augmented Dickey-Fuller test for unit root
Z(t)
Test
Statistic
1% Critical
Value
-1.536
-3.497
Number of obs
=
140
Interpolated Dickey-Fuller
5% Critical
10% Critical
Value
Value
MacKinnon approximate p-value for Z(t) = 0.5158
-2.887
-2.577
dfuller air, lags(3) trend regres
Augmented Dickey-Fuller test for unit root
Z(t)
Test
Statistic
1% Critical
Value
-6.936
-4.027
Number of obs
=
140
Interpolated Dickey-Fuller
5% Critical
10% Critical
Value
Value
-3.445
-3.145
MacKinnon approximate p-value for Z(t) = 0.0000
D.air
Coef.
air
L1.
LD.
L2D.
L3D.
_trend
_cons
Lagged values
of D.air
-.5217089
.5572871
.095912
.14511
1.407534
44.49164
Std. Err.
.0752195
.0799894
.0876692
.0879922
.2098378
7.78335
t
-6.94
6.97
1.09
1.65
6.71
5.72
P>|t|
0.000
0.000
0.276
0.101
0.000
0.000
[95% Conf. Interval]
-.67048
.399082
-.0774825
-.0289232
.9925118
29.09753
-.3729379
.7154923
.2693065
.3191433
1.822557
59.88575
This is the test statistic, the coefficient on air(t-1) =L1.air
• The regression basically regresses the change
in the variable (D.air) on lagged changes an
the lagged value of air plus a constant and
time trend.
• The inclusion of lagged D.Air values makes this
the augmented Dickey-Fuller test, i.e. it is
what differentiates ir from the Dickey Fuller
test.
pperron air
• Phillips and Perron’s test statistics can be viewed as Dickey–
Fuller statistics that have been made robust to serial
correlation by using the Newey–West (1987)
heteroskedasticity- and autocorrelation-consistent covariance
matrix estimator.
Phillips-Perron test for unit root
Z(rho)
Z(t)
Number of obs
=
Newey-West lags =
Test
Statistic
1% Critical
Value
-6.564
-1.844
-19.943
-3.496
143
4
Interpolated Dickey-Fuller
5% Critical
10% Critical
Value
Value
MacKinnon approximate p-value for Z(t) = 0.3588
-13.786
-2.887
-11.057
-2.577
• Z(rho) is the main statistic we are interested in
as it is similar to the ADF test statistic.
DFGLS Test
webuse lutkepohl2
dfgls dln_inv
• dfgls tests for a unit root in a time series. It performs the modified
Dickey–Fuller t test (known as the DF-GLS test) proposed by Elliott,
Rothenberg, and Stock (1996). Essentially, the test is an augmented
Dickey–Fuller test, similar to the test performed by Stata’s dfuller
command, except that the time series is transformed via a generalized
least squares (GLS) regression before performing the test.
• Elliott, Rothenberg, and Stock and later studies have shown that this test
has significantly greater power than the previous versions of the
augmented Dickey–Fuller test.
```