Unit Roots, Structural Breaks and Cointegration Analysis

advertisement
Unit Roots, Structural Breaks and
Cointegration Analysis:
A Review of the Available Processes
and Procedures and an Application
(Presented at the Macroeconomics and Financial Economics Workshop:
Recent Developments in Theory and Empirical Modeling workshop, held
from October 8 to October 9 at Eastern Mediterranean University)
Asst. Prof. Dr. Mete Feridun
Department of Banking and Finance
Faculty of Business and Economics
Eastern Mediterranean University
OUTLINE




Testing for stationarity
Testing for structural breaks
Dealing with structural breaks in
cointegration analysis
Tests for parameter stability
Time-series Econometrics



Prior to any time-series econometric analysis, it is
necessary to investigate the stationarity properties of
the variables.
A stationary series fluctuates around a constant longrun mean and, this implies that the series has a finite
variance which does not depend on time.
On the other hand, non-stationary series have no
tendency to return to a long-run deterministic path
and the variances of the series are time-dependent.



Non-stationary series suffer permanent effects from
random shocks and thus the series follow a random
walk.
The problems caused by non-stationary variables in
standard regression analysis have been well
documented in the time-series literature.
The standard classical estimation methods are based
on the assumption that the mean and variances of
the stochastic series are finite, constant and time
invariant.




Given that economic time-series are typically described as nonstationary processes, the estimates of such variables will lead to
spurious regression and their economic interpretation will not be
meaningful.
A spurious regression occurs when a pair of independent series, but
with strong temporal properties, are found apparently to be related
according to standard inference in an OLS regression
If the unit root tests find that a series contain one unit root, the
appropriate route in this case is to transform the data by differencing
the variables prior to their inclusion in the regression model, but this
incurs a loss of important long-run information.
Alternatively, if the variables are cointegrated, that is, if a long-run
relationship exists among the set of variables that share similar nonstationarity properties, regression involving the levels of the variables
can proceed without generating spurious results.



In this case, a “balanced” regression leads to
meaningful interpretations and evades the spurious
regression problem.
In binary choice models also, the method of
estimation of the are based on the assumption that
the means and variances of these variables being
tested are constant over time, i.e. stationary.
Incorporating non-stationary or unit root variables in
estimating the regression equations using these
models give misleading inferences.
Unit Root Tests



Traditionally, Augmented Dickey–Fuller (ADF) and
Phillips–Perron (PP) tests are used to assess the
order of integration of the variables.
Uniform outcomes of both tests are necessary for the
final conclusion about the stationarity properties of
each series.
Usually, all variables are tested with a linear trend
and/or intercept or none.
Structural Breaks



A well-known weakness of the ADF and PP unit root
tests is their potential confusion of structural breaks
in the series as evidence of non-stationarity.
In other words, they may fail to reject the unit root
hypothesis if the series have a structural break.
In other words, for the series that are found to be
I(1), there may be a possibility that they are in fact
stationary around the structural break(s), I(0), but
are erroneously classified as I(1).



Perron (1989) shows that failure to allow for an
existing break leads to a bias that reduces the ability
to reject a false unit root null hypothesis.
To overcome this, the author proposes allowing for a
known or exogenous structural break in the
Augmented Dickey-Fuller (ADF) tests.
Following this development, many authors, including
Zivot and Andrews (1992) and Perron (1997),
proposed determining the break point ‘endogenously’
from the data.




Enders (2004) argues that Perron-Vogelsang (1992) unit root tests
are more appropriate “if the date of the break is uncertain”.
Shrestha and Chowdhury (2005) argue that, in the case of a
structural break, the testing power of the Perron-Vogelsang unit root
test is superior to that of the Zivot-Andrews test.
Applying the unit root tests which allow for the possible presence of
the structural break prevents obtaining a test result which is possibly
biased towards non-rejection, as suspected by Perron (1989).
Also, since this procedure can identify the date of the structural
break, it facilitates the analysis of whether a structural break on a
certain variable is associated with a particular event such as a
change in government policy, a currency crisis, war and so forth.
Multiple Structural Breaks


The Zivot-Andrews and Perron-Vogelsang
(1992) unit root tests allow for one structural
break, whereas the Clemente-MontanesReyes (1998) unit root test allows for two
structural breaks in the mean of the
series196.
Clemente et al (1998) base their approach on
Perron and Vogelsang (1992), allowing for the
possibility of having two structural breaks in
the mean of the series.



In these tests, the null hypothesis is that the series has a
unit root with structural break(s) against the alternative
hypothesis that they are stationary with break(s).
The advantage of these tests is that they do not require
an a priori knowledge of the structural break dates.
Ben-David et al (2003) cautions that “just as failure to
allow one break can cause non-rejection of the unit root
null by the Augmented Dickey –Fuller test, failure to allow
for two breaks, if they exist, can cause non-rejection of
the unit root null by the tests which only incorporate one
break” (Ben-David et al, 2003: 304).


Lumisdaine and Papell (1997) extended the
Zivot and Andrews (1992) model to
accommodate two structural breaks.
However, this test was criticized for the
absence of the breaks under the null
hypothesis of unit root as this could result in
a tendency for these tests to suggest
evidence of stationarity with breaks (see
Glynn et al, 2007).




Hence, The Perron-Vogelsang and Clemente-MontanesReyes unit root tests are more preferable.
Both of these tests offer two models:
(1) an additive outliers (AO) model, which captures a
sudden change in the mean of a series; and
(2) an innovational outliers (IO) model, which allows for
a gradual shift in the mean of the series.


According to Baum (2004), if the estimates of the
Perron-Vogelsang and Clemente-Montanes-Reyes unit
root tests provide evidence of significant additive or
innovational outliers in the time series, the results
derived from ADF and PP tests are doubtful, as this is
evidence that the model excluding structural breaks
is misspecified.
Therefore, in applying unit root tests in time series
that exhibit structural breaks, only the results from
the Clemente-Montanes-Reyes unit root tests should
be considered if the two structural breaks indicated
by the respective tests are statistically significant (at
the 5% level as used by STATA).


On the other hand, if the results of the
Perron-Vogelsang and Clemente-MontanesReyes unit root tests show no evidence of
two significant breaks in the series, the
results from the Perron–Vogelsang unit root
tests are considered.
If these tests show no evidence of a
structural break, the ADF and PP tests can be
considered.
Cointegration Analysis



If it is certain that the underlying series are all I(1), the
conventional Johansen cointegration technique can be safely
used.
However, in the case where the presence of structural breaks
introduces uncertainty as to the true order of integration of the
variables, the autoregressive distributed lag (ARDL) bounds
testing procedure introduced by Pesaran and Pesaran (1997),
Pesaran and Shin (1999), and Pesaran et al (2001) should be
preferred.
The advantage of this methodology is that it yields valid results
regardless of whether the underlying variables are I(1) or I(0),
or a combination of both.



Conventionally, in the case where the unit root tests reveal that a
series contain one unit root, the appropriate method is to transform
the data by differencing the variables prior to their inclusion in the
regression model to avoid the risk of spurious regression.
Nonetheless, this incurs a loss of important long-run information.
Alternatively, if the variables are cointegrated, that is, if a long-run
relationship exists among the set of variables that share similar nonstationarity properties, regression involving the levels of the
variables can proceed without generating spurious results.
In this case, a “balanced” regression leads to meaningful
interpretations and evades the spurious regression problem.

Cointegration vectors are of considerable interest since they determine I(0)
relations that hold between variables which are individually non-stationary.




Essentially, variables are cointegrated when a long-run linear relationship is
obtained from a set of variables that share the same non-stationary
properties.
Hence, the intuition behind cointegration is that it allows capturing the
equilibrium relationships dictated by the economic theory between
nonstationary variables within a stationary model.
A search is made for a linear combination of such variables such that the
combination is stationary. If such a stationary combination exists, then the
variables are said to be cointegrated, meaning even though they are
individually not stationary, they are bound by an equilibrium relationship.




In this case, the application of traditional econometric modelling
to non-stationary time series data generates meaningful results.
An advantage of the cointegration approach is that it provides a
direct test of the economic theory and enables utilization of the
estimated long-run parameters into the estimation of the shortrun disequilibrium relationships.
Although Engle and Granger’s (1987) original definition of
cointegration refers to variables that are integrated of the same
order, Enders (2004) argues that:
“It is possible to find equilibrium relationships among groups of
variables that are integrated of different orders”200. Asteriou
and Hall (2007) also explains that in cases where a mix of I(0)
and I(1) variables are present in the model, cointegrating
relationships might exist.


Similarly, Lutkopohl (2004) explains:
“Occasionally it is convenient to consider
systems with both I(1) and I(0) variables.
Thereby the concept of cointegration is
extended by calling any linear combination
that is I(0) a cointegration relation, although
this terminology is not in the spirit of the
original definition because it can happen that
a linear combination of I(0) variables is called
a cointegration relation”



Therefore, even in the presence of a set of variables which contains
both I(1) and I(0) variables, cointegration analysis is applicable and
the presence of a long-run linear combination denotes the existence
of cointegrated variables.
Hence, it is possible to find long-run equilibrium relationships among
a set of I(0) and I(1) variables if their linear combination reveals a
cointegrating relationship. In the multivariate case, it is possible to
have series with different orders of integration.
In this case, a subset of the higher order series must cointegrate to
the order of the lower order series. The long-run relationship among
the variables could be achieved if the low frequency or the
stochastic trend components of a set of variables offset each other
to achieve a stationary linear combination of the variables.
Autoregressive Distributed Lag
(ARDL) Bounds Tests


The choice of the ARDL bounds testing procedure as a tool for
investigating the existence of a long-run relationship is based on
the following considerations:
First and the foremost, both dependent and the independent
variables can be introduced in the model with lags.
“Autoregressive” refers to lags in the dependent variable.

Therefore, the past values of a variable are allowed to
determine its present value.

On the other hand, “distributed lag” refers to the lags of the
explanatory variables.



This is a highly plausible feature:
Conceptually, the dependence of the
dependent variable on the independent
variables may or may not be instantaneous
depending on the theoretical considerations.
A change in the economic variables may not
necessarily lead to an immediate change in
another variable.




The reaction to a change in each variable may be different depending on
various factors.
Hence, in some cases, they may respond to the economic developments
with a lag and there is usually no reason to assume that all regressors
should have the same lags.
Hence, ARDL bounds testing approach is appropriate as it allows flexibility
in terms of the structure of lags of the regressors in the ARDL model as
opposed to the cointegration VAR models where different lags for different
variables is not permitted (Pesaran et al, 2001).
It goes without saying that the correct choice of the order of the ARDL
model is very important in the long-run analysis. In this respect, the ARDL
approach has the advantage that it takes a sufficient number of lags to
capture the data generating process in a general-to-specific modelling
framework



Furthermore, the lag orders can be selected based on four
different selection criteria taking into consideration the results of
the diagnostic tests for residual serial correlation, functional
form misspecification, non-normality, and heteroscedasticity.
Also, as shown by Pesaran et al (2001), the ARDL models yield
consistent estimates of the long-run coefficients that are
asymptotically normal irrespective of whether the underlying
regressors are purely I(0), I(1) or mutually cointegrated.
In other words, this procedure allows making inferences in the
absence of any a priori information about the order of
integration of the series under investigation.




As demonstrated by Pesaran and Shin (1999), the small sample
properties of the bounds testing approach are superior to that
of the traditional Johansen cointegration approach, which
typically requires a large sample size for the results to be valid.
Since the ARDL approach draws on the unrestricted error
correction model, it is likely to have better statistical properties
than the traditional cointegration techniques.
In particular, Pesaran and Shin (1999) show that the ARDL
approach has better properties in sample sizes up to 150
observations.
On the other hand, Narayan and Smyth (2005) provide exact
critical values for up to 80 observations



The ARDL approach is particularly applicable in the presence of
the disequilibrium nature of the time series data stemming from
the presence of possible.
Structural breaks as happens with most economic variables.
With the ARDL approach, it can be conveniently tested whether
the underlying structural breaks have affected the long-run
stability of the estimated coefficients.
As suggested by Pesaran (1997), the cumulative sum of
recursive residuals (CUSUM) and the CUSUM of square
(CUSUMSQ) tests proposed by Brown et al (1975) can be
applied to the residuals of the estimated error correction models
to test parameter constancy



Having established the existence of a long-run relationship
based on F-tests, the second step of the ARDL analysis is to
estimate the long-run and the associated short-run coefficients.
The long-run relationship is regarded as a steady-state
equilibrium, whereas the short-run relationship is evaluated by
the magnitude of the deviation from the equilibrium.
The order of the lags in the ARDL model are selected using the
appropriate selection criteria such as Akaike Information
Criterion (AIC), Schwartz Bayesian Criterion (SBC), HannanQuinn Criterion (HQC) and R2 ensuring that there is no evidence
of residual serial correlation, functional form misspecification,
non-normality and heteroscedasticity.



In the presence of structural breaks, the diagnostic
tests of the selected models will most likely suggest
that the estimated model suffers from the nonnormality problem.
In this case, the short-run and long-run coefficients
of the estimated models will not be valid.
The presence of non-normality problem can be
attributed to the presence of outliers over the sample
period, which results from non-recurring, exogenous
shocks (such as wars, terrorist attacks, oil price
shocks, financial crises etc.) rather than the normal
evolution of the economic data.



Let’s see an example of the diagnostic test results of
Turkish macroeconomic data which has severe
structural breaks due to currency crises. (see Table
1).
In econometric modelling, the presence of extreme
residuals, i.e. outliers, may lead to a rejection of the
normality assumption as can be seen in Table 1
The outliers can individually or collectively be
responsible for the residual non-normality problem.
Indeed, this is not surprising in most economic cases
given most of the series are characterized by
frequent fluctuations.
Pulse Dummy Variables



One way to improve the chances of error normality is to use
pulse dummy variables to capture those one-off abnormal
observations, i.e. the estimated models should be re-estimated
by augmenting the cointegrating equations with pulse dummy
variables.
Accordingly, separate dummy variables should be introduced for
each of the outliers. Following the existing econometrics
literature, the operational definition of an outlier is considered
as any data point for which the residuals are in excess of 2
standard deviations from the fitted model.
The dummy variables are set equal to zero for all observations
except the month in which the observation goes beyond the
threshold of two standard errors. In these months, the dummy
variable takes on the value of 1.

For example, Figure I plots the residuals of several econometric
models with structural breaks and 2 standard errors.

The horizontal lines in the figures represent the 2 standard error
bands.

In these models the identified outliers correspond to the Turkish
currency crises of 1994 and 2000-2001.

In this case, pulse dummy variables may be justifiably used to
remove observations corresponding to these one-off events that
are highly unlikely to be repeated.


When the results of the models which are reestimated using the pulse dummy variables to
account for the presence of the outliers are
examined, it can be seen that the use of the
intervention dummy variables ensured normality of
the probability distribution of the residuals, which
permits hypothesis testing on the results of the
model. (See Table 2)
The results also confirm that the re-estimated models
do not suffer from autocorrelation, heteroscedasticity,
or model misspecification problems.
Testing for Parameter Stability



The existence of cointegration does not necessary imply that
the estimated coefficients are stable. If the coefficients are
unstable the results will be unreliable.
In order to test for long-run parameter stability, Pesaran and
Pesaran (1997) suggest applying the cumulative sum of
recursive residuals (CUSUM) and the cumulative sum of
recursive residuals of square (CUSUMSQ) tests proposed by
Brown et al (1975) to the residuals of the estimated ECMs to
test for parameter constancy.
The advantage of these tests is that, unlike the alternative
Chow test that requires break point(s) to be specified, they can
be used without the requirement of a priori knowledge of the
exact date of the structural break(s).




Hansen and Johansen (1992) also suggest a parameter constancy test
but they require the variables to be I(1). Also, they do not incorporate
the short-run dynamics of a model into testing unlike CUSUM and
CUSUMSQ tests.
In both CUSUM and CUSUMSQ, the related null hypothesis is that all
coefficients are stable.
The CUSUM test uses the cumulative sum of recursive residuals based
on the first observations and is updated recursively and plotted against
break point. The test is more suitable for detecting systematic changes
in the regression coefficients.
The CUSUMSQ makes use of the squared recursive residuals and
follows the same procedure. However, it is more useful in situations
where the departure from the constancy of the regression coefficients
is haphazard and sudden.




If the plot of the CUSUM and CUSUMSQ stays within the 5 percent
critical bounds the null hypothesis that all coefficients are stable
cannot be rejected.
If however, either of the parallel lines are crossed then the null
hypothesis of parameter stability is rejected at the 5 percent
significance level.
For example in Figure II and III, which plot the CUSUM and
CUSUMSQ tests, the plots of the CUSUM and CUSUMSQ statistics are
generally confined within the 5 percent critical value bounds,
indicating the absence of any instability of the coefficients
Thus, the parameters of the model do not suffer from any structural
instability.
Thank you !
Download