A Comparison of Autoregressive Distributed Lag and Dynamic OLS

advertisement
A Comparison of Autoregressive Distributed Lag and
Dynamic OLS Cointegration Estimators in the Case
of a Serially Correlated Cointegration Error
Nikitas Pittis∗
University of Piraeus
Ekaterini Panopoulou
University of Piraeus
March 17, 2004
Abstract
This paper deals with a family of parametric, single-equation cointegration estimators that
arise in the context of the Autoregressive Distributed Lag (ADL) models. We particularly focus
on a subclass of the ADL models, those that do not involve lagged values of the dependent
variable, referred to as Augmented Static (AS) models. The general ADL and the restricted AS
models give rise to the ADL and Dynamic OLS (DOLS) estimators, respectively. The relative
performance of these estimators is assessed by means of Monte Carlo simulations in the context
of a triangular Data Generation Process (DGP) where the cointegration error and the error that
drives the regressor follow a VAR(1) process. The results suggest that ADL fares consistently
better than DOLS, both in terms of estimation precision and reliability of statistical inferences.
This is due to the fact that DOLS, as opposed to ADL, does not fully correct for the second-order
asymptotic bias effects of cointegration, since a “truncation bias” always remains. As a result,
the performance of DOLS approaches that of ADL, as the number of lagged values of the first
difference of the regressor in the AS model increases. Another set of Monte Carlo simulations
suggests that the commonly used information criteria select the corrrect order of the ADL model
quite frequently, thus making the employment of ADL over DOLS quite appealing and feasible.
Additional results suggest that ADL re-emerges as the optimal estimator within a wider class
of asymptotically efficient estimators including, apart from DOLS, the semiparametric Fully
Modified Least Squares (FMLS) estimator of Phillips and Hansen (1990, Review of Economic
Studies, 57, 99-125), the non-linear parametric estimator (PL) of Phillips and Loretan (1991,
Review of Economic Studies, 58, 407-436) and the system-based maximum likelihood estimator
(JOH) of Johansen (1991, Econometrica, 59, 1551-1580). All the aforementioned results are
robust to alternative models for the error term, such as Vector Autoregressions of higher order,
or Vector Moving Average processes.
JEL classification: C12, C13, C22
Acknowledgements: We acknowledge financial support from the Greek Ministry of Education
and the European Union under “Hrakleitos” grant. We are grateful to Stéphane Grégoir, an
anonymous referee and participants in the XXVIII Simposio de Analisis Economico, Universidad
Pablo de Olavide, Sevilla, Spain, December 11-13, 2003 and the XXIX Conference on Stochastic
Processes and their Applications, IMPA, Rio de Janeiro, Brasil, August 3-9, 2003 for helpful
suggestions and comments. The usual disclaimer applies.
∗ Correspondence to: Nikitas Pittis, Department of Banking and Financial Management, University
of Piraeus, 80 M.Karaoli and A. Dimitriou str. 18534 Piraeus, Greece. E-mail: npittis@unipi.gr
1
1
Introduction
The concept of cointegration has evolved into a fully developed statistical theory that
covers regressions with integrated variables. Efficient estimators, either in a single or in a
system-of-equations framework are now available with well known asymptotic properties.
An interesting aspect of cointegration is that single equation methods are immuned to
the classical problem of the endogeneity of the regressor(s). That is, the OLS estimator
converges at rate T, where T is the sample size, regardless of the correlation structure
between the cointegration error and the regressor (see Stock 1987). However, “long-run
correlation” and/or “endogeneity” problems, are still encountered when statistical inference on the cointegration vector is conducted. In the presence of contemporaneous and/or
temporal correlation between the cointegration error and the regressor, the asymptotic
distribution for the OLS estimator does not belong to the Local Asymptotic Mixtures
of Normal (LAMN) family and depends on nuisance parameters (see Phillips 1988, Park
and Phillips 1988, Sims, Stock and Watson 1990, Phillips and Loretan 1991).
Various single-equation estimation methods dealing with the second-order effects, either parametrically or non-parametrically have been suggested in the literature (see, for
example, Johansen 1988, 1991, Phillips and Hansen 1990, Stock and Watson 1993). The
parametric methods attempt to estimate the long-run parameters in the context of a dynamic model, in which the regression error forms a martingale difference sequence with
respect to a selected information set. The resulting models fall into the category of the
Hendry-style Autoregressive Distributed Lag (ADL) models, which encompass the Error
Correction models (ECM) as a special case (see Hendry et. al. 1984, Banerjee et. al.
1993, Pesaran and Shin 1999). In empirical applications, however, the ADL class of models is rarely employed. Instead, applied researchers seem to favor a subclass of the ADL
family, namely those models that do not involve lagged values of the dependent variable,
say yt . These models can be thought of as arising from the static equation of yt on xt ,
augmented by current and past values of the first difference of the regressor.1 We shall
refer to these models as the Augmented Static (AS) models. Estimation of the cointegration vector in the context of the AS models by means of least squares is asymptotically
optimal and the resulting estimator is usually referred to as the Dynamic Ordinary Least
Squares (DOLS) estimator (see Stock and Watson 1993). In other words, for optimal
parametric inference we do not have to employ the full dynamic ADL model; instead the
AS model suffices. This is due to the fact that the AS model is based on the projection of the cointegration error on the current, and past values of the error that drives
the regressor (say, set A), that is it involves all the necessary parametric corrections for
removing the second-order effects.2 On the other hand, the ADL model is based on the
projection of the cointegration error on the full information set, (say, set B) that is on set
A, plus the past values of the cointegration error. This in turn implies that the AS and
ADL models differ in two respects: First, the error in the AS model, as opposed to the
error in the ADL model is, in general, serially correlated. This is not a major problem,
provided that the long-run variance of the error in the AS model is consistently estimated
(see Kramer 1986, Park and Phillips 1988). Second, and more importantly, in the cases
that the cointegration error and the error that drives the regressor follow a Vector Au1 The discussion refers to the case that there are no feedbacks from the cointegration error to the error
that drives the regressor. In the case that the cointegration error Granger-causes the regressor’s error,
the generating mechanism for the latter is not fully estimated. In such a case, further augmentation of
the ADL model by the leads of the regressor restores strong exogeneity and removes the second-order
asymptotic bias (see Phillips and Loretan 1991, Saikonnen 1991, Stock and Watson 1993, Pesaran and
Shin 1999).
2 This is true under the assumption that the cointegration error does not Granger cause the error that
drives the regressor. We relax this assumption in the third section of the paper.
2
toregressive process of order m (VAR(m)), the projection of the cointegration error on
set B is summarized in terms of a small number of variables. On the other hand, the
projection of the cointegration error on set A results in an infinite weighted sum of current
and past values of the error that drives the regressor. In practice, of course, this infinite
sum is truncated at a specific lag, say p, so there is always a truncation remainder, which
represents the second-order effects that have not been taken into account. Therefore, the
ADL model, utilizing the exact projection of the cointegration error on set B, offers a
better framework for estimating the cointegration vector than the AS model that utilizes
an approximate projection of the cointegration error on set A.
The preceding discussion implies that the relative performance of ADL against DOLS
is likely to depend on the specific parametric model that generates the errors. For example,
if the error generating mechanism is a Vector Moving Average (VMA) process, then
the performance of the ADL estimator in finite samples is likely to be comparable to
that of DOLS. A direct implication of the VMA assumption is that the memory of the
cointegration error is designed to be extremely short. This, however, does not seem
to be the case, when actual data is used. In most macroeconomic applications, the
equilibrium error seems to exhibit a rather long memory. In fact, sometimes it is difficult
to distinguish between such a highly persistent error and a nonstationary one. In view
of this, it is natural to compare ADL and DOLS within a framework that is capable of
reproducing the observed behaviour of the cointegration error. Stock and Watson (1993)
(SW, henceforth) specify a VAR(1) model for the errors, which does give rise to a highly
persistent cointegration error. Their designs, however, are such that the truncation bias
of DOLS is zero, thus favoring the DOLS estimator against its competitors.3
In this paper we follow SW and employ a triangular Data Generation Process (DGP)
assuming that the cointegration error and the error that drives the regressor follow a
VAR(1) process with normal innovations. The purpose of this paper is to compare the
performance of the ADL and DOLS estimators when the cointegration error exhibits
various degrees of persistence. The parameter that controls the persistence of the cointegration error also controls the truncation bias of the DOLS estimator. The performance
of the estimators under consideration is assessed via Monte Carlo simulations. The results confirm the superiority of the ADL estimator over DOLS for all possible scenarios
on the persistence of the cointegration error and the Granger causality structure between
the cointegration error and the error that drives the regressor. In fact, in most cases, the
limiting performance of DOLS, as the number of lagged values of the first difference of
the regressor in the AS model increases, seems to be that of ADL. These results strongly
suggest the employment of the ADL estimator, provided that the correct order of the
model is selected. In this respect, additional Monte Carlo simulations suggest that the
commonly used information criteria are capable of delivering the correct order of ADL
at a satisfactory frequency. Another set of simulations suggest that all the aforementioned results favoring the ADL estimator are robust to alternative error processes, such
as VAR(2) or even VMA(1) processes.
The paper is organized as follows. Section 2 introduces the DGP and derives the
ADL and AS models, as well as the conditions that render them equivalent. Section 3
reports the Monte Carlo results. For completeness, we also report simulation evidence
on the performance of some other commonly used estimators, such as the semiparametric
Fully Modified Least Squares (FMLS) estimator of Phillips and Hansen (1990), the nonlinear-in-parameters estimator of Phillips and Loretan (1991), henceforth (PL), which
3 SW consider parameter settings such that the truncation effect is zero (cases A and B in pp.795-799).
However, these authors are not interested in comparing the DOLS estimator with the more general ADL
estimator. Their concern lies on examining the performance of the DOLS estimator against that of some
other commonly used estimators.
3
utilizes the same dynamic structure with that of ADL, and the system-based estimator
of Johansen (1991), henceforth (JOH). Within this broader set of alternative estimators,
ADL re-emerges as the optimal estimator, closely followed by the PL estimator. Section
4 concludes the paper by briefly summarizing our main results.
2 Models and Estimators
Let zt and ut be two bivariate processes, with zt = [yt , xt ]> and ut = [u1t , u2t ]> . We
further assume that ut is a VAR(1) process, driven by et = [e1t , e2t ]> and the generating
mechanism for yt is given by the system
yt = θxt + u1t
and
µ
u1t
u2t
¶
=
µ
µ
a11
a21
e1t
e2t
¶
a12
a22
(1)
∆xt = u2t
¶µ
¶ µ
¶
u1t−1
e1t
+
, a21 = 0
u2t−1
e2t
˜N IID
·µ
0
0
¶µ
σ 11
σ 12
σ 12
σ 22
¶¸
(2)
(3)
(4)
for t = 1, 2, ...T .
Both eigenvalues of the matrix A = [aij ], i, j = 1, 2 are assumed to be less than one
in modulus, in order for yt and xt to be I(1) variables, and the cointegration error to be
an I(0) process. The long-run covariance matrix Ω and the one-sided covariance matrix
∆, needed to define the asymptotic nuisance parameters, are given by equations (5) and
(6), respectively
Ω = (I − A)−1 Σ(I − A> )−1
(5)
∆ = G(I − A> )−1
(6)
where Σ denotes the innovations covariance matrix of the VAR and G is the unconditional
covariance matrix of ut given by,
vecG = (I − A ⊗ A)−1 vecΣ
(7)
An early result by Stock (1987) shows that the OLS estimator of θ obtained from
(1) is super-consistent, regardless of the presence of temporal and/or contemporaneous
correlation between the regression error, u1t , and the error that drives the regressor,
u2t . On the other hand, in general, the asymptotic distribution of the OLS estimator of
θ falls outside the Local Asymptotic Mixture of Normals (LAMN) family and contains
nuisance parameters. The reason for the presence of non-standard asymptotics is that in
the presence of contemporaneous and temporal correlation between the elements of ut ,
two types of second-order asymptotic effects are present in the limiting distribution of
the OLS estimator (see Phillips and Loretan 1991): The first is the nuisance parameter,
ω 12 /ω 22 that describes the “long-run correlation” effect, due to non-diagonality of the
long run
P∞covariance matrix Ω = [ω ij ] , i, j = 1, 2. The second is the nuisance parameter
δ 21 = k=0 E(u20 u1k ) that describes the “endogeneity” effect. In the present case, where
there are no feedbacks from the cointegration error to the error that drives the regressor
(a21 = 0), both nuisance parameters have the same source, namely the contemporaneous
correlation between u1t and u2t and the temporal correlation between u2t−i , i = 1, 2, ...
and u1t .
4
In order to remove the second order effects parametrically, we must employ a new
regression model whose error term is orthogonal to u2t and u2t−i , i = 1, 2, .... This can
be done by employing the conditional expectation of u1t either on the current and past
values of u2t (set A) or on the current and past values of u2t plus the past values of u1t
(set B). As mentioned in the introduction, the first and second conditioning information
sets result in the AS and ADL models, respectively. Next, we show how the AS and ADL
models are actually derived, starting from the latter.
2.1 The ADL estimator based on the ADL model.
The full system (1) and (2) with errors specified by (3) - (4), implies the following conditional density of yt , for the most general case with a21 6= 0:
D(yt | xt , z0t−1 , λ1 ) = N (θ1 xt + c1 yt−1 + c2 xt−1 + c3 xt−2 , σ 2v )
where λ1 ≡ (θ1 , c1 , c2 , c3 , σ2v ) and
σ 12
σ 22
σ 12
c1 = a11 − a21
σ 22
σ 12
(a22 + 1 − a21 θ) − a11 θ
c2 = a12 −
σ 22
σ12
c3 = (a22
− a12 )
σ22
θ1 = θ +
(8)
(9)
(10)
(11)
(12)
σ 212
(13)
σ 22
This conditional model can be written as the ADL(q,r) regression, with orders (q,r)=(1,2):
σ 2ν = σ11 −
yt = θ1 xt + c1 yt−1 + c2 xt−1 + c3 xt−2 + ν t
(14)
The new error term, vt , is now orthogonal to u2t , ut−1 , ut−2 , ...and its variance is equal
to
σ2
σ 2ν = σ11 − 12
(15)
σ 22
In the context of the ADL(1,2) model the cointegration parameter θ is equal to the longrun multiplier of yt with respect to xt , that is
θ=
θ1 + c2 + c3
1 − c1
(16)
This is a relationship between the parameter of interest and the parameters of the conditional model alone, suggesting that it meets the first condition for xt to be weakly
exogenous for θ, in the sense of Engle et. al. (1983).4 This means that we can always
4 The second condition for weak exogeneity requires the parameters of the conditional model and
those of the marginal model to be variation-free (see Engle et.al. 1983). In the present case, the marginal
density of xt is given by
D xt | z0t−1 , λ2 = N (φ1 xt−1 + ϕ2 xt−2 + φ3 yt−1 , σ 22 )
(17)
where λ2 ≡ (φ1 , φ2 , φ3 , σ22 ) and
ϕ1 = 1 − α21 θ + α22
(18)
φ2 = −α22
(19)
5
estimate (14) by OLS and then use (16) to obtain an efficient estimate of θ. However,
additional computations are required to obtain the variance of this estimate (see Banerjee
et. al. 1993). A more convenient approach, proposed by Bewley (1979), transforms the
model (14) in such a way that a point estimate of θ and its variance can be obtained
directly. After some algebraic manipulation, model (14) can be equivalently written as:
yt = δ 0 ∆yt + θxt + λ0 ∆xt + λ1 ∆xt−1 + ηt
where
c1
δ 0 = − (1−c
1)
c2 +c3
λ0 = − (1−c
1)
c3
λ1 = − (1−c
1)
ηt =
(22)
1
(1−c1 ) ν t
Estimates of the coefficients and their standard errors can be obtained by using the Instrumental Variables (IV) estimator, with the original matrix of regressors being the
instrumental variables (see Wickens and Breusch 1988). This means that the ADL estimator of θ is very easy to apply since it involves only IV estimation techniques.
2.2 The DOLS estimator based on the AS model.
The ADL model, derived above, may be thought of as arising from projecting u1t on the
full information set B = (u2t , ut−1 , ut−2 , ...), that is
E(u1t | B) =
σ 12
e2t + a11 u1t−1 + a12 u2t−1
σ 22
(23)
As already mentioned, the second-order effects can be dealt with by projecting u1t on a
subset of this set, namely A = (u2t , u2t−1 , u2t−2 , ...), A ⊂ B : The resulting conditional
expectation involves an infinite sum,
E(u1t | A) =
∞
X
β i u2t−i
(24)
i=0
where β i are functions of the parameters in (3)-(4)5 . This conditional expectation does
not admit a parsimonious representation analogous to (23). On the other hand, it allows
for direct substitution of this expression into (1), thus yielding the AS model
yt = θxt +
∞
X
β i ∆xt−i + υ t
(25)
i=0
where υ t is, in general, a serially correlated error term. In particular, υ t follows the AR(1)
model
υ t = γ 2 υ t−1 + εt
(26)
φ3 = α21
(20)
The variation-free condition between λ1 and λ2 is achieved in the case that α21 = 0. This is because,
in general, λ1 and λ2 are not variation free, due to the following cross restriction between the elements
of λ1 and λ2 ,
(θ1 + c2 + c3 ) φ3 = (1 − c1 ) (1 − φ2 − φ1 )
(21)
On the other hand, if α21 = 0, variation freeness is restored, xt becomes weakly exogenous for θ and
OLS on (14) will give a (super) consistent and asymptotically mixed normal estimate of θ.
5 It is easy to show that β = σ 12 , β = a σ12 + a , β = a2 σ 12 + a a
11 σ
12
11 12 + a12 a22 , ..., when
0
1
2
11 σ 22
σ 22
22
a21 = 0.
6
where γ 2 is the MA coefficient in the ARMA (2,1) representation of u2t . Specifically, the
univariate representation for u2t with a21 = 0, is
where γ 2 solves
u2t − (a11 + a22 ) u2t−1 + a11 a22 u2t−2 = ξ 2t + γ 2 ξ 2t−1
(27)
σ 22 a11 γ 22 + σ 22 (1 + a211 )γ 2 + a11 σ 22 = 0
(28)
The last three relationships suggest that the degree of serial correlation in the error of
the AS model is controlled by a11 . This is because in the case of a11 = 0, the coefficient
γ 2 in (26) is zero, thus yielding a serially uncorrelated error in the AS model6 . The serial
correlation of υt does not raise any serious problems in the estimation of θ, provided
that a consistent estimator of the long-run variance of υ t is employed, such as the one
proposed by Newey and West (1987). Alternatively, the application of Generalized Least
Squares (GLS) on (25), ensures valid asymptotic inferences on θ.7 In practice, however,
the second term on the right-hand side of (25) has to be replaced by an approximation in
which the infinite sum is truncated at i = p. The resulting AS(p) model accommodates
a truncation remainder that is likely to increase the bias of the DOLS estimator of θ.
This bias grows with the parameter a11 , which mainly controls for the persistence of the
cointegration error. Increasing the truncation point reduces the DOLS bias, but increases
its variance. Moreover, estimating (25) by OLS is not feasible if p is too large compared
to the sample size. Saikkonen (1991) specifies an upper bound for the rate at which p is
allowed to increase with the sample size T, which is given by the condition p3 /T → 0.
Nevertheless, this condition cannot be used to define the optimal value of p for any given
sample size.
Finally, it is easy to show that when a11 = 0, the ADL model reduces to the AS
model. In this case, the ADL(q,r) and AS(p) models, implied by this specific DGP, are
the ADL(0,2) and AS(1) models, respectively.
3 Simulation Results
In this section, we attempt to quantify the cost of employing the AS(p) instead of the
ADL(q,r) model for the estimation of θ, by means of Monte Carlo simulations. The
OLS and IV estimators applied to the AS(p) and ADL(1,2) models (22), respectively,
are referred to as the DOLS(p) and ADL(1,2) estimators. The serial correlation effect
on the DOLS(p) estimator is taken into account by means of the autocorrelation consistent covariance matrix estimator of Newey and West (1987). The bandwidth parameter
is estimated non-parametrically, according to Newey and West (1994). Alternatively,
we assume an AR(1) model for υt and employ the feasible generalized least squares
estimator, referred to as the DGLS(p). The truncation parameter, p, takes values in
the interval [1, 20], by steps of 1. As mentioned in the introduction, the comparison
is extended to include some other commonly used estimators, such as the FMLS, the
PL(s,l) and the JOH(z) estimators.8 The mean bias, median bias and average root mean
6 See also Stock and Watson 1993, pp.798, for a similar discussion on this issue, for the general case
with a21 6= 0
7 Note that in the case of a linear regression which involves an I(1) strictly exogeneous regressor, the
OLS is asymptotically equivalent to the GLS estimator (see Kramer 1986, Park and Phillips 1988).
8 The FMLS estimator is based on consistent estimation of the matrices Ω and ∆, which in turn
requires the selection of a kernel and the determination of the bandwidth. We employ the Quadratic
Spectral kernel and determine the bandwidth by means of the Andrews (1991) data-dependent procedure.
e t prior to
Moreover, the "prewhitened" version of FMLS (PW-FMLS) which filters the error vector u
estimating Ω and ∆ is also employed (see Christou and Pittis 2002, for a discussion on the performance
of the various versions of the FMLS estimator). Regarding the PL(s,l) estimator, the orders s and l refer
7
squared error (MSE) are used to assess the estimators. The associated t-tests are assessed
by comparing the 2.5% (t0.025 ) and the 97.5% (t0.975 ) points in the empirical distributions of the relevant t-statistics with those from the standard N(0,1). Moreover, for
nominal sizes of 5%, the empirical sizes of the t-tests for testing the hypothesis θ = 1
are computed. We generate 2000 series of length 150, starting with u10 = u20 = 0,
and then discard the initial 50 observations, thus generating a sample size of 100. Although many other parameter settings were run, we only report the results for the leading
case {a12 = 0.5, σ 12 = 0.7, a21 = a22 = 0, σ11 = σ 22 = 1, θ = 1 and 0 < a11 < 1} , referred
to as DGP1, because this summarizes the main differences between the ADL (1,2) and
DOLS(p) /DGLS(p) estimators.
In this case, the regressor xt is a random walk and weakly exogenous for θ, in the
context of the conditional model (14). The asymptotic nuisance parameters, ω 12 /ω 22 and
δ 21 reduce to:
a12 + σ12
ω 12
= δ 21 =
(29)
ω 22
1 − a11
It is easy to show that when a11 → 1, then
·
ω 12
+∞ if a12 + σ12 > 0
= δ 21 →
(30)
−∞ if a12 + σ12 < 0
ω 22
This means that the magnitude of the nuisance parameters increases with the persistence of the cointegration error, thus amplifying the truncation effect on the DOLS(p)
and DGLS(p) estimators. The key parameter a11 takes the values 0.3, 0.6 and 0.9. A
near-to-unit root case is also examined by setting a11 = 0.95.9 First, we focus solely on
comparing ADL(1,2) with DOLS(p) and DGLS(p). The results, concerning the mean,
median bias and MSE of these estimators are reported in Figures 1A - 1D, 2A-2D and
3A-3D, respectively and are summarized as follows:
(i) The mean (or median) bias for all the estimators, namely ADL(1,2), DOLS(p) and
DGLS(p), increases with the degree of persistence of the cointegration error.
(ii) DOLS(p) and DGLS(p) perform far worse than ADL(1,2) in bias and MSE
for small values of the truncation parameter, p. When p increases, the DOLS(p) and
DGLS(p) bias converges to that of ADL(1,2). However, the lag length, necessary to reduce the bias of DOLS(p) and DGLS(p) towards the bias of ADL(1,2), increases with the
persistence of the cointegration error. For example, when a11 is equal to 0.3, 0.6 and 0.9,
the number of lags necessary to bring the bias of DOLS(p) down to the level of ADL(1,2)
is 4, 7 and 20, respectively. In the near-to-unit root case, a11 = 0.95, the performance of
DOLS(20) and DGLS(20) in bias is still much worse than that of ADL(1,2).
(iii) For small values of p, DOLS(p) fares much better than DGLS(p). When p becomes
sufficiently large, DOLS(p) and DGLS(p) become equivalent in bias and MSE.
(iv) When p increases, the rate of decrease of the bias of DOLS(p) and DGLS(p) is
much higher than the rate at which the standard deviation of these estimators increases,
for all the values of a11 , except for a11 = 0.3. This explains why the MSE is a decreasing
function of p for all the values of a11 , except for a11 = 0.3.
(v) When we increase the sample size to 300, the overall picture regarding the relative
performance of the ADL(1,2) and DOLS(p) /DGLS(p) estimators, remains the same.
Next, we compare the ADL(1,2) estimator, which so far has emerged as the best
estimator, with the rest of the estimators under scrutiny. For the DGP under study,
the optimal orders s, l, and z for the PL(s,l) and JOH(z) estimators are 1, 0 and 2
to the lags and leads of ∆xt , respectively. Finally, the order z in the JOH(z) estimator corresponds to
the lag-order of the Vector Autoregressive Model on which this estimator is based.
9 Given the values of a , a
12
21 and a22 in this design, a value of a11 as large as 0.95, still satisfies the
eigenvalue stability condition for the VAR model of the errors.
8
respectively.10 The results are reported in Table 1 and summarized below:
(i) As expected, the performance of the PL(1,0) estimator is comparable to that
of ADL(1,2), since both estimators utilize the same dynamic structure. The JOH(2)
estimator also fares well, especially for the most persistent cases of a11 = 0.9 and a11 =
0.95.
(ii) The standard FMLS and, to a lesser extent, the prewhitened FMLS estimators
underperform ADL(1,2), PL(1,0) and JOH(2) for all the values of a11 . For example, for
a11 = 0.6 the bias of the FMLS, the PW-FMLS and the ADL(1,2) estimators is equal to
0.066, 0.0202 and 0.0017, respectively.
(iii) Comparing DOLS(p) with the PW-FMLS estimator yields ambiguous results. For
a11 = 0.3 and a11 = 0.6, DOLS(p) dominates PW-FMLS in terms of bias for all but very
small values of p. For a11 = 0.95, however, the opposite is true; the PW-FMLS estimator
is less biased than DOLS(p) for all the values of p that are less or equal to 13.
We now turn to the problem of inference by examining the empirical distribution
of the estimators’ t-statistics as well as the corresponding empirical sizes for testing the
hypothesis θ = 1. Table 2 reports the 2.5% (t0.025 ) and 97.5% (t0.975 ) points of the
empirical distribution of the t-statistics for all the estimators under consideration and
for the four values of a11 . Again we start the comparisons by focusing on the ADL(1,2),
DOLS(p) and DGLS(p) estimators. The results suggest that the DOLS(p) and DGLS(p)
t-statistics are not, in general, well approximated by a standard N(0,1), even when a
sufficiently large value of p is employed. On the other hand, the ADL(1,2) t-statistic is
much better approximated by the standard N(0,1), especially when the persistence of the
cointegration error is not particularly high. Moreover, the value of p that minimizes the
bias of DOLS(p) and DGLS(p) does not always coincide with the value of p that minimizes
the distributional divergence of the corresponding t-statistics from the standard N(0,1).
For example, for a moderately persistent cointegration error, that is for a11 = 0.6, the bias
of DOLS(p) reaches the level of the ADL(1,2) for p=7. For this value of p the 2.5% and
97.5% points of the corresponding t-statistic distribution are -2.9 and 2.9, respectively.
The situation deteriorates for higher values of a11 . For a11 = 0.9, the biases of both
DOLS(p) and DGLS(p) are minimized for p=20, a value for which the t0.025 and t0.975
points are equal to -4.7 and 6.2, respectively for DOLS(p), and -2.3 and 3.8, respectively
for DGLS(p). More dramatic effects occur when the cointegration error is nearly nonstationary, that is when a11 = 0.95. On the other hand, for a11 = 0.9, the t0.025 and
t0.975 points for ADL(1,2) are -1.7 and 3.8, respectively, thus ensuring much more reliable
inferences on θ. These distributional characteristics of the t-statistics are reflected on
the empirical sizes of the t-tests for testing the hypothesis θ = 1. The results, reported
in Figures 4A-4D, reveal large size distortions for both DOLS(p) and DGLS(p) in the
following two cases: First, when a11 = 0.6 and the value of p is relatively small. Second,
when the cointegration error is highly persistent, that is when a11 = 0.9 and even worse
when a11 = 0.95. In the second case, the size distortions are present regardless of the value
of p, and yield totally unreliable inferences. For example, for a11 = 0.9 the empirical size
of DOLS(p) ranges from 72 percent for p=1 to 43 percent for p=20. At the same time,
the empirical size of ADL(1,2) is at the reasonable level of 15 percent. Increasing the
sample size to 300 yields qualitatively similar results. For example, for a11 = 0.9, the
empirical size of DOLS(p) is 67 percent for p=1 and reduces to 36 percent for p=20, while
the size of the ADL(1,2) is at the level of 7 percent.
We now examine the issue of statistical inference in the context of the PL(1,0), the
JOH(2), the FMLS and the PW-FMLS estimators. The t0.025 and t0.975 values for these
10 For the prewhitened version (PW) of FMLS, a VAR(1) model is used as the filter for prewhitening
residuals. That is, the VAR-filter coincides with the true model for ut , thus creating the best case
environment for the performance of the PW-FMLS estimator.
9
estimators are also reported in Table 2, whereas the corresponding empirical sizes are
reported in the last column of Table 1. The FMLS bias, reported above, is accompanied
by size distortions which become more severe as the value of a11 increases. For example,
for a11 = 0.9, the empirical size of FMLS and PW-FMLS is 63 percent and 29 percent
respectively, whereas the corresponding size for the ADL(1,2) is as low as 15 percent.
These distortions are due to the large divergence of the FMLS t-statistic from the standard
normal, occuring when the persistence of cointegration error is high. For example, for
a11 = 0.95, the value of t0.975 is 38.78 and 8.73 for FMLS and PW-FMLS, respectively.
The empirical size of PL(1,0) is almost identical to that of ADL(1,2) for all the values of
a11 . This, however, does not seem to be the case for the JOH-based t-test, which appears
to be under-sized for low and moderate degrees of persistence of the cointegration error.
Specifically, for a11 = 0.3 and a11 = 0.6 the value of t0.025 is -0.952 and -1.211 respectively,
resulting in empirical sizes that are substantially smaller than the nominal ones.
As far as alternative parametrizations are concerned, we run the following simulations:
(i) The second-order effects arise solely from the contemporaneous correlation between
the innovations of the error, that is a12 = 0, σ12 = 0.7, a21 = a22 = 0, σ 11 = σ 22 =
1, θ = 1 and 0 < a11 < 1. (ii) The error that drives the regressor Granger causes the
cointegration error, but the contemporaneous correlation between the two errors is zero,
that is a12 = 0.5, σ 12 = 0, a21 = a22 = 0, σ 11 = σ 22 = 1, θ = 1 and 0 < a11 < 1. In both
of these cases, the results are qualitatively similar to those of the leading case. DOLS(p)
and DGLS(p) are generally beaten by ADL(1,2) in bias and MSE for all values of p
under consideration. When p reaches a sufficiently large value, say p∗ , the performance
of DOLS(p∗ ) and DGLS(p∗ ) approaches that of ADL(1,2). As in the leading case, p∗
increases with the degree of persistence of the cointegration error. The problems of
statistical inferences on θ, in both cases are very similar to those reported for the leading
case. Finally, we briefly discuss the case, where the key parameter a11 is set equal to zero,
that is a12 = 0.5, σ 12 = 0.7, a21 = a22 = 0, σ 11 = σ 22 = 1, θ = 1 and a11 = 0. This is a
case where the AS model utilizes an exact rather than an approximate projection of u1t
on the current and past values of u2t , which in turn implies that the DOLS(1) estimator
utilizes the correct model, whereas the ADL(1,2) estimator is based on a slightly overspecified model.11 The simulation results seem to confirm the theoretical predictions. The
bias and standard deviation of DOLS(1) are slightly smaller than those of ADL(1,2). Of
course, the addition of more lags of ∆xt in the AS model increases the variability (and the
MSE) of DOLS(p), but this is something that occurs in the case of an over-parametrized
ADL(q,r) model as well.
3.1 Information Criteria
The analysis, so far, seems to favor the ADL(q,r) over the DOLS(p)/DGLS(p) estimation
method for conducting inferences on θ. In fact, this estimator dominates, in some or all
the aspects of statistical inference, not only the DOLS(p)/DGLS(p) estimator but also
the rest of the estimators presently under study. Throughout the analysis, we assumed
that the ADL(q,r) estimator utilizes the correct dynamic model, implied by the DGP (1)
- (4), that is q=1, r=2. In such a case, the performance of the ADL(1,2) estimator may be
thought of as the limiting performance of the DOLS(p) or DGLS(p) estimators. Does this
clearly suggest that in empirical applications, researchers should always employ ADL(q,r)
for estimating θ? The answer seems to be in the affirmative, conditional, however, on the
ability of researchers to determine the correct dynamic model for each particular case,
that is to select the correct values for q and r. A more realistic experiment for measuring
11 In this case, υ is serially uncorrelated, which in turn implies that neither non-parametric nor GLSt
type corrections are necessary.
10
the benefits from employing ADL(q,r) over DOLS(p)/DGLS(p) should incorporate the
issue of selecting the lag orders (q,r) and p in the corresponding estimators. To address
this issue, we design the following experiment, in the context of the DGP1: We consider
the family of ADL(q,r) estimators that arise from allowing q and r to take integer values
in the interval [0,4], thus obtaining fourteen ADL(q,r) estimators. In this class, and for
the specific DGP under study, ADL(0,0), ADL(0,1), ADL(1,0) and ADL(1,1) are underspecified, ADL(1,2) is correctly specified, and the rest are over-specified. We also consider
twenty one DOLS(p) estimators and another twenty one DGLS(p) estimators, by allowing
p to take integer values in the interval [0,20]. As far as the PL(s,l) and JOH(z) estimators
are concerned, we allow s and z to take integer values in the interval [1,4].12 Since no leads
of the regressor are required for this particular DGP, we set l equal to zero. To select the
orders (q,r), p, s and z, we use the three most commonly used information criteria for
model selection, namely the Akaike (1974), the Schwarz (1978) and the Hannan and Quinn
(1979) criteria, denoted by AIC, SIC and HQ, respectively. In each replication, we select
the orders of ADL(q,r), DOLS(p)/DGLS(p), PL(s,0) and JOH(z) by each of the three
criteria and calculate the statistics, defined in the previous section. The average values of
the statistics concerning the estimation precision are reported in Table 3, whereas those
on hypothesis testing are reported in Table 4. We also report the frequencies at which
each criterion selects the orders (q, r) and p, in Figures 5A-5D, 6A-6D and 7A-7D for the
ADL(q,r), DOLS(p) and DGLS(p) estimators, respectively. For brevity, we do not report
the selection frequencies of the orders s and z in PL(s,0) and JOH(z), respectively, but
we briefly discuss them in the text. We consider sample sizes of 100 and 300, but report
the results only from the former.
First, we confine our discussion on the comparison between the ADL(q,r) and DOLS(p)/
DGLS(p) estimators. The main results may be summarized as follows:
(i) Irrespective of the value of a11 , the SIC and HQ criteria select the correct specification of the dynamic model, i.e. the ADL(1,2), in 85 percent of the cases, whereas the
respective figure for the AIC is 60 percent. Increasing the sample size to 300 increases
the frequency at which the correct ADL model is selected to 65 percent for AIC and to
95 percent for SIC and HQ.
(ii) In the context of the DOLS(p)/DGLS(p) estimators, SIC and HQ fail to select
a sufficiently large p, especially for large values of a11 . In the context of DOLS(p), the
best performing criterion is by far AIC, which tends to point towards large values of p as
the persistence of the cointegration error increases. The performance of AIC, however,
is greatly reduced in the context of the DGLS(p) estimator, where AIC is still the best
criterion but only by a slight margin over SIC and HQ.
(iii) The behavior of the information criteria has the following consequences: the mean,
median bias and MSE are much lower in the context of ADL(q,r) than DOLS(p)/DGLS(p),
especially as a11 tends to unity. For example, when a11 = 0.95, the average bias of the
AIC-based ADL(q,r), is four and thirteen times lower than the bias of the AIC-based
DOLS(p) and DGLS(p), respectively. The picture is similar as regards hypothesis testing. The distributions of the ADL(q,r), DOLS(p) and DGLS(p) t-statistics shift to the
right as the persistence of the cointegration error increases. This is due to the fact that
the nuisance parameters ω 12 /ω 22 and δ 21 tend to +∞ as a11 approaches unity. This shift
is profound in the case of the “partly corrected” DOLS(p) and DGLS(p) estimators, thus
yielding empirical sizes of 67 percent and 99 percent, respectively, for a11 = 0.95. For
the same degree of persistence, the empirical size of the ADL(q,r) procedure is around 23
12 Obviously, the problem of selecting the correct lag order is not relevant for FMLS or PW-FMLS, due
to their non-parametric nature. A comparable issue concerns the selection of the optimal bandwidth by
means of optimality criteria, such as the ones suggested by Andrews (1991) or Newey and West (1994).
We do not attempt to deal with this issue in detail, since it is clearly outside the scope of the paper.
11
percent, regardless of the information criterion employed.
(iv) Turning to the relative performance of the DOLS(p) versus DGLS(p) estimators,
the superiority of the DOLS(p) estimator is evident in both estimation and hypothesis
testing. This is due to the fact that all the criteria fail to select a sufficiently large p for
the DGLS(p) estimator. When a11 = 0.9, the mean biases of the AIC-based DOLS(p)
and DGLS(p) are 0.08 and 0.44, respectively, whereas for a11 = 0.95 the corresponding
biases climb to 0.27 and 0.85, respectively.
(v) There is a simple reason why AIC is the best performing criterion in the context
of DOLS(p), whereas it does worse than SIC and HQ in the context of the ADL(q,r)
estimator: AIC is an asymptotically efficient criterion, that is it selects the model that
best fits the data without assuming that the correct model belongs in the set of candidate
models. This is obviously the case for the class of the AS(p) models under consideration,
since the correct model assumes p=∞. On the other hand, when the class of the ADL(q,r)
models is considered, the correct model, ADL(1,2), belongs to the set of candidate models.
In such a case, consistent selection criteria, such as SIC and HQ work well in selecting
the correct model for reasonable sample sizes.
Now we examine the performance of the PL(s,0), the JOH(z) and the PW-FMLS
estimators. The main results are summarized below:
(i) The frequencies at which the criteria select the correct PL(1,0) model are almost
equal to the corresponding ones for the ADL(1,2) model. As a result, the performance of
PL(s,0) is comparable to that of ADL(q,r) as far as estimation precision and reliability of
statistical inferences are concerned. Similar results are obtained for the JOH(z) estimator.
Therefore, the ADL(q,r), PL(s,0) and JOH(z) estimators may be thought of as forming
a class of parametric estimators, say Class A, with similar characteristics.
(ii) The PW-FMLS estimator, with the bandwidth parameter selected by the Andrews
(1991) data-dependent procedure, and the DOLS(p) estimator, with p selected by any
of the three criteria under study seem to form a second class of estimators, say Class B.
Any estimator of Class A seems to dominate any estimator of Class B in any aspect of
statistical inference. Finally, the standard FMLS and the DGLS(p) estimators seem to
form a third class, say Class C, consisting of the worst-performing estimators.
3.2 Further extensions
So far, regarding statistical inferences on θ, the Monte Carlo evidence strongly suggests
the use of an estimator from Class A (in particular, ADL(q,r)) over estimators from Class
B or, even more so, over estimators from Class C. Moreover, attention has focused on the
case that the cointegration error and the first difference of the regressor are generated
by a VAR(1) process, and the cointegration error does not Granger-cause the error that
drives the regressor. The ADL(1,2) model is the correct model implied by this specific
DGP and its order is successfully selected by the three most commonly used information
criteria, especially the consistent ones, i.e. the SIC and HQ criteria. Next, we investigate
the extent to which the relative performance of the estimators remains unchanged, when
alternative specifications of the error dynamics are considered. In particular, we extend
our simulations to include the following cases:
(i) the cointegration error Granger-causes the error that drives the regressor, that is
a21 6= 0.
(ii) the cointegration error and the error that drives the regressor follow a VAR(2)
process.
(iii) the cointegration error and the error that drives the regressor follow a first-order
vector moving average,VMA(1), process.
12
3.2.1 The cointegration error Granger-causes the error that drives the regressor.
In this set of simulations, the error vector, ut , is still a VAR(1) process, but the transition
matrix A does not contain any zero elements, except for a22 . In this case, as opposed to
the ones analyzed in the previous section, there are feedbacks from the cointegration error
to the error that drives the regressor. This, in turn, implies that further augmentation
of the ADL(q,r) and AS(p) models by g leads of xt and t leads of ∆xt , respectively is
required for asymptotic optimality. The resulting estimators, referred to as ADL(q,r,g)
and DOLS(p,t)/DGLS(p,t), aim at removing the second-order asymptotic bias effects
that arise from contemporaneous and temporal correlation between the elements of ut
(see Phillips and Loretan 1991, Saikonnen 1991, Stock and Watson 1993).13
In this respect, we consider the family of ADL(q,r,g) estimators, that arise from allowing q, r and g to take integer values in the interval [0,4], thus obtaining fourteen
ADL(q,r,g) estimators. We also consider nine DOLS(p,t) estimators and another nine
DGLS(p,t) estimators, by allowing p and t to take integer values in the interval [0,4]. Finally, we consider fourteen PL(s,l) estimators by allowing s and l to take integer values in
the intervals [1,4] and [0,4], respectively. To select the orders of these estimators, we use
the three criteria mentioned above. The design of this set of simulations is the same with
that described in the previous section. The DGP under consideration is the following:
{a12 = 0.5, σ 12 = 0.7, a21 = 0.5, a22 = 0, σ 11 = σ 22 = 1, θ = 1 and 0 < a11 < 1}. The
parameter a11 is set equal to 0.3 and 0.7, in order for the eigenvalue stability condition
to be satisfied. The average values of the statistics concerning the estimation precision
and hypothesis testing are tabulated in Tables 5 and 6, respectively. We also report the
frequencies at which each criterion selects the orders of the ADL(q,r,g), DOLS(p,t) and
DGLS(p,t) estimators in Figures 8A to 8B, 9A to 9B and 10A to 10B, respectively.
First compare the ADL(q,r,g), DOLS(p,t) and DGLS(p,t) estimators. This set of
simulations provides further evidence on the dominance of the ADL class of estimators
over the DOLS/DGLS one, in terms of both estimation precision and reliability of statistical inference. However, the difference in the performance between ADL(q,r,g) and
DOLS(p,t)/DGLS(p,t) is less prominent in this case than it was in the leading case,
DGP1. This is due to the fact that when a21 6= 0, the “long-run correlation” parameter ω 12 /ω 22 converges to a well defined limit as a11 → 1. On the other hand, when
a21 = 0, the nuisance parameter ω 12 /ω 22 tends to infinity, as a11 → 1. As a consequence
of the limiting behavior of ω 12 /ω 22 , the average biases of the ADL(q,r,g) estimators for
a11 = 0.7 lie between 0.0008 and 0.0015 depending on the selection criterion, whereas the
corresponding biases of the DOLS(p,t) and DGLS(p,t) estimators lie between 0.0009 and
0.0019, and 0.0014 and 0.0032, respectively. Similarly, statistical inferences on θ are much
more reliable in the context of the ADL(q,r,g) estimator, as suggested by the ADL(q,r,g)
empirical sizes, which hardly exceed 10 percent, irrespective of the selection criterion
used and the value of a11 . On the other hand, the empirical sizes for the DOLS(p,t) and
DGLS(p,t) t-tests range from 18.6 percent to 21.9 percent and from 11.7 percent to 19.5
percent, respectively.
As far as the rest of the estimators are concerned, the discussion is confined solely
to hypothesis testing, for reasons of space. The PL(s,l) t-statistic is distributed approximately as N (0, 1), thus resulting in very reliable statistical inferences on θ. On the other
hand, the JOH(z)-based t-test is under-sized, especially for small values of a11 , despite
the fact that the information criteria select the correct lag order, z=2, at a frequency that
ranges from 90 to 98 percent. Interestingly, the PW-FMLS procedure allows for statistical
inferences of reasonable accuracy. In particular, the empirical size of the associated t-test
is 6 percent and 10.6 percent for a11 = 0.3 and a11 = 0.7, respectively. This means that
13 In
a similar vein, the order, l, in the PL(s,l) estimator is assumed to be greater than zero.
13
the performance of the PW-FMLS estimator is comparable to that of DGLS(p) and, for
small values of a11 , even to that of ADL(q,r,g). However, the relatively good properties
of PW-FMLS are not shared by the standard FMLS, which remains the worst-performing
estimator, producing empirical sizes of 10.9 percent and 49.6 percent for a11 = 0.3 and
a11 = 0.7, respectively.
3.2.2 VAR(2) errors
In this set of simulations we investigate the extent to which the ADL(q,r) estimator
outperforms the DOLS/DGLS(p) one in the case that the errors are generated by a
bivariate VAR(2) process. First, we obtain the ADL(q,r) model implied by this DGP,
and second, we derive the conditions under which the ADL(q,r) model reduces to the
AS(p) model. Specifically, we assume that the errors are generated by the following
process:
µ
u1t
u2t
¶
=
µ
a11
a21
¶µ
¶ µ
¶µ
¶ µ
¶
a12
u1t−1
b11 b12
u1t−2
e1t
+
+
a22
u2t−1
b21 b22
u2t−2
e2t
µ
¶
·µ ¶ µ
¶¸
e1t
0
σ 11 σ 12
˜N IID
e2t
σ 12 σ 22
0
(31)
with a21 = 0, b21 = 0, that is there is no Granger-causality running from the cointegration
error to the error that drives the regressor. The VAR(2) structure of ut implies that the
conditional expectation of u1t on the full information set can be summarized as follows:
E(u1t | u2t , ut−1 , ut−2 , ...) =
σ 12
e2t + a11 u1t−1 + a12 u2t−1 + b11 u1t−2 + b12 u2t−2 (32)
σ 22
This conditional expectation gives rise to the following ADL(2,3) model
yt = θxt + d1 yt−1 + d2 yt−2 + d3 xt−1 + d4 xt−2 + d5 xt−3 + ν t
where
(33)
d1 = a11 − a21
σ 12
σ 22
(34)
d2 = b11 − b21
σ 12
σ 22
(35)
d3 = a12 − a11 θ + (a21 θ − a22 − 1)
σ 12
σ 22
d4 = b12 − b11 θ − a12 + (b21 θ + a22 − b22 )
(36)
σ 12
σ 22
(37)
σ 12
(38)
− b12
σ 22
It is easy to show that the ADL(2,3) model reduces to the AS(2) model when a11 =
b11 = 0. Similarly to the previous experiments, we consider the family of ADL(q,r) estimators that arise from allowing q and r to take integer values in the interval [0,4], thus
obtaining fourteen ADL(q,r) estimators. Regarding the AS(p) model, we allow p to take
integer values in the interval [0,20], thus obtaining twenty one DOLS(p)/ DGLS(p) estimators.14 To select the q, r and p orders, we employ the information criteria employed in
d5 = b22
14 For
the DGLS(p) estimator, we assume an AR(2) model for υ t .
14
our previous simulations. In each replication, we determine the orders of ADL(q,r) and
DOLS(p)/DGLS(p) by each of the three criteria, and then we use the resulting estimators
to calculate the statistics, defined in the previous section. The DGP under consideration
is the following: {a12 = b12 = 0.5, σ 12 = 0.7, a21 = a22 = b21 = b22 = 0, σ 11 = σ22 = 1,
θ = 1 and 0 ≤ a11 < 1, 0 ≤ b11 < 1}. The parameter a11 is set equal to 0, 0.3 and 0.6,
while b11 takes the values of 0 and 0.3, in order for the eigenvalue stability condition to
be satisfied.15 To conserve space, we do not report the results from these experiments,
but we briefly discuss them below:
All in all, the simulation results continue to provide strong evidence in favor of the
ADL(q,r) models, similar to our leading case, DGP1. The SIC and HQ criteria select
the correct order of the ADL(q,r) estimator in 90 percent of the cases, whereas the
performance of AIC falls to the level of 60 percent. On the other hand, the consistent
criteria fail to select a sufficiently large p for the DOLS(p)/DGLS(p) estimators. The
only exception seems to be AIC, which in the case of a highly persistent cointegration
error, that is when a11 +b11 = 0.9, selects p=20 in 40 percent of the cases. As a result, the
ADL(q,r) estimator has substantially lower mean and median biases than the DOLS(p)
or DGLS(p) estimators. For example, when a11 = 0.6 and b11 = 0.3, the bias of the
ADL(q,r) estimator ranges from 0.049 when HQ is used to 0.055 when AIC is used. For
the same values of a11 and b11 , the bias of DOLS(p) ranges from 0.145 when AIC is used
to 0.175 when SIC is used. DGLS(p) is by far the worst estimator in both bias and MSE.
For example, for a11 + b11 = 0.9 the bias and MSE of the AIC-based DGLS(p) are 0.588
and 0.517, respectively. The bias and MSE of DGLS(p) increase dramatically when SIC
is used, reaching the values of 1.033 and 1.121, respectively. The differences in the degree
of biases and MSEs between the ADL(q,r) and DOLS(p)/DGLS(p) estimators are also
reflected in the size performance of the corresponding test procedures. In particular, the
DOLS(p)/DGLS(p) t-tests suffer from severe size distortions, especially in the cases of a
highly persistent cointegration error. For example, for a11 + b11 = 0.9, the empirical size
of DOLS(p) and DGLS(p), when AIC is used, is 58.8 and 75.5 percent, respectively. On
the other hand, for the same degree of persistence and by means of the same information
criterion, the empirical size of the ADL(q,r)-based t-test is only 15.6 percent.16
Turning to the rest of the estimators, the PL(s,l) t-statistic fares slightly better than
the ADL(q,r) one, producing an empirical size of approximately 14 percent in the case of
a highly persistent cointegration error. The empirical distribution of the JOH(z)-based
t-statistic is skewed to the right producing empirical sizes considerably greater than those
associated with the ADL(q,r) or PL(s,l) procedures. Nevertheless, the size distortions of
JOH(z) are significantly smaller than those produced by DOLS(p) or DGLS(p). Finally,
the behavior of the semiparametric estimators imitates that of the leading case, DGP1.
In particular, the PW-FMLS, and especially the FMLS procedures fail to account for
the second-order asymptotic bias effects, thus resulting in t-statistics whose empirical
distributions are located away from zero. The more persistent the cointegration error is,
the more pronounced these effects appear to be. For example, when a11 + b11 = 0.9, the
mean value of the FMLS and PW-FMLS t-statistic is 4.6 and 2.3, respectively, producing
empirical sizes of 69.7 percent and 50.6 percent, respectively.
15 Given the values of a
12, a21, a22 and b12, b21, b22 in this design, the eigenvalue stability condition
reduces to: a11 + b11 ≺ 1.
16 The only case, where the performance of the DOLS(p)/DGLS(p) estimators is comparable to that
of the ADL(q,r) estimator is when a11 = b11 = 0. This is the case where the ADL(0,3) model reduces
to the AS(2) model and the DOLS(2)/DGLS(2) estimators do not suffer from a truncation bias. The
consistent criteria identify the correct order in the context of both specifications in more than 90 per cent
of the cases. As a result, the performance of the DOLS(p)/DGLS(p) estimators is almost equal to that
of the ADL(q,r) estimator.
15
3.2.3 VMA(1) errors
In this set of simulations, we use a first-order bivariate moving average, VMA(1), process
to generate the errors, u1t and u2t . The moving average assumption implies that the
memory of the cointegration error is designed to be extremely short. Such a case rarely
occurs in macroeconomic applications, where a highly persistent cointegration error is
often detected. Specifically,
µ
¶ µ
¶µ
¶ µ
¶
u1t
a11 a12
e1t−1
e1t
=
+
(39)
u2t
a21 a22
e2t−1
e2t
and
µ
e1t
e2t
¶
∼ N IID
·µ
0
0
¶µ
σ11
σ12
σ 12
σ 22
¶¸
(40)
for t = 1, 2, ...T . This DGP does not produce a finite-order ADL(q,r) model, as the
VMA(1) process has a VAR(∞) representation. In this set of simulations, we consider the
set of ADL(q,r) and DOLS(p)/DGLS(p) estimators, employed in our previous simulations,
where the orders q, r, p are selected by the AIC, SIC and HQ criteria. The parameter
settings for the DGP under consideration are the following: {a12 = 0.5, σ 12 = 0.7,
a21 = a22 = 0, σ 11 = σ 22 = 1, θ = 1 and 0 < a11 < 1}. As in the case with the
VAR(1) errors, the parameter a11 takes the values 0.3, 0.6, 0.9 and 0.95. The results (not
reported) may be summarized as follows:
(i) Irrespective of the value of a11 , the SIC and HQ criteria choose the DOLS(1)/
DGLS(1) estimator in more than 90 percent of the cases. The respective figure for AIC
is only 58 percent.
(ii) The order of the ADL(q,r) estimator, selected by the criteria, is an increasing
function of the parameter a11 . For example, when a11 = 0.3, the SIC and HQ criteria
select the ADL(1,2) in more than 50 percent of the cases, whereas for a11 = 0.9 or 0.95,
they select the ADL(2,4) model most frequently. On the other hand, when a11 = 0.9 or
0.95, the efficient AIC criterion selects almost evenly among the ADL(2,4), ADL(3,4) and
ADL(4,4) estimators.
(iii) When a11 = 0.9 or 0.95, the AIC-based DOLS(p) estimator is consistently the
best but only by a slight margin over the AIC-based ADL(q,r).
(iv) The distribution of the t-statistic of both ADL(q,r) and DOLS(p)/ DGLS(p) estimators is properly centered around zero, while slightly negatively skewed and leptokurtic.
On the other hand, the ADL(q,r)-based t-tests marginally outperform the DOLS(p)-based
ones in minimizing size distortions. For a nominal size of 5 percent, the empirical size of
the SIC-based ADL(q,r) estimator is 9.7 percent when a11 = 0.9, whereas the respective
figure for the SIC-based DOLS(p) estimator is 10.2 percent. Moreover, the size of the
DGLS estimators is a decreasing function of the parameter a11 , ranging from around 5
percent for a11 = 0 to 3 percent for a11 = 0.95.
(v) The behavior of the JOH(z) and PL(s,l) estimators is almost identical to that of
ADL(q,r) in terms of bias, MSE and percent rejections of the null hypothesis. Interestingly, the semiparametric methods fare reasonably well in this case. The PW-FMLS
estimator, in particular, seems to account fully for the second-order endogeneity effects,
thus providing a reasonable alternative to parametric procedures in conducting statistical
inferences on θ.
The overall picture suggests that when the errors are generated by a MA(1) process,
the DOLS(p)/DGLS(p) and PW-FMLS estimators fare no worse than the ADL(q,r),
PL(s,l) and JOH(z) estimators, in terms of estimation precision and reliability of statistical inferences.
16
4 Conclusions
The simulation experiments reported in this paper highlight the potential pitfalls of employing the DOLS(p)/DGLS(p) estimators or the class of FMLS estimators for the estimation of a cointegration vector in a single-equation framework. These pitfalls are easily
addressed by using the ADL(q,r) or PL(s,l) estimators instead. The results of this paper
are summarized as follows:
(i) In general, the performance of the ADL(q,r) (or PL(s,l)) estimators is superior
to that of the DOLS(p)/ DGLS(p) estimators. This is due to the fact that the latter
estimators, as opposed to the former, suffer from truncation bias. A large value of p is
usually required for the DOLS(p)/ DGLS(p) bias to approach the levels of the ADL(q,r)
bias. However, the 2.5% and 97.5% points of the empirical distribution of the DOLS(p)/
DGLS(p) t-statistics do not approach the corresponding points of the N (0, 1), even for
large values of p. As a consequence, the sizes of the tests based on the DOLS(p)/ DGLS(p)
estimators, as opposed to those based on the ADL(q,r) estimators, are far off their nominal
size of 5 %.
(ii) The truncation bias of the DOLS(p)/ DGLS(p) estimators depends on the asymptotic long-run correlation and endogeneity nuisance parameters, both of which depend
on the Granger causality structure of the errors in the model and the persistence of the
cointegration error. As a result, the difference between the performances of ADL(q,r)
and DOLS(p)/ DGLS(p) increases with the persistence of the cointegration error. This
effect is milder in the presence of Granger causality running from the cointegration error
to the error that drives the regressor, because in this case, the nuisance parameters do
not explode as the persistence of the cointegration error increases.
(iii) The benefits from employing the ADL(q,r) estimators, instead of the DOLS(p)/
DGLS(p) estimators, remain substantial when the orders (q, r) and p are selected via the
usual order-selection criteria. The use of the consistent SIC and HQ criteria in the context
of the ADL(q,r) model, leads to selection of the correct order in more than 90 percent of
the cases. On the other hand, these criteria are totally unable to move away from low
orders in the context of the DOLS(p)/ DGLS(p) estimation method, thus producing a
very large truncation bias. The efficient AIC criterion is by far the best performing one
in the context of the DOLS(p) estimator, since it selects a sufficiently large p in the cases
that the truncation bias is likely to be large.
(iv) The simulation results provide strong evidence against the employment of the
standard FMLS estimator. In fact, this estimator is inferior even to DOLS(p)/DGLS(p)
for most values of p and for all the DGPs under study. If the applied researcher insists on
using FMLS, then at least he/she must utilize the “prewhitened” version of this estimator,
in order to achieve performance comparable to that of DOLS(p).
(v) The above mentioned results mainly refer to the cases of autoregressive errors.
When the errors follow a bivariate moving average process, where the persistence of the
cointegration error is low and the truncation bias of the DOLS(p)/ DGLS(p) estimators
is negligible, the two methods under study are almost equivalent.
17
References
[1]Akaike, H. (1974), A New Look at the Statistical Model Identification, IEEE Transactions on Automatic Control, AC-19, 667-673.
[2]Andrews, D.W.K. (1991), Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation. Econometrica, 59, 817-858.
[3]Banerjee, A., Dolado, J.J., Galbraith, J.W. and D.F. Hendry (1993), Cointegration,
Error Correction and the Econometric Analysis of Non-Stationary Data, Oxford, Oxford University Press.
[4]Bewley, R.A (1979), The Direct Estimation of the Equilibrium Response in a Linear
Model, Economics Letters, 3, 357-61.
[5]Christou, C. and N. Pittis (2002), Kernel and Bandwidth Selection, Prewhitening, and
the Performance of the Fully Modified Least Squares Estimation Method. Econometric
Theory, 18, 948-961.
[6]Engle, R.F., D.F. Hendry and J.F. Richard (1983), Exogeneity, Econometrica, 51,
277-304.
[7]Hannan, E.J. and Quinn, B.G. (1979), The Determination of the Order of an Autoregression, Journal of the Royal Statistical Society, B41, 190-195.
[8]Hendry, D.F., A.R. Pagan and J.D. Sargan (1984), Dynamic Specification, in Z.
Griliches and M.D. Intrilligator (eds.) Handbook of Econometrics, vol II, ch.18, 10231100.
[9]Johansen, S. (1988), Statistical Analysis of Cointegrating Vectors, Journal of Economic Dynamics and Control, 12, 231-254.
[10]Johansen, S. (1991), Estimation and hypothesis testing of cointegration vectors in
Gaussian vector autoregressive models, Econometrica, 59, 1551-1580.
[11]Kramer, W. (1986), Least-squares regression when the independent variable follows
an ARIMA process, Journal of the American Statistical Association, 81, 150-154.
[12]Newey, W.K. and K.D. West (1987), A simple Positive, Semi-definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix, Econometrica, 55, 703-708.
[13]Newey, W.K. and K.D. West (1994), Automatic lag selection in covariance matrix
estimation, Review of Economic Studies, 61, 4, 631-653.
[14]Park, J.Y. and P.C.B. Phillips (1988), Statistical Inference in Regressions with Integrated Processes: Part 1, Econometric Theory, 4, 468-498.
[15]Pesaran, H.M. and Y. Shin (1999), An autoregressive distributed lag modelling approach to cointegration analysis, in S. Strom (ed.), Econometrics and Economic Theory in the Twentieth Century: The Ragnar Frisch Centennial Symposium, Cambridge
University Press, Cambridge, UK.
[16]Phillips, P.C.B. (1988), Reflections on Econometric Methodology, Economic Record,
64, 344-359.
[17]Phillips, P.C.B. and B.E. Hansen (1990), Statistical Inference in Instrumental Regressions with I(1) processes, Review of Economic Studies, 57, 99-125.
18
[18]Phillips, P.C.B. and M. Loretan (1991), Estimating Long-run Economic Equilibria,
Review of Economic Studies, 58, 407-436.
[19]Saikkonen, P. (1991), Asymptotically Efficient Estimation of the Cointegration Regressions, Econometric Theory, 7,1, 1-27.
[20]Schwarz, G. (1978), Estimating the Dimension of a Model, Annals of Statistics, 6,
461-464.
[21]Sims, C.A., Stock, J.H. and M.W. Watson (1990), Inference in Linear Time Series
with Some Unit Roots, Econometrica, 58: 113-144.
[22]Stock, J.H. (1987), Asymptotic Properties of Least Squares Estimators of Cointegrating Vectors, Econometrica, 55, 1035-1056.
[23]Stock, J.H. and M.W. Watson (1993), A Simple Estimator of Cointegrating Vectors
in Higher-order Integrated Systems, Econometrica, 61, 783-820.
[24]Wickens, M.R. and T.S. Breusch (1988), Dynamic Specification, the Long Run and
the Estimation of Transformed Regression Models, Economic Journal, 98, (Conference
1988), 189-205.
19
Table 1
Panel A
Estimator
ADL(1,2)
PL(1,0)
JOH(2)
FMLS
PW-FMLS
Panel B
Estimator
ADL(1,2)
PL(1,0)
JOH(2)
FMLS
PW-FMLS
Panel C
Estimator
ADL(1,2)
PL(1,0)
JOH(2)
FMLS
PW-FMLS
Panel D
Estimator
ADL(1,2)
PL(1,0)
JOH(2)
FMLS
PW-FMLS
a11=0.3
Mean bias
0.0012
0.0012
0.0012
0.0269
0.0099
Median Bias
0.0006
0.0006
0.0059
0.0179
0.0066
MSE
0.0012
0.0012
0.0009
0.0032
0.0017
Size
5.75
5.75
0.75
14.75
8.15
Median Bias
0.0016
0.0017
0.0044
0.0460
0.0142
MSE
0.0041
0.0041
0.0062
0.0140
0.0059
Size
7.10
7.05
1.55
26.25
11.35
Median Bias
0.0399
0.0447
0.0244
0.2972
0.1188
MSE
0.0882
0.0890
0.1724
0.1981
0.1457
Size
15.10
15.35
14.10
63.20
29.30
Median Bias
0.1574
0.1670
0.1082
0.5168
0.3199
MSE
0.3659
0.3580
0.4100
0.4711
0.7436
Size
27.90
27.80
30.05
74.60
46.15
a11=0.6
Mean bias
0.0017
0.0018
0.0077
0.0660
0.0202
a11=0.9
Mean bias
0.0372
0.0397
0.0336
0.3202
0.1409
a11=0.95
Mean bias
0.1037
0.1132
0.0965
0.5170
0.3315
20
Table 2
t0.025
t0.975
a11
Estimator
OLS
ADL(1,2)
PL (1,0)
JOH (2)
FMLS
PW-FMLS
0.3
0.6
0.9
0.95
0.3
0.6
0.9
0.95
-0.655
-2.020
-2.020
-0.952
-1.541
-1.855
-0.765
-1.926
-1.966
-1.211
-1.629
-1.919
-1.074
-1.676
-1.738
-1.908
-2.108
-2.313
-1.131
-1.484
-4.186
-2.483
-2.006
-2.792
4.199
2.084
2.084
1.571
3.162
2.499
4.975
2.196
2.299
1.658
4.369
2.757
9.983
3.826
3.491
3.555
15.099
5.398
15.363
6.048
5.513
5.798
38.781
8.732
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
DOLS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-1.936
-2.271
-2.361
-2.357
-2.340
-2.454
-2.513
-2.533
-2.513
-2.514
-2.497
-2.527
-2.527
-2.574
-2.583
-2.563
-2.507
-2.533
-2.454
-2.469
-1.484
-1.882
-2.249
-2.500
-2.675
-2.784
-2.872
-2.934
-2.895
-2.871
-2.933
-2.961
-3.005
-3.042
-3.032
-3.076
-2.991
-3.047
-2.985
-2.962
-1.387
-1.613
-1.738
-2.040
-2.234
-2.449
-2.759
-2.867
-3.012
-3.068
-3.289
-3.465
-3.662
-3.745
-3.959
-4.124
-4.315
-4.535
-4.646
-4.724
-1.459
-1.617
-1.764
-2.098
-2.226
-2.420
-2.585
-2.723
-2.804
-2.992
-3.016
-3.117
-3.356
-3.565
-3.624
-3.793
-3.980
-4.147
-4.316
-4.422
2.947
2.577
2.436
2.420
2.389
2.429
2.375
2.430
2.401
2.318
2.363
2.384
2.423
2.461
2.390
2.341
2.348
2.388
2.467
2.493
4.242
3.815
3.520
3.322
3.085
2.999
2.886
2.904
2.927
2.851
2.806
2.799
2.857
2.894
2.906
2.824
2.873
2.855
2.876
2.908
9.717
9.712
9.575
9.459
9.257
9.015
8.699
8.524
8.133
7.762
7.599
7.389
7.184
6.936
6.699
6.547
6.396
6.304
6.253
6.236
15.052
14.800
14.815
14.626
14.604
14.441
14.601
14.093
13.885
13.723
13.462
13.037
12.671
12.367
12.207
12.148
11.788
11.601
11.401
11.117
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
DGLS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-1.467
-1.899
-2.049
-2.083
-2.073
-2.098
-2.115
-2.166
-2.143
-2.145
-2.161
-2.156
-2.140
-2.158
-2.171
-2.166
-2.128
-2.186
-2.131
-2.121
-0.116
-0.984
-1.424
-1.758
-1.930
-2.053
-2.118
-2.184
-2.171
-2.193
-2.237
-2.258
-2.266
-2.307
-2.279
-2.286
-2.272
-2.324
-2.251
-2.266
5.141
3.659
2.641
1.662
0.885
0.288
-0.255
-0.567
-0.856
-1.023
-1.241
-1.366
-1.481
-1.656
-1.850
-1.909
-2.050
-2.167
-2.245
-2.324
5.974
4.602
3.688
2.868
2.316
1.777
1.159
0.715
0.494
0.077
-0.206
-0.394
-0.611
-0.839
-1.003
-1.201
-1.269
-1.287
-1.481
-1.690
2.585
2.230
2.086
2.088
2.073
2.071
2.051
2.031
2.045
1.997
2.049
2.074
2.045
2.082
2.071
2.052
2.036
2.092
2.070
2.070
5.345
3.382
2.777
2.548
2.442
2.384
2.293
2.256
2.173
2.152
2.196
2.228
2.250
2.250
2.219
2.207
2.224
2.211
2.176
2.194
10.632
8.989
7.605
6.740
6.131
5.722
5.395
5.155
4.931
4.816
4.626
4.537
4.419
4.344
4.284
4.124
4.038
3.969
3.910
3.844
12.159
10.715
9.578
8.730
8.208
7.902
7.605
7.415
7.172
7.181
6.966
6.933
6.845
6.812
6.671
6.416
6.322
6.347
6.290
6.121
21
Table 3
Mean Bias
Median Bias
a11
Criterion
AIC
SIC
HQ
0.3
0.6
0.9
0.95
0.3
0.001
0.002
0.001
0.004
0.005
0.004
0.042
0.048
0.045
0.062
0.067
0.062
0.001
0.001
0.001
AIC
SIC
HQ
0.002
0.006
0.004
0.004
0.015
0.009
0.081
0.113
0.097
0.270
0.289
0.280
0.001
0.004
0.003
AIC
SIC
HQ
0.003
0.008
0.006
0.013
0.038
0.026
0.440
0.915
0.726
0.852
1.098
1.039
0.001
0.006
0.004
AIC
SIC
HQ
0.001
0.001
0.001
0.003
0.003
0.003
0.038
0.037
0.038
0.131
0.132
0.131
0.001
0.001
0.001
AIC
SIC
HQ
0.010
0.010
0.010
0.009
0.009
0.009
0.040
0.040
0.040
0.125
0.125
0.125
0.006
0.006
0.006
0.6
0.9
ADL
0.003 0.046
0.003 0.048
0.003 0.046
DOLS
0.003 0.065
0.011 0.097
0.007 0.080
DGLS
0.009 0.389
0.026 0.970
0.018 0.816
PL
0.002 0.039
0.002 0.041
0.003 0.041
JOH
0.005 0.028
0.005 0.028
0.005 0.028
MSE
0.95
0.3
0.6
0.9
0.95
0.173
0.172
0.174
0.001
0.001
0.001
0.004
0.004
0.004
0.101
0.094
0.098
0.211
0.199
0.187
0.257
0.274
0.269
0.002
0.001
0.001
0.005
0.005
0.005
0.077
0.080
0.082
0.261
0.276
0.269
0.967
1.115
1.083
0.002
0.002
0.001
0.005
0.008
0.006
0.369
0.910
0.672
0.882
1.237
1.141
0.147
0.153
0.151
0.001
0.001
0.001
0.005
0.005
0.005
0.077
0.077
0.077
0.193
0.192
0.193
0.109
0.109
0.109
0.001
0.001
0.001
0.023
0.023
0.023
0.058
0.058
0.058
0.138
0.138
0.138
0.95
0.3
0.6
0.9
0.95
6.150
6.106
6.087
7.05
6.65
6.95
8.00
7.30
7.45
14.35
14.05
14.05
23.40
23.10
23.25
14.059
14.412
14.382
13.90
11.85
12.15
22.15
21.10
21.05
49.85
53.20
52.60
66.15
66.70
67.00
11.510
12.072
11.927
7.90
7.85
7.05
11.35
14.35
12.20
62.70
96.95
86.05
88.50
99.75
98.05
5.109
5.208
5.208
7.60
7.70
7.65
8.55
8.55
8.40
14.15
14.15
14.00
22.65
22.90
22.80
5.233
5.233
5.233
1.05
1.05
1.05
2.10
2.10
2.10
10.35
10.35
10.35
21.80
21.80
21.80
Table 4
t0.025
t0.975
a11
Criterion
AIC
SIC
HQ
0.3
0.6
0.9
0.95
0.3
-2.062
-2.049
-2.062
-2.029
-1.970
-1.970
-1.754
-1.745
-1.748
-1.523
-1.505
-1.514
2.316
2.185
2.214
AIC
SIC
HQ
-2.608
-2.393
-2.445
-3.130
-2.894
-2.977
-4.803
-4.613
-4.779
-4.525
-4.345
-4.443
2.618
2.753
2.683
AIC
SIC
HQ
-2.155
-1.977
-2.036
-2.303
-1.866
-2.072
-1.911
1.803
-0.477
-0.290
4.658
2.406
2.229
2.335
2.298
AIC
SIC
HQ
-2.133
-2.143
-2.130
-2.053
-2.077
-2.036
-1.922
-1.945
-1.916
-2.256
-2.393
-2.337
2.299
2.287
2.273
AIC
SIC
HQ
-0.956
-0.956
-0.956
-1.196
-1.196
-1.196
-1.563
-1.563
-1.563
-1.657
-1.657
-1.657
1.553
1.553
1.553
22
0.6
0.9
ADL
2.505
4.022
2.312
4.017
2.331
4.022
DOLS
3.252
8.388
3.470
8.773
3.371
8.606
DGLS
2.577
9.289
3.062 10.498
2.739 10.123
PL
2.538
3.713
2.538
3.698
2.504
3.713
JOH
1.652
3.276
1.652
3.276
1.652
3.276
Size
Table 5
Mean Bias
a11
Criterion
AIC
SIC
HQ
0.3
0.7
0.0003
0.0001
0.0004
0.0008
0.0015
0.0010
AIC
SIC
HQ
0.0000
-0.0003
-0.0001
0.0009
0.0019
0.0013
AIC
SIC
HQ
0.0001
-0.0003
-0.0001
0.0014
0.0032
0.0021
AIC
SIC
HQ
0.0003
0.0003
0.0003
-0.0009
-0.0009
-0.0009
AIC
SIC
HQ
0.0033
0.0033
0.0033
0.0008
0.0008
0.0008
FMLS
PW-FMLS
0.0049
0.0015
0.0027
-0.0014
Median Bias
0.3
0.7
ADL
0.0003
0.0007
0.0000
0.0012
0.0003
0.0008
DOLS
0.0001
0.0008
-0.0002
0.0016
0.0000
0.0011
DGLS
0.0001
0.0011
-0.0002
0.0024
0.0000
0.0017
PL
0.0002 -0.0006
0.0002 -0.0006
0.0002 -0.0006
JOH
0.002
0.0003
0.002
0.0003
0.002
0.0003
FMLS
0.0031
0.0021
0.0010 -0.0001
23
MSE
0.3
0.7
0.0001
0.0001
0.0001
0.0000
0.0000
0.0000
0.0001
0.0001
0.0001
0.0000
0.0000
0.0000
0.0001
0.0001
0.0001
0.0000
0.0000
0.0000
0.0001
0.0001
0.0001
0.0000
0.0000
0.0000
0.0001
0.0001
0.0001
0.0000
0.0000
0.0000
0.0002
0.0001
0.001
0.0002
Table 6
t0.025
t0.975
a11
Criterion
AIC
SIC
HQ
0.3
0.7
-2.040
-2.016
-2.009
-1.873
-1.706
-1.840
AIC
SIC
HQ
-2.249
-2.177
-2.229
-2.387
-2.165
-2.307
AIC
SIC
HQ
-2.118
-2.093
-2.099
-1.917
-1.615
-1.799
AIC
SIC
HQ
-2.006
-2.006
-2.006
-2.274
-2.274
-2.274
AIC
SIC
HQ
-1.001
-1.001
-1.001
-1.454
-1.454
-1.454
FMLS
PW-FMLS
-1.918
-1.873
-7.411
-2.998
0.3
0.7
ADL
2.305
2.503
2.154
2.735
2.246
2.493
DOLS
2.300
3.286
2.232
3.736
2.278
3.467
DGLS
2.167
2.763
2.073
3.572
2.143
3.163
PL
2.058
2.282
2.058
2.282
2.058
2.282
JOH
1.663
1.809
1.663
1.809
1.663
1.809
FMLS
2.739
6.766
2.188
2.218
24
Size
0.3
0.7
6.90
6.15
6.45
9.55
10.05
9.15
8.90
8.15
8.65
18.60
21.90
19.75
6.75
6.25
6.60
11.65
19.45
14.45
5.65
5.65
5.65
8.35
8.35
8.35
1.05
1.05
1.05
2.55
2.55
2.55
10.9
6.00
49.55
10.60
Figures 1A-1D
Mean bias (a11=0.3, T=100)
0.025
0.020
0.015
0.010
0.005
0.000
-0.005
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
DOLS (p)
ADL(1,2)
DGLS(p)
Mean bias (a11=0.6, T=100)
0.300
0.250
0.200
0.150
0.100
0.050
0.000
-0.050
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
DOLS (p)
ADL(1,2)
DGLS(p)
Mean bias (a11=0.9, T=100)
1.200
1.000
0.800
0.600
0.400
0.200
0.000
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
DOLS (p)
ADL(1,2)
DGLS(p)
Mean bias (a11=0.95, T=100)
1.200
1.000
0.800
0.600
0.400
0.200
0.000
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
DOLS (p)
ADL(1,2)
25
DGLS(p)
Figures 2A-2D
Median bias (a11=0.3, T=100)
0.020
0.015
0.010
0.005
0.000
-0.005
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
DOLS (p)
ADL(1,2)
DGLS(p)
Median bias (a11=0.6, T=100)
0.250
0.200
0.150
0.100
0.050
0.000
-0.050
1
2
3
4
5
6
7
8
9
DOLS (p)
10 11 12 13 14 15 16 17 18 19 20
ADL(1,2)
DGLS(p)
Median bias (a11=0.9, T=100)
1.200
1.000
0.800
0.600
0.400
0.200
0.000
1
2
3
4
5
6
7
8
DOLS (p)
9
10 11 12 13 14 15 16 17 18 19 20
ADL(1,2)
DGLS(p)
Median bias (a11=0.95, T=100)
1.200
1.000
0.800
0.600
0.400
0.200
0.000
1
2
3
4
5
6
7
8
DOLS (p)
9 10 11 12 13 14 15 16 17 18 19 20
ADL(1,2)
26
DGLS(p)
Figures 3A-3D
MSE (a11=0.3, T=100)
0.003
0.003
0.002
0.002
0.001
0.001
0.000
1
2
3
4
5
6
7
8
9
DOLS (p)
10 11 12 13 14 15 16 17 18 19 20
ADL(1,2)
DGLS(p)
MSE (a11=0.6, T=100)
0.120
0.100
0.080
0.060
0.040
0.020
0.000
1
2
3
4
5
6
7
8
9
DOLS (p)
10 11 12 13 14 15 16 17 18 19 20
ADL(1,2)
DGLS(p)
MSE (a11=0.9, T=100)
1.200
1.000
0.800
0.600
0.400
0.200
0.000
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20
DOLS (p)
ADL(1,2)
DGLS(p)
MSE (a11=0.95, T=100)
1.400
1.200
1.000
0.800
0.600
0.400
0.200
0.000
1
2
3
4
5
6
7
8
DOLS (p)
9
10 11 12 13 14 15 16 17 18 19 20
ADL(1,2)
27
DGLS(p)
Figures 4A-4D
Empirical size (a11=0.3, T=100)
16
14
12
10
8
6
4
2
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
DOLS (p)
ADL(1,2)
DGLS(p)
Empirical size (a11=0.6, T=100)
71
61
51
41
31
21
11
1
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
DOLS (p)
ADL(1,2)
DGLS(p)
Empirical size (a11=0.9, T=100)
122
102
82
62
42
22
2
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
DOLS (p)
ADL(1,2)
DGLS(p)
Empirical size (a11=0.95, T=100)
122
102
82
62
42
22
2
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
DOLS (p)
ADL(1,2)
28
DGLS(p)
AIC
SIC
29
HQ
ADL model selection (a11=0.95)
80
60
40
20
0
ADL(0,1)
ADL(0,0)
ADL(4,4)
ADL(2,1)
100
ADL(2,1)
HQ
ADL(2,1)
0
ADL(1,0)
40
20
ADL(1,0)
60
ADL(1,0)
80
ADL(0,1)
ADL model selection (a11=0.9)
ADL(0,1)
100
ADL(0,0)
HQ
ADL(0,0)
ADL(4,4)
SIC
ADL(3,4)
SIC
ADL(4,4)
SIC
ADL(3,4)
ADL(2,4)
ADL(1,4)
ADL(3,3)
ADL(2,3)
ADL(1,3)
ADL(2,2)
ADL(1,2)
ADL(1,1)
AIC
ADL(3,4)
AIC
ADL(2,4)
ADL(1,4)
ADL(3,3)
ADL(2,3)
ADL(1,3)
ADL(2,2)
ADL(1,2)
ADL(1,1)
AIC
ADL(2,4)
ADL(1,4)
ADL(3,3)
ADL(2,3)
ADL(1,3)
ADL(2,2)
ADL(1,2)
ADL(1,1)
ADL(2,1)
ADL(1,0)
ADL(0,1)
ADL(0,0)
ADL(4,4)
ADL(3,4)
ADL(2,4)
ADL(1,4)
ADL(3,3)
ADL(2,3)
ADL(1,3)
ADL(2,2)
ADL(1,2)
ADL(1,1)
Figures 5A-5D
100
ADL model selection (a11=0.3)
80
60
40
20
0
HQ
100
ADL model selection (a11=0.6)
80
60
40
20
0
AIC
SIC
30
HQ
DOLS20
DOLS18
DOLS16
SIC
DOLS20
DOLS18
DOLS16
DOLS14
DOLS12
SIC
DOLS14
AIC
DOLS20
DOLS18
DOLS16
DOLS14
DOLS12
DOLS10
DOLS8
DOLS6
DOLS4
DOLS2
OLS
SIC
DOLS12
AIC
DOLS10
DOLS8
DOLS6
DOLS4
DOLS2
OLS
AIC
DOLS10
DOLS8
DOLS6
DOLS4
DOLS2
OLS
DOLS20
DOLS18
DOLS16
DOLS14
DOLS12
DOLS10
DOLS8
DOLS6
DOLS4
DOLS2
OLS
Figures 6A-6D
70
60
50
40
30
20
10
0
DOLS model selection (a11=0.3)
HQ
40
35
30
25
20
15
10
5
0
DOLS model selection (a11=0.6)
HQ
25
DOLS model selection (a11=0.9)
20
15
10
5
0
HQ
50
DOLS model selection (a11=0.95)
40
30
20
10
0
AIC
SIC
SIC
SIC
31
HQ
HQ
100
DGLS model selection (a11=0.95)
80
60
40
20
0
DGLS14
DGLS12
DGLS10
DGLS8
DGLS6
DGLS4
DGLS2
OLS
DGLS20
DGLS model selection (a11=0.9)
DGLS20
80
70
60
50
40
30
20
10
0
DGLS20
HQ
DGLS20
0
DGLS18
10
DGLS18
20
DGLS18
30
DGLS18
40
DGLS16
DGLS model selection (a11=0.6)
DGLS16
50
DGLS16
DGLS14
DGLS12
HQ
DGLS16
DGLS14
DGLS12
DGLS10
DGLS8
DGLS6
DGLS4
DGLS2
OLS
SIC
DGLS14
AIC
DGLS12
AIC
DGLS10
DGLS8
DGLS6
DGLS4
DGLS2
OLS
AIC
DGLS10
DGLS8
DGLS6
DGLS4
DGLS2
OLS
Figures 7A-7D
70
60
50
40
30
20
10
0
DGLS model selection (a11=0.3)
AIC
SIC
32
HQ
ADL(4,4,0)
ADL(1,2,0)
ADL(1,1,0)
ADL(4,4,4)
SIC
ADL(3,4,4)
AIC
ADL(2,4,4)
ADL(1,4,4)
ADL(3,3,3)
ADL(2,3,3)
ADL(1,3,3)
ADL(2,2,2)
ADL(1,2,2)
ADL(1,2,1)
ADL(1,1,1)
ADL(4,4,0)
ADL(1,2,0)
ADL(1,1,0)
ADL(4,4,4)
ADL(3,4,4)
ADL(2,4,4)
ADL(1,4,4)
ADL(3,3,3)
ADL(2,3,3)
ADL(1,3,3)
ADL(2,2,2)
ADL(1,2,2)
ADL(1,2,1)
ADL(1,1,1)
Figures 8A-8B
100
ADL model selection (a11=0.3)
80
60
40
20
0
HQ
80
70
60
50
40
30
20
10
0
ADL model selection (a11=0.7)
AIC
AIC
33
SIC
SIC
HQ
DOLS(4,0)
DOLS(4,0)
DOLS(1,0)
DOLS(4,4)
DOLS(3,3)
DOLS(2,2)
DOLS(1,1)
OLS
DOLS(3,0)
DOLS model selection (a11=0.7)
DOLS(3,0)
60
50
40
30
20
10
0
DOLS(2,0)
HQ
DOLS(2,0)
DOLS(1,0)
DOLS(4,4)
DOLS(3,3)
DOLS(2,2)
DOLS(1,1)
OLS
Figures 9A-9B
100
DOLS model selection (a11=0.3)
80
60
40
20
0
AIC
AIC
SIC
SIC
34
HQ
DGLS(4,0)
DGLS(4,0)
DGLS(1,0)
DGLS(4,4)
DGLS(3,3)
DGLS(2,2)
DGLS(1,1)
OLS
DGLS(3,0)
DGLS model selection (a11=0.7)
DGLS(3,0)
70
60
50
40
30
20
10
0
DGLS(2,0)
HQ
DGLS(2,0)
DGLS(1,0)
DGLS(4,4)
DGLS(3,3)
DGLS(2,2)
DGLS(1,1)
OLS
Figures 10A-10B
100
80
60
DGLS model selection (a11=0.3)
40
20
0
Download