Document 11162643

advertisement
Digitized by the Internet Archive
in
2011 with funding from
Boston Library Consortium Member Libraries
http://www.archive.org/details/improvedratefornOOquah
HB31
.M415
n .5z^
working paper
department
of economics
z&)
AN IMPROVED RATE FOR NONNEGATIVE DEFINITE
CONSISTENT COVARIANCE MATRIX ESTIMATION WITH
HETEROGENEOUS DEPENDENT DATA
Danny Quah
No.
529
July 1989
massachusetts
institute of
technology
50 memorial drive
Cambridge, mass. 02139
AN IMPROVED RATE FOR NONNEGATIVE DEFINITE
CONSISTENT COVARIANCE MATRIX ESTIMATION WITH
HETEROGENEOUS DEPENDENT DATA
Danny Quah
No.
529
July 1989
I
Hit *«e«G*rs
DEC 2 1 £89
l
An Improved Rate for Nonnegative Definite Consistent
Covariance Matrix Estimation With Heterogeneous Dependent Data
by
Danny Quah
*
July 1989.
Department of Economics, MIT and NBER. (dquah@dolphin.mit.edu). I thank Jeffrey Wooldridge for
and the MIT Statistics Center for its hospitality. Wooldridge also kindly pointed out a serious
mistake in a first draft. All errors and misinterpretations are mine.
*
discussions,
An Improved Rate for Nonnegative Definite Consistent
Covariance Matrix Estimation With Heterogeneous Dependent Data
by
Danny Quah
Economics Department, MIT.
July 1989.
Abstract
This paper improves on previous rates at which lag lengths are allowed to grow for consistent
covariance matrix estimation with heterogeneous dependent data. Using a
WLLN, we
4
1 3
sistency result for growth rates ofofo / ); the previous rate was o(n}' ). This
new
give a con-
rate equals that
of Berk's autoregressive spectral density estimator for well-behaved stationary contexts, and thus
may
1.
be best possible outside of very special cases.
Introduction
Estimating consistent covariance matrices
researcher.
GMM
The need
to
do
this arises in
methods (Hansen and Singleton
Phillips
and Perron
[1988], Phillips
[1982] (for stationary data),
one of the most
is
problems confronting the applied
econometric work ranging from Euler equation estimation by
[1982]) to tests for integration
and Ouliaris
and White
common
[1984],
[1988],
and Stock
White and Domowitz
and cointegration
[1988]).
[1984],
Thus the
(Phillips [1987],
results of
and Newey and West
Hansen
[1987] (for
heterogeneous dependent data) on consistent covariance matrix estimation have been very widely applied.
Newey and West
[1987]
adapted results in White [1984] and White and Domowitz [1984] to obtain a
class
of non-negative definite consistent covariance matrix estimators for dependent non-iid data. Their correction
to
arguments
The same
in
White
l A
[1984] 6.19 led to a o(n l ) rate for increasing lag length to preserve consistency.
rate appears in Gallant
Phillips [1987]
and
Phillips
and White
and Perron
[1988] 6.18,
and has been used quite generally
[1988]).
'
This paper improves that rate to o(n 1/
3
),
the
same
as that for Berk's autoregressive (time
spectral density estimator for strictly stationary data. Except for very special cases, this
may be
(see for instance
the best possible. Since the choice of lag length
arbitrary) to applied researchers, this rate
is
often one of the
new
domain)
'
rate of o(n 1 3 )
most troubling (and seemingly
improvement should permit greater
flexibility
without risking the
-
2-
loss of consistent inference.
Note that
this
o(n 1 / 3 ) rate
As Newey and West
is
exactly that originally in the conclusion of White's [1984]
[1987] have pointed out however, this result did not follow
paper therefore uses a different argument to re-establish that
assumptions. The proof
6.20.
from White's proof. This
under essentially the same regularity
remarkably straightforward.
Notation
2.
The p-norm
E
is
result,
Theorem
llp
\X\ p
to be
.
of a
random
The random
variable that
a-mixing of size —q
^ m Om
so that
<
if
is
defined on a given probability space (O, T\ Pr)
the absolute value of A'
is
X
said to be <j>-mixing of size
is
—q
analogous conditions. See for example Gallant and White [1988]. Let
when Xt
</>+
reversed in time;
is
= m =
(j>R
(j)
We
will
Lemma
for all
m.
let
It will
<f>^
X
= max(^ m ,^). When
{A'j, *
a m = 0(m x )
if its
denoted
for
>
||A"||
1}
is
p
=
said
some A < —q,
(^-mixing coefficients satisfy
denote the ^-mixing coefficients
<j>
Gaussian and covariance stationary,
is
be convenient below to place restrictions on
+
<f>
.
need the following:
2.1
(Davydov's Inequality): For
\EX
t
all p,
X _j - EX EX _j\
t
t
t
See for example Philipp [1986, p. 241]
Lemma
is
denoted \X\. Recall that
the a- mixing coefficients tend to zero and satisfy
Similarly
oo.
X
variable (rv)
Lemma
2.2 (Peligrad's Inequality): For
\EXtXt-j -
<
all p, q
t
i
+
i
<
15aJ~'~*||X||| p
3.1 for this
EX EX -i\
t
q such that
•
1,
||*t_,-||,.
form of Davydov's
such that -
-\-
< 2(^) 1/p (^f) 1/9
-
—
result.
1,
ll^illp \\Xt-j\\,.
I
This was
first
obtained in Peligrad [1983], and improves by
White
[1984] 6.16
White
[1984] 6.16.
is
a special case of
2.1;
when
X
is
1
(<t>f)
'
q
on the
earlier long-standing inequality.
covariance stationary, 2.2
is
a
strict
improvement on
-
3-
Results
3.
The
assumptions
first
a weak law of large numbers
first result is
for mixingales
moments
absolute
We
WLLN)
(
for
a process that
fails
the usual
(and thus for mixing sequences as well). Further the process
so that
weak dependence
will
have growing
not an L'-mixingale (Andrews [19S8]).
it is
give the regularity assumptions in two sets, one set of assumptions on the process
itself,
and the
other on a set of weights.
Assumption
some
for
</>+
r
>
3.1:
1: (i.)
Suppose {Xt,
t
<
sup, ||Xf||4 r
>
oo;
1} on
and
T, Pr)
(£2,
Xt
either (a.)
(ii.)
EXt =
satisfies
is
for allt,
and assume further that
a-mixing of size — 2r/(r —
or (b.)
1)
X
-mixing of size —2.
is
I
For convenience in notation,
let A'<
Assumption
w n (j),
3.2:
t
Suppose
—
weights such that as n
oo,
*
=
with n
we have
t
<
0.
1,
j
>
for all
>
w n (j) —
These assumptions are essentially those
in
1
is
a double array of uniformly bounded non-negative
for each j.
Newey and West
I
Theorem
[1987]
2,
or
White
Theorem
[1984]
6.20
where applicable.
The
first result is
Theorem
3.3:
WLLN
a
Assume
(3.1)
for
and
dependent double arrays that
(3.2),
Znt
and
will
be used below.
define the double array of rv's:
^ Yl "»0') {XtXt-j - EXtXt-j)
j=0
for
some sequence of nonnegative
n~ l Y^t=i %nt —+
as n
—
>
integers l(n), with l(n)
=
o(n 1/
'
3
).
Then
the double array
{Znt \
satisfies
oo.
I
Remarks
1.
Notice that
will
if
l(n) j oo,
Znt
have growing moments
has stronger long term dependence than does a mixingale. Further, Z„t
for l(n) increasing
2.
Clearly the conclusion remains true
3.
Our improved
rate derives
[1984] proof of his
with n.
fixed, or if
if
l(n)
is
from using
this
WLLN
Theorem
6.20.
Zn
below
t
is
defined to exclude the j
=
term.
in place of the implication rule as in
White's
-44.
By
6,
first
WLLN,
giving this
it
should be clear that our proof
differs
from that
in
White
[1984]
Chapter
by a change in the order of summation.
The
principal result
convergence in probability for a weighted estimator of Var (n -1 / 2 Y^i = i
is
We
A'j).
state this as follows:
Theorem
and
l(n) 1 oo,
Assume
3.4:
l(n)
=
and
(3.1)
1 3
o(?i ^ ).
(3.2),
Then as n
and
—
»
l(n)
n
oo,
*
n
t
£
XtXt-i
Var
U- 1/2 X) A
'«
t=j+i
j=i
_«=i
—
oo,
£A' 2 + 2J>„(j)
n- 1
be a sequence of positive integers such that as n
let l(n)
^
°'
t=i
Remarks
1.
Apply Theorem
3.4 to the proof of
Theorem
second and third terms in their expression
and
l(n)
=
2 in
3.
similarly converge to zero by l(n) f oo
m n — o(n
The Newey-West
'
1/
3
)
their
[1987],
and
Phillips
but with
Assumption TL
p. 101.
widely used in applications: see for example Phillips [1987]
is
2,
instead of o(n l l A )).
from the typographically incorrect 0(n l l A ) on
)
result
and Perron
3
Chapter 6 of Gallant and White [1988] remain intact with
Similarly, the results in
Phillips
[1987] to argue convergence of the
o(n l l 3 ). The result here therefore implies the same conclusion as their Theorem
'
changed to
The other terms
(9).
greater flexibility in choice of lag length (o(n 1/
2.
Ncwey and West
and Ouliaris
[1988],
and elsewhere. Those
Theorem
4.2,
results therefore all
hold with an even more flexible choice for the lag length.
4.
The
X
5.
rate o(n
1'
3
)
is
also that used in autoregressive spectral density estimation under assumptions
that include strict stationarity, absolute summability of the
finite fourth
moments on
Fuller [1976,
Theorem
the iid innovations
7.2.3]
strictly stationary case. It
and Anderson
may
in fact
(e.g.
[1971,
Wold moving average
Berk [1974] Theorem
Chapter
9]
coefficients,
on
and
1).
imply that a o(n) rate can be used
for the
be possible to adapt the "unraveling" method used there for the
nonstationary mixing situation considered here.
.
4.
Proofs
K
In the sequel, the symbols
and K'
denote arbitrary
will
not necessarily the same through-
finite constants,
out.
Our
a
first result is
bound which may be
WLLN.
convenient to give the proof in two parts, the
It is
moment
binding restriction on the mixing and
it is
likely to obtain
this
3.3:
double
sum
part
is
a variance
also that part of the results that gives the
further
if
improvement
is
forthcoming,
here.
bound Var (£t=i
First
is
conditions (3.1); thus
by giving a sharper inequality
Proof of Theorem
Decompose
bound
useful in other applications. This
first
Znt ).
Write Var (£t=i
zm) =
\T,"=i
E?=i EZnt Zns
By
of products into products close together, and products far apart.
\
the
triangle inequality,
£
Vail^ZnA <^2
\< = 1
Consider the
first
/
summand. For
s,t
between
\EZnt Znt <
\
For each
t,
there are at
most 4/(n)
£
t
+
1
\EZnt Znt
\
1
\\
Cauchy-Schwarz inequality implies:
n, the
\\Zn ,\\ 2
2
points s for which
\s
<
—
sup
l<t<n
t\
\EZnt Zns \<n-(4l(n) +
J2
<
\\Znt \\l
Thus the
2/(n).
sup
!< <n
l)-
\\Znt
\\
2
2
first
summand
satisfies:
.
(
|,-<|<2;(n)
Next consider the second summand. For
\EZnt Zn ,\ <
\EZnt Zn .\.
|i-<|>2l(n)
t
and
\\Znt
£
+ J2
|«-<|<2/(n)
*
\s
—
t\
>
Davydov's Inequality implies:
2/(n),
1
15a, ";
•
\\Znt
\\
2r
\\Zns
\\
2r
<
1
15a, ";
•
sup \\Znt
2
\\
2r
,
while Peligrad's Inequality implies:
\EZni Zns \<24>+.\\Znl
y
There are at most n
Y,
t
J2
|,-l|>2/(n)
—
4/(n)
—
1
\\
2
'
points s such that
M
-\\Znt 2 <2<t>f
y
\\
\s
—
1\
\EZnt Znl \<n{n-Al{n)-l).lba)~\
>
2l(n).
sup
^'^ n
'
-
sup \\Znt \\\.
l<(<n
Thus the second summand obeys:
\\Znt
\\\ r
< n 2 lba)~\
sup
^
(
^n
\\Znt
2
\\
r
.
summand
Similarly, the second
By
3.1.1,
sup sup^ \\X
for
some
finite
t
in the first
^
J2
t
|,-t|>2/(n)
<p<
For any p such that 2
X -j
t
also satisfies:
\EZnt Zn ,\<n 2
Minkowski's inequality gives
2r,
- 2£XtXt_/ ||p <
t
constant K, \\Znt p
\\
K
<
(l(n)
+
Znt
< A" [n
1)
(4/(n)
ySup
Kt<n
||^n i||p
=> ||Znt
2
<
||
A'
there
must
exist
+
(/(n)
+
l)
+
l)
1)
•
2
2
•
\\Znt \\l
< S/=o u, n(i)ll-Y '^-j -
w n (j)
Further, since
oo.
and second summands, conclude that
Var I JT
.2<i>+n
is
(/(n)
some
uniformly bounded, we have that
+
l)
finite
2
+
n (l(n)
+
n (/(n)
2
=
Using this for p
.
2
and 2r
constant K' such that:
+
l)
+
l)
1
2
a,
•
"*
}
J
\(=i
EXt^t-j\\p-
and
Var(|>n( j
If 3.1.ii.a,
/(n)
any
2
^,"?
e
>
then a,
=
= O
A
(j
)
o(l). Similarly,
<A"{n.(4/(n) +
l).(/(n)
some A < -2r/(r -
for
if 3.1.ii.b,
2
2
1
so that a, ,"?
1),
=
°(1)«
(«(«)
+
1)
•
(4/(n)
+
1)
•
then l(n) 2 <f>u n )
^ ut
= O
2
(l(n)
.^ n) }.
x
for
some
A'
j
<
-2, or
then, using Chebyshev's inequality, for
0,
n-^^Zn.l^c
Pr
(=i
_i
<
5 {"
<
^ {n-
•
aw +
1)
+
l)
2
2
+
c(n)
+
1)
+
(/(n)
+
l)
•
«?( -)*}
and
»
Pr
Given
/(n)
=
-1
Z»«l^^
|S
t=i
0(71^),
Therefore, as n
—
oo,
and
n
1
1
Lemma
[1987]
^2 t=1 ^«t
and White
6.19, the implication rule,
arrays (3.3).
With
this
Proof of Theorem
WLLN,
3.4:
•
*+n) }
.
Q.E.D.
*
[1984]
Chapter
2
6.
The
is
an abbreviation and modification of ideas
crucial difference
is
in replacing
and an early use of Chebyshev's Inequality with our
Proceed
'
2
—
(3.4) follows in
weighting differs from Var(n -1/
2
one or the other of the right hand sides above tend to zero as n —* oo.
3. l.ii,
Next, turn to the main result (3.4). The proof of this
Newey and West
(Z(n)
in
two
in
White's (corrected)
WLLN
for
double
a remarkably straightforward way.
steps.
First,
argue the expected version with truncation and
^2"=i ^t) by a quantity that vanishes as n
—
oo.
Then show
the feasible
1
1
-7estimator converges in probability to
expectation. Begin with the expected version:
its
'(")
^£X
,-i
2
t
+ 2][> n (j) J2 EXtXt-j
n—1
'(»)
= - X>«0') -
E
1)
>=1
By Davydov's
L-'/^I,
Var
t=j + i
i=i
EX X<-*
<
n
E E
- -
£*,*,_,,
j=l(u)+lt=j + l
t=j + l
Inequality,
\EX
X _j\
< 15a]~*||X
t
t
t
||
4r
||^t-j||4r,
•
and by Peligrad's Inequality,
\EX
X _ j \<2<l>+n) \\X
t
t
By
3.1, sup, ||A' (
<
||4 r
oo and sup, \\X t \\2
<
t \\
.\\Xt - j \\2.
oo, so that:
'(»)
'(»)
and
2
'(«)
similarly,
|n
From
3.1-ii,
or ^"1j_i
<i>i(
we have that
\
<
EK(j) -
\
1)
X^i a/ ^
either
E
<
EXtXt-j\ <
w n(j) ~ M aj ^ —
<
1
for
each
°°
wn
i
~ l|<n)cn imply either YlTLi
a
y
^ <
oo
the dominated convergence theorem then
j,
~
^/=i w n(j) ~ ^tln)
or
*
K^2\MJ)
oo or YlTLii^Un))^
Since w„(j) —*
oo, respectively.
implies that Yl}2l
we have
_1
*
\
® as
n
~
*
°°
-
^ v a sim il ar argument,
that:
n—
"
_1
n
E E
EX Xt-i < *
t
E
(^R*^
j=f(r>)+l V
J=l(n)+l«=i+l
'
E
-J"*
j=I(n)+l
and
n—
"
n
E E ^«^«-i
_1
Again,
if 3.1.ii.a,
Y^jL\
Qj
2r
oo as n
—
>
oo.
implies the respective right
This completes the
first
£x
«=i
2
t
hand
side
if 3.1.ii.b,
+2][>n (j)
j=i
The estimator
£
t=j+i
X X<-i
<
J27Li ^Un) conver 8 es to a
above converges to
part of the proof.
Second, we show the required convergence in probability.
,-i
<»)•
i=/(n)+l
converges to a finite quantity. Similarly,
finite quantity. In either case, this
l(n) |
E
<K
j=I(n)+l t=J+l
0,
provided that
-8differs
from
its
expectation by:
n
n
'(")
^(A' 2 - EX?) + 2 j>„(j) J2
{XtXt-i
(
(=1
For ease of notation, define
X =
J2 MJ)
2
X^ = w n(j)
i
<
0,
E
(
t
t
t
X .j)
= 2U-
t
1
in the
second term:
'(")
£ E »«0W A
''-i
- £A
<
At-;)-
t=ij=i
Znt =
Yl"=j + i(
apply Theorem 3.3 (with l(n)
EX X _i)
and rearrange orders of summation
n
A A'<-> - EX
'<
Define the double array of rv's
_1
t
t=j+i
j=i
term n
for all
t
~
(=i+i
n
l(n)
2n~ l
i=i
=
5Z; "i
A %t-j
<
0) to
w n{j)(X
t
— EXtXt-j)
X? — EX?,
X -j
t
—
EX X -j),
t
t
and apply Theorem
3.3 to
therefore converges in probability to zero.
so that the
first
term n -1 53"=i( A 2 —
<
EX?)
for the
first
in the lag
Similarly
Q.E.D.
part of the proof uses considerably weaker mixing-moment conditions than necessary
second part of the proof, which
improvement
The
converges in
probability to zero as well. This completes the proof.
Notice that the
it.
growth
is
essentially a repeated application of
rate, if forthcoming,
inequality than that available in the proof of
Theorem
might most
3.3.
easily be
Theorem
3.3.
Thus an
found by obtaining a sharper
-
9-
References
Andrews, Donald W.K. (1988): "Laws of Large Numbers
Variables," Econometric Theory, 4, no.
3,
for
Dependent Non-Identically Distributed Random
December, 458-4G7.
Anderson, T.W. (1971): The Statistical Analysis of Time Scries,
New
York: John Wiley and Sons.
Berk, K.N. (1974): "Consistent Autoregressive Spectral Estimates," Annals of Statistics, 2 no. 3, 489-502.
Fuller,
W.A.
(1976): Introduction to Statistical
A.R. and H. White (1988):
Gallant,
Models,
New
A
Time
Series,
New
York: John Wiley and Sons.
Unified Theory of Estimation
and Inference
for Nonlinear
Dynamic
York: Basil Blackwell.
Hansen, L.P. (1982): "Large Sample Properties of Generalized Method of Moments Estimators," Econometrica,
50, no.4, July, 1029-1054.
Hansen, L.P. and K.J. Singleton (1982): "Generalized Instrumental Variables Estimation of Nonlinear Rational
Expectations Models," Econometrica, 50, no.
5,
September, 1269-1286.
McLeish, D.L. (1975): "A Maximal Inequality and Dependent Strong Laws," Annals of Probability,
3,
no. 5,
829-839.
Newey, W.K. and K.D. West (1987): "A Simple, Positive Semi-Definite, Hetcroskedasticity and Autocorrelation
Consistent Covariance Matrix," Econometrica, 55, no. 3, May, 703-708.
Peligrad,
M.
(1983):
"A Note on Two Measures
of
Dependence and Mixing Sequences," Adv. Appl. Prob.,
15,
461-464.
Philipp,
W.
(1986): "Invariance Principles for Independent
268 in E. Eberlein and M.S. Taqqu
Phillips, P.C.B. (1987):
and
Phillips, P.C.B.
"Time
(eds.),
Dependence
Series Regression with
S. Ouliaris (1986):
A
and Weakly Dependent Random Variables," pp. 225in Probability
and
Statistics, Boston: Birkhauser.
Unit Root," Econometrica, 55, no.2, March, 277-301.
"Testing for Cointegration," Cowles Foundation Discussion Paper no.
890, Yale University.
and
Phillips, P.C.B.
P.
Perron (1988): "Testing for a Unit Root in Time Series Regression," Biometrika, 78,
no.2, 335-346.
Stock, J.H (1988):
"A Class
of Tests for Integration
and Cointegration," Kennedy School of Government,
Harvard University mimeo, March.
White, H. (1984): Asymptotic Theory for Econometricians,
White, H., and
143-161.
I.
Domowitz
New
(1984): "Nonlinear Regression with
York: Academic Press.
Dependent Observations," Econometrica,
52,
*\
£ob5 002
Date Due
Hz-fr
MIT LIBRARIES DUPL
3
TOflO
1
0057fl17b
Download