Digitized by the Internet Archive in 2011 with funding from Boston Library Consortium Member Libraries http://www.archive.org/details/improvedratefornOOquah HB31 .M415 n .5z^ working paper department of economics z&) AN IMPROVED RATE FOR NONNEGATIVE DEFINITE CONSISTENT COVARIANCE MATRIX ESTIMATION WITH HETEROGENEOUS DEPENDENT DATA Danny Quah No. 529 July 1989 massachusetts institute of technology 50 memorial drive Cambridge, mass. 02139 AN IMPROVED RATE FOR NONNEGATIVE DEFINITE CONSISTENT COVARIANCE MATRIX ESTIMATION WITH HETEROGENEOUS DEPENDENT DATA Danny Quah No. 529 July 1989 I Hit *«e«G*rs DEC 2 1 £89 l An Improved Rate for Nonnegative Definite Consistent Covariance Matrix Estimation With Heterogeneous Dependent Data by Danny Quah * July 1989. Department of Economics, MIT and NBER. (dquah@dolphin.mit.edu). I thank Jeffrey Wooldridge for and the MIT Statistics Center for its hospitality. Wooldridge also kindly pointed out a serious mistake in a first draft. All errors and misinterpretations are mine. * discussions, An Improved Rate for Nonnegative Definite Consistent Covariance Matrix Estimation With Heterogeneous Dependent Data by Danny Quah Economics Department, MIT. July 1989. Abstract This paper improves on previous rates at which lag lengths are allowed to grow for consistent covariance matrix estimation with heterogeneous dependent data. Using a WLLN, we 4 1 3 sistency result for growth rates ofofo / ); the previous rate was o(n}' ). This new give a con- rate equals that of Berk's autoregressive spectral density estimator for well-behaved stationary contexts, and thus may 1. be best possible outside of very special cases. Introduction Estimating consistent covariance matrices researcher. GMM The need to do this arises in methods (Hansen and Singleton Phillips and Perron [1988], Phillips [1982] (for stationary data), one of the most is problems confronting the applied econometric work ranging from Euler equation estimation by [1982]) to tests for integration and Ouliaris and White common [1984], [1988], and Stock White and Domowitz and cointegration [1988]). [1984], Thus the (Phillips [1987], results of and Newey and West Hansen [1987] (for heterogeneous dependent data) on consistent covariance matrix estimation have been very widely applied. Newey and West [1987] adapted results in White [1984] and White and Domowitz [1984] to obtain a class of non-negative definite consistent covariance matrix estimators for dependent non-iid data. Their correction to arguments The same in White l A [1984] 6.19 led to a o(n l ) rate for increasing lag length to preserve consistency. rate appears in Gallant Phillips [1987] and Phillips and White and Perron [1988] 6.18, and has been used quite generally [1988]). ' This paper improves that rate to o(n 1/ 3 ), the same as that for Berk's autoregressive (time spectral density estimator for strictly stationary data. Except for very special cases, this may be (see for instance the best possible. Since the choice of lag length arbitrary) to applied researchers, this rate is often one of the new domain) ' rate of o(n 1 3 ) most troubling (and seemingly improvement should permit greater flexibility without risking the - 2- loss of consistent inference. Note that this o(n 1 / 3 ) rate As Newey and West is exactly that originally in the conclusion of White's [1984] [1987] have pointed out however, this result did not follow paper therefore uses a different argument to re-establish that assumptions. The proof 6.20. from White's proof. This under essentially the same regularity remarkably straightforward. Notation 2. The p-norm E is result, Theorem llp \X\ p to be . of a random The random variable that a-mixing of size —q ^ m Om so that < if is defined on a given probability space (O, T\ Pr) the absolute value of A' is X said to be <j>-mixing of size is —q analogous conditions. See for example Gallant and White [1988]. Let when Xt </>+ reversed in time; is = m = (j>R (j) We will Lemma for all m. let It will <f>^ X = max(^ m ,^). When {A'j, * a m = 0(m x ) if its denoted for > ||A"|| 1} is p = said some A < —q, (^-mixing coefficients satisfy denote the ^-mixing coefficients <j> Gaussian and covariance stationary, is be convenient below to place restrictions on + <f> . need the following: 2.1 (Davydov's Inequality): For \EX t all p, X _j - EX EX _j\ t t t See for example Philipp [1986, p. 241] Lemma is denoted \X\. Recall that the a- mixing coefficients tend to zero and satisfy Similarly oo. X variable (rv) Lemma 2.2 (Peligrad's Inequality): For \EXtXt-j - < all p, q t i + i < 15aJ~'~*||X||| p 3.1 for this EX EX -i\ t q such that • 1, ||*t_,-||,. form of Davydov's such that - -\- < 2(^) 1/p (^f) 1/9 - — result. 1, ll^illp \\Xt-j\\,. I This was first obtained in Peligrad [1983], and improves by White [1984] 6.16 White [1984] 6.16. is a special case of 2.1; when X is 1 (<t>f) ' q on the earlier long-standing inequality. covariance stationary, 2.2 is a strict improvement on - 3- Results 3. The assumptions first a weak law of large numbers first result is for mixingales moments absolute We WLLN) ( for a process that fails the usual (and thus for mixing sequences as well). Further the process so that weak dependence will have growing not an L'-mixingale (Andrews [19S8]). it is give the regularity assumptions in two sets, one set of assumptions on the process itself, and the other on a set of weights. Assumption some for </>+ r > 3.1: 1: (i.) Suppose {Xt, t < sup, ||Xf||4 r > oo; 1} on and T, Pr) (£2, Xt either (a.) (ii.) EXt = satisfies is for allt, and assume further that a-mixing of size — 2r/(r — or (b.) 1) X -mixing of size —2. is I For convenience in notation, let A'< Assumption w n (j), 3.2: t Suppose — weights such that as n oo, * = with n we have t < 0. 1, j > for all > w n (j) — These assumptions are essentially those in 1 is a double array of uniformly bounded non-negative for each j. Newey and West I Theorem [1987] 2, or White Theorem [1984] 6.20 where applicable. The first result is Theorem 3.3: WLLN a Assume (3.1) for and dependent double arrays that (3.2), Znt and will be used below. define the double array of rv's: ^ Yl "»0') {XtXt-j - EXtXt-j) j=0 for some sequence of nonnegative n~ l Y^t=i %nt —+ as n — > integers l(n), with l(n) = o(n 1/ ' 3 ). Then the double array {Znt \ satisfies oo. I Remarks 1. Notice that will if l(n) j oo, Znt have growing moments has stronger long term dependence than does a mixingale. Further, Z„t for l(n) increasing 2. Clearly the conclusion remains true 3. Our improved rate derives [1984] proof of his with n. fixed, or if if l(n) is from using this WLLN Theorem 6.20. Zn below t is defined to exclude the j = term. in place of the implication rule as in White's -44. By 6, first WLLN, giving this it should be clear that our proof differs from that in White [1984] Chapter by a change in the order of summation. The principal result convergence in probability for a weighted estimator of Var (n -1 / 2 Y^i = i is We A'j). state this as follows: Theorem and l(n) 1 oo, Assume 3.4: l(n) = and (3.1) 1 3 o(?i ^ ). (3.2), Then as n and — » l(n) n oo, * n t £ XtXt-i Var U- 1/2 X) A '« t=j+i j=i _«=i — oo, £A' 2 + 2J>„(j) n- 1 be a sequence of positive integers such that as n let l(n) ^ °' t=i Remarks 1. Apply Theorem 3.4 to the proof of Theorem second and third terms in their expression and l(n) = 2 in 3. similarly converge to zero by l(n) f oo m n — o(n The Newey-West ' 1/ 3 ) their [1987], and Phillips but with Assumption TL p. 101. widely used in applications: see for example Phillips [1987] is 2, instead of o(n l l A )). from the typographically incorrect 0(n l l A ) on ) result and Perron 3 Chapter 6 of Gallant and White [1988] remain intact with Similarly, the results in Phillips [1987] to argue convergence of the o(n l l 3 ). The result here therefore implies the same conclusion as their Theorem ' changed to The other terms (9). greater flexibility in choice of lag length (o(n 1/ 2. Ncwey and West and Ouliaris [1988], and elsewhere. Those Theorem 4.2, results therefore all hold with an even more flexible choice for the lag length. 4. The X 5. rate o(n 1' 3 ) is also that used in autoregressive spectral density estimation under assumptions that include strict stationarity, absolute summability of the finite fourth moments on Fuller [1976, Theorem the iid innovations 7.2.3] strictly stationary case. It and Anderson may in fact (e.g. [1971, Wold moving average Berk [1974] Theorem Chapter 9] coefficients, on and 1). imply that a o(n) rate can be used for the be possible to adapt the "unraveling" method used there for the nonstationary mixing situation considered here. . 4. Proofs K In the sequel, the symbols and K' denote arbitrary will not necessarily the same through- finite constants, out. Our a first result is bound which may be WLLN. convenient to give the proof in two parts, the It is moment binding restriction on the mixing and it is likely to obtain this 3.3: double sum part is a variance also that part of the results that gives the further if improvement is forthcoming, here. bound Var (£t=i First is conditions (3.1); thus by giving a sharper inequality Proof of Theorem Decompose bound useful in other applications. This first Znt ). Write Var (£t=i zm) = \T,"=i E?=i EZnt Zns By of products into products close together, and products far apart. \ the triangle inequality, £ Vail^ZnA <^2 \< = 1 Consider the first / summand. For s,t between \EZnt Znt < \ For each t, there are at most 4/(n) £ t + 1 \EZnt Znt \ 1 \\ Cauchy-Schwarz inequality implies: n, the \\Zn ,\\ 2 2 points s for which \s < — sup l<t<n t\ \EZnt Zns \<n-(4l(n) + J2 < \\Znt \\l Thus the 2/(n). sup !< <n l)- \\Znt \\ 2 2 first summand satisfies: . ( |,-<|<2;(n) Next consider the second summand. For \EZnt Zn ,\ < \EZnt Zn .\. |i-<|>2l(n) t and \\Znt £ + J2 |«-<|<2/(n) * \s — t\ > Davydov's Inequality implies: 2/(n), 1 15a, "; • \\Znt \\ 2r \\Zns \\ 2r < 1 15a, "; • sup \\Znt 2 \\ 2r , while Peligrad's Inequality implies: \EZni Zns \<24>+.\\Znl y There are at most n Y, t J2 |,-l|>2/(n) — 4/(n) — 1 \\ 2 ' points s such that M -\\Znt 2 <2<t>f y \\ \s — 1\ \EZnt Znl \<n{n-Al{n)-l).lba)~\ > 2l(n). sup ^'^ n ' - sup \\Znt \\\. l<(<n Thus the second summand obeys: \\Znt \\\ r < n 2 lba)~\ sup ^ ( ^n \\Znt 2 \\ r . summand Similarly, the second By 3.1.1, sup sup^ \\X for some finite t in the first ^ J2 t |,-t|>2/(n) <p< For any p such that 2 X -j t also satisfies: \EZnt Zn ,\<n 2 Minkowski's inequality gives 2r, - 2£XtXt_/ ||p < t constant K, \\Znt p \\ K < (l(n) + Znt < A" [n 1) (4/(n) ySup Kt<n ||^n i||p => ||Znt 2 < || A' there must exist + (/(n) + l) + l) 1) • 2 2 • \\Znt \\l < S/=o u, n(i)ll-Y '^-j - w n (j) Further, since oo. and second summands, conclude that Var I JT .2<i>+n is (/(n) some uniformly bounded, we have that + l) finite 2 + n (l(n) + n (/(n) 2 = Using this for p . 2 and 2r constant K' such that: + l) + l) 1 2 a, • "* } J \(=i EXt^t-j\\p- and Var(|>n( j If 3.1.ii.a, /(n) any 2 ^,"? e > then a, = = O A (j ) o(l). Similarly, <A"{n.(4/(n) + l).(/(n) some A < -2r/(r - for if 3.1.ii.b, 2 2 1 so that a, ,"? 1), = °(1)« («(«) + 1) • (4/(n) + 1) • then l(n) 2 <f>u n ) ^ ut = O 2 (l(n) .^ n) }. x for some A' j < -2, or then, using Chebyshev's inequality, for 0, n-^^Zn.l^c Pr (=i _i < 5 {" < ^ {n- • aw + 1) + l) 2 2 + c(n) + 1) + (/(n) + l) • «?( -)*} and » Pr Given /(n) = -1 Z»«l^^ |S t=i 0(71^), Therefore, as n — oo, and n 1 1 Lemma [1987] ^2 t=1 ^«t and White 6.19, the implication rule, arrays (3.3). With this Proof of Theorem WLLN, 3.4: • *+n) } . Q.E.D. * [1984] Chapter 2 6. The is an abbreviation and modification of ideas crucial difference is in replacing and an early use of Chebyshev's Inequality with our Proceed ' 2 — (3.4) follows in weighting differs from Var(n -1/ 2 one or the other of the right hand sides above tend to zero as n —* oo. 3. l.ii, Next, turn to the main result (3.4). The proof of this Newey and West (Z(n) in two in White's (corrected) WLLN for double a remarkably straightforward way. steps. First, argue the expected version with truncation and ^2"=i ^t) by a quantity that vanishes as n — oo. Then show the feasible 1 1 -7estimator converges in probability to expectation. Begin with the expected version: its '(") ^£X ,-i 2 t + 2][> n (j) J2 EXtXt-j n—1 '(») = - X>«0') - E 1) >=1 By Davydov's L-'/^I, Var t=j + i i=i EX X<-* < n E E - - £*,*,_,, j=l(u)+lt=j + l t=j + l Inequality, \EX X _j\ < 15a]~*||X t t t || 4r ||^t-j||4r, • and by Peligrad's Inequality, \EX X _ j \<2<l>+n) \\X t t By 3.1, sup, ||A' ( < ||4 r oo and sup, \\X t \\2 < t \\ .\\Xt - j \\2. oo, so that: '(») '(») and 2 '(«) similarly, |n From 3.1-ii, or ^"1j_i <i>i( we have that \ < EK(j) - \ 1) X^i a/ ^ either E < EXtXt-j\ < w n(j) ~ M aj ^ — < 1 for each °° wn i ~ l|<n)cn imply either YlTLi a y ^ < oo the dominated convergence theorem then j, ~ ^/=i w n(j) ~ ^tln) or * K^2\MJ) oo or YlTLii^Un))^ Since w„(j) —* oo, respectively. implies that Yl}2l we have _1 * \ ® as n ~ * °° - ^ v a sim il ar argument, that: n— " _1 n E E EX Xt-i < * t E (^R*^ j=f(r>)+l V J=l(n)+l«=i+l ' E -J"* j=I(n)+l and n— " n E E ^«^«-i _1 Again, if 3.1.ii.a, Y^jL\ Qj 2r oo as n — > oo. implies the respective right This completes the first £x «=i 2 t hand side if 3.1.ii.b, +2][>n (j) j=i The estimator £ t=j+i X X<-i < J27Li ^Un) conver 8 es to a above converges to part of the proof. Second, we show the required convergence in probability. ,-i <»)• i=/(n)+l converges to a finite quantity. Similarly, finite quantity. In either case, this l(n) | E <K j=I(n)+l t=J+l 0, provided that -8differs from its expectation by: n n '(") ^(A' 2 - EX?) + 2 j>„(j) J2 {XtXt-i ( (=1 For ease of notation, define X = J2 MJ) 2 X^ = w n(j) i < 0, E ( t t t X .j) = 2U- t 1 in the second term: '(") £ E »«0W A ''-i - £A < At-;)- t=ij=i Znt = Yl"=j + i( apply Theorem 3.3 (with l(n) EX X _i) and rearrange orders of summation n A A'<-> - EX '< Define the double array of rv's _1 t t=j+i j=i term n for all t ~ (=i+i n l(n) 2n~ l i=i = 5Z; "i A %t-j < 0) to w n{j)(X t — EXtXt-j) X? — EX?, X -j t — EX X -j), t t and apply Theorem 3.3 to therefore converges in probability to zero. so that the first term n -1 53"=i( A 2 — < EX?) for the first in the lag Similarly Q.E.D. part of the proof uses considerably weaker mixing-moment conditions than necessary second part of the proof, which improvement The converges in probability to zero as well. This completes the proof. Notice that the it. growth is essentially a repeated application of rate, if forthcoming, inequality than that available in the proof of Theorem might most 3.3. easily be Theorem 3.3. Thus an found by obtaining a sharper - 9- References Andrews, Donald W.K. (1988): "Laws of Large Numbers Variables," Econometric Theory, 4, no. 3, for Dependent Non-Identically Distributed Random December, 458-4G7. Anderson, T.W. (1971): The Statistical Analysis of Time Scries, New York: John Wiley and Sons. Berk, K.N. (1974): "Consistent Autoregressive Spectral Estimates," Annals of Statistics, 2 no. 3, 489-502. Fuller, W.A. (1976): Introduction to Statistical A.R. and H. White (1988): Gallant, Models, New A Time Series, New York: John Wiley and Sons. Unified Theory of Estimation and Inference for Nonlinear Dynamic York: Basil Blackwell. Hansen, L.P. (1982): "Large Sample Properties of Generalized Method of Moments Estimators," Econometrica, 50, no.4, July, 1029-1054. Hansen, L.P. and K.J. Singleton (1982): "Generalized Instrumental Variables Estimation of Nonlinear Rational Expectations Models," Econometrica, 50, no. 5, September, 1269-1286. McLeish, D.L. (1975): "A Maximal Inequality and Dependent Strong Laws," Annals of Probability, 3, no. 5, 829-839. Newey, W.K. and K.D. West (1987): "A Simple, Positive Semi-Definite, Hetcroskedasticity and Autocorrelation Consistent Covariance Matrix," Econometrica, 55, no. 3, May, 703-708. Peligrad, M. (1983): "A Note on Two Measures of Dependence and Mixing Sequences," Adv. Appl. Prob., 15, 461-464. Philipp, W. (1986): "Invariance Principles for Independent 268 in E. Eberlein and M.S. Taqqu Phillips, P.C.B. (1987): and Phillips, P.C.B. "Time (eds.), Dependence Series Regression with S. Ouliaris (1986): A and Weakly Dependent Random Variables," pp. 225in Probability and Statistics, Boston: Birkhauser. Unit Root," Econometrica, 55, no.2, March, 277-301. "Testing for Cointegration," Cowles Foundation Discussion Paper no. 890, Yale University. and Phillips, P.C.B. P. Perron (1988): "Testing for a Unit Root in Time Series Regression," Biometrika, 78, no.2, 335-346. Stock, J.H (1988): "A Class of Tests for Integration and Cointegration," Kennedy School of Government, Harvard University mimeo, March. White, H. (1984): Asymptotic Theory for Econometricians, White, H., and 143-161. I. Domowitz New (1984): "Nonlinear Regression with York: Academic Press. Dependent Observations," Econometrica, 52, *\ £ob5 002 Date Due Hz-fr MIT LIBRARIES DUPL 3 TOflO 1 0057fl17b