Digitized by the Internet Archive in 2011 with funding from Boston Library Consortium Member Libraries http://www.archive.org/details/optimalinstrumenOOkuer DEWB HB31 • M415 working paper department of economics OPTIMAL INSTRUMENTAL VARIABLES ESTIMATION FOR ARMA MODELS Guido Kuersteiner No. 99-07 March, 1999 massachusetts institute of technology 50 memorial drive Cambridge, mass. 02139 WORKING PAPER DEPARTMENT OF ECONOMICS OPTIMAL INSTRUMENTAL VARIABLES ESTIMATION FOR ARMA MODELS Guido Kuersteiner No. 99-07 March, 1999 MASSACHUSETTS INSTITUTE OF TECHNOLOGY MEMORIAL DRIVE CAMBRIDGE, MASS. 02142 50 riTUTE 10GY R 1 4 1999 — Optimal Instrumental Variables Estimation for ARMA Models By Guido M. In this paper a new class of Kuersteiner 1 Instrumental Variables estimators for linear ARMA models is developed. Previously, IV estimators based on lagged observations as instruments have been used to account for unmodelled MA(q) errors in the estimation of the AR parameters. Here it is shown that these IV methods can be used to improve efficiency of linear time series estimators in the presence of unmodelled conditional heteroskedasticity. processes and in particular Moreover an IV estimator for both the AR and MA parts is developed. One consequence of these results is that Gaussian estimators for linear time series models are inefficient members of this IV class. A leading example of an inefficient member is the OLS estimator for AR(p) models which is known to be efficient under homoskedasticity. Keywords: ARMA, conditional heteroskedasticity, instrumental variables, efficiency lower- bound, frequency domain. 1 Address: MIT gkuerste@mit.edu. Dept. of Economics, 50 Memorial Drive, Cambridge, MA 02142, USA. Email: 1. Introduction 2 This paper considers instrumental variables (IV) estimators for linear time series models. framework has been studied by Hayashi and Sims (1983), Stoica, Soderstrom and Friedlander (1985) and Hansen and Singleton (1991, 1996). In these papers efficient estimation of autoregressive roots under the presence of moving average errors has been analyzed. The moving average part of the model is not estimated but rather treated Efficient estimation in this as a nuisance parameter. observations. It is also The class of instruments assumed restricted to linear functions of past is in this literature that the innovations are conditionally homoskedastic. Here it is shown that the same class of IV estimators based en linear functions of past observations can be used to improve efficiency of estimators for linear time series models in the presence of unmodelled conditional heteroskedasticity. this paper is consequence of the results of that standard estimators of linear process models based on Gaussian Pseudo Likelihood functions are inefficient heteroskedastic. inefficient A GMM estimators This means in particular that GMM estimators if if OLS the innovations are conditionally estimators for AR(p) models are the innovations are heteroskedastic. In addition the paper extends the current literature in two directions. estimator for general linear models, including MA(q) parts of under the assumption of conditionally heteroskedastic innovations. Second, IV estimators with filters First, ARMA models, is an IV introduced for the class of paper derives exact functional forms of optimal Hansen and Singleton (1991) for a simpler estimation linear instruments the of the type developed in shown how the filters depend on fourth order cumulants of the innovation and the impulse response function of the underlying process. This formulation allows to give exact conditions on the distribution of the error process under which optimal problem. It is distribution instrumental variables estimators are feasible. optimal weight matrix The results in this is A detailed analysis of the properties of the provided. paper are presented tions driving the linear process. for the case of martingale difference innova- Alternatively similar formulas with the same efficiency implications could be obtained under the weaker assumption of white noise innovations. In this case the space of permissible instruments is generated by all linear combinations of past observations and the efficiency bounds developed here are identical to the bounds of Hansen (1985) and Hansen, Heaton and Ogaki (1988). In the case of martingale difference innovations Hansen's bounds are based on a larger class of instruments and are therefore tighter than the bounds obtained here. A detailed analysis of the linear class of instruments is justified by the fact that the Gaussian estimators are a member of this class. Any IV procedure dominating the Gaussian estimators therefore has to contain these linear instruments in the set of all instruments 2 This paper is partly based on results in my Ph.D. dissertation at Yale University, 1997. I wish to thank Peter C.B. Phillips for continued encouragement and support. I have also benefited from comments by Donald Andrews, Jinyong Hahn, Jerry Hausman, Whitney Newey, Oliver Linton, Chris Sims and participants of the econometrics workshop at Yale and the University of Pennsylvania. I benefited from helpful comments of three anonymous referees on an earlier paper that inspired the current extensions. All remaining errors are my own. Financial support from an Alfred P. Sloan Doctoral Dissertation Fellowship is gratefully acknowledged. used. The main technical difficulty in extending previous procedures to the estimation of the moving average case lies in the consistency proof. We give a general characterization We of instrument processes that lead to consistent estimators. optimal instrument it is satisfies these criteria. we do not In this paper focus on implementation issues. For most parts of the analysis assumed that the optimal instrument is known a a procedure for estimation of the weight matrix feasible procedure then establish that the is priori. It is clear that in practice needed. In Kuersteiner (1997) such a developed under stronger assumptions about the joint distribution is these assumptions are satisfied then the procedures developed of the error process. If in Kuersteiner (1997) can be directly applied to the present context. are provided for this case. We also give an exact formula for Explicit formulas a feasible version of the optimal procedure under the more general conditions analyzed in this paper. In this case the feasible estimator depends on a bandwidth parameter. of this parameter for the estimator to maintain A maximal rate of expansion order asymptotic properties its first is provided. However, optimal bandwidth selection procedures are beyond the scope of the paper. The paper is organized as follows. Section 2 introduces the assumptions about the innovation sequence and specifies the inference problem. Section 3 develops an instru- and proves consistency In Section 4 it is shown mental variables estimator for estimation of linear process models and asymptotic normality of estimators how for the ARMA class. to factorize the asymptotic covariance matrix of this class of instrumental variables estimators in a way to obtain a lower bound. Section 5 uses the lowerbound to obtain an IV estimator depending on the data periodogram and an optimal frequency domain filter. Proofs of some important lemmas are contained in Appendix A while the proofs of the results in the paper are contained in Appendix B. explicit formulation of the optimal 2. Model Specification The econometrician observes a following mechanism finite stretch of data {yt}"=1 which is generated by the oo yt = J2c(p,j)e t -J (2.1) 3=0 for a given (3 = functions c(.,j) Rd and are known. (3 <E c(/3,j) We impose the identifying restriction The innovations : Rd x N— > R. The parameter define the lag polynomial c(/3, 0) = C(0,z) /? = is unknown but Yl'jLo C (P d) z^ the an d 1. assumed to be a martingale difference sequence. The martingale difference property imposes restrictions on the fourth order cumulants. These restrictions can be conveniently summarized by defining the following function e< are It should be emphasized that a (s,r) = «..- equal to the fourth order cumulant for s,r is " (a r) ( Q r7 = a [ We u+ a ' . , (2.3) v ; r if s Let assume that we have a probability space (Cl,T,P) with a Q T+i C such that Tt (j- fields T Vi. The doubly infinite = {e J }^_ 00 generates the filtration Tt such that jFt {e 4 }^._ 00 are summarized Assumption Al. E (e 2 surely, (Hi) °°> 0. ^^ = 4 (r, r) > M Y1T= Remark E~ l l a = stationary and ergodic, (s,r)\ \ o\ almost surely where a 2 =B < co, (vi) E (s 2 e 2_ Assumption Al(ii) could be relaxed 1. a(et,et-i, ...)• The assumptions on E (e t as follows: e^ is strictly ('ij Tt-\) \ filtration Tt of increasing sequence of random variables >a some a> s) to (ii) not constant, is Ee t £s = for \ E for all t = Tt-\) (iv) ^ (ef) s. s at the cost of more complicated expressions for the optimal instruments. Assumption that the second moments are conditionally heterogeneous. A consequence slightly states E (s 2 et- almost — a2 < Al(iii) is that s^r^O and depend on s for s — r ^ 0. Assumption (v) limits the dependence in higher moments by imposing a summability condition on the fourth cumulants. The assumption is needed to prove invertibility of the infinite dimensional weight matrix of the optimal estimator. Assumption (vi) is not restrictive. Its only purpose is to guarantee that the innovation distribution does not have all its mass concentrated at zero. terms of the form s £t-r) are nonzero for GMM Remark 2. It can be checked that processes in the ARCH, GARCH, EG ARCH and stochastic volatility class satisfy the assumptions, provided the parametrization implies that is Eel < known from Milhoj oo. It is well (1985) or Nelson (1990) that this condition satisfied only if additional restrictions limiting the temporal dependence of conditional variances and/or the innovation distribution are imposed on the parameter space. By definition of the conditional expectation operator, <r t Tt-\ measurable. is As- sumption (Al) implies that e 2 is strictly stationary and ergodic and therefore covariance stationary. It should be emphasized that no assumptions about third moments are made. In particular this allows for skewness in the error process. For the special case of an ARMA(p, q) process, the lag polynomial has the familiar rational form C(/M = with 9(z) = Let gyy (P,X) 1-01-z-.. = |C(/3, e . -0 q zi and lA ) | where 4>{z) \z\ = ^ = \-faz-. (zz*) 1 ' . (2.4) = .~4> zP and/?' p for z G C and z* is {^,...,^,6^ ...,6 q ). the complex conjugate 2 of z. of y t is given by fyy ((3,\) = %^9yy{P, A). are needed to insure identification of the model and for Under Assumption (Al), the spectrum Further restrictions on C(0, e lX ) The necessary assumptions are (1976), and Deistler, Dunsmuir and consistency and asymptotic normality of the estimators. discussed in Hannan Hannan (1978). (1973), As shown Dunsmuir and Hannan in these articles, a careful distinction between convergence of the parameters in c(/3,j) and the structural form parameters needed. Consistency proofs is An typically establish convergence in the pointwise topology. identification condition is then needed to obtain convergence in the quotient topology. Some of the results of this paper are presented for the general formulation C((3,z). At some points however a specialization sharper results. This is to the ARM A case made is in order to obtain especially the case for the consistency proof. In that case abstract high level assumptions can be made precise for the specific functional form of the ARM A model. C([R d xN],R) In the general case the functions c(0,j) G are restricted to satisfy the following additional constraints. Assumption Bl. Let C(0, z) = = defined by that (3 open is G 0. The c(/?,0) = 1. Rd g {(5 in Mr. zJ Yl'jLo c {PiJ) \C(0,z)\~ 2 ± for ' \z\ The parameter space < \C{p,z)\ 1, Let the compact closure of in Rd 2 ^ for that £~ |j| \c(0,j)\ < oo < \z\ 1}. Assume be denoted by 0. Assume coefficients c([3,j) are twice continuously differentiate in We require for (5 G a subset ofM. d is and££o G \j\ and for all j \-^c(0,j)\ < oo. Assumption B2. For all (3 G 0, gyy {Po, A) ^ 9yy(P, A) whenever j3 ^ fi Q for some subsets L C [— 7r,7r] with nonzero Lebesgue measure. Let 8Q = 0\0 and consider any convergent sequence /3n € 0, n -» P G 90. Then liminf n J\ C-\p n ,e~ lX )f(e iX )d\ > for some complex valued f(z) such that f(z) = EfcL-oo Assumption B3. For a neighborhood U in A G [-7r,7r] Remark and € of /3 h zk with Y2T=-oc IAI , UC < °°- 2 ©o, d g yy (P, X)/dpdp is continuous U. Assumption (Bl) implies that the functions g yy (P,X) and dgyy (p,\)/dp are The Lipschitz condition also implies that g~y (P,X) is Lipschitz continuous on closed subsets ofQ and therefore that -^ \ng yy (P, A) is Lipschitz continuous on closed subsets ofQ. 3. Lipschitz continuous. Remark 4. Assumption (Bl) is stronger than C2.2 in Dunsmuir (1979) where on the other is assumed. The stronger summability restrictions are needed to justify approximations based on the innovation sequence. hand conditional homoskedasticity The assumptions specified here are sufficient to identify the parameters P in C(P, e lX ). lX For specific functional forms of C(P, e ) the assumptions can be made more explicit. A leading example is the A model where the identifiable subset of R d can be described ARM more The accurately. the case of an following Assumption Assumption B4. Let C(P,z) = defined by common is equivalent to the previous assumptions for ARM A model. = {0 G zeros, 6 q ^ 0, R d \4>{z) ^ 4> p ^ 0}. 9 (z) /4>(z). for \z\ < 1, The parameter space 9(z) + Let the compact closure for ofQ \z\ in < Rd is 1 , 6 (z) a subset of , <j>(z) Rd have no be denoted by 0. Remark 5. sumption (B4) satisfy the topological properties required in to show that Dunsmuir and Hannan (1978) show Deistler, ARMA models in 6 all satisfy ments of (Bl). The only new condition C(P, e lX ) /^ C~ liminf„ is Q and defined in AsAssumption (Bl). It is easy the summability and differentiability requirethat e~ iX ) f(e lX )d\ 1 (/3 n , > 0. Since can be zero on the boundary of the parameter space we can not expect the inteon the boundary in general. The condition requires that the behavior of gral to be defined C~ l (f3 n ,e lX ) not too irregular as is > 6 dO. For the /? enough to consider the MA(q) satisfied. It is diverges to infinity as to a constant if iAfc The case. unity. To see class this condition is lX 1 lX J*n 8 n (e~ )~ f(e )d\ integral this let £- n lX n?=i(l - Z jn e ) = ARMA more than one of the roots of 6 n (e lX ) approach unity and converges one root approaches x such that B n (e* ) £r=-ooAe — j3 n which leads to denote the roots of 6 n (e lX ) E£=o^ ™th ^(e^)" = nj=i iX iX f_J n {e- )^f(e )d\ = E£=o 1 Afc ' and f^) = ••££=o^--&/V IV estimator results will first be obtained for the general will then be shown that high level assumptions needed for these the case when Assumptions (B1-B3) are specialized to (B4). In the following analysis of the linear process case. It results are satisfied for 3. Instrumental Variables Estimators In this section a class of instrumental variables estimators introduced. is are constructed from linear filters of lagged innovations e t formulation would be to allow for linear filters An . The instruments alternative, equivalent of the observable process yt- Estimators by Hayashi and Sims (1983), Stoica, Soderstrom and of this form have been proposed Friedlander (1985) and Hansen and Singleton (1991). Restricting the instruments to the linear class has implications for the efficiency properties of the estimators. It rules out conditional GLS transformations and ML estimators for parametric cases. Linearity, on the other hand, leads to a tractable theory. Introduce the space of absolutely summable sequences l l such that x G I 1 if ]P \xj\ < oo for x = {xj}°°_ 1 . Define the set A of sequences of vectors A = ia= where [.] fc {aj}f=l : a; G G clj Rd , Rd such that {[oj]*.}^ G denotes the k-th element of a vector. We 1 for all I define zj G 1 Rd < k < d\ as oo zt = ^akCt-k a.s. fc=l for o G A, a fixed. The instruments satisfy the orthogonality condition E[(Csince C _1 (/3 in the ,L)yt = time domain. e t from (2.1). -1 If C (/? ,L) 1 ((3 ,L)y t )z t }=0 (3.1) The estimator based on is of infinite order as is this condition the case for is constructed MA(q) models a sample analog to needs to be based on an approximation. Such an approximation (3.1) can be conveniently analyzed in the frequency domain. It should be stressed however that _1 (/3,z) be the estimator is set up in time domain. Let the expansion of the polynomial C C _1 (/3, z) = ' The sample analog j Y1T=qCj z ' moment of the n restriction is then given by t-1 CUM = ~I>5^Vt-j a € A. From for all we (3-2) j=0 (=1 approximated as well. Discussion of this where an optimal instrument is considered. For the time therefore assumed that z t is known. (3.1) see that zt has to be issue will be delayed to Section 5 being it is In the frequency domain the analog of P C- (3.1) iX l (l3o,e- J — IT where fyz (X) = Yi'jL-oo lyzHY* 3 and G(P,a) = (2ir)- lyziJ) = )fyZ (X)dX = Ey f C" 1 is 1 t We z t -j. set x ^, e^ )fyz (X)dX. J —IT Note that fyz (X) typically lX a complex vector valued function fyz (X) (X)dX is real valued. )fy Z is note that f*n C~ ((3,e We introduce discrete Fourier transforms of the data defined as 1 and for the instrument as = oJ n ,z(X) ^>n,y(X)co n ,z(—X). It is easy to -4= Y^t=i Zte~ G n {(3,a) check that G n ((3,a) = (27T)" r— Cr 1 J We follow Hansen (1982) - The cross u nt y(X) = lX {f3,e- is — Cd > -4= periodogram defined in (3.2) l [— 7r,7r] is . Also ^"=i Vte~ In ,yz(X) ltX = identical to )In yz {X)dX. , IT in defining the /3 n ltX : estimator f3 n =argmin||G n (M|| as the solution to 2 (3.3) . /3e0 Consistency arguments are complicated by the fact that the parameter space for linear time series models usually is only locally compact. Standard consistency proofs relying on compactness can therefore not be applied. Hosoya and Taniguchi (1982), Kabaila (1980), Taniguchi (1983) are assuming compactness of the parameter space to avoid consistency problems. Such an assumption imply that is an open subset is not valid in the in M. d as ARMA was shown by Stationarity restrictions case. Deistler, Dunsmuir and Hannan (1978). Huber (1967) probably when the parameter space level is is the first reference to discuss consistency of A/-estimators not compact. The formulation there is in terms of low assumptions on the criterion function and the data generating process which are not readily adaptable to the present situation. Hannan's (1973) original paper provides a ARM A consistency proof for the estimators of an model without assuming compactness. Unfortunately, his technique for the Gaussian estimators does not readily generalize to General consistency results are obtained by the current context. Zaman Pollard (1989) and (1989). The Wu and (1981), Pakes stochastic equicontinuity arguments underlying these proofs are not applicable in our context due to the discontinuities of the criterion function on the boundary of the parameter space. One of the problems compactification 9. is that the criterion function does not necessarily converge on the The consistency proof used here therefore proceeds by establishing almost sure bounds for the criterion function along convergent sequences in 0. It is then by analyzing convergent subsequences on an outcome by outcome basis. This method was used by Brockwell and Davis (1987, p. 384) to prove consistency for estimators for the ARMA model based on quadratic criterion functions. The details of their proof rely heavily on nonegativity properties of quadratic forms. For the IV estimators considered here such arguments are not available and a new proof is presented. We start by making the following assumptions. Unless otherwise stated all possible to circumvent uncertainty conditions are for a e A, a fixed. Assumption Cl. The sequence of estimators Assumption C2. Let the sets Bk((3 P The sets Bic (P can be taken as . ) = Y^h=i a k£t-k o..s. where e t -k sequences {afc}^L 1 such that Let z t all A' = I a € A ) for k G = 1,2, M. d is defined by (3.3). form a countable local base3 around ... the set of balls with rational radius centered at satisfies liminf inf (3 n Assumption (Al). Let A' \\G(p n ,a)\\ > where Bk(Po) c are the complements of Bk(Po). Assume that Remark 6. Assumption (Cl) = tency proof that \\G n (P n ,a)\\ (C2) is is for k is A However . 1,2, 1 We show in the consis- implied by the assumptions on a familiar identification condition which makes sure that the expectation of the terion function is (3 be the set of A ^ 0. the definition of the estimator. almost surely = C Gn . cri- bounded away from zero outside a neighborhood of the true parameter. this condition does not hold for all a £ A. instruments that satisfy the identification condition. We therefore define the subset A' of We require that this set be nonempty. Condition (C2) strengthens Assumption (B2) by requiring that liminf ||G(/?„,a)|| > Condition (C2) requires in addition that the identification condition holds on the entire parameter space. This imposes restrictions on z or a. A complete description of the set A' is possible for a given parametric class C(/3, z). A characterization will be given for holds. t the 3 ARMA A there case. collection of is a set B € open subsets B B X is called a base if for each open set O C X and each x £ O B C O. A collection Bx of open sets containing a point x is called a O containing x there is a B G Bx such that x e B C O. Every metric of a space such that x 6 x if for each open set space has a countable base at each point (see Royden (1988), local base at p. 175). Lemma Assume 3.1. m {a.fc}^=1 = argmin ||G n (/3n [a\, ...,a /3 n ] , (Al), (B1-B3), (C1-C2). Let z t = £ A' and e™ is )|| [et-i, consistent, (3 n —* . = lim™-^ A m ef ,£t-m] . Am = with a.s. Then the estimator defined by almost surely. (3 Consistency of the IV estimator depends both on restrictions on the parameter space and the instruments z t Assumption (C2) . The restricts the class of allowable instruments. conditions given are necessarily high level without further restrictions on the function C(P,L). For practical purposes it is however important to characterize the set of instruments A' leading to consistent estimators. In the case of an ARMA(p,q) model it is possible to give conditions on the sequences a £ A! This is done in the next proposition. . Proposition 3.2. Assume C(@,L) = 9o(L)/4> (L) is an ARMA(p,q) lag operator and l the parameter space G satisfies Assumption (B4). Let S = sp {x £ be 4> Q (L)x = 0} l the span of linearly independent solutions to the difference equation A'x = 0} for A = [ax,...] and a £ A. If a £ A A L = [x £ d = p + q then the following conditions are sufficient for a 1 I nonsingular and YLkLi a k row rank, full Remark 7. ¥" AL n S = Lemma . and *£%Ll Q* shows that (3.2) < then a £ A! IfO + ° for ARMA q fl < <f> Ad = with : £ A' p then we need . If q ^' : (L)x = Define 0. where and A d [a 1; ...,a d }' > p > A= [a 1; to be of ....] models can be consistently estimated by instrumental variables techniques provided that the instruments satisfy the specified re- The condition Y1T=\ a k ¥" is only needed to avoid problems at the boundary of the parameter space and can be ignored if Q is restricted to a compact subset ofW*. strictions. We now state additional assumptions that are sufficient to establish a result for the = d\nC(p,e~ tX )/d/3 = (2n)~ J f](p ,X)e d\. It follows immediately that 6_fc = and bo = 0. Let = Y^k=i a k e ~ lXk an d define the matrices Pm = [b\, ...,bm A'm = [ai,...,a m and limiting distribution of ^((3 n — and (3 ). Introduce the notation f](0, A) lkX 1 6fc ^a(A) ], a(l,l)+a A « <r(l,m) — £m. (3.4) •• <r(m, 1) It is ] easy to check that lim m P'm A m = x (2ir) a(m, m) j rj(P , +a 4 X)l a (—X)'dX. The following conditions are needed to prove the existence of a limiting distribution of /?„. Assumption Dl. ^/nG n (0n ,a) = Assumption D2. Define A" C that A' n A" / 0. o p (1) A as . A" = {a £ A |det J f](P , X)l a (-X)'dX + 0} . As- sume The limiting distribution of the instrumental variables estimator is stated in the next -1 -4 _:l = For notational efficiency define lim m IX) <7 (Pm m ) m f2 m .Pm (.Prn m ) _ theorem. o~ A (P' A)~ A€lA(A P)~ l l . This notation operators on infinite dimensional spaces. will P P P ' be justified in the next section in terms of Theorem 3.3. Assume (Al), (B1-B3), (CI, C2) and (Dl, D2). Let z = lim m _ 00 A'm e™ with A'm = [ai,...,a m Then the estimator ,e{_ m {afc}^ G -4'n.4" and e™ = [e -\, has a limiting distribution given by defined by j3 n = argmin ||G„(/3 n t ] t , . . . . ] )|| M0n - Po) ± N{0,a-\P'A)Proof. See Appendix Remark 8. If (5 n is 1 X{IA{A'P)- 1 ) B PML criterion function then the obtained from minimizing a Gaussian asymptotic covariance matrix is a~ (P P)~ P QP(P P)~ l Such an estimator therefore corresponds to an IV estimator where A = P. This shows that Gaussian estimators have the 4 l . interpretation of inefficient TV GMM estimators when the innovations are conditionally or heteroskedastic. The main result of the paper will now be developed bound for the covariance matrix two in steps. We obtain a first lower a- 4 (PA)- A'flA(A'P)l This lower bound in the next section. is 1 (3.5) then used to construct an optimal instrumental variables estimator. 4. Covariance Matrix Lowerbound Finding a lower bound infinite for (3.5) poses certain technical difficulties having to do with the fourth order cumulant matrix fi m , first by holding lated infinite dimensional problem. In particular operator Q, associated with We first Lemma Qm way in a discuss the properties of Qm 4.1. Let Proof. See Appendix Qm Invertibility of briefly We dimensional nature of the instrument space. f2 be defined as in m we m investigate the properties of the and then by looking fixed to be defined, has a well for all fixed (3.4). m. This behaved done is Then, fi" 1 exists for all inverse. in the next Lemma. m. *' B for all at a re- establish that the infinite dimensional m however is not enough to show that Q is We invertible. review the theory of invertible operators (see Gohberg and Goldberg (1980), p. 65. For two Banach spaces B\ and B2 denote the set of bounded linear operators mapping B\ into £?2 by L(B\,B2). Then A € L(B\,B2) is invertible if there exists an operator x € B x and AA~ l y = y for all y G B 2 Let x G B{\ Then A is invertible if KerA = A~ G L(B 2 ,Bi) such that A~ Ax = x for all KerA — {x G B\ Ax = 0} and Imi = {Ax {0} and \mA = B 2 l l : . . : . now choose B\,B 2 Following Hanani, Netanyahu and Reichaw (1968) we spaces whose points are sequences of real numbers denoted by x {j/ii 2/2, ---} - Define the norm ||x|| 2 = Q^il^il a^2 ) - Then B norm and is denoted by bounded under the 2 defined by the infinite dimensional matrix A = (dij),i,j = that are ||.|| 10 is 2 I . = the space of An 1,2, .... as linear {xi,x 2 ,---} and y operator all A : such that y = sequences l 2 1— > = Ax 2 I G is I for all operator It I A Note that in 2 x G . invertible is 2 This can be written element by element as the only solution to if Gohberg and Goldberg (1980) it is thus enough to show KerA = Consider Ax = is now KerA L follows — 2 A for : I =Imi 2 / > x , A = Y1T a xj f° r an The = {0,0, ....} and Im^l = 2 *• i,j I = ^°° Xjyj. From Theorem a Hilbert space with inner product (x,y) is I yi for a self adjoint . 11.4 operator A. selfadjoint. Qm the following infinite dimensional operator associated with . Define image for all x G I 2 by b{ = lim m _oo Y1T a i,j xj where otij is defined in (2.3). In other words Q, is the infinite dimensional matrix such that x has the same elements as Q m We any left upper corner sub matrix of dimension Q the operator component- wise by its m m use arguments similar to the ones in the proof of Lemma 4.2. Let fi m be defined as in Then Q G (3.4). . Lemma (4.1) to establish invertibility. L(l ,l 2 _1 and ) f7 exists. Proof. See Appendix B Remark The 9. fact that the G I 2 depends on The interpretation of the summability condition image of the summability properties of a(k,l). is that the instruments et Q square summable, is become unrelated fix i.e. moments in their fourth , as the time spread between them increases. By the Closed -1 O is ]ij =uj follows that -1 where [fi Graph Theorem (Gohberg and Goldberg _1 bounded, i.e., ||fi || 1 = (1980), su P||x|| <i ||^~ 1;r ||2 "^ °° Lemma we want 4.3. Let tlm^m = to establish that the inverse fi^ fl where I stands for an m— > • m be as defined in (3.4). 1 m approximates Define Cl^ tends to fi _1 such that as infinity. m— > In oo. Q^Clm = Im and Im Vm. Let wm = Oas Theorem X.4.2) it also Thus sup, \uJij\ < oo j. Next, we need to establish properties of the matrix fi" 1 as particular - infinite nm (T 4 I (4.1) j dimensional identity matrix. Then Q^ J exists and £7^ : II oo. Proof. See Appendix B Remark inverse fi 10. _1 Lemma (4.3) provides an algorithm to approximate the infinite dimensional . d dimensional product of sequence spaces l\ = 2 x ... x I 2 Define the infinite dimensional matrix P = [b\, ...] by stacking elements of the sequence {&fc}^=i £ ^5Introduce notation for the reverse operation of extracting a sequence form the rows of a -1 _1 matrix by defining b(P) := {b k }^=1 Define the matrix 2 = (P'n P) Using this notation we can state our next theorem which establishes a lower bound for We define the I . . . the covariance matrix. 11 Theorem a{P'A) e A" A any a € 4.4. For A' let = P and [ai,...] 1 l then the matrix (P'A)~ A'VlA{A P)~ ' 1 {P'A)- l A'£lA{A'P)- - {P'n^P)- where > Remark trix of for as previously defined. If > 1 stands for positive semi-definite. Proof. See Appendix is Q and satisfies B 11. If a G A' DA" an estimator based on IV estimators then {P' A)~ l AQ.A{A' P)~ l a. However, ma- the asymptotic covariance important to point out that the lowerbound in the class of all instruments which are linear functions of the innova- and have an innovation tion process it is is filter in A" The . construction of the lower bound does not involve consistency restrictions for the instruments. In order to construct an efficient estimator in practice it has to be established that the optimal instrument does in fact satisfy consistency restrictions. 5. Optimal Instrumental Variables Estimators Theorem (4.4) immediately leads to the construction of an optimal instrument the optimal is filter also results in examples such as the Theorem determined by the linear 5.1. sumption (B4). satisfies a € A' Theorem ARM A class this Assume C{0,L) — timator for the is . It is IV estimator. The not a priori true that indeed the case. and the parameter space 9{L)/(f){L) A = P'fi -1 then the sequence a — a^P'Q* A". We will write a(A) € A' n A". (5.1) together l a consistent estimator. However for important parametric If (~l efficient A = P'Q~ 1 filter with Theorem (3.3) and Theorem 1 ) satisfies (4.4) establish that the ARMA model constructed with instruments satisfying A' same type Q As- defined by the rows of = P'Sl -1 IV A es- achieves Hansen and Singleton (1991) but under the weaker martingale difference sequence assumptions on et detailed in Assumption (Al). Feasible versions of the optimal IV procedure have to be based on approximations Such approximations replace unobserved et by observed of the optimal instrument z residuals i t = yt — ]Cj=i c (0O'J)yt-j for t = 1, ...,n where e\ = y\. Feasible versions of e a lowerbound of the t as in . t by substituting f3 for a first stage consistent estimator 0. Gaussian PMLE procedures which are consistent but inefficient in our context can be used to generate first are obtained stage estimators. Instruments are then given by z t restriction = X^=i &j£t-j- The empirical analog of the moment now becomes t-i G n (/3,a) = -^i 11 t=l An algebraically equivalent formulation of (5.1) Gn (0,a) = (27T)- 1 t ^cfy _ , t is r C-\0,eJ — IT 12 (5.1) J j=0 given by lX nm {X)d\ )h{(3 Q ,\)I where In ,yyW is the data periodogram and the iX l Mi^o.A) = li,{-X)C- {0o^ ) with h(X) filter : — C [— n,n] > is defined as oo Va^. WA) = 3=1 The instrument are given by coefficients of the optimal oo dj = 22 b k<^kj fc=l the Fourier coefficient of the derivative of the log spectral density of y t and u)kj l The b^ coefficients have simple interpretations in the kj-th entry of the inverse Q~ where is 6fc is . an AR(p) model for example they are equivalent and can therefore be computed easily. It can also be noted that the Gaussian estimators are obtained by setting a,j = bj It is shown in Kuersteiner (1997) that a sufficient condition for the validity of the special parametric models. In the case of to the impulse response function approximation that the coefficients of the instruments satisfy is oo X^'lNfcl <°ofor fc = l,...,d. (5.2) 3=1 The following lemma shows that under strengthened summability restrictions on the fourth order cumulants Condition (5.2) estimator of the ARMA(p,q) Theorem Assume C(/3,L) 5.2. sumption (B4). symmetry is optimal instrumental variables satisfied for the model. = 6(L)/4>(L) and the parameter space Strengthen Assumption (Alv) to ^2r=i ^27=1 this implies £~ 1 ]T~ 1 r \a (s, r)\ = B < oo. If A = a s \ satisfies As- = B < oo. By then a = a^'fT s r )l ( > P'9.~ l 1 ) satisfies (5.2). G G Feasible versions of the optimal estimator are then obtained by replacing n (fi,a) by _1 lX n ((3,a) where in (/9, e ) and f3 is a consistent n (0,a) we replace h(P ,X) by ^(A)C G first stage estimate. The challenging part moments additional restrictions on the In that particular case it is is to estimate l^(X) consistently. For a case with of et this has been done in Kuersteiner (1997). possible to estimate l^(\) consistently without the need to introduce bandwidth or truncation parameters. that in that particular case In the fi _1 is The diagonal such that simplification a,j = comes from the fact bj/ctjj. more general case the elements u^j can be estimated from a sample analog of the approximation matrix $1^ defined in (4.1). Using similar arguments as in Kuersteiner (1997) it can be shown that the elements of estimated as long as m/y/n — the truncated estimate of the The development » 0. a,j this matrix can be uniformly consistently From Cl^ we can obtain coefficients by setting a,j estimates of = is beyond the scope future research. 13 We then form 5Zfc=i ^k^kj- bandwidth selection paper and will be left for of a fully feasible estimator requires an optimal procedure for the parameter m. This u>kj. of this A. Appendix The following Lemma Lemmas - Lemmas are used to derive the asymptotic distribution of the A.l. Under Assumption (Al) for each mG {1, 2...} m fixed, , IV estimators. the vector n 1 E JV(0,fi) £ t £t-l, ...,£*££- i=l with We Proof. a(m, cr(m,m) 1) note that individually Now ferences. a(l,rri) — 'm & ct(1,1)+ct 4 = define Y{ all show that y =Q _1 / 2 t y s and Rm G for all I t 'm — ] = - 1 Then t are martingale dif- \ we have -j^YL^ Yt =* ^(0, !) where now be easily evaluated to is E{Y also efet-iej- a(l,l)+a 4 cr(l,m) £*£t-i£t- 2_2 els -t -m aim, cr(m,m) -E' Next we note that t for any £ G Mm such that linearity of the conditional expectation therefore apply a martingale the m 2 ere c t-i t -.2 ' This 1 so that Yt is Tt-i) = To show that -4= Yl Yt => N(0,Q) it is enough such that tl Q = EY Y(. > the terms etSt-k with k [etet-i, ...,etet- a vector martingale difference sequence. to + a4 sum ^2 t Ynt = -4= Yl t Y t . CLT I i = 1 1, ^ fixed, £ and the fact that and Heyde (see Hall ] m , is t fixed is CLT is <r 4 a martingale by and We finite. Theorem 1980, Checking the conditions of the F + 3.2, can p. 52) to straightforward and therefore omitted. Lemma Assume A. 2. Let In yz (A) = u} n ^(\)u> niZ (-X). In<eE (A) is the periodogram of{e\, ,e n } Assumption (A\) and that y = Y2T=o' l; 3 £ t-3 w^h ^(A) = Sjio V j e_lAj . . . ^ e t satisfy ; x t ^ such that YITLq \J\ l^jl < °°so ^ et z t = Y^T=o ai e t-j with a € A. Let q (.) be a function on [— 7r,7r] —» C with absolutely summable Fourier coefficients {c/j,— oo < k < oo} such that q (A) P asn-t = Ejt-oo Cje~ iXj Then v^(27r) ^ In,yz (A) c (A) n n,e>0 any r /n dA - , £2 (A) C(/? , AX (A) dX >V <e oo. Proof. First an expression -i/2 for ^™ y t e~ lXt for i?n (A) = In yz ^ (X)—In ,ez (A) ip(X) is obtained. Let u^j, (A) = be the discrete Fourier transform of the data. Then u n>y (A) = zKAK, £ (A) + n-^J^iPj^Unj 14 (A) (a.i; . Unj where = ET=l-j £te~ iXt ~ ET=1 £ (A) « e_tAt such that oo Rn (A) := - In ,yz (A) = u> 2 </>(A)In EZ (A) , (-A) n" 1 /2 ^^e~ 2Aj ^ (A) j=0 Note that (2tt) -1 J' Rn Then using the Markov E\fn -1 Lemma term is aj-et-j l it is YlkLi enough to consider / < J2fc=i \°k\ <E A Then any for 0. It (27T)- also follows that etZt oo oo fc=l /=0 m=-oo V ^^ °° and £'Jn e2 (A) is eR d A TV ( r* / ? dX , •'— T Proof. First note that oo 1/2 0, j\ In>EZ (A) dA = < l^/l e t satisfy ^izm\\l\^0 °° " Assumption (j4l)and such that l'i=\, °° °° J^afc/afcai l_i /=1 \ \a k J2 H^o Kl = w n ,e(A)u; n]Z (— A). Assume (A) with a n 1/2 (27T)" 1 E£o Et=i Em=-oo aki>ic™.£t-k (er-l - £n-M-r) <2supa^ dA bounded from A. 3. Let 7ni£z = J]~ let 2t inequality i?„ (A) ^ (A) (2ir) since the last = n~ dA (A) ? (A) ; fc =l 1 ^ 1 /2 (2n)~ 1 [*n Jn ez (A) =1 e t z t such that En a martingale difference sequence. However z t = Y2kL\ a k&t-k 1 n" 1 , Lemma (A.l) is not possible. zj = YT=i a k£t-k such that lo™ z {\) = n" such that a direct application of For a fixed and I™£z (\) that for all e m we introduce = u> n ^(X)u>™ > z 1 (— A). From Billingsley (1968, Theorem 4.2) 1/2 it is J*=1 z enough m e - lAfc t to show 0, lim limsup P » 1/2 / nC 2 (A)-/n £2 (A))dA > e , where /7f Tl £'(i^ (A) - /„, e2 (A))dA = n- 1 ' 2 J2 OO Et a#tet-k t=l fc>m Since Ea^etSt-k — it is enough to consider n n ~ ljE E E^' afce ' £t - fc ) t=l fc>m Applying The Lemma 2 - n_1 oo oo EEE i=l takafaw -»Oasm-4 0o. j>mk>m (A.l) then gives the result. Lemmas are used in the consistency proof to show that the criterion funcnon zero almost surely when evaluated along any convergent sequence of parameter estimates that do not converge to the true parameter value. tion following is Lemma A. 4. Assumption (Bl) implies that c(@,j) Zj\c(P,j)\ <coforall(3ee. 15 = (27r) -1 JC~ 1 (P, A)e tAj dA satisfies d, Proof. Since C' ^^) = C~ 1 l \c((3,j)\=r From dC~ l {t3, X)/d\ = C~ 2 ((3, oo it follows that Lemma u ...,a 1 follows that it (27T)- fdC- 1 (0,X)/dXeiX^dX 1 X)dC{(3, X)/dX and the fact that C{0, X) satisfies £j \c((3,j)\ has absolutely summable Fourier coefficients. Rearrangover j then gives the result. A.5. Assume (Al), (B1-B3), (Cl, C2). e? = {a k }^=l € A' and m }, (A.2) dC~ (P,X)/dX and summing ing (A.2) {a 1 -it) {(3, [e t -l,- limm _*oo Then ,£t-m]' - = Let zt any for Xmef with A'm = G 0, Gn (P,a) -» (3 almost surely. G(/3, a) Proof. Without loss of generality assume that zt € K. Let Ey z s = j yz (t — s), and cum(y t ,zs ,y q ,z r ) = K yyzz (t — s,t — q,t — r). Then, form Assumption (Al) and the proof of Theorem 2.8.1 in Brillinger (1981) it follows that ]T] |7 y2 (j)| < oo and J^sqr K yyzz( s cl> r )\ < oo. Let Xn = G n ((3,a) — EG n {j3,a). Since EG n ((3,a) — G([3,a) as n — oo it is enough to show that Xn — almost surely. This follows from verifying the conditions of Lemma 3 in Gaposhkin (1980). Using the short hand notation Cj = c(/3,j) we have t • > \ > > > n n n n "-'£££$: EXl = [l y2 {t-r) lyz { q -s) t=l s=t+l <j=l r=t+l +"fyy(t K < n~ 2 ~ <lhzz( r ~ S )+ K yyzz {t - Y— * ' 2 = d _t K ' V— n s,t (1 * - q,t - S,t - ^)cJ2 < n j=—n r)] C s -tC - r q Km' 1 2 with Kq = = and K\ (J2T=-oc K Ej=-oo UyzOn) + Next consider °)- n\ Xn Xni = - nni X ni E(Xn - 2 ) for n/2 < n a _1 V^ Y^ z— z— ' y c s - t {y t z s - Ey + zs ) t n Ew < I «!»«(*» 9 n. First n n ni V" z— z— + lZT=-oo hyyU)\ET=~oo hzzU)\ c s _ t {y t z s - £:y t z s Km- {n - m] ' s=n\ t=s+\ 5=1 t=s+\ with E n_1 n n Yl 2 5 «-*(Vt^ - ^2/^) s=ni t=s+l \ Kon~ J2 J2 2 < I 2 ° s-t < 2 t=n\ s=t+l / Since E I \ \ it ^^ V— V h??,iL Z ' ^— c s _ t (y t z, - 5=1 t=s+l (ni /<i ~ n) n 2l n\ < o(l) almost surely Ki < Xni by 2 ) oo such that < Lemma 16 K 2 - m)n" 2 (n 3 of 2 A'!(7i- (n / E{Xn - Xn = < ) / follows that there exists a constant such that £j/ t z s ) Gaposhkin (1980) - m)) r) < Lemma A. 6. Assume (Al), (B1-B3), (C1-C2). ande™ = Let z = lim™-^ A m ef t Then Am = with any convergent sequence/3n 6 [et-i, ,£t-m] m {ofc}^! G 9 with p n —> /3 € 9 there exists an event E with probability one such that i) if (3 £ 9 then for all outcomes in E, G n (/3 n ,a) — G(P,a) and ii) if (3 € 9 then liming ||G„(/? n ,a)|| > for all outcomes in E. -4' [oi, ...,a , ] • for > Remark sequence (5 n G The behavior 12. It is . of n (f3 n ,a) as P n approaches the boundary depends on the therefore not possible to describe the limit of n (P n ,a) for all con- G vergent sequences. Possible behavior includes convergence to a constant, divergence as a random variable which can have unit root random behavior. All we need to show however is that \\G n (P n ,a)\\ stays away from zero for large enough n with probability one. The idea is therefore to distinguish between random and nonrandom limits and to show that nonrandom limits involve constants that are bounded away from zero. nonrandom function ofn, convergence to a limit or near unit root asymptotics or explosive Proof, \\l3 n For each i) > e -P\\<6 by continuity of will {j3' , z) C -1 (/3, A) at f5 Xn (p) = G n (p,a)-EG n (P,a) € 9. For such that /3 c' = c(/3',j). and define ||/3' — letting Xn < = sup,, < such that Yl'jLiJ loss of generality Xn = sup,, \Xn {0)\ ,, , . | c(/?', j ) < I We oo. assume z £ K. Let t Since EG n (p',a) Xn — show that to polynomial 6 the lag c((3',j) - G(P,a)\ < e f \fyz (X)\ d\ it is enough Xn (j) = %%'{ y zt+j - Jyz (-j) > -» almost t ^ n" 1 sup ' IgJI \Xn (j)\ < Kon- ,_,,(£~ ^j" 1 2 \X n (j)\ 2 [ j=0 ||/3'-/3|<« where if P\\ Without G(P',a) and |G(/?',a) Thus uq A has an expansion with coefficients use the short hand notation surely. > such that for n , |/3'-/3|<5 C~ > oo and 8 Bup|C~ 1 G9 ,A)-C- 1 09,A)|<€ sup l < there exists an no and \j=0 j\. From lE'ATnl < (EX%) 1/2 we consider EXl<K*n-*Y:r 2 (EXn (tf). 3=0 Since EXn {j) 2 <nJ2\lyy(kh z M+lyAkh yz (k)\+nY,\K4(3,k,l)\ j,k,l for all j there is consider Xnni = a Xisuch that sup \\P -p\\<6 EX£ < \Xn (p') K2T1 where - Xni (P')\ K-i = ^K\Kq. such that 1/2 ni Xn ni , K (n - ni ) (nn,)- 2 1 \ J^J' \Xni {j)\' J=0 17 For n/2 < n\ < n n—j \.t=max(n 1 -j,l) ii=0 Now ^(n-ni) 2 (nni) 2 E^Tj 2 \Xni 2 (j)\ < K 2 (n -n )n x 2 3=0 and 2 n— 3=0 together with now ni). It y=max(ni— j',1) £ (K + Z) 2 < £F 2 +2 (EY EZ 2 Fatou's Lemma follows from For part e was Gaposhkin Let c™ \\G n (p n ,a) > || almost surely where lim := lim inf„ . By 0). > and lim V ar(G n (0 n limVar(G„(0 n ,a)) > implies , l ((3 n X)e = i the Toeplitz lX] , > E"=o(^i/V") Xn < hmP(||G„(/? n ,a)|| 2 = 0) cases that need to be considered are lim V'ar(G n .(/? n a)) = JC- 2 that for By > Lemma The only two that +EZ 2 implies that £X 2 ni < K n~ 2 (nalmost surely. The (1980) that X n — 2 ) arbitrary. we show that lim ii) 3 in < P(lim||G n (/? n ,a)|| 2 = 0. 1/2 2 then follows since result / d\. For the first b y the second inequality in (A. 3). Let to? E, £S=i+j utffoi-j* Lemma it then follows ££;«? £E=i+;W-j*t = Op (l). leads to F(||G n (/? n ,a)|| that lim \\EG n (/3 n ,a)\\ 1 P(!|G n (/3 n ,a)|| 2 case note that < e) — > 0, since > < e) 7 y,C?')) we that Ylj=i w]'^yzU) This implies that for for any any e e < > < P(\\Gn ((3n ,a) - EXn have * such c™/J]"=0 and £X = O^" 2 1 ) Z}j=i7yz(i) sucn that G n (f3 n ,a) = Op (Z]= oo. For the EG n = ~ = .vhich second case one needs to show 2 {(3 n ,a)\\ > 2 ||£G*„(/? n ,a)|| - e) such that the result follows from the Markov inequality after taking lim on both sides. To see this we first show that = ]mVar(G n (l3 n ,a)) <£» \\mY™=Q (cy/y/n) 2 = 0. Now <(= follows immediately from n— 1 Ti—l I . n—1 < Var(G n (P n ,a)) = £(- £c?X„(j)) 2 < ^(^/V^) 2 -Ej^Xn (j) 2 II j=0 - 0. (A.3) 3=0 3=0 < iim^"=0 (c"/ v/n) 2 < oo. Then there exists a subsequence n k with n k < n k+ and k —> oo such that < YJj=o(^j k \/^k) 2 < °°> V ar (G nk ((3 nk a )) only if 3a 6 2 V/c. Note that Var(G nk (0nk ,a)) = exists and Var(G nic (p nk ,a)) > such that y = Ej=i ajUt-j almost surely. But this is impossible since there is no To show assume =4> i I' , I t x € 2 I such that e t = Yji=i xj £ t-j almost surely (see the Proof of 18 Lemma 4.2). This , a) hmVar(G n ((3n ,a)) = 0. lim inf \\EG n (0 n a) - G(/3n a) f = Therefore liminf ^IjLo^j/v^) 2 contradicts implying that since , , = 1/2 2 \EG n ((3 n ,a)-G(P n ,a)\\ < (^jr\%\ 2 j>n \ / 1 / \ I 1/2 \ V~^ £j KzH M .\||2 / \j>n J At the same time lim ||G(/3„,a)|| > by the identification assumptions such that lim 2 = 0) = 0. 0. This shows that hjn.P(||G n (/? n ,a)|| Appendix B. Proof of Proofs - Lemma <liminf n From Lemma From 3.1 2 (A. 5) it n If E the definition of <limsup ||G n (/? n ,a)|| n = ||G n (/? n ,a)|| ' follows that (3 n it 2 2 <limsup ||G n (/? n ,a)|| G n (Po,a) follows that limsup Let ||£'G T ,.(/3 T1 \\G n (0 o ,a)\\ n = —> G(/3 ,a) almost surely. Thus = lim ||G n (/? n ,a)|| n (B.l) . almost surely. (B.2) be the probability one event in Lemma (A. 6). Now consider the sequence /3 n G 0. there exists a subsequence /3 n (3 Q then by compactness of Pn does not converge to such that — (3 nk > (B.2). Therefore j3 n /3 Lemma G O. By fyz (X) = nfc (/?nfc > ,a)|| a.s. contradicting ->P .* Proof of Proposition 3.2 We that G (A. 6) liminffc only prove that Assumption (C2) holds. = lX C(f3 ,e- )l a (-\) where l a (X) G(0,a) = {2*y r l J— We first note € such zXk such that T,kLi a k^ lX i>((3,e- (\)d\. )l a TV Letting V(/?o,eit is clear that ip((3 ,e~ We lX ) need to show that that G(P, a) = 0. Now for = a = G- C(0,e~ any (/?,e- a )G(/3 = such that G((3 Q ,a) 1 for 1 ) /3 and denominator degrees equal ) = 9(e' iX lA ) 0. )/<p(e~ iX ) there lX the polynomial tp(0, e~ ) € to iX ,e- p + q. The is is no other j3 rational with nominator orthogonality conditions can now be written as (27T)- 1 p {i>{(3, e-' A ) - l)/ a (-A)dA = (B.3) 0. J — 7r We want to show that the only function tp(., e~ %x — 1 ) : [—it, it] — C satisfying this condition > is ^(.,eIf the assumptions of is /? Lemma (3.2) iA )-l = hold then the only value - 19 (B.4) 0. for which ip((3, e~ lX ) — 1 = ,a)||'' Now showing that ip((3, e~ lX — l = is equivalent to showing (p(e~ lX )9o(e~ lX / (e~ tX )~ lX is not zero for any A G [— 7r,7r] for (3 G 0. Here 9(e~ lX = since this polynomial 6{e~ ~ tA ~* A -tA polynomial of an ARMA(p,p + q) process with parae e s tne a </>(e )#o( S )/^o( ) ) ) ^ ) meters <f) 01 , ...^>o, P Yl'jLo i)jZ~ j %X: * ) ^ $ii ••> ^p+9- ^ the coefficients For denote the impulse response function of this process by lX where dependence of >p+ q we denote the 9{e~ lX ) by 9 We such that (p{e~ + 1 = )9o(e~ lX / _zA (e '</> ) ) cp is ]' ^o,i^'j-i _- _ [ip [ai,...,a Q ], = implies B= be written as [ipi,...,xp and x where [a q+ i,...] Rx = = q] = xp 9q — l ...] , [ip by ip = restriction (B.5) o. and the vector of coefficients in (B.6) 0. 1 — ,...,%l> q ] 9 and [V' 9 +i> A = where Define Aq = ] by the identification conditions. such that Equations (B.5) and (B.6) can [9',0,...]' R and R22 contains the known ^o.pV'j-p -,a„ [ai, Establishing (B.4) amounts to showing that •••] suppressed for notational efficiency. then Condition (B.3) has a matrix representation a'xp [V'g+i, has a one sided Fourier expansion satisfy the well ipj dimensional vector infinite [9i,...,9 q - on xjij the coefficients i>i If '4> ) B q R22 coefficients of (B.5). The result now follows if R is of full rank. then R22 = I and R is of full rank if A q is of p = with x = tp. The then the system reduces to Bx = and #22^ = full rank. If q = = condition ^22^ is satisfied for any of the p linearly independent solutions to (B.5). It 1 n N(R22) = is therefore necessary that B is of full row rank and B 0, where ./V(i?22) is the p-dimensional null space of #22 and B is the orthogonal complement of the linear and span of the row vectors of B. Thus the only solution is x = 0. Finally if both q ^ > > between and consider Define we need to distinguish First ^then < q. p. q p p q p We can distinguish three cases. If y 1 D= and y 0,p-l -<k0,1 0,p C J 1 0,P J such that R where Ad D is = [ai,...,a d ], B= invertible such that \R\ R [a d +i,.-.}, = l> Ad B = C b c - [(0,C)' ,0, A d -C'D- B of full rank. On l It ...] and D conformingly. Note that can be checked that C'D~ B = l the other hand if q < p then we need if Ad is L such again that [A q ,B} D N{C,D) = 0. Then there is no element x G N(C,D), = = = required. 0as implies x such that Rx that [A q ,B]x The previous arguments can break down on the boundary of 9 because it is possible lX iX that xjj(P, e~ ) has an expansion with constant coefficients. In that case /^ %p(@, e~ )l a (-\)d\ such that is of full rank i/O 2U Kl a (0) for lim inf n some constant K. /?» C- l (Pn, therefore need to require / a (0) > e-^)fyz (X)dX Proof of Theorem 3.3 A = op (I) We familiar for mean G e /3 n and {§pG n {f3 n )]^iG n {(3n Here -MfG n ((3) 2 (27r)- the multivariate is Gn(/£) , e - lA v^(/3 n -/3 )]- <9A )/d/3/ a (A)dA and (/^-/? the matrix with rows jSfGn(P)' for mean value expansion can be made at a different point First, /^ dlnC(/3 1 e 99. ) ) M=a /? value expansion leads to = (M + op (l))[VnG„(/3 + where - n in order to insure that 7^ = i ) = op (l) for 1, ...,d. It is well i = l,...,d. known that exact by evaluating each row -gjfG n (l3) l j3 n . convergence of to gf-Gn (/?«)' M can be shown by the same arguments as con- vergence of G((3, a) noting that both y t and z t are strictly stationary and d In C{(3, e~ lX )/d(3 is uniformly continuous on [— n, n] X U for U C 0, U compact, (5 Q € 0. A set C/ with these properties exists by local compactness of the parameter space. Next, turn to -i/nG n (/3 it ) = 1 v n(2^)" f*K / C- l {(3 ,e- lX The details are omitted. )In yz {\)d\. From Lemma (A.2) , follows that v^ Using Lemma lX C-\f3 ,e- )In yz {\)d\ - v^ / , (A. 3) then shows that should be noted that 00 / /nG n (/3 /„,«(A)dA = op (l). tf(0,E£i££= 1 «*.za*ai) whe re N it 00 ;=i fc=i The now result follows from x dlnC(f3 ,e-* )/dP = ^Tb k e —i\k fc=l such that (27T)" 1 /^ (?lnC(/? , e- tA )/a/?/ a (A)'dA = E^Li &fc<- symmetric since a(k,l) = E(e 2 £ -k£t-i) = a(l,k) for k,l > 0, k ^ I. Then, by the Shur decomposition (see Magnus and Neudecker, there exists an orthogonal matrix Sm such that Sm QrnSm = A m where 1988) for all A m is diagonal with elements A"1 j = 1, ...,m. Now, for any x m G R m x m ^ 0, we have Proof of Lemma 4.1 First, note that Qm is t m , , , x m Q Tn x Tn Qm is positive definite Proof of It = E(^2 x i,m£tEt-i) 2 > Lemma where the inequality is strict by Assumption (Al). So such that A™ > Vj, m. This shows that fi m has full rank. 4.2 remains to show that From Assumption (Al) it is clear that Qx 6 2 such that kerfi = 0. Assume there is x € Z 21 2 I rr. for all / {0,0, 2 x € ...} I . and Qx = = 2 which can be written as E(Y^Zi x i £ t £ t = 0. But this is with probability one. Now x e t £t-i only possible if Y^ x l e t e t -i = a.s. iie t £t-i = a.s. or the functions et-i are linearly dependent a.s. 2 a / such that ^aiSt-% = If st-i are linearly dependent then 3a G I a.s. for all i = 2,3,... then Yl a £ t-i = Without loss of generality a\ =^ 0. If ojj = a.s. Now assume a t ^ for at least one i = 2,3,.... such that is trivially contradicted. 1 1 £t -i = -Qf YH^=2 a ^ £ t-i a s B ut then £J(£ f _i|JFt _ 2 ) = -a^ X^^2 a i £ t-i as. so that Then 0. also x'flx ^ % , i - £7(ej_i ]-7-t_2) - with positive probability. 7^ This contradicts the martingale difference assumption. On the other hand e t e t -i if = a.s. for all then e 2 e 2_ z z I Y2k a (^)OI < as an y ' Therefore the roles of k and if A; By Assumption 4.3 -^ f° r \ also — (Al) we any fixed for therefore / Also E^ are reversed. / ^ know that E^ a (k,l) — as I, > _1 Q if exists ) = = 0. and is x |c (/c,/)| /c — > oo. < ^ thus This holds < B such that a(k,k) — 5^, 5^ and 5^ according to the |cr (/c, fc oo. Define the infinite dimensional matrices > 2 can only hold . Lemma Proof of = But then E(e 2 e 2_ l a.s. Qx = which contradicts Assumption (Al). Therefore Thus kerfi = 0. Symmetry of f} now implies that Imfi bounded on 2 for all = l k)\ > following partition Cm, = fi Then ir(5^5f2 <j 4 J) — as > Eoo ! m 00. m+ i£ZLik(M)l Then j '771 °12 °21 °22 cm ir(52"}52- s 0, and tr(5g- ct 4 /) (5^- define the infinite dimensional approximation matrix Q. r Clearly l m Vt* exists Mm by Lemma (fi- 1 and the partitioned inverse formula. (4.1) -0^- 1 ) = fi^- 1 We now have (fi-^)Sl- 1 such that -1 llfi - Q* _1 ll 777 _1 < -— llfT 771 ll where ||.|| is the matrix norm defined by \\A\\ = is bounded. By the partitioned inverse formula \\Q - Q* 777 supi| x |i llQ II _1 ll ||j4x|| <x 2 . ' First show that Hf^" 1 !] a -i a~ 4 I such that eigenvalue 1/A™ < 00 l|Q _1 || < *-H Wl < + <t~ 4 We have shown in Lemma (4.1) that the smallest \™ of Q m is nonzero. Then by a familiar inequality for all x G R m x Q^x/x x < \\Q Vm 00 by such that Lemma . H^m (4.2) 1 !] < 00 since for finite m all norms are equivalent. and 1/2 00 lift - ft; = sup y^ llxIKl 1 ,n <j(k,i)xi + 00 E E fc=m+l z=i 1 22 a (k, I) xi Also oo < sup Y^ y~" INI<1 fc= i J=m+ i \^ \^ \a (k, l)\ — 2 (=m+l Thus H^- 1 - Q*^ 1 ll*ll<i -> I \ > as 7n — > oo. fc=l m -> oo as 1| /_. a (k,l)\ 1 / 2_\ fc=m +i (=1 oo oo < sup \cr{k, l)\ \xi\ 4- Proof of Theorem 4.4 For all m fixed it follows from standard results that m for all it follows that lim inf m x m > since for any sequence {x m } such that x m > holds in the limit. Since both p(P) above inequality also the G I 1 and a(A) G i 1 it follows from a bounded convergence argument that lim m P^Am exists and is finite. If a G A" then But The same arguments can be used the inverse exists as well. exists and to show that lim m j4'm f2 mJ4 m is finite. _1 = (^m — S^S^ S'Ji) -1 where Uij are the elements Let w™ be the elements of the approximating finite of the infinite dimensional inverse Q~ dimensional inverse f^if i,j < m and otherwise. Since uj™ — Wjj for all i,jasm-»oo Finally define [il ] m= [wi,j]i,j<m l . > and sup < |k>ij| oo Let Z?^ for ail i,j. it = follows that for df sup m 3 - . e > uj™ BtJ Then > is finite oo lim The function f which m (i,j) = \u)ij\ +e only for a for all i,j. finite number of m Now oo P'A Pm = lim Y" V btb'pQ l bib-ui^j integrable for is all m and dominated by 6;6- B lt j on the counting measure. Therefore by dominated convergence YaL\ Z)j=i WjUiij as had to be shown. also integrable is lim m P'm Q,^Prn = show that a(A) G A for A = P'Q.~ l From Assumption (Al) it follows that Q maps l into l l To see this write Q — E + a 4 I where the matrix E 1 4 1 consists of elements a(k,l). For if! we have Qx = Ex + <7 :r with Ex € because of the summability restrictions on a(k,l). From Lemma 4.2 we know that for x G I 1 C / 2 we have fr 1 ^ G 2 Assume n _1 x g 1 Then a; = ftfr 1 ^ = EQ _1 x + o- 4 f2 -1 a:. But Efr 1 ^ € i 1 Thus ||cr 4 Sl — a:|| 1 = ||x — Ef2 -1 :c|| < + IJSQ - rcj| 1 and \\x\1-l becomes unbounded 1 _1 because of ff4 fi x L But this contradicts the assumption that x G I 1 It follows that Proof of Theorem 5.1 We first . l . Z ; Z . . . 1 1 .-r 1 . . the image of l l under fi" 1 also in is 1 I which < in turn implies that ]>Zfc=i \^ik\ °° f° r _1 f2 of the Z-th unit vector. Since P This can be seen by considering the image under now l an ' E A follows that P'fl~ G A. Next we show that a(A) G A'. Recall that the optimal instrument is defined by _1 2 The row rank of A' = P'17 or A'VL = P' Interpret P as a set of d vectors in = A is therefore the same as the column rank of P. Remember that P [61,62,...] with it I . 23 . = bk (27T)- j^d\nC{(3 ,e-^)/d(3e^ k d\. 1 C(p ,e~^) = For ^ - gg dlnC(0o ,e-*)/dP=[^ we have o (A)/</> o (A) gg .-. ' Define the expansions of ^q (2) = Yl^Lo' P<p,j z an d #o H 2 ) = YlJLo^ej^ The coefficients — ^o.i^Vj-i - •••- (t)o,p' P<f>,j-p = which in the expansion satisfy the difference equation 1 '' t A has p linearly independent solutions. j < , similar expression for 4>gj. Set t/^ ^ = = ip e for Then 0. = &fc Any ^j set of = d p + •'• Vv,fc-i [ Vv,fc- P g vectors &fcn&fc 2 >t>k d ]'• 4>e,k- q independent because of the linear linearly is > •• V'e.jfc-i = and 6q{L)x = together with the requirement independence of the solutions to 4> (L)x and 6 q / 0. Thus P has full that 4> (L) and # (L) have no common zeros and that 4> p ^ column rank. But this means that A has full column rank as well, thus establishing Ad = that [ai,...,ad\ For the case where q nonsingular. is = we note P that is a matrix of p linearly independent vectors. The space P 1 is spanned by vectors of the form Vj = [0, ---,£jT — 1,0, ...]' where £ is a root of (p (L). This can be seen from x e - , fc = °A,--P ~ 1- Thus for a11 Jt has t0 nold that Y^jLo^^j-^j = ° for a11 1 <Pq\L)L x = J]jfc(l - H L)~ L x = «*• (1 - ^ L)- L x = for at least one f" Now _1 assume that 3x ^ with Q (L)x = and fi x € P It follows that x £ by stationarity _1 -1 x and Q x G 2 by invertibility of Since Q~ x € P there must exist constants Cj such that Q~ x = ^2jCjVj. Recursive solution for Cj shows that Cj — oo as j — oo. Thus _1 2 which contradicts Q x = 53, 9j vj- Thus for any x ^ 0, (L)x = it follows 53 CjUj ^ that A'x ^ 0. P1 <=> - l l 1 l * l l l l . -1 1 I . cf> x fi- I . l > > / • < </> < p then the matrix P' contains q rows which are determined by 6q (L). As argued before P' has full row rank. By the previous argument it follows that for at least If one row all pi of x^0, We also (f> The sum P'l q 7^ 0. P' and any x {L)x = ^ such that l we have P'9r x ^ cf> = (L)x it follows that pjfi -1 2 ^ Thus 0. ^ where i is an infinite dimensional vector of ones. _1 l such that of the Fourier coefficients P'l is proportional to (f> (l)~ and 6>o(l) -1 is not Since A = P'fi is contained in the linear span of the vectors of and need to establish A'l P orthogonal to l it follows that A also not orthogonal to is P l. l Finally we show that a{A) 6 A". First, ff)(0, X)l a (-X)'dX = P'Q~ P and P d 2 such that PE^ztz'A *> 3^ G R £ 7^ E(e z t z't ). Now, det P'Q^P = , eE{e1ztz' )l t a.s. or x 2 = of A is = = Then x 2 rank for 0. 0. t < £(e x , 1 a.s. full implies x t so that £'z t = = Awe can write P e where e^ is the fc-th = £'z t a.s. Proof of Theorem 5.2 We need Since [e p' unit vector. = to is 2 ^E [e —id IS n~ P = 1 = <=> \Tt -\ = a.s. is ruled out by Assumption (^41). |^t-i] = a.s. But we have shown before that the column 2 Then for x := £'zt a.s. Now, clearly E , = Ex E 2 2 ) 2 [e t \F -\ t ) = 2 } impossible. show that p'Vl~ 1 VL ^Zfcli 1^1 \ a k\j = P 'Sl -1 ^ 4 / + £). ' f° r 3 — 1j Define the vector bounded. 4 = fce fc Then p'4 = p'n- 1 24 4 (^ / + £)4- (B.7) Now, the sequence a(P'f2 From _1 6 ) (B.7) A < P Ik € \ A and x € M is d - a vector Summing Note that Therefore, by the fact that for all k. l of Lemma (5.2), < P n _1 E^fc > e A we have p' < |.| l € and the summability assumption p'n-^kJ where T,£ k P Q. l (.k norm on M. over k gives = k \Y^iZi d . <? p. Without 4 YlkLi bitoik\ . - k l p' ar Y£k p'n-^ik loss of generality P Qr l lu <zr=i we use ph This establishes the result 25 = sup^Xjl iP'fT^J < |x| for oo. References Time D. R. (1981): Brillinger, Holden-Day, Brockwell, P. J. Data Analysis and Theory, Expanded Edition. and R. A. Davis (1987): Time Verlag-New York, Deistler, M., Series, Inc. Series: Theory and Methods. Springer Inc. W. Dunsmuir, and E.J. Hannan (1987): "Vector Linear Time Series Models: Corrections and Extensions." Adv. Appl. Prob., 10:360-372. Dunsmuir, W. (1979): "A Central Limit Theorem for Parameter Estimation in Stationary Vectortime Series and its Application to Models for a Signal Observed with Noise." The Annals of Statistics, 490-506. Dunsmuir, W. and E.J. Hannan "Vector Linear Time Series Models." (1976): Adv. Appl. Prob., 8:339-364. Gaposhkin, V.F. (1980): "Almost Sure Convergence of Estimates for the Spectral Density of a Stationary Process. Theory of Probability and its Applications, 169-176. Gohberg, I. and S. Goldberg, (1991): Basic Operator Theory. Birkhauser. Hanani, H., E. Netanyahu, and M. Reichaw (1968): "Eigenvalues of Infinite Matrices." Colloquium Mathematicum, 19:89-101. Hannan, E.J. (1973): "The Asymptotic Theory of Linear Time Series Models." Journal of Applied Probability, 10:130-145. Hansen, L.P. (1982): "Large Sample Properties of Generalized Method of Moments Estimators." Econometrica, 50(4):1029-1053. "A Method - (1985): trices of Generalized Bounds on the Asymptotic Covariance MaMoments Estimators." Journal of Econometrics, for Calculating Method of 30:203-238. Hansen, L. P., J. C. Heaton, and Multiperiod Conditional M. Ogaki Moment (1988): "Efficiency Restrictions." Journal of the Bounds Implied by American Statistical Association, 83:863-871. Hansen, L.P. and K. for Linear Time J. Computing Semiparametric Efficiency Bounds Models. In William A. Barnett, James Powell, and George Singleton (1991): Series Tauchen, editors, Nonparametric and Semiparametric Methods in Econometrics and Statistics, pages 387-412. - (1996): "Efficient Estimation of Linear Asset-Pricing Models with Moving Av- erage Errors." Journal of Business and Economic Statistics, pages 53-68. Hayashi, F. and C. Sims (1983): "Nearly Efficient Estimation of Time Series Models with Predetermined, but not Exogenous, Instruments." Econometrica, 51(3):783-798. 26 Hosoya, Y. and M. Taniguchi (1982): "A Central Limit Theorem for Stationary Processes and the Parameter Estimation of Linear Processes." The Annals of Statistics, 10:132— 153. Huber, P.J. (1967): "The Behaviour of Maximum Likelihood Estimates under Nonstandard Conditions." In Proceedings of the Fifth Berkley Symposium on Mathematical Statistics and Probability, volume 1, pages 221-233. University of California Press. "An Optimality Property Kabaila, P.V. (1980): of the Least Squares Estimate of the Parameter of the Spectrum of a Purely Nondeterministic Time of Statistics, Kuersteiner, Series." The Annals 8:1082-1092. CM. (1997): "Efficient IV Estimation for Autogegressive Models with Con- ditional Heterogeneity." Manuscript. Magnus, J.R. and H. Neudecker (1988): Matrix Differential Calculus with Applications in Statistics and Econometrics. John Wiley and Sons. Milhoj, A. (1985): "The Moment Structure of ARCH Processes." Scandinavian Journal of Statistics, 12:281-292. Nelson, D.B. (1990): "Stationarity and Persistence in the GARCH(1,1) model." Econo- metric Theory, 6:318-334. Pakes, A. and D. Pollard (1989): "Simulation and the Asymptotics of Optimization Estimators." Econometrica, 57(5): 1027-1057. Royden, H.L. (1988): Real Analysis. Macmillan Publishing Company. Stoica, P., T. Soderstrom, Estimates of the AR and B. Friedlander (1985): "Optimal Instrumental Variable Parameters of an ARMA process." IEEE Transactions on Automatic Control, 30:1066-1074. "On Taniguchi, M. (1983): Gaussian Wu, ARMA C. (1981): Annals of the Second Order Asymptotic Efficiency of Estimators of Processes." The Annals of "Asymptotic Theory of Nonlinear Least Squares Estimation." Statistics, The pages 501-513. Zaman, A. (1989): "Consistency via Type 2 Theorem." Econometric Theory, 272-286. 27 76 Statistics, 11:157-169. 05 Inequalities: A Generalization of Wu's Date Due MIT LIBRARIES 3 9080 01972 0934 '."