— I LIBRARIES J sN*. Of /a c& MAT LIBRARIES - DEWEY Digitized by the Internet Archive in 2011 with funding from Boston Library Consortium Member Libraries http://www.archive.org/details/inequalityforvecOObaij ® working paper department of economics AN INEQUALITY FOR VECTOR-VALUED MARTINGALES AND ITS APPLICATIONS Jushan 96-16 Bai June 1996 massachusetts institute of technology 50 memorial drive Cambridge, mass. 02139 AN INEQUALITY FOR VECTOR-VALUED MARTINGALES AND ITS APPLICA TIONS Jushan 96-16 Bai June 1996 MASSACHUSETTS INSTITUTE OF TECHNOLOGY :JUL18 1996 UBRAR!£5 96-16 An Inequality for Vector- Valued Martingales and its Applications* Jushan Bai Massachusetts Institute of Technology June, 1996 Abstract In this paper, we derive a general Hajek-Renyi type inequality for vector- valued martingales. Several well general inequality. We We known inequalities are shown to be special cases of this also derive a similar inequality for dependent sequences. then apply the inequality to the problem of strong consistency of least squares estimators for multiple regressions. AMS 1991 Subject Classifications. Primary 60E15, 60G42; secondary 60F15 62J05. Key words and phrases. Hajek-Renyi inequality, martingales, mixingales, multiple re- gressions, strong consistency. Running head: Vector- Valued Martingale "Partial financial support 9414083. is Inequality. acknowledged from the National Science Foundation under Grant SBR- Introduction 1 mean and For a sequence of independent random variables j:;,}^! with zero ance, Hajek and Renyi (1955) proved the following inequality: for any positive integers n where Ck (k = 1.2, N and ...) is The Hajek-Renyi erf. numbers. More e > finite vari- and for any < N), (n a sequence of non-increasing and positive inequality is numbers and Ezf = very useful in establishing strong laws of large interestingly, this type of inequality plays an important role in the change-point problem for both parametric and nonparametric estimations [see. e.g., This inequality has been extended in the literature into various settings; see, e.g., Bhattacharya (1987), Dumbgen (1991), and Bai (1994,1995)]. Chow (1960) for martingales, tingales, and Sen (1971) Birnbaum and Marshall (1961) for vector- valued Hajek-Renyi type inequality this general inequality are continuous time mar- random sequences. More and Serfiing (1990) extend the inequality to random eral for fields. In this recently, Christofides note we derive a gen- for vector-valued martingales. Several special cases of then discussed. One of them concerns a sequence of quadratic forms of martingales constructed using a decreasing sequence of positive-semidefinite matrices (Corollary 1), as opposed to a decreasing sequence of This quadratic form inequality is scalars. used to prove the strong consistency of least squares estimators for multiple regressions. This latter problem has received considerable attention in the literature, e.g., Drygas (1976), ity, we Anderson and Taylor (1976), Chen, Lai and Wei (1981), Lai, Robbins, and Wei (1978,1979) and Solo (1981). Using our inequal- give an alternative and simpler proof of strong consistency. Furthermore, we derive a similar quadratic form inequality In particular, we for dependent sequences. allow the dependence sequence to be a mixingale, which includes linear processes and strong mixing sequences as special cases. This inequality is useful in deriving the rate of convergence of change-point estimators for multiple regression models with dependent disturbances. The 2 Inequality Let (Q,Jr P) be a probability space and Ti be an increasing sequence of , C T. that Ti Zi,Z2 ,... be Let cr-fields such a sequence of p-dimensional martingale differences relative to {J-{] such that E{Zn \Tn -,) = = Let 5^ Rv on E^=i > > /^(z) nonincreasing with respect to k. e > and for any P (miK Rp h k (Sk ) > = 1 (for = < - (E[h n (Sn /i^, > and 1), let Z^ be a N (n < A )] + Yl r ), if sequence - E[hi(Si) we have hiiSi-J]] i=n+l 2 ck x 2 , then we (1) specializes to let h(x) . J = x 2 , we obtain the Kolmogorov is obtained. For p Then hk(x) is > 1, let let hk(x) hk{x) = = CfcX + = sup A ^ (A'/lA) _1/ ' Ck 2 |A'x|, k. where A For this Sen (1971). consider another special case of the theorem, which M and nonincreasing positive-semidefinite. Let hk(x) we c,tmax{0,x}, then convex and nonincreasing with respect to (1) yields the inequality of If the inequality of Hajek and Renyi (1955) Further, of positive-semidefinite matrices (p x p) (2) nonnegative and given in the following theorem: (i the strong consistency of least squares estimators. Let is is Let c k be a sequence of nonincreasing and positive numbers. positive-definite. We now /i^ l e) independent random variables). sequence of oo n and \ Chow's inequality is is Thus . an increasing sequence of a -fields {Tic}- to positive integers (the one dimensional case), type inequality. choose h(x) < with Ehi(Si) - - p /? p x £ general inequality martingale differences relative Then for any When for all Let hk(x) be a sequence of convex, nonnegative, and nonincreasing (with 1 R p -valued (1) > ••• The respect to k) functions defined on of S,-. Let /it(^) be a sequence of real-valued convex functions defined Z,- such that h\[x) Theorem = and E{ZiZ{) a.s = x'Mk x. k (k is = useful in establishing 1,2,...) be a sequence in the sense that Mk — Mk+\ convex, nonnegative, and nonincreasing with respect to Clearly, hk(x) is Corollary Let matrices, 1 and P let ( M Theorem be given in Zk max S'k MS k N > /) < k \ £ HM&i I ltr(Mn J2^) + t = n+l tr(-) is the trace function. Proof of Corollary particular h k . We 1: only need to compute the right hand side of (1) for this Now E[h n (Sn )] = E{S'n Mn Sn = tr{Mn E[Sn S'n }) = tr{Mn £?=1 - hk(Sk-x)} = - S'^MkSk-^} E[S'k MkSk = E(Z'k MkZk)-2E[S'k _ MkZ k 1 = tr(MkE[ZkZk }) = MZ have used the fact that E[S'k _ x k k] and £,-) ) E[hk(Sk ) We have then 1, t=l where We of p x p positive-semidefinite and nonincreasing be a sequence k k. } tr(MkXk). = E[S k -\Mk E{Zk \Tk - = l )) 0. This proves the x) l l corollary. Remark 1. Because h k (x) positive-definite, we = ck sup A ^ (A'y4A) _1,/2 |A'2;| see that Corollary 1 , 1 event {sup n < k < N h k (Sk) 1: > are disjoint, The proof t). It Define is is 2 , where A is is The due to the nonincreasingness of in (2) is due to the this latter feature that makes the similar to that of Ak = = N N h n (sn )dp fc=n + f < t Chow for n (1960). < r Let A f h k (Sk )dP JA * h n+1 (sn+1 )dp + jrf be the < k;h k {Sk) > A. Thus = eJ2P(A k )<^2 k =n < Eh n (sn - / {h T (ST ) Ak G Tk and U^=n Ak eP(A) ) l suitable for the least squares problem mentioned in the Introduction. Proof of Theorem Then the Ak A~ whereas the nonincreasingness of h k defined nonincreasingness of the matrix sequence Mkresult of Corollary c k {x' extends the inequality of Sen (1971). nonincreasingness of Sen's h k (with respect to k) the scalar sequence c k = h k (s k )dp e}. < Eh n (Sn + I ) (Sn+1 ) [h n+1 - h n+1 (Sn ))dP JCl-An N r The fixed k, the sequence {.F,-} / Jn-(A n UA n + because h k (-) £, = h k (Si) Si /i„+i(Sn ). 1,2,...) is 4* A ' key observation is that for each a real-valued submartingale relative to a martingale [Doob (1953), p. 295]. is [h n+ i(Sn+1 / (3) = (i convex and is > h(Sk )dP / n+2'7 1 ) from h n (Sn ) last inequality follows r h n+1 {Sn+1 )dP+Yl Thus )-h n+1 (Sn )]dP>0 J An and more generally / U---uA JA n k (4) It [h k+1 (Sk+1 )-h k+1 (Sk )]dP>0 (k>n). follows that Eh n (Sn + E{h n+1 (Sn+1 - eP(A) < + jQ-{A„uA n+ i) [h n+2 - h n+2 {Sn+ i)]dP N h n+2 (sn+2 ) / Jn-(A n uA n +iUA n+ 2) The theorem {Sn+2 ) r - h n+1 (Sn )} ) ) r + J2 h k (sk )dp. / n+3^^* follows by repeating the above argument and making use . of (4). Next, we generalize the inequality to dependent sequences. Let {Jr,}^_ 00 be a quence of increasing Heyde (Hall and and {%l) (i) : m > 1980, p. That 21). 0} such that t/> m is, as 1 that {Z,, F,} forms an 2 Z. -mixingale sequence there exist nonnegative constants m— > oo and for all i > and m > {6, 0, : i > 1} and E\\E(Z \^. m )f < bUt, x B||Z,- B(Z,|^ +m )|| 2 (ii) where m Assume cr-fields. se- J J ||i|| (iii) = Em=i <6 2 (Y%=\ X 1Y^ f° r x £ rn 1+8 tp m < oo, for 2 t to satisfy conditions +1 , R p We - some .6 > Throughout the remainder of assumed ^ assume 0. this paper, (i)-(iii). in addition that any L 2 mixingale {Z ,Ti} The sequence t 6, is called is implicitly mixing norms and ^>,- is called mixing Mixingales include martingale differences, linear processes, coefficient. and strong mixing processes as special cases. Andrews special cases of mixingales; see < with J2JL-oo d] < field{?7j; j Theorem °° an<^ Then n). M 2 Let {Z,, Ti] increasing matrices, and P (5) Zk let m^ M S S'k ( k 2 > k — a2 e = ij>f ^jVn-j Tn — o- E|i|>» d\- p x p non-random, positive-semidefinite and non- < ) Let . and (for all i) an L 2 mixingale. Then, for every be = E^-oo Z„ let zero and finite variance a 2 a mixingale with b 2 is be a sequence of k other dependent structures are also For example, (1988). mean i-i-d.- Vj Many £ tr(MB f ) £ 6 2 + £ 2 6 > e 0, tr(JU-) • J uAere C= 4(^ 2 + 2E~ 2+2 Proof of Theorem V We 2: E = Zfc j 1 2 + 2E°°=1 J- 2-26 < )(l oo. ) can write (Hall and Heyde 1981, p20), Zik, with Zj k = E{Z k \Jrk ^ J ) - E(Zk \Jrk -i-i)- j= — oo /2 Ml S k = Thus Ef=1 1/2 Z, £f=1 fc Note that for E£_oo P (6) ( each aJ /2 max ||Mfc * \n<k<N" = E n E°l_oc Af 1/2 fc £?=1 1/2 = ££_«, Zji Mfc where Sjk 5,fc, = m« IIM^S.fcU < — P V max z_^ „<fc<7v i < N] ^ = E{Zlt Z'- i ) > < e / oo -oo j , y is 1/2 ||Mfc " * S,fc|| m J > - a martingale. Let a ; 2 \ P U - - t \ . i > K'%11 N n *r(Mn °7 £ j = -oo / 2 R p we e) < 1 - - <T =E Ej,- > — for all j such Then 1 where 5fc|| m {5"^,^-^; j, 1. \j = -oo i £ = Z„. Thus P that M £E ,-)+ E i j=l > \ :=n+l (positive-semidefinite), Substitute Zj, for x and take expectation to obtain a,e MM.-E,-.-) / and the second inequality follows from Corollary have x'xl — xx' > where / .E||Zj,|| 2 is 1. apxp I — Ej, > 0. For any vector identity matrix. Furthermore, by Z the mixingale property and the definition of Thus for all j. C > - £* > 4&?^j£|2 ;I easy to show that it is 2 < 46 2 B>A for hand follows that the right It £]|Zj,-|| < tr(CB) Using the fact that tr(CA) 0. we have tr{MiL JX ) < 4b 2 tpf^tr(Me). 0, , 2 V> J i and side of (5) is bounded by I a tA 7 2 oo \ t n / A' £ *M M sVS, )[tr{Mn )Y,bU 1 =1 t=n+l " It remains to choose appropriate a^s such that £j a~ Vj = j~ l ~ s > for j 2E =iJ- where 1), Then £j 0. c J > (j 2 -2 a, = > as given in By assumption 1. < semidefinite E,, then from tr(MtEji) P (7) which is where i, is {e ,Jr } z We t ( max S'k \n<fc<W e, = k > 2 e < ) / bounded. Let is = i/ (V»o 2+2 +2E,"i J and 1 i/j/(l+2X3fc=i "*) an ^ «-j Ej ajVjj| = see that if tr(MeT,i)tp?j, £ (tr{Mn % y £ . £,, < ^ 2 = Qj )(l + positive- we have (6), £,-) + f) =1 some for S,-^»jl-|, and tr(M,-S.-) 1=n+ l This situation arises 1. . ) / in the following setup: Z, nonrandom constants and is Hj=-oo e j<i so * nat = Xix'iE^A < EiZjiZ'ji) setup k vector of 1 = Let a ; . -, e, is random = x,£,-, variable such that forms an L 2 mixingale with mixing norms {b 2 } and mixing coefficients {V'™}- can write £* = MS analogous to Corollary a p x (iii), we Inspecting the proof 2. (iii). ij} D ')<oo- Remark 6 2 ,/,2 2 %i = XT^-co (xix'^bfyfa. ^i« We w ith Z = ;1 i,£j,. can choose E, related to regression models, where x, are regressors and = £, It follows that (46?)x,-x;. This are disturbances. Inequality (7) will be used in proving strong consistency of least squares estimators in regression models with mixingale disturbances. Corollary 2 For sup, b 2 < b < oo. Z, in Theorem Then, for some 2, 'UsM'i?* Let M = k~ Then 2 pESn+i some k 2 ' & /1 2 M < oo. = 0(l/nj. . that the mixing norms b 2 are such that M < oo, 1 Proof: assume I. Thus the -')- M for all tr(Mn ) £?=! right hand 6 2 < N>n> 2 p6 /n and side of (5) is 1. ££ B+1 & 2 *r(M) < bounded by M/(e 2 n) for The corollary is Note that since the probability useful in change-point problems. bound does not depend on N. the corollary holds when we replace A r by oo. This yields the strong law of large numbers for mixingales as well as certain rate of convergence for the strong law. Application 3 The application concerns the strong consistency of least squares estimator of the param- eter Bj in the following multiple regression: yt (8) where the vector is + XflA = • • + • (x tl ...,x tp )' , x ip BP > (i the regression coefficient, and {e,} sequence of (j/i, ..., cr-fields y n )' and and hence for is x,- = all {Ti}. Denote by Un = (e\, .. .,£„)'. n > m. The + £i = xtf + £{ (z known 1) consists of = 1,2,...) constants, B = (/5 l5 ..., Bp )' an L 2 mixingale with respect to an increasing is Xn = (xi, ...,x n )' the design Assume (X'n Xn ) positive-definite for is least squares estimator b n , Y„ = some n = m matrix and based on the first let n observations, given by bn Thus strong consistency = (X'n is {X'n (9) When Xn )- l X'n Yn = 6+ {X'n Xn )-'X'n Un . equivalent to X n )- l X'n Un -» with probability 1. the regressors are non-random, Lai, Robbins, and Wei (1977, 1979) establish the strong consistency under the minimal condition (x'n (io) together with the assumption that an increasing sequence of (10) is significantly £, is 7 cr-fields {.T ,} xn y l o a martingale-difference sequence with respect to such that sup, E{e 2 |^i-i) < weaker than the conventional assumption that to a positive definite matrix. Under the latter °° (a.s.). Condition (.X'n Xn jn) converges assumption, the condition number of Xn (X' (the ratio of the largest and smallest eigenvalues) will be uniformly bounded, ) and the strong consistency and (1979). For be Condition trivial to prove. when the sufficient condition et al. will are e,- i.i.d. with zero (9) is mean and dynamic models, strong consistency is known as a necessary finite variance; see Lai also studied many by authors Anderson and Taylor, 1979, Chri Christopeit and Helmes, 1980, Lai and Wei, 1982). (cf. Using our inequality, we give an alternative and simpler proof of strong consistency. We do impose a stronger condition than correlation structures. minimum the <j)(x) Y^ (f> — X'n Xn a > log(x)[loglog(x)] 1+l5 > (S This condition is members and weakest possible (j-fields {J~i}- = We of $. <f>(X max (n)) x for some x = x5 <f>(x) , = is rjini-i, 2 < Theorem Then where 77, Ee 2 Note . (a.s.), as is are i.i.d. 00, 3 Let We 1+6 and = 0(Amin (n)). is close to be < b 2 < oo. that sup, assumed normal. in Let When Ee 2 < £,- is a martingale-difference oo can be significantly weaker the previous literature. For example, = J- x (T-fieldj^; j < i) < . 00 Then is e, is a equivalent which does not hold. {ei,J-i} be an L 2 mixingale with sup, the least squares estimator b n Proof. [\og(x)] series an L 2 mixingale with respect to an increasing sequence of Further assume that sup, b 2 E(e 2 \J-,-i) < oo T] and the assume, for some </>£$, sequence of martingale differences. The condition that sup, E(e 2 \!F,-i) to sup, such that each for strong consistency. sequence, b 2 can be taken as let £, <f>(x) <f>(x) A^^n) and weaker than (2.24) of Lai and Wei (1982) but slightly stronger than Next, we assume {e,} sup,- > x Xn For dynamic models, Lai and Wei show that condition (11) their (1.17). than set of functions For example 0. 0) are be the in the interval A mjn (n) -> oo (11) $ Let . and non-decreasing converges for every r^rr — Let A max (n) denote the largest eigenvalue of X'n eigenvalue of positive is but we allow the disturbances to have broad (9), is -0\\ 2 - \\{X'k 2 < b 2 strongly consistent for observe that \\b k b X k Y'X'k Uk 2 \\ < /?. 00. Assume (11) holds. < (12) X \\(X'k )-^f\\{X'k X k )-^X k Uk k 2 \\ = (Xninm-'iU'MX'M-'XiUk) = Let Zi H!=i x £ i' t Then x,e,. = i? p -valued a is P( — ||6^ 2 -/3|| \\b k S'k >e 2 ) 2 sup < 8\\ MS k £ where E, = 1 with c* S[M,S, > sup (b 2 We A min (A;). Thus by Remark (12). tr(Mn 2 = \ x Let }. see that Sk = M k is 2, 2 e ) J2^)+b £ 2 *r(M,-£,)) i=n+l 1=1 / Note that x,x^. tr(Mn (14) by k < P( < {T mixingale with respect to = ^(XlXk)' X'kPk, and let Mfc non-increasing, and (13) Z, £ = Si) c^ 1 ir((X:Xn )- 1 ^X = n) pc^ 1 = P /A mn (n). - \A,^\)/\A,\ t=i Furthermore, ir(M,£ where A = x equivalently, : = ) {X[X{) and 1 l^l ^ < = c-'x^XlX^x, = x'iMiXi we have A max (n), -l (\A <K|/ln| < 1/p ) t \ Because \A n the determinant of A,. is |j4,-| c <£(A m ax(n)) < \ A max (n) p = 0(A mjn (n)) by , or (11). This implies that .-.,, 1 1 ,w, , r (l^,|-|^-l| c- (|A,-|-|A_ 1 |)/[A,|<l, , , ^ \Ai\<f>(\Ai\Vr) for some L < Thus oo. 00 £ (15) Because J2 test. V £ (r(M-E,)= ' l/( n </ ( nl/ p )) ) Combining < (13)- (15), oc K iS: £ " - ,i)/l 4 ^'(KI-IA-iD/IAISI r (l 4 ' ' ' l ' ' 1 (|4 I — I 4 , " + ,M.wi-y"»)- the above series converges by the integral comparison i and letting N— oo, we have ' p( The right hand III «!!>-, A<r C /_^!_ side above converges to zero as n 10 i-tff — > oo. V- (Mi| -|A;-i|) MIT LIBRARIES 3 9080 00988 2652 References [1] Anderson, T.W. and Taylor J.B. (1976). Strong consistency of mates [2] normal linear regression. Ann. Statist. 4 788-790. Anderson, T.W. and Taylor J.B. (1979). Strong consistency of mates [3] in in dynamic models. Ann. least squares esti- least squares esti- Statist. 7 484-489. Bai, J. (1994). Least squares estimation of a shift in linear processes. J. Time Series Anal. 15 453-472. [4] Bai, J. (1995). Least absolute deviation estimation of a shift. Econometric Theory 11 403-436. [5] Bhattacharya, P.K. (1987). Maximum the distribution of independent likelihood estimation of a change-point in random variables: General Multiparameter case. Journal of Multivariate Analysis 23 183-208. [6] Birnbaum, Z.W. and Marshall A.W. (1961). Some multivariate Chebyshev inequalities [7] with extensions to continuous parameter processes. Ann. Math. Stat. 32 687-703. Chen, G.J., Lai, T.L., and Wei, C.Z. (1981). Convergence systems and strong con- sistency of least squares estimates in regression models. J. Multivariate Anal. 11 319-333. [8] Chow, Y.S. Amer. Math. (1960). [9] martingale inequality and the law of large numbers. Proc. 11 107-111. Christopeit, N. and Helmes, N.C. (1980). Strong consistency of least squares esti- mators [10] Soc, A in linear regression Christofides, T.C. and models. Ann. Statist. 8 778-788. Serfling, R.J. (1990). sionally indexed submartingale arrays. Maximal Ann. Probab. 11 inequalities for multidimen- 18 630-641. [11] Drygas, H. (1976). Weak and strong consistency of the least squares estimators in regression models. Z. Wahrschein. verw. Gebiete. 34 119-127. [12] Dumbgen, L. (1991). The asymptotic behavior of some nonparametric change-point estimators. Ann. Statist. 19 1471-1495. [13] Hajek, J. and Renyi A. (1955). Generalization of an inequality of Kolmogorov. Acta. Math. Acad. [14] Lai, T.L., Sci. Hungar. 6 281-283. Robbins H. and Wei C.Z. (1978). Strong consistency of estimates in multiple regression. Proc. Nat. Acad. Sci. [15] Lai, T.L., Lai, T.L. 75 3034-3036. Robbins H. and Wei C.Z. (1979). Strong consistency of estimates in multiple regression [16] USA II. J. least squares least squares Multi. Analysis 9 343-361. and Wei C.Z. (1982). Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems. Ann. Statist. 10 154-166. [17] Sen, P.K. (1971). A Hajek-Renyi type inequality for stochastic vectors with appli- cations to simultaneous confidence regions. Ann. Math. Statist. 42 1132-1134. [18] Solo, V. (1981). Strong consistency of least squares estimators in regression with correlated disturbances. Ann. Statist. 9 689-693. 12 Date Due Lib-26-67