Smo othed quantile regression for panel data* Kengo Kato† Antonio F. Galvao, Jr.‡ Abstract August 20, 2011 This pap er studies fixed effects estimation o dels with panel data. Previous studies show tha difficulties with the standard QR estimation. Firs b ecause of the well-known incidental paramete non-smo othness of the ob jective function sign asymptotic analysis of the estimator, esp ecially overcome the latter problem by smo othing the asymptotic framework where b oth the numb er ds grow at the same rate, we show that the fixe othed ob jective function has a limiting normal d mean, and provide the analytic form of the asym one-step bias correction to the fixed effects esti formula obtained from our asymptotic analysis. the case that the observations are dep endent i illustrate the effect of the bias correction to the Key words: bias correction, incidental paramete regression, smo othing. JEL classi cation codes: C13, C23. The authors are grateful to the seminar participants at Conference on Panel Data, New York Camp Econometrics Summer Meetings of the Econometric So ciety. We also wo editor and two anonymous referees for their constructive co supp orted by the Grant-in-Aid for Scientific research (KAK yDepartment of Mathematics, Graduate Scho ol of Science Kagamiyama, Higashi-Hiroshima, Hiroshima 739-8526, Jap zDepartment of Economics, University of Iowa, W210 Papp Street, Iowa City, IA 52242. E-mail: antonio-galvao@uiowa 1 1 Intro duction Since the seminal work of Ko enker and Bassett (1978), quantile regression (QR) has attracted considerable interest in statistics and econometrics. It offers an easy-toimplement metho d to estimate conditional quantiles, and is known as a more flexible to ol to capture the effects of explanatory variables to the resp onse than mean regression. The theoretical prop erties of the QR metho d are well established for cross sectional mo dels; see Ko enker (2005) for references and discussion. 1Recently, panel data sets have b ecome widely available and very p opular, b ecause they provide a large numb er of data p oints and allow researchers to study dynamics of adjustment as well as to control for individual sp ecific heterogeneity. In this regard, it is natural to consider to apply QR for panel data. In fact, there is a rapidly expanding empirical literature on QR for panel data in imp ortant areas of economics and biostatistics, which makes a p ersuasive case for the rigorous study of estimation and inference prop erties of the QR metho d for panel data.2In contrast to mean regression, however, little is known ab out the ability of the QR metho d to analyze panel mo dels with individual effects. An approach in this direction is to treat each individual effect as a parameter to b e estimated and apply the standard QR estimation to the individual and common parameters altogether (Ko enker, 2004).Unfortunately, the resulting estimator of the common parameters will b e generally inconsistent when the numb er of individuals n go es to infinity while the numb er of time p erio ds T is fixed. This is a version of incidental parameters problems (Neyman and Scott, 1948) which have b een extensively studied in the recent econometrics literature (see Lancaster, 2000, for a review). Arellano and Bonhomme (2009, p.490) stated that the incidental parameters problem is “one of the main challenges in mo dern econometrics”. A recent attempt to cop e with the incidental parameters problem is to intro duce an asymptotic framework that n and T jointly go to infinity. Li, Lindsay, and Waterman (2003) considered the maximum likeliho o d (ML) estimation for smo oth likeliho o d functions and showed that the ML estimator (MLE) of the common parameters has an order O (T-1) bias, so its limiting normal distribution has a bias in the mean (even) when n and T grow at the same rate. An imp ortant implication of their work is that prop erly bias corrected estimators enjoy mean-zero asymptotic normality when T 1For example, in applied economics, Abrevaya and Dahl (2008), Silva, Kosmop oulou, and Lamarche (2009), and Gamp er-Rabindran, Khan, and Timmins (2010); in biostatistics, Lipsitz, Fitzmaurice, Molenb erghs, and Zhao (1997), Wei and He (2006) and Wang and Fygenson (2009), among others. 2Ko enker (2004) indeed prop osed a p enalized estimation metho d where the individual parameters are sub ject to the l1p enalty. Other approaches are found in Abrevaya and Dahl (2008), Canay (2008), and Rosen (2009). 2 grows at a slower rate than n. Since then, several bias correction metho ds for the ML estimation (or more generally, M -estimation) have b een prop osed in the literature; see Hahn and Newey (2004), Woutersen (2002), Hahn and Kuersteiner (2011), Arellano and Hahn (2005), Bester and Hansen (2009), Arellano and Bonhomme (2009) and Dhaene and Jo chmans (2009), to name only a few. 2A distinctive feature of QR is that the ob jective function, often referred to as the check function, is not differentiable. This means that the asymptotic analysis of the previous nonlinear panel data literature is not directly applicable to the QR case since it substantially dep ends on the smo othness of ob jective functions. Kato, Galvao, and Montes-Ro jas (2011) formally established the asymptotic prop erties of the standard fixed effects QR estimator of the common parameters. However, they required the restrictive condition that T grows faster than nto show asymptotic normality of the estimator, and did not succeed in deriving the bias. The difficulty to handle the standard QR estimator in panel mo dels is partly explained by the fact that the higher order sto chastic expansion of the scores is an essential technical to ol in the asymptotic analysis of Hahn and Newey (2004) and Hahn and Kuersteiner (2011) while such an expansion is difficult to implement to the QR ob jective function b ecause the Taylor series metho d can not b e applied to it. It is also imp ortant to note that the higher order asymptotic b ehavior of QR estimators is non-standard and rather complicated (Arcones, 1998; Knight, 1998). See also the discussion at the end of Section 5 of Kato, Galvao, and Montes-Ro jas (2011). To our knowledge, there is no pap er that formally studies the bias correction for panel QR mo dels under the large n and T asymptotics. This pap er overcomes the ab ove problem by smo othing the ob jective function. The idea of smo othing non-differentiable ob jective functions go es back to Amemiya (1982) and Horowitz (1992, 1998). We refer to the resulting estimator as the fixed effects smo othed quantile regression (FE-SQR) estimator. We show that under suitable regularity conditions, the FE-SQR estimator has an order O (T- 1) bias and hence its limiting normal distribution has a bias in the mean (even) when n and T grow at the same rate. Imp ortantly, we allow the observations to b e dep endent in the time dimension. As naturally exp ected, the bias dep ends on the conditional and unconditional densities and their first derivatives. We prop ose a one-step bias correction based on the analytic form of the asymptotic bias. We also examine the half-panel jackknife metho d originally prop osed by Dhaene and Jo chmans (2009) to the FE-SQR estimator. We theoretically show that the b oth metho ds eliminate the bias of the limiting distribution. It is imp ortant to note that these asymptotic results are not included in the previous literature since the bandwidth tending to zero as the sample size increases is involved in the ob jective function and the standard sto chastic expansion do es not 3 apply to such a case (see Horowitz, 1998). In the present case, we have to control the smo othing effect and at the same time handle the problem of diverging numb er of parameters, which we b elieve is challenging from a technical p oint of view. An additional complication arises since the observations are dep endent in the time dimension. To cop e with this complication, we derive new empirical pro cess inequalities. We conduct a small Monte Carlo study to illustrate the effect of the bias correction to the FE-SQR estimator. The results show that, as exp ected, the standard QR and the FE-SQR estimators are mo derately biased except for some sp ecial cases. In addition, the one-step bias correction is able to substantially reduce the bias in many cases, but unfortunately it is observed that the analytic bias correction increases the variability in the small sample partly b ecause of the nonparametric estimation of the bias term. The organization of this pap er is as follows. In Section 2, we intro duce a panel QR mo del with individual effects and formally define the FE-SQR estimator. In Section 3, we present the theorems ab out the limiting distributions of the FE-SQR estimator and the bias corrected one when n and T grow at the same rate. In Section 4, we rep ort a Monte Carlo study to assess the finite sample p erformance of the estimator and the bias correction. In Section 5, we leave some concluding remarks. We relegate the pro ofs of the theorems to the App endix. 2 Mo del and estimation metho d We consider a panel QR mo del with individual effects Qt(yit|xit, ai0) = ai0 + x' itß0, (2.1) is a dep endent variable, xis a p dimensional vector o where t ∈ (0, 1) is a quantile of interest, yit scalar individual effect for each i and Qt( yit|xit, ai0) is yitgiven ( xit, a). In general, each ai0and ß0i 03can dep e fixed throughout the pap er and suppress such a de simplicity.The mo del is semiparametric in the sense conditional distribution of yitgiven (xit, ai0) is left unsp assumption is made on the relation b etween ai 0 and xit. The numb er of individuals is denoted by n an is denoted by T = Tthat may dep ends on n. In what f of Tnn. It should b e noted that although the mo del (2 3Some readers might wonder that the individual effects should b However, in our mo del, the individual effects include the interce dep end on the quantile. Thus, it would b e natural that the indiv 4 sight, the quantile is a nonlinear functional of the random variable, so in general the conventional differencing out strategy is not appropriate for the QR case. We consider the fixed effects estimation of ß, which is implemented by treating each individual effect as a parameter to b e estimated. Throughout the pap er, as in Hahn and Newey (2004), Hahn and Kuersteiner (2011) and Fernandez-Val (2005), we treat ai 00as fixed by conditioning on them. Then, the standard QR estimation solves ni=1 Tt=1∑( - ai - x' itß ){ t - I (yit min ∈ [ = ai + x' itß )} ] , (2.2) Rn, ∈ Rp 1nT ∑ yit where I ( ·) is the indicator function. Note that a implicitly dep ends on n. As in general nonlinear panel mo dels, except for some sp ecial cases, the estimator of ß40defined by (2.2) is inconsistent as n → 8 while T is fixed. In fact, as Graham, Hahn, and Powell (2009) noted, when T = 2, the standard fixed effects QR estimator of the common parameter do es not dep ends on t for panel quantile regression mo dels with individual effects; however since the common parameter dep ends on t unless the mo del is of a linear lo cation form that Graham, Hahn, and Powell (2009) presumed, so the standard fixed effects QR estimator will b e generally inconsistent. ˆ ßA distinctive feature of the standard QR estimation is that the ob jective function is not differentiable. Therefore, the basic smo othness assumption imp osed in the recent nonlinear panel data literature is not satisfied for the standard QR estimator. Kato, Galvao, and Montes-Ro jas (2011) rigorously investigated the asymptotic prop erties of the standard QR estimator, sayKBˆ ß, defined by (2.2) when n and T jointly go to infinity. The essential content of their theoretical results is that whileKBˆ ßis consistent under mild regularity conditions, justifying (mean-zero) asymptotic normality ofKB 5. ˆ ßKB ˆ ßKB 4 5 ˆ ßKB 5 2 remarked that the non-smo othn They requires the restrictive condition that T grows faster than n the non-smo othness of the scores) s ofwhen n and T jointly go to infinity. In asymptotic results analogous to those and Newey (2004) hold foris not know indep endent in b oth cross section a In this pap er, instead of the standard QR estimator, we study the asymptotic prop erties of the estimator defined by a minimizer of a smo othed version of the QR A more explicit example in which the standard fixed effects QR estimator is inconsistent is available up on request. Kato, Galvao, and Montes-Ro jas (2011), however, did not show that this condition is necessary for (mean-zero) asymptotic normality of. Whether one can substantially improve this condition is still an op en question. ob jective function.67Smo othing the QR ob jective function is employed in Horowitz (1998) to study the b o otstrap refinement for inference in conditional quantile mo dels. i= + x' it n a i ∫-8 n hn( ·) = it = ai G (·/hn). Note that Ghn(yit t=1 ß ). Then, we consider the +x ∑ estimator n (u ) := h estimator of ß0. Put p n T i = 1 ∑ ˜ it - x' itß )} ] , K ( u / h ß ) by usingTa kernel function. Let K (· he basic insight of Horowitz (1998) is to smo oth over I (yit survival function of K (·), i.e.,8K (u )du We do not require K ( ·) to b e nonneg (·) in Section 3. Let { h} b e a sequenc that h→ 0 as n → 8 and write G- a- x' ×B [ 1 n T - ai - x' itß ){ t - Ghn( y (yit ( ˆ a, ˆ ß ) := arg min (, )∈A n(2.3) where A is a compact subset of R , Anis the pro duct of n copies of A and B is a compact subset of R. The compactness of the set A× B is required to ensure the existence of ( ˆ a,ˆ ß ). We callˆ ß the fixed effects smo othed quantile regression (FE-SQR) -1 ˜ K (u ) := uK ( u ) ˜ n). We use the notation n and Khn ∂ =a∂ /∂ ß . Since the ob jective function in (2.3) is smo oth with resp ect to (a , ß ), if ( ˆ nd ∂ a,ˆ ß ) is an interior p oint of An× B , it satisfies H(1) ni( ˆai, ˆ ß ) = 0, i = 1 , . . . , n; H(2) n( ˆ a , ˆ ß ) = 0, (2.4) ∑T t=1{ t - ai - x'' itit i - x' itß )} xit i Ghn(yitT-1∑T t=1˜ Khn(yit-1(nT )∑n i=11∑∑T t=1n i=1∑T t=1 i n 6 7 (2) n + hn hn (yit - a ˜ Khn it - x' itß )xit. where H(1) ni(ai, ß ) := T-1 6 ß ) } + h- a- x( a, ß ) := (nT )ß ) , H{ t - G (y - a In fact, it turns out that using smo othing is crucial to develop the formal asymptotic analysis here, since, as Hahn and Newey (2004) and Hahn and Kuersteiner (2011) observed, in a general nonlinear panel mo del with a smo oth ob jective function, the bias of the fixed effects estimator of the common parameter dep ends on the second order biases of the estimators of the individual parameters. However, as Kato, Galvao, and Montes-Ro jas (2011) discussed, the second order bias of the general QR estimator (in the cross section case) is not uniquely determined, so there is a conceptual difficulty in deriving a bias result to the unsmo othed fixed effects QR estimator. A motivation to use the smo othed QR estimator in Horowitz (1998) is that higher order asymptotic theory of the unsmo othed QR estimator is non-standard, which significantly complicates the study of the b o otstrap refinement for inference based on it (see also Horowitz, 2001, Section 4.3). As we will see in the pro of of Theorem 3.1, the terms concerning ˜ K ( ·) do not affect thefirst order asymptotic distribution of ˆ ß . This can b e understo o d by regarding i a “kernel”, although the integral of ˜ K ( ·) over the real line is not one. For a later purp ose, we refer to the partial derivative with resp ect to a i -1 , xit H(1) ( (2) n( a, ß ) are the ai 2 , xit ni(ai ) n (1) ni(ai, ß ) the ai 3 der Main ivative and the partial derivative with resp ect to ß as the ß -derivative. n, ß ) and H-derivative and the ß -derivative of one minus the ob jective function in (2.3), resp result ectively. From an analogy to the ML estimation, we call H-score and H(a , ß ) the ß s -score. Note: In what follows, we treat ai 0as fixed by conditioning on them, and consider the joint asymptotics in which T = Tn→ 8 as n → 8 . The limit is always taken In this as n → 8 . section, we investigate the asymptotic prop erties of the FE-SQR estimator defined by (2.3). ˜ K (·) as Put a0:= ( a10, . . . , an0)'. as the a 3.1 Assumptions This section provides conditions under which the FE-SQR estimator has a limiting normal distribution with a bias in the mean when n and T grow at the same rate. The conditions b elow are stronger than needed if the main concern is to prove consistency only or asymptotic normality of the estimator when n/T → 0. However, we b elieve that the conditions are, to some extent, standard in the literature. (A1) For each i = 1, the pro cess { (y), t = 0 , ±1, ±2 , . . . } is stationary and ß mixing. Let ßiit, xit(j ) denote the ß -mixing co efficient of the pro cess { (yit8), t = 0, ±1 , ±2, . . . } .Assume that there exist constants a ∈ (0, 1) and B = 0 such that supi= 1ßi( j ) = B aj. Assume further that the pro cesses {( yit), t = 0, ±1 , ±2, . . . } are indep endent across i. (A2) There exists a constant M such that supi= 1 ∥xi 1∥ = M (a.s.). 8For any stationary pro cess {Xt , t = 0, ±1, ±2, . . . }, the ß -mixing co efficient is defined as J | P(Ai n B ) - P(A )P(B ) | : Ii is any finite partition in A , I j =1 =1 {A i =1 j i j 0 ß (k ) := 12 ∑ ∑ i} sup{ j is the s -field generated by Xi, . {Bj J j 1 kis any finite partition in A}, iwh . . , Xj for i < j . =1} ere A 7 are compact. For each i = 1, ai0 n0 it := y - ai0 - x' itß ,x The distribution of ( yit, xit i 1+ j| x1, x1+ j). Let f i1 ,u , u1+ j i,j( u1 (A3) The parameter sets A ⊂ R and B ⊂ Rp is an interior p oint of A, and ßis an interior p oint of B it 0 ) is allowed to dep end on i. Define u it given xit it i i,1+ j i i(u |x ) has a density fi it it i 1, xi, 1+ j) = (x1, x1+ j), denoted as fi,j(u1, u ≍ bnwhen there exists a p ositive constant C such } and {bn}, we that C (A5) (a) For each i = 1, fi(j ) ij( u |x ) is r times continuously differentiable with resp ect to u for each fixed x , where r = 4 is an even integer. Let f( u |x ) := (∂ /∂ u )(u |x ) for j = 0 , 1 , . . . , r , where f(0) i(u |x ) stands for fifi(u |x ). (b) There exists a constant Cf(j ) isuch that |f(u |x )| = Cuniformly over(u, x) for all j = 0, 1, . . . , r and i = 1. (c) infi= 1fif(0) > 0. (d) For each i = 1 and j = ±1, ±2 , . . . , fi,j(u1, u1+ j|x1, x) is r times continuously differentiable with resp ect to (u1, u1+ j1+ j) for each fixed (x1, x1+ j). Let f(k ,l ) i,j(u1, u1+ j|x1, x1+ j) := ∂k +lfi,j( u1, u1+ j|x1, x1+ j )/∂ uk 1∂ ul 1+ j, . Condition (A1) implies that the pro cess { (u), t = 0, ±1, ±2, . . . } is stationary and ß -mixing for each i with ß -mixing co efficient ß(·), and indep endent across i . Let F(u |x ) denote the conditional distribution function of u= x . We assume that F( u |x ). Let f(u ) denote the marginal density of u. To cop e with the time series dep endence, we further assume that for each i = 1 and j = ±1 , ±2, . . . , there exists a joint conditional density of (u) given ( x) denote the joint (unconditional) density of (u). For two sequences of p ositive numb ers { a -1bn= an= C bnfor every n. ' i1) (A4) The minimum eigenvalues of E[ fi(0|xi1 )] are b ounded away from zero uniformly over i = 1. ). (e) There exists a constant C' f( k over (u1, u1+ j, x1, x) for all 0 = k = integrable function Df(0 ,l ) i,j: R → [0 uniformly over ( u1, u1+ j, x1, x1+ j1+ (A6) (a) K (·) is symmetric ab out the or continuously differentiable. (b) K (·) is a ∫8-8ujK ( u )du = 0 , j = 1 , . . . , r - 1, ∫8-8u where r is given in condition (A5). 8 here f(0 ,0) i,j(u1, u1+ j|x1, x1+ j) stands for fi,j(u1, u1+ j| x1, x1+ j w (A7) hn ≍ - c, where 1 /r < c < 1/ 3. T Put si := 1/fi(0), γi := siE[ fi(0|xi1)xi 1] and νi := f(1) i(0)γi (1) i- E[ f(0|xi1)xi 1]. (A8) (a) Gn )] is nonsingular for each n, and the limit G := limn→8Gnexists and is - γ' i E[ fi(0|xi1)xi1( x' i ∑n 1 =1 nonsingular. (b) Let Vdenote the covariance matrix of the term T - 1/ 2∑T t=1{ t I ( uit= 0)}( xitni- γ). Assume the limit V := limn→8n- 1n i =1ni ( ∑ V i exists and is nonsingular. A 9 ) D e fi n e -8 ω(3 i1 = 0), I (ui, 1+ j = 0)} . ) ni ω ( 2 ) n i |j| T i(0|xi1) xi 1 ] - E [ xi1 ∫ 0 f (0, u |xi 1, xi, 1+ j)du ]} , |j | T) Cov {I (u i,j := ∑1 =| j |= T - 1:= |j ){ t fi(0) ∫){ t 0 (0, u )du , -8 } ) ni ∑1=| j |=T - 1:= ∑|j |= | E[f fi,j T - 1( 1 -( 1 -( 1 - T (1) ni|ω| = O (1), max1= i= n (2) ni∥ ω n→8 γi (2) ni- ω+ siω νi/2) exists. and ß0 i0 9 Assume that max1 =i =n n∥ = (1), and the limit lim- 1∑n i=1si(ω(1) ni(3) niCondition (A1) assumes that the observations are indep endent in the cross section dimension, but allows for weak dep endence in the time dimension. Condition (A1) is basically the same as Condition 1 of Hahn and Kuersteiner (2011). The only difference is that they assumed a uniform exp onential a -mixing condition in the time dimension. Generally, ß -mixing is a slightly stronger requirement than a -mixing. The reason to assume a ß -mixing condition here is to establish convergence rates of kernel typ e statistics. For that purp ose, ß -mixing is more tractable than a -mixing. We refer to Section 2.6 of Fan and Yao (2003) for some basic materials on mixing pro cesses. The condition that xitis uniformly b ounded can b e relaxed at the exp ense of lengthier pro ofs. Condition (A3) is a conventional assumption in the literature. As Kato, Galvao, and Montes-Ro jas (2011) noted, condition (A4) guarantees identification of each a ω(1 . Conditions (A5) (a)-(c) state some restrictions on the conditional densities, and are standard in the QR literature (see Assumption 4 of Horowitz (1998) and condition (ii) of Angrist, Chernozhukov, and Fernandez-Val (2006, Theorem 3)). We require no less than four times differentiability of the conditional densities to make the smo othing bias negligible. Conditions (A5) (d)-(f ) are mainly used to derive an explicit bias expression (see the pro of of Lemma A.6). Condition (A6) corresp onds to Assumption 5 of Horowitz (1998). Condition (A6) (b) requires K (· ) to b e a higher order kernel. The requirement that c < 1/3 is explained in H or o wi tz (1 9 9 8) . T h e pr o of of T h e or e m 3. 1 b el o w s h o w s th at th e a sy m pt ot ic bi a s of ˆ ß is m a x{ O (T 1), O (h r n) }, w h er e th e O (T -1) te r m c or re s p o n d s to th e e sti m at io n er ro r of e a c h in di vi d u al e ff e ct a n d th e O (h r n) te r m c or re s p o n d s to th e s m o ot hi n g bi a s. T o m a k e hr n= o( T1), w e n e e d cr > 1. A hi g h er or d er k er n el is n e e d e d to e n s ur e c o n di ti o n (A 7) . S e v er al hi g h er or d er k er n el s ar e in tr o d u c e d in M ul le r (1 9 8 4) . A n i m pl ici t a ss u m pt io n b e hi n d th is c o n di ti o n is th at T gr o w s a s fa st a s n. In pr a cti c e, it is re c o m m e n d e d to s et hn - 1/ (2r )= o{ (n T )-1 /2} in or d er to m a k e th e s m o ot hi n g bi a s o{ (n T )} . C o n di ti o n s (A 8) a n d (A 9) ar e c o n c er n e d wi th th e a sy m pt ot ic c o v ar ia n c e m at ri x a n d th e a sy m pt ot ic bi a s, re s p e cti v el y. T h e e x a ct fo r m of th e te r m V nii s gi v e n b y Vni = ∑| j |= T ( 1 - |j ) E[{ t - I ( ui1 = 0)}{t - I (ui, 1+ j -1 | T = 0)}( xi 1 - γi)(xi, 1+ j - γi ')] . (A If there is no time series dep endence, i.e., for each i , the pro cess { (yit, xitni), t = 0 , 1)±1, ±2 , . . . } is i.i.d., then V= t (1 - t )E[(xi1- γi)(xi 1- γi)'], ω(1) ni= 0 , ω(2) ni= 0 and ω(3) (A ni= t (1 - t ). 6), (ˆ 3.2 Limiting distribution a ,ˆ We first show consistency of the estimator. Although the main concern is the ß) distributional result, the consistency of the estimator is an imp ortant prerequisite is since it guarantees that ( ˆ a,ˆ ß ) is an interior p oint of An× B and thus satisfies thew first order condition (2.4) with probability approaching one. As in Kato, Galvao, and ea MontesRo jas (2011), we say that ( ˆ a ,ˆ ß ) is weakly consistent if max1= i= n| ˆai- kly ai0|p→ 0 and ˆßp→ ß0. The pro of of Prop osition 3.1 is given in App endix A.1.Propco osition 3.1. Assume that (log n)/ vT → 0 and h→ 0 . Then, under conditions ns istent.n We now present the limiting distribution of the FE-SQR estimator when n and T grow at the same rate. The pro of of Theorem 3.1 is given in App endix A.2. Theorem 3.1. Assume that n/T → ρ for some ρ > 0 . Then, under conditions (A1)(A9), we havev nT (ˆ ß - ß0)d→ N ( v- 1ρ b, G- 1V G), (3.1) where b := G-1 [ limn→8 { 1ni =1∑si ( ω(1) niγ(2) i ni- ω+ si ω(3) niν)}] i . (3.2) 2 10 n The bias (3.2) is of a complicated form. In the next subsection, to gain some insight into the bias expression, we compare our bias formula with the Hahn-Newey formula in the simplest case where the observations are indep endent in the time dimension. By the theorem, one can see that the bias consists of three terms. As the pro of shows, the first one comes from the correlation b etween ˆaiand the ai-derivative of the ai-score evaluated at the truth; the second one comes from the correlation b etween ˆaand the aii-derivative of the ß -score evaluated at the truth; the third one comes from the variance of ˆai. See the expression (A.16) and Lemmas A.5-A.6 in App endix A. Our pro of strategy of Theorem 3.1 is substantially different from a functional expansion metho d of Li, Lindsay, and Waterman (2003) that many pap ers exploited. The reason is that their technique is difficult to adapt to the situation where the kernel smo othing is used. The pro of strategy is p erhaps closer in spirit to the iterative metho d of sto chastic expansion displayed in Rilstone, Srivastava, and Ullah (1996), but still quite different from theirs. It should b e p ointed out that although the present ob jective function is smo oth, the pro of is still non-trivial since we have to control the smo othing effect and at the same time handle the problem of diverging numb er of nuisance parameters. An additional complication arises since the observations are dep endent in the time dimension. To cop e with this complication, we derive new empirical pro cess inequalities applicable to ß -mixing pro cesses, which are presented in App endix C. 3.3 A heuristic comparison with the Hahn-Newey formula 9In this subsection, we provide a heuristic comparison of our bias formula with the Hahn-Newey formula in the case that the observations are indep endent in the time dimension.Hahn and Newey (2004) gave the bias formula for the general fixed effects M -estimator with smo oth ob jective functions in the case that the observations are indep endent in b oth the cross section and time dimensions (Hahn and Newey (2004) fo cused on the likeliho o d setting, but as they stated, the bias expression presented in Hahn and Newey (2004, p.1302) do es not dep end on the likeliho o d setting). Observe that when the observations are indep endent in the time dimension, the bias term b reduces to a simpler form [ limn→8 { 1nni=1∑t (1 - t ) s2 iνi}] . 11 2 b = G- 1 9 It is p ossible to compare our result with the bias expression derived in Hahn and Kuersteiner (2011) for the case that the observations are dep endent in the time dimension. However, the purp ose of this section is to provide an intuition b ehind our bias formula, and to make the argument simpler, we restrict our attention to the case that the observations are indep endent in the time dimension. In what follows, with a heuristic approach, we present that our bias formula is identical to the Hahn-Newey formula applied to the (unsmo othed) QR ob jective function. In the present context, “heuristic” means that there is no formal justification in the mechanical application of the Hahn-Newey formula to the QR ob jective function as demonstrated b elow. To b egin with, we briefly review the Hahn-Newey formula. Let f (z; a , ß ) denote some smo oth ob jective function to b e maximized, where zititis a random vector of conformal dimension, a is a scalar constant and ß is a constant vector of conformal dimension. Let ai0and ß0denote the true parameter values. Define vit(a , ß ) := ∂∂ a f (zit; a , ß ) , wit(a , ß ) := ∂∂ ß f ( zit; a , ß ). = vit( ai 0, ß0) and wit As in Hahn and Newey (2004), we use the notation vit it(ai0, ß0). The Hahn-Newey bias formula is given by vit] ] + [ E[v2 it 2 { E[ ititavita] E[' itß] , ] E[ witaita] E[ vitaa] }] ita vita] ] w := -E [:= [wE[ witßita] E[vita2] 3 i. ita =w 1 lim n Ii n I ) n b) , 1 i=1 n ( lim i=1 n→8 ρ×( Bias = v where ρ := lim n→8(n/T ) and E[ w - b2 i E[ v ] v[ E[ w] E[ +b vita b i =: b1i are defined as vita ∑ -1 i ] 2E[v] = ∂ vit(a , ß )/∂ a |(a, ita ai0, n→8 ∑ itaa i ] E[ v )=( 0) i v i t 1 i , b 2 i i t ; a , ρ t ( u i t ( a , ß ß ) ) = = { t - I (u a a n d - b x 3 i ' i t ß ) w i t h i 0 = (uit, xit z The terms such as and so on. Note that we have changed the original order of terms in bto make v comparison more comprehensible. Here, bcorresp ond to what we have called sources of the bias in the previous subsection. Also we have slightly changed original notation to avoid the notational multiplication. We now turn to the QR case. Assume without loss of = 0 and generality a ß0 it ' itß )} (a , ß ) = t - I (uit = xit. a + x ' i t ß ) , w i t = 0. The ob jective function is f (z). The scores would b e it = a + x The scores for the QR ob jective function are not uniquely determined since the check function ρt( u ) is not differentiable at u = 0. However, we shall intuitively calculate the Hahn-Newey bias formula to the present scores based on some heuristic rule. The scores include the indicator function; so we can not compute, for instance, E[vita] as 12 it is. Instead, we shall compute E[ vita] as ∂ E[vit(a , ß )] /∂ a |( a, )=( ai 0, 010 ) . (1) i E[ vita] = -fi(0), E[ vitß] = E[ wita] = -E[ fi(0|xit)xit], E[ vitaa itß] = -E[fi(0|xit)xitx' it], E[witaa (1) i] = -E[f(0|xit)xit]. The calculation of the terms E[ vitavit] and E[ w v vit] = E[ vitavitxit 2 it /∂ a ]/2 and E[w vit] = E[∂ v E[vitavit] = (1 - 2t )2 fi(0), E[ wita (0|xvit] = (1 - 2t )2 E[f it]. Ii = E[f (0|xit)x x' it] E[f i(0|xit)xit(0) (0|xit)x' it] = E[f (0|xit)xit(x' it - γ' i)] , E[f Usin g this heuristic rule, we can obtain: ] = -f(0), E[ w ita it it)x ita ita i /∂ a ) xit i ] is not straightforward. We make use of the fact that E[v2 it] = E[( ∂ v] /2. Exchanging the derivative and the exp ectation, we can obtain Substituting these results into the Hahn-Newey formula, we can obtain it i ] f i i i it b1 = E[ f (0|xit (1 - 2t ) ] f(0) · )x 2i fi(0) = (2t - 1)2 · γi, i 2 2 b3i = t (1 - t )2 -E[f (1) i(0|xit)xit ] + E[fi (0|xit)x] = t (1 - t ) iνi. f fi(0) f 2 = (2t i (0| E 2 i2 (0) = (2t · γ i, (1) i(0) } 2·s bi 1) it xit)x] fiit [ ·(0) { 1) 2 f Therefore, the Hahn-Newey formula reduces to our formula. Note that in the case that the observations are indep endent in the time dimension, bi 1and b2 iare canceled out, so that the resulting bias is of a simple form. Of course, the ab ove calculation is not formal b ecause of the non-differentiability of the QR ob jective function, or more precisely, b ecause the Hahn-Newey bias formula dep ends on the higher order sto chastic expansion of the scores while that for the QR ob jective function is non-standard, as discussed in Intro duction. In the present pap er, the smo othing is employed as a device to make the ab ove heuristic calculation rigorous. 3.4 Bias correction As stated in the literature, the problem of the limiting distribution of v nT (ˆ ß ß0) not b eing centered at zero is that usual confidence intervals based on the asymptotic approximation will b e incorrect. In particular, even if b is small, the asymptotic bias can b e of mo derate size when the ratio n/T is large. In this subsection, we shall consider the bias correction to the FE-SQR estimator. way to implement this heuristic calculation is to use “generalized functions” as in Phillips (1991). 10Another 13 We first consider a one-step bias correction based on the analytic form of the asymptotic bias. Put ˆuit:= yit- ˆai- x' itˆ ß . The terms fi:= fi(0), si, γi, νiand G can b e estimated by ˆ := n( ˆuit) , Kh := 1ˆ , ˆ := Khn( ˆuit)xit , fi 1T ˆsi fi γi ˆsiT 2 := := ˆ γi), ˆ /hn)( 1nT Gn xit n ˆ 1T h i = t=1 1 Khn( ˆ ν i Tt=1 t=1∑K(1)( T n∑ T∑ Tt=1 ∑ ˆuit ∑ and ω(3) ni ) := ∫-8( j ) := E [xi1 = 0)I (u = 0)], it can b e estimated by min { T ,T - j }∑t=max { 1,- j +1 = 0). }Khn( ˆuit)I ( ˆu ˆ fi( j ) := min {T ,T -j Khn( ˆuit)I ( ˆui,t+ = 0)xit. }∑t=max {1 ,-j j 1T +1 } ifϕi(j ˆϱi( j ) := 1T min {T ,T -j }∑t=max {1 ,- j +1 } = 0)I ( ˆui,t+ = 0). I( ˆuit j n Take a sequence mn (1) ni→ 8 sufficiently slowly. Then, ω (2) ni (3) ni is a more w delicate issue, since it reduces to the e As in Hahn and Kuersteiner (2011), we make us here K(1)(u ) = dK ( u )/du. The estimation of the terms ω(1) ni, ω(2) ni 0 fi,j(0, u ) du, 0 ∫ f (0, u |xi 1, xi, 1+ j)du ] , ( j ) can b e estimated by i,j -8 ϱi(j ) := E[I (ui1 Since ϕi(j ) ˜ E[Khn(ui 1)I (ui, 1+ j ˆ ϕi(j ) := 1T Similarly, fi i, 1+ j = 0)]. i,t +j The term ϱi such that m(j ) can b e estimated by its sample analogue: ,ω 1 =|j |= mn ( 1 - |j ) {-t 2+ (j )} 14 | ˆϱ . T ˆω( 3) ni can b e estimated by:= ∑1=| j |= m:= ∑n1 =| j |= mn( 1 -( 1 -:= t (1 - t ˆω(1) ni )+∑ ˆ ω(2 ) ni |j ) {tˆ fi ˆ ˆ f | γi T |j | T ) { t (j ) }, ˆ ˆ ϕi(j )} , f i i i and ω T h e bi a s te r m b is th u s e sti m at e d b y ˆ G- {1 1nni=1∑ˆsi( ˆω(1) niˆˆ ω(2) ni+ ˆsiˆω(3) ni)} ˆ . ˆ b :=n γi νi 2 and t ˆ fiˆ and γi ˆ ω, resp ectively, as they are canceled out by the differen Additionally, there is no need to use the same kernel and the s is no need to compute the the terms t ˆ f i estimate ß0and b. In particular, the kernel used in the estimatio b e of a higher order (see the remark after Theorem 3.2 b elow the notation simpler, we do not intro duce a new kernel and a n The next theorem shows that under the same conditions as Th the limiting normal distribution with mean zero and the same co . The pro of of the theorem is given in App endix A.3. 2 n→ 8 such that (log n)/(T h2 n m v nT (ˆ - ß0) d→ N (0, G-1V ß1 G-1) . := ˆ ß -in We define the one-step bias corrected estimator by ˆ b /T . In practice, ˆω(1) ni ˆ ß1 there Theorem 3.2. Let mn ) → 0 . Under the same conditions as Theorem 3.1, we have R e m ar k 3. 1. A s di sc u ss e d in S e cti o n 3. 1, th e or d er of th e k er n el u s e d to c o n st ru ct th e F ES Q R e sti m at or h a s to b e at le a st 4. H o w e v er , th e or d er of th e k er n el u s e d to c o n st ru ct th e bi a s e sti m at or c a n b e 2. In vi e w of th e pr o of of T h e or e m 3. 2, to m a k e is of order o{ (log n)/( T h1/n 2) ni n ˆ Vni ˆ Gn p ) } , which is satisfied if h≍ T- cwith 1/(2r + 1) = c < 1/3. In particular, r can b e 2 in ˆ of the bias term alone. b consistent, it is sufficient that hestimation r n Remark 3.2 (Estimation of the asymptotic covariance matrix). As a bypro duct o Theorem 3.2, we can obtain a consistent estimator of the asymptotic covariance matrix. The pro of of Theorem 3.2 shows that→ G. The matrix Vcan b e consistently estimated by Tt=1∑(xitˆ γi)(xit ˆ γi 'it = 0)}{t - I ( ˆui,t+ j = 0)} (xit ˆ γ )( xi,t+ j ˆ γ ' , = t (1 - t )T T ( 1 - |j | ) + ∑1=| j |= mn T ˆ V 1min { T ,T - j }∑t=max {1 ,- j +1 }{t - I ( ˆu ∑ ˆV i i) 1 5 is consistently estimated by = n ˆV n i=1 ni -1n nˆ G-1 n is consistent for the n so that . The pro of of the consistency ofis similar to the pro of of the consistency of ˆ b Vn given in App endix A.3. Under the same conditions as Theorem 3.2, it is shown that ˆ ˆ G V - 1n asymptotic covariance matrix G- 1V G-1. Here, the term (1 - | j |/T ) in the definition of ˆVnican b e replaced by 1 or (1 - |j |/mn), and such a replacement do es not affect the consistency prop erty. Remark 3.3. As discussed in Hahn and Kuersteiner (2011), for small T , a default choice of mnwould b e 1. For relatively large T , a choice of mcould b e significant. However, a standard theory of auto correlation consistent covariance matrix estimation, such as that given in Andrews (1991), can not b e applied here b ecause Andrews (1991) restricted moment functions to b e smo oth while the moment function of quantile regression is not smo oth, and the optimal choice of m nnis b eyond the scop e of the pap er. 3In the case that the observations are indep endent in the time dimension, Hahn and Newey (2004, Theorem 2) showed that under suitable regularity conditions, the meanzero asymptotic normality of the bias corrected MLE for smo oth likeliho o d functions hold when n/T3→ 0. In the present case, although we do not exclude the p ossibility that Theorem 3.2 holds under n/T→ 0, a formal justification will b e quite involved and requires even stringent conditions, so that we do not pursue here this direction. To b e precise, consider a fixed effects estimator ˆ θ of the common parameter θ0for some “standard” panel mo dels (say, panel probit mo del) with individual effects. If the ob jective function is smo oth enough, and some moment condition is satisfied, then, as Hahn and Newey (2004) and the subsequent study suggested,ˆ θ admits the expansion ni=1 Tt=1 Uit + BT + ˆθ- = -2) + op - 1/ 2{( nT )} , (3.3) ∑ ∑ θ0 1nT Op(T 1/ 2 and T -1 are mean zero random vectors, and B is some constant which eventually where Uit contributes to the bias. Imp ortantly, this expansion is valid without n/T → const. (but a suitable growth condition to ensure the consistency of the estimator is required). The condition that n/T → const. comes so that (nT )rates are balanced. If we “successfully” remove the bias B , which means that B itself can b e estimated with bias of order T- 1, then the asymptotic normality (with mean zero) holds under the condition that T-2is faster than (nT )- 1/ 2, i.e., n = o(T3) (this discussion is less formal but explains the main p oint that Hahn and Newey (2004) did). To summarize, Hahn and Newey’s Theorem 2 crucially dep ends on the two observations: O p { T 1 (i) the fixed effects estimator admits the expansion of the form (3.3); (ii) the bias term itself can b e estimated with bias of order T- 1, i.e., ˆ B = B + + ( nT )- 1 /2} . 16 In contrast, such a “clean” expansion do es not hold for the FE-SQR estimator b ecause of the presence of the smo othing bias, and even higher order expansions of the FESQR ob jective function will b e quite involved. Additionally, to guarantee an expansion of the form (3.3) in the present case requires considerably stringent conditions. For example, there is a bias term of order hr nin the present case, and to make this term of order T-2, r (the order of the kernel) should b e at least 8. Implementing (ii) in the present case is also challenging, since the bias now dep ends on long run covariances. In fact, even for standard smo oth ob jective functions, Hahn and Kuersteiner (2011) only proved the asymptotic normality of the bias corrected estimator under the condition that n/T → const.However, it is still true that the FE-SQR estimator ˆ ß admits the expansion of the form (3.3) with the Op(T- 2) term replaced by op( T- 1). A careful insp ection of the pro of shows that this expansion holds under a slightly weaker growth condition on T than n/T → const. , although T should not b e to o slow to ensure sufficient convergence rates of some kernel typ e statistics (and the consistency of the FE-SQR estimator). Therefore, if we can estimate consistently the bias term b, the asymptotic normality (with mean zero) holds under a slightly weaker growth condition on T than n/T → const. ˆ fAn undesirable feature of the analytic bias correction is that it dep ends on the inverse of the estimated densitiesi. When T is relatively small, on rare o ccasions, ˆ f i may b e very close to zero for some i, which results in an unreasonable value of ˆ ß1ˆ I. To avoid such a situation, a reasonable solution is to use a trimming. Let i:= I ( ˆ fi> κ) for some sequence κnsuch that κn→ 0 slowly. We prop ose to mo dify ˆ Gnand ˆ b asn ni=1 ˆ Ii ˜ = Khn( ˆuit) xit( xit ˆ γi), G 1nT ∑ Tt=1∑ n ˜ G-1{ 1nni=1∑ˆ Iiˆs(i ˆω(1) niˆˆ ω(2) ni+ ˆsiˆω(3) ni)} ˆ . ˜ b =n γi νi 2 |ˆ fi i| fi for a moment that T is even. Partition { 1, . . . , T } into two subsets, S1 2 := { T /2 + 1, . . . , T } . Let ˆ ßSl b e the FE-SQR estimate based on the data 17 The validity of such a mo dification is simple. By the pro of of Theorem 3.2, pf→ 0, and by con max1= i= n κnfor all 1 = i = n. T Trimming is freque We refer to Sectio We next consid Dhaene and Jo ch ß . Supp ose := { 1, . . . , T /2} a { (yit, xit), 1 = i = n, t ∈ Sˆ ß} for l = 1, 2. The half-panel jackknife estimator is defined as1/ 2:= 2 ˆ ß -¯ ß1/ 2l, where ¯ ß1/ 2:= ( ˆ ßS1+ ˆ ßS2)/2. For simplicity, supp ose for a moment that we use the same bandwidth to constructˆ ß andˆ ßv nT (Sˆ ßl1/ 2( l = 1 , 2). Then, from the asymptotic representation of the FE-SQR estimator that app ears in the pro of of Theorem 3.1 (see (A.16) in App endix B), it can b e shown that under the same conditions as Theorem 3.1 (including n/T → ρ ), we have- ß0)d→ N (0, G-1V G-1). The half-panel jackknife estimator would b e attractive in b oth theoretical and practical senses. In fact, its validity do es not require any extra condition. It would b e also preferable from a practical p oint of view since it do es not require the nonparametric estimation of the bias term and at the same time is easy to implement. One can find some other options to correct the bias of the FE-SQR estimator in the literature (see, for instance, Hahn and Newey, 2004; Arellano and Hahn, 2005; Bester and Hansen, 2009; Dhaene and Jo chmans, 2009). Although it is an interesting topic, the exhaustive comparative study of such metho ds is b eyond the scop e of this pap er, and we would restrict our attention to the one-step bias correction and the half-panel jackknife metho d. 4 A Monte Carlo study 4.1 Design In this section, we rep ort a small Monte Carlo study to investigate the finite sample p erformance of the bias correction. A simple version of mo del (2.1) is considered in this study: i yi = 2+3 , + (1 + 0. 5xit)ϵit t = eit + θ eit- 1 ηi t 2 + F-1 ϵ 3 it x η i t i i it ~ i.i.d. U [0 , 1], ϵit it 0 11 1 2 , (4.1) where x= 0. 3ηi+ zit, zit~ i.i.d. χ, and e~ i.i.d. F with F = N (0, 1) or χ11.In this mo del, ai0= ai0( t ) = η( t ) and ß= ß0(t ) = 1 + 0.5F- 1 ϵ(t ), where F- 1 ϵis the quantile function of ϵ. We consider here the cases where n ∈ {50, 100, 500}, T ∈ { 10, 20, 50, 100} and t ∈ { 0.25, 0 .50, 0.75} . We use two different parameter values θ ∈ { 0.5 , 1} to control the auto correlation in ϵ As in Horowitz (1998), we use the fourth order kernel K (u ) := 10564 (1 - + 7u4 - 3u6)I ( |u | = 1). 5 u2 12 . The b oundedness restriction on the regressors required by condition (A2) is not satisfied by the present sp ecification. However, we b elieve that condition (A2) is mainly a technical condition and the simulation results should not b e affected by the use of this distribution. In fact, we also examined the cases with a different degree of heteroscedasticity and a different error distribution, but the results are qualitatively similar. 18 ˆ ßThe numb er of rep etitions for each Monte Carlo exp eriment is 1,000. The FESQR ob jective function is not necessarily convex and may have several lo cal optima. Thus, the choice of the starting value is imp ortant when computing the estimate. We cho ose the standard QR estimate as a reasonable starting value. We use quantreg package (Ko enker, 2009) to compute the standard QR estimates. The estimators under consideration are the standard QR estimatorKBdefined by (2.2); the FE-SQR estimatorˆ ß ; the one-step bias corrected FE-SQR estimatorˆ ß1ˆ ß; and the half-panel jackknife estimator1 /2describ ed in Section 3. Additionally, we apply the prop osed one-step bias correction to the standard QR estimator. We denote this estimator as ˆ ß1 KB. It is imp ortant to note that there is no theoretical justification for this estimator, but it is interesting to analyze its p erformance in terms of computation. Actually, the computational time of the FE-SQR estimator is considerably slower than the standard QR estimator. Nonetheless, smo othing is crucial to formally characterize the bias expression for panel QR mo dels, so we adopt the smo othing approach here to tackle this imp ortant problem. The choice of the bandwidth is a common, sometimes difficult problem when the kernel smo othing is used. In this exp eriment, we use h1 n= c1s1(nT )- 1/ 7to construct the FE-SQR estimate, where c1is some constant and sis the sample standard deviation of ˜uit= yit- ˆai,KB- xitˆ ßKB, and h2n= c2s2T1- 1/ 5to construct the bias estimate, where c2is some constant and sis the sample standard deviation of ˆu it= yit- ˆai- xitˆ ß .132These choices are only for simplicity. We leave the choice of the bandwidth as a future topic. Some intuition b ehind these choices, however, may b e explained as follows. According to condition (A7) and the discussion in Section 3.1, the bandwidth used to construct the FE-SQR estimate should b e of the same order as (nT )- c/ 2with 1 /3 < c < 1/4 when n and T grow at the same rate. Our choice h 1n 14satisfies this restriction except for the fact that it dep ends on the data.On the other hand, by the remark after Theorem 3.2, the bandwidth used to constructˆ b, say h, should b e such that h2n≍ T-c2nwith 1/9 = c < 1 /3 to make ˆ b consistent for b . Our choice h2nsatisfies this restriction. The reason for the use of the different bandwidths is that in practical situations where n is much larger than T , the choice h 1 nis inclined to b e rather small forˆ b , which results in less stability of the analytic bias correction. and h2 n, we set c1= 1 and c2 ˆ fi 1 ˆ ß1. To avoid this, 9 13( ˆa1 ; KB, . . . , ˆan; KB, ˆ ßKB 14 = 2. As discussed in the previous R section, when T is small, on rare o egarding the constants in h1n close to zero for some i, which results in an unreasonable value of ) is defined as a solution to (2.2) in the present context used the standard QR estimate as a starting value to compute the FE-SQR estim We do not further discuss the validity of the data dep endent bandwidth. we use the trimming in our implementation, and set κ= 0 .01. For the truncation level, we use mn= 1 as discussed in Remark 3.2.n 4.2 Results ˆ ß1Before lo oking at the Monte Carlo results, it is worthwhile to summarize our ob jectives of this exp eriment. The ob jectives are twofold. The first one is to examine whether the bias correction metho ds work in finite samples. In view of the theoretical results, the bias ofand ˆ ß1 /2should b e smaller than ˆ ß , at least for mo derate T . The second ˆ ß1ob jective is to examine the effect of the bias correction on the precision of the estimator. The theoretical results suggest that the asymptotic variance ofor ˆ ß 1/ 2is the same as that ofˆ ß . To this end, we present the results for the standard deviation (SD), and the ˆ ßKB ˆ ß and ˆ ß1 15. KB , and ˆ ß1 /2 32 ˆ ß for t ∈ {0 .25, 0.75} . The estimator ˆ 1 1/ 2 ß ˆß ˆ ß1 1 ˆ ß1 16 1 ro ot mean squared error (RMSE) of the estimators. The results for ˆ ˆ ß1are only for reference, a ßKB Table 1 presents the res normal innovation with θ = estimators are approximate {0 .25, 0.75} , and their biases decrease a is fixed. Additionally, it can ßis able to reduce the bias results for the half-panel jac show that it only works for a distribution are rep orted in results regarding SD show however, decreases (in lev RMSE of the one-step bias These two observations may refle in the finite sample (esp ec nonparametric estimation o first derivatives. However, for large n/T ratio, which is common in smaller than those ofˆ ß , fo many applications, the RMSE of the bias corrected estimator is are smaller than those of ˆ ß . T ∈ { 10, 20} and t ∈ { 0.25, 0. 75} , the RMSEs This of ˆ ß is due to the fact that the bias term dominates the variance in these cases. Table 2 presents the results for bias, SD, and RMSE for the χ innovation with 15 Recall that it is not known whether a result analogous to Theorem 3.1 holds for ˆ ß KB. In additional simulation exp eriments, not rep orted here but available up on request, we have examined the cases where n = 1000 and T ∈ { 10, 20}. In such cases, the RMSE of the one-step bias corrected estimatorˆ ß1is even smaller than that of ˆ ß . 1 206 θ = 0.5. In this case, for t = 0 .75 and for relatively small T , ˆ and ˆ ß are ßKB considerably ˆ ßbiased. However, the one-step bias correction is able to substantially reduce the bias. The results also show that the half-panel jackknife estimator1 /2do es not work well in bias reduction except for a few cases. The results for SD and RMSE for the χ3 2ˆ ß1case are presented in the middle and lower parts of Table 2, resp ectively. They show that b oth SD and RMSE ofˆ ß1increase substantially for small panel data. However, the variance inflation decreases as either n or T increases, and also as discussed previously, for large n/T ratio the bias term dominates the RMSE and we can observe reduction in RMSE in such cases, for example, for t = 0.75, n = 500 and T ∈ { 10, 20} the RMSE ofare smaller than those of ˆ ß . The results for bias, SD, and RMSE for the θ = 1 case are presented in Tables 3 and 4 for the normal and χ2 3innovations resp ectively. Basically, they are parallel to those for θ = 0.5. Overall, in our limited examples, we may conclude that the one-step bias correction is able to substantially reduce the bias in many cases, although it slightly increases the variability in the small sample partly b ecause of the nonparametric estimation of the bias term. As a final remark, we shall comment that, in our simulation exp eriments, the analytic bias correction presented in Section 3.3 somehow works for the unsmo othed estimator. This is not totally surprising b ecause the two estimatorˆ ß andˆ ßKBnaturally share a common structure. In practice, it might b e an option to use the bias formula given in Section 3.3 to the unsmo othed estimator. However, it is again noted that there is no formal validity to do so. 5 Concluding Remarks In this pap er, we have shown that the fixed effects smo othed quantile regression estimator for panel QR mo dels has a limiting normal distribution with a bias in the mean when n and T grow at the same rate. We have further considered two metho ds of correcting the bias of the estimator, namely an analytic bias correction and the half-panel jackknife metho d of Dhaene and Jo chmans (2009), and theoretically shown that they eliminate the bias in the limiting distribution. Imp ortantly, our results allow for dep endence in the time dimension and cover practical situations. The contribution of this pap er is to op en a new do or to the bias correction of common parameters’ estimators in panel QR mo dels, which we b elieve is an imp ortant research area. Many issues remain to b e investigated. Substantial differences from smo oth nonlinear panel mo dels are the facts that the FE-SQR estimator requires the choice of the bandwidth and its asymptotic bias consists of nonparametric ob jects. In particular, 21 the estimation of the bias term is more complicated than in smo oth nonlinear panel mo dels, and the choice of the bandwidth is a topic to b e further investigated in future work. Moreover, in this pap er, we have suggested a consistent estimator for the asymptotic covariance matrix. However, as observed in the simulation exp eriments, the inflation of the empirical variance of the analytically bias corrected estimator in the small sample might b e problematic for practical inference, which we conjecture could b e overcome by making suitable mo difications to stabilize the analytic bias correction. Extensive numerical studies, such as those of Buchinsky (1995) for the cross section case, will b e helpful to determine a go o d way of making practical inference in panel quantile regression mo dels, which will b e pursued in another place. A Pro ofs For any triangular sequence { ani}n i=1→ 0, we use the notation ani= ¯o( ϵn) if max1= i= n|ani| = o(ϵnand any sequence ϵn). We also define ¯ O (·), ¯op(·) and ¯ Op(·) in a similar fashion. For x, y ∈ R, we use the notation x ∨ y = max{x, y }. For x ∈ R, [ x ] denotes the greatest integer not exceeding x . A.1 Pro of of Prop osition 3.1 (a , ß ) := T- 1∑T t=1(yit- a - x' itß ){ t - I (yit= a + x' itß ) } and ∆i(a , ß ) := E[ ˜ M( a , ß ) ˜Mni( ai0, ß0)]. Without loss of generality, we may assume that ai0ni= 0 and ß1denote the l1norm. For d > 0, define Nd= 0. Let ∥ · ∥:= { (a , ß ) : | a | + ∥ ß ∥1= d } and Nc dby its complement in Rp +1. We first show that for any d > 0, min c ∆i(a , ß ) > 0. (A.1) inf ( a, ) ∈N i= 1 Put Mni(a , ß ) := T1 ∑T t=1(yit - a - x' itß ){ t - Ghn( yit - a - x' itß )} , ˜ Mni 0 To see this, use the identity of Knight (1998) to obtain ∆i(a , ß ) = E [∫0a +x' i1 { Fi(s |xi 1) - t } ds ] , from which we can see that each map ( a , ß ) 7→ ∆i(a , ß ) is convex by differentiation. By condition (A5), we can expand ∆i(a , ß ) uniformly over (a , ß ) such that |a | + ∥ ß ∥1= d and i = 1 as ∆i(a , ß ) = (a , ß')E[ fi(0|xi1)(1, x' i1)'(1, x' i1)](a , ') + o(d2), d → 0 ß' . By condition (A4), there exists a p ositive constant c indep endent of i such that ')(1, x' i1)](a , ß') = c( |a | + ∥ß 2), ∀i = 1 . (a , ß')E[ fi(0| xi 1)(1, x' i122 ∥1 Thus, there exists a p ositive constant such that for any 0 < d = d0, d0 min ∆i( a , ß ) > 0. (A.2) inf |a |+ ∥ ∥1= d i = 1 (a , ß ) = riwe have ∆and any (a , ß ) ∈ Nc d. Take r = d /( |a | + ∥ ß ∥), ˜a = Now, pick any 0 < d = d0 = r ß , so that | ˜a | + ∥ ˜ ß ∥-11∆i1= d . Because of the convexity of the map 7→ ∆( ˜a , ˜ ß ). Thus, by (A.2), (A.1) holds for 0 < d = d0i( ¯a , ¯ ß ),. It is obvious that (A.1) holds for any d > 0 since the left side of (A.1) is nonde d > 0. Next, we shall show that sup |Mni(a , ß ) - Mni(0, 0) - ∆i( a , ß )| p→ 0. (A.3) max ( a, )∈A×B 1 =i =n ( a , ß ) - Mni(0, 0) - ∆ini( a , ß ) = { M(a , ß ) ˜ Mni( a , ß )} - { Mni(0, 0) ˜ Mni˜ Mni(0, 0)} + {( a , ß ) ˜ Mni(0, 0) - ∆i(a , ß )} . By the pro of of Horowitz (1998, Lemma 1), the first and second terms are O (h) v T → 0, it is n uniformly over (a , ß ) ∈ A × B and 1 = i = n. It remains to prove that third term is opn(1) uniformly over (a , ß ) ∈ A × B and 1 = i = n. This can b e shown in a similar way to the pro of of (A.2) in Kato, Galvao, and Montes-Ro jas (2011) (see also their Remark A.1) with a suitable change according to the fact that the observations are now dep endent in the time dimension. Since the most of arguments are similar to the pro of of their (A.2), we only state the p oint to b e changed. The only p oint that should b e changed is the place where they apply the Marcinkiewicz-Zygmund inequality to evaluate b oth the terms of their (A.4). Instead of the Marcinkiewicz-Zygmund inequality, we now apply a Bernstein typ e inequality for ß -mixing pro cesses (see Corollary C.1 b elow) with the evaluation of the variance term by Lemma C.2. Because of the exp onential ß -mixing prop erty (condition (A1)), and the uniform b oundedness of xit(condition (A2)), taking s = 2 log n and q = [ vT ] in Corollary C.1 and using the fact that (log n)/ i - 2 + v T B a[ v ma su T] x p ( a, )∈A×B (a , ß ) | > ϵ |} ˜ Mni(a , ß ) ˜ Mni(0, 0) - ∆ =2( p→ 0 P{ T → 0, the right side is o(n (a, )∈A×B 1 sup | ˜ Mni(a , ß ) ˜ Mni(0, 0) - ∆i( a Because (log n)/ v 23 = Observe that Mni i = n max 1 =i =n - 1). Thu s, by the unio nb oun d, we hav e ˆ ßpTherefore, we obtain (A.3). We shall show→ 0 = ß0. Pick min( a, ∈Nc any d > 0 and put ϵ := infi= 1ˆ ß ∥1> d , ∑n i=1{ Mni( ˆai, ˆ ß ) - M- 1∑n i=1min(a, )∈Nc ninA×B{Mni( a , ß ) - M- 1∑n i=1min(a, )∈Nc nA×B{Mni( a , ß ) - Mnini(0, 0) - ∆i= 1min(a, ) ∈Ni 1 = i=n |Mni(a , ß ) - Mni(0, 0) - ∆i(a , ß )| + ϵ. -1 n Mni( -1 n i Mni(0, 0). Thus, i=1 =1 ˆa (a , ß ) > 0. Observe that when ∥ - 1n (0, 0)} = n(0, 0)} = n(a , ß )} + infc ∆i(a , ß ) sup =(a, ) ∈A×B max ( a, )∈A×B ∑ i, ˆ ß ) = By definition, n ∑ n max ) 1 =i =n P( ∥ ˆ ß ∥1 su |Mni(a , ß ) - Mni(0, 0) - ∆i p | | p ˆa i| > d for some 1 = i = n, >d)=P{ p i= i= n i (0, ˆ ß Mni( ˆai, ˆ ß ) - Mni ) {Mni( a , ˆ ß ) - Mni(0, ˆ ß ) = min | a| >d,a ∈A} {Mni( a , ˆ ß ) - Mni(0, 0) } - {Mni(0, ˆ ß ) - Mni(0, 0)} = min | a | > d , a ∈ A i which implies that ˆ → 0. We next prove ß max = arg mina ∈A Mni(a , ˆ ß ). When | ˆa → 0. Pick any d > 0 and put ϵ as b efore. Recall that ˆa |a |>d,a ∈A{ Mni(a , ˆ ß ) - Mni(0, 0) - ∆i(a , ˆ ß )} + ϵ - {Mni(0, ˆ ß ) - Mni(0, 0) - ∆i(0, ˆ ß )} - ∆i(0, ˆ ß ) = min =-2 max1 =j =n sup ( a, )∈A×B |Mnj( a , ß)- ∆i ( a , ß )| = ϵ → 0 } , Mnj(0, 0) - ∆j(0, ˆ ß ∆j(a , ß )| ). +ϵmax1= j = n Thus, we obtain the inclusion relation {| ˆai| > d, 1 = ∃ i = n}⊂ {max1 =i =nsup( a, - Mni(0, 0) - ∆i(a , ß )| = ϵ3 =: A1n ∪ A2n. ) ∈A×B|Mni( a,ß) ∆i(0, ˆ ß ) ϵ } }∪ { max1= = 3 i= n By (A.3), P( A1n) → 0. On the other hand, since ∆of ˆ ß implies that P( A2ni(0, ˆ ß ) = 2M ∥) → 0. Therefore, we complete the pro of.ˆ ß ∥ , the consistency 24 , ß ) and H(2) (n a, ß ) in Section 2. Put H(1) ni( ai ß )] ˆß. and H(2) n (a , ß ) := E[H (2) n, ß ) := E[ H(a , ß )]. The pro of of Theorem 3.1 consists of a series of lem A.2 Pro of of Theorem 3.1 Throughout the pro of, we assume all the conditions of Theorem 3.1. Lemma Recall the definition of H(1) ni(ai A.1-A.3 b elow are used to derive a convenient expansion of 2n → 0 such that max1 = → 0 and d2 | - ai 0| = Op( d2 n i=n2 1 nr -1 n1n ˆai 2n3 1 n1n Lemma A.1. Take d1n ˆ ß - ß0∥ = Op(d2 n i - ai0 = si H(1) ni(ai0, ß0) - γ' i( ˆ ß - - 1) - 2si f(1) i(0)( - ai 0 2) ß0 ˆai n := ( ˆ + )h + h + d2 + + n a d1n d d d d d ). P ut dr n) a n d ∥. T h e n, w e h a v e 0, ß) } - {H(1) ni( ˆai, ˆ ß ) - H(1) ni¯ O) }] +p( dn(a0 i, ß0) , (A.4) ˆß - ß0= G- 1 n-1{- n∑n i =1H(1 i[{ H(1) ni( ˆai, ˆ ß ) - H(1) ni(a0 i ß0)γi+ H(2) n- 1 n- 1n) } - G∑n i=1γi(1) ni[{ H( ˆai, ˆ ß ) - H(1) ni( a(a0, ß0 i0, ß0) } - {H(1) ni( ˆa H(1) ni) }] + G- 1 n[{ H(2) n( ˆ a, ˆ ß ) - H(2) n(a0, ß0)} - { H(2) n( ˆ a, ˆ ß ) - H(2) n) }] + G- 1 n +s (2n)∑n i =1νi( ˆai- ai02)} + Op( dn( a0, ß0(a0 i, ß0). (A.5)Proof of Lemma A.1. The consistency result shows that ( ˆ a , ˆ ß ) satisfies the first order (1) ni(ai, condition (2.4) with probability approaching one. The first order condition implies 0 = H(1) ni(ai 0, ß0) + { H(1) ni( ˆai, ˆ ß ) - H(1) ni(1) ni( ˆai, ˆ ß ) - H(1) ni(a0i, ß0(a0 i)} - { H, ß(1) ni0( ˆai) } + [ { H, ˆ ß ) - H(1) ni( a0 i, ß0)} ] Expanding H(1) ni( ˆai, ˆ ß ) around (ai 0, ß0) and using Lemma B.1, we obtain (A.4) the other hand, the first order condition implies that 0 = H(2) n( a0, ß0) + {H(2) n( ˆ a, ˆ ß ) - H(2) n(2) n( )} - { H (2) ( ˆ a, ˆ ß ) - H(2) n(a0, ß0)} ]. n ˆ a, ˆ ß ) - H(2) n( a( a00, ß, ß00)} + [{ H (A.6) Expanding H(2) n( ˆ a , ˆ ß ) around ) and using Lemma B.1, we (a0, ß0 obtain ]( ˆa-i ai0 2) + Op(dn). (A.7) 25 12n ){ H(2) n( ˆ a , ˆ ß ) - H(2) n(a0, ß0) = 1n ni =1∑nE[ fi=1∑i(0|xE[ - ai0 f(1) ii1)x(0|xi 1i1]( ˆa)xii 1 1∑ n n i=1 E[f (0|xi1)xi1x' i1] } ( ˆ ß - ) ß i 0 Pl u g gi n g (A .4 ) in to (A .7 ) yi el d s th at H( 2) n( ˆ a ,ˆ ß )H( 2) n( a0 , ß0 ) is e x p a n d e d a s 1n γi[{ H(1) ni( , ˆ ß ) - H(1) ni(a0 i, ß0) } - {H(1) ni( ˆai, ˆ ß ) - H(1) ˆai ni(a0i, ß 1n ni=1∑ ni=1∑ ) }] + Op(dn). I H(1) hn( uit)xit ni(ai f 0, ß0) γi := {H(2) n:= T-1∑ ∑T t=1 (1) i - ni=1∑ Gn( ˆßß0) + 12n νi( (2 ˆai ) n ( a a i 0 ) 2 (2) n(a0 0 , ß0 (2) n ) ( a 2 0 i n , ) ß 0 (A .8 ) C o m bi ni n g (A .8 ) a n d (A .6 ) le a d s to (A .5 ). (1) ni( H (2) n i i ˆai, ˆ ß ) - (a0 i, ß (ˆa,ˆß )-H := T - 1T t=1 )} - { H (ˆa,ˆ ß)-H K Khn(uit) , ˜ f(1) i )} - { H(1) ni( ˆai, ˆ ß ) - H(1) ni 0, ß := -( T hn2 ˜ - 1T t=1 T ∑ t = 1 K ( 1 ) ( u i t Put / h ˜ K( 1)( uit /h n) xit . 0)} , I)} , ˆ∑), ˆ g, ˜ g:= -(T h- 1n 0(1) := ni (1) ni - E[ ˆ f]) - hn( ˜ f n)1/2 p i n)1/2 = -n (2) n 1], (A .9 ) I) + o[ { (l o g n) /( T h + {(l o g n) /( T h1 / 2) } d - E[ ˜ f ])} ( 2n ˆai n( ˜ g 3- ai 0 d n 1/ 2} 2n p ∑n i =1{ ( ˆ - E[ ˆ g]) - h gi }d (1) i 3 -n E[ ˆai ˜ g(1)])} ( - ai0 21n {H Lemma A.2. We have ) + ¯o[{ (log n) /(T h}d+ {(log n)/(T h) ( 1 ) i ( iiI 1 ( ) 1 ) i n i 1 n = -{( ˆ f where are given in Lemma A.1. Proof of Lemma A.2. We only prove (A.9) since the pro of and d of (A.10) is analogous. Obd2n serve that I(1) is decomp osed i1 + Ji2, where ni as J Ji1 := { H (1) ( ˆa i, ˆ ß ) ( ˆai, ß0)} - { H(1) ni( ˆai, ˆ ß ) - H(1) ni) }, Ji2(1) ni:= { H( ˆai, ß0(1) ni) ni H(1) ni H(a0i, ß0)} - { H(1) ni( ˆai, ß0( ˆa) - Hi(1) ni, ß(a00 i, ß0) }. = {∂ H(1) ni( ˆai, ˜ ßi) - ∂ H(1) ni( ˆai, ˜ ßi)}'( ˆ ß - H(1) ni( ˆai, By Taylor’s theorem, Ji1 ß0 ˜ ßi ˜ ßi 0. Now, by Lemma B.2, ∂ H(1) ni( ˆa , ˜ ßi) = n1 /2)} , implying that Ji 1 = 1 /2[ {(log n)/(T h)} ¯op ¯op d2n ), w h er e fo r e a c h i ,is o n th e li n e s e g m e nt b et w e e n ˆ ß a n d ß) ∂{ (l o g n) / (T h] . Si m il ar ly, u s e T a yl or ’s th e or e m to o bt ai n - E[ ˆ f]) - hn( ˜ f H(1) ni( ˜ai, ß0 Ji2 = -{( ˆ f i) + {∂ 2 ai - E[ ˜ f ])} ( ˆai - ai0 - ai 0 2 a) - ∂iH(1) ni( ˜ai, ß0)} 2, (A.11) ( ˆa (1) (1) i i i i ) is on the line segment b etween and ˆa a 2 aiH [{ (log n)/ (T hpH(1) ni( ˜ai, ß0) = ¯op{ (log n)/ (T h3 n3 n1 /2)2 1 n} d) i2 26 where for each i , . By Lemma B.2, we have ∂(1) ni( ˜ai, ß02 a) - ∂i1 /2} , implying that the second term of ˜ai Jin (A.11) is ¯o]. Therefore, we complete the pro of. i i 0 Lemma A.3. max1= i= n | - ai 0| = Op - 1/ 2{ (T / log n)} ˆai . Proof of Lemma A.3. By Lemmas A.2 and B.2, b oth n are of order op{max1= i= n| ˆai- ai 0| + ∥ ˆ ß - ß ˆ ß - = G-1 )} + op{ max1= i= n| ˆai- ai 0| + ∥ ˆ ß - ß0∥} .(a0, + H(2) H(1) ni(ai0, ß0) n n ß0 γi ß0 By the exp onential ß -mixing prop erty (condition (A1)), the uniform b oundedness of xit, and the discussion following Corollary C.2 b elow it is shown that the variance of the term n- 1∑n i=1(1) ni{ H(ai 0, ß0) γi(2) n+ H(a0, ß0- 1)} is O {( nT )} . Thus, combining the fact that the exp ectation of that term is O (hr n) by Lemma B.1, we have | - ai 0|} ∥ ˆ ß - ß0∥ = Op{ (nT )-1 + hr n} + op{ /2 max1= i= n ˆai + max | a i0 |}. = op{ T- 1/ 2 1= i= n ˆai Using this b ound, together with Lemma A.2, we obtain by (A.4) - 1/ 2 + max | - ai0 = si H(1) ni(ai0, ß0) + - ai 0|}. 1= i= n ¯op{T ˆai ˆai - 1{-n∑n i=1 We now b ound the term max1= i= |H(1) ni(ai0, ß0) |. Observe n that (1) ni) = { H(ai0, ß0) - Hni(ai0, ß0)} + Hni(ai0, ß(1) H(1) ni(ai0, ß0 ni(ai0, ß0) = { H) - Hni(ai0, ß0)} + ¯ O ( hr n).0 We wish to show that |H(1) ni(ai 0, ß0) - Hni(ai 0, ß0)| = Op{ (T / log n)- 1/2}. (A.12) max 1= i= n Because of the fact that each term in the sum H(1) ni(ai0, ß01 /2) is uniformly b ounded by some constant indep endent of i and n, applying Corollary C.1 to that sum with s = 2 log n and q = [(T / log n)], and using Lemma C.2 to evaluate the variance term, we have -2 + ( T log n)1/ 2B 1 =2[(T / log n)] a P { |H= 2 (1) (ai0, ß0) - Hni(ai0, ß0 - 1/ 2) | = const. ×(T / log n)} ni {n ,} where the constant is indep endent of i and n. Since the right side is ¯o(n), we obtain (A.12) by the union b ound. Therefore, we obtain the desired result. 27 -1 By Lemmas A.1-A.3, ˆ - ai0 = si H(1) ni(ai0, ß0) - γ ' i( ˆ ß - - 1) - 2si f(1) i(0)( - ai 0 2) ai ß0 ˆai - s i{( ˆ - E[ ˆ fi]) - hn( ˜ - E[ ˜ f(1) i])} ( - ai0) + ¯op(T - 1 + ∥ ˆ ß fi f(1) i ˆai ß0 i H(1) ni(an i=1 i i0, ß0) + ¯ Op i /(T h1/ 2 n (1) ni( a, ß0 0∥ + (log n) 0 (1) i -1 (1) i (1) ni(ai0 2 3 (1) i])} H ∥ ) = s{∥ ˆ ß - ß3/ 2)} , (A.13) where we have used Lemma B.2 to obtain the last equality. Put B1-1), B:= n∑n i=1si{ ( ˆ gi- E[ ˆ gi]) - hn( ˜ g(1) i- E[ ˜ g, ß), B:= (2n)-1∑n i=1s2 iνi(1) i0 ])} - E[ ˜ - E[ ˆ f]) ni{ H(ai0, ß0)}2.Plugging (A.13) into (A.9) and (A.10), and using Lemma H f B.2, we obtain -1 -1n γi (1) = -B1 + I = -B2 + ( - 1 + ∥ ˆ ß - ß0∥ (2) in n op(T o T ), I r p 0 | 0)| = - 1 /2 H Op ∑n n i=1 On the other hand, since max1 =i= (1) ni(ai0, { + (T / log ß h n) -1n ∑n i=1 νi( - ai0 2) = B3 + op( - 1 + ∥ ˆ ß - ß0∥ ). (A.15) ˆai T H(1) ni(ai0, ß0) γ+ i H(2) n(a0, ß0 n - 1{-n∑n i=1( B1- B2 3 } , + ∥ ˆ ß - ß0∥ ). (A.16) B2 1, ) + o p ( T H(1) 0)γi ni(ai0, ß 3 - 1nT {-n∑ + H(2) n(a0, ß0)} p n d → N ( 0 , V ) . n(uit)} c'( x Plugging (A.14)-(A.15) into (A.5), we obtain 0 = )} + G-1 n+ B- 1 ˆ G the right side of (A.16) is is exp ected to follow a central limit The first term on theorem. We this fact in Lemma A.4. On the other hand, Band Bare ß will establish 1 exp ected to contribute to the bias. We will establish the limiting b ehavior of these n terms in Lemmas A.5 and A.6. Lemma A.4. Recal l the de nition of V given in condition (A8). We have vn i=1 ß it , put zit := { t -Ghn( uit)+h ˜ Kh Proof of Lemma A.4. For any fixed c ∈ R i). Note that zit Tt=1∑zi = 1v nT ni=1 Tt=1∑(zzit d→ N (0, c'V c ). Observe that by Lemma hr n} - 1/ 2 ∑n i=11 ∑ it t vnTT t=1n B.1, , - E[zi1]) + O { (nT )1/ 2 γdep ends on n. By the Cram´er-Wold device, it suffices to show that (nT )∑i =1∑ and the second term is o(1) b ecause of the fact that h=r o( - 1 - 1 /2) = o{ (nT )} n T . 28 it. By a standard calculation, E[ ¯z2 i 1] = E[{{ t - Ghn(ui 1)} c'( - γi)}2] + ¯ O xi 1 (hn). Put := zit-E[ zi1]. We first evaluate the variance of the term (nT )-1 ∑n i=1 ∑T /2 t=1 ¯zit (ui 12)| xi 1 (u |x )du 1dv2 2) dv1dv2f ¯ z 1 2 2}|x )K (v1 = -t2 and uniformly over b oth x and i, O (h+ O (hr n= x] = -t) + ∫8-8) + ∫8-8∫8u/h∫8n-8∫Fi8u/h(hnnK (v1)K (vmin{ v, v r n+ E[ { t - Ghn(ui 1)}2|x2= x ] = t- 2t E[ Ghn(ui1i1)|xi 1= x ] + E[ Ghn = t (1 - t ) + O (hn ). (A.17) On the other hand, uniformly over (i, j ) (j = 1), - γi ') c(xi, 1+ - γi )] + O (hr j n), Cov ( zi1, zi, 1+ j) = E[ {t - Ghn(ui 1)}{ t - Ghn(ui,1+ j) }c'(xi1 and uniformly over b oth (x1, x1+ j) and ( i, j ) ( j = 1), = x1, x1, 1+ j = x1+ j ) = x1, xi, 1+ j = x 2 n |xi1 E[{ t - Ghn( ui1) }{t - Ghn(ui,1+ j)}|xi 1 + O ( hr n) + E[ Gh(ui 1)Ghn(ui,1+ j] = -t 2 + O ( hr t 8 i, 1+ j ∫u81 ∫∫ /h8 n∫8-8∫8-8 i = 0)}c' i1, zi, T z i t 1+ j i1 t=1 1 vT i i1 1+ K ( v1)K du1+i1 j (v2)dv ) + ¯) O (h = V a r ( ˜ z i 1 , z i , 1 + j ) = -t2 8 /hn +O( hr n i,j 1 ( nv1, hnv2|x1, x1+ j)K (v1)K (v2)dv1dv2 v 1 r n), ( ) + O (hr n 1 i, 1+ j ∑ 1 ] = -t) +1 )+∫ n u1+j F (h = x1, xi,1+ j t=1∑˜ = 0)}|xi 1 z (x it Var ( ∑ |j | T 29 (A.18) ¯ n = itn O x 1 + ( h j] rn n + O ( h i 1 + T = -t2 h + Fi,j (0, 0|x -8 1 + j = ∑|j |= T 1= Var ( T T it )+ ) + O (h) = E[ { t - I (u= 0)}{t - I (u where the fourth equality is due to the fact that K (·) is an r -th order ke { t - I (u- γ). Then, by (A.17) and (A.18), Var(z) and Cov (z) = Cov ( ˜z, ˜ over (i, j ) ( j = 1). Thus, i1, |j | T (h +T h ), ˜zi,1+ j) + ¯ O r n) and hn + T hr → 0. Therefore, we n have ni Tt=1 zit) = Var ( 1 1 ni=1 Var ( 1 Tt=1 zit) = =1 ∑ vnT ∑ n∑ vT ∑ 1 ni Var ( 1 Tt=1 ˜zit) + o(1) =1 n∑ vT ∑ = c' ( ni=1 Vni) c + o(1) → c'V c 1n ∑ . Put ¯z: = T1/ 2∑T t=1¯ zit. Ob ser ve that ¯z1 vnT ni =1∑ 1+, . .., ¯zTt =1∑ ¯zitn +itar e ind ep end ent. Vie win g that = 1v nni =1∑ ¯zi+ , w e c h e ck th e L y a p u n o v c o n di ti o n fo . r th e ri g ht s u m . B y th e pr e vi o u s re s ul t, it s uf fic e s to s h o w th at ∑ n i=1 E[ | ¯ zi +3 |3 /2] = o( n) . B y c o n di ti o n s (A 2) a n d (A 6) , ¯ zi s u ni fo r m ly b o u n d e d. B y th e e x p o n e nt ia l ß m ixi n g pr o p er ty (c o n di ti o n (A 1) ) a n d T h e or e m 3 of Y o s hi h ar a (1 9 7 8) , it is s h o w n th at E[ | ¯ z ∑ ni =1 E[ | ¯ zi +3 |3/ 2] = O (n ) = o( n1 7). i+3 |it] = ¯ O (1 ), w hi c h i m pl ie s th at T h er ef or e, b y th e L y a p u n o v c e nt ra l li m it th e or e m , w e h a v e n- 1/ 2 ∑ ni =1 ¯ zi + d → N (0 , c' V c ). L e m m a A. 5. R e c al l th e d e ni ti o n of ω (3) ni gi v e n in c o n di ti o n (A 9) . B 3i s e x p a n d e d a s ni=1 s ω(3) niνi) + B3 = 1T ( - 1). · 12n ∑ 2 i op(T i+ We wish to show a central limit theorem for the term (nT ) - 1/ ∑n i=1 ∑T 2 t=1 Proof of Lemma A.5. Put zit := t - Ghn( uit) + hn˜ Khn(uit) and ¯zit := zit - E[ zi1 ]. T h e n, b y L e m m a B. 1 a n d th e pr o of of th e pr e vi o u s le m m a, w e h a v e ¯ z = 1T = 1T =T (1) niE[{ H(ai0, ß0)}2 i1, j zi, 1+ |j | T |j ) Cov { I (ui ). | 1 T (1) Cov ( z +¯O (h2r n) = 0), I (ui,1+ j = 0)} ¯o( T ) (1) ni) E[H ] = (n- 1 ∑n i=1 s2ω i (3) niνi) /(2T ) + o(T - 1). Thus, E[ B3 30 17 Th eo re m 3 of Y os hi ha ra (1 97 8) is st at ed in ter m s of a -m ixi ng pr o ce ss es . H o w ev er, si nc e ea ch a -m ixi ng co e ffi ci en t - (ai0, ß0 (1) ] = E[{ H ni + - 1) 1|j |= T - 1∑| j |= T -1ω(1) ni( ¯o(T- 1 1 -+ (ai is b ou nd ed by th e co rr es p on di ng ß -m ixi ng co e ffi ci en t, Th eo re m 3 of Y os hi ha ra (1 97 8) ho ld s wi th a -m ixi ng co e ffi ci en ts re pl ac ed by ß -m ixi ng co e ffi ci en ts. W e n e xt e v al u at e th e v ar ia n c e of a n y li n e ar c o m bi n at io n of B p, p ut a i: = s2 ic ' νi. T h e n, 3. F or a n y fix e d c ∈ R ∑n i ai( T - 1 -2 =1∑n i=1∑n i a2 i =1∑ - 1 ∑T t=1-1 zit 2)it 4 4 -1 4r n). (A.19) T t=1 a2 i -1 it } = nVar{ (T - 1∑∑ zit) E[(T ) ]+O (n h T-2 -2 E[( T (2) ni T i =1 i i p¯z (2) ni) + given in condition (A9). B op n + 1n 2 B1 i=1∑ ni=1 s i ∑ B2 - 1Var{n | ¯zit S i n c e u ifo r m y |b io sn e ), by the exp onential ß )2 -mixing prop erty (condition (A1)) and Theorem 3 of Yoshihara (1978), it is shown that the first term on (A.19) is O (n). Therefore, we obtain the desired result. Lemma A.6. Recal l the de nitions of ω 2 n( 2t - ni γi + 1 12 n( 2t -=1∑ni =1∑ = 1T ·1 h γi = 1T · are expanded as i1] = 1 and ω(3) ni n ω(1) niγ) + o (T -1) , := zit ω s 1 2 it ). 1 n = ¯ O ( hr n E[ Khn z Tt ∑ it it i1 i 1] =∫ 8 K (u ){ t - G (u )} ft(uh )du + ∫ ∫-8 8 uK (u )2f (uh )du n -8 a nd B (T -1 Proof of Lemma A.6. We only prove the expansion for B := t - G n(uit) + h ˜ Khn( uit) and - E[ z ¯z i n ( ) u z c a n b e s h o w n in a si m il ar w a y. P . The expansion for B i n ut z] . R e c al l th at E[ z) b y L e m m a B. 1. O b s er v e th at E[( ˆ - 2) = TT s =1T t=1hit zit)] + ¯ O ( ∑T t=1 f hr n We evaluate the term E[ is)zit]. First, hn( ∑ ∑ E[ K u K i = f -8(0) 8 K (u )G (u )du + 8-8uK (u )2du + ¯ O ). { ∫ } (h 8-8K (u ) G (u ) du =[ -1 22G (u )]8S inc eK (·) is sym met ric ab out the orig in, the thir d ter m insi de the bra ce is zer o, and ∫-8= 12 . 3 1 - 1 zit)] = E[ ˆ fi( ∑T T n(uis) z ] + O ( h t=1 -1 - E[ ˆ fi])(T i Thus, we have E[ Khn( ui1) ] = (t - 1/ 2)fi(0) + ¯ O (hn). Second, for |j | = 1, zi 1 K (u1 i,j(u1h , u1+ j)du du 1du1+ j -8 E[Khn(ui 1)zi,1+ j] = t ∫∫ 8 1 n)du 8 8 8 ∫ n n, 1 1+ j u1+ jh (u1+ j)f i,j(u1 ∫ ∫ + h 8 K ( n u (0) + O = t fi (h i(uh hn r) n (u1h , u1+ j)du1+ j hnv 1) K(v) d du v f ∫ 8 ∫ 8 {∫ i,j n r n), } K (u 1 fi,j = t fi (0) ∫ 8 -8 -8 0 -8 (0, u )du + O (h h n) du K (u )f -88 )G -8 ) ˜ K (u1+ j)f -8 where the O ) term is indep endent of i and j . Therefore, we (hr n have E[( ˆ - E[ ˆ fi])(T- 1 ∑T zit)] = T -1( t - 1/2)fi(0) + T - 1 ω(2 + ¯o( - 1). (A.20) t=1 ) ni T fi Similarly, observe that We evaluate the term E[ ˜ K(1)(uis/hn)zit]. First, h 8 8 (1)(u ) ˜ K (u i( uh ) ∫ n )f du -8 ∫ 8 - 1 ∑- 2 zit)] = -T - 2 h- 2 ∑T s ∑T E[( ˜ - E[ ˜ f(1) E[ ˜ K(1)(uis/hn)zit] + ¯ O (hr nT t=1 n =1 t=1 f(1) i i])(T n). -8 n h- 2 nE[ ˜ K(1)( /hn)zi1] = h- 1 n (u ) {t - G (u )} ˜ ui1 f K(1) + h -n 1 ∫-8 ˜ K -1 n = -h -1 n -8 i(uh )d u n )du + ¯ O ˜ K (u ) K ( u (1) )fi(uh 8 uK (u )2du + ¯ O (0) ∫ fi = -h (1) = ¯O (1), n -8 8-8˜ K(1)(u ) ˜ K (u )fi(uhnwhere the second equality is due to the fact that ∫)du = ∫= ∫8-88-8˜ K (u )˜ K(1)˜ K(1)(u )fi(u ) ˜ K (u )fi(uhn(uhn)du - hn 3 )du + ¯ O ) 2 (h . ∫ n 8 ˜ ( u )f(1) K2 i(uh )d u Second, for |j | = 1, we have ] = h -1 t ∫ 8 ˜ K(1)(u ) n h- 2 nE[ ˜K(1)( ui1/hn)zi, 1+ j fi(uhn)du -1 ˜ K(1)(u1)Ghn(u1+ j )f (u1h , u1+ j)du1du n -8∫8 8 )fi,j(u1hn, u1+ jh (1) i 8 ˜ K (u )f -8 )d d u u ˜ K(1)(u1) ˜ K ( u n ( ) uh du 1 1 n 1+ j 1+ j 1 -8 8 -8= h + ∫8 -t ∫+ ∫ 8 -8 8 ∫ 8 1+ j f i,j n 1 n 1+ j 1+ j dv ∫ ∫ ˜ K (u1) ˜ K (u1+ j) f(1 ,0) i,j(u1hn, u1+ du1+ j jhn) du 8 -88 8 ∫ ˜ K (u) K ( v ) { v hn (1, 0) h , u ) du } du i,j(u 1 ∫ r -1 n ∫ 8 - hn -8 -8 = O (h ), where the O (hr - ) term is indep endent of i and j . Therefore, we 1n have 1- E[ ˜ f(1) i])(T-1 ∑T i hnE[( ˜ -1hn zit)] = ¯ O + hr n) = t=1 f(1) i (T ¯o(T C o m bi ni n g (A .2 0) a n d (A .2 1) , w e -1). (A.21) o bt ai n E[ B It re m ai n s to e v al u at e th e v ar ia n c e of a n y li n e ar c o m bi n at io n of B p, p ut a i: = si c' γ i. T h e ] = 1T 2n( 2t - ni=1 γ + 1 ni siω(1) niγi) + =1 ∑ · 1 ∑ o(T - 1). (A.22) n, 1. F or a n y fix e d c ∈ R Var{ n-1 - 2)} = n∑n i =1a2 iVar{ ( ˆ fi- E[ ˆ fi])(T-1zit∑T t=1-2) } = n∑n i=1a2 iE[( ˆ fi- E[ ˆ fi2])( T- 1∑T ∑T t=1 t=1zzitit2)- 2] = 2n∑n i=1a2 iE[( ˆ fi- E[ ˆ fi2]){T- 1∑T t=1(zit- E[ zit]) }2- 2] + 2n∑n i =1a2 i(E[zi12])By the pro of of Lemma B.2 b elow, E[( ˆ f iE[( ˆ fi- E[ ˆ f- E[ ˆ fi])2i])2]. (A.23)] = ¯ O { (T hn-1}. Thus, by Lemma B.1, the second term on the right side of (A.23) is O ( n- 1)T- 1h2r -1 n). Put w:= K (uit/hn), ¯wit:= wit- E[ wi1] and ¯zit:= zit- E[ zi1]. Observe thatit fE[( ˆi-E[ ˆ 2]){ - 1 ∑T t=1(zit-E[ zit])}2] = T -4 h- 2 ∑T s ∑T ∑T ∑T v E[ n =1 t=1 u=1 =1 fi T ¯wis¯wit¯ziu¯zi v] . 3 3 E[ ˆ fi])(T - 1- a i( ˆ ∑ fi E[ ¯wis¯wit¯z ¯z2 i 1] = ¯ O2 it¯zit] = ¯ 2hn (hn O (T E[ ¯zis 2hn s ̸ =t), ∑s̸ = t s̸ = t ¯z2 it] = ¯ O n), ∑s,t,u :distinctE[ ¯wis¯zis¯wit¯zit] = ¯ 2hn E[ ¯w(T 2 O (T 2hn 2hn). 2 is it¯zit¯ziu] is E[ ¯w2 is¯zit¯z E[ ¯wis¯wit¯z s̸ = t =¯O (T ¯wit¯ziu¯ziv = ¯ O (T2hn). We wish to show that ∑), ∑¯w), ∑s,t,u :distinc ThE[ ¯w¯w) , ∑s,t,u :distinct2 iu] = ¯ O ( T2hn) , ∑), ∑s,t,u,v :distinctE O (T The first four equations are relatively standard to derive. Thus, we deriving the last four equations. The evaluation b elow is similar to Yoshihara (1978, Theorem 1), but the situation is slightly different sake of completeness, we provide a pro of of the last four equatio follows, C denotes some constant indep endent of s, t, u, v and i, change from line to line. 2hn) }. (A.24) 2 it] It is standard to show that E[ ¯w2 i1 E[ ¯w2 is¯zit¯ziu it¯ziu] = ¯ O {( T h1 /(1+ d ) n) ∨ (T E[ ¯w2 is s < t < u Evaluation of ∑s,t,u :distinct ¯z]: We first show that for any d > 0, ∑ ¯ 1+ d] = C z hn iu| ∑s<t<u u -t>t -s Consider the case that s < t < u and u - t > t - s. In that case, E[ | ¯w |E[ ¯w2 1/ (1+ d ) n]| = C ∑s<t<u ßi d / (1+ d )(u - t) u- t>t- s is¯zit¯ziu h T T -s - 1 j ß d) / (1+ d (j = u - t; k = t - s ) 1/ (1+ d ) n= -2s= -1∑ Ch 1∑ j =2 j =1∑j = C hn ∑s<t<u ßi ßi(u - t)d / (1+ d ) + C T h1/ (1+ t-s = u- t d)n . Similarly, we have 8 ∑ i( j) k =1 (j )d / (1+ d ) = C T h1 /(1+ d ) n 1 /(1+ d ) n= CTh 1+ d, E[ | ¯z] = C and E[| ¯w2 is¯ Lemma 1 of Yoshihara (1976) ∑s<t<u t-s =u -t| E[ ¯w2 is¯zit¯ziu]| = E[ ¯w2 i ¯z ]| + ∑s<t<u |E[( 2 i 1- E[ ¯w]) iu t- s =u -t 1] ∑s<t<u t- s= u- t|E[ ¯zit ¯w2 is ¯zit¯ziu]| T T -s - 1j jk = C n-2s=1 ßi( k )d / (1+ + C T h1/(1+ d =1∑ =1 d ) )n h ∑ ∑ = C T2hn + C T h1 /(1+ d ) n. 34 Therefore, we obtain (A.24). Since d > 0 is arbitrary, taking d sufficiently small, we have T h1/ (1+ d ) n= O ( T2hn E[ ¯w2 is¯zit¯ziu] = ¯ O (T2hn). s,t,u :distinct s,t,u :distinct s,t,u : distinct E[ ¯wis¯wit¯zit¯ ziu 2 n ), ∑ 2 E[ iu ¯wis¯ w E[ 2 iu] = ¯ ¯wis¯wit O (T 2h ) . ∑s,t,u :distinctE[ ¯wis¯wit¯zit¯ziu i s Evaluation of ∑ iu iv it¯ 1+ d is ¯ziv| ziu ]: By using the same argument as in the previous case, we have n h ¯ s,t,u :distinct z ]=¯O (T s,t,u,v :distinct n n, E[| ¯wit¯ziu¯ziv| |E[ ¯wis¯wit¯ziu ¯z ∑s<t<u<v t- s =max { ut,v -u } E[ is¯ ¯wit ¯ ¯w w ¯z z iv ¯z | ]=Ch s<t<u<v t-s = max {u- t,v - u} 2 /(1+ d ) n Evaluation of ∑ ). By rep eating the same argument, we it¯ have ∑] and ∑ z ]: Consider the case that s < t < u < v and t - s = max{ u - t, v - u } . In that case, for any d > 0, E[| ¯w1+ d] = C h] = C hand E[ | ¯w1+ d2 n. Thus, by Lemma 1 of Yoshihara (1976), iv 2/ (1+ d ) n]| = C ∑ ßi d /(1+ d )(t - s) h j T -s - 2 T jk jl 2/ (1+ d ) n= C ßi d / (1+ d )( =1 -3s= =1 =1 h j) ∑ 1∑ ∑ ∑ 8 2 /(1+ d ) n j =1∑ 2 /(1+ d )n j2ßi(j )d / (1+ d ) =CTh . =CTh Similarly, we have ∑s<t<u<v v -u= max { |E[ iu¯ ]| = C T . t- s,u - t} ¯wis¯wit z h Consider now the case that s < t < u < v and u - t = max{ t - s, v - u }. In that case, by Lemma 1 of Yoshihara (1976), for any d > 0, |E[ ¯wis¯wit¯ziu¯ziv]| = | Cov ( ¯wis¯wit, ¯ziu¯ziv)| + |E[ (t - s ) d) / (1+ d ßi d / (1+ d )(v - u ¯w2/(1+ d ) nßid / (1+ d )( u - t)is¯w+ C hit]E[ ¯ziu2 /(1+ d ) n¯zßiiv]| = ). Ch Thus, ∑s<t<u<v u -t= max {t- s,v - u}|E[ + C h2 /(1+ d T3s- T - s- 2j jk=1 jl=1 ßi(k )d / (1+ d d / (1+ d )( =1∑ )n ¯wis¯wit¯ziu¯ziv]| = C T h2 /(1+ d ) n l) =1 ∑ ∑ )ßi ∑ 2 /(1+ d ) n= h 35 C T + C T2h2/ (1+ d ) n. E[ ¯wis¯wit¯ziu¯ziv] = 2hn). Taking d ∈ (0, 1], we have ¯ O (T ∑s<t<u<v s,t,u,v :distinctE[ ¯wis¯wBy rep eating the same argument, 2hn). we have ∑it¯ziu¯ziv] = ¯ O (T We have shown that the first term on the right side of (A.23) is O (n -1 T - 4 h-2 n -1 ) = O (n- 1T 2hn -2 3 p→ 0h- 1 n). Therefore, Var(c'B1 (1) )ni = O ( n- 1T v {( nT ) -1 n × T) = o{ (nT )}. Combining (A.22), we obtain the desired result. - 1/ 2 + -1} = O - 1/ 2{( nT By (A.16) and Lemmas A.4-A.6, ∥ ˆ ß ∥ = Op T ) -ß (B1 - B2 + B ) = v n G T - 1 {n 1nni =1∑ si ( ω γi (2) ni- ω+ s ω(3) niν)}i + o 2 v nT · G } , and ρb, where b is given in (3.2). Therefore, we obtain (3.1). A.3 Pro of of Theorem 3.2 It suffices to show that ˆ p→ b . To this end, we first show that b | - si| p→ 0, ∥ ˆ - νi∥ p→ 0, ˆ = G- 1 + op(1). (A.25) n max ˆs max1= i= n νi G- 1 n 1= i= n i We use the notation in App endix B b elow. Without loss of generality, we may assume that ai0= 0 and ß0= 0. Then, ˆsican b e written as ˆsi= 1/ ˆ fi( ˆai, ˆ ß ), whereˆ fˆ f( a , ß ) stands for(0) i(a , ß ). By Lemma B.2, we have ˆ fi( ˆai, ˆ ß ) = E[ˆ fi(a , ß )]|a = ^aii+ ¯op{ (log n)/ (T hn1 /2)} . Observe that E[ ˆ fi(a , ß )] = { E[ ˆ fi(a , ß )] - E[ ˆ fi, = ^ (0, 0)] } +E[ ˆ fˆ f(0, 0)], E[i(0, 0)] = fi(0) + ¯ O ( hr n) = fi(0) + ¯op{ (log n) /(T hn)1/2} andi fi(uh(|a | + M ∥ß ∥) ∫n8-8 = Cf|E[ ˆ fi(a , ß )] - E[ ˆ f= E [ ∫8-8i(0, 0)]||K (u ){ + a + x' i 1ß |xi 1) - fi( uhn|xi 1)}|du ] |K (u )|du. (A.26) 1= i= n n| ˆai| = Op[{ (log n)/T }1 /2] and ∥ ˆ ß ∥ = By the pro of of Theorem 3.1, i op(T1 /2)} = fi(0) + ¯o max p 36 i( ˆai = + p = (1). Similarly, we can show that ˆ γi i s ¯ o ˆ f), so that, ˆ ß ) = fi(0) + ¯op{ (log n)/( T h(1), which, by condition (A5) (c), implies that ˆs -1 /2 + ¯op{ (log n)/( T 1/ 2)} = γi + hn ¯op i (1) i( ˆa p, ˆ ß ) + ˆ γ iˆ f i= -E[ ˆ g (1) i - 1 n ( ˆa i, ˆß + ¯o(1)}{E[ ˆ f (a , ß )]| p(1) , +{γ ˆ Gn p = ^ p p γi (1). To prove the second assertion of (A.25), observe that (1) i i p (1) i =ˆg (1), i i (1) i(a , ß )] ˆ | ν = ( G (0, 0)] + ¯o 1 + + (0, 0)] ) ¯o ¯o ˆ G- 1 n ( 1 ) + { γ + ¯ o ( 1 ) } { E [ ˆ f ( 1 ) } =^ ) a = ^ai p i, i a= ^a p p n+ ¯o +o w h er e th e s e c o n d e q u al ity is d u e to L e m m a B. 2. T h e th ir d in = E [ ˆ g ( 1 ) } = ν + ¯ o e q u al ity c a n b e d e d u c e d fr o m th e e v al u at io n a n al o g o u s to (A .2 6) . Si m il ar ly, w e c a n s h o w th at = G (1 ), s o th at + o( 1) . W e n e xt s h o w th at ˆω( = ω(1) + (1), ˆ = ω(2) + (1), = ω(3) + ¯op(1). (A.27) 1) ni ni ni ¯op ω(2) ni ¯op ˆω(3) ni ni Step 1: pro of of = ω(3) + ni ˆω(3) ni ¯op = ∑|j |= ( 1 - |j ) Cov { I ( = 0), I (u1 ,1+ = 0)} mn j | u( 1 -i1 = 0)} , T ω(3 ) ni (1 ). T h e pr o of is si m il ar in s pi rit to th e pr o of of H a h n a n d K u er st ei n er (2 0 1 1, L e m m a 6) , b ut w e n e e d a di ff er e nt te c h ni q u e b e c a u s e of th e n o ndi ff er e nt ia bi lit y of th e in di c at or fu n cti o n. O b s er v e th at mn + a n d th e la tt er s u m is s h o w n to b e ¯ ∑+1 =|j |= T - 1 |j ) Cov {I (ui1 = 0), I (ui, 1+ j | T o( 1) b y c o n di ti o n (A 1) a n d L e m m a C. 2. T h er ef or e, it s u ffi c e s to s h o w th at max | ˆϱi(j ) ϱi max -1 n(j )| = o(m). (A.28) 1 =| j |= mn 1 =i =n By the pro of of Theorem 3.1, max1= | ˆai| = Op[{ (log n)/T }1 /2] and ∥ ˆ ß ∥ = -1 /2 i= n op(T max |E[ I = a + x' i1ß )I (ui,1+ = a +x' i, 1+ jß )] |a = ^ai, = ^ -ϱi(j ) | = Op[{ (log n)/T }1 1 =| j |= j (ui1 /2]. mn max ), 1 =i =n s o th at b y c o n di ti o n (A 5) , Thus, taking into account that mn /T = o(m- 1 n), it suffices to show that min {T ,T - j } ma (u , ; , ) x x u x 1= i= n 1=| j |= mn (a, ' )'∈Rp+1 t=max { 1, -j +1 } p ∑ - 1 n), (A.29) it it i,t+ j i,t+ j { (m ga, (uit, xit ; ui,t +j, xi,t+ j) := I (uit = a + x' itß ) I (ui,t +j = o= a + x' i,t +jß ). We make use of Talagrand’s inequa (see App endix C). where ga, 37 su ma p x 1 T min { T ,T - j } Fix any 1 = i = n and 1 = | j | = mn. In what follows, terms const . and o(·) are t interpreted as terms indep endent of i and j . Put ξt:= ( uit, x' it, ui,t +j, x' i,t+ j)'t. Observe that { ξ, t = 0 , ±1, ±2, . . . } is ß -mixing, and letting ˜ ß (·) denote its ß -mixing co effi-cients, we have ˜ ß (k ) = ßi(k - | j |) for k > |j |. Write ga, (ξt) = ga, (uit, xit; ui,t+ j, xa, ) and define the class of functions G = { g: a ∈ R, ß ∈ Rpi,t +j1}. The class G is uniformly b ounded by 1, and by Lemma 2.6.15 and Theorem 2.6.7 of van der Vaart and Wellner (1996), there exist constants A = 5e and v = 1 (indep endent of i and j ) such that N (G , L(Q) , ϵ ) = (A/e)vfor every 0 < ϵ < 1 and every probability measure Q on R2 p+21 /2. Take q = [(T / log n)]. By using Lemma 1 of Andrews (1991) and the classical result of Parzen (1957) on the variance evaluation of sample auto covariances, it is shown that sup(a, ')'∈Rp+1 Var{ ∑q t=1ga, (ξt) / vq } = const. . By the exp onential ß -mixing prop erty (condition (A1)), r˜ ß ( q ) = o(n-2- 2) with r = [ T /(2q )]. Therefore, by Prop ositions C.1 and C.2, with probability at least 1 - o( n) (take s = 3 log n when applying those prop ositions), sup ∑ -1n (1) ni = f i( 0) + ¯ o{ (l (1) ni a, p (ξ )ga,- E[ (ξ1)] } ( 1 ) n i = const. × v(log n)/T . = ¯ O (1) by condition o g n) /( T hn )1/ 2 max | ˆ ϕi(j ) ϕi -1 n(j )| = o(m), 1 =|j |= mn max 1 =i= n (A9), we have so that ) 1 t=max { 1,- j +1 } { ' The right side is o(m= ω+ ¯o(1). To b egin with, since ω), so that by g ∈ the union bRound, T we obtain (A.29). Step 2 : pro of of ˆω ( a , p + 1 ' ω = ∑1 =|j |= mn ω | = mn| ˆ fi (1) ni ( 1 ) n i ( 1 - |j ) { t fi(0) - ϕi(j )} + ¯o(1), | T - fi(0)| + mn max | ˆ ϕi(j ) - ϕi(j )| + 1=| j |= mn ¯o(1). (1) ni| ˆω Because ˆ}, the first term on the right side is ¯o(1). It remains to show that fi w hi c h c a n b e s h o w n in a si m il ar w a y a s (A .2 8) . In fa ct , th e pr o bl e m re d u c e s to s h o wi n g ma x 1= i= n min {T ,T -j } 1=| j |= mn ( a, ' )'∈Rp+1 su ma p x where ga, t=max {1 ,-j +1 } p (u , x ; u , x ) -1 n) , (A.30) it it i,t +j i,t+ j 1T ∑ { (m hn ga, ,hn ,h(uit , xit; ui,t+ j, xi,t+ j) := K ((uit - a - x' itß )/h)I ( uit )] } = o= a + x' i,t+ j' i,t +jß )/h) I (ui,t +ja - x= a + x' i,t+ jß previous case, it is shown 38 that the left side of (A.30) is Op18) under the present assumption. Step 3: pro of of ˆ = ω(2) + ni ω(2) ni ¯op ˆ G-1{ 1nn∑n ˆsi( ˆω(1) niˆ(2) niˆ ω+ ˆsiˆω(3) ni)} ˆ n γi νi 2 ˆb= (1). This step is completely analogous to the previous case. We have now shown (A.25) and (A.27), so that { 1ni =1 si ( ω(1) niγi(2) ni- ω+ si ω(3) niν)}i + op(1) 2 = G-1 n p→ i =1∑ b. Therefore, we obtain the desired conclusion. B Miscellaneous lemmas In this section, we summarize some miscellaneous results used in the pro of of Theorem 3.1. Lemma B.2 is a mo dification of Lemma 3 of Horowitz (1998). Throughout the section, we assume the conditions of Theorem 3.1. ) = O (hr n) ; Lemma B.1. We have (a) H(1) ni(ai0, ß0) = ¯ O (hr n) , H(2) n(a0, ß0 (b) ∂aiH(1) ni(ai 0, ß0) = -fi(0) + ¯ O (hr n); H(1) ni(ai0, ß0) = -f(1) i(0) + ¯ O ( hr - 1 (c) ∂2 ai n); ( (1) ni(ai0, ß0) = ∂aiH(2) n(a0, ß0) = -E[fi(0|xit)xi1] + ¯ O ( hr n); d ) ∂ H (f ) ∂2 ); ai (e) ∂ ' H ) = -E[f00(2) n(a, ßH(2) n(a0, ß0) = -n- 1∑(1) in ] + O (hr i=1E[ f(0|xi 1i(0| x)xi1i 1)xi1x] + ¯ O (h' i1r - 1 n n); ∂ßk ∂ßj H(1) ni(ai, ß ), ∂ßj j∂ßk∂ßlH(1) ni( ai 1 =i =n H(1) ni(ai, ai∂ßj ß ), ∂3 aiH(1) ni(ai, ß ), ∂2 |∂ai∂ßj H(1) ni( (ai H , ß ) and ∂ß, ß ) are O (1) uniformly over both (asupi, )∈A×B(1) ni(a, ß ) ∈ A × B and 1 = i for 1 = j , k , l = p , i.e., for instance, max, ß ) | = O (1) as n → 8 for 1 = j = p ; 1 (g) ∂ai 8 We conjecture that the left side of (A.30) is Op{ v(log n ) /(T hn) }, but we do not pursue this small improvement here. 39 H(2) n(a , ß ) , ∂ß∂j ßk ßj∂ßk∂ßlH(2) n ) T t=1 H(2) n( ai∂ßj a, ß ), ∂3 aiH(2) n( a, ß ), ∂2 H(2) n(a, ß ), ∂ai∂ßj ∂ßk H(2) n n j n j f( j ) i (j)i j n j n)xit (j)i n)xit Ji J(1) i (a , ß ) and ∂(a, ß ) are O (1) uniformly over both (a , ß ) ∈ A× B and 1 (= i = n for 1 h) k,l=p. ∂ai∂ßProof. j The pro of is immediate from conditions (A2), (A5) and (A6). ) , j = 0 , 1 , ˜(a , ß ) := ( -1)(T hj +1 n- 1)∑T t=1˜ K( j )(( uit- a - x' itß ) /h) , j = - a - x' itß ) /h K( j )(( u Put 1 , 2 , ˆ g(a , ß ) := ( -1)(T hj +1 n)-1∑T t=1K( j )(( uit- a - x' itß )/h, j = 0 , 1, ˜ ˆ f( j ) i g(a , ß ) := ( -1)(T hj +1 n)-1∑T t=1˜ K( j )(( uit- a - x' itß )/h, j = 1 , 2, ˆ(a , ß ) := (T h∑T t=1K (( uit' it- a - xß )/hn)xitx' it, ˜(a , ß ) := -(T h)T t=1n K (u ). ∑ - 1 n)-1 ˜ K(1)((uit - a - x' itß )/h )xitx' it, 2n where K( j j(u ) = d jK ( u and K(0)(u ) stands for K (u ). The same rule applies to ˜ ) )/du Lemma B.2. Uniformly over both (a , ß ) ∈ Rp+1 and 1 = i = n, 2 j +1 n (a) ˆ f( j ) i( a , ß ) - E[ ˆ f(j ) i(a , ß )] = op{ (log n)/2 j +11n/2)} , j = 0 , 1; 1 /2)} , j = 1 , 2; (T h (b) ˜ f( j ) i( a , ß ) - E[ ˜ f(j ) i(a , ß )] = op{ (log n)/ (T h (c) ˆ g( ( j ) i( a , ß ) - E[ ˆ g(a , ß )] = op{ (log n) /(T h2j +1 n)1/2}, j = j)i 0 , 1; (d) ˜ g( ( j ) i( a , ß ) - E[ ˜ g(a , ß )] 2j +1 n{ (log n) /(T h)1/2}, j = 1 , j)i = op 2; J(e) ˆi(a , ß ) - E[ ˆ i( a , ß )] = op{ (log n) /(T J hn)1/2}; J(f ) ˜(1) (1) i( a , ß ) - E[ ˜ J(a , ß )] 3 n{ (log n) /(T h)1 i = op /2}. F:= supf ∈FAs in Horowitz (1998), we employ empirical pro cess theory to prove the lemma, but since the observations are dep endent in the time dimension, we use a concentration inequality for ß -mixing pro cesses, which is develop ed in App endix C. Before the pro of, we intro duce some notation. Let F b e a class of measurable functions on a measurable space ( S , S ). For a functional Z (f ) defined on F , we use the notation ∥ Z (f )∥k|Z (f ) |. For a probability measure Q on (S , S ) and a constant ϵ > 0, let N (F , L(Q), ϵ ) denote the ϵ -covering numb er of F with resp ect to the LLk(Q)( Q) norm ∥ · ∥, where k = 1.k 40 Proof of Lemma B.2. We only prove (a) with j = 0. The other cases can b e shown in a similar way. Put ξit:= ( uit, xit), ga, ,h(u, x) := K (( u - a - x'ß )/h) for (a , ß ) ∈ Rp+1 and h > 0, and Gni := { g - E[ g (ξi 1)] : g ∈ Gn i1∑q f (ξit) / v q . It is n 2 ni 2 t=12 v(Q) , U ϵ ) = (A/ϵ i 2 ) ni n nßi(j ) ni, L1 p+1 1+ j|x1, x1+ j a, ,hn( ξi, 1+ j } . We apply Prop ositions C.2 and C.2 to the class G. The p ointwise measurability of Gis guaranteed by the continuity of K (·). Because of the b oundedness of K (·), the class Gis uniformly b ounded by some constant U (say) indep endent of i and n. Since condition (A6) guarantees that K (·) is of b ounded variation, by Lemma 22 of Nolan and Pollard (1987), there exist constants A = 5e and v = 1 indep endent of i and n such that N (Gfor every 0 < ϵ < 1 and every probability measure Q on R. Let q ∈ [1, T / 2]. We wish to b ound the variance of the sum n i 1) ] = const. ×h and E[ ) ga, ,hn( ) 1/2 ga, ,hn( ξ ξ i,j(u1, u a, ,hn(ξi 1), g standard to show that ] = const. ×hwhere the constants are indep endent of a , ß , i, j and n (the E[g assertion uses the fact that the joint densities f) are uniformly b ounded, w assumed in condition (A5) (e)). Thus, by Lemma 1 of Yoshihara (1976) (s Lemma C.2 b elow), we have | Cov ( g))| = const. ×h, which implies that q ß g ( )/ vq } = const. ×{ 1/ 2} h = const. ×h , q j =1 ξ t=1 g ∈Gni it n n su Var { ∑ 1 + ∑ i(j ) p 2 where the constant on the last expression is indep endent of i, n and q (recall that condition (A1) ensures that∑8 j =1supi= 1ßi1 /2( j )< 8 ). Take q = qn= [(T hn1 /2/ log n)]. By the exp onential ß -mixing prop erty (condition (A1)), r ßi(q ) = ¯o( n- 1- s- 1) with r = [T /(2q )]. Therefore, by Prop ositions C.1 and C.2, for all s > 0, with probability at least 1 - e- ¯o( n), we have Tt=1∑g (ξit) Gni= const. ×{ vT hn( s ∨ log n) + s vT hn/ log n} ,where the constant is indep endent of i, n and s. Taking s = 2 log n and using the union b ound, we obtain the desired result. C Some sto chastic inequalities for ß -mixing processes We intro duce some sto chastic inequalities that can b e applied to ß -mixing pro cesses. These inequalities are used in the pro ofs of the theorems ab ove, and are of indep endent interest. The first ob ject is to extend Talagrand’s (1996) concentration inequality to ß mixing pro cesses. We recall Talagrand’s inequality for i.i.d. random variables. 41 Theorem C.1 (Talagrand (1996); in this form, Massart (2000)). Let ξ, i = 1 , 2, . . . be i.i.d. random variables on some measurable space S . Let F a pointwise measurable class of functions on S such that E[f (ξ1)] = 0 and supx∈ Si2| f (x )| = U (see van der Vaart and Wel lner, 1996, Section 2.3 for pointwise measurability). Let s 2be any positive constant such that s= supf ∈FE[ f (ξ12)]. Put Z := supf ∈F| ∑n i=1)|. Then, for al l s > 0 , we haveP{ Z = 2E[Z ] + C s vns + sC U } = e- s, where C is a universal constant.f (ξi In what follows, let { ξt, t ∈ Z} b e a stationary pro cess taking values in a measurable space ( S , S ). We assume that S is a Polish space and S is its Borel s -field. For a function f on S and a p ositive integer q , define s2 q(f ) := Var{ f (ξ1) } + 2 ∑q -1 j =11), f (ξ1+ j(1 j /q ) Cov { f (ξ)} , which is the variance of the sum ∑denote the ß -mixing co efficient of { ξtq t=1f (ξt)/ v q . Let ß (j )}. The main technical device we use is a blo cking technique develop ed in Yu (1993, 1994) and Arcones and Yu (1994). The blo cking technique enables us to employ maximal inequalities and the symmetrization technique available in the i.i.d. case. We intro duce some notation and a key lemme related to the blo cking technique. Divide the T -sequence {1 , . . . , T } into blo cks of length q with q ∈ [1 , T / 2], one after the other: Hk= { t : 2(k - 1)q + 1 = t = (2k - 1)q } , Tk= { t : (2k - 1)q + 1 = t = 2k q } , k:= ( ξt, t ∈ H) = ∑t∈Hkkf (ξt). = ( ˜ ξt, t ∈ Hk 1 Let ˜ Ξk˜ Ξk where k = 1 , . . . , r := [ T / (2q )]. Put Ξk x∈ q r, we have (˜ ,..., )∈A} S P {(Ξ1, . . . , Ξr Ξ ˜Ξ ). With a slight abuse of notation, for a function f : S → R, we write f (Ξ), k = 1 , . . . , r b e indep endent blo cks such that eachhas the same distribution as Ξ. The next lemma, which is a key to the blo cking technique, is due to Eb erlain (1984). Lemma C.1. Work with the same notation as above. For every Borel measurable subset A of S 1 r = r ß (q ) ∈ A} - P { ). k t∈H r t∈ =1 ∑k f (ξt f (ξt + ( T - 2q r k =1 Tk∑ )U. t=1∑f ( ξt ) 4 2 Supp ose that we have a p ointwise measurable class of functions F on S such that supT= |f ( x )| = U for some constant U . For each f ∈ F , observe that r∑) + ∑) Si n c e th e fir st a n d s e c o n d te r m s o n th e ri g ht si d e h a v e th e s a m e di st ri b ut io n, fo r al ls > 0, w e h a v e r t∈ k =1 H∑ ∑ f (ξ t r ∑ f (ξt) F ∑ ∑= P =P t=1 t∈ f (ξt ) f (ξt) T = 2s + ( T - 2q r )U } = 2s F P{ r H∑k + =s F) F) k =1 ∑ B y L e m m a C. 1, th e la st e x pr e ss io n is b o u n d e d b y r k =1∑ t∈ H∑k rk =1∑ f (ξt f (Ξ ) F k =1k t∈ T rk =1∑ ∑ f (ξt) F = s k = s } = 2P = 2P { r k =1∑2P { k F s f ( ˜ Ξk) = +P k t∈T Fk = s } + 2r ß (q } = e- s ). Recall that ˜ , k = 1, . . . , r are i.i.d. blo cks. By Talagrand’s inequality, for all s > 0, we have Ξ f ( ) = 2E [ k r=1 f ( ˜ v sT + sq C ] + C s r ˜Ξ Ξk ) U k =1 ∑ ∑F k P{ q F is any p ositive constant such that 2 q(f ) = = 2q sup s U f ( ˜ Ξk) ] + C sqv sT + sq C } = 2e- s + 2 r ß ( F U q). 2 r k =1 f ∈F . Therefore, we obtain the next prop osition: w here s2 q Prop osition C.1. Work with the same notation as above. Then, for al l s > 0, we have T t r k =1F k ) F = 4E [ ∑ f ( ˜ Ξk)∥F k(Q) norm ∥ · ∥Lk (Q) P { t=1∑ For the eva luat ion of the ter m E[∥ ∑], the nex t pro p ositi on is use ful. Rec all that for f (ξ 1 4 3 := supf ∈F 2s q a pro bab ility me asu re Q on (S , S) and a con sta nt ϵ > 0, N (F , L(Q ),ϵ ) den ote s the ϵ -co veri ng nu mb er of F with res p ect to the L, wh ere k= 1, and for a fun ctio nZ (f ) on F, ∥Z (f )∥|Z (f )|. v P ro p o si ti o n C. 2. W or k wi th s a m e n ot at io n a s a b o v e. A ss u m e th at th er e e xi st c o n st a nt s A = 5 e a n d v = 1 s u c h th at N (F , L( Q ), U ϵ ) = ( A/ ϵ) fo r e v er y 0 < ϵ < 1 a n d e v er y pr o b a bi lit y m e a s ur e Q o n S . T h e n, w e h a v e r k =1 f ( ˜ Ξk) F] = C [ q v U logv q A' + vv vT s v log v q A' ] , (C.1) q q where C is a universal constant and A2A. Proof. The pro of is based on the pro of of Gine and Guillou (2001, Prop osition 2.1), but some mo difications are needed. Recall that ˜ Ξk, k = 1 , . . . , r are i.i.d. , . . . , ϵr Let ϵ b e i.i.d. Rademacher random variables indep endent of ˜ Ξ1, . . . , ˜ Ξr ϵ f(˜ ) ] Ξ r k =1 ∑ Us E q Us 1 [ : = v ' . B y L e m m a 2. 3. 1 of v a n d er V a ar t a n d W el ln er (1 9 9 6) , w e h a v e E[ k f ( ˜ Ξk) ] = 2E [ ∑= 2 q U E k ϵk F [ ∑ rk =1 k ] , (C.2) F) f( ˜ Ξ H = 1 ∑ r k where H := { f (ξ1, . . . , q) = ∑q t=1 f (ξt r k =1 ϵkf ( ˜ Ξk) / v r is subξ ˜ Ξk)/( q U ) : f ∈ F } . We shall b ound the right side of (C.3). Without loss of generality, we may assume that 0 ∈ H . By Ho effding’s inequality, given { r k =1 ϵ f ( ˜ Ξk)/ vr H] = C ∫ where Eϵ Gaussian for the L2 ( ˜ Qr) nor distributio 2.2.8 of va v log N (H , L2( ˜ ) , ϵ Q )dϵ, stands for the exp ectation with resp ect to ϵk’s and C is a universal constant. Letdenote the empirical distribution on S that assigns probability 1/( q r ) to each ˜tq rr k =1, t ∈ ∪Hk. Since for fi(ξ1, . . . , ξq) = ∑q t=1rfi(ξt) /(q U ), fi∈ F , i = 1 , 2, 1 ∥∑ f( ~ k 2/r ∥1= 2 H k 1 2 k E [ ∑ q2r U 2 k =1 1 k 2 k r k =1 k ϵ ) r 0 ˜P ξ { f ( ˜ ) - f ( ˜ )}2 = 1 ( ˜ r∑{ f ) - f ( ˜ )}2 Ξ Ξ Ξ Ξ ∑ k =1 | f1( ˜ k) - f (˜ Ξ Ξk)| ∑ r 2 L (~ 2 ), = 2q P rU = 2U ∥ 1 - f ∥ 1 q r f we have N (H , 2 ), ϵ ) = N (F , L1( ˜ Pq r), U ϵ2/2) = (2A/ϵ2)v. L Thus, k r H] = Cv 2 )/ v ϵkf ( ˜ rΞ Eϵ [ ∑ v= 2 C v r k =1 k =1 Av ∫ 44 2A/ ∥ ∑r k =1 f ( ~ 2 k 1= 2H v log dϵ ϵ . (˜ Qr ∫0 ∥ ∑r k =1 f( ~ 8v k2)/r ∥1 = 2 H /r ) ∥ ϵ v log (v 2A/ϵ 2 )dϵ 21 ϵ v log d ϵ ϵ a8Integration log ϵ2ϵ dϵ = [ -=v log ϵϵ + 12 ∫8a8v log ϵ+ 1 v ]8a2 ∫ a log aa dϵ, a = e, by parts gives ∫v ϵ2 v log ϵdϵ = 2v log aa , a = 2ϵ e. 2/r f( ˜ Ξ H/r v log r k =12A ∥ k)2/r ∥H ∑r k =1 ], k =1∑ 8 from which we have ∫a Therefore, we have E [ k =1∑ k =1 2 r k E [ E[ wh ere the sec ond ine qua ϵkf ( ˜ Ξk)/ vr H] = 2Cv 2v E= 2C v = 1 2 v v u ∑ E [ [ r k = r1 ∑ k) k) ∑ 1 /2 H] f(˜Ξ f ( ˜ Ξk log u t ϵ ∑ k =1 2 rs 2 v v ( s qU f(˜ Ξ 2 2A E[ ∥ ) 2/r ∥H lity is due to H¨o lder ’s ine qua lity, the con cavi ty of the ma px 7→ x log (a/x ) and Jen sen ’s ine qua lity. N o w, b y C or ol la ry 3. 4 of T al a gr a n d (1 9 9 4) , r f( ˜ Ξk 2 H] ) = 2q + 8E [ r ϵkf ( ˜ Ξk ) H] , kf( ˜ Ξk 2 q r ϵ f ( ˜ Ξk) H]) log 2q AU2 . ∑= 2C v k =1 uut and the right side is b ounded by 10r b ecause s 2 q s =2q U 2 r . Since the map x 7→ x log ( a/x ) is non-decreasing for 0 = x = a/e and A = 5e , we have k 2 [ k r + 8 q k) H =1∑∑ E Put Z := E [ r ]. ϵkf ( ˜ Ξ Then, Z satisfies Z2 q U= C v log r vq A'U + 8 C v Z log vq , s2 q2 sq A'U sq 45 )/ v qU H] r where C is another universal constant and := v2A. This gives A' + v16C2v2 ( log v q A'Uq )2 + C v r s2log vq A' Z = 4C v log vq A'U q q s sq + v r q U2 v v q A 'U s = 8C v C l s log vq o v A'U sq v g v q v U wh ere the sec ond ine qua lity is due to va +b = va + vb for a, b > 0, and C'is ano ther uni ver sal con sta nt. Co mbi nin g (C. 2) and (C. 3) yiel ds the des ired ine qua lity. In th e c a s e q = C ' [ v l o g v q A' U sq + vv vr sv q Uq q ] , (C.3) v vq Al 'U sq o g Us th at F is si n gl et o n, i. e. , F = {f }, Pr o p o sit io n C. 2 gi v e s a B er n st ei n ty p e in e q u al ity fo r ß m ixi n g pr o c e ss e s. In th at c a s e, r k =1 E[ f ( ˜ Ξk) ] = E [ k r f ( ˜ ) ] F 2 =1 Ξ ∑ r = vr q sq k =1∑ f ( ˜ (f ). Ξ Al th o u g h s u c h in e q u al iti e s ar e k n o w n in th e lit er at ur e (c f. F a n = v uu u t E k k ) a n d Y a o, 2 0 0 3, T h e or e m 2. 1 8) , w e pr e s e nt th e s p e ci al c a s e of F b ei n g si n gl et o n a s a c or ol la ry , si n c e its fo r m is m or e c o n v e ni e nt in th e pr e s e nt c o nt e xt . Corollary C.1. Let f be a function on S such that supx |f (x )| = U and E[ f ∈S (ξ1 Tt=1∑ f ( ξt) 1+ j) |= C { v(s ∨ 1)T sq(f ) + sq U } } = 2e-s + 2 r ß (q ) P{ 1)| , )] = 0. T 2q 1 1+ j 1 1+ j)| h e n, fo r al ls > 0 , w e h a v e wh ere C is a uni ver sal con sta nt. In app lyin g tho se ine qua litie s, the eva luat ion of the vari anc e ter m s( f ) is ess enti al. For ß -mi xin g pro ces ses , Yos hih ara’ s (19 76) Le mm a1 is part icul arly use ful for that pur p ose . Sin ce it is rep eat edl y use d in the pro ofs of the the ore ms ab ove , we des crib ea sp eci al cas e of that lem ma. L e m m a C. 2. W or k wi th th e s a m e n ot at io n a s a b o v e. L et j b e a x e d p o sit iv e in te g er . L et f a n d g b e fu n cti o n s o n S s u c h th at E[ f (ξ )] = E[ g (ξ )] = 0, a n d fo r s o m e p o sit iv e c o n st a nt s d a n d M , E[ |f (ξ 1+ d]E[ |g (ξ 1+ d] 46 = M , E[|f (ξ )g ( ξ 1+ d] = M . (C.4) ∑8 j =1Var {qt=1∑ 1/(1+ d )ß (j )d / (1+ d1)/(1+ d )ß (j ) 8 j =1 j =1∑ß (j) Then, we have | Cov (f (ξ1) , g (ξ1+ j)) | = 4M f (ξt)/ vq } = 4M 1 /(1+ d ) {1+2 8 d /(1+ d )} . A direct consequence of Lemma C.2 is that if ther andM such that (C.4) ho ds for for any p ositive inte sumCov {f (ξ1), g (ξ1+ j)} is absolutely convergent, a integer q , . ß (j ) decays exp onentially fast as j → 8 , i.e., for some constants a ∈ (0, 1) and B > 0, ß ( j ) = B a, then ∑8 j =1d / (1+ d )ß (j )< 8 for any d > 0. jIf References Abrevaya, J., and C. M. Dahl (2008): “The Effects of Birth Inputs on Birthweight: Evidence From Quantile Estimation on Panel Data,” Journal of Business and Economic Statistics , 26, 379–397. Amemiya, T. (1982): “Two Stage Least Absolute Deviations Estimators,” Econometrica, 50, 689–711. Andrews, D. W. K. (1991): “Heteroskedasticity and Auto correlation Consistent Covariance Matrix Estimation,” Econometrica, 59, 817–858. Angrist, J., V. Chernozhukov, and I. Fernandez-Val (2006): “Quantile Regression under Missp ecification, with an Application to the U.S. Wage Structure,” Econometrica, 74, 539–563. Arcones, M. A. (1998): “Second Order Representations of the Least Absolute Deviation Regression Estimator,” Annals of the Institute of Statistical Mathematics , 50, 87–117. Arcones, M. A., and B. Yu (1994): “Central Limit Theorems for Empirical and U-pro cesses of Stationary Mixing Sequences,” Journal of Theoretical Probabilities, 7, 47–71. Arellano, M., and S. Bonhomme (2009): “Robust Priors in Nonlinear Panel Data Mo dels,” Econometrica, 77, 489–536. Arellano, M., and J. Hahn (2005): “Understanding Bias in Nonlinear Panel Mo dels: Some Recent Developments,” Mimeo, CEMFI. 47 Bester, C. A., and C. Hansen (2009): “A Penalty Function Approach to Bias Reduction in Non-Linear Panel Mo dels with Fixed Effects,” Journal of Business and Economic Statistics , 27, 131–148. Buchinsky, M. (1995): “Estimating the Asymptotic Covariance Matrix for Quantile Regression Mo dels: A Monte Carlo Study,” Journal of Econometrics, 68, 303–338. Canay, I. A. (2008): “A Note on Quantile Regression for Panel Data Mo dels,” Mimeo, Northwestern University. Dhaene, G., and K. Jochmans (2009): “Split-Panel Jacknife Estimation of FixedEffect Mo dels,” Mimeo. Eberlain, E. (1984): “Weak Convergence of Partial Sums of Absolutely Regular Sequences,” Statistics and Probability Letters, 2, 291–293. Fan, J., and Q. Yao (2003): Nonlinear Time Series . Springer-Verlag. Fernandez-Val, I. (2005): “Bias Correction in Panel Data Mo dels with Individual Sp ecific Parameters,” Mimeo. Gamper-Rabindran, S., S. Khan, and C. Timmins (2010): “The Impact of Pip ed Water Provision on Infant Mortality in Brazil: a Quantile Panel Data Approach,” Journal of Development Economics , 92, 188–200. Gine, E., and A. Guillou (2001): “On Consistency of Kernel Density Estimators for Randomly Censored Data: Rates Holding Uniformly over Adaptive Intervals,” Annales de l’Institut Henri Poincare, 37, 503–522. Graham, B. S., J. Hahn, and J. L. Powell (2009): “The Incidental Parameter Problem in a Non-Differentiable Panel Data Mo del,” Economics Letters, 105, 181– 182. Hahn, J., and G. M. Kuersteiner (2011): “Asymptotically Unbiased Inference for a Dynamic Panel Mo del with Fixed Effects When b oth n and T are Large,” Econometric Theory, forthcoming. Hahn, J., and W. Newey (2004): “Jackknife and Analytical Bias Reduction for Nonlinear Panel Mo dels,” Econometrica, 72, 1295–1319. Horowitz, J. L. (1992): “A Smo othed Maximum Score Estimator for the Binary Resp onse Mo del,” Econometrica, 60, 505–531. 48 (1998): “Bo otstrap Metho ds for Median Regression Mo dels,” Econometrica, 66, 1327–1351. (2001): “The Bo otstrap in Econometrics,” in Handbook of Econometrics, Vol.5, ed. by J. J. Heckman, and E. E. Leamer, pp. 3159–3228. Elsevier Science B.V. Ichimura, H., and P. Todd (2007): “Implementing Nonparametric and Semiparametric Estimators,” in Handbook of Econometrics, Vol.6, ed. by J. J. Heckman, and E. E. Leamer, pp. 5369–5468. Elsevier Science B.V. Kato, K., A. F. Galvao, Jr., and G. V. Montes-Rojas (2011): “Asymptotics for Panel Quantile Regression Mo dels with Individual Effects,” Mimeo. Knight, K. (1998): “Limiting Distributions for L1Regression Estimators Under General Conditions,” Annals of Statistics , 26, 755–770. Koenker, R. (2004): “Quantile Regression for Longitudinal Data,” Journal of Multivariate Analysis, 91, 74–89. (2005): Quantile Regression. Cambridge University Press, Cambridge. (2009): “Quantile Regression in R: A Vignette,” Available from: http://cran.r-pro ject.org/web/packages/quantreg/index.html. Koenker, R., and G. W. Bassett (1978): “Regression Quantiles,” Econometrica, 46, 33–49. Lancaster, T. (2000): “The Incidental Parameter Problem since 1948,” Journal of Econometrics, 95, 391–413. Li, H., B. G. Lindsay, and R. P. Waterman (2003): “Efficiency of Pro jected Score Metho ds in Rectangular Array Asymptotics,” Journal of the Royal Statistical Society, Series B , 65, 191–208. Lipsitz, S. R., G. M. Fitzmaurice, G. Molenberghs, and L. P. Zhao (1997): “Quantile Regression Metho ds for Longitudinal Data with Drop-outs: Application to CD4 Cell Counts of Patients Infected with the Human Immuno deficiency Virus,” Journal of the Royal Statistical Society, Series C , 46, 463–476. Massart, P. (2000): “Ab out the Constants in Talagrand’s Concentration Inequalities for Empirical Pro cesses,” Annals of Probability, 28, 863–884. Muller, H. G. (1984): “Smo oth Optimal Kernel Estimators of Densities, Regression Curves and Mo des,” Annals of Statistics , 12, 766–774. 49 Neyman, J., and E. L. Scott (1948): “Consistent Estimates Based on Partially Consistent Observations,” Econometrica, 16, 1–32. Nolan, D., and D. Pollard (1987): “U-pro cesses: Rates of Convergence,” Annals of Statistics , 15, 780–799. Parzen, E. (1957): “On Consistent Estimates of the Sp ectrum of a Stationary Time Series,” Annals of Mathematical Statistics , 28, 329–348. Phillips, P. C. (1991): “A Shortcut to LAD Estimator Asymptotics,” Econometric Theory, 7, 450–463. Rilstone, P., V. Srivastava, and A. Ullah (1996): “The Second-Order Bias and Mean Squared Error of Nonlinear Estimators,” Journal of Econometrics , 75, 369–395. Rosen, A. (2009): “Set Identification via Quantile Restrictions in Short Panels,” CEMMAP working pap er CWP26/09. Silva, D. G. D., G. Kosmopoulou, and C. Lamarche (2009): “The Effect of Information on the Bidding and Survival of Entrants in Pro curement Auctions,” Journal of Public Economics , 93, 56–72. Talagrand, M. (1994): “Sharp er Bounds for Gaussian and Empirical Pro cesses,” Annals of Probability, 22, 28–76. (1996): “New Concentration Inequalities in Pro duct Spaces,” Inventiones Mathematicae , 126, 505–563. van der Vaart, A., and J. A. Wellner (1996): Weak Convergence and Empirical Processes. Springer-Verlag, New York, New York. Wang, H. J., and M. Fygenson (2009): “Inference for Censored Quantile Regression Mo dels in Longitudinal Studies,” Annals of Statistics , 37, 756–781. Wei, Y., and X. He (2006): “Conditional Growth Charts,” Annals of Statistics , 34, 2069–2097. Woutersen, T. M. (2002): “Robustness Against Incidental Parameters,” Mimeo, University of Western Ontario. Yoshihara, K. (1976): “Limiting Behavior of U-Statistics for Stationary, Absolutely Regular Pro cesses,” Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte Gebiete, 35, 237–252. 50 (1978): “Moment inequalities for Mixing Sequences,” Kodai Mathematical Journal, 1, 316–328. Yu, B. (1993): “Density Estimation in the L8norm for Dep endent Data with Applications to the Gibbs Sampler,” Annals of Statistics , 21, 711–735. (1994): “Rates of Convergence for Empirical Pro cesses of Stationary Mixing Sequences,” Annals of Probability, 22, 94–116. 51 ß^ KB ß^ t 0 .25 0 .50 0 .75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75 BIAS T = 10 n = 50 0.047 0.001 -0.045 0.045 -0.004 -0.048 0.026 -0.007 -0.030 0.047 0.005 -0.036 0.028 -0.002 -0.027 0.042 0.001 -0.049 0.042 0.000 -0.049 0.021 n =-0.001 100 -0.036 0.043 0.002 -0.047 0.021 0.000 -0.036 n = 500 0.046 -0.001 -0.046 0.046 -0.001 -0.046 0.027 -0.003 -0.030 0.046 -0.001 -0.046 0.027 -0.003 -0.030 0.027 0.004 -0.019 0.017 0.003 -0.018 T =0.002 20 n 0.003 = 50 -0.007 0.023 0.008 -0.011 0.012 0.004 -0.008 0.021 0.001 -0.019 0.019 0.001 -0.018 0.005 n =0.001 100 -0.006 0.022 0.002 -0.019 0.007 0.001 -0.007 n = 500 0.021 -0.001 -0.021 0.021 -0.001 -0.021 0.008 -0.001 -0.011 0.021 -0.001 -0.021 0.008 -0.001 -0.011 T = 50 n = 50 0.006 0.000 -0.010 0.005 0.000 -0.012 -0.001 0.000 -0.006 0.008 0.003 -0.005 0.000 0.000 -0.004 n = 100 0.008 0.000 -0.008 0.008 0.000 -0.008 0.002 0.000 -0.002 0.009 0.000 -0.007 0.002 0.000 -0.002 n = 500 0.008 0.000 -0.009 0.008 0.000 -0.009 0.001 0.000 -0.003 0.008 0.000 -0.009 0.001 0.000 -0.003 T = 100 n = 50 0.004 -0.001 -0.003 0.001 -0.001 -0.003 -0.002 -0.001 0.000 0.005 0.001 0.002 0.001 -0.001 0.001 n = 100 0.004 -0.001 -0.004 0.003 -0.001 -0.004 0.000 -0.001 -0.001 0.005 -0.001 -0.004 0.001 -0.001 -0.001 n = 500 0.004 0.000 -0.005 0.004 0.000 -0.005 0.001 0.000 -0.002 0.004 0.000 -0.005 0.001 0.000 -0.002 SD T = 10 n = 50 0.107 0.099 0.107 0.107 0.098 0.106 0.136 0.117 0.131 0.109 0.101 0.108 0.136 0.119 0.131 0.076 0.069 0.074 0.076 0.069 0.074 0.094 n0.084 = 1000.092 0.076 0.070 0.074 0.094 0.085 0.092 0.034 0.032 0.033 0.034 0.032 0.033 0.043 n0.041 = 5000.042 0.034 0.032 0.033 0.043 0.041 0.042 T = 20 n = 50 0.077 0.072 0.076 0.076 0.071 0.075 0.085 0.076 0.081 0.078 0.074 0.077 0.086 0.077 0.081 n = 100 0.054 0.051 0.058 0.054 0.050 0.058 0.060 0.055 0.062 0.054 0.051 0.058 0.060 0.055 0.062 n = 500 0.024 0.022 0.024 0.024 0.022 0.024 0.027 0.023 0.027 0.024 0.022 0.024 0.027 0.023 0.027 T = 50 n = 50 0.046 0.043 0.049 0.046 0.043 0.049 0.048 0.044 0.050 0.048 0.044 0.050 0.048 0.044 0.050 n = 100 0.034 0.031 0.034 0.034 0.031 0.034 0.035 0.032 0.035 0.034 0.031 0.034 0.035 0.032 0.035 n = 500 0.015 0.014 0.015 0.015 0.014 0.015 0.016 0.014 0.016 0.015 0.014 0.015 0.016 0.014 0.016 T = 100 n = 50 0.034 0.031 0.032 0.034 0.031 0.032 0.035 0.031 0.033 0.035 0.031 0.033 0.035 0.031 0.033 n = 100 0.024 0.022 0.024 0.024 0.022 0.024 0.025 0.022 0.024 0.024 0.022 0.024 0.025 0.022 0.024 n = 500 0.011 0.010 0.011 0.011 0.010 0.011 0.011 0.010 0.011 0.011 0.010 0.011 0.011 0.010 0.011 RMSE T = 10 n = 50 0.117 0.099 0.116 0.116 0.098 0.117 0.139 0.117 0.135 0.119 0.101 0.114 0.139 0.119 0.134 0.087 0.069 0.089 0.087 0.069 0.089 0.096 n0.084 = 1000.099 0.087 0.070 0.088 0.097 0.085 0.099 n = 500 0.057 0.032 0.057 0.057 0.032 0.057 0.050 0.041 0.052 0.057 0.032 0.057 0.050 0.041 0.052 T = 20 n = 50 0.081 0.072 0.078 0.077 0.071 0.077 0.085 0.076 0.081 0.081 0.075 0.078 0.087 0.077 0.082 n = 100 0.058 0.051 0.061 0.057 0.050 0.061 0.060 0.055 0.063 0.058 0.051 0.061 0.060 0.055 0.063 n = 500 0.032 0.022 0.032 0.032 0.022 0.032 0.029 0.023 0.029 0.032 0.022 0.032 0.029 0.023 0.029 T = 50 n = 50 0.047 0.043 0.050 0.047 0.043 0.050 0.048 0.044 0.050 0.048 0.044 0.051 0.048 0.044 0.050 n = 100 0.035 0.031 0.034 0.035 0.031 0.035 0.035 0.032 0.035 0.035 0.031 0.034 0.035 0.032 0.035 n = 500 0.017 0.014 0.018 0.017 0.014 0.018 0.016 0.014 0.016 0.017 0.014 0.018 0.016 0.014 0.016 T = 100 n = 50 0.035 0.031 0.033 0.034 0.031 0.033 0.035 0.031 0.033 0.036 0.031 0.033 0.035 0.031 0.033 n = 100 0.025 0.022 0.024 0.024 0.022 0.024 0.025 0.022 0.024 0.025 0.022 0.024 0.025 0.022 0.024 n = 500 0.011 0.010 0.012 0.011 0.010 0.012 0.011 0.010 0.011 0.011 0.010 0.012 0.011 0.010 0.011 tion, and Ro ot Mean Squared Error. Standard ß^ 1 ß^1= 2 ß^ 1KB 52 ß^ KB ß^ t 0. 25 0 .50 0 .75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75 BIAS T = 10 n = 50 0.034 -0.042 -0.205 0.028 -0.039 -0.210 0.024 0.031 -0.103 0.033 -0.032 -0.172 0.030 0.028 -0.098 0.032 -0.046 -0.204 0.031 -0.043 -0.204 0.017 n = 100 0.012 -0.081 0.033 -0.046 -0.201 0.017 0.010 -0.081 n = 500 0.030 -0.045 -0.200 0.030 -0.045 -0.200 0.024 0.022 -0.071 0.030 -0.045 -0.200 0.024 0.022 -0.071 0.010 -0.024 -0.095 0.012 -0.023 -0.093 T = 20 0.002 n = 50 0.017 -0.003 0.013 -0.014 -0.074 0.000 0.016 -0.005 0.011 -0.026 -0.100 0.013 -0.024 -0.098 0.001 n = 100 0.017 -0.012 0.011 -0.026 -0.099 -0.001 0.015 -0.015 n = 500 0.010 -0.026 -0.096 0.010 -0.026 -0.096 0.001 0.014 -0.011 0.010 -0.026 -0.096 0.001 0.014 -0.011 T = 50 n = 50 0.000 -0.017 -0.051 -0.004 -0.019 -0.056 -0.008 -0.003 -0.021 0.002 -0.009 -0.033 -0.004 -0.001 -0.016 n = 100 0.001 -0.015 -0.043 0.001 -0.015 -0.044 -0.002 0.000 -0.009 0.002 -0.014 -0.041 -0.001 0.000 -0.008 n = 500 0.004 -0.011 -0.040 0.004 -0.011 -0.040 0.002 0.005 -0.005 0.004 -0.011 -0.040 0.002 0.005 -0.005 T = 100 n = 50 -0.001 -0.008 -0.024 -0.002 -0.011 -0.026 -0.004 -0.005 -0.009 0.003 -0.002 -0.010 -0.002 -0.002 -0.007 n = 100 0.003 -0.006 -0.019 0.003 -0.006 -0.018 0.002 0.001 -0.001 0.004 -0.005 -0.016 0.002 0.001 -0.002 n = 500 0.002 -0.006 -0.019 0.002 -0.006 -0.019 0.002 0.001 -0.002 0.002 -0.006 -0.019 0.002 0.001 -0.002 SD T = 10 n = 50 0.168 0.223 0.304 0.169 0.222 0.304 0.273 0.325 0.514 0.172 0.229 0.310 0.272 0.325 0.515 0.121 0.151 0.209 0.121 0.151 0.209 0.200n 0.232 = 100 0.366 0.121 0.152 0.210 0.200 0.232 0.366 0.055 0.070 0.096 0.055 0.070 0.096 0.088n 0.105 = 500 0.165 0.055 0.070 0.096 0.088 0.105 0.165 T = 20 n = 50 0.126 0.156 0.220 0.126 0.156 0.220 0.157 0.184 0.285 0.130 0.159 0.228 0.157 0.184 0.285 n = 100 0.093 0.110 0.156 0.093 0.110 0.156 0.115 0.133 0.206 0.093 0.110 0.156 0.115 0.133 0.207 n = 500 0.041 0.050 0.069 0.041 0.050 0.069 0.051 0.060 0.090 0.041 0.050 0.069 0.051 0.060 0.090 T = 50 n = 50 0.077 0.096 0.130 0.078 0.096 0.131 0.083 0.101 0.142 0.080 0.100 0.135 0.082 0.101 0.142 n = 100 0.054 0.069 0.098 0.054 0.069 0.098 0.058 0.074 0.107 0.054 0.070 0.099 0.058 0.074 0.107 n = 500 0.024 0.031 0.044 0.024 0.031 0.044 0.026 0.032 0.048 0.024 0.031 0.044 0.026 0.032 0.048 T = 100 n = 50 0.053 0.069 0.102 0.053 0.069 0.102 0.055 0.071 0.106 0.055 0.071 0.105 0.055 0.071 0.107 n = 100 0.038 0.048 0.069 0.038 0.048 0.069 0.039 0.049 0.072 0.038 0.048 0.070 0.039 0.049 0.072 n = 500 0.017 0.021 0.031 0.017 0.021 0.031 0.017 0.022 0.032 0.017 0.021 0.031 0.017 0.022 0.032 RMSE T = 10 n = 50 0.172 0.226 0.367 0.171 0.226 0.369 0.274 0.327 0.524 0.175 0.231 0.355 0.274 0.326 0.524 0.125 0.158 0.292 0.125 0.157 0.293 0.201n 0.232 = 100 0.375 0.126 0.158 0.291 0.201 0.232 0.374 n = 500 0.063 0.083 0.222 0.063 0.083 0.222 0.092 0.107 0.180 0.063 0.083 0.222 0.092 0.107 0.180 T = 20 n = 50 0.127 0.158 0.239 0.126 0.157 0.239 0.157 0.184 0.285 0.130 0.160 0.239 0.157 0.185 0.285 n = 100 0.093 0.113 0.186 0.094 0.113 0.184 0.115 0.134 0.207 0.094 0.113 0.184 0.115 0.134 0.207 n = 500 0.042 0.057 0.118 0.042 0.057 0.118 0.051 0.062 0.090 0.042 0.057 0.118 0.051 0.062 0.090 T = 50 n = 50 0.077 0.097 0.140 0.078 0.098 0.142 0.083 0.101 0.143 0.080 0.100 0.139 0.082 0.101 0.142 n = 100 0.054 0.071 0.107 0.054 0.071 0.108 0.058 0.074 0.107 0.054 0.071 0.107 0.058 0.074 0.107 n = 500 0.025 0.032 0.060 0.025 0.032 0.060 0.026 0.033 0.048 0.025 0.032 0.060 0.026 0.033 0.048 T = 100 n = 50 0.053 0.070 0.105 0.053 0.070 0.106 0.055 0.071 0.107 0.055 0.071 0.106 0.055 0.071 0.107 n = 100 0.038 0.048 0.072 0.038 0.048 0.072 0.039 0.049 0.072 0.038 0.048 0.071 0.039 0.049 0.072 n = 500 0.017 0.022 0.036 0.017 0.022 0.036 0.018 0.022 0.032 0.017 0.022 0.036 0.018 0.022 0.032 ion, and Ro ot Mean Squared Error. Chi-square ß^1 ß^1 =2 ß^ 1KB 53 ß^ KB ß^ t 0 .25 0 .50 0 .75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75 BIAS T = 10 n = 50 0.067 0.004 -0.060 0.064 -0.002 -0.063 0.042 -0.002 -0.045 0.067 0.009 -0.047 0.044 0.005 -0.042 0.059 0.000 -0.066 0.058 -0.002 -0.066 0.035 n = -0.004 100 -0.046 0.059 0.001 -0.064 0.035 -0.003 -0.046 n = 500 0.064 0.000 -0.062 0.064 0.000 -0.062 0.041 -0.003 -0.040 0.064 0.000 -0.062 0.041 -0.003 -0.040 0.035 0.008 -0.028 0.022 0.005 -0.028 T =0.003 20 n 0.004 = 50 -0.011 0.030 0.013 -0.018 0.016 0.007 -0.012 0.026 0.000 -0.026 0.023 0.000 -0.025 0.002 n =0.000 100 -0.009 0.027 0.001 -0.025 0.005 0.000 -0.010 n = 500 0.029 0.001 -0.030 0.029 0.001 -0.030 0.012 0.000 -0.015 0.029 0.001 -0.030 0.012 0.000 -0.015 T = 50 n = 50 0.011 -0.001 -0.012 0.010 -0.002 -0.014 0.001 -0.002 -0.006 0.013 0.003 -0.007 0.003 -0.002 -0.004 n = 100 0.012 0.001 -0.011 0.012 0.001 -0.011 0.003 0.001 -0.002 0.013 0.001 -0.010 0.003 0.001 -0.002 n = 500 0.012 0.001 -0.010 0.012 0.001 -0.010 0.003 0.001 -0.002 0.012 0.001 -0.010 0.003 0.001 -0.002 T = 100 n = 50 0.005 0.000 -0.004 0.001 -0.001 -0.005 -0.003 -0.001 0.000 0.005 0.002 0.002 0.001 0.000 0.000 n = 100 0.006 -0.001 -0.006 0.005 -0.001 -0.006 0.000 -0.001 -0.002 0.006 -0.001 -0.006 0.001 -0.001 -0.002 n = 500 0.005 0.000 -0.005 0.005 0.000 -0.005 0.001 0.000 -0.001 0.005 0.000 -0.005 0.001 0.000 -0.001 SD T = 10 n = 50 0.137 0.126 0.137 0.138 0.124 0.136 0.189 0.178 0.181 0.140 0.128 0.138 0.189 0.180 0.181 0.093 0.088 0.093 0.093 0.088 0.093 0.126 n0.113 = 1000.124 0.093 0.089 0.093 0.126 0.113 0.124 0.045 0.040 0.042 0.045 0.040 0.042 0.059 n0.051 = 5000.055 0.045 0.040 0.042 0.059 0.051 0.055 T = 20 n = 50 0.101 0.090 0.096 0.100 0.090 0.096 0.113 0.099 0.109 0.103 0.092 0.098 0.114 0.100 0.109 n = 100 0.068 0.067 0.070 0.068 0.066 0.070 0.078 0.074 0.079 0.069 0.067 0.071 0.078 0.074 0.079 n = 500 0.031 0.029 0.031 0.031 0.029 0.031 0.035 0.033 0.035 0.031 0.029 0.031 0.035 0.033 0.035 T = 50 n = 50 0.061 0.056 0.061 0.061 0.056 0.062 0.064 0.057 0.064 0.063 0.058 0.063 0.064 0.058 0.064 n = 100 0.042 0.040 0.042 0.042 0.040 0.042 0.044 0.041 0.044 0.043 0.040 0.042 0.044 0.041 0.044 n = 500 0.019 0.018 0.019 0.019 0.018 0.019 0.020 0.018 0.020 0.019 0.018 0.019 0.020 0.018 0.020 T = 100 n = 50 0.043 0.039 0.041 0.043 0.039 0.041 0.044 0.039 0.042 0.044 0.040 0.042 0.044 0.039 0.042 n = 100 0.030 0.028 0.030 0.030 0.028 0.030 0.031 0.028 0.031 0.030 0.028 0.030 0.031 0.028 0.031 n = 500 0.014 0.013 0.013 0.014 0.013 0.013 0.014 0.013 0.014 0.014 0.013 0.013 0.014 0.013 0.014 RMSE T = 10 n = 50 0.153 0.126 0.149 0.152 0.124 0.150 0.193 0.178 0.187 0.155 0.128 0.146 0.194 0.180 0.186 0.110 0.088 0.114 0.110 0.088 0.114 0.131 n0.113 = 1000.132 0.111 0.089 0.113 0.131 0.113 0.132 n = 500 0.078 0.040 0.074 0.078 0.040 0.074 0.071 0.051 0.068 0.078 0.040 0.074 0.071 0.051 0.068 T = 20 n = 50 0.106 0.091 0.100 0.102 0.090 0.100 0.113 0.099 0.110 0.107 0.093 0.100 0.115 0.100 0.110 n = 100 0.073 0.067 0.075 0.072 0.066 0.075 0.078 0.074 0.079 0.074 0.067 0.075 0.078 0.074 0.079 n = 500 0.042 0.029 0.043 0.042 0.029 0.043 0.036 0.033 0.038 0.042 0.029 0.043 0.036 0.033 0.038 T = 50 n = 50 0.062 0.056 0.063 0.062 0.056 0.063 0.064 0.058 0.065 0.064 0.058 0.064 0.064 0.058 0.064 n = 100 0.044 0.040 0.043 0.044 0.040 0.043 0.045 0.041 0.044 0.044 0.040 0.043 0.045 0.042 0.044 n = 500 0.023 0.018 0.022 0.023 0.018 0.022 0.020 0.018 0.020 0.023 0.018 0.022 0.020 0.018 0.020 T = 100 n = 50 0.043 0.039 0.041 0.043 0.039 0.041 0.044 0.039 0.042 0.045 0.040 0.042 0.044 0.039 0.042 n = 100 0.031 0.028 0.031 0.031 0.028 0.031 0.031 0.028 0.031 0.031 0.028 0.031 0.031 0.028 0.031 n = 500 0.015 0.013 0.014 0.015 0.013 0.014 0.014 0.013 0.014 0.015 0.013 0.014 0.014 0.013 0.014 tion, and Ro ot Mean Squared Error. Standard ß^ 1 ß^1= 2 ß^ 1KB 54 innovation mo del. θ = 1. ß^ KB ß^ ß^ 1 t 0.25 0 .50 0 .75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75 BIAS T = 10 n = 50 0.062 -0.056 -0.267 0.054 -0.054 -0.273 -0.028 -0.051 -0.227 0.059 -0.045 -0.224 -0.020 -0.054 -0.221 0.051 -0.064 -0.261 0.050 -0.061 -0.262 -0.053 n = 100 -0.062 -0.242 0.053 -0.063 -0.257 -0.052 -0.065 -0.242 n = 500 0.050 -0.056 -0.262 0.050 -0.056 -0.262 -0.030 -0.049 -0.227 0.050 -0.056 -0.262 -0.030 -0.049 -0.227 0.020 -0.028 -0.123 0.022 -0.027 -0.121 T = 20-0.030 n = 50-0.012 -0.051 0.024 -0.015 -0.094 -0.032 -0.013 -0.052 0.015 -0.036 -0.129 0.017 -0.034 -0.126 -0.039 n = 100 -0.017 -0.041 0.015 -0.035 -0.126 -0.041 -0.019 -0.044 n = 500 0.016 -0.034 -0.130 0.016 -0.034 -0.130 -0.035 -0.014 -0.046 0.016 -0.034 -0.130 -0.035 -0.014 -0.046 T = 50 n = 50 -0.002 -0.019 -0.067 -0.008 -0.021 -0.073 -0.025 -0.009 -0.030 -0.001 -0.009 -0.044 -0.018 -0.006 -0.023 n = 100 0.005 -0.019 -0.054 0.004 -0.017 -0.054 -0.011 -0.005 -0.012 0.006 -0.016 -0.050 -0.011 -0.006 -0.011 n = 500 0.006 -0.012 -0.051 0.006 -0.012 -0.051 -0.010 0.001 -0.009 0.006 -0.012 -0.051 -0.010 0.001 -0.009 T = 100 n = 50 -0.001 -0.013 -0.033 -0.003 -0.016 -0.036 -0.010 -0.009 -0.015 0.004 -0.004 -0.015 -0.008 -0.006 -0.012 n = 100 0.004 -0.007 -0.026 0.004 -0.006 -0.025 -0.002 0.001 -0.002 0.005 -0.004 -0.022 -0.002 0.000 -0.003 n = 500 0.002 -0.007 -0.026 0.002 -0.007 -0.026 -0.004 0.000 -0.003 0.002 -0.007 -0.026 -0.004 0.000 -0.003 SD T = 10 n = 50 0.225 0.283 0.381 0.224 0.281 0.381 0.429 0.482 0.710 0.230 0.291 0.390 0.429 0.482 0.711 0.161 0.202 0.261 0.161 0.201 0.261 0.313 n =0.353 100 0.530 0.161 0.202 0.262 0.313 0.353 0.530 0.072 0.089 0.117 0.072 0.089 0.117 0.139 n =0.158 500 0.231 0.072 0.089 0.117 0.139 0.158 0.231 T = 20 n = 50 0.175 0.204 0.279 0.174 0.204 0.279 0.232 0.255 0.422 0.180 0.211 0.291 0.233 0.256 0.421 n = 100 0.121 0.147 0.199 0.121 0.147 0.199 0.162 0.193 0.295 0.122 0.148 0.200 0.162 0.193 0.295 n = 500 0.052 0.061 0.086 0.052 0.061 0.086 0.071 0.079 0.127 0.052 0.061 0.086 0.071 0.079 0.127 T = 50 n = 50 0.101 0.126 0.169 0.101 0.126 0.169 0.111 0.139 0.197 0.105 0.131 0.174 0.111 0.139 0.197 n = 100 0.073 0.091 0.125 0.073 0.091 0.125 0.081 0.100 0.146 0.073 0.091 0.125 0.081 0.100 0.146 n = 500 0.032 0.041 0.056 0.032 0.041 0.056 0.036 0.045 0.065 0.032 0.041 0.056 0.036 0.045 0.065 T = 100 n = 50 0.073 0.088 0.126 0.073 0.089 0.126 0.077 0.092 0.135 0.075 0.092 0.130 0.076 0.092 0.135 n = 100 0.052 0.063 0.087 0.052 0.063 0.087 0.054 0.066 0.093 0.052 0.063 0.087 0.054 0.066 0.093 n = 500 0.024 0.029 0.041 0.024 0.029 0.041 0.024 0.031 0.044 0.024 0.029 0.041 0.024 0.031 0.044 RMSE T = 10 n = 50 0.233 0.288 0.466 0.231 0.286 0.468 0.430 0.485 0.745 0.237 0.295 0.450 0.430 0.486 0.744 0.169 0.212 0.369 0.168 0.210 0.370 0.317 n =0.358 100 0.583 0.169 0.212 0.367 0.317 0.359 0.583 n = 500 0.088 0.104 0.287 0.088 0.104 0.287 0.142 0.165 0.324 0.088 0.104 0.287 0.142 0.165 0.324 T = 20 n = 50 0.176 0.206 0.305 0.175 0.205 0.304 0.234 0.255 0.425 0.182 0.211 0.306 0.235 0.256 0.425 n = 100 0.122 0.152 0.238 0.122 0.151 0.236 0.167 0.194 0.298 0.123 0.152 0.236 0.167 0.194 0.298 n = 500 0.054 0.070 0.156 0.054 0.070 0.156 0.079 0.080 0.135 0.054 0.070 0.156 0.079 0.080 0.135 T = 50 n = 50 0.101 0.127 0.182 0.101 0.128 0.184 0.114 0.140 0.199 0.105 0.131 0.179 0.113 0.139 0.198 n = 100 0.073 0.093 0.136 0.073 0.092 0.136 0.082 0.100 0.146 0.074 0.092 0.135 0.082 0.100 0.146 n = 500 0.032 0.043 0.076 0.032 0.043 0.076 0.037 0.045 0.066 0.032 0.043 0.076 0.037 0.045 0.066 T = 100 n = 50 0.073 0.089 0.131 0.073 0.090 0.131 0.077 0.093 0.136 0.075 0.092 0.131 0.077 0.092 0.136 n = 100 0.052 0.063 0.090 0.052 0.063 0.090 0.054 0.066 0.093 0.052 0.063 0.090 0.054 0.066 0.093 n = 500 0.024 0.030 0.048 0.024 0.030 0.048 0.025 0.031 0.044 0.024 0.030 0.048 0.025 0.031 0.044 ion, and Ro ot Mean Squared Error. Chi-square ß^1 =2 ß^1KB 55