Smo othed quantile regression for panel data * Kengo

advertisement
Smo othed quantile regression for panel
data*
Kengo
Kato†
Antonio F. Galvao, Jr.‡
Abstract
August 20, 2011
This pap er studies fixed effects estimation o
dels with panel data. Previous studies show tha
difficulties with the standard QR estimation. Firs
b ecause of the well-known incidental paramete
non-smo othness of the ob jective function sign
asymptotic analysis of the estimator, esp ecially
overcome the latter problem by smo othing the
asymptotic framework where b oth the numb er
ds grow at the same rate, we show that the fixe
othed ob jective function has a limiting normal d
mean, and provide the analytic form of the asym
one-step bias correction to the fixed effects esti
formula obtained from our asymptotic analysis.
the case that the observations are dep endent i
illustrate the effect of the bias correction to the
Key words: bias correction, incidental paramete
regression, smo othing.
JEL classi cation codes: C13, C23.
The authors are grateful to the seminar participants at
Conference on Panel Data, New York Camp Econometrics
Summer Meetings of the Econometric So ciety. We also wo
editor and two anonymous referees for their constructive co
supp orted by the Grant-in-Aid for Scientific research (KAK
yDepartment of Mathematics, Graduate Scho ol of Science
Kagamiyama, Higashi-Hiroshima, Hiroshima 739-8526, Jap
zDepartment of Economics, University of Iowa, W210 Papp
Street, Iowa City, IA 52242. E-mail: antonio-galvao@uiowa
1
1 Intro duction
Since the seminal work of Ko enker and Bassett (1978), quantile regression (QR) has
attracted considerable interest in statistics and econometrics. It offers an
easy-toimplement metho d to estimate conditional quantiles, and is known as a more
flexible to ol to capture the effects of explanatory variables to the resp onse than mean
regression. The theoretical prop erties of the QR metho d are well established for
cross sectional mo dels; see Ko enker (2005) for references and discussion.
1Recently,
panel data sets have b ecome widely available and very p opular, b ecause
they provide a large numb er of data p oints and allow researchers to study dynamics
of adjustment as well as to control for individual sp ecific heterogeneity. In this regard,
it is natural to consider to apply QR for panel data. In fact, there is a rapidly expanding
empirical literature on QR for panel data in imp ortant areas of economics and
biostatistics, which makes a p ersuasive case for the rigorous study of estimation and
inference prop erties of the QR metho d for panel data.2In contrast to mean
regression, however, little is known ab out the ability of the QR metho d to analyze
panel mo dels with individual effects. An approach in this direction is to treat each
individual effect as a parameter to b e estimated and apply the standard QR
estimation to the individual and common parameters altogether (Ko enker,
2004).Unfortunately, the resulting estimator of the common parameters will b e
generally inconsistent when the numb er of individuals n go es to infinity while the
numb er of time p erio ds T is fixed. This is a version of incidental parameters
problems (Neyman and Scott, 1948) which have b een extensively studied in the
recent econometrics literature (see Lancaster, 2000, for a review). Arellano and
Bonhomme (2009, p.490) stated that the incidental parameters problem is “one of the
main challenges in mo dern econometrics”.
A recent attempt to cop e with the incidental parameters problem is to intro duce an
asymptotic framework that n and T jointly go to infinity. Li, Lindsay, and Waterman
(2003) considered the maximum likeliho o d (ML) estimation for smo oth likeliho o d
functions and showed that the ML estimator (MLE) of the common parameters has an
order O (T-1) bias, so its limiting normal distribution has a bias in the mean (even)
when n and T grow at the same rate. An imp ortant implication of their work is that
prop erly bias corrected estimators enjoy mean-zero asymptotic normality when T
1For
example, in applied economics, Abrevaya and Dahl (2008), Silva, Kosmop oulou, and Lamarche
(2009), and Gamp er-Rabindran, Khan, and Timmins (2010); in biostatistics, Lipsitz, Fitzmaurice,
Molenb erghs, and Zhao (1997), Wei and He (2006) and Wang and Fygenson (2009), among others.
2Ko enker (2004) indeed prop osed a p enalized estimation metho d where the individual parameters
are sub ject to the l1p enalty. Other approaches are found in Abrevaya and Dahl (2008), Canay (2008),
and Rosen (2009).
2
grows at a slower rate than n. Since then, several bias correction metho ds for the ML
estimation (or more generally, M -estimation) have b een prop osed in the literature;
see Hahn and Newey (2004), Woutersen (2002), Hahn and Kuersteiner (2011),
Arellano and Hahn (2005), Bester and Hansen (2009), Arellano and Bonhomme
(2009) and Dhaene and Jo chmans (2009), to name only a few.
2A
distinctive feature of QR is that the ob jective function, often referred to as the
check function, is not differentiable. This means that the asymptotic analysis of the
previous nonlinear panel data literature is not directly applicable to the QR case since
it substantially dep ends on the smo othness of ob jective functions. Kato, Galvao, and
Montes-Ro jas (2011) formally established the asymptotic prop erties of the standard
fixed effects QR estimator of the common parameters. However, they required the
restrictive condition that T grows faster than nto show asymptotic normality of the
estimator, and did not succeed in deriving the bias. The difficulty to handle the
standard QR estimator in panel mo dels is partly explained by the fact that the higher
order sto chastic expansion of the scores is an essential technical to ol in the
asymptotic analysis of Hahn and Newey (2004) and Hahn and Kuersteiner (2011)
while such an expansion is difficult to implement to the QR ob jective function b
ecause the Taylor series metho d can not b e applied to it. It is also imp ortant to note
that the higher order asymptotic b ehavior of QR estimators is non-standard and rather
complicated (Arcones, 1998; Knight, 1998). See also the discussion at the end of
Section 5 of Kato, Galvao, and Montes-Ro jas (2011). To our knowledge, there is no
pap er that formally studies the bias correction for panel QR mo dels under the large n
and T asymptotics.
This pap er overcomes the ab ove problem by smo othing the ob jective function.
The idea of smo othing non-differentiable ob jective functions go es back to Amemiya
(1982) and Horowitz (1992, 1998). We refer to the resulting estimator as the fixed
effects smo othed quantile regression (FE-SQR) estimator. We show that under
suitable regularity conditions, the FE-SQR estimator has an order O (T- 1) bias and
hence its limiting normal distribution has a bias in the mean (even) when n and T grow
at the same rate. Imp ortantly, we allow the observations to b e dep endent in the time
dimension. As naturally exp ected, the bias dep ends on the conditional and
unconditional densities and their first derivatives. We prop ose a one-step bias
correction based on the analytic form of the asymptotic bias. We also examine the
half-panel jackknife metho d originally prop osed by Dhaene and Jo chmans (2009) to
the FE-SQR estimator. We theoretically show that the b oth metho ds eliminate the
bias of the limiting distribution. It is imp ortant to note that these asymptotic results are
not included in the previous literature since the bandwidth tending to zero as the
sample size increases is involved in the ob jective function and the standard sto
chastic expansion do es not
3
apply to such a case (see Horowitz, 1998). In the present case, we have to control the
smo othing effect and at the same time handle the problem of diverging numb er of
parameters, which we b elieve is challenging from a technical p oint of view. An
additional complication arises since the observations are dep endent in the time
dimension. To cop e with this complication, we derive new empirical pro cess
inequalities.
We conduct a small Monte Carlo study to illustrate the effect of the bias correction
to the FE-SQR estimator. The results show that, as exp ected, the standard QR and
the FE-SQR estimators are mo derately biased except for some sp ecial cases. In
addition, the one-step bias correction is able to substantially reduce the bias in many
cases, but unfortunately it is observed that the analytic bias correction increases the
variability in the small sample partly b ecause of the nonparametric estimation of the
bias term.
The organization of this pap er is as follows. In Section 2, we intro duce a panel QR
mo del with individual effects and formally define the FE-SQR estimator. In Section 3,
we present the theorems ab out the limiting distributions of the FE-SQR estimator and
the bias corrected one when n and T grow at the same rate. In Section 4, we rep ort a
Monte Carlo study to assess the finite sample p erformance of the estimator and the
bias correction. In Section 5, we leave some concluding remarks. We relegate the pro
ofs of the theorems to the App endix.
2 Mo del and estimation metho d
We consider a panel QR mo del with individual effects
Qt(yit|xit, ai0) = ai0 + x' itß0, (2.1)
is a dep endent variable, xis a p dimensional vector o
where t ∈ (0, 1) is a quantile of interest, yit scalar individual effect for each i and Qt( yit|xit, ai0) is
yitgiven ( xit, a). In general, each ai0and ß0i 03can dep
e fixed throughout the pap er and suppress such a de
simplicity.The mo del is semiparametric in the sense
conditional distribution of yitgiven (xit, ai0) is left unsp
assumption is made on the relation b etween ai 0
and xit. The numb er of individuals is denoted by n an
is denoted by T = Tthat may dep ends on n. In what f
of Tnn. It should b e noted that although the mo del (2
3Some
readers might wonder that the individual effects should b
However, in our mo del, the individual effects include the interce
dep end on the quantile. Thus, it would b e natural that the indiv
4
sight, the quantile is a nonlinear functional of the random variable, so in general the
conventional differencing out strategy is not appropriate for the QR case.
We consider the fixed effects estimation of ß, which is implemented by treating each
individual effect as a parameter to b e estimated. Throughout the pap er, as in Hahn
and Newey (2004), Hahn and Kuersteiner (2011) and Fernandez-Val (2005), we
treat ai 00as fixed by conditioning on them. Then, the standard QR estimation solves
ni=1 Tt=1∑( - ai - x' itß ){ t - I (yit
min ∈
[
= ai + x' itß )} ] , (2.2)
Rn, ∈ Rp 1nT ∑ yit
where I ( ·) is the indicator function. Note that a implicitly dep ends on n. As in general
nonlinear panel mo dels, except for some sp ecial cases, the estimator of ß40defined
by (2.2) is inconsistent as n → 8 while T is fixed. In fact, as Graham, Hahn, and Powell
(2009) noted, when T = 2, the standard fixed effects QR estimator of the common
parameter do es not dep ends on t for panel quantile regression mo dels with
individual effects; however since the common parameter dep ends on t unless the mo
del is of a linear lo cation form that Graham, Hahn, and Powell (2009) presumed, so
the standard fixed effects QR estimator will b e generally inconsistent.
ˆ ßA distinctive feature of the standard QR estimation is that the ob jective function is
not differentiable. Therefore, the basic smo othness assumption imp osed in the
recent nonlinear panel data literature is not satisfied for the standard QR estimator.
Kato, Galvao, and Montes-Ro jas (2011) rigorously investigated the asymptotic prop
erties of the standard QR estimator, sayKBˆ ß, defined by (2.2) when n and T jointly
go to infinity. The essential content of their theoretical results is that whileKBˆ ßis
consistent under mild regularity conditions, justifying (mean-zero) asymptotic
normality ofKB
5.
ˆ ßKB
ˆ ßKB
4
5
ˆ ßKB
5
2 remarked that the non-smo othn
They
requires the restrictive condition that T grows faster than n the non-smo othness of the scores) s
ofwhen n and T jointly go to infinity. In
asymptotic results analogous to those
and Newey (2004) hold foris not know
indep endent in b oth cross section a
In this pap er, instead of the standard QR estimator, we
study the asymptotic prop erties of the estimator defined by a
minimizer of a smo othed version of the QR
A more explicit example in which the standard fixed effects QR
estimator is inconsistent is available up on request.
Kato, Galvao, and Montes-Ro jas (2011), however, did not show that
this condition is necessary for (mean-zero) asymptotic normality of.
Whether one can substantially improve this condition is still an op en
question.
ob jective function.67Smo othing the QR ob jective function is employed in Horowitz
(1998) to study the b o otstrap refinement for inference in conditional quantile mo
dels.
i=
+ x' it
n
a
i
∫-8
n
hn( ·) =
it = ai
G (·/hn). Note that Ghn(yit
t=1 ß ). Then, we consider the
+x
∑ estimator
n
(u ) := h
estimator of ß0.
Put
p
n
T
i
=
1
∑
˜
it
- x' itß )} ] ,
K
(
u
/
h
ß ) by usingTa kernel function. Let K (·
he basic insight of Horowitz (1998) is to smo oth over I (yit survival function of K (·), i.e.,8K (u )du
We do not require K ( ·) to b e nonneg
(·) in Section 3. Let { h} b e a sequenc
that h→ 0 as n → 8 and write G- a- x'
×B [ 1
n
T
- ai - x' itß ){ t - Ghn( y
(yit
( ˆ a, ˆ ß ) := arg min
(,
)∈A
n(2.3)
where A is a compact subset of R , Anis the pro duct of n copies of A and B is
a compact subset of R. The compactness of the set A× B is required to ensure the
existence of ( ˆ a,ˆ ß ). We callˆ ß the fixed effects smo othed quantile regression
(FE-SQR)
-1
˜ K (u ) := uK ( u ) ˜
n). We use the notation
n
and
Khn
∂
=a∂ /∂ ß . Since the ob jective function in (2.3) is smo oth with resp ect to (a , ß ), if ( ˆ
nd ∂ a,ˆ ß ) is an interior p oint of An× B , it satisfies
H(1) ni( ˆai, ˆ ß ) = 0, i = 1 , . . . , n; H(2) n( ˆ a , ˆ ß ) = 0, (2.4)
∑T t=1{ t - ai - x'' itit i - x' itß )} xit
i
Ghn(yitT-1∑T t=1˜
Khn(yit-1(nT )∑n i=11∑∑T t=1n i=1∑T t=1
i
n
6
7
(2) n
+ hn
hn
(yit - a
˜ Khn it - x' itß )xit.
where
H(1) ni(ai, ß ) := T-1
6
ß ) } + h- a- x( a, ß ) := (nT )ß ) , H{ t - G
(y - a
In fact, it turns out that using smo othing is crucial to develop the formal asymptotic analysis here,
since, as Hahn and Newey (2004) and Hahn and Kuersteiner (2011) observed, in a general nonlinear
panel mo del with a smo oth ob jective function, the bias of the fixed effects estimator of the common
parameter dep ends on the second order biases of the estimators of the individual parameters.
However, as Kato, Galvao, and Montes-Ro jas (2011) discussed, the second order bias of the general
QR estimator (in the cross section case) is not uniquely determined, so there is a conceptual difficulty in
deriving a bias result to the unsmo othed fixed effects QR estimator.
A motivation to use the smo othed QR estimator in Horowitz (1998) is that higher order asymptotic
theory of the unsmo othed QR estimator is non-standard, which significantly complicates the study of
the b o otstrap refinement for inference based on it (see also Horowitz, 2001, Section 4.3).
As we will see in the pro of of Theorem 3.1, the terms concerning ˜ K ( ·) do not
affect thefirst order asymptotic distribution of ˆ ß . This can b e understo o d by
regarding
i
a “kernel”, although the integral of ˜ K ( ·) over the real line is not
one.
For a later purp ose, we refer to the partial derivative with resp ect to a i
-1 , xit H(1)
(
(2) n( a, ß ) are the ai
2
, xit ni(ai
)
n
(1) ni(ai,
ß ) the ai
3
der
Main
ivative and the partial derivative with resp ect to ß as the ß -derivative. n, ß ) and
H-derivative and the ß -derivative of one minus the ob jective function in (2.3), resp result
ectively. From an analogy to the ML estimation, we call H-score and H(a , ß ) the ß s
-score. Note: In what follows, we treat ai 0as fixed by conditioning on them, and
consider the joint asymptotics in which T = Tn→ 8 as n → 8 . The limit is always taken
In this
as n → 8 .
section,
we investigate the asymptotic prop erties of the FE-SQR estimator defined by (2.3). ˜ K (·) as
Put a0:= ( a10, . . . , an0)'.
as the a
3.1 Assumptions
This section provides conditions under which the FE-SQR estimator has a limiting
normal distribution with a bias in the mean when n and T grow at the same rate. The
conditions b elow are stronger than needed if the main concern is to prove
consistency only or asymptotic normality of the estimator when n/T → 0. However,
we b elieve that the conditions are, to some extent, standard in the literature.
(A1) For each i = 1, the pro cess { (y), t = 0 , ±1, ±2 , . . . } is stationary and ß
mixing. Let ßiit, xit(j ) denote the ß -mixing co efficient of the pro cess { (yit8), t = 0,
±1 , ±2, . . . } .Assume that there exist constants a ∈ (0, 1) and B = 0 such that
supi= 1ßi( j ) = B aj. Assume further that the pro cesses {( yit), t = 0, ±1 , ±2, . . . } are
indep endent across i.
(A2) There exists a constant M such that supi= 1 ∥xi 1∥ = M (a.s.).
8For
any stationary pro cess {Xt , t = 0, ±1, ±2, . . . }, the ß -mixing co efficient is defined
as
J | P(Ai n B ) - P(A )P(B ) | :
Ii
is any finite partition in A
,
I
j =1
=1
{A
i =1
j
i
j
0
ß (k ) := 12
∑ ∑
i}
sup{
j
is the s -field generated by Xi, .
{Bj J j
1 kis any finite partition in A},
iwh
. . , Xj
for i < j . =1}
ere
A
7
are compact. For each i = 1, ai0
n0
it
:= y - ai0 - x' itß
,x
The
distribution
of ( yit, xit
i
1+ j|
x1, x1+ j). Let f
i1
,u
, u1+ j
i,j( u1
(A3) The parameter sets A ⊂ R and B ⊂ Rp is an interior p oint of A, and ßis an interior p oint of B
it
0
) is allowed to dep end on i.
Define u
it
given xit
it
i
i,1+ j
i
i(u
|x ) has a density fi
it
it
i 1,
xi, 1+ j) = (x1, x1+ j), denoted as fi,j(u1, u
≍ bnwhen there exists a p ositive constant C such
} and {bn}, we
that C
(A5) (a) For each i = 1, fi(j ) ij( u |x ) is r times continuously differentiable with
resp ect to u for each fixed x , where r = 4 is an even integer. Let f( u |x ) := (∂
/∂ u )(u |x ) for j = 0 , 1 , . . . , r , where f(0) i(u |x ) stands for fifi(u |x ). (b) There
exists a constant Cf(j ) isuch that |f(u |x )| = Cuniformly over(u, x) for all j = 0, 1, .
. . , r and i = 1. (c) infi= 1fif(0) > 0. (d) For each i = 1 and j = ±1, ±2 , . . . , fi,j(u1,
u1+ j|x1, x) is r times continuously differentiable with resp ect to (u1, u1+ j1+ j) for
each fixed (x1, x1+ j). Let
f(k ,l ) i,j(u1, u1+ j|x1, x1+ j) := ∂k +lfi,j( u1, u1+ j|x1, x1+ j )/∂ uk 1∂ ul
1+ j,
. Condition (A1) implies that the pro cess { (u), t = 0, ±1, ±2, . . . } is stationary
and ß -mixing for each i with ß -mixing co efficient ß(·), and indep endent
across i . Let F(u |x ) denote the conditional distribution function of u= x . We
assume that F( u |x ). Let f(u ) denote the marginal density of u. To cop e with
the time series dep endence, we further assume that for each i = 1 and j = ±1 ,
±2, . . . , there exists a joint conditional density of (u) given ( x) denote the joint
(unconditional) density of (u). For two sequences of p ositive numb ers { a -1bn=
an= C bnfor every n.
' i1)
(A4) The minimum eigenvalues of E[ fi(0|xi1 )] are b ounded away from zero uniformly over i = 1.
). (e) There exists a constant C' f( k
over (u1, u1+ j, x1, x) for all 0 = k =
integrable function Df(0 ,l ) i,j: R → [0
uniformly over ( u1, u1+ j, x1, x1+ j1+
(A6) (a) K (·) is symmetric ab out the or
continuously differentiable. (b) K (·) is a
∫8-8ujK ( u )du = 0 , j = 1 , . . . , r - 1, ∫8-8u
where r is given in condition (A5).
8
here f(0 ,0) i,j(u1, u1+ j|x1, x1+ j) stands for fi,j(u1, u1+ j| x1,
x1+ j
w
(A7) hn ≍ - c, where 1 /r < c < 1/ 3.
T
Put si := 1/fi(0), γi := siE[ fi(0|xi1)xi 1] and νi := f(1) i(0)γi (1) i- E[ f(0|xi1)xi 1].
(A8) (a) Gn )] is nonsingular for each n, and the limit G := limn→8Gnexists and is
- γ' i E[ fi(0|xi1)xi1( x' i ∑n
1
=1
nonsingular. (b) Let Vdenote the covariance matrix of the term T - 1/ 2∑T t=1{ t I ( uit= 0)}( xitni- γ). Assume the limit V := limn→8n- 1n i =1ni
(
∑
V i
exists
and
is
nonsingular.
A
9
)
D
e
fi
n
e
-8
ω(3
i1
= 0), I (ui, 1+ j = 0)} .
) ni
ω
(
2
)
n
i
|j|
T
i(0|xi1)
xi 1 ] - E [ xi1 ∫
0
f
(0, u |xi 1, xi, 1+ j)du ]} ,
|j | T) Cov {I
(u
i,j
:= ∑1 =| j |= T - 1:= |j ){ t fi(0) ∫){ t 0
(0, u )du ,
-8
}
) ni ∑1=| j |=T - 1:= ∑|j |= | E[f
fi,j
T - 1( 1 -( 1 -( 1 - T
(1) ni|ω| = O (1), max1= i= n (2) ni∥
ω
n→8
γi (2) ni- ω+ siω νi/2) exists.
and ß0
i0
9
Assume that max1 =i =n n∥ = (1), and the limit lim- 1∑n i=1si(ω(1) ni(3) niCondition (A1) assumes that the
observations are indep endent in the cross section dimension, but allows for weak
dep endence in the time dimension. Condition (A1) is basically the same as
Condition 1 of Hahn and Kuersteiner (2011). The only difference is that they
assumed a uniform exp onential a -mixing condition in the time dimension.
Generally, ß -mixing is a slightly stronger requirement than a -mixing. The reason to
assume a ß -mixing condition here is to establish convergence rates of kernel typ e
statistics. For that purp ose, ß -mixing is more tractable than a -mixing. We refer to
Section 2.6 of Fan and Yao (2003) for some basic materials on mixing pro cesses.
The condition that xitis uniformly b ounded can b e relaxed at the exp ense of
lengthier pro ofs. Condition (A3) is a conventional assumption in the literature. As
Kato, Galvao, and Montes-Ro jas (2011) noted, condition (A4) guarantees
identification of each a
ω(1
. Conditions (A5) (a)-(c) state some restrictions on the conditional densities,
and are standard in the QR literature (see Assumption 4 of Horowitz (1998) and
condition (ii) of Angrist, Chernozhukov, and Fernandez-Val (2006, Theorem 3)). We
require no less than four times differentiability of the conditional densities to make
the smo othing bias
negligible. Conditions
(A5) (d)-(f ) are mainly
used to derive an
explicit bias expression
(see the pro of of
Lemma A.6).
Condition (A6) corresp
onds to Assumption 5
of Horowitz (1998).
Condition (A6) (b)
requires K (· ) to b e a
higher order kernel.
The requirement that
c < 1/3 is explained
in
H
or
o
wi
tz
(1
9
9
8)
.
T
h
e
pr
o
of
of
T
h
e
or
e
m
3.
1
b
el
o
w
s
h
o
w
s
th
at
th
e
a
sy
m
pt
ot
ic
bi
a
s
of
ˆ
ß
is
m
a
x{
O
(T
1),
O
(h
r
n)
},
w
h
er
e
th
e
O
(T
-1)
te
r
m
c
or
re
s
p
o
n
d
s
to
th
e
e
sti
m
at
io
n
er
ro
r
of
e
a
c
h
in
di
vi
d
u
al
e
ff
e
ct
a
n
d
th
e
O
(h
r
n)
te
r
m
c
or
re
s
p
o
n
d
s
to
th
e
s
m
o
ot
hi
n
g
bi
a
s.
T
o
m
a
k
e
hr
n=
o(
T1),
w
e
n
e
e
d
cr
>
1.
A
hi
g
h
er
or
d
er
k
er
n
el
is
n
e
e
d
e
d
to
e
n
s
ur
e
c
o
n
di
ti
o
n
(A
7)
.
S
e
v
er
al
hi
g
h
er
or
d
er
k
er
n
el
s
ar
e
in
tr
o
d
u
c
e
d
in
M
ul
le
r
(1
9
8
4)
.
A
n
i
m
pl
ici
t
a
ss
u
m
pt
io
n
b
e
hi
n
d
th
is
c
o
n
di
ti
o
n
is
th
at
T
gr
o
w
s
a
s
fa
st
a
s
n.
In
pr
a
cti
c
e,
it
is
re
c
o
m
m
e
n
d
e
d
to
s
et
hn
- 1/
(2r
)=
o{
(n
T
)-1
/2}
in
or
d
er
to
m
a
k
e
th
e
s
m
o
ot
hi
n
g
bi
a
s
o{
(n
T
)}
.
C
o
n
di
ti
o
n
s
(A
8)
a
n
d
(A
9)
ar
e
c
o
n
c
er
n
e
d
wi
th
th
e
a
sy
m
pt
ot
ic
c
o
v
ar
ia
n
c
e
m
at
ri
x
a
n
d
th
e
a
sy
m
pt
ot
ic
bi
a
s,
re
s
p
e
cti
v
el
y.
T
h
e
e
x
a
ct
fo
r
m
of
th
e
te
r
m
V
nii
s
gi
v
e
n
b
y
Vni = ∑| j |= T ( 1 - |j ) E[{ t - I ( ui1 = 0)}{t - I (ui, 1+ j
-1
|
T
= 0)}( xi 1 - γi)(xi, 1+ j
- γi ')]
.
(A
If there is no time series dep endence, i.e., for each i , the pro cess { (yit, xitni), t = 0 ,
1)±1, ±2 , . . . } is i.i.d., then V= t (1 - t )E[(xi1- γi)(xi 1- γi)'], ω(1) ni= 0 , ω(2) ni= 0 and ω(3)
(A
ni= t (1 - t ).
6),
(ˆ
3.2 Limiting distribution
a
,ˆ
We first show consistency of the estimator. Although the main concern is the
ß)
distributional result, the consistency of the estimator is an imp ortant prerequisite is
since it guarantees that ( ˆ a,ˆ ß ) is an interior p oint of An× B and thus satisfies thew
first order condition (2.4) with probability approaching one. As in Kato, Galvao, and ea
MontesRo jas (2011), we say that ( ˆ a ,ˆ ß ) is weakly consistent if max1= i= n| ˆai- kly
ai0|p→ 0 and ˆßp→ ß0. The pro of of Prop osition 3.1 is given in App endix A.1.Propco
osition 3.1. Assume that (log n)/ vT → 0 and h→ 0 . Then, under conditions
ns
istent.n
We now present the limiting distribution of the FE-SQR estimator when n and T
grow at the same rate. The pro of of Theorem 3.1 is given in App endix A.2.
Theorem 3.1. Assume that n/T → ρ for some ρ > 0 . Then, under conditions
(A1)(A9), we havev nT (ˆ ß - ß0)d→ N ( v- 1ρ b, G- 1V G), (3.1) where
b := G-1 [ limn→8
{ 1ni =1∑si ( ω(1) niγ(2)
i ni- ω+ si ω(3) niν)}]
i . (3.2)
2
10
n
The bias (3.2) is of a complicated form. In the next subsection, to gain some insight
into the bias expression, we compare our bias formula with the Hahn-Newey formula
in the simplest case where the observations are indep endent in the time dimension.
By the theorem, one can see that the bias consists of three terms. As the pro of
shows, the first one comes from the correlation b etween ˆaiand the ai-derivative of the
ai-score evaluated at the truth; the second one comes from the correlation b etween
ˆaand the aii-derivative of the ß -score evaluated at the truth; the third one comes from
the variance of ˆai. See the expression (A.16) and Lemmas A.5-A.6 in App endix A.
Our pro of strategy of Theorem 3.1 is substantially different from a functional expansion metho d of Li, Lindsay, and Waterman (2003) that many pap ers exploited.
The reason is that their technique is difficult to adapt to the situation where the kernel
smo othing is used. The pro of strategy is p erhaps closer in spirit to the iterative
metho d of sto chastic expansion displayed in Rilstone, Srivastava, and Ullah (1996),
but still quite different from theirs. It should b e p ointed out that although the present
ob jective function is smo oth, the pro of is still non-trivial since we have to control the
smo othing effect and at the same time handle the problem of diverging numb er of
nuisance parameters. An additional complication arises since the observations are
dep endent in the time dimension. To cop e with this complication, we derive new
empirical pro cess inequalities applicable to ß -mixing pro cesses, which are presented
in App endix C.
3.3 A heuristic comparison with the Hahn-Newey formula
9In
this subsection, we provide a heuristic comparison of our bias formula with the
Hahn-Newey formula in the case that the observations are indep endent in the time
dimension.Hahn and Newey (2004) gave the bias formula for the general fixed
effects M -estimator with smo oth ob jective functions in the case that the
observations are indep endent in b oth the cross section and time dimensions (Hahn
and Newey (2004) fo cused on the likeliho o d setting, but as they stated, the bias
expression presented in Hahn and Newey (2004, p.1302) do es not dep end on the
likeliho o d setting). Observe that when the observations are indep endent in the
time dimension, the bias term b reduces to a simpler form
[ limn→8
{ 1nni=1∑t (1 - t ) s2 iνi}] .
11 2
b = G- 1
9
It is p ossible to compare our result with the bias expression derived in Hahn and Kuersteiner (2011)
for the case that the observations are dep endent in the time dimension. However, the purp ose of
this section is to provide an intuition b ehind our bias formula, and to make the argument simpler, we
restrict our attention to the case that the observations are indep endent in the time dimension.
In what follows, with a heuristic approach, we present that our bias formula is identical
to the Hahn-Newey formula applied to the (unsmo othed) QR ob jective function. In the
present context, “heuristic” means that there is no formal justification in the mechanical
application of the Hahn-Newey formula to the QR ob jective function as demonstrated
b elow.
To b egin with, we briefly review the Hahn-Newey formula. Let f (z; a , ß ) denote
some smo oth ob jective function to b e maximized, where zititis a random vector of
conformal dimension, a is a scalar constant and ß is a constant vector of conformal
dimension. Let ai0and ß0denote the true parameter values. Define
vit(a , ß ) := ∂∂ a f (zit; a , ß ) , wit(a , ß ) := ∂∂ ß f ( zit; a , ß ).
= vit( ai 0, ß0) and wit
As in Hahn and Newey (2004), we use the notation vit
it(ai0, ß0). The Hahn-Newey bias formula is given by
vit] ] + [ E[v2 it 2 { E[
ititavita] E[' itß] ,
] E[ witaita] E[ vitaa] }]
ita
vita] ]
w
:= -E [:= [wE[
witßita] E[vita2]
3 i.
ita
=w
1
lim n
Ii
n
I )
n b) ,
1 i=1
n
( lim
i=1
n→8
ρ×(
Bias = v
where ρ := lim n→8(n/T ) and
E[ w - b2 i E[ v
] v[ E[ w] E[
+b
vita
b
i
=: b1i
are defined as
vita
∑
-1
i
] 2E[v]
= ∂ vit(a , ß )/∂ a |(a,
ita
ai0,
n→8
∑
itaa
i
] E[ v
)=(
0)
i
v
i
t
1
i
,
b
2
i
i
t
;
a
,
ρ
t
(
u
i
t
(
a
,
ß
ß
)
)
=
=
{
t - I (u
a
a
n
d
-
b
x
3
i
'
i
t
ß
)
w
i
t
h
i
0
=
(uit,
xit
z
The terms such as and so on. Note that we have changed the original order of terms in bto make
v
comparison more comprehensible. Here, bcorresp ond to what we have called
sources of the bias in the previous subsection. Also we have slightly changed
original notation to avoid the notational multiplication.
We now turn to the QR case. Assume without loss of
= 0 and
generality a
ß0
it
' itß )}
(a , ß ) = t - I (uit
=
xit.
a
+
x
'
i
t
ß
)
,
w
i
t
= 0. The ob jective function is f (z). The scores would b e
it = a + x
The scores for the QR ob jective function are not uniquely determined since the check
function ρt( u ) is not differentiable at u = 0. However, we shall intuitively calculate the
Hahn-Newey bias formula to the present scores based on some heuristic rule. The
scores include the indicator function; so we can not compute, for instance, E[vita] as
12
it is. Instead, we shall compute E[ vita] as ∂ E[vit(a , ß )] /∂ a |( a, )=( ai 0, 010
) .
(1) i
E[ vita] = -fi(0), E[ vitß] = E[ wita] = -E[ fi(0|xit)xit], E[ vitaa
itß] = -E[fi(0|xit)xitx' it], E[witaa
(1) i] = -E[f(0|xit)xit].
The calculation of the terms E[ vitavit] and E[ w
v
vit] = E[ vitavitxit
2 it
/∂ a ]/2 and E[w
vit] = E[∂ v
E[vitavit] = (1 - 2t )2 fi(0), E[ wita
(0|xvit] = (1 - 2t )2 E[f
it].
Ii = E[f (0|xit)x x' it] E[f
i(0|xit)xit(0)
(0|xit)x' it] = E[f (0|xit)xit(x' it - γ' i)] ,
E[f
Usin
g this heuristic rule, we can obtain:
] = -f(0), E[ w
ita it
it)x
ita
ita
i
/∂ a ) xit
i
] is not straightforward.
We make use of the fact that E[v2 it] = E[( ∂ v] /2. Exchanging the derivative
and the exp ectation, we can obtain
Substituting these results into the Hahn-Newey formula, we can obtain
it
i ] f
i
i
i
it
b1
= E[ f (0|xit
(1 - 2t ) ] f(0) ·
)x 2i
fi(0) = (2t - 1)2 · γi,
i
2
2
b3i = t (1 - t )2 -E[f (1) i(0|xit)xit ] + E[fi (0|xit)x]
= t (1 - t )
iνi.
f
fi(0) f
2 = (2t i
(0|
E 2 i2
(0) =
(2t · γ i,
(1) i(0) }
2·s
bi 1)
it
xit)x] fiit [ ·(0) {
1) 2
f
Therefore, the Hahn-Newey formula reduces to our formula. Note that in the
case that the observations are indep endent in the time dimension, bi 1and b2
iare canceled out, so that the resulting bias is of a simple form.
Of course, the ab ove calculation is not formal b ecause of the
non-differentiability of the QR ob jective function, or more precisely, b ecause
the Hahn-Newey bias formula dep ends on the higher order sto chastic
expansion of the scores while that for the QR ob jective function is
non-standard, as discussed in Intro duction. In the present pap er, the smo
othing is employed as a device to make the ab ove heuristic calculation
rigorous.
3.4 Bias correction
As stated in the literature, the problem of the limiting distribution of v nT (ˆ ß ß0) not b eing centered at zero is that usual confidence intervals based on
the asymptotic approximation will b e incorrect. In particular, even if b is
small, the asymptotic bias can b e of mo derate size when the ratio n/T is
large. In this subsection, we shall consider the bias correction to the FE-SQR
estimator.
way to implement this heuristic calculation is to use “generalized functions” as in
Phillips (1991).
10Another
13
We first consider a one-step bias correction based on the analytic form of the asymptotic
bias. Put ˆuit:= yit- ˆai- x' itˆ ß . The terms fi:= fi(0), si, γi, νiand G can b e estimated by
ˆ :=
n( ˆuit) ,
Kh
:= 1ˆ , ˆ :=
Khn( ˆuit)xit ,
fi 1T
ˆsi
fi
γi ˆsiT
2
:=
:=
ˆ γi), ˆ /hn)(
1nT Gn
xit
n
ˆ 1T h
i
=
t=1
1
Khn( ˆ
ν
i
Tt=1 t=1∑K(1)(
T
n∑ T∑
Tt=1
∑ ˆuit
∑
and
ω(3) ni
) := ∫-8( j ) :=
E [xi1
= 0)I (u
= 0)], it can b e estimated by
min { T ,T - j }∑t=max { 1,- j +1
= 0).
}Khn( ˆuit)I ( ˆu
ˆ fi( j ) := min {T ,T -j
Khn( ˆuit)I ( ˆui,t+ = 0)xit.
}∑t=max {1 ,-j
j
1T
+1 }
ifϕi(j
ˆϱi( j ) :=
1T
min {T ,T -j
}∑t=max {1 ,- j
+1 }
= 0)I ( ˆui,t+ = 0).
I(
ˆuit
j
n
Take a sequence
mn
(1) ni→
8 sufficiently slowly. Then,
ω
(2)
ni
(3)
ni
is a more w
delicate issue, since it reduces to the e
As
in
Hahn
and Kuersteiner (2011), we make us
here K(1)(u ) = dK ( u )/du. The estimation of the terms ω(1) ni, ω(2)
ni
0
fi,j(0, u ) du,
0
∫ f
(0, u |xi 1, xi, 1+ j)du ]
,
( j ) can b e estimated
by
i,j
-8
ϱi(j ) := E[I (ui1
Since ϕi(j ) ˜ E[Khn(ui 1)I (ui, 1+ j
ˆ ϕi(j ) := 1T
Similarly, fi
i, 1+ j
= 0)].
i,t +j
The term ϱi
such that m(j ) can b e estimated by its sample analogue:
,ω
1 =|j |= mn ( 1 - |j ) {-t
2+
(j )}
14
|
ˆϱ .
T
ˆω(
3) ni
can b e estimated by:= ∑1=| j |= m:=
∑n1 =| j |= mn( 1
-( 1 -:= t (1 - t
ˆω(1) ni
)+∑
ˆ
ω(2
) ni
|j ) {tˆ fi ˆ ˆ f
|
γi
T
|j
|
T
)
{
t
(j )
},
ˆ ˆ ϕi(j )} ,
f
i
i
i
and
ω
T
h
e
bi
a
s
te
r
m
b
is
th
u
s
e
sti
m
at
e
d
b
y
ˆ G- {1 1nni=1∑ˆsi( ˆω(1) niˆˆ ω(2) ni+ ˆsiˆω(3) ni)}
ˆ .
ˆ b :=n
γi
νi
2
and t ˆ fiˆ and
γi ˆ ω, resp ectively, as they are canceled out by the differen
Additionally, there is no need to use the same kernel and the s
is no need to compute the the terms t ˆ f i
estimate ß0and b. In particular, the kernel used in the estimatio
b e of a higher order (see the remark after Theorem 3.2 b elow
the notation simpler, we do not intro duce a new kernel and a n
The next theorem shows that under the same conditions as Th
the limiting normal distribution with mean zero and the same co
.
The pro of of the theorem is given in App endix A.3.
2 n→ 8 such that
(log n)/(T h2
n
m
v nT (ˆ - ß0) d→ N (0, G-1V
ß1
G-1) .
:= ˆ ß -in We define the one-step bias corrected estimator by
ˆ b /T . In practice,
ˆω(1) ni ˆ ß1
there
Theorem 3.2. Let mn ) → 0 . Under the same conditions as Theorem 3.1, we have
R
e
m
ar
k
3.
1.
A
s
di
sc
u
ss
e
d
in
S
e
cti
o
n
3.
1,
th
e
or
d
er
of
th
e
k
er
n
el
u
s
e
d
to
c
o
n
st
ru
ct
th
e
F
ES
Q
R
e
sti
m
at
or
h
a
s
to
b
e
at
le
a
st
4.
H
o
w
e
v
er
,
th
e
or
d
er
of
th
e
k
er
n
el
u
s
e
d
to
c
o
n
st
ru
ct
th
e
bi
a
s
e
sti
m
at
or
c
a
n
b
e
2.
In
vi
e
w
of
th
e
pr
o
of
of
T
h
e
or
e
m
3.
2,
to
m
a
k
e
is of order o{ (log n)/( T h1/n 2)
ni
n
ˆ Vni
ˆ Gn p
)
} , which is satisfied if h≍ T- cwith
1/(2r + 1) = c < 1/3. In particular, r can b e 2 in
ˆ
of the bias term alone.
b consistent, it is sufficient that hestimation
r
n
Remark 3.2 (Estimation of the asymptotic covariance matrix). As a bypro duct o
Theorem 3.2, we can obtain a consistent estimator of the asymptotic covariance
matrix. The pro of of Theorem 3.2 shows that→ G. The matrix Vcan b e
consistently estimated by
Tt=1∑(xitˆ γi)(xit
ˆ γi 'it
= 0)}{t - I ( ˆui,t+ j = 0)} (xit ˆ γ )( xi,t+ j ˆ γ ' ,
= t (1 - t )T
T
( 1 - |j | )
+ ∑1=| j |= mn
T
ˆ
V
1min { T ,T - j }∑t=max {1 ,- j +1
}{t - I ( ˆu
∑
ˆV
i
i)
1
5
is consistently estimated by = n
ˆV
n
i=1
ni
-1n
nˆ
G-1 n
is consistent for
the
n
so that . The pro of of the consistency ofis similar to the pro of of the consistency of ˆ b
Vn
given in App endix A.3. Under
the same conditions as Theorem 3.2, it is shown that ˆ ˆ
G
V
- 1n
asymptotic covariance matrix G- 1V G-1. Here, the term (1 - | j |/T ) in the definition of
ˆVnican b e replaced by 1 or (1 - |j |/mn), and such a replacement do es not affect the
consistency prop erty.
Remark 3.3. As discussed in Hahn and Kuersteiner (2011), for small T , a default
choice of mnwould b e 1. For relatively large T , a choice of mcould b e significant.
However, a standard theory of auto correlation consistent covariance matrix
estimation, such as that given in Andrews (1991), can not b e applied here b ecause
Andrews (1991) restricted moment functions to b e smo oth while the moment function
of quantile regression is not smo oth, and the optimal choice of m nnis b eyond the scop
e of the pap er.
3In
the case that the observations are indep endent in the time dimension, Hahn and
Newey (2004, Theorem 2) showed that under suitable regularity conditions, the
meanzero asymptotic normality of the bias corrected MLE for smo oth likeliho o d
functions hold when n/T3→ 0. In the present case, although we do not exclude the p
ossibility that Theorem 3.2 holds under n/T→ 0, a formal justification will b e quite
involved and requires even stringent conditions, so that we do not pursue here this
direction.
To b e precise, consider a fixed effects estimator ˆ θ of the common parameter θ0for
some “standard” panel mo dels (say, panel probit mo del) with individual effects. If
the ob jective function is smo oth enough, and some moment condition is satisfied,
then, as Hahn and Newey (2004) and the subsequent study suggested,ˆ θ admits
the expansion
ni=1 Tt=1 Uit + BT +
ˆθ- =
-2) + op - 1/ 2{( nT )} , (3.3)
∑
∑
θ0
1nT
Op(T
1/
2
and
T
-1
are
mean zero random vectors, and B is some constant which eventually
where Uit
contributes to the bias. Imp ortantly, this expansion is valid without n/T → const. (but
a suitable growth condition to ensure the consistency of the estimator is required).
The condition that n/T → const. comes so that (nT )rates are balanced. If we
“successfully” remove the bias B , which means that B itself can b e estimated with
bias of order T- 1, then the asymptotic normality (with mean zero) holds under the
condition that T-2is faster than (nT )- 1/ 2, i.e., n = o(T3) (this discussion is less formal
but explains the main p oint that Hahn and Newey (2004) did). To summarize, Hahn
and Newey’s Theorem 2 crucially dep ends on the two observations:
O
p
{
T
1
(i) the fixed effects estimator admits the expansion of the form (3.3);
(ii) the bias term itself can b e estimated with bias of order T- 1, i.e., ˆ B = B +
+ ( nT )- 1 /2} .
16
In contrast, such a “clean” expansion do es not hold for the FE-SQR estimator b
ecause of the presence of the smo othing bias, and even higher order expansions of
the FESQR ob jective function will b e quite involved. Additionally, to guarantee an
expansion of the form (3.3) in the present case requires considerably stringent
conditions. For example, there is a bias term of order hr nin the present case, and to
make this term of order T-2, r (the order of the kernel) should b e at least 8.
Implementing (ii) in the present case is also challenging, since the bias now dep ends
on long run covariances. In fact, even for standard smo oth ob jective functions, Hahn
and Kuersteiner (2011) only proved the asymptotic normality of the bias corrected
estimator under the condition that n/T → const.However, it is still true that the FE-SQR
estimator ˆ ß admits the expansion of
the form (3.3) with the Op(T- 2) term replaced by op( T- 1). A careful insp ection of the
pro of shows that this expansion holds under a slightly weaker growth condition on T
than n/T → const. , although T should not b e to o slow to ensure sufficient
convergence rates of some kernel typ e statistics (and the consistency of the FE-SQR
estimator). Therefore, if we can estimate consistently the bias term b, the asymptotic
normality (with mean zero) holds under a slightly weaker growth condition on T than
n/T → const.
ˆ fAn undesirable feature of the analytic bias correction is that it dep ends on the
inverse of the estimated densitiesi. When T is relatively small, on rare o ccasions, ˆ f i
may b e very close to zero for some i, which results in an unreasonable value of ˆ ß1ˆ
I. To avoid such a situation, a reasonable solution is to use a trimming. Let i:= I ( ˆ fi>
κ) for some sequence κnsuch that κn→ 0 slowly. We prop ose to mo dify ˆ Gnand ˆ b
asn
ni=1 ˆ Ii
˜ =
Khn( ˆuit) xit( xit ˆ γi),
G 1nT ∑ Tt=1∑
n
˜ G-1{ 1nni=1∑ˆ Iiˆs(i ˆω(1) niˆˆ ω(2) ni+ ˆsiˆω(3) ni)}
ˆ .
˜ b =n
γi
νi
2
|ˆ
fi
i|
fi
for a moment that T is even. Partition { 1, . . . , T } into two subsets,
S1
2 := { T /2 + 1, . . . , T } . Let ˆ ßSl b e the FE-SQR estimate based on the data
17
The validity of such a mo dification is simple. By the pro of of Theorem 3.2,
pf→ 0, and by con
max1= i= n
κnfor all 1 = i = n. T
Trimming is freque
We refer to Sectio
We next consid
Dhaene and Jo ch
ß . Supp ose
:= { 1, . . . , T /2} a
{ (yit, xit), 1 = i = n, t ∈ Sˆ ß} for l = 1, 2. The half-panel jackknife estimator is defined
as1/ 2:= 2 ˆ ß -¯ ß1/ 2l, where ¯ ß1/ 2:= ( ˆ ßS1+ ˆ ßS2)/2. For simplicity, supp ose for a
moment that we use the same bandwidth to constructˆ ß andˆ ßv nT (Sˆ ßl1/ 2( l = 1 , 2).
Then, from the asymptotic representation of the FE-SQR estimator that app ears in the
pro of of Theorem 3.1 (see (A.16) in App endix B), it can b e shown that under the
same conditions as Theorem 3.1 (including n/T → ρ ), we have- ß0)d→ N (0, G-1V G-1).
The half-panel jackknife estimator would b e attractive in b oth theoretical and practical
senses. In fact, its validity do es not require any extra condition. It would b e also
preferable from a practical p oint of view since it do es not require the nonparametric
estimation of the bias term and at the same time is easy to implement.
One can find some other options to correct the bias of the FE-SQR estimator in the
literature (see, for instance, Hahn and Newey, 2004; Arellano and Hahn, 2005; Bester
and Hansen, 2009; Dhaene and Jo chmans, 2009). Although it is an interesting topic,
the exhaustive comparative study of such metho ds is b eyond the scop e of this pap
er, and we would restrict our attention to the one-step bias correction and the
half-panel jackknife metho d.
4 A Monte Carlo study
4.1 Design
In this section, we rep ort a small Monte Carlo study to investigate the finite sample
p erformance of the bias correction. A simple version of mo del (2.1) is considered in
this study:
i
yi = 2+3 , + (1 + 0. 5xit)ϵit
t
= eit + θ eit- 1
ηi
t
2
+ F-1 ϵ
3
it
x
η
i
t
i
i it
~ i.i.d. U [0 , 1], ϵit
it
0
11
1
2
, (4.1) where x= 0. 3ηi+ zit, zit~
i.i.d. χ, and e~ i.i.d. F with F = N (0, 1) or χ11.In this mo del, ai0= ai0( t ) = η( t ) and ß=
ß0(t ) = 1 + 0.5F- 1 ϵ(t ), where F- 1 ϵis the quantile function of ϵ. We consider here the
cases where n ∈ {50, 100, 500}, T ∈ { 10, 20, 50, 100} and t ∈ { 0.25, 0 .50, 0.75} . We
use two different parameter values θ ∈ { 0.5 , 1} to control the auto correlation in ϵ
As in Horowitz (1998), we use the fourth order kernel
K (u ) := 10564 (1 - + 7u4 - 3u6)I ( |u | = 1).
5 u2
12
.
The b oundedness restriction on the regressors required by condition (A2) is not satisfied by the
present sp ecification. However, we b elieve that condition (A2) is mainly a technical condition and the
simulation results should not b e affected by the use of this distribution.
In fact, we also examined the cases with a different degree of heteroscedasticity and a different
error distribution, but the results are qualitatively similar.
18
ˆ ßThe numb er of rep etitions for each Monte Carlo exp eriment is 1,000. The FESQR
ob jective function is not necessarily convex and may have several lo cal optima.
Thus, the choice of the starting value is imp ortant when computing the estimate. We
cho ose the standard QR estimate as a reasonable starting value. We use quantreg
package (Ko enker, 2009) to compute the standard QR estimates. The estimators
under consideration are the standard QR estimatorKBdefined by (2.2); the FE-SQR
estimatorˆ ß ; the one-step bias corrected FE-SQR estimatorˆ ß1ˆ ß; and the half-panel
jackknife estimator1 /2describ ed in Section 3. Additionally, we apply the prop osed
one-step bias correction to the standard QR estimator. We denote this estimator as
ˆ ß1 KB. It is imp ortant to note that there is no theoretical justification for this estimator,
but it is interesting to analyze its p erformance in terms of computation. Actually, the
computational time of the FE-SQR estimator is considerably slower than the standard
QR estimator. Nonetheless, smo othing is crucial to formally characterize the bias
expression for panel QR mo dels, so we adopt the smo othing approach here to tackle
this imp ortant problem.
The choice of the bandwidth is a common, sometimes difficult problem when the
kernel smo othing is used. In this exp eriment, we use h1 n= c1s1(nT )- 1/ 7to construct
the FE-SQR estimate, where c1is some constant and sis the sample standard
deviation of ˜uit= yit- ˆai,KB- xitˆ ßKB, and h2n= c2s2T1- 1/ 5to construct the bias estimate,
where c2is some constant and sis the sample standard deviation of ˆu it= yit- ˆai- xitˆ ß
.132These choices are only for simplicity. We leave the choice of the bandwidth as a
future topic. Some intuition b ehind these choices, however, may b e explained as
follows. According to condition (A7) and the discussion in Section 3.1, the bandwidth
used to construct the FE-SQR estimate should b e of the same order as (nT )- c/ 2with 1
/3 < c < 1/4 when n and T grow at the same rate. Our choice h 1n
14satisfies
this restriction except for the fact that it dep ends on the data.On the other
hand, by the remark after Theorem 3.2, the bandwidth used to constructˆ b, say h,
should b e such that h2n≍ T-c2nwith 1/9 = c < 1 /3 to make ˆ b consistent for b . Our
choice h2nsatisfies this restriction. The reason for the use of the different bandwidths
is that in practical situations where n is much larger than T , the choice h 1 nis inclined
to b e rather small forˆ b , which results in less stability of the analytic bias
correction.
and h2 n, we set c1= 1 and c2
ˆ fi
1
ˆ ß1. To avoid this,
9
13(
ˆa1 ; KB, . . . , ˆan; KB, ˆ
ßKB
14
= 2. As discussed in the previous
R
section, when T is small, on rare o
egarding the constants in h1n close to zero for some i, which results in an unreasonable value of
) is defined as a solution to (2.2) in the present context
used the standard QR estimate as a starting value to compute the FE-SQR estim
We do not further discuss the validity of the data dep endent bandwidth.
we use the trimming in our implementation, and set κ= 0 .01. For the truncation level,
we use mn= 1 as discussed in Remark 3.2.n
4.2 Results
ˆ ß1Before lo oking at the Monte Carlo results, it is worthwhile to summarize our ob
jectives of this exp eriment. The ob jectives are twofold. The first one is to examine
whether the bias correction metho ds work in finite samples. In view of the theoretical
results, the bias ofand ˆ ß1 /2should b e smaller than ˆ ß , at least for mo derate T . The
second
ˆ ß1ob jective is to examine the effect of the bias correction on the precision of the
estimator. The theoretical results suggest that the asymptotic variance ofor ˆ ß 1/ 2is
the same as that ofˆ ß . To this end, we present the results for the standard deviation
(SD), and the
ˆ ßKB
ˆ ß and ˆ ß1 15.
KB
,
and ˆ ß1 /2
32
ˆ ß for t ∈ {0 .25, 0.75} . The estimator
ˆ
1
1/ 2
ß
ˆß
ˆ ß1
1
ˆ ß1
16
1
ro ot mean squared error (RMSE) of the estimators. The results for ˆ ˆ ß1are only for reference, a
ßKB
Table 1 presents the res
normal innovation with θ =
estimators are approximate
{0 .25, 0.75} ,
and their biases decrease a
is fixed. Additionally, it can
ßis able to reduce the bias
results for the half-panel jac
show that it only works for a
distribution are rep orted in
results regarding SD show
however, decreases (in lev
RMSE of the one-step bias
These
two observations may refle
in the finite sample (esp ec
nonparametric estimation o
first derivatives. However, for large n/T ratio, which is common in
smaller than those ofˆ ß , fo
many applications, the RMSE of the bias corrected estimator is
are smaller than those of ˆ ß .
T ∈ { 10, 20} and t ∈ { 0.25, 0. 75} , the RMSEs
This
of ˆ ß
is due to the fact that the bias term dominates the variance in these cases.
Table 2 presents the results for bias, SD, and RMSE for the χ innovation with
15
Recall that it is not known whether a result analogous to Theorem 3.1 holds for ˆ ß KB.
In additional simulation exp eriments, not rep orted here but available up on request, we have
examined the cases where n = 1000 and T ∈ { 10, 20}. In such cases, the RMSE of the one-step bias
corrected
estimatorˆ ß1is even smaller than that of ˆ ß .
1
206
θ = 0.5. In this case, for t = 0 .75 and for relatively small T , ˆ and ˆ ß are
ßKB
considerably
ˆ ßbiased. However, the one-step bias correction is able to substantially reduce the
bias. The results also show that the half-panel jackknife estimator1 /2do es not work
well in bias reduction except for a few cases. The results for SD and RMSE for the χ3
2ˆ ß1case are presented in the middle and lower parts of Table 2, resp ectively. They
show that b oth SD and RMSE ofˆ ß1increase substantially for small panel data.
However, the variance inflation decreases as either n or T increases, and also as
discussed previously, for large n/T ratio the bias term dominates the RMSE and we
can observe reduction in RMSE in such cases, for example, for t = 0.75, n = 500 and T
∈ { 10, 20} the RMSE ofare smaller than those of ˆ ß .
The results for bias, SD, and RMSE for the θ = 1 case are presented in Tables 3
and 4 for the normal and χ2 3innovations resp ectively. Basically, they are parallel to
those for θ = 0.5. Overall, in our limited examples, we may conclude that the one-step
bias correction is able to substantially reduce the bias in many cases, although it
slightly increases the variability in the small sample partly b ecause of the
nonparametric estimation of the bias term.
As a final remark, we shall comment that, in our simulation exp eriments, the
analytic bias correction presented in Section 3.3 somehow works for the unsmo othed
estimator. This is not totally surprising b ecause the two estimatorˆ ß andˆ ßKBnaturally
share a common structure. In practice, it might b e an option to use the bias formula
given in Section 3.3 to the unsmo othed estimator. However, it is again noted that
there is no formal validity to do so.
5 Concluding Remarks
In this pap er, we have shown that the fixed effects smo othed quantile regression
estimator for panel QR mo dels has a limiting normal distribution with a bias in the
mean when n and T grow at the same rate. We have further considered two metho ds
of correcting the bias of the estimator, namely an analytic bias correction and the
half-panel jackknife metho d of Dhaene and Jo chmans (2009), and theoretically
shown that they eliminate the bias in the limiting distribution. Imp ortantly, our results
allow for dep endence in the time dimension and cover practical situations. The
contribution of this pap er is to op en a new do or to the bias correction of common
parameters’ estimators in panel QR mo dels, which we b elieve is an imp ortant
research area.
Many issues remain to b e investigated. Substantial differences from smo oth
nonlinear panel mo dels are the facts that the FE-SQR estimator requires the choice of
the bandwidth and its asymptotic bias consists of nonparametric ob jects. In particular,
21
the estimation of the bias term is more complicated than in smo oth nonlinear panel
mo dels, and the choice of the bandwidth is a topic to b e further investigated in future
work. Moreover, in this pap er, we have suggested a consistent estimator for the
asymptotic covariance matrix. However, as observed in the simulation exp eriments,
the inflation of the empirical variance of the analytically bias corrected estimator in the
small sample might b e problematic for practical inference, which we conjecture could
b e overcome by making suitable mo difications to stabilize the analytic bias
correction. Extensive numerical studies, such as those of Buchinsky (1995) for the
cross section case, will b e helpful to determine a go o d way of making practical
inference in panel quantile regression mo dels, which will b e pursued in another
place.
A Pro ofs
For any triangular sequence { ani}n i=1→ 0, we use the notation ani= ¯o( ϵn) if max1= i=
n|ani| = o(ϵnand any sequence ϵn). We also define ¯ O (·), ¯op(·) and ¯ Op(·) in a similar
fashion. For x, y ∈ R, we use the notation x ∨ y = max{x, y }. For x ∈ R, [ x ] denotes
the greatest integer not exceeding x .
A.1 Pro of of Prop osition 3.1
(a , ß ) := T- 1∑T t=1(yit- a - x' itß ){ t - I (yit= a + x' itß ) } and ∆i(a , ß ) := E[ ˜ M( a , ß )
˜Mni( ai0, ß0)]. Without loss of generality, we may assume that ai0ni= 0 and ß1denote
the l1norm. For d > 0, define Nd= 0. Let ∥ · ∥:= { (a , ß ) : | a | + ∥ ß ∥1= d } and Nc dby
its complement in Rp +1. We first show that for any d > 0,
min c ∆i(a , ß ) > 0. (A.1)
inf ( a, )
∈N
i=
1
Put Mni(a , ß ) := T1
∑T
t=1(yit
- a - x' itß ){ t - Ghn( yit
- a - x' itß )} , ˜ Mni
0
To see this, use the identity of Knight (1998) to obtain
∆i(a , ß ) = E [∫0a +x' i1 { Fi(s |xi 1) - t } ds ] ,
from which we can see that each map ( a , ß ) 7→ ∆i(a , ß ) is convex by
differentiation. By condition (A5), we can expand ∆i(a , ß ) uniformly over (a , ß )
such that |a | + ∥ ß ∥1= d and i = 1 as
∆i(a , ß ) = (a , ß')E[ fi(0|xi1)(1, x' i1)'(1, x' i1)](a , ') + o(d2), d → 0
ß'
.
By condition (A4), there exists a p ositive constant c indep endent of i such that
')(1, x' i1)](a , ß') = c( |a | + ∥ß 2), ∀i = 1 .
(a , ß')E[ fi(0| xi 1)(1, x' i122
∥1
Thus, there exists a p ositive constant such that for any 0 < d = d0,
d0
min ∆i( a , ß ) > 0. (A.2)
inf |a |+ ∥ ∥1=
d
i
=
1
(a , ß ) = riwe have ∆and any (a , ß ) ∈ Nc d. Take r = d /( |a | + ∥ ß ∥), ˜a =
Now, pick any 0 < d = d0
= r ß , so that | ˜a | + ∥ ˜ ß ∥-11∆i1= d . Because of the convexity of the map
7→ ∆( ˜a , ˜ ß ). Thus, by (A.2), (A.1) holds for 0 < d = d0i( ¯a , ¯ ß ),. It is
obvious that (A.1) holds for any d > 0 since the left side of (A.1) is nonde
d > 0.
Next, we shall show that
sup |Mni(a , ß ) - Mni(0, 0) - ∆i( a , ß )| p→ 0. (A.3)
max
( a,
)∈A×B
1
=i
=n
( a , ß ) - Mni(0, 0) - ∆ini( a , ß ) = { M(a , ß ) ˜ Mni( a ,
ß )} - { Mni(0, 0) ˜ Mni˜ Mni(0, 0)} + {( a , ß ) ˜ Mni(0,
0) - ∆i(a , ß )} .
By the pro of of Horowitz (1998, Lemma 1), the first and second terms are O (h) v T → 0, it is n
uniformly over (a , ß ) ∈ A × B and 1 = i = n. It remains to prove that third term is
opn(1) uniformly over (a , ß ) ∈ A × B and 1 = i = n. This can b e shown in a similar
way to the pro of of (A.2) in Kato, Galvao, and Montes-Ro jas (2011) (see also
their Remark A.1) with a suitable change according to the fact that the
observations are now dep endent in the time dimension. Since the most of
arguments are similar to the pro of of their (A.2), we only state the p oint to b e
changed. The only p oint that should b e changed is the place where they apply
the Marcinkiewicz-Zygmund inequality to evaluate b oth the terms of their (A.4).
Instead of the Marcinkiewicz-Zygmund inequality, we now apply a Bernstein typ e
inequality for ß -mixing pro cesses (see Corollary C.1 b elow) with the evaluation
of the variance term by Lemma C.2. Because of the exp onential ß -mixing prop
erty (condition (A1)), and the uniform b oundedness of xit(condition (A2)), taking s
= 2 log n and q = [ vT ] in Corollary C.1 and using the fact that (log n)/
i
- 2 + v T B a[ v
ma
su
T]
x
p
( a, )∈A×B
(a , ß ) | > ϵ |} ˜ Mni(a , ß ) ˜ Mni(0, 0) - ∆
=2(
p→ 0
P{
T → 0, the right side is
o(n
(a,
)∈A×B
1
sup
| ˜ Mni(a , ß ) ˜ Mni(0, 0) - ∆i( a
Because (log n)/ v
23
=
Observe that
Mni
i
=
n
max
1 =i =n
- 1).
Thu
s, by
the
unio
nb
oun
d,
we
hav
e
ˆ ßpTherefore, we obtain (A.3). We shall show→ 0 = ß0. Pick min( a,
∈Nc
any d > 0 and put ϵ := infi= 1ˆ ß ∥1> d ,
∑n i=1{ Mni( ˆai, ˆ ß ) - M- 1∑n i=1min(a, )∈Nc ninA×B{Mni( a , ß ) - M- 1∑n
i=1min(a, )∈Nc nA×B{Mni( a , ß ) - Mnini(0, 0) - ∆i= 1min(a, ) ∈Ni
1 = i=n
|Mni(a , ß ) - Mni(0, 0) - ∆i(a , ß )| + ϵ.
-1
n Mni(
-1
n i Mni(0, 0). Thus,
i=1
=1
ˆa
(a , ß ) > 0. Observe that when ∥
- 1n (0, 0)} = n(0, 0)} = n(a , ß )} + infc ∆i(a , ß )
sup
=(a, ) ∈A×B
max
( a, )∈A×B
∑ i, ˆ ß ) =
By definition, n ∑
n
max
)
1 =i =n
P( ∥ ˆ
ß ∥1
su
|Mni(a , ß ) - Mni(0, 0) - ∆i
p
| | p
ˆa i| > d for some 1 = i = n,
>d)=P{
p
i= i= n
i
(0, ˆ ß
Mni( ˆai, ˆ ß ) - Mni )
{Mni( a , ˆ ß ) - Mni(0, ˆ ß )
= min
| a| >d,a ∈A}
{Mni( a , ˆ ß ) - Mni(0, 0) } - {Mni(0, ˆ ß ) - Mni(0, 0)}
= min
|
a
|
>
d
,
a
∈
A
i
which implies that ˆ → 0. We next prove
ß
max
= arg mina ∈A Mni(a , ˆ ß ). When
| ˆa
→ 0. Pick any d > 0 and put ϵ as b efore. Recall that ˆa
|a |>d,a ∈A{ Mni(a , ˆ ß ) - Mni(0, 0) - ∆i(a , ˆ ß )} + ϵ
- {Mni(0, ˆ ß ) - Mni(0, 0) - ∆i(0, ˆ ß )} - ∆i(0, ˆ ß )
= min
=-2
max1
=j =n
sup
( a,
)∈A×B
|Mnj( a ,
ß)-
∆i
( a , ß )| = ϵ → 0
}
,
Mnj(0, 0) - ∆j(0, ˆ ß
∆j(a , ß )| ).
+ϵmax1= j = n
Thus, we obtain the inclusion relation
{| ˆai| > d, 1 = ∃ i = n}⊂ {max1 =i =nsup( a,
- Mni(0, 0) - ∆i(a , ß )| = ϵ3
=: A1n ∪ A2n.
) ∈A×B|Mni(
a,ß)
∆i(0, ˆ ß ) ϵ }
}∪ {
max1= =
3
i= n
By (A.3), P( A1n) → 0. On the other hand, since ∆of ˆ ß implies that P( A2ni(0, ˆ ß
) = 2M ∥) → 0. Therefore, we complete the pro of.ˆ ß ∥ , the consistency
24
, ß ) and H(2) (n a, ß ) in Section 2. Put H(1) ni( ai
ß )]
ˆß.
and H(2) n (a , ß ) := E[H
(2) n, ß ) := E[ H(a , ß )]. The pro of of Theorem 3.1 consists of a series of lem
A.2 Pro of of Theorem 3.1
Throughout the pro of, we assume all the conditions of Theorem 3.1. Lemma
Recall the definition of H(1) ni(ai
A.1-A.3 b elow are used to derive a convenient expansion of
2n → 0 such that max1 =
→ 0 and d2
| - ai 0| = Op( d2
n
i=n2 1 nr -1 n1n
ˆai 2n3 1 n1n
Lemma A.1. Take d1n
ˆ ß - ß0∥ = Op(d2 n
i - ai0 = si H(1) ni(ai0, ß0) - γ' i( ˆ ß - - 1) - 2si f(1) i(0)(
- ai 0 2)
ß0
ˆai
n := (
ˆ
+
)h +
h
+
d2 +
+
n
a
d1n
d
d
d
d
d
).
P
ut
dr
n)
a
n
d
∥.
T
h
e
n,
w
e
h
a
v
e
0, ß) } - {H(1) ni( ˆai, ˆ ß ) - H(1) ni¯ O) }] +p( dn(a0 i, ß0) , (A.4) ˆß - ß0= G- 1 n-1{- n∑n i =1H(1
i[{ H(1) ni( ˆai, ˆ ß ) - H(1)
ni(a0 i
ß0)γi+ H(2) n- 1 n- 1n) } - G∑n i=1γi(1) ni[{ H( ˆai, ˆ ß ) - H(1) ni( a(a0, ß0 i0, ß0) } - {H(1) ni( ˆa
H(1) ni) }] + G- 1 n[{ H(2) n( ˆ a, ˆ ß ) - H(2) n(a0, ß0)} - { H(2) n( ˆ a, ˆ ß ) - H(2) n) }] + G- 1 n
+s
(2n)∑n i =1νi( ˆai- ai02)} + Op( dn( a0, ß0(a0 i, ß0). (A.5)Proof of Lemma A.1. The
consistency result shows that ( ˆ a , ˆ ß ) satisfies the first order
(1) ni(ai,
condition (2.4) with probability approaching one. The first order condition implies
0 = H(1) ni(ai 0, ß0) + { H(1) ni( ˆai, ˆ ß ) - H(1) ni(1) ni( ˆai, ˆ ß ) - H(1) ni(a0i,
ß0(a0 i)} - { H, ß(1) ni0( ˆai) } + [ { H, ˆ ß ) - H(1) ni( a0 i, ß0)} ]
Expanding H(1) ni( ˆai, ˆ ß ) around (ai 0, ß0) and using Lemma B.1, we obtain (A.4)
the other hand, the first order condition implies that
0 = H(2) n( a0, ß0) + {H(2) n( ˆ a, ˆ ß ) - H(2) n(2) n( )} - { H (2)
( ˆ a, ˆ ß ) - H(2) n(a0, ß0)} ].
n
ˆ a, ˆ ß ) - H(2) n( a( a00, ß, ß00)} + [{ H
(A.6)
Expanding H(2) n( ˆ a , ˆ ß ) around ) and using Lemma B.1, we
(a0, ß0
obtain
]( ˆa-i ai0 2) + Op(dn). (A.7)
25
12n
){
H(2) n( ˆ a , ˆ ß ) - H(2)
n(a0, ß0)
= 1n ni =1∑nE[ fi=1∑i(0|xE[ - ai0
f(1) ii1)x(0|xi 1i1]( ˆa)xii
1
1∑
n
n
i=1
E[f (0|xi1)xi1x' i1] } ( ˆ ß - )
ß
i
0
Pl
u
g
gi
n
g
(A
.4
)
in
to
(A
.7
)
yi
el
d
s
th
at
H(
2)
n(
ˆ
a
,ˆ
ß
)H(
2)
n(
a0
,
ß0
)
is
e
x
p
a
n
d
e
d
a
s
1n
γi[{ H(1) ni( , ˆ ß ) - H(1) ni(a0 i, ß0) } - {H(1) ni( ˆai, ˆ ß ) - H(1)
ˆai
ni(a0i, ß
1n ni=1∑
ni=1∑
) }] + Op(dn).
I H(1) hn( uit)xit
ni(ai
f
0, ß0)
γi
:=
{H(2)
n:=
T-1∑
∑T t=1
(1) i
- ni=1∑
Gn(
ˆßß0)
+
12n
νi( (2
ˆai )
n
(
a
a
i
0
)
2
(2)
n(a0
0
, ß0
(2)
n
)
(
a
2
0
i
n
,
)
ß
0
(A
.8
)
C
o
m
bi
ni
n
g
(A
.8
)
a
n
d
(A
.6
)
le
a
d
s
to
(A
.5
).
(1) ni(
H
(2) n
i
i
ˆai, ˆ ß ) - (a0
i,
ß
(ˆa,ˆß
)-H
:= T - 1T t=1
)} - {
H
(ˆa,ˆ
ß)-H
K Khn(uit) , ˜ f(1) i
)} - { H(1) ni( ˆai, ˆ ß ) - H(1)
ni
0,
ß
:= -(
T hn2
˜
- 1T t=1 T
∑
t
=
1
K
(
1
)
(
u
i
t
Put
/
h
˜
K(
1)(
uit
/h
n)
xit
.
0)} , I)} , ˆ∑), ˆ g, ˜ g:= -(T h- 1n
0(1) :=
ni
(1) ni
- E[ ˆ f]) - hn( ˜ f
n)1/2
p
i
n)1/2
=
-n
(2) n
1],
(A
.9
)
I)
+
o[
{
(l
o
g
n)
/(
T
h
+
{(l
o
g
n)
/(
T
h1
/
2)
}
d
- E[ ˜ f ])}
(
2n
ˆai
n( ˜ g
3-
ai 0
d
n 1/ 2}
2n
p
∑n i =1{ ( ˆ - E[ ˆ g]) - h
gi
}d
(1) i
3 -n E[
ˆai
˜ g(1)])} ( - ai0
21n
{H
Lemma A.2. We have ) + ¯o[{ (log n) /(T h}d+ {(log
n)/(T h)
(
1
)
i
( iiI
1
(
)
1
)
i
n
i
1
n
= -{( ˆ f
where are given in Lemma A.1. Proof of Lemma A.2. We only prove (A.9) since the pro of and
d
of (A.10) is analogous. Obd2n
serve that I(1) is decomp osed i1 + Ji2, where
ni
as J
Ji1 := { H (1)
(
ˆa
i, ˆ ß ) ( ˆai, ß0)} - { H(1) ni( ˆai, ˆ ß ) - H(1) ni) }, Ji2(1) ni:= { H( ˆai, ß0(1) ni) ni
H(1) ni
H(a0i, ß0)} - { H(1) ni( ˆai, ß0( ˆa) - Hi(1) ni, ß(a00 i, ß0) }.
= {∂ H(1) ni( ˆai, ˜ ßi) - ∂ H(1) ni( ˆai, ˜ ßi)}'( ˆ ß - H(1) ni( ˆai,
By Taylor’s theorem, Ji1
ß0
˜ ßi
˜ ßi
0. Now, by Lemma B.2, ∂
H(1) ni( ˆa , ˜ ßi) =
n1 /2)} , implying that Ji 1
=
1 /2[ {(log n)/(T h)}
¯op
¯op d2n
),
w
h
er
e
fo
r
e
a
c
h
i
,is
o
n
th
e
li
n
e
s
e
g
m
e
nt
b
et
w
e
e
n
ˆ
ß
a
n
d
ß)
∂{
(l
o
g
n)
/
(T
h]
.
Si
m
il
ar
ly,
u
s
e
T
a
yl
or
’s
th
e
or
e
m
to
o
bt
ai
n
- E[ ˆ f]) - hn( ˜ f
H(1) ni( ˜ai,
ß0
Ji2 = -{( ˆ f i) + {∂
2
ai
- E[ ˜
f
])} (
ˆai
- ai0
- ai 0
2 a)
- ∂iH(1) ni( ˜ai, ß0)}
2,
(A.11)
( ˆa
(1) (1) i i
i i
)
is on the line segment b etween and
ˆa
a
2 aiH
[{ (log n)/ (T hpH(1) ni( ˜ai, ß0) =
¯op{ (log n)/ (T h3 n3 n1 /2)2 1 n} d)
i2
26
where for each i , . By Lemma B.2, we have ∂(1) ni( ˜ai, ß02 a) - ∂i1 /2} , implying that the second term of
˜ai
Jin (A.11) is ¯o]. Therefore, we complete the pro of.
i i
0
Lemma A.3. max1= i= n | - ai 0| = Op - 1/ 2{ (T / log n)}
ˆai
.
Proof of Lemma A.3. By Lemmas A.2 and B.2, b oth n are of order op{max1= i= n| ˆai- ai 0| + ∥ ˆ ß - ß
ˆ ß - = G-1 )} + op{ max1= i= n| ˆai- ai 0| + ∥ ˆ ß - ß0∥} .(a0, + H(2) H(1) ni(ai0, ß0)
n
n
ß0
γi
ß0
By the exp onential ß -mixing prop erty (condition (A1)),
the uniform b oundedness of xit, and the discussion
following Corollary C.2 b elow it is shown that the
variance of the term n- 1∑n i=1(1) ni{ H(ai 0, ß0) γi(2) n+ H(a0,
ß0- 1)} is O {( nT )} . Thus, combining the fact that the
exp ectation of that term is O (hr n) by Lemma B.1, we
have
| - ai 0|}
∥ ˆ ß - ß0∥ = Op{ (nT )-1 + hr n} + op{
/2
max1= i= n
ˆai
+
max
|
a
i0
|}.
= op{ T- 1/ 2
1= i= n
ˆai
Using this b ound, together with Lemma A.2, we obtain
by (A.4)
- 1/ 2 + max |
- ai0 = si H(1) ni(ai0, ß0) +
- ai 0|}.
1=
i=
n
¯op{T
ˆai
ˆai
- 1{-n∑n i=1
We now b ound the term max1= i= |H(1) ni(ai0, ß0) |. Observe
n
that
(1) ni) = { H(ai0, ß0) - Hni(ai0, ß0)} + Hni(ai0, ß(1)
H(1) ni(ai0, ß0
ni(ai0, ß0) = { H) - Hni(ai0, ß0)} + ¯ O ( hr n).0
We wish to show that
|H(1) ni(ai 0, ß0) - Hni(ai 0, ß0)| = Op{ (T / log n)- 1/2}. (A.12)
max
1= i= n
Because of the fact that each term in the sum H(1) ni(ai0,
ß01 /2) is uniformly b ounded by some constant indep
endent of i and n, applying Corollary C.1 to that sum
with s = 2 log n and q = [(T / log n)], and using Lemma
C.2 to evaluate the variance term, we have
-2 + ( T log n)1/ 2B 1 =2[(T / log n)]
a
P { |H= 2 (1)
(ai0, ß0) - Hni(ai0, ß0 - 1/ 2) | = const. ×(T / log n)}
ni
{n
,}
where the constant is indep endent of i and n. Since the right side is ¯o(n), we
obtain (A.12) by the union b ound. Therefore, we obtain the desired result.
27
-1
By Lemmas A.1-A.3,
ˆ - ai0 = si H(1) ni(ai0, ß0) - γ ' i( ˆ ß - - 1) - 2si f(1) i(0)( - ai 0 2)
ai
ß0
ˆai
- s i{( ˆ - E[ ˆ fi]) - hn( ˜ - E[ ˜ f(1) i])} ( - ai0) + ¯op(T - 1 + ∥ ˆ ß fi
f(1) i
ˆai
ß0
i H(1) ni(an i=1
i i0, ß0) + ¯ Op
i
/(T h1/ 2 n (1) ni( a, ß0
0∥ + (log n)
0
(1) i
-1
(1) i
(1) ni(ai0
2
3
(1) i])}
H
∥
) = s{∥ ˆ ß - ß3/ 2)} , (A.13)
where we have used Lemma B.2 to obtain
the last equality. Put
B1-1), B:= n∑n i=1si{ ( ˆ gi- E[ ˆ gi]) - hn( ˜ g(1) i- E[ ˜ g, ß), B:= (2n)-1∑n i=1s2 iνi(1) i0 ])} - E[ ˜ - E[ ˆ f]) ni{ H(ai0, ß0)}2.Plugging (A.13) into (A.9) and (A.10), and using Lemma
H f
B.2, we obtain
-1
-1n
γi (1)
= -B1 +
I
= -B2 + ( - 1
+ ∥ ˆ ß - ß0∥ (2)
in
n
op(T
o T
), I
r
p
0
|
0)| =
- 1 /2
H
Op
∑n
n
i=1
On the other hand, since max1 =i=
(1) ni(ai0,
{ + (T / log
ß
h n)
-1n ∑n i=1 νi(
- ai0 2) = B3 + op( - 1 + ∥ ˆ ß - ß0∥ ). (A.15)
ˆai
T
H(1) ni(ai0, ß0) γ+
i H(2) n(a0, ß0
n
- 1{-n∑n i=1(
B1- B2
3
}
,
+ ∥ ˆ ß - ß0∥ ). (A.16)
B2
1,
)
+
o
p
(
T
H(1) 0)γi
ni(ai0,
ß
3
- 1nT
{-n∑
+ H(2) n(a0,
ß0)}
p
n
d
→
N
(
0
,
V
)
.
n(uit)} c'( x
Plugging (A.14)-(A.15) into (A.5), we obtain
0 =
)} + G-1 n+ B- 1
ˆ
G the right side of (A.16) is is exp ected to follow a central limit
The first term on
theorem. We
this fact in Lemma A.4. On the other hand, Band Bare
ß will establish
1
exp ected to contribute to the bias. We will establish the limiting b ehavior of these
n
terms in Lemmas
A.5 and A.6.
Lemma A.4. Recal l the de nition of V given in condition (A8). We have vn i=1
ß
it
, put zit := { t -Ghn( uit)+h ˜
Kh
Proof of Lemma A.4. For any fixed c ∈ R
i). Note that zit
Tt=1∑zi = 1v nT ni=1 Tt=1∑(zzit d→ N (0, c'V c ). Observe that by Lemma
hr n}
- 1/ 2 ∑n i=11
∑ it
t
vnTT t=1n
B.1,
,
- E[zi1]) + O { (nT )1/ 2
γdep ends on n. By the Cram´er-Wold
device, it suffices to show that (nT )∑i =1∑
and
the second term is o(1) b ecause of the fact that h=r o( - 1 - 1 /2) = o{ (nT )}
n
T
.
28
it.
By a standard calculation,
E[ ¯z2 i 1] = E[{{ t - Ghn(ui 1)} c'( - γi)}2] + ¯ O
xi 1
(hn).
Put
:= zit-E[ zi1]. We first evaluate the variance of the term (nT )-1 ∑n i=1 ∑T
/2
t=1
¯zit
(ui 12)| xi 1
(u |x )du
1dv2
2) dv1dv2f
¯
z
1
2
2}|x
)K (v1
= -t2
and uniformly over b oth x and i,
O (h+ O (hr n= x] = -t) + ∫8-8) +
∫8-8∫8u/h∫8n-8∫Fi8u/h(hnnK (v1)K (vmin{ v, v
r n+
E[ { t - Ghn(ui 1)}2|x2= x ] = t- 2t E[ Ghn(ui1i1)|xi
1= x ] + E[ Ghn
= t (1 - t ) + O (hn ). (A.17)
On the other hand, uniformly over (i, j ) (j = 1),
- γi ') c(xi, 1+ - γi )] + O (hr
j
n),
Cov ( zi1, zi, 1+ j) = E[ {t - Ghn(ui 1)}{ t - Ghn(ui,1+ j) }c'(xi1
and uniformly over b oth (x1, x1+ j) and ( i, j ) ( j = 1),
= x1, x1, 1+ j = x1+ j
)
= x1, xi, 1+ j = x
2
n
|xi1
E[{ t - Ghn( ui1) }{t - Ghn(ui,1+ j)}|xi 1
+ O ( hr n) + E[ Gh(ui 1)Ghn(ui,1+ j] = -t
2 + O ( hr
t
8
i, 1+ j
∫u81
∫∫
/h8
n∫8-8∫8-8
i
=
0)}c'
i1, zi,
T
z
i
t
1+ j
i1
t=1
1 vT
i
i1
1+
K ( v1)K du1+i1 j
(v2)dv ) + ¯)
O (h
=
V
a
r
(
˜
z
i
1
,
z
i
,
1
+
j
)
= -t2
8 /hn
+O(
hr n
i,j
1 (
nv1,
hnv2|x1, x1+ j)K (v1)K (v2)dv1dv2
v 1
r n),
(
) + O (hr n
1
i, 1+ j
∑
1
] = -t) +1
)+∫
n
u1+j
F (h
= x1, xi,1+
j
t=1∑˜
= 0)}|xi 1
z
(x
it
Var (
∑
|j |
T
29
(A.18)
¯
n
= itn
O
x
1
+
(
h
j]
rn
n
+
O
(
h
i
1
+
T
= -t2
h
+ Fi,j (0, 0|x
-8
1
+
j
= ∑|j |= T 1= Var (
T
T
it
)+
) + O (h) = E[ { t - I (u= 0)}{t - I (u
where the fourth equality is due to the fact that K (·) is an r -th order ke
{ t - I (u- γ). Then, by (A.17) and (A.18), Var(z) and Cov (z) = Cov ( ˜z, ˜
over (i, j ) ( j = 1). Thus,
i1,
|j
|
T
(h
+T
h
),
˜zi,1+ j) + ¯ O
r
n)
and hn + T hr → 0. Therefore, we
n
have
ni
Tt=1 zit) =
Var ( 1
1 ni=1 Var ( 1 Tt=1 zit) =
=1 ∑
vnT ∑
n∑
vT ∑
1 ni
Var ( 1 Tt=1 ˜zit) + o(1)
=1
n∑
vT ∑
= c' ( ni=1 Vni) c + o(1) → c'V c
1n ∑ .
Put
¯z:
=
T1/
2∑T
t=1¯
zit.
Ob
ser
ve
that
¯z1
vnT
ni
=1∑
1+, .
..,
¯zTt
=1∑
¯zitn
+itar
e
ind
ep
end
ent.
Vie
win
g
that
=
1v
nni
=1∑
¯zi+
,
w
e
c
h
e
ck
th
e
L
y
a
p
u
n
o
v
c
o
n
di
ti
o
n
fo
.
r
th
e
ri
g
ht
s
u
m
.
B
y
th
e
pr
e
vi
o
u
s
re
s
ul
t,
it
s
uf
fic
e
s
to
s
h
o
w
th
at
∑
n
i=1
E[
|
¯
zi
+3
|3
/2]
=
o(
n)
.
B
y
c
o
n
di
ti
o
n
s
(A
2)
a
n
d
(A
6)
,
¯
zi
s
u
ni
fo
r
m
ly
b
o
u
n
d
e
d.
B
y
th
e
e
x
p
o
n
e
nt
ia
l
ß
m
ixi
n
g
pr
o
p
er
ty
(c
o
n
di
ti
o
n
(A
1)
)
a
n
d
T
h
e
or
e
m
3
of
Y
o
s
hi
h
ar
a
(1
9
7
8)
,
it
is
s
h
o
w
n
th
at
E[
|
¯
z
∑
ni
=1
E[
|
¯
zi
+3
|3/
2]
=
O
(n
)
=
o(
n1
7).
i+3
|it]
=
¯
O
(1
),
w
hi
c
h
i
m
pl
ie
s
th
at
T
h
er
ef
or
e,
b
y
th
e
L
y
a
p
u
n
o
v
c
e
nt
ra
l
li
m
it
th
e
or
e
m
,
w
e
h
a
v
e
n-
1/
2
∑
ni
=1
¯
zi
+
d
→
N
(0
,
c'
V
c
).
L
e
m
m
a
A.
5.
R
e
c
al
l
th
e
d
e
ni
ti
o
n
of
ω
(3)
ni
gi
v
e
n
in
c
o
n
di
ti
o
n
(A
9)
.
B
3i
s
e
x
p
a
n
d
e
d
a
s
ni=1 s ω(3) niνi) +
B3 = 1T (
- 1).
·
12n ∑ 2 i op(T
i+ We wish to show a central limit theorem for the term (nT ) - 1/ ∑n i=1 ∑T
2
t=1
Proof of Lemma A.5. Put zit := t - Ghn( uit) + hn˜ Khn(uit) and ¯zit := zit - E[
zi1
].
T
h
e
n,
b
y
L
e
m
m
a
B.
1
a
n
d
th
e
pr
o
of
of
th
e
pr
e
vi
o
u
s
le
m
m
a,
w
e
h
a
v
e
¯
z
=
1T
= 1T
=T
(1) niE[{ H(ai0,
ß0)}2
i1,
j
zi, 1+ |j
|
T
|j ) Cov { I (ui ).
| 1
T
(1) Cov (
z
+¯O
(h2r n)
= 0), I (ui,1+ j = 0)}
¯o( T
)
(1) ni)
E[H
] = (n- 1 ∑n i=1 s2ω
i (3) niνi) /(2T ) + o(T
- 1).
Thus, E[ B3
30
17
Th
eo
re
m
3
of
Y
os
hi
ha
ra
(1
97
8)
is
st
at
ed
in
ter
m
s
of
a
-m
ixi
ng
pr
o
ce
ss
es
.
H
o
w
ev
er,
si
nc
e
ea
ch
a
-m
ixi
ng
co
e
ffi
ci
en
t
-
(ai0, ß0 (1)
] = E[{ H
ni
+
- 1)
1|j |= T - 1∑| j |= T
-1ω(1) ni(
¯o(T- 1
1 -+
(ai
is
b
ou
nd
ed
by
th
e
co
rr
es
p
on
di
ng
ß
-m
ixi
ng
co
e
ffi
ci
en
t,
Th
eo
re
m
3
of
Y
os
hi
ha
ra
(1
97
8)
ho
ld
s
wi
th
a
-m
ixi
ng
co
e
ffi
ci
en
ts
re
pl
ac
ed
by
ß
-m
ixi
ng
co
e
ffi
ci
en
ts.
W
e
n
e
xt
e
v
al
u
at
e
th
e
v
ar
ia
n
c
e
of
a
n
y
li
n
e
ar
c
o
m
bi
n
at
io
n
of
B
p,
p
ut
a i:
=
s2
ic '
νi.
T
h
e
n,
3.
F
or
a
n
y
fix
e
d
c
∈
R
∑n i ai( T - 1
-2
=1∑n
i=1∑n i
a2 i
=1∑
- 1 ∑T
t=1-1
zit 2)it 4 4
-1
4r n).
(A.19)
T t=1
a2 i
-1
it
} = nVar{ (T
- 1∑∑ zit)
E[(T
)
]+O
(n
h
T-2
-2
E[(
T
(2) ni
T i =1 i
i
p¯z
(2) ni)
+
given in condition (A9). B
op
n
+ 1n
2
B1
i=1∑
ni=1
s
i
∑
B2
- 1Var{n
| ¯zit
S
i
n
c
e
u
ifo
r
m
y
|b
io
sn
e
), by the exp onential ß )2
-mixing prop erty
(condition (A1)) and
Theorem 3 of Yoshihara
(1978), it is shown that
the first term on (A.19) is
O (n). Therefore, we
obtain the desired result.
Lemma A.6. Recal l the de nitions of ω
2 n( 2t - ni γi + 1
12 n( 2t -=1∑ni
=1∑
= 1T ·1
h
γi
= 1T ·
are expanded as
i1]
=
1
and
ω(3) ni
n
ω(1) niγ) + o (T -1) , := zit
ω
s
1
2
it
).
1
n
= ¯ O ( hr n
E[ Khn
z Tt ∑
it
it
i1
i 1]
=∫
8
K (u ){ t - G (u )} ft(uh )du + ∫
∫-8
8
uK (u )2f (uh )du
n
-8
a
nd
B
(T
-1
Proof of Lemma A.6. We only prove the expansion for
B
:= t - G n(uit) + h ˜ Khn( uit) and
- E[ z
¯z
i
n
( )
u z
c
a
n
b
e
s
h
o
w
n
in
a
si
m
il
ar
w
a
y.
P
. The expansion for
B
i
n
ut
z]
.
R
e
c
al
l
th
at
E[
z)
b
y
L
e
m
m
a
B.
1.
O
b
s
er
v
e
th
at
E[( ˆ
- 2) = TT s =1T t=1hit
zit)] + ¯ O ( ∑T
t=1
f
hr n
We evaluate the term E[
is)zit]. First, hn(
∑
∑
E[
K
u
K
i = f -8(0)
8 K (u )G (u )du + 8-8uK (u )2du + ¯ O ).
{
∫
}
(h
8-8K
(u )
G
(u )
du
=[
-1
22G
(u
)]8S
inc
eK
(·)
is
sym
met
ric
ab
out
the
orig
in,
the
thir
d
ter
m
insi
de
the
bra
ce
is
zer
o,
and
∫-8=
12 .
3
1
- 1 zit)]
= E[ ˆ fi( ∑T
T
n(uis) z ] + O (
h
t=1
-1
- E[ ˆ
fi])(T
i
Thus, we have E[ Khn( ui1) ] = (t - 1/ 2)fi(0) + ¯ O (hn). Second, for |j | = 1,
zi 1
K (u1
i,j(u1h , u1+ j)du du
1du1+ j
-8
E[Khn(ui 1)zi,1+ j] = t ∫∫
8
1 n)du
8
8
8
∫
n
n,
1
1+ j
u1+ jh
(u1+ j)f
i,j(u1
∫ ∫
+
h
8
K
(
n
u
(0)
+
O
= t fi
(h
i(uh
hn
r)
n
(u1h , u1+ j)du1+ j
hnv
1)
K(v) d
du
v
f
∫
8
∫
8
{∫
i,j
n
r n),
} K (u
1
fi,j
= t fi (0) ∫
8
-8 -8
0
-8
(0, u )du + O (h
h
n)
du
K (u )f
-88
)G
-8
) ˜ K (u1+ j)f
-8
where the O ) term is indep endent of i and j . Therefore, we
(hr n
have
E[( ˆ - E[ ˆ fi])(T- 1 ∑T zit)] = T -1( t - 1/2)fi(0) + T - 1 ω(2 + ¯o( - 1). (A.20)
t=1
) ni T
fi
Similarly, observe that
We evaluate the term
E[ ˜ K(1)(uis/hn)zit]. First,
h
8 8
(1)(u ) ˜ K (u
i( uh )
∫
n
)f
du
-8
∫ 8
- 1 ∑- 2 zit)] = -T - 2 h- 2 ∑T s ∑T
E[( ˜ - E[ ˜ f(1)
E[ ˜ K(1)(uis/hn)zit] + ¯ O (hr
nT t=1
n
=1
t=1
f(1) i
i])(T
n).
-8
n
h- 2 nE[ ˜ K(1)( /hn)zi1] = h- 1 n
(u ) {t - G (u )} ˜
ui1
f
K(1)
+ h -n 1 ∫-8
˜
K
-1
n
= -h
-1
n
-8
i(uh
)d
u
n )du + ¯ O ˜ K (u ) K ( u
(1)
)fi(uh
8 uK (u )2du + ¯ O
(0) ∫ fi = -h
(1) =
¯O
(1),
n
-8
8-8˜
K(1)(u ) ˜ K (u )fi(uhnwhere the second equality is due to the
fact that ∫)du = ∫= ∫8-88-8˜ K (u )˜ K(1)˜ K(1)(u )fi(u ) ˜ K (u
)fi(uhn(uhn)du - hn
3
)du + ¯ O )
2
(h
.
∫
n
8
˜ ( u )f(1)
K2 i(uh
)d
u
Second, for |j | = 1, we have
] = h -1
t ∫ 8 ˜ K(1)(u )
n
h- 2 nE[ ˜K(1)( ui1/hn)zi, 1+ j
fi(uhn)du
-1
˜ K(1)(u1)Ghn(u1+ j )f (u1h , u1+ j)du1du
n
-8∫8
8
)fi,j(u1hn, u1+ jh
(1) i
8
˜ K (u
)f
-8
)d d
u u
˜ K(1)(u1) ˜ K (
u
n
(
)
uh du
1
1 n
1+ j
1+ j
1
-8
8
-8=
h
+
∫8
-t ∫+ ∫
8
-8 8
∫
8
1+ j
f
i,j
n
1
n
1+ j
1+ j
dv
∫
∫
˜ K (u1) ˜ K (u1+ j) f(1 ,0) i,j(u1hn, u1+
du1+ j
jhn) du
8 -88 8
∫
˜ K (u) K ( v ) { v hn (1, 0) h , u ) du } du
i,j(u
1
∫
r -1 n
∫
8
- hn
-8
-8
= O (h
),
where the O (hr - ) term is indep endent of i and j . Therefore, we
1n
have
1- E[ ˜ f(1) i])(T-1 ∑T
i
hnE[( ˜
-1hn zit)] = ¯ O
+ hr n) =
t=1
f(1) i
(T
¯o(T
C
o
m
bi
ni
n
g
(A
.2
0)
a
n
d
(A
.2
1)
,
w
e
-1).
(A.21)
o
bt
ai
n
E[
B
It
re
m
ai
n
s
to
e
v
al
u
at
e
th
e
v
ar
ia
n
c
e
of
a
n
y
li
n
e
ar
c
o
m
bi
n
at
io
n
of
B
p,
p
ut
a i:
=
si
c'
γ i.
T
h
e
] = 1T 2n( 2t - ni=1 γ + 1 ni
siω(1) niγi) +
=1
∑
·
1
∑ o(T
- 1).
(A.22)
n,
1.
F
or
a
n
y
fix
e
d
c
∈
R
Var{ n-1 - 2)} = n∑n i =1a2 iVar{ ( ˆ fi- E[ ˆ fi])(T-1zit∑T t=1-2) } = n∑n i=1a2 iE[( ˆ fi- E[ ˆ fi2])( T- 1∑T
∑T
t=1
t=1zzitit2)- 2] = 2n∑n i=1a2 iE[( ˆ fi- E[ ˆ fi2]){T- 1∑T t=1(zit- E[ zit]) }2- 2] + 2n∑n i =1a2
i(E[zi12])By the pro of of Lemma B.2 b elow, E[( ˆ f iE[( ˆ fi- E[ ˆ f- E[ ˆ fi])2i])2]. (A.23)]
= ¯ O { (T hn-1}. Thus, by Lemma B.1, the second term on the right side of (A.23) is
O ( n- 1)T- 1h2r -1 n). Put w:= K (uit/hn), ¯wit:= wit- E[ wi1] and ¯zit:= zit- E[ zi1]. Observe
thatit
fE[( ˆi-E[ ˆ 2]){ - 1 ∑T t=1(zit-E[ zit])}2] = T -4 h- 2 ∑T s ∑T ∑T ∑T v E[
n
=1
t=1
u=1
=1
fi
T
¯wis¯wit¯ziu¯zi
v] .
3
3
E[ ˆ
fi])(T
- 1-
a i( ˆ ∑
fi
E[
¯wis¯wit¯z
¯z2 i 1] = ¯ O2 it¯zit] = ¯ 2hn
(hn
O (T
E[ ¯zis
2hn s ̸ =t), ∑s̸ = t
s̸ = t
¯z2 it] = ¯ O
n), ∑s,t,u :distinctE[ ¯wis¯zis¯wit¯zit] = ¯
2hn
E[ ¯w(T
2
O (T
2hn
2hn).
2
is
it¯zit¯ziu]
is
E[ ¯w2
is¯zit¯z
E[
¯wis¯wit¯z
s̸ = t
=¯O
(T
¯wit¯ziu¯ziv
= ¯ O (T2hn). We wish to show that ∑), ∑¯w), ∑s,t,u :distinc
ThE[ ¯w¯w) , ∑s,t,u :distinct2 iu] = ¯ O ( T2hn) , ∑), ∑s,t,u,v :distinctE
O (T
The first four equations are relatively standard to derive. Thus, we
deriving the last four equations. The evaluation b elow is similar to
Yoshihara (1978, Theorem 1), but the situation is slightly different
sake of completeness, we provide a pro of of the last four equatio
follows, C denotes some constant indep endent of s, t, u, v and i,
change from line to line.
2hn) }. (A.24)
2 it]
It is standard to
show that E[ ¯w2
i1
E[ ¯w2
is¯zit¯ziu
it¯ziu] = ¯ O {( T h1 /(1+ d )
n) ∨ (T
E[ ¯w2
is
s
<
t
<
u
Evaluation of ∑s,t,u :distinct ¯z]: We first show that for any d > 0, ∑
¯ 1+ d] = C
z hn
iu|
∑s<t<u
u -t>t -s
Consider the case that s < t < u and u - t > t - s. In that case, E[ |
¯w
|E[ ¯w2
1/ (1+ d ) n]| = C ∑s<t<u ßi d / (1+ d )(u - t)
u- t>t- s
is¯zit¯ziu
h
T
T -s - 1 j
ß d) / (1+ d (j = u - t; k = t - s )
1/ (1+ d ) n=
-2s=
-1∑
Ch
1∑
j =2
j =1∑j
= C hn ∑s<t<u
ßi
ßi(u - t)d / (1+ d ) + C T h1/ (1+
t-s = u- t
d)n
. Similarly, we have
8
∑
i(
j)
k =1
(j )d / (1+ d )
= C T h1 /(1+ d )
n
1 /(1+ d )
n=
CTh
1+ d,
E[ | ¯z] = C and E[| ¯w2 is¯
Lemma 1 of Yoshihara (1976)
∑s<t<u t-s =u -t| E[ ¯w2 is¯zit¯ziu]| = E[ ¯w2 i ¯z ]| + ∑s<t<u |E[(
2 i 1- E[ ¯w])
iu t- s =u -t
1] ∑s<t<u t- s= u- t|E[ ¯zit
¯w2 is ¯zit¯ziu]|
T
T -s - 1j jk
= C n-2s=1
ßi( k )d / (1+ + C T h1/(1+ d
=1∑ =1 d )
)n
h
∑
∑
= C T2hn + C T h1 /(1+ d )
n.
34
Therefore, we obtain (A.24). Since d > 0 is arbitrary, taking d sufficiently small, we have T h1/ (1+ d ) n= O ( T2hn
E[ ¯w2 is¯zit¯ziu] = ¯ O (T2hn).
s,t,u :distinct
s,t,u :distinct
s,t,u : distinct
E[
¯wis¯wit¯zit¯
ziu
2
n
), ∑
2
E[
iu
¯wis¯
w
E[
2 iu] = ¯
¯wis¯wit O (T
2h
)
.
∑s,t,u :distinctE[
¯wis¯wit¯zit¯ziu
i
s
Evaluation of ∑
iu iv
it¯
1+
d
is
¯ziv|
ziu
]: By using the same argument as in the previous case, we have
n
h
¯
s,t,u :distinct
z
]=¯O
(T
s,t,u,v :distinct
n
n,
E[|
¯wit¯ziu¯ziv|
|E[
¯wis¯wit¯ziu
¯z
∑s<t<u<v t- s =max { ut,v -u }
E[ is¯ ¯wit ¯
¯w w ¯z z
iv ¯z
|
]=Ch
s<t<u<v t-s = max
{u- t,v - u}
2 /(1+ d ) n
Evaluation of ∑
). By rep eating the same argument, we
it¯
have ∑] and ∑
z
]: Consider the case that s < t < u < v and t - s = max{ u - t, v - u } . In that case, for any d > 0, E[| ¯w1+ d] = C
h] = C hand E[ | ¯w1+ d2 n. Thus, by Lemma 1 of Yoshihara (1976),
iv 2/ (1+ d ) n]| = C
∑
ßi d /(1+ d )(t - s)
h
j
T -s - 2 T
jk
jl
2/ (1+ d ) n= C
ßi d / (1+ d )(
=1
-3s=
=1 =1
h
j)
∑
1∑
∑ ∑
8 2 /(1+ d ) n
j =1∑
2 /(1+ d
)n
j2ßi(j )d / (1+ d
)
=CTh
.
=CTh
Similarly, we have ∑s<t<u<v v -u= max { |E[
iu¯ ]| = C T
.
t- s,u - t}
¯wis¯wit z h
Consider now the case that s < t < u < v and u - t = max{ t - s, v - u }. In that case, by Lemma 1 of Yoshihara
(1976), for any d > 0,
|E[ ¯wis¯wit¯ziu¯ziv]| = | Cov ( ¯wis¯wit, ¯ziu¯ziv)| + |E[
(t - s ) d) / (1+ d ßi d / (1+ d )(v - u
¯w2/(1+ d ) nßid / (1+ d )( u - t)is¯w+ C hit]E[ ¯ziu2 /(1+ d ) n¯zßiiv]| =
).
Ch
Thus, ∑s<t<u<v u -t= max {t- s,v - u}|E[
+ C h2 /(1+ d T3s- T - s- 2j jk=1 jl=1 ßi(k )d / (1+ d d / (1+ d )(
=1∑
)n
¯wis¯wit¯ziu¯ziv]| = C T h2 /(1+ d ) n
l)
=1
∑ ∑ )ßi
∑
2 /(1+ d ) n=
h
35
C T + C T2h2/ (1+ d )
n.
E[ ¯wis¯wit¯ziu¯ziv] = 2hn).
Taking d ∈ (0, 1], we have
¯ O (T
∑s<t<u<v
s,t,u,v :distinctE[ ¯wis¯wBy rep eating the same argument, 2hn).
we have ∑it¯ziu¯ziv] = ¯ O (T
We have shown that the first term on the right side of (A.23) is O (n
-1
T - 4 h-2
n
-1
) = O (n- 1T
2hn
-2
3
p→
0h- 1 n).
Therefore, Var(c'B1
(1) )ni =
O ( n- 1T
v
{( nT )
-1 n
× T) = o{ (nT )}. Combining (A.22), we obtain the desired
result.
- 1/ 2 +
-1} = O
- 1/ 2{( nT
By (A.16) and Lemmas A.4-A.6, ∥ ˆ ß ∥ = Op
T
)
-ß
(B1 - B2 + B ) = v n G
T - 1 {n 1nni =1∑
si ( ω γi (2) ni- ω+ s ω(3) niν)}i + o
2
v nT · G
} , and
ρb,
where b is given in (3.2). Therefore, we obtain (3.1).
A.3 Pro of of Theorem 3.2
It suffices to show that ˆ p→ b . To this end, we first show
that
b
| - si| p→ 0,
∥ ˆ - νi∥ p→ 0, ˆ = G- 1 + op(1). (A.25)
n
max ˆs
max1= i= n νi
G- 1 n
1= i= n i
We use the notation in App endix B b elow. Without loss of
generality, we may assume that ai0= 0 and ß0= 0. Then,
ˆsican b e written as ˆsi= 1/ ˆ fi( ˆai, ˆ ß ), whereˆ fˆ f( a , ß )
stands for(0) i(a , ß ). By Lemma B.2, we have ˆ fi( ˆai, ˆ ß ) =
E[ˆ fi(a , ß )]|a = ^aii+ ¯op{ (log n)/ (T hn1 /2)} . Observe that E[ ˆ
fi(a , ß )] = { E[ ˆ fi(a , ß )] - E[ ˆ fi, = ^ (0, 0)] } +E[ ˆ fˆ f(0, 0)],
E[i(0, 0)] = fi(0) + ¯ O ( hr n) = fi(0) + ¯op{ (log n) /(T hn)1/2}
andi
fi(uh(|a | + M ∥ß ∥) ∫n8-8
= Cf|E[ ˆ fi(a ,
ß )] - E[ ˆ f=
E [ ∫8-8i(0,
0)]||K (u ){
+ a + x' i 1ß
|xi 1) - fi(
uhn|xi 1)}|du ]
|K (u )|du. (A.26)
1= i= n
n| ˆai| = Op[{ (log n)/T }1 /2] and ∥ ˆ ß ∥ =
By the pro of of Theorem 3.1, i
op(T1 /2)} = fi(0) + ¯o
max
p
36
i( ˆai
=
+ p
=
(1).
Similarly,
we
can show that ˆ γi
i
s
¯
o
ˆ f), so that, ˆ ß ) = fi(0) + ¯op{ (log n)/( T h(1), which, by
condition (A5) (c), implies that ˆs
-1 /2
+ ¯op{ (log n)/( T 1/ 2)} = γi +
hn
¯op
i (1) i( ˆa
p, ˆ ß ) + ˆ
γ iˆ f
i= -E[ ˆ g
(1) i - 1 n
( ˆa
i,
ˆß
+ ¯o(1)}{E[ ˆ f (a , ß )]|
p(1)
,
+{γ
ˆ Gn
p
=
^
p
p
γi (1). To prove the second assertion of (A.25), observe that
(1)
i
i
p
(1) i
=ˆg
(1), i
i
(1) i(a , ß )]
ˆ
|
ν
=
(
G (0, 0)] + ¯o
1 +
+
(0, 0)]
)
¯o
¯o
ˆ G- 1 n
(
1
)
+
{
γ
+
¯
o
(
1
)
}
{
E
[
ˆ
f
(
1
)
}
=^
)
a = ^ai
p
i,
i
a=
^a
p
p
n+
¯o
+o
w
h
er
e
th
e
s
e
c
o
n
d
e
q
u
al
ity
is
d
u
e
to
L
e
m
m
a
B.
2.
T
h
e
th
ir
d
in
=
E
[
ˆ
g
(
1
)
}
=
ν
+
¯
o
e
q
u
al
ity
c
a
n
b
e
d
e
d
u
c
e
d
fr
o
m
th
e
e
v
al
u
at
io
n
a
n
al
o
g
o
u
s
to
(A
.2
6)
.
Si
m
il
ar
ly,
w
e
c
a
n
s
h
o
w
th
at
=
G
(1
),
s
o
th
at
+
o(
1)
.
W
e
n
e
xt
s
h
o
w
th
at
ˆω( = ω(1) + (1), ˆ
= ω(2) + (1),
= ω(3) + ¯op(1). (A.27)
1) ni ni
ni
¯op ω(2) ni
¯op ˆω(3) ni ni
Step 1: pro of of
= ω(3) +
ni
ˆω(3) ni
¯op
= ∑|j |= ( 1 - |j ) Cov { I ( = 0), I (u1 ,1+ = 0)}
mn
j
| u( 1 -i1
= 0)} ,
T
ω(3
) ni
(1
).
T
h
e
pr
o
of
is
si
m
il
ar
in
s
pi
rit
to
th
e
pr
o
of
of
H
a
h
n
a
n
d
K
u
er
st
ei
n
er
(2
0
1
1,
L
e
m
m
a
6)
,
b
ut
w
e
n
e
e
d
a
di
ff
er
e
nt
te
c
h
ni
q
u
e
b
e
c
a
u
s
e
of
th
e
n
o
ndi
ff
er
e
nt
ia
bi
lit
y
of
th
e
in
di
c
at
or
fu
n
cti
o
n.
O
b
s
er
v
e
th
at
mn +
a
n
d
th
e
la
tt
er
s
u
m
is
s
h
o
w
n
to
b
e
¯
∑+1 =|j |= T - 1
|j ) Cov {I (ui1 = 0), I (ui, 1+ j
|
T
o(
1)
b
y
c
o
n
di
ti
o
n
(A
1)
a
n
d
L
e
m
m
a
C.
2.
T
h
er
ef
or
e,
it
s
u
ffi
c
e
s
to
s
h
o
w
th
at
max | ˆϱi(j ) ϱi
max
-1 n(j
)| = o(m). (A.28)
1 =| j |=
mn
1 =i
=n
By the pro of of Theorem 3.1, max1= | ˆai| = Op[{ (log n)/T }1 /2] and ∥ ˆ ß ∥ = -1 /2
i= n
op(T
max |E[ I
= a + x' i1ß )I (ui,1+ = a +x' i, 1+ jß )] |a = ^ai, = ^ -ϱi(j ) | = Op[{ (log n)/T }1
1 =| j |=
j
(ui1
/2].
mn
max
),
1 =i
=n
s
o
th
at
b
y
c
o
n
di
ti
o
n
(A
5)
,
Thus, taking into account that mn /T = o(m- 1 n), it suffices to show
that
min {T ,T - j }
ma
(u , ;
,
)
x
x u
x
1= i= n
1=| j |= mn (a,
' )'∈Rp+1
t=max { 1, -j +1 }
p
∑
- 1 n),
(A.29)
it
it
i,t+ j
i,t+ j
{
(m
ga,
(uit, xit ; ui,t +j, xi,t+ j) := I (uit = a + x' itß ) I (ui,t +j = o= a + x' i,t +jß ). We make use of Talagrand’s inequa
(see App endix C).
where ga,
37
su ma
p x
1
T
min { T ,T - j }
Fix any 1 = i = n and 1 = | j | = mn. In what follows, terms const . and o(·) are
t
interpreted as terms indep endent of i and j . Put ξt:= ( uit, x' it, ui,t +j, x' i,t+ j)'t. Observe
that { ξ, t = 0 , ±1, ±2, . . . } is ß -mixing, and letting ˜ ß (·) denote its ß -mixing co
effi-cients, we have ˜ ß (k ) = ßi(k - | j |) for k > |j |. Write ga, (ξt) = ga, (uit, xit; ui,t+ j,
xa, ) and define the class of functions G = { g: a ∈ R, ß ∈ Rpi,t +j1}. The class G is
uniformly b ounded by 1, and by Lemma 2.6.15 and Theorem 2.6.7 of van der Vaart
and Wellner (1996), there exist constants A = 5e and v = 1 (indep endent of i and j )
such that N (G , L(Q) , ϵ ) = (A/e)vfor every 0 < ϵ < 1 and every probability measure
Q on R2 p+21 /2. Take q = [(T / log n)]. By using Lemma 1 of Andrews (1991) and the
classical result of Parzen (1957) on the variance evaluation of sample auto
covariances, it is shown that sup(a, ')'∈Rp+1 Var{ ∑q t=1ga, (ξt) / vq } = const. . By the exp
onential ß -mixing prop erty (condition (A1)), r˜ ß ( q ) = o(n-2- 2) with r = [ T /(2q )].
Therefore, by Prop ositions C.1 and C.2, with probability at least 1 - o( n) (take s = 3
log n when applying those prop ositions),
sup
∑
-1n
(1) ni
=
f i(
0)
+
¯
o{
(l
(1) ni
a,
p
(ξ )ga,- E[
(ξ1)] }
(
1
)
n
i
= const. × v(log n)/T .
= ¯ O (1) by condition
o
g
n)
/(
T
hn
)1/
2
max | ˆ ϕi(j ) ϕi
-1 n(j
)| = o(m),
1 =|j |= mn
max
1 =i=
n
(A9), we
have
so that
)
1 t=max { 1,- j +1 }
{
'
The right side
is
o(m=
ω+ ¯o(1). To b egin with, since ω), so that by g
∈
the union bRound, T
we obtain (A.29). Step 2 : pro of of ˆω
(
a
,
p
+
1
'
ω
= ∑1 =|j |=
mn
ω
| = mn|
ˆ fi
(1)
ni
(
1
)
n
i
( 1 - |j ) { t fi(0) - ϕi(j )} + ¯o(1),
|
T
- fi(0)| + mn max | ˆ ϕi(j ) - ϕi(j )| +
1=| j |= mn ¯o(1).
(1) ni| ˆω
Because ˆ}, the first term on the right side is ¯o(1). It remains to show that
fi
w
hi
c
h
c
a
n
b
e
s
h
o
w
n
in
a
si
m
il
ar
w
a
y
a
s
(A
.2
8)
.
In
fa
ct
,
th
e
pr
o
bl
e
m
re
d
u
c
e
s
to
s
h
o
wi
n
g
ma
x
1= i= n
min {T ,T -j }
1=| j |= mn ( a, ' )'∈Rp+1
su ma
p x
where ga,
t=max {1 ,-j +1 }
p
(u ,
x
;
u
,
x
)
-1 n)
,
(A.30)
it
it
i,t +j
i,t+ j
1T
∑
{
(m
hn
ga, ,hn
,h(uit , xit; ui,t+ j, xi,t+ j) := K ((uit
- a - x' itß )/h)I ( uit )] } = o= a + x' i,t+ j' i,t +jß )/h) I (ui,t +ja - x= a + x' i,t+ jß
previous case, it is shown
38
that the left side of (A.30) is Op18) under the present assumption.
Step 3: pro of of ˆ
= ω(2) +
ni
ω(2) ni
¯op
ˆ G-1{ 1nn∑n ˆsi( ˆω(1) niˆ(2) niˆ ω+ ˆsiˆω(3) ni)}
ˆ
n
γi
νi
2
ˆb=
(1). This step is completely analogous to the
previous case.
We have now shown (A.25) and (A.27), so that
{ 1ni =1 si ( ω(1) niγi(2) ni- ω+ si ω(3) niν)}i + op(1)
2
= G-1 n
p→
i =1∑
b.
Therefore, we obtain the desired conclusion.
B Miscellaneous lemmas
In this section, we summarize some miscellaneous results used in the pro of of
Theorem 3.1. Lemma B.2 is a mo dification of Lemma 3 of Horowitz (1998).
Throughout the section, we assume the conditions of Theorem 3.1.
) = O (hr n)
;
Lemma B.1. We have
(a) H(1) ni(ai0, ß0) = ¯ O (hr n) , H(2) n(a0,
ß0
(b) ∂aiH(1) ni(ai 0, ß0) = -fi(0) + ¯ O (hr n);
H(1) ni(ai0, ß0) = -f(1) i(0) + ¯ O ( hr - 1
(c) ∂2 ai
n);
(
(1) ni(ai0, ß0) = ∂aiH(2) n(a0, ß0) = -E[fi(0|xit)xi1] + ¯ O ( hr n);
d
)
∂
H
(f ) ∂2
);
ai
(e) ∂ ' H ) = -E[f00(2) n(a, ßH(2) n(a0, ß0) = -n- 1∑(1) in ] + O (hr
i=1E[ f(0|xi 1i(0| x)xi1i 1)xi1x] + ¯ O (h' i1r - 1 n n);
∂ßk
∂ßj H(1) ni(ai, ß ), ∂ßj
j∂ßk∂ßlH(1) ni( ai
1 =i =n H(1) ni(ai,
ai∂ßj
ß ), ∂3 aiH(1) ni(ai, ß ), ∂2
|∂ai∂ßj H(1) ni(
(ai
H
, ß ) and ∂ß, ß ) are O (1) uniformly over both (asupi, )∈A×B(1) ni(a, ß ) ∈ A × B and 1 = i
for 1 = j , k , l = p , i.e., for instance, max, ß ) | = O (1) as n → 8 for 1 = j = p ; 1
(g) ∂ai
8
We conjecture that the left side of (A.30) is Op{ v(log n ) /(T hn) }, but we do not pursue this small
improvement here.
39
H(2) n(a , ß ) , ∂ß∂j ßk
ßj∂ßk∂ßlH(2) n
)
T t=1
H(2) n(
ai∂ßj
a, ß ), ∂3 aiH(2) n( a, ß ), ∂2 H(2) n(a, ß ), ∂ai∂ßj ∂ßk H(2)
n
n
j
n
j
f( j ) i
(j)i
j
n
j
n)xit
(j)i
n)xit
Ji
J(1) i
(a , ß ) and ∂(a, ß ) are O (1) uniformly over both (a , ß ) ∈ A× B and 1 (= i = n for 1
h)
k,l=p.
∂ai∂ßProof.
j
The pro of is immediate from conditions (A2), (A5) and (A6).
) , j = 0 , 1 , ˜(a , ß ) := ( -1)(T hj +1 n- 1)∑T t=1˜ K( j )(( uit- a - x' itß ) /h) , j = - a - x' itß ) /h K( j )(( u
Put 1 , 2 , ˆ g(a , ß ) := ( -1)(T hj +1 n)-1∑T t=1K( j )(( uit- a - x' itß )/h, j = 0 , 1, ˜
ˆ f( j ) i g(a , ß ) := ( -1)(T hj +1 n)-1∑T t=1˜ K( j )(( uit- a - x' itß )/h, j = 1 , 2, ˆ(a , ß )
:= (T h∑T t=1K (( uit' it- a - xß )/hn)xitx' it, ˜(a , ß ) := -(T h)T t=1n
K (u ).
∑ - 1 n)-1
˜ K(1)((uit - a - x' itß )/h )xitx' it,
2n
where K( j j(u ) = d jK ( u
and K(0)(u ) stands for K (u ). The same rule applies to ˜
)
)/du
Lemma B.2. Uniformly over both (a , ß ) ∈ Rp+1 and 1 = i = n,
2 j +1 n
(a) ˆ f( j ) i( a , ß ) - E[ ˆ f(j ) i(a , ß )] = op{ (log n)/2 j +11n/2)} , j = 0 , 1;
1 /2)} , j = 1 , 2;
(T h
(b) ˜ f( j ) i( a , ß ) - E[ ˜ f(j ) i(a , ß )] = op{ (log n)/
(T h
(c) ˆ g( ( j ) i( a , ß ) - E[ ˆ g(a , ß )] = op{ (log n) /(T h2j +1 n)1/2}, j =
j)i
0 , 1;
(d) ˜ g( ( j ) i( a , ß ) - E[ ˜ g(a , ß )] 2j +1 n{ (log n) /(T h)1/2}, j = 1 ,
j)i
= op
2;
J(e) ˆi(a , ß ) - E[ ˆ i( a , ß )] = op{ (log n) /(T
J
hn)1/2};
J(f ) ˜(1) (1) i( a , ß ) - E[ ˜ J(a , ß )] 3 n{ (log n) /(T h)1
i
= op
/2}.
F:=
supf ∈FAs in Horowitz (1998), we employ empirical pro cess theory to prove the
lemma, but since the observations are dep endent in the time dimension, we use a
concentration inequality for ß -mixing pro cesses, which is develop ed in App endix C.
Before the pro of, we intro duce some notation. Let F b e a class of measurable functions
on a measurable space ( S , S ). For a functional Z (f ) defined on F , we use the notation
∥ Z (f )∥k|Z (f ) |. For a probability measure Q on (S , S ) and a constant ϵ > 0, let N (F ,
L(Q), ϵ ) denote the ϵ -covering numb er of F with resp ect to the LLk(Q)( Q) norm ∥ · ∥,
where k = 1.k
40
Proof of Lemma B.2. We only prove (a) with j = 0. The other cases can b e shown in
a similar way. Put ξit:= ( uit, xit), ga, ,h(u, x) := K (( u - a - x'ß )/h) for (a , ß ) ∈ Rp+1
and h > 0, and Gni := { g - E[ g (ξi 1)] : g ∈ Gn
i1∑q f (ξit) / v q . It is
n
2
ni
2
t=12
v(Q) , U ϵ ) = (A/ϵ
i
2
)
ni
n
nßi(j )
ni,
L1
p+1
1+ j|x1,
x1+ j
a, ,hn( ξi, 1+ j
} . We apply Prop ositions C.2 and C.2 to the class G. The p ointwise measurability
of Gis guaranteed by the continuity of K (·). Because of the b oundedness of K (·),
the class Gis uniformly b ounded by some constant U (say) indep endent of i and n.
Since condition (A6) guarantees that K (·) is of b ounded variation, by Lemma 22 of
Nolan and Pollard (1987), there exist constants A = 5e and v = 1 indep endent of i
and n such that N (Gfor every 0 < ϵ < 1 and every probability measure Q on R. Let q
∈ [1, T / 2]. We wish to b ound the variance of the sum
n
i 1) ] = const. ×h and E[
) ga, ,hn(
)
1/2
ga, ,hn( ξ
ξ
i,j(u1, u
a, ,hn(ξi 1), g
standard to show that ] = const. ×hwhere the constants are indep endent of a , ß , i, j and n (the
E[g
assertion uses the fact that the joint densities f) are uniformly b ounded, w
assumed in condition (A5) (e)). Thus, by Lemma 1 of Yoshihara (1976) (s
Lemma C.2 b elow), we have | Cov ( g))| = const. ×h, which implies that
q ß
g
(
)/
vq
}
=
const.
×{
1/ 2} h
= const. ×h ,
q
j =1
ξ
t=1
g
∈Gni
it
n
n
su Var { ∑
1 + ∑ i(j )
p
2
where the constant on the last expression is indep endent of i, n and q (recall that
condition (A1) ensures that∑8 j =1supi= 1ßi1 /2( j )< 8 ). Take q = qn= [(T hn1 /2/ log n)]. By
the exp onential ß -mixing prop erty (condition (A1)), r ßi(q ) = ¯o( n- 1- s- 1) with r = [T
/(2q )]. Therefore, by Prop ositions C.1 and C.2, for all s > 0, with probability at least 1
- e- ¯o( n), we have
Tt=1∑g (ξit)
Gni= const. ×{ vT hn( s ∨ log n) + s vT hn/ log n}
,where the constant is indep endent of i, n and s. Taking s = 2 log n and using the
union b ound, we obtain the desired result.
C Some sto chastic inequalities for ß -mixing processes
We intro duce some sto chastic inequalities that can b e applied to ß -mixing pro
cesses. These inequalities are used in the pro ofs of the theorems ab ove, and are of
indep endent interest.
The first ob ject is to extend Talagrand’s (1996) concentration inequality to ß mixing
pro cesses. We recall Talagrand’s inequality for i.i.d. random variables.
41
Theorem C.1 (Talagrand (1996); in this form, Massart (2000)). Let ξ, i = 1 , 2, . . . be
i.i.d. random variables on some measurable space S . Let F a pointwise measurable
class of functions on S such that E[f (ξ1)] = 0 and supx∈ Si2| f (x )| = U (see van der
Vaart and Wel lner, 1996, Section 2.3 for pointwise measurability). Let s 2be any
positive constant such that s= supf ∈FE[ f (ξ12)]. Put Z := supf ∈F| ∑n i=1)|. Then, for al l s
> 0 , we haveP{ Z = 2E[Z ] + C s vns + sC U } = e- s, where C is a universal constant.f
(ξi
In what follows, let { ξt, t ∈ Z} b e a stationary pro cess taking values in a measurable
space ( S , S ). We assume that S is a Polish space and S is its Borel s -field. For a
function f on S and a p ositive integer q , define s2 q(f ) := Var{ f (ξ1) } + 2 ∑q -1 j =11), f
(ξ1+ j(1 j /q ) Cov { f (ξ)} , which is the variance of the sum ∑denote the ß -mixing co
efficient of { ξtq t=1f (ξt)/ v q . Let ß (j )}. The main technical device we use is a blo cking
technique develop ed in Yu (1993,
1994) and Arcones and Yu (1994). The blo cking technique enables us to employ
maximal inequalities and the symmetrization technique available in the i.i.d. case. We
intro duce some notation and a key lemme related to the blo cking technique. Divide
the T -sequence {1 , . . . , T } into blo cks of length q with q ∈ [1 , T / 2], one after the
other:
Hk= { t : 2(k - 1)q + 1 = t = (2k - 1)q } , Tk= { t : (2k - 1)q + 1 = t = 2k q } ,
k:= ( ξt, t ∈ H) = ∑t∈Hkkf (ξt). = ( ˜ ξt, t ∈ Hk
1
Let ˜ Ξk˜ Ξk
where k = 1 , . . . , r := [ T / (2q )]. Put
Ξk
x∈
q r, we have
(˜ ,..., )∈A}
S
P {(Ξ1, . . . , Ξr
Ξ ˜Ξ
). With a slight abuse of notation, for
a function f : S → R, we write f (Ξ), k = 1 , . . . , r b e indep endent blo cks such that
eachhas the same distribution as Ξ. The next lemma, which is a key to the blo cking
technique, is due to Eb erlain (1984).
Lemma C.1. Work with the same notation as above. For every Borel measurable
subset A of S
1
r
= r ß (q
) ∈ A} - P {
).
k
t∈H
r t∈
=1 ∑k f (ξt
f (ξt + ( T - 2q r
k =1 Tk∑
)U.
t=1∑f (
ξt )
4
2
Supp ose that we have a p ointwise measurable class of functions F on S such that
supT= |f ( x )| = U for some constant U . For each f ∈ F , observe that
r∑) + ∑)
Si
n
c
e
th
e
fir
st
a
n
d
s
e
c
o
n
d
te
r
m
s
o
n
th
e
ri
g
ht
si
d
e
h
a
v
e
th
e
s
a
m
e
di
st
ri
b
ut
io
n,
fo
r
al
ls
>
0,
w
e
h
a
v
e
r t∈
k =1 H∑
∑
f (ξ
t
r
∑ f (ξt) F
∑
∑= P
=P
t=1
t∈
f (ξt )
f (ξt)
T
= 2s + ( T - 2q r )U }
= 2s
F
P{
r
H∑k
+ =s
F)
F)
k =1
∑
B
y
L
e
m
m
a
C.
1,
th
e
la
st
e
x
pr
e
ss
io
n
is
b
o
u
n
d
e
d
b
y
r k =1∑
t∈ H∑k
rk =1∑
f (ξt
f (Ξ )
F
k =1k
t∈ T
rk =1∑
∑ f (ξt) F = s
k
=
s
}
=
2P
=
2P {
r
k =1∑2P {
k
F
s
f ( ˜ Ξk)
=
+P
k
t∈T
Fk
= s } + 2r ß (q
} = e- s
).
Recall that ˜ , k = 1, . . . , r are i.i.d. blo cks. By Talagrand’s inequality, for all s > 0, we have
Ξ
f ( ) = 2E [ k r=1 f ( ˜
v sT + sq C ] + C s
r
˜Ξ
Ξk )
U
k
=1
∑
∑F
k
P{
q
F
is any p ositive constant such that
2 q(f ) =
= 2q
sup
s
U
f ( ˜ Ξk) ] + C sqv sT + sq C } = 2e- s + 2 r ß (
F
U
q).
2
r
k =1
f ∈F
. Therefore, we obtain the next prop osition:
w
here s2 q Prop osition C.1. Work with the same notation as above. Then, for al l s > 0, we
have
T
t
r k =1F
k
) F = 4E [ ∑
f ( ˜ Ξk)∥F
k(Q) norm ∥ · ∥Lk
(Q)
P { t=1∑
For
the
eva
luat
ion
of
the
ter
m
E[∥
∑],
the
nex
t
pro
p
ositi
on
is
use
ful.
Rec
all
that
for
f (ξ
1
4
3
:= supf ∈F
2s
q
a
pro
bab
ility
me
asu
re
Q
on
(S ,
S)
and
a
con
sta
nt ϵ
> 0,
N
(F ,
L(Q
),ϵ
)
den
ote
s
the
ϵ
-co
veri
ng
nu
mb
er
of F
with
res
p
ect
to
the
L,
wh
ere
k=
1,
and
for
a
fun
ctio
nZ
(f )
on
F,
∥Z
(f
)∥|Z
(f )|.
v
P
ro
p
o
si
ti
o
n
C.
2.
W
or
k
wi
th
s
a
m
e
n
ot
at
io
n
a
s
a
b
o
v
e.
A
ss
u
m
e
th
at
th
er
e
e
xi
st
c
o
n
st
a
nt
s
A
=
5
e
a
n
d
v
=
1
s
u
c
h
th
at
N
(F
,
L(
Q
),
U
ϵ
)
=
(
A/
ϵ)
fo
r
e
v
er
y
0
<
ϵ
<
1
a
n
d
e
v
er
y
pr
o
b
a
bi
lit
y
m
e
a
s
ur
e
Q
o
n
S
.
T
h
e
n,
w
e
h
a
v
e
r
k =1
f ( ˜ Ξk) F] = C [ q v U logv q A' + vv vT s v log v q A' ] , (C.1)
q
q
where C is a universal constant and A2A. Proof. The pro of is based on the pro of
of Gine and Guillou (2001, Prop osition 2.1),
but some mo difications are needed. Recall that ˜ Ξk, k = 1 , . . . , r are i.i.d. , . . . ,
ϵr
Let ϵ
b e i.i.d. Rademacher random variables indep endent of ˜ Ξ1, . . . , ˜ Ξr
ϵ f(˜ ) ]
Ξ
r
k =1
∑
Us
E
q
Us
1
[
:
=
v
'
.
B
y
L
e
m
m
a
2.
3.
1
of
v
a
n
d
er
V
a
ar
t
a
n
d
W
el
ln
er
(1
9
9
6)
,
w
e
h
a
v
e
E[
k
f ( ˜ Ξk) ] = 2E [ ∑= 2 q U E k
ϵk
F
[
∑
rk
=1
k
] , (C.2)
F)
f( ˜ Ξ
H
=
1
∑
r
k
where H := { f (ξ1, . . . , q) = ∑q t=1 f (ξt
r k =1 ϵkf ( ˜ Ξk) / v r is
subξ
˜ Ξk)/( q U ) : f ∈ F } . We shall b ound the right side of (C.3). Without loss of
generality, we may assume that 0 ∈ H . By Ho effding’s inequality, given {
r
k =1
ϵ f ( ˜ Ξk)/ vr H] = C ∫
where
Eϵ
Gaussian for the L2 ( ˜ Qr) nor
distributio
2.2.8 of va
v log N (H , L2( ˜ ) , ϵ
Q
)dϵ,
stands for the exp ectation with resp ect to ϵk’s and C is a universal constant.
Letdenote the empirical distribution on S that assigns probability 1/( q r ) to each ˜tq
rr k =1, t ∈ ∪Hk. Since for fi(ξ1, . . . , ξq) = ∑q t=1rfi(ξt) /(q U ), fi∈ F , i = 1 , 2, 1
∥∑
f(
~
k
2/r ∥1= 2
H
k
1
2
k
E [ ∑
q2r
U
2
k =1
1
k
2
k
r k =1
k
ϵ
)
r
0
˜P
ξ
{ f ( ˜ ) - f ( ˜ )}2 = 1
( ˜ r∑{ f
) - f ( ˜ )}2
Ξ
Ξ
Ξ
Ξ
∑
k =1 | f1( ˜ k) - f
(˜
Ξ
Ξk)|
∑ r 2 L (~ 2
),
= 2q
P
rU
= 2U ∥ 1 - f ∥ 1 q r
f
we have N (H , 2
), ϵ ) = N (F , L1( ˜ Pq r), U ϵ2/2) = (2A/ϵ2)v.
L
Thus,
k
r
H] = Cv 2 )/ v ϵkf ( ˜ rΞ
Eϵ [ ∑
v= 2 C v
r
k
=1
k =1
Av ∫
44
2A/ ∥ ∑r k =1 f (
~
2
k
1=
2H
v log dϵ
ϵ
.
(˜
Qr
∫0
∥ ∑r k =1 f( ~
8v
k2)/r ∥1 = 2 H
/r
)
∥
ϵ v log (v 2A/ϵ
2 )dϵ
21
ϵ v log d
ϵ
ϵ
a8Integration
log ϵ2ϵ
dϵ = [ -=v log ϵϵ + 12 ∫8a8v log ϵ+ 1
v
]8a2 ∫
a
log
aa
dϵ, a = e,
by parts gives ∫v
ϵ2
v log ϵdϵ = 2v log aa , a =
2ϵ
e.
2/r
f( ˜ Ξ H/r
v log r k =12A ∥ k)2/r ∥H
∑r k =1
],
k =1∑
8
from which we have ∫a
Therefore, we have
E [ k =1∑
k =1
2
r
k
E
[
E[
wh
ere
the
sec
ond
ine
qua
ϵkf ( ˜ Ξk)/ vr
H] =
2Cv 2v E= 2C v
=
1
2
v
v
u
∑
E
[
[
r
k
=
r1
∑
k)
k)
∑
1 /2
H]
f(˜Ξ
f ( ˜ Ξk
log
u
t
ϵ
∑
k =1
2
rs
2
v
v
(
s
qU
f(˜
Ξ
2
2A E[ ∥
)
2/r
∥H
lity
is
due
to
H¨o
lder
’s
ine
qua
lity,
the
con
cavi
ty
of
the
ma
px
7→
x
log
(a/x
)
and
Jen
sen
’s
ine
qua
lity.
N
o
w,
b
y
C
or
ol
la
ry
3.
4
of
T
al
a
gr
a
n
d
(1
9
9
4)
,
r
f( ˜ Ξk
2 H]
)
=
2q
+ 8E
[
r
ϵkf ( ˜
Ξk
) H] ,
kf(
˜ Ξk
2
q
r
ϵ f ( ˜ Ξk) H]) log 2q AU2
.
∑=
2C v
k =1
uut
and the right side is b ounded by 10r b
ecause s
2
q
s
=2q
U
2
r
. Since the map x 7→ x log ( a/x ) is non-decreasing for 0 = x = a/e and A = 5e , we
have
k
2
[
k r + 8
q
k) H
=1∑∑ E
Put Z := E
[ r
].
ϵkf ( ˜ Ξ
Then, Z
satisfies
Z2 q U= C v log
r vq A'U + 8 C v Z log vq ,
s2 q2
sq
A'U sq
45
)/ v
qU
H]
r
where C is another universal constant and := v2A. This gives
A'
+ v16C2v2 ( log v q A'Uq )2 + C v r s2log vq A'
Z = 4C v log vq A'U
q
q
s
sq
+
v
r
q U2
v
v q A 'U s
= 8C v
C
l
s
log vq
o
v
A'U sq
v
g
v
q
v
U
wh
ere
the
sec
ond
ine
qua
lity
is
due
to
va
+b
=
va
+
vb
for
a, b
> 0,
and
C'is
ano
ther
uni
ver
sal
con
sta
nt.
Co
mbi
nin
g
(C.
2)
and
(C.
3)
yiel
ds
the
des
ired
ine
qua
lity.
In
th
e
c
a
s
e
q
=
C
'
[
v
l
o
g
v
q
A'
U
sq
+ vv
vr sv
q Uq
q ] , (C.3)
v
vq
Al 'U sq
o
g
Us
th
at
F
is
si
n
gl
et
o
n,
i.
e.
,
F
=
{f
},
Pr
o
p
o
sit
io
n
C.
2
gi
v
e
s
a
B
er
n
st
ei
n
ty
p
e
in
e
q
u
al
ity
fo
r
ß
m
ixi
n
g
pr
o
c
e
ss
e
s.
In
th
at
c
a
s
e,
r
k
=1
E[
f ( ˜ Ξk) ] = E [ k r f ( ˜ ) ]
F
2
=1 Ξ
∑
r
= vr q sq k =1∑ f ( ˜
(f ). Ξ
Al
th
o
u
g
h
s
u
c
h
in
e
q
u
al
iti
e
s
ar
e
k
n
o
w
n
in
th
e
lit
er
at
ur
e
(c
f.
F
a
n
= v uu u t
E
k
k
)
a
n
d
Y
a
o,
2
0
0
3,
T
h
e
or
e
m
2.
1
8)
,
w
e
pr
e
s
e
nt
th
e
s
p
e
ci
al
c
a
s
e
of
F
b
ei
n
g
si
n
gl
et
o
n
a
s
a
c
or
ol
la
ry
,
si
n
c
e
its
fo
r
m
is
m
or
e
c
o
n
v
e
ni
e
nt
in
th
e
pr
e
s
e
nt
c
o
nt
e
xt
.
Corollary C.1. Let f be a function on S such that supx |f (x )| = U and E[ f
∈S
(ξ1
Tt=1∑
f ( ξt)
1+ j) |= C { v(s ∨ 1)T sq(f ) + sq U } }
= 2e-s + 2 r ß (q )
P{
1)|
,
)]
=
0.
T
2q
1
1+ j
1
1+ j)|
h
e
n,
fo
r
al
ls
>
0
,
w
e
h
a
v
e
wh
ere
C is
a
uni
ver
sal
con
sta
nt.
In
app
lyin
g
tho
se
ine
qua
litie
s,
the
eva
luat
ion
of
the
vari
anc
e
ter
m
s( f
) is
ess
enti
al.
For
ß
-mi
xin
g
pro
ces
ses
,
Yos
hih
ara’
s
(19
76)
Le
mm
a1
is
part
icul
arly
use
ful
for
that
pur
p
ose
.
Sin
ce
it is
rep
eat
edl
y
use
d in
the
pro
ofs
of
the
the
ore
ms
ab
ove
,
we
des
crib
ea
sp
eci
al
cas
e of
that
lem
ma.
L
e
m
m
a
C.
2.
W
or
k
wi
th
th
e
s
a
m
e
n
ot
at
io
n
a
s
a
b
o
v
e.
L
et
j
b
e
a
x
e
d
p
o
sit
iv
e
in
te
g
er
.
L
et
f
a
n
d
g
b
e
fu
n
cti
o
n
s
o
n
S
s
u
c
h
th
at
E[
f
(ξ
)]
=
E[
g
(ξ
)]
=
0,
a
n
d
fo
r
s
o
m
e
p
o
sit
iv
e
c
o
n
st
a
nt
s
d
a
n
d
M
,
E[ |f (ξ
1+ d]E[
|g (ξ
1+ d]
46
= M , E[|f (ξ )g (
ξ
1+ d]
= M . (C.4)
∑8 j =1Var
{qt=1∑
1/(1+ d )ß
(j )d / (1+ d1)/(1+ d )ß (j
)
8 j =1
j =1∑ß
(j)
Then, we have | Cov (f (ξ1) , g (ξ1+ j)) | = 4M
f (ξt)/ vq } = 4M
1 /(1+ d
)
{1+2
8
d /(1+ d )}
. A direct consequence of Lemma C.2 is that if ther
andM such that (C.4) ho ds for for any p ositive inte
sumCov {f (ξ1), g (ξ1+ j)} is absolutely convergent, a
integer q ,
.
ß (j ) decays exp onentially fast as j → 8 , i.e., for some constants a ∈ (0,
1) and B > 0, ß ( j ) = B a, then ∑8 j =1d / (1+ d )ß (j )< 8 for any d > 0.
jIf
References
Abrevaya, J., and C. M. Dahl (2008): “The Effects of Birth Inputs on
Birthweight: Evidence From Quantile Estimation on Panel Data,” Journal of
Business and Economic Statistics , 26, 379–397.
Amemiya, T. (1982): “Two Stage Least Absolute Deviations Estimators,”
Econometrica, 50, 689–711.
Andrews, D. W. K. (1991): “Heteroskedasticity and Auto correlation
Consistent Covariance Matrix Estimation,” Econometrica, 59, 817–858.
Angrist, J., V. Chernozhukov, and I. Fernandez-Val (2006): “Quantile
Regression under Missp ecification, with an Application to the U.S. Wage
Structure,” Econometrica, 74, 539–563.
Arcones, M. A. (1998): “Second Order Representations of the Least
Absolute Deviation Regression Estimator,” Annals of the Institute of
Statistical Mathematics , 50, 87–117.
Arcones, M. A., and B. Yu (1994): “Central Limit Theorems for Empirical
and U-pro cesses of Stationary Mixing Sequences,” Journal of Theoretical
Probabilities, 7, 47–71.
Arellano, M., and S. Bonhomme (2009): “Robust Priors in Nonlinear Panel
Data Mo dels,” Econometrica, 77, 489–536.
Arellano, M., and J. Hahn (2005): “Understanding Bias in Nonlinear Panel
Mo dels: Some Recent Developments,” Mimeo, CEMFI.
47
Bester, C. A., and C. Hansen (2009): “A Penalty Function Approach to Bias Reduction
in Non-Linear Panel Mo dels with Fixed Effects,” Journal of Business and Economic
Statistics , 27, 131–148.
Buchinsky, M. (1995): “Estimating the Asymptotic Covariance Matrix for Quantile
Regression Mo dels: A Monte Carlo Study,” Journal of Econometrics, 68, 303–338.
Canay, I. A. (2008): “A Note on Quantile Regression for Panel Data Mo dels,” Mimeo,
Northwestern University.
Dhaene, G., and K. Jochmans (2009): “Split-Panel Jacknife Estimation of FixedEffect
Mo dels,” Mimeo.
Eberlain, E. (1984): “Weak Convergence of Partial Sums of Absolutely Regular
Sequences,” Statistics and Probability Letters, 2, 291–293.
Fan, J., and Q. Yao (2003): Nonlinear Time Series . Springer-Verlag.
Fernandez-Val, I. (2005): “Bias Correction in Panel Data Mo dels with Individual Sp
ecific Parameters,” Mimeo.
Gamper-Rabindran, S., S. Khan, and C. Timmins (2010): “The Impact of Pip ed Water
Provision on Infant Mortality in Brazil: a Quantile Panel Data Approach,” Journal of
Development Economics , 92, 188–200.
Gine, E., and A. Guillou (2001): “On Consistency of Kernel Density Estimators for
Randomly Censored Data: Rates Holding Uniformly over Adaptive Intervals,” Annales
de l’Institut Henri Poincare, 37, 503–522.
Graham, B. S., J. Hahn, and J. L. Powell (2009): “The Incidental Parameter Problem in
a Non-Differentiable Panel Data Mo del,” Economics Letters, 105, 181– 182.
Hahn, J., and G. M. Kuersteiner (2011): “Asymptotically Unbiased Inference for a
Dynamic Panel Mo del with Fixed Effects When b oth n and T are Large,” Econometric
Theory, forthcoming.
Hahn, J., and W. Newey (2004): “Jackknife and Analytical Bias Reduction for
Nonlinear Panel Mo dels,” Econometrica, 72, 1295–1319.
Horowitz, J. L. (1992): “A Smo othed Maximum Score Estimator for the Binary Resp
onse Mo del,” Econometrica, 60, 505–531.
48
(1998): “Bo otstrap Metho ds for Median Regression Mo dels,” Econometrica,
66, 1327–1351.
(2001): “The Bo otstrap in Econometrics,” in Handbook of Econometrics,
Vol.5, ed. by J. J. Heckman, and E. E. Leamer, pp. 3159–3228. Elsevier Science
B.V.
Ichimura, H., and P. Todd (2007): “Implementing Nonparametric and Semiparametric
Estimators,” in Handbook of Econometrics, Vol.6, ed. by J. J. Heckman, and
E. E. Leamer, pp. 5369–5468. Elsevier Science B.V.
Kato, K., A. F. Galvao, Jr., and G. V. Montes-Rojas (2011): “Asymptotics for Panel
Quantile Regression Mo dels with Individual Effects,” Mimeo.
Knight, K. (1998): “Limiting Distributions for L1Regression Estimators Under General
Conditions,” Annals of Statistics , 26, 755–770.
Koenker, R. (2004): “Quantile Regression for Longitudinal Data,” Journal of
Multivariate Analysis, 91, 74–89.
(2005): Quantile Regression. Cambridge University Press, Cambridge.
(2009): “Quantile Regression in R: A Vignette,” Available from:
http://cran.r-pro ject.org/web/packages/quantreg/index.html.
Koenker, R., and G. W. Bassett (1978): “Regression Quantiles,” Econometrica, 46,
33–49.
Lancaster, T. (2000): “The Incidental Parameter Problem since 1948,” Journal of
Econometrics, 95, 391–413.
Li, H., B. G. Lindsay, and R. P. Waterman (2003): “Efficiency of Pro jected Score
Metho ds in Rectangular Array Asymptotics,” Journal of the Royal Statistical Society,
Series B , 65, 191–208.
Lipsitz, S. R., G. M. Fitzmaurice, G. Molenberghs, and L. P. Zhao (1997): “Quantile
Regression Metho ds for Longitudinal Data with Drop-outs: Application to CD4 Cell
Counts of Patients Infected with the Human Immuno deficiency Virus,” Journal of the
Royal Statistical Society, Series C , 46, 463–476.
Massart, P. (2000): “Ab out the Constants in Talagrand’s Concentration Inequalities for
Empirical Pro cesses,” Annals of Probability, 28, 863–884.
Muller, H. G. (1984): “Smo oth Optimal Kernel Estimators of Densities, Regression
Curves and Mo des,” Annals of Statistics , 12, 766–774.
49
Neyman, J., and E. L. Scott (1948): “Consistent Estimates Based on Partially
Consistent Observations,” Econometrica, 16, 1–32.
Nolan, D., and D. Pollard (1987): “U-pro cesses: Rates of Convergence,” Annals of
Statistics , 15, 780–799.
Parzen, E. (1957): “On Consistent Estimates of the Sp ectrum of a Stationary Time
Series,” Annals of Mathematical Statistics , 28, 329–348.
Phillips, P. C. (1991): “A Shortcut to LAD Estimator Asymptotics,” Econometric Theory,
7, 450–463.
Rilstone, P., V. Srivastava, and A. Ullah (1996): “The Second-Order Bias and Mean
Squared Error of Nonlinear Estimators,” Journal of Econometrics , 75, 369–395.
Rosen, A. (2009): “Set Identification via Quantile Restrictions in Short Panels,”
CEMMAP working pap er CWP26/09.
Silva, D. G. D., G. Kosmopoulou, and C. Lamarche (2009): “The Effect of Information
on the Bidding and Survival of Entrants in Pro curement Auctions,” Journal of Public
Economics , 93, 56–72.
Talagrand, M. (1994): “Sharp er Bounds for Gaussian and Empirical Pro cesses,”
Annals of Probability, 22, 28–76.
(1996): “New Concentration Inequalities in Pro duct Spaces,” Inventiones
Mathematicae , 126, 505–563.
van der Vaart, A., and J. A. Wellner (1996): Weak Convergence and Empirical
Processes. Springer-Verlag, New York, New York.
Wang, H. J., and M. Fygenson (2009): “Inference for Censored Quantile Regression
Mo dels in Longitudinal Studies,” Annals of Statistics , 37, 756–781.
Wei, Y., and X. He (2006): “Conditional Growth Charts,” Annals of Statistics , 34,
2069–2097.
Woutersen, T. M. (2002): “Robustness Against Incidental Parameters,” Mimeo,
University of Western Ontario.
Yoshihara, K. (1976): “Limiting Behavior of U-Statistics for Stationary, Absolutely
Regular Pro cesses,” Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte
Gebiete, 35, 237–252.
50
(1978): “Moment inequalities for Mixing Sequences,” Kodai Mathematical
Journal, 1, 316–328.
Yu, B. (1993): “Density Estimation in the L8norm for Dep endent Data with Applications
to the Gibbs Sampler,” Annals of Statistics , 21, 711–735.
(1994): “Rates of Convergence for Empirical Pro cesses of Stationary Mixing
Sequences,” Annals of Probability, 22, 94–116.
51
ß^ KB
ß^
t 0 .25 0 .50 0 .75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75
BIAS T = 10 n = 50 0.047 0.001 -0.045 0.045 -0.004 -0.048 0.026 -0.007 -0.030 0.047 0.005 -0.036 0.028 -0.002 -0.027
0.042 0.001 -0.049 0.042 0.000 -0.049 0.021
n =-0.001
100 -0.036 0.043 0.002 -0.047 0.021 0.000 -0.036
n = 500 0.046 -0.001 -0.046 0.046 -0.001 -0.046 0.027 -0.003 -0.030 0.046 -0.001 -0.046 0.027 -0.003 -0.030
0.027 0.004 -0.019 0.017 0.003 -0.018
T =0.002
20 n 0.003
= 50 -0.007 0.023 0.008 -0.011 0.012 0.004 -0.008
0.021 0.001 -0.019 0.019 0.001 -0.018 0.005
n =0.001
100 -0.006 0.022 0.002 -0.019 0.007 0.001 -0.007
n = 500 0.021 -0.001 -0.021 0.021 -0.001 -0.021 0.008 -0.001 -0.011 0.021 -0.001 -0.021 0.008 -0.001 -0.011
T = 50 n = 50 0.006 0.000 -0.010 0.005 0.000 -0.012 -0.001 0.000 -0.006 0.008 0.003 -0.005 0.000 0.000 -0.004
n = 100 0.008 0.000 -0.008 0.008 0.000 -0.008 0.002 0.000 -0.002 0.009 0.000 -0.007 0.002 0.000 -0.002
n = 500 0.008 0.000 -0.009 0.008 0.000 -0.009 0.001 0.000 -0.003 0.008 0.000 -0.009 0.001 0.000 -0.003
T = 100 n = 50 0.004 -0.001 -0.003 0.001 -0.001 -0.003 -0.002 -0.001 0.000 0.005 0.001 0.002 0.001 -0.001 0.001
n = 100 0.004 -0.001 -0.004 0.003 -0.001 -0.004 0.000 -0.001 -0.001 0.005 -0.001 -0.004 0.001 -0.001 -0.001
n = 500 0.004 0.000 -0.005 0.004 0.000 -0.005 0.001 0.000 -0.002 0.004 0.000 -0.005 0.001 0.000 -0.002
SD T = 10 n = 50 0.107 0.099 0.107 0.107 0.098 0.106 0.136 0.117 0.131 0.109 0.101 0.108 0.136 0.119 0.131
0.076 0.069 0.074 0.076 0.069 0.074 0.094 n0.084
= 1000.092 0.076 0.070 0.074 0.094 0.085 0.092
0.034 0.032 0.033 0.034 0.032 0.033 0.043 n0.041
= 5000.042 0.034 0.032 0.033 0.043 0.041 0.042
T = 20 n = 50 0.077 0.072 0.076 0.076 0.071 0.075 0.085 0.076 0.081 0.078 0.074 0.077 0.086 0.077 0.081
n = 100 0.054 0.051 0.058 0.054 0.050 0.058 0.060 0.055 0.062 0.054 0.051 0.058 0.060 0.055 0.062
n = 500 0.024 0.022 0.024 0.024 0.022 0.024 0.027 0.023 0.027 0.024 0.022 0.024 0.027 0.023 0.027
T = 50 n = 50 0.046 0.043 0.049 0.046 0.043 0.049 0.048 0.044 0.050 0.048 0.044 0.050 0.048 0.044 0.050
n = 100 0.034 0.031 0.034 0.034 0.031 0.034 0.035 0.032 0.035 0.034 0.031 0.034 0.035 0.032 0.035
n = 500 0.015 0.014 0.015 0.015 0.014 0.015 0.016 0.014 0.016 0.015 0.014 0.015 0.016 0.014 0.016
T = 100 n = 50 0.034 0.031 0.032 0.034 0.031 0.032 0.035 0.031 0.033 0.035 0.031 0.033 0.035 0.031 0.033
n = 100 0.024 0.022 0.024 0.024 0.022 0.024 0.025 0.022 0.024 0.024 0.022 0.024 0.025 0.022 0.024
n = 500 0.011 0.010 0.011 0.011 0.010 0.011 0.011 0.010 0.011 0.011 0.010 0.011 0.011 0.010 0.011
RMSE T = 10 n = 50 0.117 0.099 0.116 0.116 0.098 0.117 0.139 0.117 0.135 0.119 0.101 0.114 0.139 0.119 0.134
0.087 0.069 0.089 0.087 0.069 0.089 0.096 n0.084
= 1000.099 0.087 0.070 0.088 0.097 0.085 0.099
n = 500 0.057 0.032 0.057 0.057 0.032 0.057 0.050 0.041 0.052 0.057 0.032 0.057 0.050 0.041 0.052
T = 20 n = 50 0.081 0.072 0.078 0.077 0.071 0.077 0.085 0.076 0.081 0.081 0.075 0.078 0.087 0.077 0.082
n = 100 0.058 0.051 0.061 0.057 0.050 0.061 0.060 0.055 0.063 0.058 0.051 0.061 0.060 0.055 0.063
n = 500 0.032 0.022 0.032 0.032 0.022 0.032 0.029 0.023 0.029 0.032 0.022 0.032 0.029 0.023 0.029
T = 50 n = 50 0.047 0.043 0.050 0.047 0.043 0.050 0.048 0.044 0.050 0.048 0.044 0.051 0.048 0.044 0.050
n = 100 0.035 0.031 0.034 0.035 0.031 0.035 0.035 0.032 0.035 0.035 0.031 0.034 0.035 0.032 0.035
n = 500 0.017 0.014 0.018 0.017 0.014 0.018 0.016 0.014 0.016 0.017 0.014 0.018 0.016 0.014 0.016
T = 100 n = 50 0.035 0.031 0.033 0.034 0.031 0.033 0.035 0.031 0.033 0.036 0.031 0.033 0.035 0.031 0.033
n = 100 0.025 0.022 0.024 0.024 0.022 0.024 0.025 0.022 0.024 0.025 0.022 0.024 0.025 0.022 0.024
n = 500 0.011 0.010 0.012 0.011 0.010 0.012 0.011 0.010 0.011 0.011 0.010 0.012 0.011 0.010 0.011
tion, and Ro ot Mean Squared Error. Standard
ß^ 1
ß^1= 2
ß^ 1KB
52
ß^ KB
ß^
t 0. 25 0 .50 0 .75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75
BIAS T = 10 n = 50 0.034 -0.042 -0.205 0.028 -0.039 -0.210 0.024 0.031 -0.103 0.033 -0.032 -0.172 0.030 0.028 -0.098
0.032 -0.046 -0.204 0.031 -0.043 -0.204 0.017
n = 100
0.012 -0.081 0.033 -0.046 -0.201 0.017 0.010 -0.081
n = 500 0.030 -0.045 -0.200 0.030 -0.045 -0.200 0.024 0.022 -0.071 0.030 -0.045 -0.200 0.024 0.022 -0.071
0.010 -0.024 -0.095 0.012 -0.023 -0.093
T = 20
0.002
n = 50
0.017 -0.003 0.013 -0.014 -0.074 0.000 0.016 -0.005
0.011 -0.026 -0.100 0.013 -0.024 -0.098 0.001
n = 100
0.017 -0.012 0.011 -0.026 -0.099 -0.001 0.015 -0.015
n = 500 0.010 -0.026 -0.096 0.010 -0.026 -0.096 0.001 0.014 -0.011 0.010 -0.026 -0.096 0.001 0.014 -0.011
T = 50 n = 50 0.000 -0.017 -0.051 -0.004 -0.019 -0.056 -0.008 -0.003 -0.021 0.002 -0.009 -0.033 -0.004 -0.001 -0.016
n = 100 0.001 -0.015 -0.043 0.001 -0.015 -0.044 -0.002 0.000 -0.009 0.002 -0.014 -0.041 -0.001 0.000 -0.008
n = 500 0.004 -0.011 -0.040 0.004 -0.011 -0.040 0.002 0.005 -0.005 0.004 -0.011 -0.040 0.002 0.005 -0.005
T = 100 n = 50 -0.001 -0.008 -0.024 -0.002 -0.011 -0.026 -0.004 -0.005 -0.009 0.003 -0.002 -0.010 -0.002 -0.002 -0.007
n = 100 0.003 -0.006 -0.019 0.003 -0.006 -0.018 0.002 0.001 -0.001 0.004 -0.005 -0.016 0.002 0.001 -0.002
n = 500 0.002 -0.006 -0.019 0.002 -0.006 -0.019 0.002 0.001 -0.002 0.002 -0.006 -0.019 0.002 0.001 -0.002
SD T = 10 n = 50 0.168 0.223 0.304 0.169 0.222 0.304 0.273 0.325 0.514 0.172 0.229 0.310 0.272 0.325 0.515
0.121 0.151 0.209 0.121 0.151 0.209 0.200n 0.232
= 100 0.366 0.121 0.152 0.210 0.200 0.232 0.366
0.055 0.070 0.096 0.055 0.070 0.096 0.088n 0.105
= 500 0.165 0.055 0.070 0.096 0.088 0.105 0.165
T = 20 n = 50 0.126 0.156 0.220 0.126 0.156 0.220 0.157 0.184 0.285 0.130 0.159 0.228 0.157 0.184 0.285
n = 100 0.093 0.110 0.156 0.093 0.110 0.156 0.115 0.133 0.206 0.093 0.110 0.156 0.115 0.133 0.207
n = 500 0.041 0.050 0.069 0.041 0.050 0.069 0.051 0.060 0.090 0.041 0.050 0.069 0.051 0.060 0.090
T = 50 n = 50 0.077 0.096 0.130 0.078 0.096 0.131 0.083 0.101 0.142 0.080 0.100 0.135 0.082 0.101 0.142
n = 100 0.054 0.069 0.098 0.054 0.069 0.098 0.058 0.074 0.107 0.054 0.070 0.099 0.058 0.074 0.107
n = 500 0.024 0.031 0.044 0.024 0.031 0.044 0.026 0.032 0.048 0.024 0.031 0.044 0.026 0.032 0.048
T = 100 n = 50 0.053 0.069 0.102 0.053 0.069 0.102 0.055 0.071 0.106 0.055 0.071 0.105 0.055 0.071 0.107
n = 100 0.038 0.048 0.069 0.038 0.048 0.069 0.039 0.049 0.072 0.038 0.048 0.070 0.039 0.049 0.072
n = 500 0.017 0.021 0.031 0.017 0.021 0.031 0.017 0.022 0.032 0.017 0.021 0.031 0.017 0.022 0.032
RMSE T = 10 n = 50 0.172 0.226 0.367 0.171 0.226 0.369 0.274 0.327 0.524 0.175 0.231 0.355 0.274 0.326 0.524
0.125 0.158 0.292 0.125 0.157 0.293 0.201n 0.232
= 100 0.375 0.126 0.158 0.291 0.201 0.232 0.374
n = 500 0.063 0.083 0.222 0.063 0.083 0.222 0.092 0.107 0.180 0.063 0.083 0.222 0.092 0.107 0.180
T = 20 n = 50 0.127 0.158 0.239 0.126 0.157 0.239 0.157 0.184 0.285 0.130 0.160 0.239 0.157 0.185 0.285
n = 100 0.093 0.113 0.186 0.094 0.113 0.184 0.115 0.134 0.207 0.094 0.113 0.184 0.115 0.134 0.207
n = 500 0.042 0.057 0.118 0.042 0.057 0.118 0.051 0.062 0.090 0.042 0.057 0.118 0.051 0.062 0.090
T = 50 n = 50 0.077 0.097 0.140 0.078 0.098 0.142 0.083 0.101 0.143 0.080 0.100 0.139 0.082 0.101 0.142
n = 100 0.054 0.071 0.107 0.054 0.071 0.108 0.058 0.074 0.107 0.054 0.071 0.107 0.058 0.074 0.107
n = 500 0.025 0.032 0.060 0.025 0.032 0.060 0.026 0.033 0.048 0.025 0.032 0.060 0.026 0.033 0.048
T = 100 n = 50 0.053 0.070 0.105 0.053 0.070 0.106 0.055 0.071 0.107 0.055 0.071 0.106 0.055 0.071 0.107
n = 100 0.038 0.048 0.072 0.038 0.048 0.072 0.039 0.049 0.072 0.038 0.048 0.071 0.039 0.049 0.072
n = 500 0.017 0.022 0.036 0.017 0.022 0.036 0.018 0.022 0.032 0.017 0.022 0.036 0.018 0.022 0.032
ion, and Ro ot Mean Squared Error. Chi-square
ß^1
ß^1 =2
ß^ 1KB
53
ß^ KB
ß^
t 0 .25 0 .50 0 .75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75
BIAS T = 10 n = 50 0.067 0.004 -0.060 0.064 -0.002 -0.063 0.042 -0.002 -0.045 0.067 0.009 -0.047 0.044 0.005 -0.042
0.059 0.000 -0.066 0.058 -0.002 -0.066 0.035
n = -0.004
100 -0.046 0.059 0.001 -0.064 0.035 -0.003 -0.046
n = 500 0.064 0.000 -0.062 0.064 0.000 -0.062 0.041 -0.003 -0.040 0.064 0.000 -0.062 0.041 -0.003 -0.040
0.035 0.008 -0.028 0.022 0.005 -0.028
T =0.003
20 n 0.004
= 50 -0.011 0.030 0.013 -0.018 0.016 0.007 -0.012
0.026 0.000 -0.026 0.023 0.000 -0.025 0.002
n =0.000
100 -0.009 0.027 0.001 -0.025 0.005 0.000 -0.010
n = 500 0.029 0.001 -0.030 0.029 0.001 -0.030 0.012 0.000 -0.015 0.029 0.001 -0.030 0.012 0.000 -0.015
T = 50 n = 50 0.011 -0.001 -0.012 0.010 -0.002 -0.014 0.001 -0.002 -0.006 0.013 0.003 -0.007 0.003 -0.002 -0.004
n = 100 0.012 0.001 -0.011 0.012 0.001 -0.011 0.003 0.001 -0.002 0.013 0.001 -0.010 0.003 0.001 -0.002
n = 500 0.012 0.001 -0.010 0.012 0.001 -0.010 0.003 0.001 -0.002 0.012 0.001 -0.010 0.003 0.001 -0.002
T = 100 n = 50 0.005 0.000 -0.004 0.001 -0.001 -0.005 -0.003 -0.001 0.000 0.005 0.002 0.002 0.001 0.000 0.000
n = 100 0.006 -0.001 -0.006 0.005 -0.001 -0.006 0.000 -0.001 -0.002 0.006 -0.001 -0.006 0.001 -0.001 -0.002
n = 500 0.005 0.000 -0.005 0.005 0.000 -0.005 0.001 0.000 -0.001 0.005 0.000 -0.005 0.001 0.000 -0.001
SD T = 10 n = 50 0.137 0.126 0.137 0.138 0.124 0.136 0.189 0.178 0.181 0.140 0.128 0.138 0.189 0.180 0.181
0.093 0.088 0.093 0.093 0.088 0.093 0.126 n0.113
= 1000.124 0.093 0.089 0.093 0.126 0.113 0.124
0.045 0.040 0.042 0.045 0.040 0.042 0.059 n0.051
= 5000.055 0.045 0.040 0.042 0.059 0.051 0.055
T = 20 n = 50 0.101 0.090 0.096 0.100 0.090 0.096 0.113 0.099 0.109 0.103 0.092 0.098 0.114 0.100 0.109
n = 100 0.068 0.067 0.070 0.068 0.066 0.070 0.078 0.074 0.079 0.069 0.067 0.071 0.078 0.074 0.079
n = 500 0.031 0.029 0.031 0.031 0.029 0.031 0.035 0.033 0.035 0.031 0.029 0.031 0.035 0.033 0.035
T = 50 n = 50 0.061 0.056 0.061 0.061 0.056 0.062 0.064 0.057 0.064 0.063 0.058 0.063 0.064 0.058 0.064
n = 100 0.042 0.040 0.042 0.042 0.040 0.042 0.044 0.041 0.044 0.043 0.040 0.042 0.044 0.041 0.044
n = 500 0.019 0.018 0.019 0.019 0.018 0.019 0.020 0.018 0.020 0.019 0.018 0.019 0.020 0.018 0.020
T = 100 n = 50 0.043 0.039 0.041 0.043 0.039 0.041 0.044 0.039 0.042 0.044 0.040 0.042 0.044 0.039 0.042
n = 100 0.030 0.028 0.030 0.030 0.028 0.030 0.031 0.028 0.031 0.030 0.028 0.030 0.031 0.028 0.031
n = 500 0.014 0.013 0.013 0.014 0.013 0.013 0.014 0.013 0.014 0.014 0.013 0.013 0.014 0.013 0.014
RMSE T = 10 n = 50 0.153 0.126 0.149 0.152 0.124 0.150 0.193 0.178 0.187 0.155 0.128 0.146 0.194 0.180 0.186
0.110 0.088 0.114 0.110 0.088 0.114 0.131 n0.113
= 1000.132 0.111 0.089 0.113 0.131 0.113 0.132
n = 500 0.078 0.040 0.074 0.078 0.040 0.074 0.071 0.051 0.068 0.078 0.040 0.074 0.071 0.051 0.068
T = 20 n = 50 0.106 0.091 0.100 0.102 0.090 0.100 0.113 0.099 0.110 0.107 0.093 0.100 0.115 0.100 0.110
n = 100 0.073 0.067 0.075 0.072 0.066 0.075 0.078 0.074 0.079 0.074 0.067 0.075 0.078 0.074 0.079
n = 500 0.042 0.029 0.043 0.042 0.029 0.043 0.036 0.033 0.038 0.042 0.029 0.043 0.036 0.033 0.038
T = 50 n = 50 0.062 0.056 0.063 0.062 0.056 0.063 0.064 0.058 0.065 0.064 0.058 0.064 0.064 0.058 0.064
n = 100 0.044 0.040 0.043 0.044 0.040 0.043 0.045 0.041 0.044 0.044 0.040 0.043 0.045 0.042 0.044
n = 500 0.023 0.018 0.022 0.023 0.018 0.022 0.020 0.018 0.020 0.023 0.018 0.022 0.020 0.018 0.020
T = 100 n = 50 0.043 0.039 0.041 0.043 0.039 0.041 0.044 0.039 0.042 0.045 0.040 0.042 0.044 0.039 0.042
n = 100 0.031 0.028 0.031 0.031 0.028 0.031 0.031 0.028 0.031 0.031 0.028 0.031 0.031 0.028 0.031
n = 500 0.015 0.013 0.014 0.015 0.013 0.014 0.014 0.013 0.014 0.015 0.013 0.014 0.014 0.013 0.014
tion, and Ro ot Mean Squared Error. Standard
ß^ 1
ß^1= 2
ß^ 1KB
54
innovation mo del. θ = 1.
ß^ KB
ß^
ß^ 1
t 0.25 0 .50 0 .75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75 0.25 0.50 0.75
BIAS T = 10 n = 50 0.062 -0.056 -0.267 0.054 -0.054 -0.273 -0.028 -0.051 -0.227 0.059 -0.045 -0.224 -0.020 -0.054 -0.221
0.051 -0.064 -0.261 0.050 -0.061 -0.262 -0.053
n = 100
-0.062 -0.242 0.053 -0.063 -0.257 -0.052 -0.065 -0.242
n = 500 0.050 -0.056 -0.262 0.050 -0.056 -0.262 -0.030 -0.049 -0.227 0.050 -0.056 -0.262 -0.030 -0.049 -0.227
0.020 -0.028 -0.123 0.022 -0.027 -0.121
T = 20-0.030
n = 50-0.012 -0.051 0.024 -0.015 -0.094 -0.032 -0.013 -0.052
0.015 -0.036 -0.129 0.017 -0.034 -0.126 -0.039
n = 100
-0.017 -0.041 0.015 -0.035 -0.126 -0.041 -0.019 -0.044
n = 500 0.016 -0.034 -0.130 0.016 -0.034 -0.130 -0.035 -0.014 -0.046 0.016 -0.034 -0.130 -0.035 -0.014 -0.046
T = 50 n = 50 -0.002 -0.019 -0.067 -0.008 -0.021 -0.073 -0.025 -0.009 -0.030 -0.001 -0.009 -0.044 -0.018 -0.006 -0.023
n = 100 0.005 -0.019 -0.054 0.004 -0.017 -0.054 -0.011 -0.005 -0.012 0.006 -0.016 -0.050 -0.011 -0.006 -0.011
n = 500 0.006 -0.012 -0.051 0.006 -0.012 -0.051 -0.010 0.001 -0.009 0.006 -0.012 -0.051 -0.010 0.001 -0.009
T = 100 n = 50 -0.001 -0.013 -0.033 -0.003 -0.016 -0.036 -0.010 -0.009 -0.015 0.004 -0.004 -0.015 -0.008 -0.006 -0.012
n = 100 0.004 -0.007 -0.026 0.004 -0.006 -0.025 -0.002 0.001 -0.002 0.005 -0.004 -0.022 -0.002 0.000 -0.003
n = 500 0.002 -0.007 -0.026 0.002 -0.007 -0.026 -0.004 0.000 -0.003 0.002 -0.007 -0.026 -0.004 0.000 -0.003
SD T = 10 n = 50 0.225 0.283 0.381 0.224 0.281 0.381 0.429 0.482 0.710 0.230 0.291 0.390 0.429 0.482 0.711
0.161 0.202 0.261 0.161 0.201 0.261 0.313
n =0.353
100 0.530 0.161 0.202 0.262 0.313 0.353 0.530
0.072 0.089 0.117 0.072 0.089 0.117 0.139
n =0.158
500 0.231 0.072 0.089 0.117 0.139 0.158 0.231
T = 20 n = 50 0.175 0.204 0.279 0.174 0.204 0.279 0.232 0.255 0.422 0.180 0.211 0.291 0.233 0.256 0.421
n = 100 0.121 0.147 0.199 0.121 0.147 0.199 0.162 0.193 0.295 0.122 0.148 0.200 0.162 0.193 0.295
n = 500 0.052 0.061 0.086 0.052 0.061 0.086 0.071 0.079 0.127 0.052 0.061 0.086 0.071 0.079 0.127
T = 50 n = 50 0.101 0.126 0.169 0.101 0.126 0.169 0.111 0.139 0.197 0.105 0.131 0.174 0.111 0.139 0.197
n = 100 0.073 0.091 0.125 0.073 0.091 0.125 0.081 0.100 0.146 0.073 0.091 0.125 0.081 0.100 0.146
n = 500 0.032 0.041 0.056 0.032 0.041 0.056 0.036 0.045 0.065 0.032 0.041 0.056 0.036 0.045 0.065
T = 100 n = 50 0.073 0.088 0.126 0.073 0.089 0.126 0.077 0.092 0.135 0.075 0.092 0.130 0.076 0.092 0.135
n = 100 0.052 0.063 0.087 0.052 0.063 0.087 0.054 0.066 0.093 0.052 0.063 0.087 0.054 0.066 0.093
n = 500 0.024 0.029 0.041 0.024 0.029 0.041 0.024 0.031 0.044 0.024 0.029 0.041 0.024 0.031 0.044
RMSE T = 10 n = 50 0.233 0.288 0.466 0.231 0.286 0.468 0.430 0.485 0.745 0.237 0.295 0.450 0.430 0.486 0.744
0.169 0.212 0.369 0.168 0.210 0.370 0.317
n =0.358
100 0.583 0.169 0.212 0.367 0.317 0.359 0.583
n = 500 0.088 0.104 0.287 0.088 0.104 0.287 0.142 0.165 0.324 0.088 0.104 0.287 0.142 0.165 0.324
T = 20 n = 50 0.176 0.206 0.305 0.175 0.205 0.304 0.234 0.255 0.425 0.182 0.211 0.306 0.235 0.256 0.425
n = 100 0.122 0.152 0.238 0.122 0.151 0.236 0.167 0.194 0.298 0.123 0.152 0.236 0.167 0.194 0.298
n = 500 0.054 0.070 0.156 0.054 0.070 0.156 0.079 0.080 0.135 0.054 0.070 0.156 0.079 0.080 0.135
T = 50 n = 50 0.101 0.127 0.182 0.101 0.128 0.184 0.114 0.140 0.199 0.105 0.131 0.179 0.113 0.139 0.198
n = 100 0.073 0.093 0.136 0.073 0.092 0.136 0.082 0.100 0.146 0.074 0.092 0.135 0.082 0.100 0.146
n = 500 0.032 0.043 0.076 0.032 0.043 0.076 0.037 0.045 0.066 0.032 0.043 0.076 0.037 0.045 0.066
T = 100 n = 50 0.073 0.089 0.131 0.073 0.090 0.131 0.077 0.093 0.136 0.075 0.092 0.131 0.077 0.092 0.136
n = 100 0.052 0.063 0.090 0.052 0.063 0.090 0.054 0.066 0.093 0.052 0.063 0.090 0.054 0.066 0.093
n = 500 0.024 0.030 0.048 0.024 0.030 0.048 0.025 0.031 0.044 0.024 0.030 0.048 0.025 0.031 0.044
ion, and Ro ot Mean Squared Error. Chi-square
ß^1 =2
ß^1KB
55
Download