Bootstrapping factor-augmented regression models

advertisement
Bootstrapping factor-augmented regression models
Sílvia Gonçalves and Benoit Perron
Département de sciences économiques, CIREQ and CIRANO, Université de Montréal
November 9, 2011
Abstract
The main contribution of this paper is to propose and theoretically justify bootstrap methods
for regressions where some of the regressors p
are factors estimated from a large panel of data. We
derive our results under the assumption that T =N ! c, where 0 c < 1 (N and T are the crosssectional and the time series dimensions, respectively), thus allowing for the possibility that factors
estimation error enters the limiting distribution of the OLS estimator. We consider general residualbased bootstrap methods and provide a set of high level conditions on the bootstrap residuals and
on the idiosyncratic errors such that the bootstrap distribution of the OLS estimator is consistent.
We subsequently verify these conditions for a simple wild bootstrap residual-based procedure.
Our main results can be summarized as follows. When c = 0, as in Bai and Ng (2006), the
crucial condition for bootstrap validity is the ability of the bootstrap regression scores to mimic
the serial dependence of the original regression scores. Mimicking the cross sectional and/or serial
dependence of the idiosyncratic errors in the panel factor model is asymptotically irrelevant in
this case since the limiting distribution of the original OLS estimator does not depend on these
dependencies. Instead, when c > 0, a two-step residual-based bootstrap is required to capture the
factors estimation uncertainty, which shows up as an asymptotic bias term (as we show here and
as was recently discussed by Ludvigson and Ng (2009b)). Because the bias depends on the cross
sectional dependence of the idiosyncratic error term, bootstrap validity depends crucially on the
ability of the bootstrap panel factor model to capture this cross sectional dependence.
Keywords: factor model, bootstrap, asymptotic bias.
1
Introduction
Factor-augmented regressions where some of the regressors, called factors, are estimated from a large
set of data are increasingly popular in empirical work. Inference in these models is complicated by
the fact that the regressors are estimated and thus measured with error. Recently, Bai and Ng (2006)
derived the asymptotic distribution of the OLS estimator in this case under a set of standard regularity
conditions. In particular, they show that the asymptotic distribution of the OLS estimator is una¤ected
We would like to thank participants at the Conference in honor of Halbert White (San Diego, May 2011), the
Conference in honor of Ron Gallant (Toulouse, May 2011), the Conference in honor of Hashem Pesaran (Cambridge,
July 2011), the Fifth CIREQ time series conference (Montreal, May 2011), the 17th International Panel Data conference
(Montreal, July 2011), and the NBER/NSF Time Series conference at MSU (East Lansing, September 2011) as well
as seminar participants at HEC Montreal, Syracuse University, Georgetown University, Tilburg University, Rochester
University, Cornell University, University of British Columbia and University of Victoria. We also thank Valentina
Corradi, Nikolay Gospodinov, Guido Kursteiner and Serena Ng for very useful comments and discussions. Gonçalves
acknowledges …nancial support from the NSERC and MITACS whereas Perron acknowledges …nancial support from the
SSHRCC and MITACS.
1
by the estimation of the factors when
p
T =N ! 0; where N and T are the cross-sectional and the
time series dimensions, respectively. While their simulation study does not consider inference on the
coe¢ cients themselves (they look at the conditional mean), they report noticeable size distortions in
some situations.
The main contribution of this paper is to propose and theoretically justify bootstrap methods for
inference in the context of the factor-augmented regression model. Shintani and Guo (2011) prove the
validity of the bootstrap to carry out inference on the persistence of factors, but not in the context of
factor-augmented regressions. Recent empirical applications of the bootstrap in this context include
Ludvigson and Ng (2007, 2009a,b) and Gospodinov and Ng (2011), where the bootstrap has been
used in the context of predictability tests based on factor-augmented regressions without theoretical
justi…cation. Our main contribution is to establish the …rst order asymptotic validity of the bootstrap
for factor-augmented regression models under assumptions similar to those of Bai and Ng (2006) but
p
p
without the condition that T =N ! 0: Speci…cally, we assume that T =N ! c, where 0 c < 1,
thus allowing for the possibility that factor estimation error a¤ects the limiting distribution of the
OLS estimator. As it turns out, when c > 0, an asymptotic bias term appears in the distribution,
re‡ecting the contribution of factors estimation uncertainty. This bias problem was recently discussed
by Ludvigson and Ng (2009b), who proposed an analytical bias correction procedure. Instead, here
we focus on the bootstrap and provide a set of conditions under which it can reproduce the whole
limiting distribution of the OLS estimator, including the bias term.
The bootstrap method we propose is made up of two main steps. In a …rst step, we obtain a
bootstrap panel data set from which we estimate the bootstrap factors by principal components.
The bootstrap panel observations are generated by adding the estimated common components from
the original panel and bootstrap idiosyncratic residuals. In a second step, we generate a bootstrap
version of the response variable by again relying on a residual-based bootstrap where the bootstrap
observations of the dependent variable are obtained by summing the estimated regression mean and a
bootstrap regression residual. To mimic the fact that in the original regression model the true factors
are latent and need to be estimated, we regress the bootstrap response variable on the estimated
bootstrap factors. This produces a bootstrap OLS estimator whose bootstrap distribution can be used
to replicate the distribution of the OLS estimator.
We …rst provide a set of high level conditions on the bootstrap residuals and idiosyncratic errors
that allow us to characterize the limiting distribution of the bootstrap OLS estimator under the
p
assumption that T =N ! c; where 0 c < 1. These high level conditions essentially require that
the bootstrap idiosyncratic errors be weakly dependent across individuals and over time and that the
bootstrap regression scores satisfy a central limit theorem. We then verify these high level conditions
for a residual-based wild bootstrap scheme, where the wild bootstrap is used to generate the bootstrap
idiosyncratic error term in the …rst step, and also in the second step when generating the regression
residuals. The two steps are performed independently of each other.
2
A crucial result in proving the …rst order asymptotic validity of the bootstrap in this context is the
consistency of the bootstrap principal component estimator. Given our residual-based bootstrap, the
“latent” factors underlying the bootstrap data generating process (DGP) are given by the estimated
factors. Nevertheless, these are not identi…ed by the bootstrap principal component estimator due to
the well-known identi…cation problem of factor models. By relying on results of Bai and Ng (2011) (see
also Stock and Watson (2002)), we show that the bootstrap estimated factors identify the estimated
factors up to a change of sign. Contrary to the rotation indeterminacy problem that a¤ects the
principal component estimator, this sign indetermination is easily resolved in the bootstrap world,
where the bootstrap rotation matrix depends on bootstrap population values that are functions of the
original data. As a consequence, to bootstrap the distribution of OLS estimator, our proposal is to
rotate the bootstrap OLS estimator using the feasible bootstrap rotation matrix. This amounts to
sign-adjusting the bootstrap OLS regression estimates asymptotically.
Our results can be summarized as follows. When c = 0, as in Bai and Ng (2006), the crucial
condition for bootstrap validity is the ability of the bootstrap regression scores to mimic the serial
dependence and heteroskedasticity of the original regression scores. For instance, a wild bootstrap
in the second-step of our residual-based bootstrap is appropriate if we assume the regression errors
to be a possibly heteroskedastic martingale di¤erence sequence (as in Bai and Ng (2006)). Under a
more general dependence assumption, the wild bootstrap is not appropriate and we should instead
consider a block bootstrap. We do not pursue this possibility here but note that our bootstrap high
level conditions would be useful in establishing the validity of the block bootstrap in this context as
well. Mimicking the cross sectional and/or serial dependence of the idiosyncratic errors in the panel
factor model is asymptotically irrelevant when c = 0 since the limiting distribution of the original OLS
estimator does not depend on these dependencies under this condition. Thus, a wild bootstrap in the
…rst step of the residual-based bootstrap is asymptotically valid under the general setup of Bai and
Ng (2006) that allows for weak time series and cross sectional dependence in the idiosyncratic error
term. In fact, a simple one-step residual-based bootstrap method that does not take into account the
factor estimation uncertainty in the bootstrap samples (i.e. a bootstrap method based only on the
second step of our proposed method) is asymptotically valid when c = 0.1 Instead, when c > 0, a twostep residual-based bootstrap is required to capture the asymptotic bias term that re‡ects the factors
estimation error. Because the bias depends on the cross sectional dependence of the idiosyncratic
error term, bootstrap validity depends crucially on the ability of the bootstrap panel factor model to
capture this cross sectional dependence. Since the wild bootstrap generates bootstrap idiosyncratic
errors that are heteroskedastic but independent along the time and cross sectional dimensions, this
method is consistent only under cross sectional independence of the idiosyncratic errors.
The rest of the paper is organized as follows. In Section 2, we …rst describe the setup and the
1
We do not consider this here because factor estimation uncertainty has an impact in …nite samples. For instance,
Yamamoto (2011) compares bootstrap methods with and without factor estimation for the factor-augmented vector
autoregression (FAVAR) model and concludes that the latter is worse than the …rst in terms of …nite sample accuracy.
3
assumptions, and then derive the asymptotic theory of the OLS estimator when
p
T =N ! c: In Section
3, we introduce the residual-based bootstrap method and characterize a set of conditions under which
the bootstrap distribution consistency follows. Section 4 proposes a wild bootstrap implementation of
the residual-based bootstrap and proves its consistency. Section 5 discusses the Monte Carlo results
and Section 6 concludes. Three mathematical appendices are included. Appendix A contains the
proofs of the results in Section 2, Appendix B the proofs of the results in Section 3, and Appendix C
the proofs of the results in Section 4.
A word on notation. As usual in the bootstrap literature, we use P
probability measure, conditional on a given sample.
to denote the bootstrap
For any bootstrap statistic TN T , we write
TN T = oP (1), in probability, or TN T !P 0, in probability, when for any
oP (1). We write TN T = OP (1), in probability, when for all
> 0, P (jTN T j > ) =
> 0 there exists M < 1 such that
limN;T !1 P [P (jTN T j > M ) > ] = 0: Finally, we write TN T !d D; in probability, if conditional on
a sample with probability that converges to one, TN T weakly converges to the distribution D under
P , i.e. E (f (TN T )) !P E (f (D)) for all bounded and uniformly continuous functions f .
2
Asymptotic theory when
p
T =N ! c; where 0
c<1
We consider the following regression model
yt+h =
where h
0
0
Ft +
Wt + "t+h ; t = 1; : : : ; T
h;
(1)
0: The q observed regressors are contained in Wt . The r unobserved regressors Ft are the
common factors in the following panel factor model,
Xit =
where the r
1 vector
0
i Ft
+ eit ;
i = 1; : : : ; N; t = 1; : : : ; T;
(2)
contains the factor loadings and eit is an idiosyncratic error term. In matrix
i
form, we can write (2) as
X=F
where X is a T
factors,
=(
0
+ e;
N matrix of stationary data, F = (F1 ; : : : ; FT )0 is T
1; : : : ;
0
N)
is N
r, and e is T
r, r is the number of common
N:
The factor-augmented regression model described in (1) and (2) has recently attracted a lot of
attention in econometrics. One of the …rst papers to discuss this model in the forecasting context
was Stock and Watson (2002): Recent empirical applications include Ludvigson and Ng (2007) who
consider predictive regressions of excess stock returns and augment the usual set of predictors by
including estimated factors from a large panel of macro and …nancial variables, Ludvigson and Ng
(2009a,b) who consider this approach in the context of predictive regressions of bond excess returns,
Gospodinov and Ng (2011) who study predictive regressions for in‡ation using principal components
from a panel of commodity convenience yields, and Eichengreen, Mody, Nedeljkovic, and Sarno (2009)
4
who use common factors extracted from credit default swap (CDS) spreads during the recent …nancial
crisis to look at spillovers across banks.
Estimation proceeds in two steps. Given X, we estimate F and
with the method of principal
0
r matrix F~ = F~1 : : : F~T
composed
components. In particular, F is estimated with the T
p
of T times the eigenvectors corresponding to the r largest eigenvalues of of XX 0 =T N (arranged in
decreasing order), where the normalization
loadings is then ~ = ~ 1 ; : : : ; ~ N
0
= X 0 F~
F~ 0 F~
T
~
F 0 F~
= Ir is used. The matrix containing the estimated
1
= X 0 F~ =T:
In the second step, we run an OLS regression of yt+h on z^t =
^
^
^
T
Xh
=
z^t z^t0
t=1
where ^ is p
!
1T h
X
F~t0 Wt0
0
, i.e. we compute
z^t yt+h ;
(3)
t=1
1 with p = r + q:
As is well known in this literature, the principal components F~t can only consistently estimate a
transformation of the true factors Ft , given by HFt ; where H is a rotation matrix de…ned as
H = V~
where V~ is the r
~0F
0
T
N
1F
;
(4)
r diagonal matrix containing on the main diagonal the r largest eigenvalues of
XX 0 =N T , in decreasing order.
One important implication is that ^ consistently estimates
0H 1
0
0
; and not
0
0
0
:
In particular, given (1), adding and subtracting appropriately yields
yt+h =
or, equivalently,
0H 1
|
{z
=
0
0
F~t
+
} Wt
| {z }
0
H
1
HFt
F~t + "t+h ;
=^
zt
yt+h = z^t0 +
0
H
1
F~t + "t+h ;
HFt
(5)
where the second term represents the contribution from estimating the factors.
p
Recently, Bai and Ng (2006) derived the asymptotic distribution of T ^
under a set of
p
regularity conditions and the assumption that T =N ! 0. Our goal in this section is to derive the
p
limiting distribution of ^ under the assumption that T =N ! c; where c is not necessarily zero. We
use the following assumptions, which are similar to Bai’s (2003) assumptions and slightly weaker than
Ft0 Wt0
the Bai and Ng (2006) assumptions. We let zt =
0
, where zt is p
1, with p = r + q.
Assumption 1 - Factors and factor loadings
(a) E kFt k4
M and
(b) The loadings
i
In either case,
1
T
PT
0
t=1 Ft Ft
!P
F
> 0; where
F
are either deterministic such that k i k
0
=N
!P
> 0, where
is a non-random r
M , or stochastic such that E k i k4
is a non-random matrix.
5
r matrix.
M:
(c) The eigenvalues of the r
r matrix (
F)
are distinct.
Assumption 2 - Time and cross section dependence and heteroskedasticity
(a) E (eit ) = 0, E jeit j8
M:
(b) E (eit ejs ) = ij;ts , j ij;ts j
ij for all (t; s) and j ij;ts j
P
1 PT
M; T t;s=1 ts M , and N1T t;s;i;j j ij;ts j M:
(c) For every (t; s), E N
1=2
PN
i=1 (eit eis
E (eit eis ))
4
ts
for all (i; j) such that N1
PN
ij
i;j=1
M:
Assumption 3 - Moments and weak dependence among fzt g, f i g and feit g.
(a) E
1
N
PN
p1
T
i=1
(d) E
p1
NT
1
T
PT
PT
t=1
p1
N
t
V ar
PT
s=1
2
0
t=1 zt et
(e) As N; T ! 1;
and
t=1 Ft eit
p1
TN
(b) For each t; E
(c) E
PT
PN
M , where E (Ft eit ) = 0 for all (i; t).
PN
i=1 zs (eit eis
2
E (eit eis ))
M; where zs =
Fs0 Ws0
0
:
M , where E zt 0i eit = 0 for all (i; t).
i eit
i=1
2
2
M; where E ( i eit ) = 0 for all (i; t).
1 PT PN PN
t=1
i=1
j=1
TN
PN
p1
e
:
i=1 i it
N
0
i j eit ejt
!P 0; where
limN;T !1
1
T
PT
t=1
t
> 0;
Assumption 4 - weak dependence between idiosyncratic errors and regression errors
(a) For each t and h
(b) E
p1
NT
0; E
h PN
i=1
t=1
PT
p1
TN
i eit "t+h
PT
2
h PN
s=1
i=1 "s+h (eit eis
E (eit eis ))
2
M:
M , where E ( i eit "t+h ) = 0 for all (i; t) :
Assumption 5 - moments and CLT for the score vector
(a) E ("t+h ) = 0 and E j"t+h j2 < M:
(b) E kzt k4
M and
(c) As T ! 1;
p1
T
1
T
PT
PT
0
t=1 zt zt
h
t=1 zt "t+h
limT !1 V ar
p1
T
PT
!P
zz
> 0:
!d N (0; ) ; where E
h
t=1 zt "t+h
> 0:
p1
T
PT
h
t=1 zt "t+h
2
< M , and
Assumption 1(a) imposes the assumption that factors are non-degenerate. Assumption 1(b) ensures
that each factor contributes non-trivially to the variance of Xt , i.e. the factors are pervasive and a¤ect
all cross sectional units. These assumptions ensure that there are r identi…able factors in the model.
Recently, Onatski (2011) considers a class of “weak” factor models, where the factor loadings are
modeled as local to zero. Under this assumption, the estimated factors are no longer consistent for
6
the unobserved (rotated) factors. In this paper, we do not consider this possibility. Assumption 1(c)
ensures that Q
p lim F~ 0 F=T is unique. Without this assumption, Q is only determined up to
orthogonal transformations. See Bai (2003, proof of his Proposition 1).
Assumption 2 imposes weak cross-sectional and serial dependence conditions in the idiosyncratic
error term eit . In particular, we allow for the possibility that eit is dependent across individual units
and over time, but we require that the degree of dependence decreases as the time and the cross
sectional distance (regardless of how it is de…ned) between observations increases. This assumption is
compatible with the approximate factor model of Chamberlain and Rothschild (1983) and Connor and
Korajczyk (1986, 1993), in which cross section units are weakly correlated. Assumption 2 allows for
heteroskedasticity in both dimensions and requires the idiosyncratic error term to have …nite eighth
moments.
Assumption 3 restricts the degree of dependence among the vector of regressors fzt g, the factor
loadings f i g and the idiosyncratic error terms feit g. If we assume that fzt g, f i g and feit g are
mutually independent (as in Bai and Ng (2006)), then Assumptions 3(a), 3(c) with zt = Ft and 3(d)
are implied by Assumptions 1 and 2. Similarly, Assumption 3(b) holds if we assume that fzt g and
feit g are independent and the following weak dependence condition on feit g holds
1 X X
jCov (eit eis ; ejl eju )j < < 1:
NT2
(6)
t;s;l;u i;j
Bai (2009) relies on a similar condition (part 1 of his Assumption C.4) to establish the asymptotic
properties of the interactive e¤ects estimator. As he explains, this condition is slightly stronger than
2
P
P
T 1=2 Tt=1 eit
the assumption that the second moment of N 1=2 N
i=1 i is bounded, where i
2
P
P
E T 1=2 Tt=1 eit (note that V ar N 1=2 N
i=1 i equals the left hand side of the above inequality
without the absolute value). It holds if eit is i.i.d. over i and t and E e4it < M: Assumptions 3(a)-3(c)
are equivalent to Assumptions D, F1 and F2 of Bai (2003) when zt = Ft .
To describe Assumption 3(e), for each t, let
t
N
1 X
p
N i=1
i eit ;
and
t
V ar ( t ) = E
0
t t
;
since we assume that E ( i eit ) = 0 for all (i; t). Assumption 3(e) requires that
1
T
PT
t=1
0
t t
E
0
t t
converges in probability to zero. This follows under weak dependence conditions on f i eit g over (i; t).
p
N F~t HFt , as shown by Bai (2003, cf. Theorem 1).
t is related to the asymptotic variance of
Assumption 4 imposes weak dependence between the idiosyncratic errors and the regression errors.
Part (a) holds if feit g is independent of f"t g and the weak dependence condition (6) holds. Similarly,
part (b) holds if f i g ; feis g and f"t g are three mutually independent groups of random variables and
Assumption 2 holds.
Assumption 5 imposes moment conditions on f"t+h g, on fzt g and on the score vector fzt "t+h g.
Part b) requires fzt zt0 g to satisfy a law of large numbers. Part c) requires the score to satisfy a
7
central limit theorem, where
denotes the limiting variance of the scaled average of the scores. The
dependence structure of the scores fzt "t+h g dictates the form of the covariance matrix estimator
to be used for inference on . For instance, Bai and Ng (2006) impose the assumption that "t+h
is a martingale di¤erence sequence with respect to fzt ; yt ; zt 1 ; yt 1 ; : : :g when h > 0: Under this
PT h
assumption,
= lim T1 t=1
E zt zt0 "2t+h ; and the appropriate covariance matrix estimator is an
heteroskedasticity robust variance estimator. As we will see later, the form of
will also dictate the
type of bootstrap we should use. In particular, under the martingale di¤erence sequence on "t+h , a
wild bootstrap method will be appropriate. Section 4 will consider this speci…c bootstrap scheme.
Given Assumptions 1-5, we can state our main result as follows. We introduce the following
notation:
H0
p lim H = V
Q
; Q
WF
=(
0) 1
0
1
zz
1
1
zz
0
W F~
If
WF
V
1
p lim
p
T =N ! c; with 0
!d N ( c
;
c < 1, then
);
zz
0
Q Q0 V
!
W 0 F~
T
1
0
0
+V
W F~ V
1
F~
1
F~ V
1
F~ V
H0 1
0
; with
;
=
0
W F H0 :
= 0, the asymptotic bias is equal to
c
^
=
c
F~
+V
F~ V
1
0
Theorem 2.1 gives the asymptotic distribution of
where 0
p lim V~ ; and
, and
=
F~
; V
= E (Wt Ft0 ).
Theorem 2.1 Let Assumptions 1-5 hold. If
p
T ^
where
p lim
!
diag (H0 ; Iq ) :
0
Additionally, we let
1
F~ 0 F
T
p
T ^
H0 1
0
:
under the condition that
p
T =N ! c;
c < 1: When c = 0, we obtain the same limiting distribution as in Bai and Ng (2006)
under a set of assumptions that is weaker than theirs, as we discussed above. As in Bai and Ng (2006),
factors estimation error does not contribute to the asymptotic distribution when c = 0. This is no
longer the case when c > 0. Under this alternative condition, an asymptotic bias appears, as was
recently discussed by Ludvigson and Ng (2009b) in the context of a simpler regression model without
observed regressors Wt . Our Theorem 2.1 complements their results by providing an expression of the
bias for ^ when the factor-augmented regression model includes also observed regressors in addition
to the unobserved factors Ft .
8
is proportional to H0 1
Several remarks follow. First, the expression for
0
= p lim ^ , implying
that when
= 0, no asymptotic bias exists independently of the value of c. Second, the asymptotic
bias for both ^ and ^ is a function of
V
F~
1
Q Q0 V
1
=
T
1X
V
N;T !1 T
lim
1
Q t Q0 V
1
;
t=1
where V
1Q
0
tQ V
1
is the asymptotic variance of
p
N F~t
HFt (see Bai (2003)). Thus, the bias
depends on the sampling variance-covariance matrix of the estimation error incurred by the principal
components estimator F~t , averaged over time. This variance matrix depends on the cross sectional
P
dependence of feit g via t V ar p1N N
i=1 i ei . As we will see next, the main implication for the
validity of the bootstrap is that it needs to reproduce this cross sectional dependence when c 6= 0
but not otherwise. Third, the existence of measurement error in F~t contaminates the estimators of
the remaining parameters , i.e. ^ is asymptotically biased due to the measurement error in F~t : The
asymptotic bias associated with ^ will be zero only in the special case when the observed regressors
and the factors are not correlated (i.e.
3
WF
= 0) (or when
= 0).
A general residual-based bootstrap
The main contribution of this section is to propose a general residual-based bootstrap method and
discuss its consistency for factor-augmented regression models under a set of su¢ cient high-level conditions on the bootstrap residuals. These high level conditions can be veri…ed for any bootstrap scheme
that resamples residuals. We verify these conditions for a two-step wild bootstrap scheme in Section
4.
3.1
Bootstrap data generating process and estimation
n
o
Let et = (e1t ; : : : ; eN t )0 denote a bootstrap sample from e~t = Xt ~ F~t and "t+h a bootstrap
n
o
0
sample from ^"t+h = yt+h ^ 0 F~t ^ Wt . We consider the following bootstrap DGP:
0
yt+h = ^ 0 F~t + ^ Wt + "t+h ; t = 1; : : : ; T
Xt
=
~ F~t + e ;
t
t = 1; : : : ; T:
h;
(7)
(8)
Estimation proceeds in two stages. First, we estimate the factors by the method of principal
components using the bootstrap panel data set fXt g. Second, we run a regression of yt+h on the
bootstrap estimated factors and on the …xed observed regressors Wt .
More speci…cally, given fXt g, we estimate the bootstrap factor loadings and the bootstrap factors
by minimizing the bootstrap objective function
T N
1 XX
V (F; ) =
Xit
TN
t=1 i=1
9
2
0
i Ft
subject to the normalization constraint that F 0 F=T = Ir : The T r matrix containing the estimated
0
bootstrap factors is denoted by F~ = F~1 ; : : : ; F~T and it is equal to the r eigenvectors of X X 0 =N T
p
(multiplied by T ) corresponding to the r largest eigenvalues. The N r matrix of estimated bootstrap
0
= X 0 F~ =T .
loadings is given by ~ = ~ ; : : : ; ~
N
1
Given the estimated bootstrap factors F~t , the second estimation stage is to regress yt+h on
0
z^t = F~t 0 Wt0 : Because the bootstrap scheme used to generate yt+h is residual-based, we …x
the observed regressors Wt in the bootstrap regression. We replace F~t with F~t to mimic the fact that
in the original regression model the factors Ft are latent and need to be estimated with F~t . This yields
the bootstrap OLS estimator
T h
1 X
z^t z^t 0
T
^ =
t=1
!
1
T h
1 X
z^t yt+h :
T
(9)
t=1
^ is the bootstrap analogue of ^, the OLS estimator based on the original sample.
3.2
Bootstrap high level conditions
In this section, we provide a set of high level conditions on feit g and
characterize the bootstrap distribution of ^ .
"t+h
that will allow us to
Condition A (weak time series and cross section dependence in eit )
(a) E (eit ) = 0 for all (i; t) :
(b)
1
T
(c)
1
T2
PT
t=1
PT
PT
2
s=1 j st j
t=1
PT
= OP (1), where
p1
N
s=1 E
PN
i=1 (eit eis
st
1
N
=E
E (eit eis ))
2
PN
i=1 eit eis
:
= OP (1) :
Condition B* (weak dependence among z^t ; ~ i ; and eit )
(a)
1
T
(b)
1
T
(c) E
(d)
1
T
(e)
1
T
PT
t=1
PT
PT
~ ~0
s=1 Fs Ft st
p1
TN
t=1 E
p1
TN
PT
PT
t=1
PT
t=1
s=1
PN
p1
N
t=1 E
PT
~ 0e
p t
N
= OP (1) :
PN
^s (eit eis
i=1 z
2
0
^t ~ i eit
i=1 z
PN ~
ie
i=1
et 0 ~
p
N
it
2
E (eit eis ))
2
= OP (1), where z^s =
1
T
(b) E
PT
t=1 E
p1
TN
p1
TN
PT
PT
= OP (1) :
= oP (1) ; in probability, where
h PN
s=1
i=1 "s+h (eit eis
h PN
t=1
i=1
:
= OP (1) :
1
T
Condition C* (weak dependence between eit and "t+h )
(a)
0
F~s0 Ws0
~ie "
it t+h
2
E (eit eis ))
2
PT
t=1 V
ar
= OP (1).
= OP (1), where E eit "t+h = 0 for all (i; t).
10
p1
N
~ 0e
t
> 0:
PT
h PT
~
t=1
s=1 Fs "t+h st
1
T
(c)
= OP (1), in probability.
Condition D* (bootstrap CLT)
(a) E
"t+h = 0 and
(b)
1=2 p1
T
PT
1
T
PT
h
t=1 E
h
^t "t+h
t=1 z
V ar
p1
T
PT
"t+h
2
= OP (1) :
!d N (0; Ip ), in probability, where E
h
^t "t+h
t=1 z
> 0:
p1
T
PT
h
^t "t+h
t=1 z
2
= OP (1), and
Conditions A*-D* are the bootstrap analogue of Assumptions 1 through 5. However, contrary
to Assumptions 1-5, which pertain to the data generating process and cannot be veri…ed in practice,
Conditions A*-D* can be veri…ed for any particular bootstrap algorithm used to generate the bootstrap
residuals and idiosyncratic error terms. More importantly, we can devise bootstrap schemes to verify
these conditions and hence ensure bootstrap validity. For instance, part (a) of Condition A* requires
the bootstrap mean of eit to be zero for all (i; t) whereas part (a) of Condition D* requires that the same
is true for "t . The practical implication is that we should make sure to construct bootstrap residuals
with mean zero, e.g. to recenter residuals before applying a nonparametric bootstrap method. Parts
b) and c) of Condition A* impose weak dependence conditions on feit g over (i; t). For instance, these
conditions are satis…ed if we resample feit g in an i.i.d. fashion over the two indices (i; t). Condition
B* imposes further restrictions on the dependence among z^t , ~ i and the idiosyncratic errors e . Since
it
z^t and ~ i are …xed in the bootstrap world, Condition B* is implied by appropriately restricting the
dependence of eit over (i; t). Similarly, Condition C* restricts the amount of dependence between
feis g and
"t+h . If these two sets of bootstrap innovations are independent of one another, then
weak dependence on feis g su¢ ces for Condition C* to hold. Finally, Condition D* requires the
bootstrap regression scores z^t "t+h to obey a central limit theorem in the bootstrap world.
3.3
Bootstrap results
Under Conditions A*-D*, we can show the consistency of the bootstrap principal component estimator
F~ for a rotated version of the true “latent” bootstrap factors F~ , a crucial result in proving the …rst
order asymptotic validity of the bootstrap in this context.
More speci…cally, according to (7)-(8), the common factors underlying the bootstrap panel data
fXt g are given by F~t (with ~ as factor loadings). Nevertheless, just as F~t estimates a rotation of Ft ,
the estimated bootstrap factors F~t estimate H F~t , where H is the bootstrap analogue of the rotation
matrix H de…ned in (4), i.e.
H = V~
where V~ is the r
X X
0 =N T ,
~ 0 F~ ~ 0 ~
;
T N
1F
(10)
r diagonal matrix containing on the main diagonal the r largest eigenvalues of
in decreasing order.
11
Lemma 3.1 Assume Assumptions 1-5 hold and suppose we generate bootstrap data
yt+h ; Xt
ac-
cording to the residual-based bootstrap DGP (7) and (8) by relying on bootstrap residuals "t+h and
fet g such that Conditions A*-D* are satis…ed. Then, as N; T ! 1,
T
1X ~
Ft
T
H F~t
2
= OP
2
NT
;
t=1
in probability, where
NT
= min
p
p
N; T :
According to Lemma 3.1, the time average of the squared deviations between the estimated bootstrap factors F~t and a rotation of the “latent”bootstrap factors given by H F~t vanishes in probability
under the bootstrap measure P
as N; T ! 1, conditional on a sample which lies in a set with
probability converging to one. Contrary to H, H does not depend on population values and can
be computed for any bootstrap sample, given the original sample. Hence, rotation indeterminacy is
not a problem in the bootstrap world. Because the bootstrap factor DGP (8) satis…es the constraints
that F~ 0 F~ =T = Ir and ~ 0 ~ is a diagonal matrix, we can actually show that H is asymptotically (as
N; T ! 1) equal to H0 = diag ( 1), a diagonal matrix with diagonal elements equal to 1, where the
sign of is determined by the sign of F~ 0 F~ =T (see Lemma B.1; the proof follows by arguments similar
to those used in Bai and Ng (2011) and Stock and Watson (2002)). Thus, the bootstrap factors are
identi…ed up to a change of sign.
The main implication from Lemma 3.1 is that the bootstrap OLS estimator that one obtains from
0
0
= 0 1 ^, where
regressing yt+h on z^t estimates a rotated version of ^, given by
^ 0H 1 ^
0 1^
= diag (H ; Iq ) : Asymptotically,
is equal to
, where
= diag (H ; Iq ), which can be
0
0
interpreted as a sign-adjusted version of ^.
Our next result characterizes the asymptotic bootstrap distribution of
c, with 0
p
c < 1. We add the two following conditions.
when
p
T =N !
0:
0
Condition E*. p lim
=
Condition F*. p lim
= Q Q0 :
0
T ^
is the bootstrap variance of the scaled average of the bootstrap regression scores z^t "t+h (as
de…ned in Condition D*(b)). Since F~t estimates a rotated version of the latent factors given by
H0 Ft , z^t estimates a rotated version of zt given by
0 zt ,
and therefore
is the sample analog of
0
0
provided we choose "t+h to mimic the time series dependence of "t+h . Condition E* imposes
PN ~
1 PT
p1
formally this condition. Similarly, by Condition B*(e),
t=1 V ar
i=1 i eit . Because
T
N
~ i estimates Q i ,
is the sample analogue of Q Q0 if we choose e to mimic the cross sectional
0
it
dependence of eit (interestingly enough, mimicking the time series dependence of eit is not relevant).
Condition F* formalizes this requirement.
12
Theorem 3.1 Let Assumptions 1-5 hold and consider any residual-based bootstrap scheme for which
p
Conditions A*-D* are veri…ed. Suppose T =N ! c, where 0
c < 1. If in addition the two
following conditions hold: (1) Condition E* is veri…ed, and (2) c = 0 or Condition F* is veri…ed; then
as N; T ! 1,
p
T ^
in probability, where
!d N
the main diagonal, and
H
1 0
; with
and
According to Theorem 3.1,
with mean equal to
1 0^
0
c(
p
0) 1
0
0
c
0
1
0
= p lim
;
0
1
0
1
;
0
= diag (H0 ; Iq ) a diagonal matrix with
1 in
are as de…ned in Theorem 2.1.
T ^
is asymptotically distributed as a normal random vector
p
0
: Just as the asymptotic bias of T ^
H 1
is proportional to
, the bootstrap asymptotic bias is proportional to H0
1 0
1 0
^ . Since ^ converges in probability
1
, the bootstrap bias of ^ converges to c (H0 0 )
provided we ensure that Condition F*
0~
is satis…ed. The bootstrap bias of ^ depends both on the bootstrap analogue of W F~ = p lim WT F =
to H
0
W F H0
W F~ H0
and on H0
0,
1 0
^ : Since the bootstrap analogue of
the rotation matrix H0 0 “cancels out” with H0
0
~
is p lim WTF , which converges to
^ , leaving the bootstrap bias of ^
W F~
1 0
una¤ected by the sign problem that arises for ^ (Lemma B.4 formalizes this argument). Similarly,
1
provided we choose "t+h
the asymptotic variance-covariance matrix of ^ is equal to ( 0 0 ) 1
0
so as to verify Condition E*.
For bootstrap consistency, we need the bootstrap bias and variance to match the bias and variance
of the limiting distribution of the original OLS estimator. Since H0 (hence
0)
is not necessarily
equal to the identity matrix, Theorem 3.1 shows that this is not the case. Hence, the bootstrap
p
p
distribution of T ^
is not a consistent estimator of the sampling distribution of T ^
in general. This is true even if we choose "t+h and eit so that Conditions E* and F* are satis…ed. The
reason is that the bootstrap factors are not identi…ed. In particular, because the bootstrap principal
components estimator does not necessarily identify the sign of the bootstrap factors, the mean of each
p
element of T ^
corresponding to the coe¢ cients associated with the latent factors may have
the “wrong” sign even asymptotically. The same “sign” problem will a¤ect the o¤-diagonal elements
of the bootstrap covariance matrix asymptotically (although not the main diagonal elements). As we
mentioned above, the coe¢ cients associated with Wt are correctly identi…ed in the bootstrap world as
well as in the original sample and therefore this sign problem does not a¤ect these coe¢ cients.
p
In order to obtain a consistent estimator of the distribution of T ^
, our proposal is to
p
p
consider the bootstrap distribution of the rotated version of T ^
given by T ( 0 ^
^ ).
This rotation is feasible because
does not depend on any population quantities and can be computed
for any bootstrap and original samples. Since
is asymptotically equal to 0 = diag ( 1; Iq ) =
diag sign F~ 0 F~ ; Iq , 0 ^ is asymptotically equal to a sign-adjusted version of ^ . The following
result is an immediate corollary to Theorems 2.1 and 3.1.
13
Corollary 3.1 Under the conditions of Theorem 3.1, if
p
p
0^
^
then supx2Rp P
T
T ^
x
P
p
T =N ! c, where 0
x
!P 0:
c < 1, as N; T ! 1;
Corollary 3.1 justi…es the use of a residual-based bootstrap method for constructing bootstrap
percentile con…dence intervals for the elements of . When c = 0, the crucial condition for bootstrap
validity is Condition E*, which requires f"t g to be chosen so as to mimic the dependence structure
0^
of the scores zt "t+h . This condition ensures that the bootstrap variance-covariance matrix of
is correct asymptotically. When c 6= 0, Condition F* is also crucial to ensure that the bootstrap
distribution correctly captures the bias. When both Conditions E* and F* are satis…ed, the bootstrap
contains a built-in bias correction term that is absent in the Bai and Ng (2006) asymptotic normal
distribution, and we might expect it to outperform the normal approximation. A bootstrap method
that does not involve factor estimation in the bootstrap world will not contain this bias correction and
will not be valid in this context.
4
Wild bootstrap
In this section we propose a particular bootstrap method for generating "t+h
its …rst-order asymptotic validity under a set of primitive conditions.
and feit g and show
Bootstrap algorithm
1. For t = 1; : : : ; T , let
Xt = ~ F~t + et ;
where fet = (e1t ; : : : ; eN t )g is such that
n
is a resampled version of e~it = Xit
random variables
it
eit = e~it it ;
o
~ 0 F~t obtained with the wild bootstrap. The external
i
are i.i.d. across (i; t) and have mean zero and variance one.
2. Estimate the bootstrap factors F~ and the bootstrap loadings ~ using X .
3. For t = 1; : : : ; T
h; let
0
yt+h = ^ 0 F~t + ^ Wt + "t+h ;
where the error term "t+h is a wild bootstrap resampled version of ^"t+h , i.e.
"t+h = ^"t+h vt+h ,
where the external random variable vt+h is i.i.d. (0; 1) and is independent of
14
it .
4. Regress yt+h generated in 3. on the estimated bootstrap factors and the …xed regressors z^t =
0
F~t 0 ; Wt0 . This yields the bootstrap OLS estimators
T
Xh
^ =
z^t z^t 0
t=1
!
1T h
X
z^t yt+h .
t=1
To prove the validity of this residual-based wild bootstrap, we add the following assumptions.
Assumption 6.
12
E k ik
i
are either deterministic such that k i k
M < 1 for all i; E kFt k
for some q > 1, E j"t+h j4q
Assumption 7. E ("t+h jyt ; Ft ; yt
12
M < 1; E jeit j12
M < 1; for all t; h:
1 ; Ft 1 ; : : :)
M < 1, or stochastic such that
M < 1; for all (i; t) ; and
= 0 for any h > 0, and Ft and "t are independent of
the idiosyncratic errors eis for all (i; s; t).
Assumption 8. E (eit ejs ) = 0 if i 6= j.
Assumption 6 strengthens the moment conditions assumed in Assumption 1.b), 2.a), and 5.a),
respectively. The moment conditions on
i,
Ft and eit su¢ ce to show that E
4
0
i Fs eit
< M (while
8
maintaining that E jeit j < M ). If we assume that the three groups of random variables fFt g, feit g
and f i g are mutually independent (as in Bai and Ng (2006)), then it su¢ ces that E k i k4
E kFt k
4
8
M < 1 (and E jeit j < M ).
M < 1,
Assumption 7 was used by Bai and Ng (2006). It imposes a martingale di¤erence condition on the
regression errors "t+h , implying that these are serially uncorrelated but possibly heteroskedastic. In
addition, "t is independent of eis for all (i; t; s). Under Assumption 7,
!
T h
T h
1 X
1 X
= lim V ar p
zt "t+h = lim
E zt zt0 "2t+h ;
T
T t=1
t=1
which motivates using a wild bootstrap to generate "t+h . For this bootstrap scheme,
T h
1 X
=
z^t z^t0 ^"2t+h ;
T
t=1
which corresponds to the estimator used by Bai and Ng (2006) (cf. their equation (3)) and is consistent for
0
0
0
under Assumptions 1-7. We assume independence between eit and "t+h and generate
"t+h independently of eit ; but we conjecture that our results will be valid under weak forms of correlation between the two sets of errors because the limiting distribution of the OLS estimator remains
unchanged under Assumptions 1-5, which allow for dependence between eit and "t+h ; as we proved in
Theorem 2.1.
Assumption 8 assumes cross section independence in feit g, but allows for serial correlation and
heteroskedasticity in both directions. This assumption motivates the use of a wild bootstrap to
15
generate feit g. For this bootstrap scheme, we can show that
=
T
N
1 X 1 X ~ ~0 2
~it
i ie
T
N
t=1
i=1
T
1 X~
t;
T
t=1
where ~ t corresponds to estimator (5a) in Bai and Ng (2006, p. 1140). As shown by Bai and Ng, this
estimator is consistent for Q Q0 under cross section independence (and potential heteroskedasticity).
Assumption 8 assumes this is the case and thus justi…es Condition F* in this context. As we discussed
in the previous section, Condition F* is not needed if c = 0. Thus, a wild bootstrap is still asympp
totically valid if the idiosyncratic errors are cross sectionally (and serially) dependent when T =N
converges to zero (as assumed in Bai and Ng (2006)).
Our main result is as follows.
Theorem 4.1 Suppose that a residual-based wild bootstrap is used to generate feit g and "t+h with
E j it j4 < C for all (i; t) and E jvt+h j4q < C for all t; for some q > 1: Under Assumptions 1-7, if
p
T =N ! c, where 0 c < 1, and either Assumption 8 holds or c = 0, the conclusions of Corollary
3.1 follow.
5
Monte Carlo results
In this section, we report results from a simulation experiment that documents the properties of our
bootstrap procedure in factor-augmented regressions.
The data-generating process (DGP) is similar to the one used in Ludvigson and Ng (2009b) to
analyze bias. We consider the single factor model:
yt = F t + "t ;
(11)
where Ft is drawn from a standard normal distribution independently over time. The regression error
"t will either be standard normal or heteroskedastic over time. The (T
N ) matrix of panel variables
is generated as:
Xit =
where
i
i Ft
+ eit ;
(12)
is drawn from a U [0; 1] distribution (independent across i) and the properties of eit will be
discussed below. The only di¤erence with Ludvigson and Ng (2009b) is that they draw the loadings
from a standard normal distribution. The use of a uniform distribution increases the cross-correlations
and leads to larger biases without having to set the idiosyncratic variance to large values (they set
it to 16 in one experiment). Note that this DGP satis…es the conditions PC1 in Bai and Ng (2011)
which implies that H0 =
1:
We consider two values for the coe¢ cient, either
outlined in the table below. When
= 0 or 1. We consider six di¤erent scenarios
= 0; the OLS estimator is unbiased, and the properties of the
idiosyncratic components do not matter asymptotically. This leads us to consider only one scenario
16
with
= 0: The other …ve scenarios have
= 1 but di¤er according to the properties of the regression
error, "t ; and of the idiosyncratic error, eit :
DGP
1 (homo-homo)
2 (homo-homo)
0
1
"t
N (0; 1)
N (0; 1)
3 (hetero-homo)
1
N 0;
4 (hetero-hetero)
1
N 0;
5 (hetero-AR)
1
N 0;
6 (hetero-CS)
1
N 0;
DGP 1 is the simplest case with
Ft2
3
Ft2
3
Ft2
3
Ft2
3
eit
N (0; 1)
N (0; 1)
N (0; 1)
N 0;
2
i
2
i
AR + N 0;
CS + N (0; 1)
= 0 and both error terms i.i.d. standard normal in both
dimensions. DGP 2 is the same but with
= 1: This will allow us to isolate the e¤ect of a non-
zero coe¢ cient on bias and inference while keeping everything else the same. The third experiment
introduces conditional heteroskedasticity in the regression error. We do so by making the variance of
"t depend on the factor and scale so that the asymptotic variance of ^ ;
a;
is 1 in all experiments.
The fourth DGP adds heteroskedasticity to the idiosyncratic error. The variance of eit is drawn
from U [:5; 1:5] so that the average variance is the same as the homoskedastic case. The …fth DGP
introduces serial correlation in the idiosyncratic error term with autoregressive parameter equal to 0.5.
:52
The innovations are scaled by 1
1=2
to preserve the variance of the idiosyncratic errors. Finally,
the last experiment introduces cross-sectional dependence among idiosyncratic errors. The design is
similar to the one in Bai and Ng (2006): the correlation between eit and ejt is 0:5ji
We concentrate on inference about the parameter
jj
if ji
jj
5:
in (11) : We consider asymptotic and bootstrap
symmetric percentile t con…dence intervals at a nominal level of 95%. We report experiments based
on 1000 replications with B = 399 bootstrap repetitions. We consider three values for N (50, 100,
and 200) and T (50, 100, and 200).
We tailor our inference procedures to the properties of the error terms. In other words, when "t is
homoskedastic (DGP 1 and 2), we use the variance estimator under homoskedasticity:
! 1
T
Xh
1
2
2
^ ^ = ^"
F~t
T
(13)
t=1
whereas we use the heteroskedastic-robust version for DGPs 3-6:
! 1
!
!
T
T
T
Xh
Xh
Xh
1
1
1
^ =
F~t2
F~t2^"2t+h
F~t2
T
T
T
t=1
t=1
Similarly, we use the homoskedastic estimator of
1
:
(14)
t=1
for cases 1-3, the heteroskedasticity-robust
estimator for cases 4-5, and the CS-HAC estimator of Bai and Ng (2006) in case 6 with the window
p p
N ; T . We consider the wild residual-based bootstrap described in Section 4
size n equal to min
with the two external variables
it
and vt both drawn from i.i.d. N (0; 1) :
17
We report two sets of results. The …rst set of results is the bias of the OLS estimator. Because
the OLS estimator does not converge to
but to H
10
(and H converges to +1 or -1) and because
its bias is proportional to this, the bias will be positive for some replications and negative for others.
Reporting the average bias over replications is therefore misleading in this situation. To circumvent
this, we report the bias of the rotated OLS coe¢ cient, H 0 ^ : This rotated coe¢ cient converges to
in
all replications. Note that this rotation is not possible in the real world since the matrix H depends
on population parameters. Note also that this rotation is possible in the bootstrap world (and indeed
necessary to obtain consistent inference of the entire coe¢ cient vector, see Corollary 3.1). For the
bootstrap world, we report the average of H 0 H 0 ^
H 0 ^ ; again to ensure that the sign of this bias
is always the same. Secondly, we present coverage rates of the associated con…dence intervals. For
comparison, we also include results for the case where factors do not need to be estimated and are
treated as observed (row labeled "true factor" in the tables). This quanti…es the loss from having to
estimate the factors.
Table 1 provides results for the …rst two DGPs and illustrates our results. For each DGP, the top
panel gives the bias associated with the OLS estimator as well as the plug-in and bootstrap estimates.
The second panel for each design provides the coverage rate of intervals based on asymptotic theory,
either using the OLS estimator or its bias-corrected version, based on OLS using true factors, and
based on the wild bootstrap. From table 1, we see that, as expected, bias is nil when
When
= 0 (case 1).
6= 0; a negative bias appears. DGP 2 shows that the magnitude of this bias is decreasing in
N (and T ): The sample estimate of this quantity provides a reasonable approximation to it. However,
the bootstrap captures the behavior of the bias well as predicted by theory and better than the sample
estimate.
Coverage rates are consistent with these …ndings. When
= 0 (DGP 1), asymptotic theory is
nearly perfect and matches closely the results based on the observed factors. In case 2, with
= 1;
OLS inference su¤ers from noticeable distortions for all sample sizes. This is because the estimator
is biased and the associated t-statistic is not centered at 0. Analytical bias correction corrects most
of these distortions. The bootstrap provides even better inference and is quite accurate for N
100:
This loss in accuracy in inference is due to the estimation of the factors as illustrated by the results
with the true factors.
Tables 2 and 3 provide results for the other DGPs and show the robustness of the results to the
presence of heteroskedasticity in both errors (DGPs 3 and 4), and serial correlation in the idiosyncratic
errors (DGP 5). The bias results in table 2 are very similar to those in table 1, although coverage
rates reported in table 3 deteriorate relative to the simpler homoskedastic case. The presence of crosssectional dependence (DGP 6) is interesting. Firstly the bias is larger here (because
also includes
cross-product terms). Secondly, the wild bootstrap is theoretically not valid since it does not replicate
the cross-sectional dependence. Indeed, we see that, contrary to the other cases, the sample estimate
of the bias is often better than the bootstrap, especially with N = 50:
18
6
Conclusion
The main contribution of this paper is to give a set of su¢ cient high-level conditions under which
any residual-based bootstrap method is valid in the context of the factor-augmented regression model
p
c < 1: Our results show that two crucial conditions for bootstrap
in cases where T =N ! c; 0
validity in this context are that the bootstrap regression scores replicate the time series dependence
of the true regression scores, and that either c = 0 or the bootstrap replicates also the cross-sectional
dependence of the idiosyncratic error terms.
Our high-level conditions can be checked for any implementation of the bootstrap in this context.
We verify them for a particular scheme based on a two-step application of the wild bootstrap. Although the wild bootstrap preserves heteroskedasticity, its validity depends on a martingale di¤erence
condition on the regression errors and on cross-sectional independence of the idiosyncratic errors when
c 6= 0.
The martingale di¤erence condition on the regression errors was used in Bai and Ng (2006), but it
is overly restrictive when the forecasting horizon h is larger than one. Thus, relaxing this assumption
is important. Although our general results in Sections 2 and 3 allow for serial correlation in the scores
(see in particular our Assumption 5(b)), the particular implementation of the two-step residual-based
wild bootstrap we consider in Section 4 is not robust to serial dependence. A block bootstrap based
method is required in this case. We plan on investigating the validity of such a method in future work.
Our high-level conditions will be useful in establishing this result.
A second important extension of the results in this paper is to propose a bootstrap scheme that
is able to replicate the cross-sectional dependence of the idiosyncratic error term. As our results
show, this is crucial for capturing the bias when c 6= 0. Our wild bootstrap based method does not
allow for cross-sectional dependence. Because there is no natural cross-sectional ordering, devising a
nonparametric bootstrap method that is robust to cross-sectional dependence of unknown form is a
challenging task.
Another important extension of the results in this paper is the construction of interval forecasts,
which we are currently investigating. Finally, the extension to factor-augmented vector autoregressions
(FAVAR) …rst suggested by Boivin and Bernanke (2003) is also important. This case has recently been
analyzed by Yamamoto (2011), who proposes a bootstrap scheme that exploits the VAR structure in
the factors and the panel variables.
19
A
Appendix A: Proofs of results in Section 2
We rely on the following lemmas to prove Theorem 2.1.
Lemma A.1 Let Assumptions 1-5 hold. Then,
p p
N; T :
N T = min
Lemma A.2 Let Assumptions 1-5 hold. Then, if
a)
p1
T
b)
p1
T
c)
p1
T
PT
h
t=1
F~t
PT
h
t=1 HFt
PT
h
t=1 Wt
d) Letting
F~t
F~t
V
F~
F~t
HFt
HFt
0
HFt
HFt
1Q
0
1,
T h
1 X ~ ~
p
Ft F t
T t=1
1Q
= cV
= cQ Q0 V
2
0
W F H0 Q
=c
Q0 V
0
= p lim
W 0 F~
T
h
t=1
F~t
HFt "t+h = OP
T =N ! c, where 0
1
HFt
0
0;
+ oP (1) :
Q0 V
H
HFt
=
0
H
0
W F H0 ,
By Assumption 5 and given the de…nition of
2
+ oP (1) :
1 0
=c
|
1 0
=c
|
with
WF
F~
+V
F~ V
1
p lim (^ ) + oP (1) ;
}
(15)
1
p lim (^ ) + oP (1) ;
}
(16)
{z
B
W F~ V
F~ V
{z
B
E (Wt Ft0 ).
H 0
0 Iq
0
= diag (p lim H; Iq ),
T h
1 X
p
zt "t+h !d N 0;
T t=1
0
0
0
:
By Lemma A.1, B !P 0 and by Lemma A.2d),
C=
, where
+ oP (1) :
C
T h
1 X
p
T t=1
T
c < 1; for any …xed h
Proof of Theorem 2.1. Write
8
T h
T h
>
1 X F~t HFt
1 X HFt
>
>
p
p
"
+
>
t+h
>
Wt
>
0
T t=1
T t=1
>
>
!
1
>
|
{z
}
{z
|
T
h
<
p
1 X
A
B
T ^
z^t z^t0
=
T
>
Xh
T
0
1
>
0
t=1
>
>
p
z^t F~t HFt
H 1
>
>
T t=1
>
>
>
|
{z
}
:
A=
1p
NT
we have that
T h
1 X
p
Wt F~t
T t=1
W F~
p
PT
Q0 V
and
where
1
T
F~t
Wt
F~t
HFt
0
20
H
1 0
=
c
B
B
+ oP (1) ;
9
>
>
"t+h >
>
>
>
>
>
} >
=
>
>
>
>
>
>
>
>
>
;
:
(17)
where B and B are as de…ned in (15) and (16). Given Assumptions 1-5,
!
T h
T h
1 X
1 X
0
0
0
0
z^t z^t = 0
zt z t
0 + oP (1) = 0 zz 0 + oP (1) :
T
T
t=1
This implies that if
p
t=1
T =N ! c;
p
T ^
!d N ( c
=
When
E (Wt Ft0 ) = 0, we have that
WF
H00
=
1
1
F
1
F
H0 1
F~
+V
0
1
F~ V
1
F~ V
F~
0 1
0
=
1
zz
1
zz
1
0
, and
p lim (^ ) :
= 0, implying that
W F~
0
+V
0 V
F~
1
W
0
=
since H00
1
) ; with
+V
W F~ V
1
0
0
zz
0
;
F~ V
1
F~ V
1
p lim (^ )
1
F~ V
p lim (^ ) ;
H0 1 = Ir given that we can show that H0
F
1
= Q = (H00 )
:
Proof of Lemma A.1. The proof is based on the following identity:
F~t
HFt = V~
1
T
1X~
Fs
T
T
1X~
Fs
st +
T
s=1
V~
1
s=1
(A1t + A2t + A3t + A4t ) ;
where
st
= E
N
1 X
eis eit
N
i=1
st
=
N
1 X
N
0
i Fs eit
!
,
= Fs0
i=1
=
st
T
1X~
Fs
st +
T
s=1
N
1 X
(eis eit
N
T
1X~
Fs
st +
T
st
s=1
!
(18)
E (eis eit )) ;
i=1
0e
t
N
and
st
= Ft0
0e
s
N
=
ts :
Using the identity (18), we have that
T h
1 X ~
Ft
T
HFt "t+h = V~
1
(I + II + III + IV ) ;
t=1
where
I =
III =
T h T
1 XX ~
Fs
T2
st "t+h ;
1
T2
st "t+h ;
t=1 s=1
T
T
Xh X
F~s
T h T
1 XX ~
II = 2
Fs
T
st "t+h ;
t=1 s=1
T
T
Xh X
and IV =
t=1 s=1
1
T2
F~s
st "t+h :
t=1 s=1
Since V~ 1 = OP (1) (see Lemma A.3 of Bai (2003), which shows that V~ !P V > 0), we can ignore
V~ 1 . Start with I. We can write
I=
T h T
1 XX ~
Fs
T2
HFs
st "t+h + H
t=1 s=1
T h T
1 XX
Fs
T2
t=1 s=1
21
st "t+h
I1 + I2 :
1
T
We can show that I2 = OP
have
T h T
T h T
1 XX
1 XX
E
kF
"
k
j
s st t+h
T2
T2
t=1 s=1
t=1 s=1
!
T
1
1 1X
j st j = O
;
T T t;s
T
E kI2 k
since E kFs k2
1
T
by showing that E kI2 k = O
M and E j"t+h j2
. Ignoring H (which is OP (1)), we
E kFs k2
st j
M for some constant M < 1, and
1=2
PT
1
T
1=2
E j"t+h j2
t;s j st j
< M by assumption.
For I1 , repeated application of the Cauchy-Schwartz inequality implies that
kI1 k =
T
1X ~
Fs
T
T h
1 X
T
HFs
s=1
1
p
T
Thus, I = OP
st "t+h
t=1
0
!
T
1X ~
Fs
T
s=1
11=2 0
B
B
B
B
B
@
C
T
C
2
1X ~
C
Fs HFs
C
C
T
s=1
A
|
{z
}
2
OP ( N T ) by Lemma A.1 of Bai (2003)
p 1
T NT
HFs
HFs
T
T h
X
1 X
@1
T
T
s=1
11=2
2
OP (1)
2
st "t+h
t=1
p
= OP
1
T
11=2
A
:
NT
OP (1)
T h T
1 XX
Fs
st "t+h + H 2
T
t=1 s=1
!1=2 0
C
T h
1 X 2 C
"t+h C
st j
C
T
A
}| t=1
{z }
B
T T
Xh
B1 X
B
j
BT
@ s=1 t=1
|
{z
. Next, consider II. We have that
T h T
1 XX ~
II = 2
Fs
T
2
st "t+h
II1 + II2 :
t=1 s=1
We can show that
kII1 k
Indeed,
|
T
T h
1 X
1X
E
T
T
s=1
T
1X ~
Fs
T
s=1
HFs
{z
OP (
1
NT
2
st "t+h
=
t=1
2
!1=2 0
)
T
T h
X
1 X
@1
T
T
s=1
t=1
}|
{z
OP
T
T h
1X
1 X
E
T
T
s=1
=
t=1
2
A
st "t+h
p1
NT
N
1 X
(eis eit
N
11=2
= OP
p
!
2
E (eis eit )) "t+h
i=1
T
T h N
1 XX
1 1X
E p
(eis eit
TN T
T N t=1 i=1
s=1
|
{z
2
E (eis eit )) "t+h
t=1 s=1
st "t+h
T N
1 XX
p
Fs (eit eis
T N s=1 i=1
|
{z
mt
22
=O
1
TN
:
}
For II2 , ignoring H (which is OP (1)), we have that
T h
1 1 X
=p
T N T t=1
:
NT
}
=O(1) by Assumption 4(a)
T h T
1 XX
II2 = 2
Fs
T
1
NT
!
E (eit eis )) "t+h
}
p
T h
1 1 X
mt "t+h :
T N T t=1
We can show that
1
T
PT
h
t=1 mt "t+h
p1
NT
= OP (1) ; implying that II2 = OP
. By Cauchy-Schwartz
inequality, we have that
T h
1 X
mt "t+h
T
T h
1 X
kmt k2
T
t=1
t=1
given that E j"t+h j2 < M < 1, provided
1
T
Markov’s inequality. But
T h
1 X
E kmt k2
T
p1
NT
by Assumption 3(b). Thus, II = OP
T h T
1 XX ~
Fs
T2
T h
1 X 2
"t+h
T
t=1
PT
h
2
t=1 kmt k
st "t+h =
t=1 s=1
!1=2
= OP (1), or
= OP (1) ;
PT
h
t=1 E
1
T
kmt k2 = O (1), by
2
T
T N
1X
1 XX
E p
Fs (eit eis
T
T N s=1 i=1
t=1
t=1
III =
!1=2
E (eit eis ))
= O (1)
: Next, consider
T h T
1 XX ~
Fs
T2
st "t+h + H
HFs
t=1 s=1
T h T
1 XX
Fs
T2
st "t+h
III1 + III2 :
t=1 s=1
Starting with III1 , we have that
kIII1 k
since
|
T
T h
1X
1 X
E
T
T
s=1
T
1X ~
Fs
T
s=1
HFs
2
{z
OP ( N1T )
2
st "t+h
=
t=1
!1=2 0
T
T h
X
1 X
@1
T
T
s=1
t=1
}|
{z
OP
T
T h
0e
1 X
1X
t
E
Fs0
T
T
N
s=1
T h T
1 XX
III2 = 2
Fs
T
t=1 s=1
st "t+h
p1
TN
2
"t+h
= OP
t=1 s=1
IV =
T h T
1 XX ~
Fs
T2
HFs
p1
TN
et "t+h
= OP
1
TN
t=1 s=1
:
!
!
T
T h
1X
1 X 0 et
0
=
Fs F s
"t+h = OP
T
T
N
s=1
t=1
|
{z
}|
{z
}
T h T
1 XX
Fs
T2
t=1 s=1
23
et "t+h
}
OP
p1
TN
: For IV; write
st "t+h + H
2
0
t=1
OP (1)
given Assumption 4(b). Thus, III = OP
NT
2
0
with its de…nition, we have that
"t+h
1
TN
T
T h
1 X
1X
E Fs0
=
T
TN
=O(1) by Assumption 4(b)
T h T
0e
1 XX
t
= 2
Fs Fs0
T
N
p
}
s=1
T
T h
1X
1
1 X
kFs k2
E p
T
TN
T N t=1
| s=1{z
}
{z
|
st
11=2
A
st "t+h
t=1
=OP (1)
Ignoring again H and replacing
2
st "t+h
IV1 + IV2 .
p
1
TN
;
For IV1 , we have that
kIV1 k
|
Indeed,
T
1X ~
Fs
T
s=1
HFs
{z
OP (
T
1X
T
s=1
=
!1=2 0
T
X
@1
T
s=1
}|
)
T h
1 X
T
st "t+h
t=1
T
X
1
=
T
1
NT
2
e0s
N
s=1
1
TN
1
T
T
X
T
Xh
|
1
T
2
! 0
@ p1
T
} |
=OP (1) by Assumption 3(d)
For IV2 , we have that
IV2 =
=
T h T
0e
1 XX
s
0
F
F
s
t
T2
N
1
T2
t=1 s=1
T
T
Xh X
Fs
t=1 s=1
A
st "t+h
{z
p1
NT
T
1X
=
T
s=1
!2
!2 11=2
t=1
OP
Ft "t+h
t=1
e0
ps
N
s=1
{z
1
T
!2
T
Xh
T h
1 X 0
Ft
T
1
T
s=1
T
Xh
"t+h
N
2
e0s
N
2
1
NT
t=1
{z
:
NT
!2
2
T h
1 X
Ft "t+h
T
t=1
1
TN
A = OP
Ft "t+h
1
p
}
0e
s
t=1
T
X
= OP
:
}
=OP (1) by Assumption 5(c)
"t+h
T
1 X e0s
Fs
T
N
e0s
Ft "t+h =
N
s=1
!
T h
1 X
Ft "t+h
T
t=1
!
= OP
p
1
NT
1
p
T
OP
given Assumptions 3(c) and 5(b). Thus, we have shown that under our assumptions,
1
p
I+II+III+IV = OP
NT
T
+OP
p
1
+OP
NT
p
1
+OP
NT
NT
1
p
NT
1
p
= OP
NT
Proof of Lemma A.2. Proof of part a) Using the identity (18), we can write
p 1 TXh
T
F~t
T
B
HFt
F~t
HFt
0
t=1
=
p
T V~
1
T h
1 X
(A1t + A2t + A3t + A4t ) (A1t + A2t + A3t + A4t )0 V~
T
1
;
t=1
where Ait (i = 1; : : : ; 4) are de…ned in (18). We analyze each term separately (ignoring V~
We can show that
1
T
PT
h
0
t=1 A1t A1t
= OP
p 1 TXh
T
A1t A01t = OP
T
t=1
1
T
+ OP
2
NT
p
1
T
24
2
NT
1
T
+ OP
: Thus,
1
p
T
= oP (1) :
1 ).
T
:
;
Indeed, by Cauchy-Schwartz
T h
1 X
A1t A01t
T
T h
T h
T
1 X
1 X 1X~
2
kA1t k =
Fs
T
T
T
t=1
t=1
t=1
t=1
st
s=1
2
T h
T
1 X 1X ~
Fs
T
T
2
HFs
st
s=1
T h
T
1 X 1X ~
=
Fs
T
T
t=1
HFs + HFs
st
s=1
2
T h
T
1 X
1X
+
H
Fs
T
T
t=1
2
a11:1 + a11:2 :
st
s=1
We have that
2
T h
T
1 X 1X ~
Fs
T
T
T h
T
T
2 1 X
1 X 1X ~
F
HF
j st j2
s
s
st
T
T
T
t=1
s=1
t=1
s=1
s=1
!
!
T
T
T
2
1 1X ~
1
1 XX
=
Fs HFs
j st j2 = OP
;
T T
T
T 2N T
s=1
t=1 s=1
|
{z
}|
{z
}
=OP (1)
=OP ( N2T )
P P
where we have use Assumption 2 (see Lemma 1(i) of Bai and Ng (2002)) to bound T1 Tt=1 Ts=1 j
a11:1
HFs
Next,
T h
T
1 X
1X
H
Fs
T
T
a11:2
t=1
1
kHk2
| {z } T
=OP (1)
We can show that
if
p
1
T
PT
h
0
t=1 A2t A2t
1
T
|
2
st
st
1
T
:
t=1
!
T
T
1 XX
kFs k
j
T
s=1
t=1 s=1
{z
}|
{z
2
=OP (1)
T h
1 X
p
A2t A02t = OP
T t=1
2
T h
T
1 X 1X
Fs
kHk
T
T
2
s=1
T
X
= OP
2
st j :
1
TN
2
st j
=OP (1)
1
+ OP
1
p
TN
N
+ OP
2
NT
s=1
= OP
}
; implying that
p
T 1
N 2N T
!
= oP (1) ;
T =N ! c < 1:
Indeed,
T h
1 X
A2t A02t
T
t=1
2
T h
T h
T
1 X
1 X 1X ~
2
kA2t k =
Fs
T
T
T
t=1
t=1
T h
T
1 X 1X ~
Fs
T
T
t=1
2
HFs
s=1
HFs + HFs
st
s=1
st
T h
T
1 X
1X
+
H
Fs
T
T
t=1
2
st
a22:1 + a22:2 :
s=1
We have that
a22:1
T h
T
1 X 1X ~
Fs
T
T
t=1
s=1
2
HFs
st
T
1X ~
Fs HFs
T
s=1
|
{z
=OP ( N2T )
25
2
!
T
T
1 XX
2
j st j
= OP
T2
t=1 s=1
} |
{z
}
1
=OP ( N
) by Assumption 2(c)
!
1
N
2
NT
:
Indeed,
2
N
T
T
1 1 XX
1 X
(eis eit
E p
st j =
N T2
N i=1
t=1 s=1
T
T
1 XX
Ej
T2
2
t=1 s=1
E (eis eit ))
=O
1
N
given Assumption 2(c). Next,
a22:2
T h
T
1 X 1X
kHk
Fs
T
T
2
2
t=1
T N
T h
1 1 X
1 XX
p
Fs (eis eit
= kHk
TN T
T N s=1 i=1
t=1
|
{z
2
st
s=1
2
E (eis eit ))
}
=OP (1) by Assumption 3(b)
implying that
T h
1 X
A2t A02t = OP
T
t=1
1
+ OP
2
NT
N
We can show that under Assumptions 1-5, if
p
1
TN
= OP
1
N
1
TN
= OP
:
2
NT
T =N ! c,
T h
1 X
p
A3t A03t = cQ Q0 + oP (1) :
T t=1
Proof.
T h
1 X
A3t A03t =
T
t=1
=
T h
1 X
T
t=1
T h
1 X
T
t=1
T
1X~
Fs
T
st
s=1
!
T
1X ~
Fs
T
T
1X~
Fs
T
st
s=1
!0
HFs + HFs
st
s=1
a33:1 + a33:2 + a033:2 + a33:3 :
!
T
1X ~
Fs
T
HFs + HFs
st
s=1
!0
It follows that
ka33:1 k
2
T h
T
1 X 1X ~
Fs
T
T
t=1
HFs
st
s=1
T
T h T
2 1 XX
1X ~
Fs HFs
j
T
T2
s=1
t=1 s=1
{z
}|
{z
|
1
=OP ( N
)
1
=OP
2
NT
2
st j
= OP
}
1
N
2
NT
since
T h T
1 XX
j
T2
t=1 s=1
2
st j =
T h T
1 X X 0 0 et
Fs
T2
N
2
t=1 s=1
T
1X
kFs k2
T
| s=1{z
}
=OP (1)
Thus,
p
p
T 1
N 2N T
T a33:1 = OP
26
T h
0e 2
1 X
t
= OP
T
N
t=1
|
{z
}
1
=OP ( N
) given Assumption 3(d)
!
= oP (1) ;
1
N
:
;
;
if
p
T =N ! c. Next we show that
ka33:2 k
0
T
T
Xh 1 X
@1
HFs
T
T
t=1
s=1
|
{z
=OP
p
2
st
p1
N
T a33:2 = OP
p
T
N
1
= oP (1) : Indeed,
NT
11=2 0
2
T
T
Xh 1 X
A @1
F~s HFs
T
T
t=1
s=1
}|
{z
st
1p
=OP
11=2
A
N
NT
1
= OP
N
;
NT
}
since by Cauchy-Schwartz and Assumption 3(d),
2
T h
T
1 X 1X
HFs
T
T
t=1
T h
T
1 X 1X
kHk
Fs
T
T
2
2
st
s=1
t=1
kHk2
st
s=1
For a33:3 , we have that
a33:3 =
T h
1 X
T
= H
t=1
T
Xh
1
T
= H
!
!0
T
T h
1X
1 X
HFs st = H
st
T
T
s=1
s=1
t=1
!
!
T
T
0
0
X
X
1
et
1
et
Fs Fs0
Fs Fs0 H 0
T
N
T
N
T
1X
HFs
T
t=1
F 0F
T
T
T h T
1X
1 XX
kFs k2 2
j
T
T
s=1
| s=1{z
}| t=1 {z
1
=OP (1)
=OP ( N
)
1
T
s=1
T
Xh
t=1
T
1X
Fs
T
st
s=1
!
2
st j
= OP
}
T
1X
T
0
st Fs
s=1
!
H0
s=1
0e
t
e0t
N
N
F 0F
T
given Assumption 3(e). When multiplied by
p
1
N
H 0 = OP
;
T , this will therefore be of order OP
p
T
N
and
therefore it will not go to zero in probability when c 6= 0. Its probability limit can be computed
as follows. First, notice that
HF 0 F
T
F~
=
0
F H 0 + F~
F~
F
=
T
0
F H0
F
+
T
F~ 0 F
= Q + oP (1)
T
given that by Lemma B.2 of Bai (2003),
F~
0
F H0
F
= OP
T
2
NT
= oP (1) ;
=
+ oP (1) :
~0
and p lim FTF = Q. Second, by Assumption 3(e),
T h
1 X
T
t=1
Thus, letting
p
T =N = c + o (1),
p
p
T
T a33:3 =
H
N
0e
t
p
e0
pt
N
N
F 0F
T
(
T h
1 X
T
t=1
0e
t
p
N
e0
pt
N
)
F 0F
T
= (c + o (1)) [Q + oP (1)] ( + oP (1)) Q0 + oP (1)
= cQ Q0 + oP (1) :
27
H0
1
N
:
1
T
We can show that
p
if T =N ! c;
PT
h
0
t=1 A4t A4t
p
T h
1 X
p
A4t A04t = OP
T t=1
1
= OP
N
T
2
NT
N
!
1
+ OP
2
NT
p
+ OP
N
N
T
NT
NT
!
1
TN
+ OP
+ OP
p
1
TN
; which implies that
= oP (1) :
Proof.
T h
1 X
A4t A04t =
T
T h
1 X
T
t=1
T
1X~
Fs
T
t=1
T h
1 X
T
=
st
s=1
!
T
1X ~
Fs
T
t=1
T
1X~
Fs
T
st
s=1
!0
HFs + HFs
st
s=1
!
T
1X ~
Fs
T
HFs + HFs
st
s=1
a44:1 + a44:2 + a044:2 + a44:3 :
!0
We have that
ka44:1 k
T h
T
1 X 1X ~
Fs
T
T
t=1
2
HFs
T
T h T
2 1 XX
1X ~
Fs HFs
j
T
T2
s=1
t=1 s=1
|
{z
}|
{z
1
=OP ( N
)
1
=OP
st
s=1
2
st j
1
= OP
}
2
NT
;
2
NT
N
since
T h T
1 XX
j
T2
t=1 s=1
T h T
1 X X 0 0 es
Ft
T2
N
2
st j =
2
t=1 s=1
T r
1 X
kFt k2
T
| t=1{z
}
=OP (1)
This implies that
ka44:2 k
0
@1
T
|
p
T
Xh
t=1
T a44:1 = OP
T
1X
HFs
T
s=1
{z
=OP
p
2
st
p1
N
T =N OP
11=2 0
A
@1
T
}|
1
2
NT
T
Xh
t=1
T
0e 2
1X
s
= OP
T
N
s=1
|
{z
}
1
=OP ( N
) given Assumption 3(d)
1
N
:
= oP (1). Next,
2
T
1X ~
Fs
T
s=1
{z
HFs
1p
NT N
=OP
st
11=2
A
= OP
1
N
;
NT
}
since by Cauchy-Schwartz and Assumption 3(d),
T h
T
1 X 1X
HFs
T
T
t=1
s=1
2
T h
T
1 X 1X
kHk
Fs
T
T
2
2
st
t=1
Thus,
p
T a44:2 = OP
kHk2
st
s=1
p !
T
OP
N
28
1
NT
T
T h T
1X
1 XX
kFs k2 2
j
T
T
s=1
t=1 s=1
|
{z
}|
{z
1
=OP (1)
=OP ( N
)
= oP (1) ;
2
st j
}
= OP
1
N
:
p
when
T =N ! c: For a44:3 , we have that
!
!0
T h
T
T
T h
1 X 1X
1X
1 X
=
HFs st
HFs st = H
T
T
T
T
t=1
s=1
s=1
t=1
!
!
T
h
T
T
0
0
1 X 1X
es
1 X es
= H
Fs Ft0
Ft Fs0 H 0
T
T
N
T
N
t=1
s=1
s=1
! T h
!
T
T
0
0e
X
X
X
1
es
1
1
s
Fs
Ft Ft0
F0
= H
T
N T
T
N s
}| t=1
{z
}
}
| s=1{z
| s=1{z
a44:3
Thus,
p
p1
TN
T a44:3 = OP
We can show that
oP (1) :
=OP (1)
p1
TN
=OP
1
T
p1
TN
=OP
T
1X
Fs
T
st
s=1
!
T
1X
T
s=1
1
TN
H 0 = OP
0
st Fs
!
H0
:
by Assumption 3(c)
= oP (1) :
PT
h
0
t=1 A1t A2t
= OP
NT
1
p
TN
; which implies that
p
T T1
PT
h
0
t=1 A1t A2t
=
Proof. Given the bounds we found above for the terms that depend on A1t and A2t , it is
immediate to see that
T h
1 X
A1t A02t
T
T h
1 X
kA1t k2
T
t=1
t=1
We can show that
OP
p1
N
1
T
h
0
t=1 A1t A3t
T h
1 X
kA2t k2
T
t=1
p1
TN
= OP
!1=2
= OP
, implying that
p
T
T
= oP (1) : Indeed,
T h
1 X
A1t A03t
T
T h
1 X
kA1t k2
T
t=1
Similarly,
PT
!1=2
1
T
t=1
PT
h
0
t=1 A1t A4t
= OP
T h
1 X
A1t A04t
T
!1=2
T h
1 X
kA3t k2
T
t=1
p1
TN
t=1
1
p
TN
= OP
1
T
where we can prove that
= OP
, which implies that
T h
1 X
kA1t k2
T
t=1
!1=2
!1=2
p
T h
1 X
kA4t k2
T
t=1
T T1
!1=2
1
p
T
1
p
OP
NT
PT
h
0
t=1 A1t A3t
1
p
T
OP
PT
h
0
t=1 A1t A4t
1
p
T
= OP
N
= OP
1
p
N
= OP
NT
p
p T
TN
= OP
1
p
=
p
1
TN
:
= oP (1) :
OP
1
p
N
;
PT
h
2
t=1 kA4t k
= OP
1
N
by an application of Cauchy-Schwartz.
Proof. Given the de…nition of A4t , we have that
T h
1 X
kA4t k2 =
T
t=1
T h
T
1 X 1X~
Fs
T
T
t=1
s=1
2
st
T h
T
1 X 1X ~
Fs
T
T
s=1
| t=1
{z
OP
= OP
1
N
:
29
1
N 2
NT
2
HFs
st
T h
T
1X
1 X
H
Fs
+
T
T
} | t=1
{zs=1
1
OP ( N
)
2
st
}
TN
By Cauchy-Schwartz,
2
T h
T
1 X 1X ~
Fs
T
T
t=1
HFs
st
s=1
since
T
T
1 XX
T2
2
st
t=1 s=1
T
T
0e
1 XX
s
= 2
Ft0
T
N
T
T
T
2 1 XX
1X ~
2
Fs HFs
st = OP
T
T2
s=1
t=1 s=1
|
{z
}|
{z
}
1
=OP ( N2T )
=OP ( N
)
T
T
X
1X
2 1
kFt k
T
T
2
t=1 s=1
t=1
0e 2
s
N
s=1
1
2
NT N
1
N
= OP (1) OP
;
by Assumptions 1(a) and 3(d). For the second term,
T h
T
1 X
1X
H
Fs
T
T
t=1
2
T h
T
1 X 1X
kHk
Fs
T
T
st
s=1
t=1
1
T
Similarly, we can show that
oP (1) :
PT
h
0
t=1 A2t A3t
11=2
1
N 2
NT
OP
PT
h
0
t=1 A2t A4t
1
= OP
N
NT
; implying that
0
t=1
1
T
PT
h
0
t=1 A3t A4t
= OP
T h
1 X
A3t A04t =
T
t=1
=
1
N
NT
T h
1 X
A3t
T
t=1
T
1X~
Fs
T
s=1
T h
T
1 X
1X ~
A3t
Fs
T
T
t=1
p
st
!0
NT
11=2
PT
h
0
t=1 A2t A4t
T h
1 X
=
A3t
T
HFs
s=1
t=1
0
11=2
h
0
t=1 A3t A4t
= OP
p
T
T
PT
h
0
t=1 A2t A3t
1
N
= OP
=
:
NT
1
N
:
NT
= oP (1) :
T
1X ~
Fs
T
HFs
s=1
T h
1 X
+
H
A3t
st
T2
t=1
30
= OP
s=1
2
st j
= oP (1) :
C
B
C
B 1 TXh
B
2C
kA4t k C
B
C
BT
@| t=1{z
}A
1
OP ( N
)
PT
s=1
; implying that
0
T
T
, implying that
N
t=1
C
B T h
C
B1 X
B
2C
kA
k
B
3t C
C
BT
@| t=1{z
}A
1
OP ( N
)
T
T
1
N 2
NT
OP
1
T h
T
T
1 X 1X
1X
kFs k2
j
T
T
T
0
p
11=2
B
C
B
C
B T h
C
B1 X
C
2C
B
kA
k
2t
BT
C
B t=1
C
B|
{z
}C
@
A
T h
1 X
A2t A04t
T
st
= OP
C
B
C
B
C
B T h
C
B1 X
2
B
kA2t k C
C
BT
C
B t=1
B|
{z
}C
A
@
t=1
kHk2
s=1
0
T h
1 X
A2t A03t
T
1
T
2
2
T
X
s=1
Fs
st
T
1X
+
H
Fs
st
T
!0
s=1
:
st
!0
1
N
:
The …rst term can be bounded by
!1=2 0 T h
T
h
T
X
1 X 1X ~
1
2
@
kA3t k
Fs
T
T
T
t=1
t=1
2
HFs
A
st
s=1
For the second term, note that
T
1X~
Fs
A3t =
T
st
s=1
T
1X ~
=
Fs
T
11=2
1
p
N
= OP
NT
T
1X
Fs
st + H
T
HFs
s=1
1
p
OP
N
= OP
st :
s=1
Thus, we get that
T h
T
X
1 X
A
Fs
3t
T2
t=1
=
st
s=1
T h
1 X
T
T
1X ~
Fs
T
t=1
s=1
T h
1 X
+H
T
T
1X
Fs
T
t=1
The …rst term is bounded by
0
T
T
Xh 1 X
@1
F~s
T
T
t=1
NT
N
OP
For the second term, we get that
!
T h
T
T
1 X 1X
1X
Fs st
Fs
T
T
T
t=1
=
=
s=1
T h
1 X F 0F
T
T
F 0F 1
T T
t=1
T
Xh
t=1
0e
t
N
0e
t
N
Ft0
s=1
2
HFs
st
s=1
1
p
= OP
HFs
1
p
N
= OP
11=2 0
st
!
st
!
T
1X
Fs
T
s=1
!0
T
1X
Fs
T
s=1
A
1
N
t=1
:
st
T
T
Xh 1 X
@1
Fs
T
T
st
!0
2
st
s=1
:
11=2
A
NT
!0
T h
T
0e
1 X 1X
t
0
F
F
=
s
st
s
T
T
N
s=1
t=1
s=1
!
0
T
T h
0e
0e
1X
F 0F 1 X
s
t
Fs Ft0
=
T
N
T T
N
s=1
t=1
!
T
1 X 0 es 0
1
F = OP (1) OP p
OP
T
N s
TN
s=1
!
T
0e
1X
s
Fs Ft0
T
N
s=1
!0
T
1 X e0s
Fs
Ft
T
N
!0
s=1
p
1
TN
= OP
1
TN
;
given Assumption 3(c).
Thus, it follows that the only term that contributes in a non-negligible way to
p
T B = V~
1
T h
1 X
p
(A1t + A2t + A3t + A4t ) (A1t + A2t + A3t + A4t )0 V~
T t=1
is the term that depends on
Bai (2003)), we have that
1
T
PT
h
0
t=1 A3t A3t :
p
T B = cV
1
1
More precisely, since V~ !P V (by Lemma A.3 of
Q Q0 V
which proves the result.
31
1
+ oP (1) ;
1
N
NT
:
Proof of part b). Consider now
T h
1 X
Hp
Ft F~t
T t=1
C
Replacing F~t
HFt = V~
1 (A
1t
0
HFt
:
+ A2t + A3t + A4t ) yields
T h
1 X
Ft (A1t + A2t + A3t + A4t )0 V~
C = Hp
T t=1
p
1
T H (bf 1 + bf 2 + bf 3 + bf 4 ) V~
1
;
where Ait are as de…ned previously. Again, we consider each term separately.
PT
h
0
t=1 Ft A1t
1
T
We can show that
1p
= OP
NT
, which implies that
T
p
T Hbf 1 V~
1
= oP (1)
under our assumptions.
Proof. Write
T h
T h
1 X
1 X
0
Ft A1t =
Ft
T
T
bf 1
1
T
=
t=1
T
Xh
T
1X ~
Fs
T
s=1
!
t=1
1
T
Ft
t=1
bf 1:1 + bf 1:2 :
T
X
F~s
HFs
s=1
0
HFs
+
st
1
T
T
Xh
T
1X 0
Fs
st +
T
0
Ft
t=1
1
T
st H
0
s=1
!
T
X
0
Fs st H 0
s=1
!
By Cauchy-Schwartz,
kbf 1:1 k
1
T
T
Xh
t=1
B
!1=2 B
B T h
T
B1 X 1 X
2
B
F~s
kFt k
BT
B t=1 T s=1
B|
{z
@
1
HFs
0
st
1
2 T
NT
OP
where the OP
11=2
0
C
C
C
C
C
C
}C
A
2C
= OP (1) OP
1
p
NT
T
;
term is equal to a11:1 in part a). Next, consider bf 1:2 :
2
NT T
T h
1 X
Ft
T
bf 1:2
t=1
=
1
T
1
T
|
T
1X 0
Fs
T
s=1
T
h
T
XX
t=1 s=1
{z
st
Ft Fs0
!
st
H0
!
H 0 = OP
1
T
:
}
OP (1) given Assumptions 1 and 2
Indeed,
T h T
1 XX
E
Ft Fs0
T
t=1 s=1
st
T
T
1 XX
E Ft Fs0
T
st
t=1 s=1
T
T
1 XX
j
T
t=1 s=1
32
st j
E kFt k2
1=2
E kFs k2
1=2
= O (1) ;
given the moment conditions on Ft and the weak dependence assumptions on eit . Thus,
bf 1
1
p
bf 1:1 + bf 1:2 = OP
NT
1
T
+ OP
T
1
p
= OP
NT
;
T
as we wanted to prove.
1 PT h
0
We can show that bf 2
t=1 Ft A2t = OP
T
p
implies that T times this object is oP (1).
NT
1
p
p1
TN
+ OP
TN
= OP
p1
TN
; which
Proof.
T h
T h
1 X
1 X
0
Ft A2t =
Ft
T
T
bf 2
1
T
=
t=1
T
Xh
T
1X~
Fs
T
s=1
!
t=1
T
1X ~
Fs
T
Ft
t=1
0
HFs
s=1
st
st
+
!0
T h
T
1 X 1X 0
Ft
Fs
T
T
t=1
st H
0
bf 2:1 + bf 2:2 :
s=1
We can write
kbf 2:1 k =
T
1X
T
s=1
= OP
NT
T h
1 X
Ft
T
st
t=1
1
p
!
F~s
B
T
T h
B1 X
1 X
B
Ft
B
BT
T
t=1
@| s=1
{z
OP ( T1N )
0
HFs
11=2
0
st
2C
C
C
C
C
}A
C
B
C
B
C
B X
2C
B1 T
B
F~s HFs C
C
BT
C
B s=1
B|
{z
}C
A
@
1
2
NT
OP
;
TN
11=2
0
given that
T
T h
1X 1 X
Ft
T
T
s=1
2
=
st
t=1
T
T h
N
1 X
1X 1 X
Ft
(eis eit
T
T
N
s=1
=
t=1
2
E (eis eit ))
i=1
T
T h N
1 1X
1 XX
p
Ft (eis eit
TN T
T
N
s=1
t=1 i=1
{z
|
2
= OP
E (eis eit ))
For bf 2:2 , we have that (ignoring H),
t=1
T
1X 0
Fs
T
st
s=1
!
T h
1 X
kFt k2
T
t=1
!1=2 0
using again Assumption 3(b) to bound the second term.
We can show that bf 3
oP (1).
1
T
PT
h
0
t=1 Ft A3t
= OP
33
T
T
Xh 1 X
@1
Fs
T
T
p1
TN
t=1
:
}
OP (1) by Assumption 3(b)
T h
1 X
Ft
kbf 2:2 k =
T
1
TN
2
ts
s=1
; implying that
p
11=2
A
= OP
T bf 3 = OP
p
p
p T
TN
1
TN
=
;
Proof.
T h
T h
1 X
1 X
0
Ft A3t =
Ft
T
T
bf 3
t=1
t=1
T h
T
1 X 1X ~
Ft
Fs
T
T
=
T
1X ~
Fs
T
t=1
0
HFs
st +
s=1
T h
1 X
Ft
T
t=1
Rewrite
T
1X
=
T
s=1
so that by Cauchy-Schwartz
0
T
T h
1X 1 X
Ft
kbf 3:1 k @
T
T
s=1
t=1
|
{z
OP
2
st
p1
TN
T h
1 X
Ft
T
2
s=1
1
T
T
Xh
t=1
st
t=1
Ft
e0t
N
t=1
11=2
A
Note that
T
T h
1X 1 X
Ft
T
T
1
T
T
X
s=1
st
!
T
1X ~
Fs
T
s=1
}|
=OP
F~s
{z
kFs k2 =
1
TN
|
p
1
TN
2
;
!1=2
= OP
NT
}
1
p
TN
T
T h
1 X 1 X e0t
Ft
Fs
=
T
T
N
t=1
T
Xh
0
HFs
1
NT
2
st
!0
s=1
HFs
T
T h
0e
1X 1 X
t
Ft Fs0
=
T
T
N
s=1
2
HFs
s=1
bf 3:1 + bf 3:2 :
bf 3:1
T
1X
Fs
st + H
T
s=1
!0
T
1X
H
Fs st
T
s=1
2
1
T
Ft e0t
t=1
{z
}
t=1
T
X
s=1
1
TN
kFs k2 = OP
OP (1) by Assumption 3(c)
Next consider (ignoring H),
bf 3:2
T h
T
1 X 1X 0
=
Ft
Fs
T
T
t=1
st
s=1
T
1X
=
T
T h
1 X
Ft
T
s=1
st
t=1
!
Fs0 ;
so that by Cauchy-Schwartz inequality and Assumption 3(c), we get that
0
1
!1=2
2 1=2
T
T
h
T
X
X
X
1
1
1
2
kbf 3:2 k @
Ft st A
kFs k
= OP
T
T
T
s=1
t=1
s=1
This concludes the proof that
bf 3
We can show that
p
T bf 4
p1
T
bf 3:1 + bf 3:2 = OP
PT
h
0
t=1 Ft A4t
=c
34
F
Q0 V
p
1
TN
1
2
:
+ oP (1) :
p
1
TN
:
:
Proof. Replacing A4t with its de…nition yields
T h
T h
1 X
1 X
Ft A04t =
Ft
T
T
bf 4
=
1
T
t=1
T
Xh
T
1 X ~0
Fs
T
s=1
!
t=1
Ft
t=1
T
1X ~
Fs
T
HFs
s=1
Starting with bf 4:2 , and given that
st
0
st
st
T h
1 X
+
Ft
T
= O
1
p
TN
st
s=1
!
bf 4:1 + bf 4:2 :
we have that
T h
T
1
1 X 1 X 0 0 es 0 0
Ft
Ft
F H =p
T
T
N s
TN
t=1
s=1
bf 4:2 =
T
1X
(HFs )0
T
t=1
0e
s
N ;
= Ft0
!
T h
1 X
Ft Ft0
T
t=1
p
OP (1) OP (1) OP (1) = OP
1
TN
!
T
1 X
p
T N s=1
0
es Fs0
!
H0
;
given Assumptions 1, 3(c) and the fact that H = OP (1) : Thus,
p
T b4f:2 = oP (1).
Next, consider bf 4:1 . We have that
bf 4:1
T h
T
1 X 1X ~
=
Fs
Ft
T
T
t=1
0
HFs
Ft0
s=1
0e
s
=
N
T h
1 X
Ft Ft0
T
t=1
!
T
1 X 0 es ~
Fs
T
N
HFs
s=1
0
!
;
where the …rst term is OP (1) and the second term can be shown to be OP N1 : Thus, we will
p
get a non negligible contribution from bf 4:1 when multiplying by T . Speci…cally, using the
usual decomposition for F~s HFs , we have that
T
1 X 0 es ~
Fs
T
N
HFs
0
s=1
=
T
1 X 0 es
(A1s + A2s + A3s + A4s )0 V~
T
N
1
:
(19)
s=1
We …rst show that the …rst, second and last terms are oP (1) when multiplied by
p
T . To end
the proof we will study the third term. Starting with the …rst term in (19) (and ignoring V~ 1 ),
!1=2
!1=2
T
T
T
0e 2
1X
1
1 X 0 es 0
1X
1
s
2
OP p
;
A
kA1s k
= OP p
T
N 1s
T
N
T
N
T
s=1
s=1
s=1
p
so that this term is oP (1) when multiplied by T . Similarly,
!1=2
!1=2
T
T
T
0e 2
1 X 0 es 0
1X
1X
1
1
s
2
A2s
kA2s k
= OP p
OP p
= OP
T
N
T
N
T
N
N
N
T
s=1
s=1
s=1
so that when we multiply by
p
T we get OP
p
T
N
OP
1
NT
= oP (1).
For the last term in (19), a more careful analysis is required. We have that
T
1 X 0 es 0
A
=
T
N 4s
s=1
=
T
1 X 0 es
T
N
1
T
s=1
T
X
s=1
T
1X ~
Ft
T
T
1X ~
Ft
N T
0e
s
HFt
t=1
t=1
35
HFt
0
T
1X
+
H
Ft
ts
T
1
ts + H
T
t=1
T
X 0 es
s=1
ts
!0
T
1X 0
Ft
N T
t=1
ts :
(20)
1
N
NT
;
For the …rst term in (20),
T
T
T
T
1 X 0 es 1 X ~
1X ~
1 X 0 es
Ft HFt ts =
Ft HFt
T
N T
T
T
N ts
s=1
t=1
t=1
s=1
1
!1=2 0
2 1=2
T
T
T
0e
X
X
2
1
1
1
1
1X ~
s
@
A = OP
Ft HFt
OP
= OP
ts
T
T
T
N
N
NT
t=1
t=1
s=1
1
N
;
NT
since
2
T
T
1 X 1 X 0 es
T
T
N
t=1
ts
s=1
2
T
T
1 X 1 X 0 es
=
T
T
N
t=1
T
1X
T
st
s=1
0e 2
s
N
s=1
T
T
1 XX
j
T2
t=1 s=1
2
st j
= OP
1
N2
For the second term in (20), given Assumption 3(c),
T
T
1 X 0 es 1 X 0
Ft
T
N T
s=1
T
T
0e
1 X 0 es 1 X
t
Fs0
Ft0
T
N T
N
s=1
t=1
!
T
T
1
1 X 0
1 X
0
p
p
es Fs
TN
T N s=1
T N t=1
=
ts
t=1
=
1
Thus, (20) is OP
N
NT
+ OP
1
NT
1
= OP
N
0
et Ft0
!
= OP
; which when multiplied by
NT
p
1
TN
:
T is oP (1).
Next, we analyze the dominant term in bf 4:1 which comes from the contribution involving A3s .
For this term, we have that
T
1 X 0 es 0 ~
A V
T
N 3s
1
=
s=1
given that
F 0 F~
T
T
1 X 0 es
T
N
s=1
T
X
T
1X~
Ft
T
t=1
F~ 0 F
0e
s
0e
s
=
1
T
=
1
( + oP (1)) Q0 V
N
s=1
ts
N
T
N
1
!0
!0
V~
1
T
1 X 0 es
=
T
N
s=1
1 1
NT
=
T
X
s=1
0e
s
p
t=1
F 0 F~ ~
V
T
e0s
p
N
!0
T
1 X ~ 0 0 es
Ft Ft
T
N
N
1
;
;
= Q0 + oP (1) by Proposition 1 of Bai (2003) and that by Assumption 3(e),
T h
1 X
T
t=1
0e
t
p
e0
pt
N
N
=
+ oP (1) :
Thus,
p
T bf 4:1 =
T h
1 X
Ft Ft0
T
t=1
!
p
T
( + oP (1)) Q0 V
N
1
;
!
=c
F
Q0 V
1
+ oP (1) :
Since
T h
1 X
C = Hp
Ft (A1t + A2t + A3t + A4t )0 V~
T t=1
36
1
=
p
T H (bf 1 + bf 2 + bf 3 + bf 4 ) V~
1
;
V~
1
:
it follows that
C = cH0
Q0 V
F
1
1
V
because H0 = p lim H is such that H0
F
+ oP (1) = cQ Q0 V
2
+ oP (1) ;
1Q
= Q: Indeed, note that H0 = p lim H = V
;
which implies that
H0
F
= V
1
V 1=2
= V
1
V 1=2 V
1=2
0
F
0
1=2
= V 1=2
given the de…nition of Q and the fact that
1
=V
V 1=2
1=2
0
is such that
0
1=2
1=2
1=2
F
= Q;
1=2
1=2
=
F
V:
Proof of part c). The proof follows closely the proof of part b) by relying on moment and
dependence conditions involving the extra regressors Wt . In particular, writing
T h
1 X
p
Wt F~t
T t=1
we verify
p
0
HFt
=
T h
1 X
p
Wt (A1t + A2t + A3t + A4t )0 V~
T t=1
p
T (d1 + d2 + d3 + d4 ) V~ 1 ;
1
T di = oP (1) for i = 1; 2; 3 by using the same arguments as for bf 1 , bf 2 and bf 3 . The only
term that has a nonzero contribution is d4 . Following the same arguments as for bf 4 ; we have that
!
!
T
T
0e
Xh
X
p
p
0
1
1
s
1
0
T d4 V~
=
T
W t Ft
F~s HFs
V~ 1
T
T
N
t=1
s=1
!
p
T
h
T 1 X
=
Wt Ft0
Q0 V~ 1 + oP (1) V~ 1
N
T
t=1
0
2
= c
WF
QV
= c
0
W F H0 V
= c
W F~ V
+ oP (1) = c
1
V
F~ V
1
Q Q0 V
1
0
W F H0 Q
V
+ oP (1) , since
1
Q0 V
2
+ oP (1) ; since H00 Q = Ir
+ oP (1)
W F~
=
0
W F H0
and
F~
=V
1
Q Q0 V
1
:
This ends the proof.
Proof of part d). This follows immediately from parts a), b) and c) of this Lemma.
B
Appendix B: Proofs of results in Section 3
This Appendix is organized as follows. First, we provide some auxiliary lemmas, then we prove the
results in Section 3, and …nally we prove the auxiliary lemmas.
Lemma B.1 Let H = V~
a) H H
0
2
NT
= Ir + OP
b) H = H0 +OP
2
NT
1 F~ 0 F~ ~ 0 ~ .
T
N
Under Conditions A*-D*, we have that if
NT
= min
p
N;
p
T ;
, in probability, i.e. H is asymptotically an orthogonal matrix.
, in probability, where H0 is a diagonal matrix with
37
1 on the main diagonal.
c) V~ = H V~ H 0 + OP
= V~ + OP
2
NT
2
NT
, in probability.
Lemma B.2 Assume Assumptions 1-5 hold and suppose we generate bootstrap data yt+h ; Xt
ac-
cording to the residual-based bootstrap DGP (7) and (8) by relying on bootstrap residuals "t+h and
fet g such that Conditions A*-D* are satis…ed. Then, as N; T ! 1,
T h
1 X ~
Ft
T
1
p
H F~t "t+h = OP
NT
t=1
in probability, for h
;
T
0.
Lemma B.3 Suppose conditions A*-D* hold. Then, as N; T ! 1;
a)
T h
1 X ~
Ft
T
H F~t
F~t
H F~t
0
1 ~
V
N
=
t=1
1
"
H
1
T
+OP
T h
1 X
T
~ 0e
p t
N
t=1
!
1
+ OP
et 0 ~
p
N
!#
p
+ OP
N
NT
~ 0e
p s
N
!
H 0 V~
1
1
NT
:
b)
T h
1 X
H F~t F~t
T
0
H F~t
t=1
1
p
+OP
NT
1
+ OP
T
T h
1 X ~ ~0
Ft F t
T
1
=H
N
N
t=1
!"
T
1X
T
s=1
es0 ~
p
N
!#
!
F~ 0 F~
T
V~
:
NT
c)
T h
1 X
Wt F~t
T
0
H F~t
t=1
1
p
+OP
NT
T
1
=
N
t=1
1
+ OP
N
T h
1 X
Wt F~t0
T
!"
T
1X
T
s=1
1
p
T
T
Xh
F~t
Wt F~t
H F~t
H F~t
0
0
H
H
t=1
in probability, where ~ W F~ =
0
0
1
1
PT
es0 ~
p
N
!#
F~ 0 F~
T
!
V~
:
^ = c H0 0
0; if
1
h
|
V~
h
^ = c ~ W F~ V~
{z
|
B
1
T
!
NT
Lemma B.4 Suppose conditions A*-D* hold. For any h
T h
1 X ~
p
Ft
T t=1
~ 0e
p s
N
h
~0
t=1 Wt Ft :
38
1
p
T =N ! c, 0
V~
1
+
{z
V~
B
2
i
^ + oP (1) ;
}
c < 1; then
2
i
^ + oP (1) ;
}
2
2
Proof of Lemma 3.1. The proof is based on the following identity:
0
B
T
B1 X
F~s
BT
@ s=1
|
{z
1B
H F~t = V~
F~t
where
C
T
T
T
C
1X~
1X~
1X~
Fs st +
Fs st +
Fs st C
st +
C;
T
T
T
A
s=1
s=1
s=1
} |
{z
} |
{z
} |
{z
}
A1t
A2t
!
N
1 X
eis eit
N
= E
st
i=1
st
,
st
A3t
N
1 X
=
(eis eit
N
st
=
i=1
Ignoring V~
1
2
H F~t
t=1
N
1 X ~0 ~
i Ft eis =
N
T
1X
kA1t k2
T
1
T
t=1
=
T
1X
kA1t k2 + kA2t k2 + kA3t k2 + kA4t k2 ;
T
PT
T
1X ~
Fs
T
s=1
|
{z
kF~ k2
T
=r because
T
1X
kA2t k2
T
st
2
t=1
s=1
!
}
~ 0F
~
F
T
T
1X ~
Fs
T
s=1
|
{z
PT
2
~
s=1 Fs
For the second term, we have that
2
=r
=Ir
T
T
1 XX
E j
T2
t=1 s=1
T
T
1 1 XX
j
=
E
st
N T2
t=1 s=1
1
N
which explains why the second term is OP
T
T
1X
1X
kA3t k2 =
T
T
T
t=1
OP
t=1
PT
~ ~0
s=1 Fs Fs
2
2
T
X
s=1
F~s F~s0
~ 0e
t
N
s=1
T
T
1 XX
T
t=1 s=1
|
{z
2
st
!
2
st
; implying that
1
T
= OP
:
}
=OP (1) by Condition A*(b).
!
T
T
1 XX
j
T2
t=1 s=1
}|
{z
1
OP ( N
)
2
st j
!
1
N
= OP
:
}
2
N
1 X
p
(eis eit
N i=1
2
PT
2
F~s
In particular, by Condition A*(c), we can show that
1
N
ts :
i=1
t=1
By the Cauchy-Schwartz inequality,
1
E (eis eit )) ;
(which is OP (1)), it follows that
T
1X ~
Ft
T
since T
A4t
i=1
N
~ 0 et
1 X ~0 ~
0
~
and
F
e
=
F
i s it
s
N
N
=
1
E (eis eit ))
= OP
1
N
;
: For the third term,
2
T
1 X ~ 0 et
T
N
t=1
2
T
1
T
X
2
F~s F~s0
s=1
r2 ; whereas by Condition B*(d) and Markov’s inequality,
: The fourth term in dt follows by the same arguments.
39
= OP
1
T
PT
t=1
1
N
~ 0e
t
N
;
2
=
Proof of Theorem 3.1. The proof follows the proof of Theorem 2.1. In particular, we can write the
bootstrap analogue of (17), viz
p
T ^
T h
1 X
z^t z^t 0
T
=
t=1
!
1
(A + B + C ) ;
where conditional on the original data, with probability converging to one, we have that
T h
1 X
p
z^t "t+h !d N 0;
T t=1
A =
given Conditions D*(b) and E*, and given that
p lim
0
0
0
;
0
;B =
p1
T
oP (1) (given Lemma B.2);
C =
0
T h
1 X
p
z^t F~t
T t=1
H F~t
0
1 0
H
^ !P
PT
h
t=1
0
c
F~t
1
H F~t "t+h =
;
0
is de…ned in Lemma B.4. Under Assumptions 1-5, p lim V~ = V and
p lim ^ = (H00 ) 1 , p lim ~ W F~ = W F~ , and p lim
= Q Q0 by Condition F*, which implies that
p
P
!d
!P
; and …nally T1 Tt=1h z^t z^t 0 = 0 0 zz 00 00 + oP (1). This implies that T ^
B 0; B
where
N
c(
0) 1
0
;(
0
0) 1
0
(
0)
1
, in probability.
p
Proof of Corollary 3.1. By Theorem 2.1, under Assumptions 1-5, and if T =N ! c, 0 c <
p
1, T ^
!d Z
N ( c ; ) : Thus, from a multivariate version of Polya’s Theorem (cf.
p
T ^
x
(x; c ; ) = o (1),
Battacharya and Rao (1986)), it follows that supx P
where
(x; c
;
) denotes the distribution function of Z. Then, the result follows if we show that
p
0^
^
sup P
T
x
(x; c ; ) = oP (1) :
(21)
x
Under the stated assumptions, from Theorem 3.1 we have that
p
T
0^
^ !d N ( c
;
), in
probability. By arguing along subsequences and applying Polya’s Theorem, this su¢ ces to show (21).
Proof of Lemma B.1. For part a), writing
1 ~
F~ 0 F~
=
F
T
T
and noting that
1 ~0 ~
TF F
F~ H
0
0
1
F~ + H F~ 0 F~ ;
T
= Ir , yields
H =
F~ 0 F~
+ OP
T
2
NT
;
(22)
where we used Lemma B.3 to bound the second term. Now right multiply by H
H H
0
F~ 0 F~ 0
=
H + OP
T
2
= Ir + OP
NT ;
2
NT
F~
=
0
F~ H
F~ + F~
0
T
40
+ OP
2
NT
=
0
to get
F~ 0 F~
+ OP
T
2
NT
where again we used Lemma B.3 and the fact that by construction
H
0
1
=H
+ OP
2
NT
F~ 0 F~
T
= Ir . This shows that
, i.e. H is asymptotically an orthogonal matrix. The eigenvalues of H are
therefore +1 or -1, for large N and T . For part b), by de…nition,
!
~0 ~
~ 0 F~ ~ 0 ~
F
1
H = V~
= V~ 1 H
+ OP
T
N
N
2
NT
;
given (22). Left multiplying both sides by V~ yields
V~ H = H
~0 ~
N
since V~ =
~0 ~
N
+ OP
2
NT
= H V~ + OP
2
NT
;
(23)
by construction of the principal components. By Lemma A.3 of Bai (2003), V~ !P V > 0
and therefore we can write that
V~ H = H V + oP (1) ;
or, transposing, that
VH
Thus, H
H
0
0
0
= H 0 V~ + oP (1) :
is (for large N and T ) the matrix of eigenvectors of V . Since V is a diagonal matrix,
is also diagonal, asymptotically. Moreover, because V has distinct eigenvalues (by assumption), it
follows that its eigenvectors have only one nonzero value and this is +1 or -1 (because H is orthogonal).
Therefore H
0
is for large N and T a diagonal matrix with 1 in the main diagonal (in particular,
H = diag(sign(F~ 0 F~ ))). Part c) follows from (23) by right multiplying by H 0 and using parts a)
and b).
Proof of Lemma B.2. The proof follows exactly as the proof of Lemma A.1, thus we only
highlight the di¤erences. Throughout, we denote with a star the bootstrap analogue of a given
formula in that proof. First, to bound I2 , the bootstrap analogue of I2 , we use directly Condition
1
T
C*(c) to conclude that I2 = OP
, in probability. Second, to bound I1 , we rely on Lemma 3.1 and
on Conditions A*(b) and D*(a). For II1 , we use Condition C*(a) instead of Assumption 4(a). For
P
II2 , we use Condition B*(b) instead of Assumption 3(b) to show that T1 Tt=1h E kmt k2 = OP (1),
P
2 =O
and Condition D*(a) instead of Assumption 5(a) to show that T1 Tt=1h "t+h
P (1), in probability.
To bound III1 and III2 ; we rely on Condition C*(b) instead of Assumption 4(b). Finally, to bound
IV , we rely on Conditions B*(d) and D*(b) instead of Assumptions 3(d) and 5(b), respectively.
Proof of Lemma B.3. Proof of part a). We follow closely the proof of Lemma A.2. Speci…cally,
we analyze each of the terms in
T h
1 X ~
Ft
T
H F~t
F~t
H F~t
t=1
0
= V~
1
T h
1 X
(A1t + A2t + A3t + A4t ) (A1t + A2t + A3t + A4t )0 V~
T
t=1
In particular, by Lemma 3.1, and the appropriate bootstrap high level conditions, we can show that:
1
T
PT
h
0
t=1 A1t A1t
= OP
1
T
, in probability, using Condition A*(b).
41
1
:
1
T
1
T
PT
h
0
t=1 A2t A2t
1
= OP
PT
h
0
t=1 A3t A3t
F~ 0 F~
T
1
NH
=
, in probability, using Conditions A*(c) and B*(b).
2
NT
N
1
T
using Condition B*(d).
Each of
1
T
PT
h
0 1
t=1 A4t A4t , T
PT
~ 0e
p t
N
h
t=1
PT
h
0 1
t=1 A2t A3t , T
in probability, using condition B*(d).
1
T
1
T
PT
h
0
t=1 A1t A2t
= OP
PT
h
0
t=1 A1t A3t
and
1
T
Thus, we conclude that
T h
1 X ~
Ft
T
H F~t
F~t
NT
1
p
h
0
t=1 A1t A4t
H F~t
F~ 0 F~
T
PT
h
0
t=1 A2t A4t
1
H 0 + OP
and
1
T
N
NT
PT
h
0
t=1 A3t A4t
, in probability,
1
is OP
N
NT
,
, in probability.
NT
PT
0
et 0 ~
p
N
p1
NT
are OP
1
1
H
N
+OP
1
T
= V~
t=1
, in probability.
F~ 0 F~
T
!"
+ OP
T h
1 X
T
t=1
1
N
~ 0e
p t
N
!
+ OP
NT
et 0 ~
p
N
p
!#
1
NT
F~ 0 F~
T
!
H 0 V~
:
Proof of part b). We analyze each of the terms in
T h
1 X ~
H
Ft (A1t + A2t + A3t + A4t )0 V~
T
1
=H
bf 1 + bf 2 + bf 3 + bf 4 V~
1
;
t=1
PT
h ~
0
t=1 Ft Ait ,
1
T
where we let bf i
for j = 1; : : : ; 4: Following exactly the same steps as in the proof of
part b) of Lemma A.2, we can show that:
1p
bf 1 = OP
NT
bf 2 = OP
p1
TN
, in probability, given Condition B*(b) and Lemma 3.1.
bf 3 = OP
p1
TN
, in probability, given Condition B*(c) and Lemma 3.1.
bf 4 =
1
N
1
T
, in probability, given Condition B*(a) and Lemma 3.1.
T
PT
h ~ ~0
t=1 Ft Ft
h P
T
1
T
~ 0e
p s
N
s=1
es 0 ~
p
N
i
F~ 0 F~
T
V~
1 +O
P
p1
TN
+ OP
1
N
NT
, in
probability, given in particular Condition B*(c) and Lemma 3.1. Thus, we can conclude that
!
!#
!
!"
T h
T
T
0~
Xh
X
~ 0e
~ 0 F~
0
1 X
1
e
F
1
1
p s
ps
H F~t F~t
H F~t = H
F~t F~t0
V~
T
N T
T
T
N
N
t=1
t=1
s=1
+OP
1
p
NT
T
+ OP
1
N
:
NT
Proof of part c). This follows by the same arguments used in the proof of part c) of Lemma A.1.
42
2
1
Proof of Lemma B.4. From prove part a) of Lemma B.3, and noting that
F~ 0 F~
T
= Ir , we have
that
T h
0
1 X ~
1
p
H 0
Ft
H F~t F~t
H F~t
^
T t=1
" T h
!
!#
p
1 X ~ 0 et
et 0 ~
T ~ 1
p
p
V
H
H 0 V~ 1 H
=
N
T
N
N
t=1
!
p
1
1
T 1
p
p
+OP
+ OP
+ OP
N NT
T
N
= cV~
1
H 0 V~
= c H
0
1
V~
1
= c H0 0
1
V~
1
H
1
1
H
V~
H
H
1
0
1
H
0
1
0
^
^ + oP (1)
H
0
1
V~
1
H
1
1
0
H
^ + oP (1)
^ + oP (1)
where the second equality uses Condition B*(e), and the third equality follows from Lemma B.1.c).
1 (H 0 ) 1
The last equality uses the fact that H0 = diag ( 1), which implies that H
^ = ^ + oP (1).
This proves part a). Similarly, from part b) of Lemma B.3 and given Condition B*(e), we get that
T h
0
1 X
1
p
H F~t F~t
H F~t
H 0
^
T t=1
!"
!
!#
p
T h
T
1 X ~ ~0
1 X ~ 0 es
T
es0 ~
p
p
Ft Ft
=
H
N
T
T
N
N
t=1
s=1
!
F~ 0 F~
1
V~ 2 H 0
^ + oP (1)
= cH
T
= cH
V~
1
H
0
H
0
1
V~
1
H
1
H
0
1
F~ 0 F~
T
!
V~
^ + oP (1) = cH0
2
H
V~
2
1
0
^ + oP (1)
^ + oP (1) ;
P
where the second equality uses the fact that T1 Tt=1h F~t F~t0 = Ir +oP (1) ; the third equality uses Lemma
~0 ~
B.1.c) and the fact that F TF V~ 1 = V~ 1 H 0 , and the last equality holds because H 1 (H 0 ) 1 =
Ir + oP (1), in probability.
To prove part c), we note that from part c) of Lemma B.3, we have that
T h
0
1 X
1
p
Wt F~t
H F~t
H 0
^
T t=1
!
!#
!"
p
T h
T
0~
X
~ 0 es
T 1 X
1
e
p
ps
=
Wt F~t0
N
T
T
N
N
t=1
s=1
!
F~ 0 F~
1
V~ 2 H 0
^ + oP (1)
= c ~ W F~
T
= c ~ W F~
V~
2
F~ 0 F~
T
^ + oP (1) ;
by relying on the same arguments as those used to prove part b).
43
!
V~
2
H
0
1
^ + oP (1)
C
Appendix C: Proofs of results in Section 4
First, we state an auxiliary result and its proof. Then we prove Theorem 4.1.
Lemma C.1 Suppose Assumptions 1-5 hold. If in addition either: (1) fFs g, f i g and feit g are
mutually independent and for some p
2, E jeit j3p
M < 1, or (2) for some p
follows that
(i)
PT
1
T
(ii)
(iii)
F~t
t=1
1
N
PN
~i
i=1
1
TN
PT
t=1
p
HFt
10
H
PN
~pit
i=1 e
M < 1; E k i kp
2, E jeit j2p
M < 1; E k i k3p
M < 1 and E kFt kp
M < 1 and E kFt k3p
M < 1, it
= OP (1) ;
p
= OP (1) ;
i
= OP (1) :
Proof of Lemma C.1. Proof of (i). We rely on the following identity (see Bai and Ng (2002),
proof of Theorem 1):
F~t
HFt = V~
1
T
1X~
Fs
T
T
1X~
+
Fs
st
T
s=1
where
st
=
1
N
PN
i=1 eis eit ;
T
1X ~
Ft
T
st
=
HFt
1
N
p
PN
0
i Fs eit ;
i=1
3p
s=1
V~
1
1
p
t=1
st
s=1
!
;
It follows that by the c
!
T
T
T
1X
1X
1X
at +
bt +
ct ;
T
T
T
and
st
=
T
1X~
+
Fs
st
T
ts :
t=1
t=1
r inequality,
t=1
where
T
1 X~
at = p
Fs
T
p
st
s=1
Let
st
denote either
st ,
T
X
s=1
st
s=1
or
p
F~s
st
T
1 X~
; bt = p
Fs
T
st .
We can write
0
1
2 p=2
T
X
=@
F~s st A
s=1
p
st
t=1
s=1
p
st
T
T
C
1 XB
B 1 X ~ 2C
Fs C
B
T
@ T s=1
A
t=1
|
{z
}
rp=2
1
T2
=r
T
T
XX
t=1 s=1
j
p
st j ;
44
st
:
s=1
T
X
F~s
s=1
T
1X
j
T
s=1
T
2X
s=1
where the inequality follows by Cauchy-Schwartz. It follows that
0
1p=2
T
T
1X 1 X~
Fs
T
Tp
p
T
1 X~
; and ct = p
Fs
T
2
st j
!p=2
j
2
st j
!p=2
;
T
1X
rp=2
T
t=1
T
1X
j
T
s=1
2
st j
!p=2
where the last inequality follows again by the c
r inequality. Thus it su¢ ces to show that E j
M < 1 to prove that the above term is OP (1). Starting with
p
N
1 X
E j st j = E
eit eis
N
N
1 X
E jeit eis jp
N
p
i=1
N
1 X
E j st j = E
N
st ,
1=2
E jeis j2p
i=1
M < 1: When
p
=
N
1 X
E jeit j2p
N
i=1
given that we assume E jeit j2p
st
=
st
p
st ,
i=1
M < 1;
we have that
N
1 X
E
N
0
i Fs eit
1=2
p
st j
p
0
i Fs eit
< M;
i=1
since
E
2=3
3p
p
0
i Fs eit
1=3
E kFs k3p
E k i eit k 2
given the assumptions that E k i k3p
E k i k3p E jeit j3p
M , E kFs k3p
1=3
E kFs k3p
M , and E jeit j3p
1=3
M
M in obtaining the last
inequality. Note that if we assume that f i g, fFs g and feit g are three mutually independent groups
of random variables, then it su¢ ces that E k i kp
E
M , E kFs kp
p
0
i Fs eit :
M , and E jeit jp
The term that depends on st = st can be dealt with similarly.
0~
~0
Proof of (ii). Note that ~ = XTF , which implies that ~ 0 = FTX . Since X = F
M to bound
0
+ e, it follows
that
~0
~0 = F F
T
thus implying that ~ i =
F~ 0 F
T
i
+
F~ 0 ei
T ;
1
F~ 0 F~
F H0
T
T
i
+
F~ 0 e
;
T
where ei = (ei1 ; : : : ; eiT )0 . We can write
0
F~ 0 F H
F~ 0 F~
=
T
T
from which it follows that
0
~0
~0
~ i = F F H H 0 1 i + F ei = H 0
T
T
0
1
F~ 0 F~
F~ 0 F~
= Ir
F H0 H0
F H0
T
1
1
i +T
F~
;
0
F H0
ei +T
1
0
F H 0 ei :
Thus,
1
N
N
X
~i
H
10
p
i
3p
i=1
0
1B
For the …rst term, we have that
N
1 X
T
N
1
F~ 0 F~
@
F H0 H0
1
N
+ N1
1
PN
T
i=1
p
T
i
PN
i=1
F~
1
1=2
F~
p
1F
~0
F~
F H0 H0 1 i
p
0
P
F H 0 ei + N1 N
i=1 T
T
T
1=2
F~
F H0
p
H0
i=1
p
1 (F H 0 )0 e p
i
1 p
T
1=2
F~
p
=
T
1
F~
2
p=2
=
T
1
T
X
t=1
45
F~t
2
!p=2
= rp=2 ;
C
A:
N
1 X
k i kp :
N
i=1
Now,
1
given that F~ 0 F~ =T = Ir . Similarly, under Assumptions 1-5,
F~
1=2
T
p
F H0
=
T
T
X
1
F~t
HFt
t=1
1 p
Since H 0
2
!p=2
p
NT
= OP
= OP (1) :
= OP (1), it follows that the …rst term is OP (1) provided E k i kp
holds by assumption.
M < 1, which
For the second term,
N
1 X
T
N
1
F~
0
F H0
p
ei
1=2
= T
F~
p
F H0
t=1
N
1 X
T
N
1=2
ei
p
;
t=1
where the …rst factor is OP (1) and the second factor is dominated by
N
1 X
N
T
1=2
ei
2
p=2
N
1 X
=
T
N
N
1 X
=
T
N
p=2
1 0
ei ei
t=1
i=1
t=1
p
which is OP (1) given the assumption that E jeit j
using in particular the fact that E kFt k2
T
X
e2it
t=1
!p=2
N T
1 XX p
eit ;
NT
t=1 t=1
M: The third term can be bounded similarly
M < 1:
Proof of (iii). We can write
F~t
0
1
iH
e~it = eit
1
~i
HFt
H
10
0
i
F~t ;
which implies that
T N
1 XX
j~
eit jp
NT
3p
1
t=1 i=1
+3p
T N
1 XX
jeit jp + 3p
NT
1
1
N
t=1 i=1
N
X
~i
H
10
1
p
i
i=1
N
1 X
k i kp H
N
1
T
i=1
T
X
F~t
p
1 p
T
1X ~
Ft
T
HFt
p
t=1
:
t=1
The …rst term is OP (1) given that E jeit jp = O (1); the second term is OP (1) since E k i kp = O (1)
and given part (i); and the third term is OP (1) given parts (ii) and (iii), since in particular
T
1X ~
Ft
T
t=1
p
T
1X
HFt + F~t
T
HFt
p
2p
T
T
1X
1X ~
kHkp
kFt kp +
Ft
T
T
1
t=1
t=1
t=1
HFt
p
!
= OP (1) :
Proof of Theorem 4.1. We verify Condition A*-F*. We start with Condition A*. Since eit =
e~it
it where i.i.d. (0; 1) across (i; t), part a) follows immediately. For part b), note that st =
2
P
PT
N
1
1 PT
1 PN
2 = 1
2
e
~
e
~
1
(t
=
s)
;
which
implies
that
e
~
: This expression is
it
is
i=1
t;s st
t=1 N
i=1 it
N
T
T
1 PT
1 PN
4
bounded by T t=1 N i=1 e~it ; which is OP (1) under our assumptions by an application of Lemma
C.1 (iii) with p = 4: For c), note that for any (t; s),
E
N
1 X
p
(eit eis
N i=1
2
E (eit eis ))
=
N N
1 XX
Cov
N
i=1 j=1
46
eit eis ; ejt ejs :
For the wild bootstrap where eit = e~it
Cov
it , with
eit eis ; ejt ejs = e~it e~is e~jt e~js Cov
i.i.d. across (i; t),
it
it is ; jt js
e~2it e~2is V ar ( it is ) if i = j
;
0
if i 6= j
=
which implies that
N
1 X
p
(eit eis
N i=1
E
2
E (eit eis ))
N
1 X 2 2
=
e~it e~is V ar (
N
it is ) :
i=1
Thus, condition A*(c) becomes
N
T
T
1 XX 1 X 2 2
e~it e~is V ar ( it
|
{z
T2
N
t=1 s=1
i=1
for some constants
is )
}
N
1 X
N
T
1X 2
e~it
T
t=1
i=1
!2
C
N T
1 1 XX 4
e~it = OP (1) ;
NT
i=1 t=1
and C, which holds given Lemma C.1 (iii) with p = 4: For Condition B*(a),
under the wild bootstrap, in particular the bootstrap time series independence, we have that
!
T
T
T
T
N
1 X X ~ ~0
1 X ~ ~0
1 X ~ ~0 1 X 2
Fs Ft st =
Ft Ft tt =
Ft Ft
e~it
T
T
T
N
t=1 s=1
t=1
t=1
i=1
2
31=2
!
!
!1=2 "
#1=2
1=2
2
T
T
N
T
T X
N
X
X
X
X
4
1
1 X ~ ~0 2
1
1
1
1
4
Ft Ft
e~2it 5
F~t
e~4it
;
T
T
N
T
TN
t=1
t=1
t=1
i=1
t=1 i=1
which is OP (1) under our assumptions. For Condition B*(b), we have that
"
2
T N
N
T
T
X
1X1
1 XX
1 X
p
z^s (eit eis E (eit eis )) =
(eit eis
E
z^s p
T
T
T N s=1 i=1
N i=1
t=1
t=1
s=1
1
0
N
N
T
T
T
X
X
1 X 1 XX 0
1
1
z^s z^l E @ p
(eit eis E (eit eis )) p
ejt ejl E ejt ejl A ;
T
T
N i=1
N j=1
t=1
s=1 l=1
T
1X
E
T
=
where
0
N
1 X
E @p
(eit eis
N i=1
N
1 X
E (eit eis )) p
ejt ejl
N j=1
E
1
N N
1 XX
ejt ejl A =
Cov
N
E (eit eis ))
eit eis ; ejt ejl :
i=1 j=1
Using the properties of the wild bootstrap, in particular the bootstrap cross sectional independence,
we can show that
Cov
eit eis ; ejt ejl = 0 when i 6= j for any t; s; l:
When i = j;
Cov
eit eis ; ejt ejl = Cov (eit eis ; eit eil ) =
0
V ar (eit eis ) = e~2it e~2is V ar (
47
it
if s 6= l
)
if
s = l:
is
#
2
It follows that
T
1X
E
T
=
=
1
T
t=1
T
X
t=1
2
E (eit eis ))
T
N
1X 0 1 X 2 2
z^s z^s
e~it e~is V ar (
T
N
N
1 X
N
i=1
"
T N
1 XX
p
z^s (eit eis
T N s=1 i=1
s=1
it is )
i=1
T
1X 2
e~it
T
t=1
N T
1 1 XX 4
e~it
NT
i=1 t=1
!
T
1X
k^
zs k2 e~2is
T
s=1
#1=2 "
2
!
N
1 X
N
i=1
N
X
41
N
i=1
T
T
N X
X
1X
4 1 1
k^
zs k
e~4is
T
NT
s=1
i=1 s=1
!
T
1X 0 2
z^s z^s e~is
T
t=1
s=1
!2 31=2 2
!2 31=2
T
N
T
X
X
X
1
1
1
e~2it 5 4
k^
zs k2 e~2is 5
T
N
T
T
1X 2
e~it
T
#1=2
!
t=1
i=1
s=1
:
This expression is OP (1) under our assumptions. Next consider Condition B*(c). We have that
!
2
2
T
N
T
X
X
1 X ~ 0 et 0
1
0
~
p z^t =
^t
E
E p
i eit z
TN
T t=1 N
t=1
i=1
! !
T
N
X
X
1
0
2
~ ~iE e
=
z^t
tr
z^t0
i
it
TN
t=1
i=1
!
T
N
T
N
X
X
X
X
2
1
0
2 1
~ ~ i e~2 = 1
~ i e~2
z^t0 z^t
k^
z
k
=
t
i
it
it
TN
T
N
t=1
t=1
i=1
i=1
0
!
!2 11=2
1=2
T
T
N
X
X
X
2
1
1
~ i e~2 A :
@1
k^
zt k 4
it
T
T
N
t=1
t=1
i=1
|
{z
}
=OP (1) under our assumptions
But by Cauchy-Schwartz,
T
1X
T
N
1 X ~
i
N
t=1
2
i=1
e~2it
!2
N
T
N
1 X ~ 41 X 1 X 4
e~it = OP (1) ;
i
N
T
N
t=1
}|
{z i=1 }
| i=1{z
OP (1)
under our conditions. In particular, note that
N
1 X ~
i
N
i=1
4
2
3
N
1 X
H
N
=OP (1)
4
10
i
i=1
N
1 X ~
+
i
N
H
i=1
10
4
i
!
and use Lemma C.1 (ii) with p = 4 to bound the second term. The …rst term is bounded by the
assumptions on f i g. For Condition B*(d), we have that
E
~ 0e
p t
N
2
=E
N
1 X~
p
i eit
N i=1
2
=
N N
1 X X ~0 ~
i jE
N
|
i=1 j=1
48
N
1 X ~0 ~ 2
eit ejt =
~it ;
i ie
{z } N i=1
=0 for i6=j
since Cov
it ; jt
= 0 and V ar (
it )
= 1: Thus, Condition B*(d) becomes
!
T
N
N
T
1 X 1 X ~0 ~ 2
1 X ~ 2 1X 2
=
e~it
~it =
i
i ie
T
N
N
T
t=1
t=1
t=1
i=1
i=1
!1=2 0
!2 11=2
!1=2
N
N
T
N
X
X
X
X
4
4
1
1
1
1
2
~i
~i
@
e~it A
N
N
T
N
T
1X
E
T
~ 0e
p t
N
2
i=1
t=1
i=1
i=1
N
T
1 X1X 4
e~it
N
T
t=1
i=1
!1=2
= OP (1) ;
under our assumptions, by an application of Lemma C.1. For Condition B*(e), we need to show that
T
1X
T
A
t
t
0
E
t
t
0
T
N N
1 X 1 X X ~ ~0
i j eit ejt
T
N
=
t=1
t=1
E
eit ejt
= oP (1) :
i=1 j=1
This expression has mean zero under the bootstrap measure by construction. So, it su¢ ces to show
that its variance tends to zero in probability. Take the case where the number of factors r is equal to
1, for simplicity. Then,
V ar (A ) = V ar
T
1X
T
t
2
E
t
!
2
t=1
T
T
N
1 XX 1 X ~ ~ ~ ~
= 2
i j l k Cov
T
N2
t=1 s=1
eit ejt ; els eks = 0 if t 6= s, for any
Using the properties of the wild bootstrap, we can show that Cov
(i; j; k; l) ; and
Cov
eit ejt ; elt ekt
8 4
2
< e~it V ar
it if i = j = k = l
2
2
e
~
e
~
,
if
i = k 6= j = l
=
:
it jt
:
0, otherwise
This implies that for some …nite constant ;
0
T
N
1 X 1 @X ~ 4 4
V ar (A ) =
~it V ar
ie
T2
N2
t=1
T
1 X
T2
t=1
= OP
provided
1
N
PN ~ 4
i=1 i = OP (1) and
1
T
1
NT
i=1
N
1 X ~2 2
~it
ie
N
i=1
!2
2
it
eit ejt ; els eks :
i;j;k;l
+
N
X
i6=j
1
~ i ~ j e~2 e~2 A
it jt
N
T N
1 1 X ~4 1 X X 4
e~it
i
TN
NT
i=1
t=1 i=1
= oP (1) ;
PN PT
~4it
t=1 e
= OP (1), which holds under our moment asP
P
~ 2 ~2 for the wild
sumptions by an application of lemma C.1 with p = 4: Thus,
= T1 Tt=1 N1 N
it
i=1 i e
i=1
bootstrap. Condition F* is satis…ed because by Bai and Ng (2006),
Next, we verify Condition C*. For t = 1; : : : ; T
!P Q Q0 .
h, let "t+h = ^"t+h vt+h , where vt+h
i.i.d.(0; 1).
Part a) follows as Condition B*(b), using the independence between "t+h and eit : For part b), we have
49
that
T h
1 X ~ 0 et
p
p "t+h
T t=1 N
E
2
=
=
1
E
TN
T
Xh
N
X
t=1
~ie
it
!
2
"t+h
80 i=1
< TXh TXh
1
@
"t+h "s+h
E
:
TN
t=1 s=1
N
X
i=1
By the independence between feit g and f"s g, we have that
1
0
N
N
T
h
T
h
XX 0
1
1 BX X
C
~ ~j E e e
E "t+h "s+h
=
@
i
it js A =
TN
TN
{z
} i=1 j=1
|
| {z }
t=1 s=1
=0 if t6=s
=
T h
N
1 X 2 1 X ~
^"t+h
i
T
N
t=1
2
e~2it
=0 if i6=j or t6=s
T h
1 X 4
^"t+h
T
t=1
i=1
!1=2 0
@1
T
T
Xh
119
!0 N
=
X
~0 e
~ j e AA :
@
i it
js
;
j=1
T
Xh
t=1
N
1 X ~
i
N
t=1
E
2
"t+h
2
N
X
~0 ~iE
i
2
eit
i=1
e~2it
i=1
!2 11=2
A
= OP (1)
!
OP (1) ;
under our assumptions and using the same arguments used above to show condition B*(b). Part c)
follows because
st
that
T h T
1 XX ~
Fs "t+h
T
= 0 for t 6= s and by repeated application of Cauchy-Schwartz inequality, we have
st
t=1 s=1
1
T
T
Xh
t=1
F~t "t+h
t=1
2
!
T h
N
1 X ~
1 X 2
Ft "t+h
e~it
tt =
T
N
t=1
i=1
11=4
0
3
!2 1=2 B
#1=2
C "
N
T
T h
T X
N
X
41 X
C
B1 X
1 X 2 5
1
1
4
4
B
F~t
"t+h C
;
e~it
e~it
C
BT
N
T
TN
A
@
t=1
t=1
t=1 i=1
i=1
{z
}|
{z
}
|
{z
}
|
T
1X~
=
Ft "t+h
T
!1=2 2
41
T
T
Xh
t=1
OP (1)
OP (1)
=OP (1)
P
4 = O
under our assumptions, provided in particular that T1 Tt=1h "t+h
P (1) in probability. For this,
P
T h
1
4
it su¢ ces that T t=1 E "t+h = OP (1) : But by the properties of the wild bootstrap on "t+h , we
have that
T h
1 X
E
T
t=1
4
"t+h
T h
1 X 4
=
^"t+h
T
4
vt+h
E
| {z
t=1
}
C<1 by assumption
T h
1 X 4
^"t+h ;
T
t=1
which is veri…ed under our conditions. Finally. we verify Condition D*. E "t+h = 0 by construction.
1=2
PT h
P
2
1 PT h 4
Moreover, we have that T1 t=1
E "t+h = T1 Tt=1h ^"2t+h
"t+h
= OP (1) ; under our
t=1 ^
T
assumptions. So, part a) is veri…ed. For part b) of Condition D*, we need to verify that the bootstrap
CLT result holds for the wild bootstrap. By the Cramer-Wold device, it su¢ ces to show that
T h
1 X 0
p
` (
T t=1 |
)
1=2
{z
wt
z^t "t+h !d N (0; 1) ;
}
in probability for any ` such that `0 ` = 1. Note that wt is an heterogeneous array of independent
random variables (given that "t+h is conditionally independent but heteroskedastic). Thus, we apply
50
a CLT for heterogeneous independent arrays. Note that E (wt ) = 0 and
!
!
T h
T h
1 X
1 X
1=2
0
p
p
V ar
wt
= ` ( )
V ar
z^t "t+h ( ) 1=2 `
T t=1
T t=1
!
T
h
X
1
= `0 ( ) 1=2
z^t z^t0 ^"2t+h ( ) 1=2 ` = 1:
T
t=1
Thus, it su¢ ces to verify Lyapunov’s condition, i.e. for some r > 1;
that wt = `0 (
)
1=2
T h
1 X
E jwt j2r =
Tr
t=1
z^t "t+h ; we have that
1
Tr
T h
1 X 0
` (
1T
Tr
C
)
1
k`k
1
Tr
1
2r
(
(
)
0
z^t
1=2
2r
2r
)
1=2
2r
h
t=1 E
T h
1 X
k^
zt k2r j^"t+h j2r E jvt+r j2r
| {z }
T
T h
1 X
k^
zt k4r
T
t=1
=
PT
1
T
jwt j2r !P 0: Noting
j^"t+h j2r
t=1
by an application of Lemma C.1. Since
to
1=2
t=1
1
0
0
1
Tr
PT
M <1
!1=2
h
^t z^t0 ^"2t+h ,
t=1 z
T h
1 X
j^"t+h j4r
T
t=1
!1=2
= OP
under our assumptions
1
Tr
1
= oP (1) ;
converges
> 0, by Bai and Ng (2006). Thus, Condition E* is satis…ed. Condition F* was veri…ed in
the main text.
51
References
[1] Bai, J., 2003. “Inferential theory for factor models of large dimensions,” Econometrica, 71, 135172.
[2] Bai, J., 2009. “Panel data models with interactive …xed e¤ects,” Econometrica, 77, 1229-1279.
[3] Bai, J. and S. Ng, 2002. “Determining the number of factors in approximate factor models,”
Econometrica, 70, 191-221.
[4] Bai, J. and S. Ng, 2006. “Con…dence intervals for di¤usion index forecasts and inference with
factor-augmented regressions,” Econometrica, 74, 1133-1150.
[5] Bai, J. and S. Ng, 2011. “Principal Components Estimation and Identi…cation of the Factors",
manuscript, Columbia University.
[6] Boivin, J. and B. S. Bernanke, 2003, "Monetary policy in a data-rich environment", Journal of
Monetary Economics, 50, 525-546.
[7] Chamberlain, G. and M. Rothschild, 1983. “Arbitrage, factor structure and mean-variance analysis in large asset markets,” Econometrica, 51, 1305-1324.
[8] Connor, G. and R. A. Korajczyk, 1986. “Performance measurement with the arbitrage pricing
theory,” Journal of Financial Economics, 15, 373-394.
[9] Connor, G. and R. A. Korajczyk, 1993. “A test for the number of factors in an approximate factor
model,” Journal of Finance, XLVIII, 1263-1291.
[10] Eichengreen, B. A. Mody, M. Nedeljkovic, and L. Sarno, 2009. “How the Subprime Crisis Went
Global: Evidence from Bank Credit Default Swap Spreads", NBER working paper 14904.
[11] Gospodinov, N. and S. Ng, 2011. “Commodity prices, convenience yields and in‡ation,”
emph Review of Economics and Statistics, forthcoming.
[12] Ludvigson, S. and S. Ng, 2007. “The empirical risk return relation: a factor analysis approach,”
Journal of Financial Economics, 83, 171-222.
[13] Ludvigson, S. and S. Ng, 2009a. “Macro factors in bond risk premia,”Review of Financial Studies,
22, 5027-5067.
[14] Ludvigson, S. and S. Ng, 2009b. “A factor analysis of bond risk premia,” Handbook of Applied
Econometrics, forthcoming.
[15] Onatski, A., 2011. “Asymptotics of the principal components estimator of large factor models
with weakly in‡uential factors,” manuscript, University of Cambridge.
52
[16] Shintani, M. and Z-Y Guo, 2011. “Finite sample performance of principal components estimators
for dynamic factor models: asymptotic vs. bootstrap approximations,” manuscript, Vanderbilt
University.
[17] Stock, J. H. and M. Watson, 2002. “Forecasting using principal components from a large number
of predictors,” Journal of the American Statistical Association, 97, 1167-1179.
[18] Yamamoto, Y., 2011. “Bootstrap inference for impulse response functions in factor-augmented
vector autoregressions,” manuscript, University of Alberta.
53
Table 1: Bias and coverage rate of 95% CIs for delta - Homoskedastic cases
T = 50
N = 50
T = 100
T = 200
T = 50
N = 100
T = 100
T = 200
T = 50
N = 200
T = 100
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
T = 200
Bias
DGP 1
bias
estimate
WB
-0.01
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.01
0.00
0.00
alpha = 0
homo, homo
0.00
0.00
0.00
Coverage rate
OLS
BC
True factor
WB
94.0
91.0
93.8
95.4
93.1
95.5
95.0
91.9
94.3
94.5
93.1
94.3
95.7
94.0
95.4
94.2
93.3
93.7
95.7
95.1
95.0
94.4
93.8
94.6
94.1
93.7
94.1
96.9
96.2
95.8
96.0
97.0
95.1
96.7
94.7
94.3
-0.08
-0.05
-0.06
-0.09
-0.03
-0.07
-0.06
-0.03
-0.05
-0.05
-0.03
-0.04
Bias
DGP 2
bias
estimate
WB
-0.17
-0.09
-0.12
-0.14
-0.09
-0.11
-0.13
-0.10
-0.10
-0.11
-0.05
-0.09
alpha = 1
homo, homo
-0.09
-0.05
-0.07
Coverage rate
OLS
BC
True factor
WB
71.1
83.0
93.8
66.0
88.1
95.5
50.7
86.5
94.3
84.7
88.6
94.3
83.1
90.4
95.4
79.3
90.2
93.7
88.3
90.1
94.6
89.8
92.1
94.6
88.1
92.7
94.1
90.9
92.7
90.7
93.8
93.7
92.0
93.8
94.5
94.3
Table 2. Bias in estimation of alpha - More general cases
DGP 3
alpha = 1
hetero, homo
DGP 4
alpha = 1
hetero, hetero
DGP 5
alpha = 1
hetero, AR+hetero
DGP 6
alpha = 1
hetero, CS+homo
T = 50
N = 50
T = 100
T = 50
N = 100
T = 100
T = 200
bias
estimate
WB
-0.16
-0.09
-0.12
-0.14
-0.09
-0.11
bias
estimate
WB
-0.17
-0.10
-0.13
bias
estimate
WB
bias
estimate
WB
T = 50
N = 200
T = 100
T = 200
T = 200
-0.13
-0.10
-0.10
-0.11
-0.05
-0.09
-0.09
-0.05
-0.07
-0.07
-0.05
-0.06
-0.09
-0.03
-0.07
-0.06
-0.03
-0.05
-0.05
-0.03
-0.04
-0.15
-0.10
-0.12
-0.14
-0.10
-0.11
-0.12
-0.05
-0.10
-0.10
-0.06
-0.08
-0.08
-0.06
-0.07
-0.10
-0.03
-0.08
-0.06
-0.03
-0.06
-0.05
-0.03
-0.04
-0.19
-0.09
-0.13
-0.16
-0.10
-0.11
-0.15
-0.10
-0.11
-0.13
-0.05
-0.09
-0.10
-0.06
-0.08
-0.08
-0.06
-0.07
-0.10
-0.03
-0.08
-0.07
-0.03
-0.06
-0.05
-0.03
-0.04
-0.23
-0.11
-0.10
-0.21
-0.12
-0.09
-0.20
-0.12
-0.09
-0.15
-0.07
-0.08
-0.13
-0.07
-0.07
-0.11
-0.08
-0.06
-0.11
-0.04
-0.07
-0.08
-0.04
-0.05
-0.07
-0.04
-0.04
Table 3: Coverage rate of 95% CIs for delta - More general cases
DGP 3
alpha = 1
hetero, homo
DGP 4
alpha = 1
hetero, hetero
DGP 5
alpha = 1
hetero, AR + hetero
DGP 6
alpha = 1
hetero, CS + homo
T = 50
N = 50
T = 100
T = 200
T = 50
N = 100
T = 100
T = 200
T = 50
N = 200
T = 100
T = 200
63.5
81.5
93.3
57.3
84.3
93.6
45.6
84.0
93.5
76.5
83.1
90.2
78.4
87.6
93.1
76.2
88.0
92.9
80.6
83.5
88.9
85.0
89.6
94.3
86.0
89.6
92.8
WB
93.1
92.3
91.1
93.5
93.4
92.1
92.6
94.0
92.5
OLS
BC
58.8
78.5
93.3
51.8
81.1
93.6
38.7
82.1
93.5
73.0
82.5
90.2
75.3
87.1
93.1
73.1
86.6
92.9
78.7
83.2
88.9
84.5
89.3
94.3
84.6
89.3
92.8
WB
93.7
92.8
91.8
92.9
93.8
92.3
92.6
93.6
93.1
OLS
BC
54.7
74.6
93.3
49.0
79.7
93.6
36.7
81.1
93.5
72.4
80.8
90.2
72.9
86.3
93.1
72.6
87.8
92.9
76.9
81.1
88.9
83.7
88.5
94.3
83.4
88.1
92.8
WB
91.2
90.8
91.1
91.8
92.8
92.4
91.3
93.1
92.3
OLS
BC
42.2
68.3
93.3
30.6
67.2
93.6
13.7
61.4
93.5
65.8
77.5
90.2
65.2
81.7
93.1
54.7
81.1
92.9
76.1
80.0
88.9
79.5
87.9
94.3
78.3
87.4
92.8
81.5
75.1
59.4
87.5
84.5
80.3
88.6
90.2
87.3
OLS
BC
True factor
True factor
True factor
True factor
WB
Download