Bootstrapping factor-augmented regression models Sílvia Gonçalves and Benoit Perron Département de sciences économiques, CIREQ and CIRANO, Université de Montréal November 9, 2011 Abstract The main contribution of this paper is to propose and theoretically justify bootstrap methods for regressions where some of the regressors p are factors estimated from a large panel of data. We derive our results under the assumption that T =N ! c, where 0 c < 1 (N and T are the crosssectional and the time series dimensions, respectively), thus allowing for the possibility that factors estimation error enters the limiting distribution of the OLS estimator. We consider general residualbased bootstrap methods and provide a set of high level conditions on the bootstrap residuals and on the idiosyncratic errors such that the bootstrap distribution of the OLS estimator is consistent. We subsequently verify these conditions for a simple wild bootstrap residual-based procedure. Our main results can be summarized as follows. When c = 0, as in Bai and Ng (2006), the crucial condition for bootstrap validity is the ability of the bootstrap regression scores to mimic the serial dependence of the original regression scores. Mimicking the cross sectional and/or serial dependence of the idiosyncratic errors in the panel factor model is asymptotically irrelevant in this case since the limiting distribution of the original OLS estimator does not depend on these dependencies. Instead, when c > 0, a two-step residual-based bootstrap is required to capture the factors estimation uncertainty, which shows up as an asymptotic bias term (as we show here and as was recently discussed by Ludvigson and Ng (2009b)). Because the bias depends on the cross sectional dependence of the idiosyncratic error term, bootstrap validity depends crucially on the ability of the bootstrap panel factor model to capture this cross sectional dependence. Keywords: factor model, bootstrap, asymptotic bias. 1 Introduction Factor-augmented regressions where some of the regressors, called factors, are estimated from a large set of data are increasingly popular in empirical work. Inference in these models is complicated by the fact that the regressors are estimated and thus measured with error. Recently, Bai and Ng (2006) derived the asymptotic distribution of the OLS estimator in this case under a set of standard regularity conditions. In particular, they show that the asymptotic distribution of the OLS estimator is una¤ected We would like to thank participants at the Conference in honor of Halbert White (San Diego, May 2011), the Conference in honor of Ron Gallant (Toulouse, May 2011), the Conference in honor of Hashem Pesaran (Cambridge, July 2011), the Fifth CIREQ time series conference (Montreal, May 2011), the 17th International Panel Data conference (Montreal, July 2011), and the NBER/NSF Time Series conference at MSU (East Lansing, September 2011) as well as seminar participants at HEC Montreal, Syracuse University, Georgetown University, Tilburg University, Rochester University, Cornell University, University of British Columbia and University of Victoria. We also thank Valentina Corradi, Nikolay Gospodinov, Guido Kursteiner and Serena Ng for very useful comments and discussions. Gonçalves acknowledges …nancial support from the NSERC and MITACS whereas Perron acknowledges …nancial support from the SSHRCC and MITACS. 1 by the estimation of the factors when p T =N ! 0; where N and T are the cross-sectional and the time series dimensions, respectively. While their simulation study does not consider inference on the coe¢ cients themselves (they look at the conditional mean), they report noticeable size distortions in some situations. The main contribution of this paper is to propose and theoretically justify bootstrap methods for inference in the context of the factor-augmented regression model. Shintani and Guo (2011) prove the validity of the bootstrap to carry out inference on the persistence of factors, but not in the context of factor-augmented regressions. Recent empirical applications of the bootstrap in this context include Ludvigson and Ng (2007, 2009a,b) and Gospodinov and Ng (2011), where the bootstrap has been used in the context of predictability tests based on factor-augmented regressions without theoretical justi…cation. Our main contribution is to establish the …rst order asymptotic validity of the bootstrap for factor-augmented regression models under assumptions similar to those of Bai and Ng (2006) but p p without the condition that T =N ! 0: Speci…cally, we assume that T =N ! c, where 0 c < 1, thus allowing for the possibility that factor estimation error a¤ects the limiting distribution of the OLS estimator. As it turns out, when c > 0, an asymptotic bias term appears in the distribution, re‡ecting the contribution of factors estimation uncertainty. This bias problem was recently discussed by Ludvigson and Ng (2009b), who proposed an analytical bias correction procedure. Instead, here we focus on the bootstrap and provide a set of conditions under which it can reproduce the whole limiting distribution of the OLS estimator, including the bias term. The bootstrap method we propose is made up of two main steps. In a …rst step, we obtain a bootstrap panel data set from which we estimate the bootstrap factors by principal components. The bootstrap panel observations are generated by adding the estimated common components from the original panel and bootstrap idiosyncratic residuals. In a second step, we generate a bootstrap version of the response variable by again relying on a residual-based bootstrap where the bootstrap observations of the dependent variable are obtained by summing the estimated regression mean and a bootstrap regression residual. To mimic the fact that in the original regression model the true factors are latent and need to be estimated, we regress the bootstrap response variable on the estimated bootstrap factors. This produces a bootstrap OLS estimator whose bootstrap distribution can be used to replicate the distribution of the OLS estimator. We …rst provide a set of high level conditions on the bootstrap residuals and idiosyncratic errors that allow us to characterize the limiting distribution of the bootstrap OLS estimator under the p assumption that T =N ! c; where 0 c < 1. These high level conditions essentially require that the bootstrap idiosyncratic errors be weakly dependent across individuals and over time and that the bootstrap regression scores satisfy a central limit theorem. We then verify these high level conditions for a residual-based wild bootstrap scheme, where the wild bootstrap is used to generate the bootstrap idiosyncratic error term in the …rst step, and also in the second step when generating the regression residuals. The two steps are performed independently of each other. 2 A crucial result in proving the …rst order asymptotic validity of the bootstrap in this context is the consistency of the bootstrap principal component estimator. Given our residual-based bootstrap, the “latent” factors underlying the bootstrap data generating process (DGP) are given by the estimated factors. Nevertheless, these are not identi…ed by the bootstrap principal component estimator due to the well-known identi…cation problem of factor models. By relying on results of Bai and Ng (2011) (see also Stock and Watson (2002)), we show that the bootstrap estimated factors identify the estimated factors up to a change of sign. Contrary to the rotation indeterminacy problem that a¤ects the principal component estimator, this sign indetermination is easily resolved in the bootstrap world, where the bootstrap rotation matrix depends on bootstrap population values that are functions of the original data. As a consequence, to bootstrap the distribution of OLS estimator, our proposal is to rotate the bootstrap OLS estimator using the feasible bootstrap rotation matrix. This amounts to sign-adjusting the bootstrap OLS regression estimates asymptotically. Our results can be summarized as follows. When c = 0, as in Bai and Ng (2006), the crucial condition for bootstrap validity is the ability of the bootstrap regression scores to mimic the serial dependence and heteroskedasticity of the original regression scores. For instance, a wild bootstrap in the second-step of our residual-based bootstrap is appropriate if we assume the regression errors to be a possibly heteroskedastic martingale di¤erence sequence (as in Bai and Ng (2006)). Under a more general dependence assumption, the wild bootstrap is not appropriate and we should instead consider a block bootstrap. We do not pursue this possibility here but note that our bootstrap high level conditions would be useful in establishing the validity of the block bootstrap in this context as well. Mimicking the cross sectional and/or serial dependence of the idiosyncratic errors in the panel factor model is asymptotically irrelevant when c = 0 since the limiting distribution of the original OLS estimator does not depend on these dependencies under this condition. Thus, a wild bootstrap in the …rst step of the residual-based bootstrap is asymptotically valid under the general setup of Bai and Ng (2006) that allows for weak time series and cross sectional dependence in the idiosyncratic error term. In fact, a simple one-step residual-based bootstrap method that does not take into account the factor estimation uncertainty in the bootstrap samples (i.e. a bootstrap method based only on the second step of our proposed method) is asymptotically valid when c = 0.1 Instead, when c > 0, a twostep residual-based bootstrap is required to capture the asymptotic bias term that re‡ects the factors estimation error. Because the bias depends on the cross sectional dependence of the idiosyncratic error term, bootstrap validity depends crucially on the ability of the bootstrap panel factor model to capture this cross sectional dependence. Since the wild bootstrap generates bootstrap idiosyncratic errors that are heteroskedastic but independent along the time and cross sectional dimensions, this method is consistent only under cross sectional independence of the idiosyncratic errors. The rest of the paper is organized as follows. In Section 2, we …rst describe the setup and the 1 We do not consider this here because factor estimation uncertainty has an impact in …nite samples. For instance, Yamamoto (2011) compares bootstrap methods with and without factor estimation for the factor-augmented vector autoregression (FAVAR) model and concludes that the latter is worse than the …rst in terms of …nite sample accuracy. 3 assumptions, and then derive the asymptotic theory of the OLS estimator when p T =N ! c: In Section 3, we introduce the residual-based bootstrap method and characterize a set of conditions under which the bootstrap distribution consistency follows. Section 4 proposes a wild bootstrap implementation of the residual-based bootstrap and proves its consistency. Section 5 discusses the Monte Carlo results and Section 6 concludes. Three mathematical appendices are included. Appendix A contains the proofs of the results in Section 2, Appendix B the proofs of the results in Section 3, and Appendix C the proofs of the results in Section 4. A word on notation. As usual in the bootstrap literature, we use P probability measure, conditional on a given sample. to denote the bootstrap For any bootstrap statistic TN T , we write TN T = oP (1), in probability, or TN T !P 0, in probability, when for any oP (1). We write TN T = OP (1), in probability, when for all > 0, P (jTN T j > ) = > 0 there exists M < 1 such that limN;T !1 P [P (jTN T j > M ) > ] = 0: Finally, we write TN T !d D; in probability, if conditional on a sample with probability that converges to one, TN T weakly converges to the distribution D under P , i.e. E (f (TN T )) !P E (f (D)) for all bounded and uniformly continuous functions f . 2 Asymptotic theory when p T =N ! c; where 0 c<1 We consider the following regression model yt+h = where h 0 0 Ft + Wt + "t+h ; t = 1; : : : ; T h; (1) 0: The q observed regressors are contained in Wt . The r unobserved regressors Ft are the common factors in the following panel factor model, Xit = where the r 1 vector 0 i Ft + eit ; i = 1; : : : ; N; t = 1; : : : ; T; (2) contains the factor loadings and eit is an idiosyncratic error term. In matrix i form, we can write (2) as X=F where X is a T factors, =( 0 + e; N matrix of stationary data, F = (F1 ; : : : ; FT )0 is T 1; : : : ; 0 N) is N r, and e is T r, r is the number of common N: The factor-augmented regression model described in (1) and (2) has recently attracted a lot of attention in econometrics. One of the …rst papers to discuss this model in the forecasting context was Stock and Watson (2002): Recent empirical applications include Ludvigson and Ng (2007) who consider predictive regressions of excess stock returns and augment the usual set of predictors by including estimated factors from a large panel of macro and …nancial variables, Ludvigson and Ng (2009a,b) who consider this approach in the context of predictive regressions of bond excess returns, Gospodinov and Ng (2011) who study predictive regressions for in‡ation using principal components from a panel of commodity convenience yields, and Eichengreen, Mody, Nedeljkovic, and Sarno (2009) 4 who use common factors extracted from credit default swap (CDS) spreads during the recent …nancial crisis to look at spillovers across banks. Estimation proceeds in two steps. Given X, we estimate F and with the method of principal 0 r matrix F~ = F~1 : : : F~T composed components. In particular, F is estimated with the T p of T times the eigenvectors corresponding to the r largest eigenvalues of of XX 0 =T N (arranged in decreasing order), where the normalization loadings is then ~ = ~ 1 ; : : : ; ~ N 0 = X 0 F~ F~ 0 F~ T ~ F 0 F~ = Ir is used. The matrix containing the estimated 1 = X 0 F~ =T: In the second step, we run an OLS regression of yt+h on z^t = ^ ^ ^ T Xh = z^t z^t0 t=1 where ^ is p ! 1T h X F~t0 Wt0 0 , i.e. we compute z^t yt+h ; (3) t=1 1 with p = r + q: As is well known in this literature, the principal components F~t can only consistently estimate a transformation of the true factors Ft , given by HFt ; where H is a rotation matrix de…ned as H = V~ where V~ is the r ~0F 0 T N 1F ; (4) r diagonal matrix containing on the main diagonal the r largest eigenvalues of XX 0 =N T , in decreasing order. One important implication is that ^ consistently estimates 0H 1 0 0 ; and not 0 0 0 : In particular, given (1), adding and subtracting appropriately yields yt+h = or, equivalently, 0H 1 | {z = 0 0 F~t + } Wt | {z } 0 H 1 HFt F~t + "t+h ; =^ zt yt+h = z^t0 + 0 H 1 F~t + "t+h ; HFt (5) where the second term represents the contribution from estimating the factors. p Recently, Bai and Ng (2006) derived the asymptotic distribution of T ^ under a set of p regularity conditions and the assumption that T =N ! 0. Our goal in this section is to derive the p limiting distribution of ^ under the assumption that T =N ! c; where c is not necessarily zero. We use the following assumptions, which are similar to Bai’s (2003) assumptions and slightly weaker than Ft0 Wt0 the Bai and Ng (2006) assumptions. We let zt = 0 , where zt is p 1, with p = r + q. Assumption 1 - Factors and factor loadings (a) E kFt k4 M and (b) The loadings i In either case, 1 T PT 0 t=1 Ft Ft !P F > 0; where F are either deterministic such that k i k 0 =N !P > 0, where is a non-random r M , or stochastic such that E k i k4 is a non-random matrix. 5 r matrix. M: (c) The eigenvalues of the r r matrix ( F) are distinct. Assumption 2 - Time and cross section dependence and heteroskedasticity (a) E (eit ) = 0, E jeit j8 M: (b) E (eit ejs ) = ij;ts , j ij;ts j ij for all (t; s) and j ij;ts j P 1 PT M; T t;s=1 ts M , and N1T t;s;i;j j ij;ts j M: (c) For every (t; s), E N 1=2 PN i=1 (eit eis E (eit eis )) 4 ts for all (i; j) such that N1 PN ij i;j=1 M: Assumption 3 - Moments and weak dependence among fzt g, f i g and feit g. (a) E 1 N PN p1 T i=1 (d) E p1 NT 1 T PT PT t=1 p1 N t V ar PT s=1 2 0 t=1 zt et (e) As N; T ! 1; and t=1 Ft eit p1 TN (b) For each t; E (c) E PT PN M , where E (Ft eit ) = 0 for all (i; t). PN i=1 zs (eit eis 2 E (eit eis )) M; where zs = Fs0 Ws0 0 : M , where E zt 0i eit = 0 for all (i; t). i eit i=1 2 2 M; where E ( i eit ) = 0 for all (i; t). 1 PT PN PN t=1 i=1 j=1 TN PN p1 e : i=1 i it N 0 i j eit ejt !P 0; where limN;T !1 1 T PT t=1 t > 0; Assumption 4 - weak dependence between idiosyncratic errors and regression errors (a) For each t and h (b) E p1 NT 0; E h PN i=1 t=1 PT p1 TN i eit "t+h PT 2 h PN s=1 i=1 "s+h (eit eis E (eit eis )) 2 M: M , where E ( i eit "t+h ) = 0 for all (i; t) : Assumption 5 - moments and CLT for the score vector (a) E ("t+h ) = 0 and E j"t+h j2 < M: (b) E kzt k4 M and (c) As T ! 1; p1 T 1 T PT PT 0 t=1 zt zt h t=1 zt "t+h limT !1 V ar p1 T PT !P zz > 0: !d N (0; ) ; where E h t=1 zt "t+h > 0: p1 T PT h t=1 zt "t+h 2 < M , and Assumption 1(a) imposes the assumption that factors are non-degenerate. Assumption 1(b) ensures that each factor contributes non-trivially to the variance of Xt , i.e. the factors are pervasive and a¤ect all cross sectional units. These assumptions ensure that there are r identi…able factors in the model. Recently, Onatski (2011) considers a class of “weak” factor models, where the factor loadings are modeled as local to zero. Under this assumption, the estimated factors are no longer consistent for 6 the unobserved (rotated) factors. In this paper, we do not consider this possibility. Assumption 1(c) ensures that Q p lim F~ 0 F=T is unique. Without this assumption, Q is only determined up to orthogonal transformations. See Bai (2003, proof of his Proposition 1). Assumption 2 imposes weak cross-sectional and serial dependence conditions in the idiosyncratic error term eit . In particular, we allow for the possibility that eit is dependent across individual units and over time, but we require that the degree of dependence decreases as the time and the cross sectional distance (regardless of how it is de…ned) between observations increases. This assumption is compatible with the approximate factor model of Chamberlain and Rothschild (1983) and Connor and Korajczyk (1986, 1993), in which cross section units are weakly correlated. Assumption 2 allows for heteroskedasticity in both dimensions and requires the idiosyncratic error term to have …nite eighth moments. Assumption 3 restricts the degree of dependence among the vector of regressors fzt g, the factor loadings f i g and the idiosyncratic error terms feit g. If we assume that fzt g, f i g and feit g are mutually independent (as in Bai and Ng (2006)), then Assumptions 3(a), 3(c) with zt = Ft and 3(d) are implied by Assumptions 1 and 2. Similarly, Assumption 3(b) holds if we assume that fzt g and feit g are independent and the following weak dependence condition on feit g holds 1 X X jCov (eit eis ; ejl eju )j < < 1: NT2 (6) t;s;l;u i;j Bai (2009) relies on a similar condition (part 1 of his Assumption C.4) to establish the asymptotic properties of the interactive e¤ects estimator. As he explains, this condition is slightly stronger than 2 P P T 1=2 Tt=1 eit the assumption that the second moment of N 1=2 N i=1 i is bounded, where i 2 P P E T 1=2 Tt=1 eit (note that V ar N 1=2 N i=1 i equals the left hand side of the above inequality without the absolute value). It holds if eit is i.i.d. over i and t and E e4it < M: Assumptions 3(a)-3(c) are equivalent to Assumptions D, F1 and F2 of Bai (2003) when zt = Ft . To describe Assumption 3(e), for each t, let t N 1 X p N i=1 i eit ; and t V ar ( t ) = E 0 t t ; since we assume that E ( i eit ) = 0 for all (i; t). Assumption 3(e) requires that 1 T PT t=1 0 t t E 0 t t converges in probability to zero. This follows under weak dependence conditions on f i eit g over (i; t). p N F~t HFt , as shown by Bai (2003, cf. Theorem 1). t is related to the asymptotic variance of Assumption 4 imposes weak dependence between the idiosyncratic errors and the regression errors. Part (a) holds if feit g is independent of f"t g and the weak dependence condition (6) holds. Similarly, part (b) holds if f i g ; feis g and f"t g are three mutually independent groups of random variables and Assumption 2 holds. Assumption 5 imposes moment conditions on f"t+h g, on fzt g and on the score vector fzt "t+h g. Part b) requires fzt zt0 g to satisfy a law of large numbers. Part c) requires the score to satisfy a 7 central limit theorem, where denotes the limiting variance of the scaled average of the scores. The dependence structure of the scores fzt "t+h g dictates the form of the covariance matrix estimator to be used for inference on . For instance, Bai and Ng (2006) impose the assumption that "t+h is a martingale di¤erence sequence with respect to fzt ; yt ; zt 1 ; yt 1 ; : : :g when h > 0: Under this PT h assumption, = lim T1 t=1 E zt zt0 "2t+h ; and the appropriate covariance matrix estimator is an heteroskedasticity robust variance estimator. As we will see later, the form of will also dictate the type of bootstrap we should use. In particular, under the martingale di¤erence sequence on "t+h , a wild bootstrap method will be appropriate. Section 4 will consider this speci…c bootstrap scheme. Given Assumptions 1-5, we can state our main result as follows. We introduce the following notation: H0 p lim H = V Q ; Q WF =( 0) 1 0 1 zz 1 1 zz 0 W F~ If WF V 1 p lim p T =N ! c; with 0 !d N ( c ; c < 1, then ); zz 0 Q Q0 V ! W 0 F~ T 1 0 0 +V W F~ V 1 F~ 1 F~ V 1 F~ V H0 1 0 ; with ; = 0 W F H0 : = 0, the asymptotic bias is equal to c ^ = c F~ +V F~ V 1 0 Theorem 2.1 gives the asymptotic distribution of where 0 p lim V~ ; and , and = F~ ; V = E (Wt Ft0 ). Theorem 2.1 Let Assumptions 1-5 hold. If p T ^ where p lim ! diag (H0 ; Iq ) : 0 Additionally, we let 1 F~ 0 F T p T ^ H0 1 0 : under the condition that p T =N ! c; c < 1: When c = 0, we obtain the same limiting distribution as in Bai and Ng (2006) under a set of assumptions that is weaker than theirs, as we discussed above. As in Bai and Ng (2006), factors estimation error does not contribute to the asymptotic distribution when c = 0. This is no longer the case when c > 0. Under this alternative condition, an asymptotic bias appears, as was recently discussed by Ludvigson and Ng (2009b) in the context of a simpler regression model without observed regressors Wt . Our Theorem 2.1 complements their results by providing an expression of the bias for ^ when the factor-augmented regression model includes also observed regressors in addition to the unobserved factors Ft . 8 is proportional to H0 1 Several remarks follow. First, the expression for 0 = p lim ^ , implying that when = 0, no asymptotic bias exists independently of the value of c. Second, the asymptotic bias for both ^ and ^ is a function of V F~ 1 Q Q0 V 1 = T 1X V N;T !1 T lim 1 Q t Q0 V 1 ; t=1 where V 1Q 0 tQ V 1 is the asymptotic variance of p N F~t HFt (see Bai (2003)). Thus, the bias depends on the sampling variance-covariance matrix of the estimation error incurred by the principal components estimator F~t , averaged over time. This variance matrix depends on the cross sectional P dependence of feit g via t V ar p1N N i=1 i ei . As we will see next, the main implication for the validity of the bootstrap is that it needs to reproduce this cross sectional dependence when c 6= 0 but not otherwise. Third, the existence of measurement error in F~t contaminates the estimators of the remaining parameters , i.e. ^ is asymptotically biased due to the measurement error in F~t : The asymptotic bias associated with ^ will be zero only in the special case when the observed regressors and the factors are not correlated (i.e. 3 WF = 0) (or when = 0). A general residual-based bootstrap The main contribution of this section is to propose a general residual-based bootstrap method and discuss its consistency for factor-augmented regression models under a set of su¢ cient high-level conditions on the bootstrap residuals. These high level conditions can be veri…ed for any bootstrap scheme that resamples residuals. We verify these conditions for a two-step wild bootstrap scheme in Section 4. 3.1 Bootstrap data generating process and estimation n o Let et = (e1t ; : : : ; eN t )0 denote a bootstrap sample from e~t = Xt ~ F~t and "t+h a bootstrap n o 0 sample from ^"t+h = yt+h ^ 0 F~t ^ Wt . We consider the following bootstrap DGP: 0 yt+h = ^ 0 F~t + ^ Wt + "t+h ; t = 1; : : : ; T Xt = ~ F~t + e ; t t = 1; : : : ; T: h; (7) (8) Estimation proceeds in two stages. First, we estimate the factors by the method of principal components using the bootstrap panel data set fXt g. Second, we run a regression of yt+h on the bootstrap estimated factors and on the …xed observed regressors Wt . More speci…cally, given fXt g, we estimate the bootstrap factor loadings and the bootstrap factors by minimizing the bootstrap objective function T N 1 XX V (F; ) = Xit TN t=1 i=1 9 2 0 i Ft subject to the normalization constraint that F 0 F=T = Ir : The T r matrix containing the estimated 0 bootstrap factors is denoted by F~ = F~1 ; : : : ; F~T and it is equal to the r eigenvectors of X X 0 =N T p (multiplied by T ) corresponding to the r largest eigenvalues. The N r matrix of estimated bootstrap 0 = X 0 F~ =T . loadings is given by ~ = ~ ; : : : ; ~ N 1 Given the estimated bootstrap factors F~t , the second estimation stage is to regress yt+h on 0 z^t = F~t 0 Wt0 : Because the bootstrap scheme used to generate yt+h is residual-based, we …x the observed regressors Wt in the bootstrap regression. We replace F~t with F~t to mimic the fact that in the original regression model the factors Ft are latent and need to be estimated with F~t . This yields the bootstrap OLS estimator T h 1 X z^t z^t 0 T ^ = t=1 ! 1 T h 1 X z^t yt+h : T (9) t=1 ^ is the bootstrap analogue of ^, the OLS estimator based on the original sample. 3.2 Bootstrap high level conditions In this section, we provide a set of high level conditions on feit g and characterize the bootstrap distribution of ^ . "t+h that will allow us to Condition A (weak time series and cross section dependence in eit ) (a) E (eit ) = 0 for all (i; t) : (b) 1 T (c) 1 T2 PT t=1 PT PT 2 s=1 j st j t=1 PT = OP (1), where p1 N s=1 E PN i=1 (eit eis st 1 N =E E (eit eis )) 2 PN i=1 eit eis : = OP (1) : Condition B* (weak dependence among z^t ; ~ i ; and eit ) (a) 1 T (b) 1 T (c) E (d) 1 T (e) 1 T PT t=1 PT PT ~ ~0 s=1 Fs Ft st p1 TN t=1 E p1 TN PT PT t=1 PT t=1 s=1 PN p1 N t=1 E PT ~ 0e p t N = OP (1) : PN ^s (eit eis i=1 z 2 0 ^t ~ i eit i=1 z PN ~ ie i=1 et 0 ~ p N it 2 E (eit eis )) 2 = OP (1), where z^s = 1 T (b) E PT t=1 E p1 TN p1 TN PT PT = OP (1) : = oP (1) ; in probability, where h PN s=1 i=1 "s+h (eit eis h PN t=1 i=1 : = OP (1) : 1 T Condition C* (weak dependence between eit and "t+h ) (a) 0 F~s0 Ws0 ~ie " it t+h 2 E (eit eis )) 2 PT t=1 V ar = OP (1). = OP (1), where E eit "t+h = 0 for all (i; t). 10 p1 N ~ 0e t > 0: PT h PT ~ t=1 s=1 Fs "t+h st 1 T (c) = OP (1), in probability. Condition D* (bootstrap CLT) (a) E "t+h = 0 and (b) 1=2 p1 T PT 1 T PT h t=1 E h ^t "t+h t=1 z V ar p1 T PT "t+h 2 = OP (1) : !d N (0; Ip ), in probability, where E h ^t "t+h t=1 z > 0: p1 T PT h ^t "t+h t=1 z 2 = OP (1), and Conditions A*-D* are the bootstrap analogue of Assumptions 1 through 5. However, contrary to Assumptions 1-5, which pertain to the data generating process and cannot be veri…ed in practice, Conditions A*-D* can be veri…ed for any particular bootstrap algorithm used to generate the bootstrap residuals and idiosyncratic error terms. More importantly, we can devise bootstrap schemes to verify these conditions and hence ensure bootstrap validity. For instance, part (a) of Condition A* requires the bootstrap mean of eit to be zero for all (i; t) whereas part (a) of Condition D* requires that the same is true for "t . The practical implication is that we should make sure to construct bootstrap residuals with mean zero, e.g. to recenter residuals before applying a nonparametric bootstrap method. Parts b) and c) of Condition A* impose weak dependence conditions on feit g over (i; t). For instance, these conditions are satis…ed if we resample feit g in an i.i.d. fashion over the two indices (i; t). Condition B* imposes further restrictions on the dependence among z^t , ~ i and the idiosyncratic errors e . Since it z^t and ~ i are …xed in the bootstrap world, Condition B* is implied by appropriately restricting the dependence of eit over (i; t). Similarly, Condition C* restricts the amount of dependence between feis g and "t+h . If these two sets of bootstrap innovations are independent of one another, then weak dependence on feis g su¢ ces for Condition C* to hold. Finally, Condition D* requires the bootstrap regression scores z^t "t+h to obey a central limit theorem in the bootstrap world. 3.3 Bootstrap results Under Conditions A*-D*, we can show the consistency of the bootstrap principal component estimator F~ for a rotated version of the true “latent” bootstrap factors F~ , a crucial result in proving the …rst order asymptotic validity of the bootstrap in this context. More speci…cally, according to (7)-(8), the common factors underlying the bootstrap panel data fXt g are given by F~t (with ~ as factor loadings). Nevertheless, just as F~t estimates a rotation of Ft , the estimated bootstrap factors F~t estimate H F~t , where H is the bootstrap analogue of the rotation matrix H de…ned in (4), i.e. H = V~ where V~ is the r X X 0 =N T , ~ 0 F~ ~ 0 ~ ; T N 1F (10) r diagonal matrix containing on the main diagonal the r largest eigenvalues of in decreasing order. 11 Lemma 3.1 Assume Assumptions 1-5 hold and suppose we generate bootstrap data yt+h ; Xt ac- cording to the residual-based bootstrap DGP (7) and (8) by relying on bootstrap residuals "t+h and fet g such that Conditions A*-D* are satis…ed. Then, as N; T ! 1, T 1X ~ Ft T H F~t 2 = OP 2 NT ; t=1 in probability, where NT = min p p N; T : According to Lemma 3.1, the time average of the squared deviations between the estimated bootstrap factors F~t and a rotation of the “latent”bootstrap factors given by H F~t vanishes in probability under the bootstrap measure P as N; T ! 1, conditional on a sample which lies in a set with probability converging to one. Contrary to H, H does not depend on population values and can be computed for any bootstrap sample, given the original sample. Hence, rotation indeterminacy is not a problem in the bootstrap world. Because the bootstrap factor DGP (8) satis…es the constraints that F~ 0 F~ =T = Ir and ~ 0 ~ is a diagonal matrix, we can actually show that H is asymptotically (as N; T ! 1) equal to H0 = diag ( 1), a diagonal matrix with diagonal elements equal to 1, where the sign of is determined by the sign of F~ 0 F~ =T (see Lemma B.1; the proof follows by arguments similar to those used in Bai and Ng (2011) and Stock and Watson (2002)). Thus, the bootstrap factors are identi…ed up to a change of sign. The main implication from Lemma 3.1 is that the bootstrap OLS estimator that one obtains from 0 0 = 0 1 ^, where regressing yt+h on z^t estimates a rotated version of ^, given by ^ 0H 1 ^ 0 1^ = diag (H ; Iq ) : Asymptotically, is equal to , where = diag (H ; Iq ), which can be 0 0 interpreted as a sign-adjusted version of ^. Our next result characterizes the asymptotic bootstrap distribution of c, with 0 p c < 1. We add the two following conditions. when p T =N ! 0: 0 Condition E*. p lim = Condition F*. p lim = Q Q0 : 0 T ^ is the bootstrap variance of the scaled average of the bootstrap regression scores z^t "t+h (as de…ned in Condition D*(b)). Since F~t estimates a rotated version of the latent factors given by H0 Ft , z^t estimates a rotated version of zt given by 0 zt , and therefore is the sample analog of 0 0 provided we choose "t+h to mimic the time series dependence of "t+h . Condition E* imposes PN ~ 1 PT p1 formally this condition. Similarly, by Condition B*(e), t=1 V ar i=1 i eit . Because T N ~ i estimates Q i , is the sample analogue of Q Q0 if we choose e to mimic the cross sectional 0 it dependence of eit (interestingly enough, mimicking the time series dependence of eit is not relevant). Condition F* formalizes this requirement. 12 Theorem 3.1 Let Assumptions 1-5 hold and consider any residual-based bootstrap scheme for which p Conditions A*-D* are veri…ed. Suppose T =N ! c, where 0 c < 1. If in addition the two following conditions hold: (1) Condition E* is veri…ed, and (2) c = 0 or Condition F* is veri…ed; then as N; T ! 1, p T ^ in probability, where !d N the main diagonal, and H 1 0 ; with and According to Theorem 3.1, with mean equal to 1 0^ 0 c( p 0) 1 0 0 c 0 1 0 = p lim ; 0 1 0 1 ; 0 = diag (H0 ; Iq ) a diagonal matrix with 1 in are as de…ned in Theorem 2.1. T ^ is asymptotically distributed as a normal random vector p 0 : Just as the asymptotic bias of T ^ H 1 is proportional to , the bootstrap asymptotic bias is proportional to H0 1 0 1 0 ^ . Since ^ converges in probability 1 , the bootstrap bias of ^ converges to c (H0 0 ) provided we ensure that Condition F* 0~ is satis…ed. The bootstrap bias of ^ depends both on the bootstrap analogue of W F~ = p lim WT F = to H 0 W F H0 W F~ H0 and on H0 0, 1 0 ^ : Since the bootstrap analogue of the rotation matrix H0 0 “cancels out” with H0 0 ~ is p lim WTF , which converges to ^ , leaving the bootstrap bias of ^ W F~ 1 0 una¤ected by the sign problem that arises for ^ (Lemma B.4 formalizes this argument). Similarly, 1 provided we choose "t+h the asymptotic variance-covariance matrix of ^ is equal to ( 0 0 ) 1 0 so as to verify Condition E*. For bootstrap consistency, we need the bootstrap bias and variance to match the bias and variance of the limiting distribution of the original OLS estimator. Since H0 (hence 0) is not necessarily equal to the identity matrix, Theorem 3.1 shows that this is not the case. Hence, the bootstrap p p distribution of T ^ is not a consistent estimator of the sampling distribution of T ^ in general. This is true even if we choose "t+h and eit so that Conditions E* and F* are satis…ed. The reason is that the bootstrap factors are not identi…ed. In particular, because the bootstrap principal components estimator does not necessarily identify the sign of the bootstrap factors, the mean of each p element of T ^ corresponding to the coe¢ cients associated with the latent factors may have the “wrong” sign even asymptotically. The same “sign” problem will a¤ect the o¤-diagonal elements of the bootstrap covariance matrix asymptotically (although not the main diagonal elements). As we mentioned above, the coe¢ cients associated with Wt are correctly identi…ed in the bootstrap world as well as in the original sample and therefore this sign problem does not a¤ect these coe¢ cients. p In order to obtain a consistent estimator of the distribution of T ^ , our proposal is to p p consider the bootstrap distribution of the rotated version of T ^ given by T ( 0 ^ ^ ). This rotation is feasible because does not depend on any population quantities and can be computed for any bootstrap and original samples. Since is asymptotically equal to 0 = diag ( 1; Iq ) = diag sign F~ 0 F~ ; Iq , 0 ^ is asymptotically equal to a sign-adjusted version of ^ . The following result is an immediate corollary to Theorems 2.1 and 3.1. 13 Corollary 3.1 Under the conditions of Theorem 3.1, if p p 0^ ^ then supx2Rp P T T ^ x P p T =N ! c, where 0 x !P 0: c < 1, as N; T ! 1; Corollary 3.1 justi…es the use of a residual-based bootstrap method for constructing bootstrap percentile con…dence intervals for the elements of . When c = 0, the crucial condition for bootstrap validity is Condition E*, which requires f"t g to be chosen so as to mimic the dependence structure 0^ of the scores zt "t+h . This condition ensures that the bootstrap variance-covariance matrix of is correct asymptotically. When c 6= 0, Condition F* is also crucial to ensure that the bootstrap distribution correctly captures the bias. When both Conditions E* and F* are satis…ed, the bootstrap contains a built-in bias correction term that is absent in the Bai and Ng (2006) asymptotic normal distribution, and we might expect it to outperform the normal approximation. A bootstrap method that does not involve factor estimation in the bootstrap world will not contain this bias correction and will not be valid in this context. 4 Wild bootstrap In this section we propose a particular bootstrap method for generating "t+h its …rst-order asymptotic validity under a set of primitive conditions. and feit g and show Bootstrap algorithm 1. For t = 1; : : : ; T , let Xt = ~ F~t + et ; where fet = (e1t ; : : : ; eN t )g is such that n is a resampled version of e~it = Xit random variables it eit = e~it it ; o ~ 0 F~t obtained with the wild bootstrap. The external i are i.i.d. across (i; t) and have mean zero and variance one. 2. Estimate the bootstrap factors F~ and the bootstrap loadings ~ using X . 3. For t = 1; : : : ; T h; let 0 yt+h = ^ 0 F~t + ^ Wt + "t+h ; where the error term "t+h is a wild bootstrap resampled version of ^"t+h , i.e. "t+h = ^"t+h vt+h , where the external random variable vt+h is i.i.d. (0; 1) and is independent of 14 it . 4. Regress yt+h generated in 3. on the estimated bootstrap factors and the …xed regressors z^t = 0 F~t 0 ; Wt0 . This yields the bootstrap OLS estimators T Xh ^ = z^t z^t 0 t=1 ! 1T h X z^t yt+h . t=1 To prove the validity of this residual-based wild bootstrap, we add the following assumptions. Assumption 6. 12 E k ik i are either deterministic such that k i k M < 1 for all i; E kFt k for some q > 1, E j"t+h j4q Assumption 7. E ("t+h jyt ; Ft ; yt 12 M < 1; E jeit j12 M < 1; for all t; h: 1 ; Ft 1 ; : : :) M < 1, or stochastic such that M < 1; for all (i; t) ; and = 0 for any h > 0, and Ft and "t are independent of the idiosyncratic errors eis for all (i; s; t). Assumption 8. E (eit ejs ) = 0 if i 6= j. Assumption 6 strengthens the moment conditions assumed in Assumption 1.b), 2.a), and 5.a), respectively. The moment conditions on i, Ft and eit su¢ ce to show that E 4 0 i Fs eit < M (while 8 maintaining that E jeit j < M ). If we assume that the three groups of random variables fFt g, feit g and f i g are mutually independent (as in Bai and Ng (2006)), then it su¢ ces that E k i k4 E kFt k 4 8 M < 1 (and E jeit j < M ). M < 1, Assumption 7 was used by Bai and Ng (2006). It imposes a martingale di¤erence condition on the regression errors "t+h , implying that these are serially uncorrelated but possibly heteroskedastic. In addition, "t is independent of eis for all (i; t; s). Under Assumption 7, ! T h T h 1 X 1 X = lim V ar p zt "t+h = lim E zt zt0 "2t+h ; T T t=1 t=1 which motivates using a wild bootstrap to generate "t+h . For this bootstrap scheme, T h 1 X = z^t z^t0 ^"2t+h ; T t=1 which corresponds to the estimator used by Bai and Ng (2006) (cf. their equation (3)) and is consistent for 0 0 0 under Assumptions 1-7. We assume independence between eit and "t+h and generate "t+h independently of eit ; but we conjecture that our results will be valid under weak forms of correlation between the two sets of errors because the limiting distribution of the OLS estimator remains unchanged under Assumptions 1-5, which allow for dependence between eit and "t+h ; as we proved in Theorem 2.1. Assumption 8 assumes cross section independence in feit g, but allows for serial correlation and heteroskedasticity in both directions. This assumption motivates the use of a wild bootstrap to 15 generate feit g. For this bootstrap scheme, we can show that = T N 1 X 1 X ~ ~0 2 ~it i ie T N t=1 i=1 T 1 X~ t; T t=1 where ~ t corresponds to estimator (5a) in Bai and Ng (2006, p. 1140). As shown by Bai and Ng, this estimator is consistent for Q Q0 under cross section independence (and potential heteroskedasticity). Assumption 8 assumes this is the case and thus justi…es Condition F* in this context. As we discussed in the previous section, Condition F* is not needed if c = 0. Thus, a wild bootstrap is still asympp totically valid if the idiosyncratic errors are cross sectionally (and serially) dependent when T =N converges to zero (as assumed in Bai and Ng (2006)). Our main result is as follows. Theorem 4.1 Suppose that a residual-based wild bootstrap is used to generate feit g and "t+h with E j it j4 < C for all (i; t) and E jvt+h j4q < C for all t; for some q > 1: Under Assumptions 1-7, if p T =N ! c, where 0 c < 1, and either Assumption 8 holds or c = 0, the conclusions of Corollary 3.1 follow. 5 Monte Carlo results In this section, we report results from a simulation experiment that documents the properties of our bootstrap procedure in factor-augmented regressions. The data-generating process (DGP) is similar to the one used in Ludvigson and Ng (2009b) to analyze bias. We consider the single factor model: yt = F t + "t ; (11) where Ft is drawn from a standard normal distribution independently over time. The regression error "t will either be standard normal or heteroskedastic over time. The (T N ) matrix of panel variables is generated as: Xit = where i i Ft + eit ; (12) is drawn from a U [0; 1] distribution (independent across i) and the properties of eit will be discussed below. The only di¤erence with Ludvigson and Ng (2009b) is that they draw the loadings from a standard normal distribution. The use of a uniform distribution increases the cross-correlations and leads to larger biases without having to set the idiosyncratic variance to large values (they set it to 16 in one experiment). Note that this DGP satis…es the conditions PC1 in Bai and Ng (2011) which implies that H0 = 1: We consider two values for the coe¢ cient, either outlined in the table below. When = 0 or 1. We consider six di¤erent scenarios = 0; the OLS estimator is unbiased, and the properties of the idiosyncratic components do not matter asymptotically. This leads us to consider only one scenario 16 with = 0: The other …ve scenarios have = 1 but di¤er according to the properties of the regression error, "t ; and of the idiosyncratic error, eit : DGP 1 (homo-homo) 2 (homo-homo) 0 1 "t N (0; 1) N (0; 1) 3 (hetero-homo) 1 N 0; 4 (hetero-hetero) 1 N 0; 5 (hetero-AR) 1 N 0; 6 (hetero-CS) 1 N 0; DGP 1 is the simplest case with Ft2 3 Ft2 3 Ft2 3 Ft2 3 eit N (0; 1) N (0; 1) N (0; 1) N 0; 2 i 2 i AR + N 0; CS + N (0; 1) = 0 and both error terms i.i.d. standard normal in both dimensions. DGP 2 is the same but with = 1: This will allow us to isolate the e¤ect of a non- zero coe¢ cient on bias and inference while keeping everything else the same. The third experiment introduces conditional heteroskedasticity in the regression error. We do so by making the variance of "t depend on the factor and scale so that the asymptotic variance of ^ ; a; is 1 in all experiments. The fourth DGP adds heteroskedasticity to the idiosyncratic error. The variance of eit is drawn from U [:5; 1:5] so that the average variance is the same as the homoskedastic case. The …fth DGP introduces serial correlation in the idiosyncratic error term with autoregressive parameter equal to 0.5. :52 The innovations are scaled by 1 1=2 to preserve the variance of the idiosyncratic errors. Finally, the last experiment introduces cross-sectional dependence among idiosyncratic errors. The design is similar to the one in Bai and Ng (2006): the correlation between eit and ejt is 0:5ji We concentrate on inference about the parameter jj if ji jj 5: in (11) : We consider asymptotic and bootstrap symmetric percentile t con…dence intervals at a nominal level of 95%. We report experiments based on 1000 replications with B = 399 bootstrap repetitions. We consider three values for N (50, 100, and 200) and T (50, 100, and 200). We tailor our inference procedures to the properties of the error terms. In other words, when "t is homoskedastic (DGP 1 and 2), we use the variance estimator under homoskedasticity: ! 1 T Xh 1 2 2 ^ ^ = ^" F~t T (13) t=1 whereas we use the heteroskedastic-robust version for DGPs 3-6: ! 1 ! ! T T T Xh Xh Xh 1 1 1 ^ = F~t2 F~t2^"2t+h F~t2 T T T t=1 t=1 Similarly, we use the homoskedastic estimator of 1 : (14) t=1 for cases 1-3, the heteroskedasticity-robust estimator for cases 4-5, and the CS-HAC estimator of Bai and Ng (2006) in case 6 with the window p p N ; T . We consider the wild residual-based bootstrap described in Section 4 size n equal to min with the two external variables it and vt both drawn from i.i.d. N (0; 1) : 17 We report two sets of results. The …rst set of results is the bias of the OLS estimator. Because the OLS estimator does not converge to but to H 10 (and H converges to +1 or -1) and because its bias is proportional to this, the bias will be positive for some replications and negative for others. Reporting the average bias over replications is therefore misleading in this situation. To circumvent this, we report the bias of the rotated OLS coe¢ cient, H 0 ^ : This rotated coe¢ cient converges to in all replications. Note that this rotation is not possible in the real world since the matrix H depends on population parameters. Note also that this rotation is possible in the bootstrap world (and indeed necessary to obtain consistent inference of the entire coe¢ cient vector, see Corollary 3.1). For the bootstrap world, we report the average of H 0 H 0 ^ H 0 ^ ; again to ensure that the sign of this bias is always the same. Secondly, we present coverage rates of the associated con…dence intervals. For comparison, we also include results for the case where factors do not need to be estimated and are treated as observed (row labeled "true factor" in the tables). This quanti…es the loss from having to estimate the factors. Table 1 provides results for the …rst two DGPs and illustrates our results. For each DGP, the top panel gives the bias associated with the OLS estimator as well as the plug-in and bootstrap estimates. The second panel for each design provides the coverage rate of intervals based on asymptotic theory, either using the OLS estimator or its bias-corrected version, based on OLS using true factors, and based on the wild bootstrap. From table 1, we see that, as expected, bias is nil when When = 0 (case 1). 6= 0; a negative bias appears. DGP 2 shows that the magnitude of this bias is decreasing in N (and T ): The sample estimate of this quantity provides a reasonable approximation to it. However, the bootstrap captures the behavior of the bias well as predicted by theory and better than the sample estimate. Coverage rates are consistent with these …ndings. When = 0 (DGP 1), asymptotic theory is nearly perfect and matches closely the results based on the observed factors. In case 2, with = 1; OLS inference su¤ers from noticeable distortions for all sample sizes. This is because the estimator is biased and the associated t-statistic is not centered at 0. Analytical bias correction corrects most of these distortions. The bootstrap provides even better inference and is quite accurate for N 100: This loss in accuracy in inference is due to the estimation of the factors as illustrated by the results with the true factors. Tables 2 and 3 provide results for the other DGPs and show the robustness of the results to the presence of heteroskedasticity in both errors (DGPs 3 and 4), and serial correlation in the idiosyncratic errors (DGP 5). The bias results in table 2 are very similar to those in table 1, although coverage rates reported in table 3 deteriorate relative to the simpler homoskedastic case. The presence of crosssectional dependence (DGP 6) is interesting. Firstly the bias is larger here (because also includes cross-product terms). Secondly, the wild bootstrap is theoretically not valid since it does not replicate the cross-sectional dependence. Indeed, we see that, contrary to the other cases, the sample estimate of the bias is often better than the bootstrap, especially with N = 50: 18 6 Conclusion The main contribution of this paper is to give a set of su¢ cient high-level conditions under which any residual-based bootstrap method is valid in the context of the factor-augmented regression model p c < 1: Our results show that two crucial conditions for bootstrap in cases where T =N ! c; 0 validity in this context are that the bootstrap regression scores replicate the time series dependence of the true regression scores, and that either c = 0 or the bootstrap replicates also the cross-sectional dependence of the idiosyncratic error terms. Our high-level conditions can be checked for any implementation of the bootstrap in this context. We verify them for a particular scheme based on a two-step application of the wild bootstrap. Although the wild bootstrap preserves heteroskedasticity, its validity depends on a martingale di¤erence condition on the regression errors and on cross-sectional independence of the idiosyncratic errors when c 6= 0. The martingale di¤erence condition on the regression errors was used in Bai and Ng (2006), but it is overly restrictive when the forecasting horizon h is larger than one. Thus, relaxing this assumption is important. Although our general results in Sections 2 and 3 allow for serial correlation in the scores (see in particular our Assumption 5(b)), the particular implementation of the two-step residual-based wild bootstrap we consider in Section 4 is not robust to serial dependence. A block bootstrap based method is required in this case. We plan on investigating the validity of such a method in future work. Our high-level conditions will be useful in establishing this result. A second important extension of the results in this paper is to propose a bootstrap scheme that is able to replicate the cross-sectional dependence of the idiosyncratic error term. As our results show, this is crucial for capturing the bias when c 6= 0. Our wild bootstrap based method does not allow for cross-sectional dependence. Because there is no natural cross-sectional ordering, devising a nonparametric bootstrap method that is robust to cross-sectional dependence of unknown form is a challenging task. Another important extension of the results in this paper is the construction of interval forecasts, which we are currently investigating. Finally, the extension to factor-augmented vector autoregressions (FAVAR) …rst suggested by Boivin and Bernanke (2003) is also important. This case has recently been analyzed by Yamamoto (2011), who proposes a bootstrap scheme that exploits the VAR structure in the factors and the panel variables. 19 A Appendix A: Proofs of results in Section 2 We rely on the following lemmas to prove Theorem 2.1. Lemma A.1 Let Assumptions 1-5 hold. Then, p p N; T : N T = min Lemma A.2 Let Assumptions 1-5 hold. Then, if a) p1 T b) p1 T c) p1 T PT h t=1 F~t PT h t=1 HFt PT h t=1 Wt d) Letting F~t F~t V F~ F~t HFt HFt 0 HFt HFt 1Q 0 1, T h 1 X ~ ~ p Ft F t T t=1 1Q = cV = cQ Q0 V 2 0 W F H0 Q =c Q0 V 0 = p lim W 0 F~ T h t=1 F~t HFt "t+h = OP T =N ! c, where 0 1 HFt 0 0; + oP (1) : Q0 V H HFt = 0 H 0 W F H0 , By Assumption 5 and given the de…nition of 2 + oP (1) : 1 0 =c | 1 0 =c | with WF F~ +V F~ V 1 p lim (^ ) + oP (1) ; } (15) 1 p lim (^ ) + oP (1) ; } (16) {z B W F~ V F~ V {z B E (Wt Ft0 ). H 0 0 Iq 0 = diag (p lim H; Iq ), T h 1 X p zt "t+h !d N 0; T t=1 0 0 0 : By Lemma A.1, B !P 0 and by Lemma A.2d), C= , where + oP (1) : C T h 1 X p T t=1 T c < 1; for any …xed h Proof of Theorem 2.1. Write 8 T h T h > 1 X F~t HFt 1 X HFt > > p p " + > t+h > Wt > 0 T t=1 T t=1 > > ! 1 > | {z } {z | T h < p 1 X A B T ^ z^t z^t0 = T > Xh T 0 1 > 0 t=1 > > p z^t F~t HFt H 1 > > T t=1 > > > | {z } : A= 1p NT we have that T h 1 X p Wt F~t T t=1 W F~ p PT Q0 V and where 1 T F~t Wt F~t HFt 0 20 H 1 0 = c B B + oP (1) ; 9 > > "t+h > > > > > > } > = > > > > > > > > > ; : (17) where B and B are as de…ned in (15) and (16). Given Assumptions 1-5, ! T h T h 1 X 1 X 0 0 0 0 z^t z^t = 0 zt z t 0 + oP (1) = 0 zz 0 + oP (1) : T T t=1 This implies that if p t=1 T =N ! c; p T ^ !d N ( c = When E (Wt Ft0 ) = 0, we have that WF H00 = 1 1 F 1 F H0 1 F~ +V 0 1 F~ V 1 F~ V F~ 0 1 0 = 1 zz 1 zz 1 0 , and p lim (^ ) : = 0, implying that W F~ 0 +V 0 V F~ 1 W 0 = since H00 1 ) ; with +V W F~ V 1 0 0 zz 0 ; F~ V 1 F~ V 1 p lim (^ ) 1 F~ V p lim (^ ) ; H0 1 = Ir given that we can show that H0 F 1 = Q = (H00 ) : Proof of Lemma A.1. The proof is based on the following identity: F~t HFt = V~ 1 T 1X~ Fs T T 1X~ Fs st + T s=1 V~ 1 s=1 (A1t + A2t + A3t + A4t ) ; where st = E N 1 X eis eit N i=1 st = N 1 X N 0 i Fs eit ! , = Fs0 i=1 = st T 1X~ Fs st + T s=1 N 1 X (eis eit N T 1X~ Fs st + T st s=1 ! (18) E (eis eit )) ; i=1 0e t N and st = Ft0 0e s N = ts : Using the identity (18), we have that T h 1 X ~ Ft T HFt "t+h = V~ 1 (I + II + III + IV ) ; t=1 where I = III = T h T 1 XX ~ Fs T2 st "t+h ; 1 T2 st "t+h ; t=1 s=1 T T Xh X F~s T h T 1 XX ~ II = 2 Fs T st "t+h ; t=1 s=1 T T Xh X and IV = t=1 s=1 1 T2 F~s st "t+h : t=1 s=1 Since V~ 1 = OP (1) (see Lemma A.3 of Bai (2003), which shows that V~ !P V > 0), we can ignore V~ 1 . Start with I. We can write I= T h T 1 XX ~ Fs T2 HFs st "t+h + H t=1 s=1 T h T 1 XX Fs T2 t=1 s=1 21 st "t+h I1 + I2 : 1 T We can show that I2 = OP have T h T T h T 1 XX 1 XX E kF " k j s st t+h T2 T2 t=1 s=1 t=1 s=1 ! T 1 1 1X j st j = O ; T T t;s T E kI2 k since E kFs k2 1 T by showing that E kI2 k = O M and E j"t+h j2 . Ignoring H (which is OP (1)), we E kFs k2 st j M for some constant M < 1, and 1=2 PT 1 T 1=2 E j"t+h j2 t;s j st j < M by assumption. For I1 , repeated application of the Cauchy-Schwartz inequality implies that kI1 k = T 1X ~ Fs T T h 1 X T HFs s=1 1 p T Thus, I = OP st "t+h t=1 0 ! T 1X ~ Fs T s=1 11=2 0 B B B B B @ C T C 2 1X ~ C Fs HFs C C T s=1 A | {z } 2 OP ( N T ) by Lemma A.1 of Bai (2003) p 1 T NT HFs HFs T T h X 1 X @1 T T s=1 11=2 2 OP (1) 2 st "t+h t=1 p = OP 1 T 11=2 A : NT OP (1) T h T 1 XX Fs st "t+h + H 2 T t=1 s=1 !1=2 0 C T h 1 X 2 C "t+h C st j C T A }| t=1 {z } B T T Xh B1 X B j BT @ s=1 t=1 | {z . Next, consider II. We have that T h T 1 XX ~ II = 2 Fs T 2 st "t+h II1 + II2 : t=1 s=1 We can show that kII1 k Indeed, | T T h 1 X 1X E T T s=1 T 1X ~ Fs T s=1 HFs {z OP ( 1 NT 2 st "t+h = t=1 2 !1=2 0 ) T T h X 1 X @1 T T s=1 t=1 }| {z OP T T h 1X 1 X E T T s=1 = t=1 2 A st "t+h p1 NT N 1 X (eis eit N 11=2 = OP p ! 2 E (eis eit )) "t+h i=1 T T h N 1 XX 1 1X E p (eis eit TN T T N t=1 i=1 s=1 | {z 2 E (eis eit )) "t+h t=1 s=1 st "t+h T N 1 XX p Fs (eit eis T N s=1 i=1 | {z mt 22 =O 1 TN : } For II2 , ignoring H (which is OP (1)), we have that T h 1 1 X =p T N T t=1 : NT } =O(1) by Assumption 4(a) T h T 1 XX II2 = 2 Fs T 1 NT ! E (eit eis )) "t+h } p T h 1 1 X mt "t+h : T N T t=1 We can show that 1 T PT h t=1 mt "t+h p1 NT = OP (1) ; implying that II2 = OP . By Cauchy-Schwartz inequality, we have that T h 1 X mt "t+h T T h 1 X kmt k2 T t=1 t=1 given that E j"t+h j2 < M < 1, provided 1 T Markov’s inequality. But T h 1 X E kmt k2 T p1 NT by Assumption 3(b). Thus, II = OP T h T 1 XX ~ Fs T2 T h 1 X 2 "t+h T t=1 PT h 2 t=1 kmt k st "t+h = t=1 s=1 !1=2 = OP (1), or = OP (1) ; PT h t=1 E 1 T kmt k2 = O (1), by 2 T T N 1X 1 XX E p Fs (eit eis T T N s=1 i=1 t=1 t=1 III = !1=2 E (eit eis )) = O (1) : Next, consider T h T 1 XX ~ Fs T2 st "t+h + H HFs t=1 s=1 T h T 1 XX Fs T2 st "t+h III1 + III2 : t=1 s=1 Starting with III1 , we have that kIII1 k since | T T h 1X 1 X E T T s=1 T 1X ~ Fs T s=1 HFs 2 {z OP ( N1T ) 2 st "t+h = t=1 !1=2 0 T T h X 1 X @1 T T s=1 t=1 }| {z OP T T h 0e 1 X 1X t E Fs0 T T N s=1 T h T 1 XX III2 = 2 Fs T t=1 s=1 st "t+h p1 TN 2 "t+h = OP t=1 s=1 IV = T h T 1 XX ~ Fs T2 HFs p1 TN et "t+h = OP 1 TN t=1 s=1 : ! ! T T h 1X 1 X 0 et 0 = Fs F s "t+h = OP T T N s=1 t=1 | {z }| {z } T h T 1 XX Fs T2 t=1 s=1 23 et "t+h } OP p1 TN : For IV; write st "t+h + H 2 0 t=1 OP (1) given Assumption 4(b). Thus, III = OP NT 2 0 with its de…nition, we have that "t+h 1 TN T T h 1 X 1X E Fs0 = T TN =O(1) by Assumption 4(b) T h T 0e 1 XX t = 2 Fs Fs0 T N p } s=1 T T h 1X 1 1 X kFs k2 E p T TN T N t=1 | s=1{z } {z | st 11=2 A st "t+h t=1 =OP (1) Ignoring again H and replacing 2 st "t+h IV1 + IV2 . p 1 TN ; For IV1 , we have that kIV1 k | Indeed, T 1X ~ Fs T s=1 HFs {z OP ( T 1X T s=1 = !1=2 0 T X @1 T s=1 }| ) T h 1 X T st "t+h t=1 T X 1 = T 1 NT 2 e0s N s=1 1 TN 1 T T X T Xh | 1 T 2 ! 0 @ p1 T } | =OP (1) by Assumption 3(d) For IV2 , we have that IV2 = = T h T 0e 1 XX s 0 F F s t T2 N 1 T2 t=1 s=1 T T Xh X Fs t=1 s=1 A st "t+h {z p1 NT T 1X = T s=1 !2 !2 11=2 t=1 OP Ft "t+h t=1 e0 ps N s=1 {z 1 T !2 T Xh T h 1 X 0 Ft T 1 T s=1 T Xh "t+h N 2 e0s N 2 1 NT t=1 {z : NT !2 2 T h 1 X Ft "t+h T t=1 1 TN A = OP Ft "t+h 1 p } 0e s t=1 T X = OP : } =OP (1) by Assumption 5(c) "t+h T 1 X e0s Fs T N e0s Ft "t+h = N s=1 ! T h 1 X Ft "t+h T t=1 ! = OP p 1 NT 1 p T OP given Assumptions 3(c) and 5(b). Thus, we have shown that under our assumptions, 1 p I+II+III+IV = OP NT T +OP p 1 +OP NT p 1 +OP NT NT 1 p NT 1 p = OP NT Proof of Lemma A.2. Proof of part a) Using the identity (18), we can write p 1 TXh T F~t T B HFt F~t HFt 0 t=1 = p T V~ 1 T h 1 X (A1t + A2t + A3t + A4t ) (A1t + A2t + A3t + A4t )0 V~ T 1 ; t=1 where Ait (i = 1; : : : ; 4) are de…ned in (18). We analyze each term separately (ignoring V~ We can show that 1 T PT h 0 t=1 A1t A1t = OP p 1 TXh T A1t A01t = OP T t=1 1 T + OP 2 NT p 1 T 24 2 NT 1 T + OP : Thus, 1 p T = oP (1) : 1 ). T : ; Indeed, by Cauchy-Schwartz T h 1 X A1t A01t T T h T h T 1 X 1 X 1X~ 2 kA1t k = Fs T T T t=1 t=1 t=1 t=1 st s=1 2 T h T 1 X 1X ~ Fs T T 2 HFs st s=1 T h T 1 X 1X ~ = Fs T T t=1 HFs + HFs st s=1 2 T h T 1 X 1X + H Fs T T t=1 2 a11:1 + a11:2 : st s=1 We have that 2 T h T 1 X 1X ~ Fs T T T h T T 2 1 X 1 X 1X ~ F HF j st j2 s s st T T T t=1 s=1 t=1 s=1 s=1 ! ! T T T 2 1 1X ~ 1 1 XX = Fs HFs j st j2 = OP ; T T T T 2N T s=1 t=1 s=1 | {z }| {z } =OP (1) =OP ( N2T ) P P where we have use Assumption 2 (see Lemma 1(i) of Bai and Ng (2002)) to bound T1 Tt=1 Ts=1 j a11:1 HFs Next, T h T 1 X 1X H Fs T T a11:2 t=1 1 kHk2 | {z } T =OP (1) We can show that if p 1 T PT h 0 t=1 A2t A2t 1 T | 2 st st 1 T : t=1 ! T T 1 XX kFs k j T s=1 t=1 s=1 {z }| {z 2 =OP (1) T h 1 X p A2t A02t = OP T t=1 2 T h T 1 X 1X Fs kHk T T 2 s=1 T X = OP 2 st j : 1 TN 2 st j =OP (1) 1 + OP 1 p TN N + OP 2 NT s=1 = OP } ; implying that p T 1 N 2N T ! = oP (1) ; T =N ! c < 1: Indeed, T h 1 X A2t A02t T t=1 2 T h T h T 1 X 1 X 1X ~ 2 kA2t k = Fs T T T t=1 t=1 T h T 1 X 1X ~ Fs T T t=1 2 HFs s=1 HFs + HFs st s=1 st T h T 1 X 1X + H Fs T T t=1 2 st a22:1 + a22:2 : s=1 We have that a22:1 T h T 1 X 1X ~ Fs T T t=1 s=1 2 HFs st T 1X ~ Fs HFs T s=1 | {z =OP ( N2T ) 25 2 ! T T 1 XX 2 j st j = OP T2 t=1 s=1 } | {z } 1 =OP ( N ) by Assumption 2(c) ! 1 N 2 NT : Indeed, 2 N T T 1 1 XX 1 X (eis eit E p st j = N T2 N i=1 t=1 s=1 T T 1 XX Ej T2 2 t=1 s=1 E (eis eit )) =O 1 N given Assumption 2(c). Next, a22:2 T h T 1 X 1X kHk Fs T T 2 2 t=1 T N T h 1 1 X 1 XX p Fs (eis eit = kHk TN T T N s=1 i=1 t=1 | {z 2 st s=1 2 E (eis eit )) } =OP (1) by Assumption 3(b) implying that T h 1 X A2t A02t = OP T t=1 1 + OP 2 NT N We can show that under Assumptions 1-5, if p 1 TN = OP 1 N 1 TN = OP : 2 NT T =N ! c, T h 1 X p A3t A03t = cQ Q0 + oP (1) : T t=1 Proof. T h 1 X A3t A03t = T t=1 = T h 1 X T t=1 T h 1 X T t=1 T 1X~ Fs T st s=1 ! T 1X ~ Fs T T 1X~ Fs T st s=1 !0 HFs + HFs st s=1 a33:1 + a33:2 + a033:2 + a33:3 : ! T 1X ~ Fs T HFs + HFs st s=1 !0 It follows that ka33:1 k 2 T h T 1 X 1X ~ Fs T T t=1 HFs st s=1 T T h T 2 1 XX 1X ~ Fs HFs j T T2 s=1 t=1 s=1 {z }| {z | 1 =OP ( N ) 1 =OP 2 NT 2 st j = OP } 1 N 2 NT since T h T 1 XX j T2 t=1 s=1 2 st j = T h T 1 X X 0 0 et Fs T2 N 2 t=1 s=1 T 1X kFs k2 T | s=1{z } =OP (1) Thus, p p T 1 N 2N T T a33:1 = OP 26 T h 0e 2 1 X t = OP T N t=1 | {z } 1 =OP ( N ) given Assumption 3(d) ! = oP (1) ; 1 N : ; ; if p T =N ! c. Next we show that ka33:2 k 0 T T Xh 1 X @1 HFs T T t=1 s=1 | {z =OP p 2 st p1 N T a33:2 = OP p T N 1 = oP (1) : Indeed, NT 11=2 0 2 T T Xh 1 X A @1 F~s HFs T T t=1 s=1 }| {z st 1p =OP 11=2 A N NT 1 = OP N ; NT } since by Cauchy-Schwartz and Assumption 3(d), 2 T h T 1 X 1X HFs T T t=1 T h T 1 X 1X kHk Fs T T 2 2 st s=1 t=1 kHk2 st s=1 For a33:3 , we have that a33:3 = T h 1 X T = H t=1 T Xh 1 T = H ! !0 T T h 1X 1 X HFs st = H st T T s=1 s=1 t=1 ! ! T T 0 0 X X 1 et 1 et Fs Fs0 Fs Fs0 H 0 T N T N T 1X HFs T t=1 F 0F T T T h T 1X 1 XX kFs k2 2 j T T s=1 | s=1{z }| t=1 {z 1 =OP (1) =OP ( N ) 1 T s=1 T Xh t=1 T 1X Fs T st s=1 ! 2 st j = OP } T 1X T 0 st Fs s=1 ! H0 s=1 0e t e0t N N F 0F T given Assumption 3(e). When multiplied by p 1 N H 0 = OP ; T , this will therefore be of order OP p T N and therefore it will not go to zero in probability when c 6= 0. Its probability limit can be computed as follows. First, notice that HF 0 F T F~ = 0 F H 0 + F~ F~ F = T 0 F H0 F + T F~ 0 F = Q + oP (1) T given that by Lemma B.2 of Bai (2003), F~ 0 F H0 F = OP T 2 NT = oP (1) ; = + oP (1) : ~0 and p lim FTF = Q. Second, by Assumption 3(e), T h 1 X T t=1 Thus, letting p T =N = c + o (1), p p T T a33:3 = H N 0e t p e0 pt N N F 0F T ( T h 1 X T t=1 0e t p N e0 pt N ) F 0F T = (c + o (1)) [Q + oP (1)] ( + oP (1)) Q0 + oP (1) = cQ Q0 + oP (1) : 27 H0 1 N : 1 T We can show that p if T =N ! c; PT h 0 t=1 A4t A4t p T h 1 X p A4t A04t = OP T t=1 1 = OP N T 2 NT N ! 1 + OP 2 NT p + OP N N T NT NT ! 1 TN + OP + OP p 1 TN ; which implies that = oP (1) : Proof. T h 1 X A4t A04t = T T h 1 X T t=1 T 1X~ Fs T t=1 T h 1 X T = st s=1 ! T 1X ~ Fs T t=1 T 1X~ Fs T st s=1 !0 HFs + HFs st s=1 ! T 1X ~ Fs T HFs + HFs st s=1 a44:1 + a44:2 + a044:2 + a44:3 : !0 We have that ka44:1 k T h T 1 X 1X ~ Fs T T t=1 2 HFs T T h T 2 1 XX 1X ~ Fs HFs j T T2 s=1 t=1 s=1 | {z }| {z 1 =OP ( N ) 1 =OP st s=1 2 st j 1 = OP } 2 NT ; 2 NT N since T h T 1 XX j T2 t=1 s=1 T h T 1 X X 0 0 es Ft T2 N 2 st j = 2 t=1 s=1 T r 1 X kFt k2 T | t=1{z } =OP (1) This implies that ka44:2 k 0 @1 T | p T Xh t=1 T a44:1 = OP T 1X HFs T s=1 {z =OP p 2 st p1 N T =N OP 11=2 0 A @1 T }| 1 2 NT T Xh t=1 T 0e 2 1X s = OP T N s=1 | {z } 1 =OP ( N ) given Assumption 3(d) 1 N : = oP (1). Next, 2 T 1X ~ Fs T s=1 {z HFs 1p NT N =OP st 11=2 A = OP 1 N ; NT } since by Cauchy-Schwartz and Assumption 3(d), T h T 1 X 1X HFs T T t=1 s=1 2 T h T 1 X 1X kHk Fs T T 2 2 st t=1 Thus, p T a44:2 = OP kHk2 st s=1 p ! T OP N 28 1 NT T T h T 1X 1 XX kFs k2 2 j T T s=1 t=1 s=1 | {z }| {z 1 =OP (1) =OP ( N ) = oP (1) ; 2 st j } = OP 1 N : p when T =N ! c: For a44:3 , we have that ! !0 T h T T T h 1 X 1X 1X 1 X = HFs st HFs st = H T T T T t=1 s=1 s=1 t=1 ! ! T h T T 0 0 1 X 1X es 1 X es = H Fs Ft0 Ft Fs0 H 0 T T N T N t=1 s=1 s=1 ! T h ! T T 0 0e X X X 1 es 1 1 s Fs Ft Ft0 F0 = H T N T T N s }| t=1 {z } } | s=1{z | s=1{z a44:3 Thus, p p1 TN T a44:3 = OP We can show that oP (1) : =OP (1) p1 TN =OP 1 T p1 TN =OP T 1X Fs T st s=1 ! T 1X T s=1 1 TN H 0 = OP 0 st Fs ! H0 : by Assumption 3(c) = oP (1) : PT h 0 t=1 A1t A2t = OP NT 1 p TN ; which implies that p T T1 PT h 0 t=1 A1t A2t = Proof. Given the bounds we found above for the terms that depend on A1t and A2t , it is immediate to see that T h 1 X A1t A02t T T h 1 X kA1t k2 T t=1 t=1 We can show that OP p1 N 1 T h 0 t=1 A1t A3t T h 1 X kA2t k2 T t=1 p1 TN = OP !1=2 = OP , implying that p T T = oP (1) : Indeed, T h 1 X A1t A03t T T h 1 X kA1t k2 T t=1 Similarly, PT !1=2 1 T t=1 PT h 0 t=1 A1t A4t = OP T h 1 X A1t A04t T !1=2 T h 1 X kA3t k2 T t=1 p1 TN t=1 1 p TN = OP 1 T where we can prove that = OP , which implies that T h 1 X kA1t k2 T t=1 !1=2 !1=2 p T h 1 X kA4t k2 T t=1 T T1 !1=2 1 p T 1 p OP NT PT h 0 t=1 A1t A3t 1 p T OP PT h 0 t=1 A1t A4t 1 p T = OP N = OP 1 p N = OP NT p p T TN = OP 1 p = p 1 TN : = oP (1) : OP 1 p N ; PT h 2 t=1 kA4t k = OP 1 N by an application of Cauchy-Schwartz. Proof. Given the de…nition of A4t , we have that T h 1 X kA4t k2 = T t=1 T h T 1 X 1X~ Fs T T t=1 s=1 2 st T h T 1 X 1X ~ Fs T T s=1 | t=1 {z OP = OP 1 N : 29 1 N 2 NT 2 HFs st T h T 1X 1 X H Fs + T T } | t=1 {zs=1 1 OP ( N ) 2 st } TN By Cauchy-Schwartz, 2 T h T 1 X 1X ~ Fs T T t=1 HFs st s=1 since T T 1 XX T2 2 st t=1 s=1 T T 0e 1 XX s = 2 Ft0 T N T T T 2 1 XX 1X ~ 2 Fs HFs st = OP T T2 s=1 t=1 s=1 | {z }| {z } 1 =OP ( N2T ) =OP ( N ) T T X 1X 2 1 kFt k T T 2 t=1 s=1 t=1 0e 2 s N s=1 1 2 NT N 1 N = OP (1) OP ; by Assumptions 1(a) and 3(d). For the second term, T h T 1 X 1X H Fs T T t=1 2 T h T 1 X 1X kHk Fs T T st s=1 t=1 1 T Similarly, we can show that oP (1) : PT h 0 t=1 A2t A3t 11=2 1 N 2 NT OP PT h 0 t=1 A2t A4t 1 = OP N NT ; implying that 0 t=1 1 T PT h 0 t=1 A3t A4t = OP T h 1 X A3t A04t = T t=1 = 1 N NT T h 1 X A3t T t=1 T 1X~ Fs T s=1 T h T 1 X 1X ~ A3t Fs T T t=1 p st !0 NT 11=2 PT h 0 t=1 A2t A4t T h 1 X = A3t T HFs s=1 t=1 0 11=2 h 0 t=1 A3t A4t = OP p T T PT h 0 t=1 A2t A3t 1 N = OP = : NT 1 N : NT = oP (1) : T 1X ~ Fs T HFs s=1 T h 1 X + H A3t st T2 t=1 30 = OP s=1 2 st j = oP (1) : C B C B 1 TXh B 2C kA4t k C B C BT @| t=1{z }A 1 OP ( N ) PT s=1 ; implying that 0 T T , implying that N t=1 C B T h C B1 X B 2C kA k B 3t C C BT @| t=1{z }A 1 OP ( N ) T T 1 N 2 NT OP 1 T h T T 1 X 1X 1X kFs k2 j T T T 0 p 11=2 B C B C B T h C B1 X C 2C B kA k 2t BT C B t=1 C B| {z }C @ A T h 1 X A2t A04t T st = OP C B C B C B T h C B1 X 2 B kA2t k C C BT C B t=1 B| {z }C A @ t=1 kHk2 s=1 0 T h 1 X A2t A03t T 1 T 2 2 T X s=1 Fs st T 1X + H Fs st T !0 s=1 : st !0 1 N : The …rst term can be bounded by !1=2 0 T h T h T X 1 X 1X ~ 1 2 @ kA3t k Fs T T T t=1 t=1 2 HFs A st s=1 For the second term, note that T 1X~ Fs A3t = T st s=1 T 1X ~ = Fs T 11=2 1 p N = OP NT T 1X Fs st + H T HFs s=1 1 p OP N = OP st : s=1 Thus, we get that T h T X 1 X A Fs 3t T2 t=1 = st s=1 T h 1 X T T 1X ~ Fs T t=1 s=1 T h 1 X +H T T 1X Fs T t=1 The …rst term is bounded by 0 T T Xh 1 X @1 F~s T T t=1 NT N OP For the second term, we get that ! T h T T 1 X 1X 1X Fs st Fs T T T t=1 = = s=1 T h 1 X F 0F T T F 0F 1 T T t=1 T Xh t=1 0e t N 0e t N Ft0 s=1 2 HFs st s=1 1 p = OP HFs 1 p N = OP 11=2 0 st ! st ! T 1X Fs T s=1 !0 T 1X Fs T s=1 A 1 N t=1 : st T T Xh 1 X @1 Fs T T st !0 2 st s=1 : 11=2 A NT !0 T h T 0e 1 X 1X t 0 F F = s st s T T N s=1 t=1 s=1 ! 0 T T h 0e 0e 1X F 0F 1 X s t Fs Ft0 = T N T T N s=1 t=1 ! T 1 X 0 es 0 1 F = OP (1) OP p OP T N s TN s=1 ! T 0e 1X s Fs Ft0 T N s=1 !0 T 1 X e0s Fs Ft T N !0 s=1 p 1 TN = OP 1 TN ; given Assumption 3(c). Thus, it follows that the only term that contributes in a non-negligible way to p T B = V~ 1 T h 1 X p (A1t + A2t + A3t + A4t ) (A1t + A2t + A3t + A4t )0 V~ T t=1 is the term that depends on Bai (2003)), we have that 1 T PT h 0 t=1 A3t A3t : p T B = cV 1 1 More precisely, since V~ !P V (by Lemma A.3 of Q Q0 V which proves the result. 31 1 + oP (1) ; 1 N NT : Proof of part b). Consider now T h 1 X Hp Ft F~t T t=1 C Replacing F~t HFt = V~ 1 (A 1t 0 HFt : + A2t + A3t + A4t ) yields T h 1 X Ft (A1t + A2t + A3t + A4t )0 V~ C = Hp T t=1 p 1 T H (bf 1 + bf 2 + bf 3 + bf 4 ) V~ 1 ; where Ait are as de…ned previously. Again, we consider each term separately. PT h 0 t=1 Ft A1t 1 T We can show that 1p = OP NT , which implies that T p T Hbf 1 V~ 1 = oP (1) under our assumptions. Proof. Write T h T h 1 X 1 X 0 Ft A1t = Ft T T bf 1 1 T = t=1 T Xh T 1X ~ Fs T s=1 ! t=1 1 T Ft t=1 bf 1:1 + bf 1:2 : T X F~s HFs s=1 0 HFs + st 1 T T Xh T 1X 0 Fs st + T 0 Ft t=1 1 T st H 0 s=1 ! T X 0 Fs st H 0 s=1 ! By Cauchy-Schwartz, kbf 1:1 k 1 T T Xh t=1 B !1=2 B B T h T B1 X 1 X 2 B F~s kFt k BT B t=1 T s=1 B| {z @ 1 HFs 0 st 1 2 T NT OP where the OP 11=2 0 C C C C C C }C A 2C = OP (1) OP 1 p NT T ; term is equal to a11:1 in part a). Next, consider bf 1:2 : 2 NT T T h 1 X Ft T bf 1:2 t=1 = 1 T 1 T | T 1X 0 Fs T s=1 T h T XX t=1 s=1 {z st Ft Fs0 ! st H0 ! H 0 = OP 1 T : } OP (1) given Assumptions 1 and 2 Indeed, T h T 1 XX E Ft Fs0 T t=1 s=1 st T T 1 XX E Ft Fs0 T st t=1 s=1 T T 1 XX j T t=1 s=1 32 st j E kFt k2 1=2 E kFs k2 1=2 = O (1) ; given the moment conditions on Ft and the weak dependence assumptions on eit . Thus, bf 1 1 p bf 1:1 + bf 1:2 = OP NT 1 T + OP T 1 p = OP NT ; T as we wanted to prove. 1 PT h 0 We can show that bf 2 t=1 Ft A2t = OP T p implies that T times this object is oP (1). NT 1 p p1 TN + OP TN = OP p1 TN ; which Proof. T h T h 1 X 1 X 0 Ft A2t = Ft T T bf 2 1 T = t=1 T Xh T 1X~ Fs T s=1 ! t=1 T 1X ~ Fs T Ft t=1 0 HFs s=1 st st + !0 T h T 1 X 1X 0 Ft Fs T T t=1 st H 0 bf 2:1 + bf 2:2 : s=1 We can write kbf 2:1 k = T 1X T s=1 = OP NT T h 1 X Ft T st t=1 1 p ! F~s B T T h B1 X 1 X B Ft B BT T t=1 @| s=1 {z OP ( T1N ) 0 HFs 11=2 0 st 2C C C C C }A C B C B C B X 2C B1 T B F~s HFs C C BT C B s=1 B| {z }C A @ 1 2 NT OP ; TN 11=2 0 given that T T h 1X 1 X Ft T T s=1 2 = st t=1 T T h N 1 X 1X 1 X Ft (eis eit T T N s=1 = t=1 2 E (eis eit )) i=1 T T h N 1 1X 1 XX p Ft (eis eit TN T T N s=1 t=1 i=1 {z | 2 = OP E (eis eit )) For bf 2:2 , we have that (ignoring H), t=1 T 1X 0 Fs T st s=1 ! T h 1 X kFt k2 T t=1 !1=2 0 using again Assumption 3(b) to bound the second term. We can show that bf 3 oP (1). 1 T PT h 0 t=1 Ft A3t = OP 33 T T Xh 1 X @1 Fs T T p1 TN t=1 : } OP (1) by Assumption 3(b) T h 1 X Ft kbf 2:2 k = T 1 TN 2 ts s=1 ; implying that p 11=2 A = OP T bf 3 = OP p p p T TN 1 TN = ; Proof. T h T h 1 X 1 X 0 Ft A3t = Ft T T bf 3 t=1 t=1 T h T 1 X 1X ~ Ft Fs T T = T 1X ~ Fs T t=1 0 HFs st + s=1 T h 1 X Ft T t=1 Rewrite T 1X = T s=1 so that by Cauchy-Schwartz 0 T T h 1X 1 X Ft kbf 3:1 k @ T T s=1 t=1 | {z OP 2 st p1 TN T h 1 X Ft T 2 s=1 1 T T Xh t=1 st t=1 Ft e0t N t=1 11=2 A Note that T T h 1X 1 X Ft T T 1 T T X s=1 st ! T 1X ~ Fs T s=1 }| =OP F~s {z kFs k2 = 1 TN | p 1 TN 2 ; !1=2 = OP NT } 1 p TN T T h 1 X 1 X e0t Ft Fs = T T N t=1 T Xh 0 HFs 1 NT 2 st !0 s=1 HFs T T h 0e 1X 1 X t Ft Fs0 = T T N s=1 2 HFs s=1 bf 3:1 + bf 3:2 : bf 3:1 T 1X Fs st + H T s=1 !0 T 1X H Fs st T s=1 2 1 T Ft e0t t=1 {z } t=1 T X s=1 1 TN kFs k2 = OP OP (1) by Assumption 3(c) Next consider (ignoring H), bf 3:2 T h T 1 X 1X 0 = Ft Fs T T t=1 st s=1 T 1X = T T h 1 X Ft T s=1 st t=1 ! Fs0 ; so that by Cauchy-Schwartz inequality and Assumption 3(c), we get that 0 1 !1=2 2 1=2 T T h T X X X 1 1 1 2 kbf 3:2 k @ Ft st A kFs k = OP T T T s=1 t=1 s=1 This concludes the proof that bf 3 We can show that p T bf 4 p1 T bf 3:1 + bf 3:2 = OP PT h 0 t=1 Ft A4t =c 34 F Q0 V p 1 TN 1 2 : + oP (1) : p 1 TN : : Proof. Replacing A4t with its de…nition yields T h T h 1 X 1 X Ft A04t = Ft T T bf 4 = 1 T t=1 T Xh T 1 X ~0 Fs T s=1 ! t=1 Ft t=1 T 1X ~ Fs T HFs s=1 Starting with bf 4:2 , and given that st 0 st st T h 1 X + Ft T = O 1 p TN st s=1 ! bf 4:1 + bf 4:2 : we have that T h T 1 1 X 1 X 0 0 es 0 0 Ft Ft F H =p T T N s TN t=1 s=1 bf 4:2 = T 1X (HFs )0 T t=1 0e s N ; = Ft0 ! T h 1 X Ft Ft0 T t=1 p OP (1) OP (1) OP (1) = OP 1 TN ! T 1 X p T N s=1 0 es Fs0 ! H0 ; given Assumptions 1, 3(c) and the fact that H = OP (1) : Thus, p T b4f:2 = oP (1). Next, consider bf 4:1 . We have that bf 4:1 T h T 1 X 1X ~ = Fs Ft T T t=1 0 HFs Ft0 s=1 0e s = N T h 1 X Ft Ft0 T t=1 ! T 1 X 0 es ~ Fs T N HFs s=1 0 ! ; where the …rst term is OP (1) and the second term can be shown to be OP N1 : Thus, we will p get a non negligible contribution from bf 4:1 when multiplying by T . Speci…cally, using the usual decomposition for F~s HFs , we have that T 1 X 0 es ~ Fs T N HFs 0 s=1 = T 1 X 0 es (A1s + A2s + A3s + A4s )0 V~ T N 1 : (19) s=1 We …rst show that the …rst, second and last terms are oP (1) when multiplied by p T . To end the proof we will study the third term. Starting with the …rst term in (19) (and ignoring V~ 1 ), !1=2 !1=2 T T T 0e 2 1X 1 1 X 0 es 0 1X 1 s 2 OP p ; A kA1s k = OP p T N 1s T N T N T s=1 s=1 s=1 p so that this term is oP (1) when multiplied by T . Similarly, !1=2 !1=2 T T T 0e 2 1 X 0 es 0 1X 1X 1 1 s 2 A2s kA2s k = OP p OP p = OP T N T N T N N N T s=1 s=1 s=1 so that when we multiply by p T we get OP p T N OP 1 NT = oP (1). For the last term in (19), a more careful analysis is required. We have that T 1 X 0 es 0 A = T N 4s s=1 = T 1 X 0 es T N 1 T s=1 T X s=1 T 1X ~ Ft T T 1X ~ Ft N T 0e s HFt t=1 t=1 35 HFt 0 T 1X + H Ft ts T 1 ts + H T t=1 T X 0 es s=1 ts !0 T 1X 0 Ft N T t=1 ts : (20) 1 N NT ; For the …rst term in (20), T T T T 1 X 0 es 1 X ~ 1X ~ 1 X 0 es Ft HFt ts = Ft HFt T N T T T N ts s=1 t=1 t=1 s=1 1 !1=2 0 2 1=2 T T T 0e X X 2 1 1 1 1 1X ~ s @ A = OP Ft HFt OP = OP ts T T T N N NT t=1 t=1 s=1 1 N ; NT since 2 T T 1 X 1 X 0 es T T N t=1 ts s=1 2 T T 1 X 1 X 0 es = T T N t=1 T 1X T st s=1 0e 2 s N s=1 T T 1 XX j T2 t=1 s=1 2 st j = OP 1 N2 For the second term in (20), given Assumption 3(c), T T 1 X 0 es 1 X 0 Ft T N T s=1 T T 0e 1 X 0 es 1 X t Fs0 Ft0 T N T N s=1 t=1 ! T T 1 1 X 0 1 X 0 p p es Fs TN T N s=1 T N t=1 = ts t=1 = 1 Thus, (20) is OP N NT + OP 1 NT 1 = OP N 0 et Ft0 ! = OP ; which when multiplied by NT p 1 TN : T is oP (1). Next, we analyze the dominant term in bf 4:1 which comes from the contribution involving A3s . For this term, we have that T 1 X 0 es 0 ~ A V T N 3s 1 = s=1 given that F 0 F~ T T 1 X 0 es T N s=1 T X T 1X~ Ft T t=1 F~ 0 F 0e s 0e s = 1 T = 1 ( + oP (1)) Q0 V N s=1 ts N T N 1 !0 !0 V~ 1 T 1 X 0 es = T N s=1 1 1 NT = T X s=1 0e s p t=1 F 0 F~ ~ V T e0s p N !0 T 1 X ~ 0 0 es Ft Ft T N N 1 ; ; = Q0 + oP (1) by Proposition 1 of Bai (2003) and that by Assumption 3(e), T h 1 X T t=1 0e t p e0 pt N N = + oP (1) : Thus, p T bf 4:1 = T h 1 X Ft Ft0 T t=1 ! p T ( + oP (1)) Q0 V N 1 ; ! =c F Q0 V 1 + oP (1) : Since T h 1 X C = Hp Ft (A1t + A2t + A3t + A4t )0 V~ T t=1 36 1 = p T H (bf 1 + bf 2 + bf 3 + bf 4 ) V~ 1 ; V~ 1 : it follows that C = cH0 Q0 V F 1 1 V because H0 = p lim H is such that H0 F + oP (1) = cQ Q0 V 2 + oP (1) ; 1Q = Q: Indeed, note that H0 = p lim H = V ; which implies that H0 F = V 1 V 1=2 = V 1 V 1=2 V 1=2 0 F 0 1=2 = V 1=2 given the de…nition of Q and the fact that 1 =V V 1=2 1=2 0 is such that 0 1=2 1=2 1=2 F = Q; 1=2 1=2 = F V: Proof of part c). The proof follows closely the proof of part b) by relying on moment and dependence conditions involving the extra regressors Wt . In particular, writing T h 1 X p Wt F~t T t=1 we verify p 0 HFt = T h 1 X p Wt (A1t + A2t + A3t + A4t )0 V~ T t=1 p T (d1 + d2 + d3 + d4 ) V~ 1 ; 1 T di = oP (1) for i = 1; 2; 3 by using the same arguments as for bf 1 , bf 2 and bf 3 . The only term that has a nonzero contribution is d4 . Following the same arguments as for bf 4 ; we have that ! ! T T 0e Xh X p p 0 1 1 s 1 0 T d4 V~ = T W t Ft F~s HFs V~ 1 T T N t=1 s=1 ! p T h T 1 X = Wt Ft0 Q0 V~ 1 + oP (1) V~ 1 N T t=1 0 2 = c WF QV = c 0 W F H0 V = c W F~ V + oP (1) = c 1 V F~ V 1 Q Q0 V 1 0 W F H0 Q V + oP (1) , since 1 Q0 V 2 + oP (1) ; since H00 Q = Ir + oP (1) W F~ = 0 W F H0 and F~ =V 1 Q Q0 V 1 : This ends the proof. Proof of part d). This follows immediately from parts a), b) and c) of this Lemma. B Appendix B: Proofs of results in Section 3 This Appendix is organized as follows. First, we provide some auxiliary lemmas, then we prove the results in Section 3, and …nally we prove the auxiliary lemmas. Lemma B.1 Let H = V~ a) H H 0 2 NT = Ir + OP b) H = H0 +OP 2 NT 1 F~ 0 F~ ~ 0 ~ . T N Under Conditions A*-D*, we have that if NT = min p N; p T ; , in probability, i.e. H is asymptotically an orthogonal matrix. , in probability, where H0 is a diagonal matrix with 37 1 on the main diagonal. c) V~ = H V~ H 0 + OP = V~ + OP 2 NT 2 NT , in probability. Lemma B.2 Assume Assumptions 1-5 hold and suppose we generate bootstrap data yt+h ; Xt ac- cording to the residual-based bootstrap DGP (7) and (8) by relying on bootstrap residuals "t+h and fet g such that Conditions A*-D* are satis…ed. Then, as N; T ! 1, T h 1 X ~ Ft T 1 p H F~t "t+h = OP NT t=1 in probability, for h ; T 0. Lemma B.3 Suppose conditions A*-D* hold. Then, as N; T ! 1; a) T h 1 X ~ Ft T H F~t F~t H F~t 0 1 ~ V N = t=1 1 " H 1 T +OP T h 1 X T ~ 0e p t N t=1 ! 1 + OP et 0 ~ p N !# p + OP N NT ~ 0e p s N ! H 0 V~ 1 1 NT : b) T h 1 X H F~t F~t T 0 H F~t t=1 1 p +OP NT 1 + OP T T h 1 X ~ ~0 Ft F t T 1 =H N N t=1 !" T 1X T s=1 es0 ~ p N !# ! F~ 0 F~ T V~ : NT c) T h 1 X Wt F~t T 0 H F~t t=1 1 p +OP NT T 1 = N t=1 1 + OP N T h 1 X Wt F~t0 T !" T 1X T s=1 1 p T T Xh F~t Wt F~t H F~t H F~t 0 0 H H t=1 in probability, where ~ W F~ = 0 0 1 1 PT es0 ~ p N !# F~ 0 F~ T ! V~ : ^ = c H0 0 0; if 1 h | V~ h ^ = c ~ W F~ V~ {z | B 1 T ! NT Lemma B.4 Suppose conditions A*-D* hold. For any h T h 1 X ~ p Ft T t=1 ~ 0e p s N h ~0 t=1 Wt Ft : 38 1 p T =N ! c, 0 V~ 1 + {z V~ B 2 i ^ + oP (1) ; } c < 1; then 2 i ^ + oP (1) ; } 2 2 Proof of Lemma 3.1. The proof is based on the following identity: 0 B T B1 X F~s BT @ s=1 | {z 1B H F~t = V~ F~t where C T T T C 1X~ 1X~ 1X~ Fs st + Fs st + Fs st C st + C; T T T A s=1 s=1 s=1 } | {z } | {z } | {z } A1t A2t ! N 1 X eis eit N = E st i=1 st , st A3t N 1 X = (eis eit N st = i=1 Ignoring V~ 1 2 H F~t t=1 N 1 X ~0 ~ i Ft eis = N T 1X kA1t k2 T 1 T t=1 = T 1X kA1t k2 + kA2t k2 + kA3t k2 + kA4t k2 ; T PT T 1X ~ Fs T s=1 | {z kF~ k2 T =r because T 1X kA2t k2 T st 2 t=1 s=1 ! } ~ 0F ~ F T T 1X ~ Fs T s=1 | {z PT 2 ~ s=1 Fs For the second term, we have that 2 =r =Ir T T 1 XX E j T2 t=1 s=1 T T 1 1 XX j = E st N T2 t=1 s=1 1 N which explains why the second term is OP T T 1X 1X kA3t k2 = T T T t=1 OP t=1 PT ~ ~0 s=1 Fs Fs 2 2 T X s=1 F~s F~s0 ~ 0e t N s=1 T T 1 XX T t=1 s=1 | {z 2 st ! 2 st ; implying that 1 T = OP : } =OP (1) by Condition A*(b). ! T T 1 XX j T2 t=1 s=1 }| {z 1 OP ( N ) 2 st j ! 1 N = OP : } 2 N 1 X p (eis eit N i=1 2 PT 2 F~s In particular, by Condition A*(c), we can show that 1 N ts : i=1 t=1 By the Cauchy-Schwartz inequality, 1 E (eis eit )) ; (which is OP (1)), it follows that T 1X ~ Ft T since T A4t i=1 N ~ 0 et 1 X ~0 ~ 0 ~ and F e = F i s it s N N = 1 E (eis eit )) = OP 1 N ; : For the third term, 2 T 1 X ~ 0 et T N t=1 2 T 1 T X 2 F~s F~s0 s=1 r2 ; whereas by Condition B*(d) and Markov’s inequality, : The fourth term in dt follows by the same arguments. 39 = OP 1 T PT t=1 1 N ~ 0e t N ; 2 = Proof of Theorem 3.1. The proof follows the proof of Theorem 2.1. In particular, we can write the bootstrap analogue of (17), viz p T ^ T h 1 X z^t z^t 0 T = t=1 ! 1 (A + B + C ) ; where conditional on the original data, with probability converging to one, we have that T h 1 X p z^t "t+h !d N 0; T t=1 A = given Conditions D*(b) and E*, and given that p lim 0 0 0 ; 0 ;B = p1 T oP (1) (given Lemma B.2); C = 0 T h 1 X p z^t F~t T t=1 H F~t 0 1 0 H ^ !P PT h t=1 0 c F~t 1 H F~t "t+h = ; 0 is de…ned in Lemma B.4. Under Assumptions 1-5, p lim V~ = V and p lim ^ = (H00 ) 1 , p lim ~ W F~ = W F~ , and p lim = Q Q0 by Condition F*, which implies that p P !d !P ; and …nally T1 Tt=1h z^t z^t 0 = 0 0 zz 00 00 + oP (1). This implies that T ^ B 0; B where N c( 0) 1 0 ;( 0 0) 1 0 ( 0) 1 , in probability. p Proof of Corollary 3.1. By Theorem 2.1, under Assumptions 1-5, and if T =N ! c, 0 c < p 1, T ^ !d Z N ( c ; ) : Thus, from a multivariate version of Polya’s Theorem (cf. p T ^ x (x; c ; ) = o (1), Battacharya and Rao (1986)), it follows that supx P where (x; c ; ) denotes the distribution function of Z. Then, the result follows if we show that p 0^ ^ sup P T x (x; c ; ) = oP (1) : (21) x Under the stated assumptions, from Theorem 3.1 we have that p T 0^ ^ !d N ( c ; ), in probability. By arguing along subsequences and applying Polya’s Theorem, this su¢ ces to show (21). Proof of Lemma B.1. For part a), writing 1 ~ F~ 0 F~ = F T T and noting that 1 ~0 ~ TF F F~ H 0 0 1 F~ + H F~ 0 F~ ; T = Ir , yields H = F~ 0 F~ + OP T 2 NT ; (22) where we used Lemma B.3 to bound the second term. Now right multiply by H H H 0 F~ 0 F~ 0 = H + OP T 2 = Ir + OP NT ; 2 NT F~ = 0 F~ H F~ + F~ 0 T 40 + OP 2 NT = 0 to get F~ 0 F~ + OP T 2 NT where again we used Lemma B.3 and the fact that by construction H 0 1 =H + OP 2 NT F~ 0 F~ T = Ir . This shows that , i.e. H is asymptotically an orthogonal matrix. The eigenvalues of H are therefore +1 or -1, for large N and T . For part b), by de…nition, ! ~0 ~ ~ 0 F~ ~ 0 ~ F 1 H = V~ = V~ 1 H + OP T N N 2 NT ; given (22). Left multiplying both sides by V~ yields V~ H = H ~0 ~ N since V~ = ~0 ~ N + OP 2 NT = H V~ + OP 2 NT ; (23) by construction of the principal components. By Lemma A.3 of Bai (2003), V~ !P V > 0 and therefore we can write that V~ H = H V + oP (1) ; or, transposing, that VH Thus, H H 0 0 0 = H 0 V~ + oP (1) : is (for large N and T ) the matrix of eigenvectors of V . Since V is a diagonal matrix, is also diagonal, asymptotically. Moreover, because V has distinct eigenvalues (by assumption), it follows that its eigenvectors have only one nonzero value and this is +1 or -1 (because H is orthogonal). Therefore H 0 is for large N and T a diagonal matrix with 1 in the main diagonal (in particular, H = diag(sign(F~ 0 F~ ))). Part c) follows from (23) by right multiplying by H 0 and using parts a) and b). Proof of Lemma B.2. The proof follows exactly as the proof of Lemma A.1, thus we only highlight the di¤erences. Throughout, we denote with a star the bootstrap analogue of a given formula in that proof. First, to bound I2 , the bootstrap analogue of I2 , we use directly Condition 1 T C*(c) to conclude that I2 = OP , in probability. Second, to bound I1 , we rely on Lemma 3.1 and on Conditions A*(b) and D*(a). For II1 , we use Condition C*(a) instead of Assumption 4(a). For P II2 , we use Condition B*(b) instead of Assumption 3(b) to show that T1 Tt=1h E kmt k2 = OP (1), P 2 =O and Condition D*(a) instead of Assumption 5(a) to show that T1 Tt=1h "t+h P (1), in probability. To bound III1 and III2 ; we rely on Condition C*(b) instead of Assumption 4(b). Finally, to bound IV , we rely on Conditions B*(d) and D*(b) instead of Assumptions 3(d) and 5(b), respectively. Proof of Lemma B.3. Proof of part a). We follow closely the proof of Lemma A.2. Speci…cally, we analyze each of the terms in T h 1 X ~ Ft T H F~t F~t H F~t t=1 0 = V~ 1 T h 1 X (A1t + A2t + A3t + A4t ) (A1t + A2t + A3t + A4t )0 V~ T t=1 In particular, by Lemma 3.1, and the appropriate bootstrap high level conditions, we can show that: 1 T PT h 0 t=1 A1t A1t = OP 1 T , in probability, using Condition A*(b). 41 1 : 1 T 1 T PT h 0 t=1 A2t A2t 1 = OP PT h 0 t=1 A3t A3t F~ 0 F~ T 1 NH = , in probability, using Conditions A*(c) and B*(b). 2 NT N 1 T using Condition B*(d). Each of 1 T PT h 0 1 t=1 A4t A4t , T PT ~ 0e p t N h t=1 PT h 0 1 t=1 A2t A3t , T in probability, using condition B*(d). 1 T 1 T PT h 0 t=1 A1t A2t = OP PT h 0 t=1 A1t A3t and 1 T Thus, we conclude that T h 1 X ~ Ft T H F~t F~t NT 1 p h 0 t=1 A1t A4t H F~t F~ 0 F~ T PT h 0 t=1 A2t A4t 1 H 0 + OP and 1 T N NT PT h 0 t=1 A3t A4t , in probability, 1 is OP N NT , , in probability. NT PT 0 et 0 ~ p N p1 NT are OP 1 1 H N +OP 1 T = V~ t=1 , in probability. F~ 0 F~ T !" + OP T h 1 X T t=1 1 N ~ 0e p t N ! + OP NT et 0 ~ p N p !# 1 NT F~ 0 F~ T ! H 0 V~ : Proof of part b). We analyze each of the terms in T h 1 X ~ H Ft (A1t + A2t + A3t + A4t )0 V~ T 1 =H bf 1 + bf 2 + bf 3 + bf 4 V~ 1 ; t=1 PT h ~ 0 t=1 Ft Ait , 1 T where we let bf i for j = 1; : : : ; 4: Following exactly the same steps as in the proof of part b) of Lemma A.2, we can show that: 1p bf 1 = OP NT bf 2 = OP p1 TN , in probability, given Condition B*(b) and Lemma 3.1. bf 3 = OP p1 TN , in probability, given Condition B*(c) and Lemma 3.1. bf 4 = 1 N 1 T , in probability, given Condition B*(a) and Lemma 3.1. T PT h ~ ~0 t=1 Ft Ft h P T 1 T ~ 0e p s N s=1 es 0 ~ p N i F~ 0 F~ T V~ 1 +O P p1 TN + OP 1 N NT , in probability, given in particular Condition B*(c) and Lemma 3.1. Thus, we can conclude that ! !# ! !" T h T T 0~ Xh X ~ 0e ~ 0 F~ 0 1 X 1 e F 1 1 p s ps H F~t F~t H F~t = H F~t F~t0 V~ T N T T T N N t=1 t=1 s=1 +OP 1 p NT T + OP 1 N : NT Proof of part c). This follows by the same arguments used in the proof of part c) of Lemma A.1. 42 2 1 Proof of Lemma B.4. From prove part a) of Lemma B.3, and noting that F~ 0 F~ T = Ir , we have that T h 0 1 X ~ 1 p H 0 Ft H F~t F~t H F~t ^ T t=1 " T h ! !# p 1 X ~ 0 et et 0 ~ T ~ 1 p p V H H 0 V~ 1 H = N T N N t=1 ! p 1 1 T 1 p p +OP + OP + OP N NT T N = cV~ 1 H 0 V~ = c H 0 1 V~ 1 = c H0 0 1 V~ 1 H 1 1 H V~ H H 1 0 1 H 0 1 0 ^ ^ + oP (1) H 0 1 V~ 1 H 1 1 0 H ^ + oP (1) ^ + oP (1) where the second equality uses Condition B*(e), and the third equality follows from Lemma B.1.c). 1 (H 0 ) 1 The last equality uses the fact that H0 = diag ( 1), which implies that H ^ = ^ + oP (1). This proves part a). Similarly, from part b) of Lemma B.3 and given Condition B*(e), we get that T h 0 1 X 1 p H F~t F~t H F~t H 0 ^ T t=1 !" ! !# p T h T 1 X ~ ~0 1 X ~ 0 es T es0 ~ p p Ft Ft = H N T T N N t=1 s=1 ! F~ 0 F~ 1 V~ 2 H 0 ^ + oP (1) = cH T = cH V~ 1 H 0 H 0 1 V~ 1 H 1 H 0 1 F~ 0 F~ T ! V~ ^ + oP (1) = cH0 2 H V~ 2 1 0 ^ + oP (1) ^ + oP (1) ; P where the second equality uses the fact that T1 Tt=1h F~t F~t0 = Ir +oP (1) ; the third equality uses Lemma ~0 ~ B.1.c) and the fact that F TF V~ 1 = V~ 1 H 0 , and the last equality holds because H 1 (H 0 ) 1 = Ir + oP (1), in probability. To prove part c), we note that from part c) of Lemma B.3, we have that T h 0 1 X 1 p Wt F~t H F~t H 0 ^ T t=1 ! !# !" p T h T 0~ X ~ 0 es T 1 X 1 e p ps = Wt F~t0 N T T N N t=1 s=1 ! F~ 0 F~ 1 V~ 2 H 0 ^ + oP (1) = c ~ W F~ T = c ~ W F~ V~ 2 F~ 0 F~ T ^ + oP (1) ; by relying on the same arguments as those used to prove part b). 43 ! V~ 2 H 0 1 ^ + oP (1) C Appendix C: Proofs of results in Section 4 First, we state an auxiliary result and its proof. Then we prove Theorem 4.1. Lemma C.1 Suppose Assumptions 1-5 hold. If in addition either: (1) fFs g, f i g and feit g are mutually independent and for some p 2, E jeit j3p M < 1, or (2) for some p follows that (i) PT 1 T (ii) (iii) F~t t=1 1 N PN ~i i=1 1 TN PT t=1 p HFt 10 H PN ~pit i=1 e M < 1; E k i kp 2, E jeit j2p M < 1; E k i k3p M < 1 and E kFt kp M < 1 and E kFt k3p M < 1, it = OP (1) ; p = OP (1) ; i = OP (1) : Proof of Lemma C.1. Proof of (i). We rely on the following identity (see Bai and Ng (2002), proof of Theorem 1): F~t HFt = V~ 1 T 1X~ Fs T T 1X~ + Fs st T s=1 where st = 1 N PN i=1 eis eit ; T 1X ~ Ft T st = HFt 1 N p PN 0 i Fs eit ; i=1 3p s=1 V~ 1 1 p t=1 st s=1 ! ; It follows that by the c ! T T T 1X 1X 1X at + bt + ct ; T T T and st = T 1X~ + Fs st T ts : t=1 t=1 r inequality, t=1 where T 1 X~ at = p Fs T p st s=1 Let st denote either st , T X s=1 st s=1 or p F~s st T 1 X~ ; bt = p Fs T st . We can write 0 1 2 p=2 T X =@ F~s st A s=1 p st t=1 s=1 p st T T C 1 XB B 1 X ~ 2C Fs C B T @ T s=1 A t=1 | {z } rp=2 1 T2 =r T T XX t=1 s=1 j p st j ; 44 st : s=1 T X F~s s=1 T 1X j T s=1 T 2X s=1 where the inequality follows by Cauchy-Schwartz. It follows that 0 1p=2 T T 1X 1 X~ Fs T Tp p T 1 X~ ; and ct = p Fs T 2 st j !p=2 j 2 st j !p=2 ; T 1X rp=2 T t=1 T 1X j T s=1 2 st j !p=2 where the last inequality follows again by the c r inequality. Thus it su¢ ces to show that E j M < 1 to prove that the above term is OP (1). Starting with p N 1 X E j st j = E eit eis N N 1 X E jeit eis jp N p i=1 N 1 X E j st j = E N st , 1=2 E jeis j2p i=1 M < 1: When p = N 1 X E jeit j2p N i=1 given that we assume E jeit j2p st = st p st , i=1 M < 1; we have that N 1 X E N 0 i Fs eit 1=2 p st j p 0 i Fs eit < M; i=1 since E 2=3 3p p 0 i Fs eit 1=3 E kFs k3p E k i eit k 2 given the assumptions that E k i k3p E k i k3p E jeit j3p M , E kFs k3p 1=3 E kFs k3p M , and E jeit j3p 1=3 M M in obtaining the last inequality. Note that if we assume that f i g, fFs g and feit g are three mutually independent groups of random variables, then it su¢ ces that E k i kp E M , E kFs kp p 0 i Fs eit : M , and E jeit jp The term that depends on st = st can be dealt with similarly. 0~ ~0 Proof of (ii). Note that ~ = XTF , which implies that ~ 0 = FTX . Since X = F M to bound 0 + e, it follows that ~0 ~0 = F F T thus implying that ~ i = F~ 0 F T i + F~ 0 ei T ; 1 F~ 0 F~ F H0 T T i + F~ 0 e ; T where ei = (ei1 ; : : : ; eiT )0 . We can write 0 F~ 0 F H F~ 0 F~ = T T from which it follows that 0 ~0 ~0 ~ i = F F H H 0 1 i + F ei = H 0 T T 0 1 F~ 0 F~ F~ 0 F~ = Ir F H0 H0 F H0 T 1 1 i +T F~ ; 0 F H0 ei +T 1 0 F H 0 ei : Thus, 1 N N X ~i H 10 p i 3p i=1 0 1B For the …rst term, we have that N 1 X T N 1 F~ 0 F~ @ F H0 H0 1 N + N1 1 PN T i=1 p T i PN i=1 F~ 1 1=2 F~ p 1F ~0 F~ F H0 H0 1 i p 0 P F H 0 ei + N1 N i=1 T T T 1=2 F~ F H0 p H0 i=1 p 1 (F H 0 )0 e p i 1 p T 1=2 F~ p = T 1 F~ 2 p=2 = T 1 T X t=1 45 F~t 2 !p=2 = rp=2 ; C A: N 1 X k i kp : N i=1 Now, 1 given that F~ 0 F~ =T = Ir . Similarly, under Assumptions 1-5, F~ 1=2 T p F H0 = T T X 1 F~t HFt t=1 1 p Since H 0 2 !p=2 p NT = OP = OP (1) : = OP (1), it follows that the …rst term is OP (1) provided E k i kp holds by assumption. M < 1, which For the second term, N 1 X T N 1 F~ 0 F H0 p ei 1=2 = T F~ p F H0 t=1 N 1 X T N 1=2 ei p ; t=1 where the …rst factor is OP (1) and the second factor is dominated by N 1 X N T 1=2 ei 2 p=2 N 1 X = T N N 1 X = T N p=2 1 0 ei ei t=1 i=1 t=1 p which is OP (1) given the assumption that E jeit j using in particular the fact that E kFt k2 T X e2it t=1 !p=2 N T 1 XX p eit ; NT t=1 t=1 M: The third term can be bounded similarly M < 1: Proof of (iii). We can write F~t 0 1 iH e~it = eit 1 ~i HFt H 10 0 i F~t ; which implies that T N 1 XX j~ eit jp NT 3p 1 t=1 i=1 +3p T N 1 XX jeit jp + 3p NT 1 1 N t=1 i=1 N X ~i H 10 1 p i i=1 N 1 X k i kp H N 1 T i=1 T X F~t p 1 p T 1X ~ Ft T HFt p t=1 : t=1 The …rst term is OP (1) given that E jeit jp = O (1); the second term is OP (1) since E k i kp = O (1) and given part (i); and the third term is OP (1) given parts (ii) and (iii), since in particular T 1X ~ Ft T t=1 p T 1X HFt + F~t T HFt p 2p T T 1X 1X ~ kHkp kFt kp + Ft T T 1 t=1 t=1 t=1 HFt p ! = OP (1) : Proof of Theorem 4.1. We verify Condition A*-F*. We start with Condition A*. Since eit = e~it it where i.i.d. (0; 1) across (i; t), part a) follows immediately. For part b), note that st = 2 P PT N 1 1 PT 1 PN 2 = 1 2 e ~ e ~ 1 (t = s) ; which implies that e ~ : This expression is it is i=1 t;s st t=1 N i=1 it N T T 1 PT 1 PN 4 bounded by T t=1 N i=1 e~it ; which is OP (1) under our assumptions by an application of Lemma C.1 (iii) with p = 4: For c), note that for any (t; s), E N 1 X p (eit eis N i=1 2 E (eit eis )) = N N 1 XX Cov N i=1 j=1 46 eit eis ; ejt ejs : For the wild bootstrap where eit = e~it Cov it , with eit eis ; ejt ejs = e~it e~is e~jt e~js Cov i.i.d. across (i; t), it it is ; jt js e~2it e~2is V ar ( it is ) if i = j ; 0 if i 6= j = which implies that N 1 X p (eit eis N i=1 E 2 E (eit eis )) N 1 X 2 2 = e~it e~is V ar ( N it is ) : i=1 Thus, condition A*(c) becomes N T T 1 XX 1 X 2 2 e~it e~is V ar ( it | {z T2 N t=1 s=1 i=1 for some constants is ) } N 1 X N T 1X 2 e~it T t=1 i=1 !2 C N T 1 1 XX 4 e~it = OP (1) ; NT i=1 t=1 and C, which holds given Lemma C.1 (iii) with p = 4: For Condition B*(a), under the wild bootstrap, in particular the bootstrap time series independence, we have that ! T T T T N 1 X X ~ ~0 1 X ~ ~0 1 X ~ ~0 1 X 2 Fs Ft st = Ft Ft tt = Ft Ft e~it T T T N t=1 s=1 t=1 t=1 i=1 2 31=2 ! ! !1=2 " #1=2 1=2 2 T T N T T X N X X X X 4 1 1 X ~ ~0 2 1 1 1 1 4 Ft Ft e~2it 5 F~t e~4it ; T T N T TN t=1 t=1 t=1 i=1 t=1 i=1 which is OP (1) under our assumptions. For Condition B*(b), we have that " 2 T N N T T X 1X1 1 XX 1 X p z^s (eit eis E (eit eis )) = (eit eis E z^s p T T T N s=1 i=1 N i=1 t=1 t=1 s=1 1 0 N N T T T X X 1 X 1 XX 0 1 1 z^s z^l E @ p (eit eis E (eit eis )) p ejt ejl E ejt ejl A ; T T N i=1 N j=1 t=1 s=1 l=1 T 1X E T = where 0 N 1 X E @p (eit eis N i=1 N 1 X E (eit eis )) p ejt ejl N j=1 E 1 N N 1 XX ejt ejl A = Cov N E (eit eis )) eit eis ; ejt ejl : i=1 j=1 Using the properties of the wild bootstrap, in particular the bootstrap cross sectional independence, we can show that Cov eit eis ; ejt ejl = 0 when i 6= j for any t; s; l: When i = j; Cov eit eis ; ejt ejl = Cov (eit eis ; eit eil ) = 0 V ar (eit eis ) = e~2it e~2is V ar ( 47 it if s 6= l ) if s = l: is # 2 It follows that T 1X E T = = 1 T t=1 T X t=1 2 E (eit eis )) T N 1X 0 1 X 2 2 z^s z^s e~it e~is V ar ( T N N 1 X N i=1 " T N 1 XX p z^s (eit eis T N s=1 i=1 s=1 it is ) i=1 T 1X 2 e~it T t=1 N T 1 1 XX 4 e~it NT i=1 t=1 ! T 1X k^ zs k2 e~2is T s=1 #1=2 " 2 ! N 1 X N i=1 N X 41 N i=1 T T N X X 1X 4 1 1 k^ zs k e~4is T NT s=1 i=1 s=1 ! T 1X 0 2 z^s z^s e~is T t=1 s=1 !2 31=2 2 !2 31=2 T N T X X X 1 1 1 e~2it 5 4 k^ zs k2 e~2is 5 T N T T 1X 2 e~it T #1=2 ! t=1 i=1 s=1 : This expression is OP (1) under our assumptions. Next consider Condition B*(c). We have that ! 2 2 T N T X X 1 X ~ 0 et 0 1 0 ~ p z^t = ^t E E p i eit z TN T t=1 N t=1 i=1 ! ! T N X X 1 0 2 ~ ~iE e = z^t tr z^t0 i it TN t=1 i=1 ! T N T N X X X X 2 1 0 2 1 ~ ~ i e~2 = 1 ~ i e~2 z^t0 z^t k^ z k = t i it it TN T N t=1 t=1 i=1 i=1 0 ! !2 11=2 1=2 T T N X X X 2 1 1 ~ i e~2 A : @1 k^ zt k 4 it T T N t=1 t=1 i=1 | {z } =OP (1) under our assumptions But by Cauchy-Schwartz, T 1X T N 1 X ~ i N t=1 2 i=1 e~2it !2 N T N 1 X ~ 41 X 1 X 4 e~it = OP (1) ; i N T N t=1 }| {z i=1 } | i=1{z OP (1) under our conditions. In particular, note that N 1 X ~ i N i=1 4 2 3 N 1 X H N =OP (1) 4 10 i i=1 N 1 X ~ + i N H i=1 10 4 i ! and use Lemma C.1 (ii) with p = 4 to bound the second term. The …rst term is bounded by the assumptions on f i g. For Condition B*(d), we have that E ~ 0e p t N 2 =E N 1 X~ p i eit N i=1 2 = N N 1 X X ~0 ~ i jE N | i=1 j=1 48 N 1 X ~0 ~ 2 eit ejt = ~it ; i ie {z } N i=1 =0 for i6=j since Cov it ; jt = 0 and V ar ( it ) = 1: Thus, Condition B*(d) becomes ! T N N T 1 X 1 X ~0 ~ 2 1 X ~ 2 1X 2 = e~it ~it = i i ie T N N T t=1 t=1 t=1 i=1 i=1 !1=2 0 !2 11=2 !1=2 N N T N X X X X 4 4 1 1 1 1 2 ~i ~i @ e~it A N N T N T 1X E T ~ 0e p t N 2 i=1 t=1 i=1 i=1 N T 1 X1X 4 e~it N T t=1 i=1 !1=2 = OP (1) ; under our assumptions, by an application of Lemma C.1. For Condition B*(e), we need to show that T 1X T A t t 0 E t t 0 T N N 1 X 1 X X ~ ~0 i j eit ejt T N = t=1 t=1 E eit ejt = oP (1) : i=1 j=1 This expression has mean zero under the bootstrap measure by construction. So, it su¢ ces to show that its variance tends to zero in probability. Take the case where the number of factors r is equal to 1, for simplicity. Then, V ar (A ) = V ar T 1X T t 2 E t ! 2 t=1 T T N 1 XX 1 X ~ ~ ~ ~ = 2 i j l k Cov T N2 t=1 s=1 eit ejt ; els eks = 0 if t 6= s, for any Using the properties of the wild bootstrap, we can show that Cov (i; j; k; l) ; and Cov eit ejt ; elt ekt 8 4 2 < e~it V ar it if i = j = k = l 2 2 e ~ e ~ , if i = k 6= j = l = : it jt : 0, otherwise This implies that for some …nite constant ; 0 T N 1 X 1 @X ~ 4 4 V ar (A ) = ~it V ar ie T2 N2 t=1 T 1 X T2 t=1 = OP provided 1 N PN ~ 4 i=1 i = OP (1) and 1 T 1 NT i=1 N 1 X ~2 2 ~it ie N i=1 !2 2 it eit ejt ; els eks : i;j;k;l + N X i6=j 1 ~ i ~ j e~2 e~2 A it jt N T N 1 1 X ~4 1 X X 4 e~it i TN NT i=1 t=1 i=1 = oP (1) ; PN PT ~4it t=1 e = OP (1), which holds under our moment asP P ~ 2 ~2 for the wild sumptions by an application of lemma C.1 with p = 4: Thus, = T1 Tt=1 N1 N it i=1 i e i=1 bootstrap. Condition F* is satis…ed because by Bai and Ng (2006), Next, we verify Condition C*. For t = 1; : : : ; T !P Q Q0 . h, let "t+h = ^"t+h vt+h , where vt+h i.i.d.(0; 1). Part a) follows as Condition B*(b), using the independence between "t+h and eit : For part b), we have 49 that T h 1 X ~ 0 et p p "t+h T t=1 N E 2 = = 1 E TN T Xh N X t=1 ~ie it ! 2 "t+h 80 i=1 < TXh TXh 1 @ "t+h "s+h E : TN t=1 s=1 N X i=1 By the independence between feit g and f"s g, we have that 1 0 N N T h T h XX 0 1 1 BX X C ~ ~j E e e E "t+h "s+h = @ i it js A = TN TN {z } i=1 j=1 | | {z } t=1 s=1 =0 if t6=s = T h N 1 X 2 1 X ~ ^"t+h i T N t=1 2 e~2it =0 if i6=j or t6=s T h 1 X 4 ^"t+h T t=1 i=1 !1=2 0 @1 T T Xh 119 !0 N = X ~0 e ~ j e AA : @ i it js ; j=1 T Xh t=1 N 1 X ~ i N t=1 E 2 "t+h 2 N X ~0 ~iE i 2 eit i=1 e~2it i=1 !2 11=2 A = OP (1) ! OP (1) ; under our assumptions and using the same arguments used above to show condition B*(b). Part c) follows because st that T h T 1 XX ~ Fs "t+h T = 0 for t 6= s and by repeated application of Cauchy-Schwartz inequality, we have st t=1 s=1 1 T T Xh t=1 F~t "t+h t=1 2 ! T h N 1 X ~ 1 X 2 Ft "t+h e~it tt = T N t=1 i=1 11=4 0 3 !2 1=2 B #1=2 C " N T T h T X N X 41 X C B1 X 1 X 2 5 1 1 4 4 B F~t "t+h C ; e~it e~it C BT N T TN A @ t=1 t=1 t=1 i=1 i=1 {z }| {z } | {z } | T 1X~ = Ft "t+h T !1=2 2 41 T T Xh t=1 OP (1) OP (1) =OP (1) P 4 = O under our assumptions, provided in particular that T1 Tt=1h "t+h P (1) in probability. For this, P T h 1 4 it su¢ ces that T t=1 E "t+h = OP (1) : But by the properties of the wild bootstrap on "t+h , we have that T h 1 X E T t=1 4 "t+h T h 1 X 4 = ^"t+h T 4 vt+h E | {z t=1 } C<1 by assumption T h 1 X 4 ^"t+h ; T t=1 which is veri…ed under our conditions. Finally. we verify Condition D*. E "t+h = 0 by construction. 1=2 PT h P 2 1 PT h 4 Moreover, we have that T1 t=1 E "t+h = T1 Tt=1h ^"2t+h "t+h = OP (1) ; under our t=1 ^ T assumptions. So, part a) is veri…ed. For part b) of Condition D*, we need to verify that the bootstrap CLT result holds for the wild bootstrap. By the Cramer-Wold device, it su¢ ces to show that T h 1 X 0 p ` ( T t=1 | ) 1=2 {z wt z^t "t+h !d N (0; 1) ; } in probability for any ` such that `0 ` = 1. Note that wt is an heterogeneous array of independent random variables (given that "t+h is conditionally independent but heteroskedastic). Thus, we apply 50 a CLT for heterogeneous independent arrays. Note that E (wt ) = 0 and ! ! T h T h 1 X 1 X 1=2 0 p p V ar wt = ` ( ) V ar z^t "t+h ( ) 1=2 ` T t=1 T t=1 ! T h X 1 = `0 ( ) 1=2 z^t z^t0 ^"2t+h ( ) 1=2 ` = 1: T t=1 Thus, it su¢ ces to verify Lyapunov’s condition, i.e. for some r > 1; that wt = `0 ( ) 1=2 T h 1 X E jwt j2r = Tr t=1 z^t "t+h ; we have that 1 Tr T h 1 X 0 ` ( 1T Tr C ) 1 k`k 1 Tr 1 2r ( ( ) 0 z^t 1=2 2r 2r ) 1=2 2r h t=1 E T h 1 X k^ zt k2r j^"t+h j2r E jvt+r j2r | {z } T T h 1 X k^ zt k4r T t=1 = PT 1 T jwt j2r !P 0: Noting j^"t+h j2r t=1 by an application of Lemma C.1. Since to 1=2 t=1 1 0 0 1 Tr PT M <1 !1=2 h ^t z^t0 ^"2t+h , t=1 z T h 1 X j^"t+h j4r T t=1 !1=2 = OP under our assumptions 1 Tr 1 = oP (1) ; converges > 0, by Bai and Ng (2006). Thus, Condition E* is satis…ed. Condition F* was veri…ed in the main text. 51 References [1] Bai, J., 2003. “Inferential theory for factor models of large dimensions,” Econometrica, 71, 135172. [2] Bai, J., 2009. “Panel data models with interactive …xed e¤ects,” Econometrica, 77, 1229-1279. [3] Bai, J. and S. Ng, 2002. “Determining the number of factors in approximate factor models,” Econometrica, 70, 191-221. [4] Bai, J. and S. Ng, 2006. “Con…dence intervals for di¤usion index forecasts and inference with factor-augmented regressions,” Econometrica, 74, 1133-1150. [5] Bai, J. and S. Ng, 2011. “Principal Components Estimation and Identi…cation of the Factors", manuscript, Columbia University. [6] Boivin, J. and B. S. Bernanke, 2003, "Monetary policy in a data-rich environment", Journal of Monetary Economics, 50, 525-546. [7] Chamberlain, G. and M. Rothschild, 1983. “Arbitrage, factor structure and mean-variance analysis in large asset markets,” Econometrica, 51, 1305-1324. [8] Connor, G. and R. A. Korajczyk, 1986. “Performance measurement with the arbitrage pricing theory,” Journal of Financial Economics, 15, 373-394. [9] Connor, G. and R. A. Korajczyk, 1993. “A test for the number of factors in an approximate factor model,” Journal of Finance, XLVIII, 1263-1291. [10] Eichengreen, B. A. Mody, M. Nedeljkovic, and L. Sarno, 2009. “How the Subprime Crisis Went Global: Evidence from Bank Credit Default Swap Spreads", NBER working paper 14904. [11] Gospodinov, N. and S. Ng, 2011. “Commodity prices, convenience yields and in‡ation,” emph Review of Economics and Statistics, forthcoming. [12] Ludvigson, S. and S. Ng, 2007. “The empirical risk return relation: a factor analysis approach,” Journal of Financial Economics, 83, 171-222. [13] Ludvigson, S. and S. Ng, 2009a. “Macro factors in bond risk premia,”Review of Financial Studies, 22, 5027-5067. [14] Ludvigson, S. and S. Ng, 2009b. “A factor analysis of bond risk premia,” Handbook of Applied Econometrics, forthcoming. [15] Onatski, A., 2011. “Asymptotics of the principal components estimator of large factor models with weakly in‡uential factors,” manuscript, University of Cambridge. 52 [16] Shintani, M. and Z-Y Guo, 2011. “Finite sample performance of principal components estimators for dynamic factor models: asymptotic vs. bootstrap approximations,” manuscript, Vanderbilt University. [17] Stock, J. H. and M. Watson, 2002. “Forecasting using principal components from a large number of predictors,” Journal of the American Statistical Association, 97, 1167-1179. [18] Yamamoto, Y., 2011. “Bootstrap inference for impulse response functions in factor-augmented vector autoregressions,” manuscript, University of Alberta. 53 Table 1: Bias and coverage rate of 95% CIs for delta - Homoskedastic cases T = 50 N = 50 T = 100 T = 200 T = 50 N = 100 T = 100 T = 200 T = 50 N = 200 T = 100 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T = 200 Bias DGP 1 bias estimate WB -0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 alpha = 0 homo, homo 0.00 0.00 0.00 Coverage rate OLS BC True factor WB 94.0 91.0 93.8 95.4 93.1 95.5 95.0 91.9 94.3 94.5 93.1 94.3 95.7 94.0 95.4 94.2 93.3 93.7 95.7 95.1 95.0 94.4 93.8 94.6 94.1 93.7 94.1 96.9 96.2 95.8 96.0 97.0 95.1 96.7 94.7 94.3 -0.08 -0.05 -0.06 -0.09 -0.03 -0.07 -0.06 -0.03 -0.05 -0.05 -0.03 -0.04 Bias DGP 2 bias estimate WB -0.17 -0.09 -0.12 -0.14 -0.09 -0.11 -0.13 -0.10 -0.10 -0.11 -0.05 -0.09 alpha = 1 homo, homo -0.09 -0.05 -0.07 Coverage rate OLS BC True factor WB 71.1 83.0 93.8 66.0 88.1 95.5 50.7 86.5 94.3 84.7 88.6 94.3 83.1 90.4 95.4 79.3 90.2 93.7 88.3 90.1 94.6 89.8 92.1 94.6 88.1 92.7 94.1 90.9 92.7 90.7 93.8 93.7 92.0 93.8 94.5 94.3 Table 2. Bias in estimation of alpha - More general cases DGP 3 alpha = 1 hetero, homo DGP 4 alpha = 1 hetero, hetero DGP 5 alpha = 1 hetero, AR+hetero DGP 6 alpha = 1 hetero, CS+homo T = 50 N = 50 T = 100 T = 50 N = 100 T = 100 T = 200 bias estimate WB -0.16 -0.09 -0.12 -0.14 -0.09 -0.11 bias estimate WB -0.17 -0.10 -0.13 bias estimate WB bias estimate WB T = 50 N = 200 T = 100 T = 200 T = 200 -0.13 -0.10 -0.10 -0.11 -0.05 -0.09 -0.09 -0.05 -0.07 -0.07 -0.05 -0.06 -0.09 -0.03 -0.07 -0.06 -0.03 -0.05 -0.05 -0.03 -0.04 -0.15 -0.10 -0.12 -0.14 -0.10 -0.11 -0.12 -0.05 -0.10 -0.10 -0.06 -0.08 -0.08 -0.06 -0.07 -0.10 -0.03 -0.08 -0.06 -0.03 -0.06 -0.05 -0.03 -0.04 -0.19 -0.09 -0.13 -0.16 -0.10 -0.11 -0.15 -0.10 -0.11 -0.13 -0.05 -0.09 -0.10 -0.06 -0.08 -0.08 -0.06 -0.07 -0.10 -0.03 -0.08 -0.07 -0.03 -0.06 -0.05 -0.03 -0.04 -0.23 -0.11 -0.10 -0.21 -0.12 -0.09 -0.20 -0.12 -0.09 -0.15 -0.07 -0.08 -0.13 -0.07 -0.07 -0.11 -0.08 -0.06 -0.11 -0.04 -0.07 -0.08 -0.04 -0.05 -0.07 -0.04 -0.04 Table 3: Coverage rate of 95% CIs for delta - More general cases DGP 3 alpha = 1 hetero, homo DGP 4 alpha = 1 hetero, hetero DGP 5 alpha = 1 hetero, AR + hetero DGP 6 alpha = 1 hetero, CS + homo T = 50 N = 50 T = 100 T = 200 T = 50 N = 100 T = 100 T = 200 T = 50 N = 200 T = 100 T = 200 63.5 81.5 93.3 57.3 84.3 93.6 45.6 84.0 93.5 76.5 83.1 90.2 78.4 87.6 93.1 76.2 88.0 92.9 80.6 83.5 88.9 85.0 89.6 94.3 86.0 89.6 92.8 WB 93.1 92.3 91.1 93.5 93.4 92.1 92.6 94.0 92.5 OLS BC 58.8 78.5 93.3 51.8 81.1 93.6 38.7 82.1 93.5 73.0 82.5 90.2 75.3 87.1 93.1 73.1 86.6 92.9 78.7 83.2 88.9 84.5 89.3 94.3 84.6 89.3 92.8 WB 93.7 92.8 91.8 92.9 93.8 92.3 92.6 93.6 93.1 OLS BC 54.7 74.6 93.3 49.0 79.7 93.6 36.7 81.1 93.5 72.4 80.8 90.2 72.9 86.3 93.1 72.6 87.8 92.9 76.9 81.1 88.9 83.7 88.5 94.3 83.4 88.1 92.8 WB 91.2 90.8 91.1 91.8 92.8 92.4 91.3 93.1 92.3 OLS BC 42.2 68.3 93.3 30.6 67.2 93.6 13.7 61.4 93.5 65.8 77.5 90.2 65.2 81.7 93.1 54.7 81.1 92.9 76.1 80.0 88.9 79.5 87.9 94.3 78.3 87.4 92.8 81.5 75.1 59.4 87.5 84.5 80.3 88.6 90.2 87.3 OLS BC True factor True factor True factor True factor WB