SERIES PAPER DISCUSSION IZA DP No. 6318 Exponent of Cross-sectional Dependence: Estimation and Inference Natalia Bailey George Kapetanios M. Hashem Pesaran January 2012 Forschungsinstitut zur Zukunft der Arbeit Institute for the Study of Labor Exponent of Cross-sectional Dependence: Estimation and Inference Natalia Bailey University of Cambridge George Kapetanios Queen Mary, University of London M. Hashem Pesaran University of Cambridge, University of Southern California and IZA Discussion Paper No. 6318 January 2012 IZA P.O. Box 7240 53072 Bonn Germany Phone: +49-228-3894-0 Fax: +49-228-3894-180 E-mail: iza@iza.org Any opinions expressed here are those of the author(s) and not those of IZA. Research published in this series may include views on policy, but the institute itself takes no institutional policy positions. The Institute for the Study of Labor (IZA) in Bonn is a local and virtual international research center and a place of communication between science, politics and business. IZA is an independent nonprofit organization supported by Deutsche Post Foundation. The center is associated with the University of Bonn and offers a stimulating research environment through its international network, workshops and conferences, data service, project support, research visits and doctoral program. IZA engages in (i) original and internationally competitive research in all fields of labor economics, (ii) development of policy concepts, and (iii) dissemination of research results and concepts to the interested public. IZA Discussion Papers often represent preliminary work and are circulated to encourage discussion. Citation of such a paper should account for its provisional character. A revised version may be available directly from the author. IZA Discussion Paper No. 6318 January 2012 ABSTRACT Exponent of Cross-sectional Dependence: Estimation and Inference* An important issue in the analysis of cross-sectional dependence which has received renewed interest in the past few years is the need for a better understanding of the extent and nature of such cross dependencies. In this paper we focus on measures of crosssectional dependence and how such measures are related to the behaviour of the aggregates defined as cross-sectional averages. We endeavour to determine the rate at which the cross-sectional weighted average of a set of variables appropriately demeaned, tends to zero. One parameterisation sets this to be O(N 2α-2), for 1/2 < α ≤ 1. Given the fashion in which it arises, we refer to as the exponent of cross-sectional dependence. We derive an estimator of from the estimated variance of the cross-sectional average of the variables under consideration. We propose bias corrected estimators, derive their asymptotic properties and consider a number of extensions. We include a detailed Monte Carlo study supporting the theoretical results. Finally, we undertake an empirical investigation of using the S&P 500 data-set, and a large number of macroeconomic variables across and within countries. JEL Classification: Keywords: C21, C32 cross correlations, cross-sectional dependence, cross-sectional averages, weak and strong factor models, Capital Asset Pricing Model Corresponding author: M. Hashem Pesaran Trinity College University of Cambridge CB2 1TQ Cambridge United Kingdom E-mail: mhp1@cam.ac.uk * We are grateful to Jean-Marie Dufour and Oliver Linton for helpful comments and discussions. Natalia Bailey and Hashem Pesaran acknowledge financial support under ESRC Grant ES/I031626/1. 1 Introduction Over the past decade there has been a resurgence of interest in the analysis of cross-sectional dependence applied to households, firms, markets, regional and national economies. Researchers in many fields have turned to network theory, spatial and factor models to obtain a better understanding of the extent and nature of such cross dependencies. There are many issues to be considered: how to test for the presence of cross-sectional dependence, how to measure the degree of cross-sectional dependence, how to model cross-sectional dependence, and how to carry out counterfactual exercises under alternative network formations or market inter-connections. Many of these topics are the subject of ongoing research. In this paper we focus on measures of cross-sectional dependence and how such measures are related to the behaviour of cross-sectional averages or aggregates. Perhaps, the simplest and most concise way to motivate the need for determining the extent of cross-sectional dependence is to view the matter from a simple statistical viewpoint. Let xit denote a double array of random variables indexed by i = 1, 2, ..., N and t = 1, 2, ..., T, over space and time, respectively. Then, a variety of analyses focus on weighted averages of xit over i. Examples include the construction of portfolios, when xit are asset returns, or aggregate macroeconomic variables, when xit are firm or consumer level data, such as firm sales, or individual consumption. Weighted averages P 1 take the form x̄wt = N i=1 wN i xit , where the weights wN i are granular in the sense that wN i = O N . For simplicity, we set wN i = N1 and x̄wt = x̄t . Then, it is of considerable interest to determine the behaviour of x̄t and, in particular, the rate at which x̄t , when appropriately standardised, tends to zero. In the case of asset returns this determines the extent to which risk, associated with investing in particular portfolios of assets, is diversifiable. In the case of firm sales this is of interest in relation to the effect of idiosyncratic, firm level, shocks onto aggregate macroeconomic variables such as GDP. In the case where xit are cross-sectionally independent, using standard law of large numbers, one obtains the result that x̄t = Op N −1/2 . However, in the more general and realistic case where xit are cross-sectionally correlated we have1 V ar (x̄t ) = N N N 1 X X 1 X 2 2 σij,x = N −1 σ̄xN + τN , σ + i,x N2 N2 i=1 (1) i=1 j=1,i6=j 2 = V ar(x ) and σ where σi,x it ij,x = Cov(xit , xjt ), assuming, for simplicity, stationarity, over time, for xit . The above result suggests two important conditions under which standard law of large numbers may fail to apply to the cross-sectional averages. One relates to the absence of finite variances for individual P PN xit ,2 while the other to the rate at which the remainder term, τN = N12 N i=1 j=1,i6=j σij,x , grows with N . Determining the extent of cross-sectional dependence, relates directly to the second way, in which standard law of large numbers may not be applicable, and lead to phenomena of considerable interest as discussed above. It is, therefore, of interest to investigate the rate at which τN declines with N . We may parametrise this by letting τN = O N 2α−2 , with α measuring the degree of cross-sectional 2 σ̄xN dependence. We note that V ar (x̄t ) cannot decline at a rate lower than N −1 since N = O N −1 . P PN Consider now the estimator of τN , based on sample covariances, σ̂ij,x , namely τ̂N = N12 N i=1 j=1,i6=j σ̂ij,x . 2 P P T T It is easily seen that τ̂N = σ̂x̄2 + Op N −1 , where σ̂x̄2 = T1 t=1 x̄t − T1 t=1 x̄t . Therefore, the rate at which V ar (x̄t ) tends to zero with N , is governed by the rate at which σ̂x̄2 tends to zero. But 1 (1) can be equivalently stated in terms of correlations, but since correlations are more complex to handle we choose to carry out the analysis in terms of covariances. 2 This way has been recently explored, in the case of firm sales data, by Gabaix (2011), who noted that the failure of such data to have finite variances can lead to idiosyncratic firm shocks determining, to a considerable extent, GDP growth volatility. 2 this rate can not be faster than N −1 , and hence the range of interest for α must lie in the range −1 < 2α − 2 ≤ 0, or 1/2 < α ≤ 1, This paper focusses on the problem of identification, estimation and inference regarding α, which we refer to as the exponent of cross-sectional dependence, bearing in mind that α is defined by τN = O N 2α−2 . However, it is important to note that different problems may require the determination of the exponent of different summaries of the covariance matrix. For example, a summary focused upon by Chudik, Pesaran, and Tosetti (2011) is the column sum norm of the covariance matrix. The exponent P α of that is given by the parametrisation maxi N j=1 |σij,x | = O (N ). It is clear that this exponent need not be the same as the one relating to τN . Therefore, while the exponent of cross-sectional dependence is of great interest for the phenomena outlined above, one needs to be clear about the motivation for using this measure of cross-sectional dependence. Also, other measures of cross-sectional dependence can be considered. An important example is the measure based on the upper bound for V ar (x̄t ), which is N −1 λmax (ΣN ), where ΣN = E (xt x0t ), xt = (x1t , x2t , ..., xN t )0 and λmax (ΣN ) denotes the maximum eigenvalue of ΣN . λmax (ΣN ) is an object of considerable interest in the statistical literature on large data sets. However, work in the area (see, e.g., Yin, Bai, and Krishnaiah (1988), Bai and Silverstein (1998), Hachem, Loubaton, and Najim (2005a) and Hachem, Loubaton, and Najim (2005b)) suggests that as a statistical measure of cross-sectional dependence λmax (ΣN ) could be difficult to analyse especially for temporally and cross-sectionally dependent data. This is partly due to the fact that estimates of λmax (ΣN ) based on sample estimates of ΣN , could be very poor when N is large relative to T , which is the type of data sets often encountered in macroeconomics and finance. Therefore, we do not pursue its analysis in this paper, although we acknowledge the need for further investigations in this area, comparing the performance of estimates of cross-sectional dependence based on α and on N −1 λmax (ΣN ). The above measures of cross-sectional dependence are related to the degree of pervasiveness of factors in unobserved factor models often used in the literature to model cross-sectional dependence.3 Consider the following canonical factor model xit = ai + β 0i f t + uit , where f t is the m × 1 vector of unobserved factors (m being fixed), and β i = (βi1 , βi2 , ..., βim )0 is the associated vector of factor loadings, and Cov (uit , ujt ) = ωij . The extent of cross-sectional dependence in xit crucially depends on the nature of factor loadings. The degree of cross-sectional dependence will be strong if β i is bounded away from 0 and the average value of β i is different from zero. In such a case P PN PN −2 supi N −1 N j=1 |σij,x | = O (1) and τN = N i=1 j=1,i6=j σij,x = O (1), for all i, which yields α = 1. However, other configurations of factor loadings can also be entertained, that yield values of α in the range (1/2, 1]. Since both ft and β i are unobserved, taking a strong stand on a particular value of α might not be justified empirically. Accordingly, Chudik, Pesaran, and Tosetti (2011), Kapetanios and Marcellino (2010) and Onatski (2011) have considered an extension of the above factor model which allows for a wider spectrum of cross-sectional dependence behaviours by specifying that the factor loadings β i may be functions of N and decline in magnitude as N → ∞ allowing for the possibility of factors having a weaker effect than is the case for standard factor models.4 One such formulation α−1 ), for any α < 1. This specification assumes that factor loadings decline with N , and β i = O(N 0 P PN PN P 0 −1 2α−2 ), so long has the same order for N −2 N N −1 N i=1 j=1 β i β j = N i=1 β i j=1 β j = O(N 3 Factor models have a long pedigree both as a conceptual device for summarising multivariate data sets as well as an empirical framework with sound theoretical underpinnings both in finance and economics. Recent econometric research on factor models include Bai and Ng (2002), Bai (2003), Stock and Watson (2002), and Pesaran (2006). 4 While Chudik, Pesaran, and Tosetti (2011) and Kapetanios and Marcellino (2010) have considered the case where 1/2 < α < 1, Onatski (2011) has focused on the case where a ≤ 1/2. 3 P PN as N −2 N i=1 j=1,i6=j ωij is of smaller order of magnitude than τN . The latter condition is satisfied under an approximate factor model, so long as 2(α − 1) > −1, or if α > 1/2. Although mathematically convenient, the assumption that all factor loadings vary with N (almost uniformly) is rather restrictive in many economic applications. Therefore, we will not consider it in detail, but only briefly as an alternative formulation. In this paper we consider a baseline formulation where we allow β i to be fixed in N , but assume that only N α of the N factor loadings are individually important, in the sense that they are bounded away from zero in absolute value.5 More specifically, we consider βik = vik for i = 1, 2, ..., [N αk ] , βik = i−[N αk ]+1 , ck ρk for i = [N αk (2) ] + 1, [N αk ] + 2, ..., N , for k = 1, 2, ..., m, where [N αk ] is the integer part of N αk , 1/2 < αk ≤ 1, |ρk | < 1, ck is a finite constant, and vik ∼ iid(µvk , σv2k ), with µvk 6= 0 and σv2k > 0. In effect, the factor loadings are grouped into two categories, a strong category with effects that are bounded away from zero, and a weak category with transitory effects that tend to zero exponentially. The focus of our analysis is on α = maxk (αk ), which is the cross-sectional exponent as defined above. As we shall see, since we are interested in the behaviour of cross-sectional averages, our proposed estimator of α will be invariant to the ordering of the factor loadings within each group. Also the exponential decay assumed for the second group of factor loadings can be relaxed and replaced by the absolute summability condition, PN i=[N αk ]+1 |βik | < ∞, which is again invariant to the ordering of the units with weak dependence on factors. It is important to recognize that while this formulation is the one we focus on, alternative formulations can give the same exponent. While some details of our inferential theory does depend on the nature of the specific formulation adopted, it is clear that very similar inferential procedure with similar asymptotic and small sample properties can be easily developed for alternative formulations. In cases where the common factors are observed the strength of the factors as measured by α can be estimated directly in terms of the number of statistically significant βi coefficients. Denoting the number of such statistically significant estimates of the loadings associated with the ith factor by M̂i , the estimates of αi can then be obtained as ln(M̂i )/ ln(N ). We shall consider such a ”direct” estimate of α in our empirical application to Capital Asset Pricing Model. Following the theoretical line of reasoning advanced above, in this paper we propose the use of the variance of the cross-sectional average of the observed data, x̄t , to estimate and carry out inference on α. Focusing, for simplicity, on a single factor representation, we show that6 V ar(x̄t ) = κ2 N 2α−2 + N −1 cN + O(N α−2 ), where κ2 = σf2 µ2v , σf2 is the variance of the factor process, µv is the mean of the factor loadings, and cN is a bias term that is analysed in detail in the main body of the paper. Using this relationship, we can provide a feasible estimator for α and derive inferential theory for it. The property of our proposed estimator depends on the choice of κ. For an arbitrary but bounded value of κ, our estimator is consistent but its rate of convergence at 1/ ln(N ) is rather slow. However, given the identified nature of σf2 in factor models it seems more sensible to consider estimating α for a given value of σf2 . In the factor literature the factor loadings are typically identified by setting σf2 = 1, assuming that the factors are strong, and the idiosyncratic components are weakly cross correlated. However, since the aim here is to estimate α, it is more sensible to fix κ2 . The rationale behind this alternative normalization, and its empirical implications will be discussed in detail. Further, we propose a second order bias corrected 5 6 Note that α in our baseline formulation is not directly comparable to the α of the almost uniform loadings formulation. We consider a general multi-factor representation, in detail, in Section 3.2. 4 estimator that addresses the term cN , and derive the asymptotic distribution of both estimators for a given value of κ2 . We consider extensions that relate to the presence of multiple factors, potentially with multiple distinct exponents of cross-sectional dependence, temporal dependence in ft or uit , and weak cross-sectional dependence in uit . It is worth noting that our baseline estimator is equivalent to one obtained by setting up a regression framework whereby the logarithm of the partial sum process of estimated factor loadings is regressed upon the logarithm of the cross-sectional dimension of the partial sum. To illustrate the properties of the proposed estimators of α and their asymptotic distributions, we carry out a detailed Monte Carlo study that considers a battery of robustness checks. Finally, we provide a number of empirical applications investigating the degree of inter-linkages in real and financial variables in the global economy, the extent to which macroeconomic variables are interconnected across and within countries, with special reference to the US and UK economies in the second case, and present recursive estimates of α applied to excess returns on securities included in the Standard & Poor 500 index. In these applications to ensure that our estimates of α are invariant to the scales of measurement of the observations, xit , we base our analysis on the standardized variates, (xit − x̄i )/si , where x̄i and si are sample mean and standard deviation of xit , respectively, over the sample under consideration. The rest of the paper is organised as follows: Section 2 provides a formal characterisation of α in the context of a single factor model, and discusses potential estimation strategies. This section also presents the rudiments of the analysis of the variance of the cross-sectional average and motivates the baseline estimator and bias corrected versions of it. Sections 3-3.3 present the theoretical results of the paper. Section 3 provides the full inferential theory and discusses feasible estimation, including estimators for the variance of the estimator of cross-sectional dependence. Section 3.2 presents an extension of the theoretical analysis to a multi-factor setting and briefly touches upon an alternative specification of factor loadings. Section 3.3 considers the normalization σf2 µ2v = κ2 , and Section 4 discusses the conditions under which estimation of α and κ2 can be carried out jointly. Section 5 presents a detailed Monte Carlo study. The empirical applications are discussed in Section 6. Finally, Section 7 concludes. Proofs of all theoretical results are relegated to the Appendix. 2 Preliminaries and Motivations As noted in the Introduction, to characterize the degree of cross-sectional dependence in xit we use a possibly weak factor model. We begin with the following single factor specification xit = ai + βi ft + uit for i = 1, 2, ..., N ; t = 1, 2, ..., T, (3) where ft is an unobserved factor, βi are the associated factor loadings and ai are bounded constants such that supi |ai | < K < ∞. We make the following assumptions. Assumption 1 The factor loadings are given by βi = vi for i = 1, 2, ..., [N α ] , i−[N α ]+1 βi = cρ α (4) α , for i = [N ] + 1, [N ] + 2, ..., N , [N α ] where 1/2 < α ≤ 1, [N α ] is the integer part of N α , |ρ| < 1, and {vi }i=1 is an identically, independently distributed (IID) sequence of random variables with mean µv 6= 0, and variance σv2 < ∞. 5 Assumption 2 The factor, ft , follows a linear stationary process given by ft = ∞ X ψf,j νf,t−j , (5) j=0 where νf t is an IID sequence of random variables with mean zero and finite variance and uniformly finite ϕ-th moments for some ϕ > 4. We assume that ∞ X j ζ |ψf j | < ∞, j=0 such that {ζ(ϕ − 2)}/{2(ϕ − 1)} ≥ 1/2. ft is distributed independently of the idiosyncratic errors, uit0 , and the factor loadings, βi , for all i, t and t0 . Assumption 3 For each i, uit follows a linear stationary process given by ! ∞ ∞ X X uit = ψij ξjs νj,t−s , (6) s=−∞ j=0 where νit , i = ..., −1, 0, ..., t = 0, ..., is a double sequence of IID random variables with mean zero and uniformly finite variances, σν2i and uniformly finite ϕ-th moments for some ϕ > 4. We assume that sup i and sup i ∞ X j ζ |ψij | < ∞, (7) j=0 ∞ X |s|ζ |ξis | < ∞ (8) s=−∞ such that {ζ(ϕ − 2)}/{2(ϕ − 1)} ≥ 1/2. It is worth briefly commenting on these assumptions. Assumption 1 has been motivated in Section P 2 1, and implies that N −1 N N α−1 , which is more general than the standard assumption in i=1 βi = Op P 2 the factor literature that requires N −1 N i=1 βi to have a strictly positive limit (see, e.g., Assumption B of Bai and Ng (2002)). It is easy to see that the standard assumption is satisfied only if α = 1. Assumptions 2 and 3 are mostly straightforward specifications of the factor and error processes assuming a linear structure with sufficient restrictions to enable the use of central limit theorems. The only noteworthy part of these assumptions relates to the cross-sectional dependence of the error terms. Here, cross-sectional dependence is structured in a flexible linear way so as to mirror, to the extent possible, the conditions assumed for stationary (weak) time series dependence. From Assumption 3 it P∞ 2 P∞ 2 follows that E(u2it ) = σi2 = σν2i s=−∞ ξis s=0 ψis < ∞. A generalization of the factor model and the related Assumptions, 1 and 2, will be considered in Section 3.2. In the rest of this Section, we motivate our proposed estimator for α. We write (3) as xt = a + β ft + ut , where xt = (x1t , x2t , ..., xN t )0 , a = (a1 , a2 , ..., aN )0 , β= (β1 , β2 , ..., βN )0 and ut = (u1t , u2t , ..., uN t )0 . We also note that under the above assumptions, Σβ = E(ββ 0 ) − E(β)E(β 0 ), with λmax (Σβ ) < K < ∞; 6 Σu = E(ut u0t ), with λmax (Σu ) < K < ∞, E(ft ) = 0, E(ft2 ) = σf2 > 0, and ft and β are distributed independently. Hence xt − E(xt ) = βft + ut , Cov(xt ) = E (βft + ut ) (βft + ut )0 = E(ββ 0 )E(ft2 ) + E ut u0t = Σβ + E(β)E(β 0 ) σf2 + Σu . Consider now the cross-sectional averages of the observables defined by x̄t = τ 0N xt /N , where τ N is an N × 1 vector of ones. Therefore 2 0 2 −2 0 −2 0 2 τ N E(β) . (9) V ar(x̄t ) = N τ N Cov(xt )τ N = N τ N σf Σβ + Σu τ N +σf N But under (4), it follows that [N α ] N X X cρ α 1 vi + βi = [N ] [N α ] [N α ] i=1 1− α ρ(N −[N ]) 1−ρ i=1 ! = v̄N + O N −α [N α ] P[N α ] where v̄N = [N1α ] i=1 vi is Op (1) and for simplicity, we set vi = 0, for i > [N α ]. Note that the specification of βi for i > [N α ] need not be of the form given in (4). Any sequence of loadings, for P which N i=[N α ]+1 βi = Op (1) is acceptable. Hence N −1 τ 0N E(β) = µv N α−1 + O(N −1 ). Also N −2 τ 0N Σβ τ N =N −2 τ 01N Σβ (1) τ 1N ≤ N α−2 λmax (Σβ ) , where τ 1N is an [N α ] × 1 vector of ones and Σβ (1) is the upper [N α ] × [N α ] sub-matrix of Σβ . Using the above results in (9) we now have V ar(x̄t ) ≤ N α−2 σf2 λmax (Σβ ) + N −1 cN + κ2 N 2α−2 . where τ 0N Σu τ N = cN < K < ∞. N But, by assumption λmax (Σβ ) < K < ∞, and hence under 1 ≥ α > 1/2 we have V ar(x̄t ) = κ2 N 2α−2 + N −1 cN + O(N α−2 ). (10) (11) Depending on how σf2 is normalized, different estimators of α can be envisaged. Since, by assumption, µv 6= 0, a natural normalization would be σf2 = 1/µ2v or κ2 = 1. Then, a simple manipulation of (11) yields N −1 cN 2 2(α − 1) ln(N ) ≈ ln(σx̄ ) + ln 1 − σx̄2 N −1 cN ≈ ln(σx̄2 ) − , σx̄2 or α≈1+ 1 ln(σx̄2 ) cN − . 2 ln(N ) 2 [N ln(N )] σx̄2 7 (12) Note that the third term on the RHS of (12) is of smaller order of magnitude than the other two terms. In cases where α ≤ 1/2, the second term in the RHS of (11), that arises from the contribution of the idiosyncratic components, will be at least as important as the contribution of a weak factor, and, in consequence, α cannot be identified. However, for values of α > 1/2, α can be identified from (12) using a consistent estimator of V ar(x̄t ) = σx̄2 , given by σ̂x̄2 = T 1X (x̄t − x̄)2 , T (13) t=1 where x̄ = T PT −1 t=1 x̄t . A simple consistent estimator of α is given by α̂ = 1 + 1 ln(σ̂x̄2 ) . 2 ln(N ) (14) Further, in the case of exact factor models where Σu is a diagonal matrix, the third term in (12) can be estimated by N X −1 2 ĉN = N σ̂j2 = σ̄c N, j=1 where σj2 is the j th diagonal term of Σu and σ̂j2 is its estimator. This suggests the following modified estimator of α 2 σ̄c 1 ln(σ̂x̄2 ) N − . (15) α̃ = 1 + 2 ln(N ) 2 [N ln(N )] σ̂x̄2 Note that while ĉN , as an estimator for cN , is motivated by appealing to an exact factor model, it is also valid for mild deviations from this model as discussed in the next Section. The above estimators of α form the basis of the formal analysis that is carried out in Section 3. 3 Theoretical Derivations In this section we provide a formal analysis of our proposed estimators. 3.1 Main Results Our first set of theoretical results characterise the asymptotic behaviour of α̂. Introducing the ad 2 PN 2 2 1 PT 1 PT 2 = N −1 ditional notations, σ̄N σ , s = f − f , µf = E(ft ), and σf2 = t t i=1 i t=1 t=1 f T T E(ft − µf )2 , we have: Theorem 1 Let Assumptions 1-3 hold, m = 1 and α > 1/2. Then, ! 2 p σ̄ ∗ ∗ N min(N α , T ) 2 ln(N ) (α̂ − α ) − 2α−1 2 2 →d N (0, ω) N v̄N sf where ∗ α ≡ ∗ αN ln κ2 =α+ , 2 ln (N ) min(N α , T ) min(N α , T ) 4σv2 ω = lim Vf 2 + , N,T →∞ T Nα µ2v ∞ X 2 Cov f˜t2 , f˜t−i , Vf 2 = V ar f˜t2 + 2 (16) i=1 and f˜t = (ft − µf )/σf . 8 (17) This theorem shows that α̂, as an estimator of α, is subject to two sources of bias. The first relates to the term ln κ2 / ln (N ) in α∗ , which arises due to the way identification of the factor loadings depends on scaling of the factors as defined by σf2 . For example, this term vanishes if σf2 is normalised to 1/µ2v (assuming µv is non-zero). Under this normalization κ = 1, and α∗ = α. We will discuss the implications of this restriction, and potential ways to circumvent it under certain conditions, in Section 3.3.7 Clearly, κ2 is an important determinant of cross-sectional dependence in small samples, and while its value is irrelevant for the probability limit of α̂, as shown in Theorem 1, one may wish to focus on α∗ as a parameter of interest in small samples. The following corollary provides the asymptotic properties of α̂ when κ2 = 1. Corollary 1 Suppose Assumptions 1 to 3 hold, m = 1 and α > 1/2. Under the normalisation restriction κ2 = 1, ! 2 p σ̄ min(N α , T ) 2 ln(N ) (α̂ − α) − 2α−1N 2 2 →d N (0, ω) . N v̄N sf The second source of bias is the term estimator of this term is given by 2 σ̄c N N σ̂x̄2 2 σ̄N 2 s2 2α−1 N v̄N f which is unobserved. A first order accurate where N T 1 X 2 2 1X 2 c 2 σ̂i , σ̂i = ûit , σ̄N = N T t=1 i=1 ûit = xit − δ̂i x̃t , x̃t = x̄tσ̂−x̄ , and δ̂i denotes the OLS estimator of the regression coefficient of xit on x̃t . x̄ This suggests the following bias corrected estimator α̃ = α̂ − 2 σ̄c N . 2 ln(N )N σ̂x̄2 (18) The following theorem presents the asymptotic properties of α̃. Theorem 2 Let Assumptions 1-3 hold, m = 1 and α > 1/2. Then, as long as either α > 4/7, p min(N α∗ , T )2 ln(N ) (α̃ − α∗ ) →d N (0, ω) , T 1/2 N 4α−2 → 0 or where α∗ and ω are defined in (16) and (17), respectively. As noted above this estimator is only first order accurate since Theorem 2 only holds under the specified assumptions concerning α and the assumed relative rate of growth of N and T . Fortunately, a sec ond order bias correction is available. This amounts to estimating 2 σ̄N 2 s2 N 2α−1 v̄N f by 2 σ̄c N ln(N )N σ̂x̄2 1+ 2 σ̄c N N σ̂x̄2 . Accordingly, we define a second bias corrected estimator 2 σ̄c N α̌ = α̂ − 2 ln(N )N σ̂x̄2 2 σ̄c 1 + N2 N σ̂x̄ ! , (19) and give its asymptotic distribution in the following theorem. 7 Note that as shown in (51) the rate of convergence of α̂ to α without this restriction is a somewhat slow ln(N ). 9 Theorem 3 Suppose Assumptions 1 to 3 hold, m = 1 and α > 1/2. Then, p min(N α∗ , T )2 ln(N ) (α̌ − α∗ ) →d N (0, ω) , where α∗ and ω are defined in (16) and (17), respectively. We consider α̃, even though α̌ provides a more comprehensive bias correction because small sample evidence suggests that α̃ may outperform α̌ for values of α in the middle of the admissible range (1/2, 1]. Obviously, equivalent Corollaries to Corollary 1 hold for α̃ and α̌. The final result of this subsection relates to providing a consistent estimator for ω. Let v̂i denote the OLS estimator of the regression n oN n o (s) (s) N denote the sequences of δ̂i and v̂i , sorted according coefficient of xit on x̄t . Let δ̂i and v̂i i=1 i=1 N Pα̇ N Pα̇ to their absolute values in a descending order, and define the following estimators of σv2 /µ2v (s) !2 (s) 2α̇−2 v̂i − N1α̇ v̂i N \ 2 i=1 i=1 σv = µ2v N α̇ − 1 , for any finite value of κ, and N Pα̇ \ σv2 = µ2v i=1 (s) δ̂i − 1 N α̇ N Pα̇ (20) !2 (s) δ̂i i=1 N α̇ − 1 , when κ = 1. (21) Theorem 4 Suppose Assumptions 1 to 3 hold and m = 1. Let 2 min(N α̇ , T ) σv 4 min(N α̇ , T ) \ ω̇ = V̂f 2 + , α̇ T N µ2v where α̇ = α̂, α̃, α̌, and V̂f 2 [ σv2 µ2v is given by (20) or (21) depending on the value of κ, T T 1X 1X = st − st T T t=1 st = (x̃t − x̃)2 , x̃ = T −1 !2 t=1 PT t=1 x̃t , + l X j=1 T X 1 T st−j t=j+1 T 1X − st T ! t=1 ! T 1X st − st , T (22) t=1 and l → ∞. Then, ω̇ − ω = op (1). where ω is defined in (17), so long as l → ∞, l = o (T ) and l = o N α−1/2 T 1/2 . 3.2 Extensions Consider now the following multiple factor extension of our basic setup: xit = β 0i f t + uit = m X βij fjt + uit , i = 1, 2, ..., N, j=1 where f t = (f1t , f2t , ..., fmt )0 is an m × 1 vector of factors and β i is the associated vector of factor loadings (m is fixed). We make the following assumptions that generalise straightforwardly Assumptions 1 and 2. 10 Assumption 4 We assume that loadings are given by βij = vij for i = 1, 2, ..., [N αj ] , i−[N αj ]+1 βij = cj ρj , for i = [N αj ] + 1, [N αj ] + 2, ..., N , (23) αj where 1/2 < αj ≤ 1, |ρj | < 1, and {vij }N i=1 is an IID sequence of random variables with mean µvj 6= 0, and variance σv2j < ∞, for all j = 1, 2, ..., m. Assumption 5 The m × 1 vector of factors, f t , follows a linear stationary process given by ft = ∞ X ψ f j ν f,t−j , (24) j=0 where ν f t is a sequence of IID random variables with mean zero and a finite variance matrix, Σνf , and uniformly finite ϕ-th moments for some ϕ > 4. The matrix coefficients, ψ f j , satisfy the absolute summability condition ∞ X j ζ ψ f,j < ∞, j=0 such that {ζ(ϕ − 2)}/{2(ϕ − 1)} ≥ 1/2. f t is distributed independently of the idiosyncratic errors, uit0 , for all i, t and t0 . Without loss of generality we assume that α = α1 ≥ αj , j = 2, ..., m and that factors are orthogonal. We have ! PNj (N −Nj ) N X 1 − ρ Nj cj ρj j i=1 vji = N αj −1 v̄j + N −1 Kρj β̄jN = N −1 βji = + (25) N Nj N 1 − ρj i=1 and V ar(β̄jN ) = N −2 τ 0 Σβj τ = O(N aj −2 ). Note that x̄t − E(x̄t ) = β̄1N f1t + β̄2N f2t + ...β̄mN fmt + ūt , and 2 V ar(x̄t ) = E β̄1N f1t + β̄2N f2t + ...β̄mN fmt + ūt m m m X X X 2 2 2 2 2 E(β̄1N )σjf + E(ū2t ) = + V ar(β̄1N )σjf + E(ū2t ). = E(β̄1N ) σjf j=1 Let β̄ N = β̄1N , ..., β̄mN j=1 0 j=1 and v̄ N = (v̄1N , ..., v̄mN )0 . Then, β̄ N = N α−1 DN v̄ N + N −1 K ρ . (26) where DN is an m×m diagonal matrix with diagonal elements given by N αj −α1 and K ρ = (Kρ1 , ..., Kρm )0 . It is important to stress that (23) can be generalised. In particular, the results below hold so long as PN PN i=Nj +1 βji < ∞, in which case, Kρj = Kj = i=Nj +1 βji . Further, let −1/2 −1/2 dT = v̄ 0N S f f f¯T − µ0v Σf f µf (27) h 0 i = Σf f [σij,f ] = E f t − µf f t − µf , 0 f t − f¯T f t − f¯T , Σf f PT 0 σif = σii,f , µf = E (f t ), t=1 f t , and µv = (µ1v , ..., µmv ) = E (v i ). Then we have the following theorem which distinguishes between a number of different cases depending on the relative values of the different exponents, αj , for j = 1, 2, .., m. where S f f = S f f [sijf ] = 1 T PT t=1 f¯T = T1 11 Theorem 5 Suppose Assumptions 4 to 5 hold, α = α1 = α2 = ... = αm , and α > 1/2. Then, 2 p σ̄N min(N α∗ , T ) 2 ln(N ) (α̂ − α∗ ) − 2α−1 0 →d N (0, ωm ) N v̄ N D N S f f D N v̄ N (28) where ωm = lim min (N α , T ) V ar(d2T ), N,T →∞ dT is defined by (27), ∗ α ≡ and κ2 = ∗ αN ln κ2 =α+ , 2 ln (N ) Pm 2 2 i=1 µiv σif . Continue to assuming that α = α1 = α2 = ... = αm , and suppose that either α > 4/7 or then p min(N α∗ , T )2 ln(N ) (α̃ − α∗ ) →d N (0, ωm ) , T 1/2 N 4α−2 → 0, (29) and p min(N α∗ , T )2 ln(N ) (α̌ − α∗ ) →d N (0, ωm ) . (30) α1 = α > α2 + 1/4, (31) Further, if either or if α2 < 3α/4, T b = N, b> 1 , 4(α − α2 ) (32) and α2 ≥ α3 ≥ ... ≥ αm , (28), (29) and (30) hold with ω replacing ωm , where ω is defined in (17) and α∗ is now defined by (16). Finally, if α > α2 ≥ α3 ... ≥ αm but neither (29) nor (30) hold, then (28), (29) and (30) hold with ω replacing ωm , and P m 2(α1 −αi ) µ2 σ 2 N ln iv if i=1 ∗ α∗ ≡ αN =α+ . 2 ln (N ) Up to now we have analysed estimators of the exponent of cross-sectional dependence assuming that factor loadings take the form given in Assumption 1. We briefly examine an alternative formulation for the loadings. As we noted in the Introduction this alternative specification is mathematically convenient, yet economically restrictive. More specifically consider the following formulation of the factor loadings: Assumption 6 Suppose that the factor loadings are given by βik = N α−1 vik , 1/2 < α ≤ 1 (33) 2 where {vik }N i=1 is an i.i.d. sequence of random variables with mean µvk 6= 0, and variance σvk < ∞. It is easy to see that even in the case of this alternative specification of the loadings we have τN = N N 1 X X σij,x = O(N 2α ), N2 i=1 j=1,i6=j and the following Corollary follows easily from the proofs of Theorems 1-3, and Lemma 6. 12 Corollary 2 Let Assumptions 6 and 2-3 hold, m = 1 and α > 1/2. Let α̂, α̃ and α̌ be defined as in (14), (18) and (19) respectively. Then, ! 2 p σ̄ min(N, T ) 2 ln(N ) (α̂ − α∗ ) − 2α−1N 2 2 →d N (0, ω) , N v̄N sf where α∗ and ω are defined in (16) and (17), respectively. If µ2v σf2 = 1,then, p min(N, T ) 2 ln(N ) (α̂ − α) − As long as either α > 5/8 or T 1/2 N 4α−2 2 σ̄N 2 s2 N 2α−1 v̄N f ! →d N (0, ω) . → 0, p min(N, T )2 ln(N ) (α̃ − α∗ ) →d N (0, ω) and p min(N, T )2 ln(N ) (α̌ − α∗ ) →d N (0, ω) . Remark 1 It is of interest to consider circumstances where Assumption 6 fails but the above result still holds. In particular, let βi = N α−1 vi , 1/2 < α ≤ 1 (34) where vi = vN i = ṽi + cN i and {ṽi }N i=1 is an i.i.d. sequence of random variables with mean µv 6= 0, and variance σv2 < ∞. Lemma 14 provides general conditions for this Assumption, under which, our theoretical results hold. In this remark we explore a leading case of departure from Assumption 6 that is covered by Lemma 14. Without loss of generality, we order units, such that cN i = N 1−α ηi for i = 1, 2, ..., M where {ηi }N i=1 is an i.i.d. sequence of random variables with mean µη 6= 0, and variance 2 ση < ∞. This implies that M units have loadings that are bounded away from zero. Then, using Lemma 14, it is easy to see that the theorems relating to the asymptotic distribution of the estimators continue to hold as long as M = o N α−1/2 . 3.3 Is the normalization restriction κ2 = 1 justified? Since the cross-sectional exponent, α, is identified in terms of the factor loadings, and factor loadings are in turn identified only up to a non-singular rotation matrix, it is clear that some restriction on κ2 = µ2v σf2 is needed before a sufficiently accurate estimator of α can be obtained. Although, as we shall see below, it might be possible to jointly estimate α and κ under our preferred formulation of the factor loadings, this is not the case in general. For example, joint estimation of κ and α does not seem possible if the factor loadings are generated according to (33). It is, therefore, interesting to investigate the extent to which the imposition of the restriction κ2 = 1 restrict the data generating process of xit . A useful way to do this is to consider the extent to which setting κ2 = 1 imposes restrictions on the second moment structure of xit . The restriction κ2 = 1 has no implications for the second moments of the individual series, xi = (xi1 , xi2 , ..., xiT )0 , and can only affect the cross-sectional average of the second moments. We have investigated these effects in a not-for-publication appendix that is available upon request, and conclude that they are reasonably modest. 13 4 Joint estimation of α and κ In this section we show that it is in fact possible to jointly estimate α and κ, if the factor loadings follow the set up given under Assumption 1. To see this, without loss of generality, suppose that βi are ordered such that |β1 | ≥ |β2 | ≥ .... β[N α ] , with βi = 0 for i = [N α ] + 1, ..., N . Let β n = (β1 , β2 , ...βn )0 , P β̄n = n1 ni=1 βi , and note that E(β̄n ) = n−1 τ 0n E(β n ) = µv , if n ≤ [N α ] = n−1 [N α ] µv , if n > [N α ] . Similarly, let Σβ,n = Cov(β n ) for n ≤ [N α ] and bear in mind that Σβ,[N α ]×[N α ] 0 Σβ,N = 0 0(N −[N α ])×(N −[N α ]) and note that V ar(β̄n ) = n−2 τ 0n Cov(β n )τ 0n ≤ n−1 λmax (Σβ,n ) = O(n−1 ), if n ≤ [N α ] α [N α ] [N ] −2 0 0 = n τ n Cov(β n )τ n ≤ 2 λmax (Σβ,[N α ]×[N α ] ) = O , if n > [N α ] . n n2 Now let x̄nt be the cross-sectional average of xit over i = 1, 2, ..., n < N , where without loss of generality we assume that xit is ordered by |βi |. Then, using the above results V ar(x̄nt ) = σf2 µ2v +n−1 cn + O(n−1 ), if n ≤ [N α ] , α 2α 2 2 [N ] −1 −2 =n N σf µv + n cn + O , if n > [N α ] . n2 where cn = n−1 τ 0n Σu,n τ n , E(un u0n ), with un = (u1 , u2 , ..., un )0 . This result can be written more compactly as V ar(x̄nt ) − n−1 cn = I([N α ] − n)κ2 + I([N α ] − n)O(n−1 ) α [N ] , + [1 − I([N α ] − n)] n−2 N 2α κ2 + O n2 where I(A) is an indicator function that takes the value of 1 if A > 0 and zero otherwise. After some simplifications we have 2 V ar(x̄nt ) − n−1 cn = n−2 N 2α + I([N α ] − n) 1 − n−2 N 2α κ + α α [N ] [N ] +O I([N α ] − n) O(n−1 ) − O 2 n n2 It is easily seen that for n = N , we obtain our earlier result (note that N α ≤ N since α ≤ 1), namely V ar(x̄N t ) − N −1 cN = κ2 N 2α−2 + O N α−2 . But other values of n that are sufficiently large can also be used to provide information on κ and α. First we need a consistent estimator of V ar(x̄nt ) − n−1 cn , when n is sufficiently large. To this end, consider the OLS estimator of the slope coefficient in the regression of xit − x̄i on x̄N t − x̄, given by N P P P v̂i = Tt=1 (xit − x̄i ) (x̄N t − x̄) / Tt=1 (x̄N t − x̄)2 , where x̄ = N −1 x̄i . Order the observations, xit , i=1 as x(i)t , so that x(1)t is associated with v̂ (1) , x(2)t is associated with v̂ (2) , and so on, where v̂ (1) , v̂ (2) , ... 14 are the values of v̂1 , v̂2 , ..., v̂N ordered in a descending manner. Consider the following estimator of V ar(x̄nt ), PT (x̄nt − x̄nT )2 2 σ̂x̄n = t=1 , T P P P 2 , where x̄nt = n−1 ni=1 x(i)t , and x̄nT = T −1 Tt=1 x̄nt . Similarly, estimate cn by ĉn = n−1 nj=1 σ̂(j) 2 is the estimator of σ 2 , the standard error of the idiosyncratic component of x where σ̂(j) (j)t for (j) t = 1, 2, ..., T . Thus n T 1 XX 2 û(j)t , ĉn = nT j=1 t=1 (s) where û(j)t = x(j)t − x̄(j) − v̂j (x̄N t − x̄N T ). Based on σ̂x̄2n , we obtain the following estimating equation 2 σ̂x̄2n − n−1 ĉn = n−2 N 2α + I([N α ] − n) 1 − n−2 N 2α κ + α α [N ] [N ] I([N α ] − n) O(n−1 ) − O +O + ξn , n2 n2 where ξn is the estimation error. It is clear that only the estimates that are based on n > [N α ] are informative for α. This is because for values of n < [N α ] we have σ̂x̄2n − n−1 ĉn = κ2 + O(n−1 ) + ξn . The above results suggest that joint estimation of α and κ can be based on the minimization of the following quadratic form in terms of α and κ: X X 2 2 Q(α, κ2 ) = σ̂x̄2n − n−1 ĉn − κ2 + σ̂x̄2n − n−1 ĉn − n−2 N 2α κ2 , (35) n≤[N α ] n>[N α ] The first order condition for κ2 is X X −2 2 σ̂x̄2n − n−1 ĉn − κ̂2 (α) + N 2α n σ̂x̄n − n−1 ĉn − n−2 N 2α κ̂2 (α) = 0, n≤[N α ] n>[N α ] which yields P κ̂2 (α) = n≤[N α ] PN −2 σ̂ 2 − n−1 ĉ σ̂x̄2n − n−1 ĉn + N 2α n x̄n n>[N α ] n . P −4 [N α ] + [N 4α ] N n>[N α ] n Using this result we have X 2 σ̂x̄2n − n−1 ĉn − κ̂2 (α) + X Q(α) = Q(α, κ̂2 (α)) = n≤[N α ] 2 σ̂x̄2n − n−1 ĉn − n−2 N 2α κ̂2 (α) . n>[N α ] The above expressions can be simplified. Let [N α ] Q1 = Q2 = X [N α ] σ̂x̄2n n=1 N X −n −1 ĉn 2 , q1 = σ̂x̄2n − n−1 ĉn n=1 σ̂x̄2n −n −1 ĉn 2 , q2 = n=[N α ]+1 α X N (α) = [N ] + N 4α N X n=[N α ]+1 N X n−4 n=[N α ]+1 15 n−2 σ̂x̄2n − n−1 ĉn which yields q1 + N 2α q2 κ̂ (α) = . N (α) 2 (36) Then, 4 α Q(α) = Q1 + Q2 + κ̂ (α) [N ] + N 4α 4 κ̂ (α) N X n−4 − 2κ̂2 (α)q1 − 2κ̂2 (α) N 2α q2 n=[N α ]+1 4 2 = Q1 + Q2 + κ̂ (α)N (α) − 2κ̂ (α) q1 + N 2α q2 " " #2 # 2α q1 + N 2α q2 q1 + N 2α q2 = Q1 + Q2 + N (α) − 2 q1 + N q2 N (α) N (α) 2 q1 + N 2α q2 = Q1 + Q2 − . N (α) 2 P [q1 +[N 2α ]q2 ] 2 2 −1 Further, note that Q1 + Q2 = N which does not depend on α. Then, n=1 σ̂x̄n − n ĉn N (α) can be maximised straightforwardly by grid search, which gives an estimator of α (denoted by α̈). κ2 can then be estimated using (36). The consistency of α̈ is established in Appendix IV, although we have not been able to establish the rate at which this estimator is consistent. But judging by the results obtained in the structural break literature we conjecture that the rate at which α̈ converges to α is ln(N ), and to obtain a better convergence rate some a priori restriction on κ might be needed. 5 Monte Carlo Study We investigate the small sample properties of the proposed estimators of α, through a detailed Monte Carlo simulation study. We consider the following two factor model xit = β1i f1t + β2i f2t + uit , (37) for i = 1, 2, ..., N , and t = 1, 2, ..., T. The factors are generated as q fjt = ρj fj,t−1 + 1 − ρ2j ζjt , j = 1, 2, for t = −49, −48, ..., 0, 1, ..., T, with fj,−50 = 0, for j = 1, 2. The shocks are generated as q uit = φi ui,t−1 + 1 − φ2i εit , for i = 1, 2, ..., N and t = −49, −48, ..., 0, 1, ..., T, with ui,−50 = 0, ζjt ∼ IIDN (0, 1), εit ∼ IIDN (0, σi2 ), where σi2 ∼ IID χ2 , i = 1, 2, ..., N. Therefore, by construction σf2j = 1, for j = 1, 2. In the first instance we set the loadings of the second factor equal to zero, β2i = 0, and focus on the properties of β1i . We consider the following generating mechanism: β1i = v1i , for i = 1, 2, ..., M (N ) β1i = ρi−M , for i = M + 1, M + 2, ..., N l where v1i ∼ IIDU (µv1 − 0.5, µv1 + 0.5), M = [N a ] and ρl = 0.5. The above parametrization ensures that µ2v1 σf21 = 1, as discussed in the development of the theory. Initially, where m = 1, we consider the following experiments. 16 Experiment A A basic design, where the factor, f1t , is serially uncorrelated and the errors, uit , are serially uncorrelated and cross-sectionally independent, namely when ρ1 = 0, φi = 0, i = 1, 2, ..., N and εit ∼ IIDN (0, σi2 ), for all i and t. Experiment B This design is as in Experiment A, but allows for temporal dependence in the factor, so that ρ1 = 0.5, φi = 0, i = 1, 2, ..., N and εit ∼ IIDN (0, σi2 ), for all i and t. Experiment C This design is as in Experiment A, where we continue to set ρ1 = 0, but allow the idiosyncratic errors, uit , to be cross-sectionally dependent according to a first order spatial autoregressive model. Let ut = (u1t , u2t , ..., uN t )0 , and set ut as ut = Qεt , εt = σε ηt ; ηt ∼ IIDN (0, IN ), where Q = (IN − θS)−1 , and S= 0 1 0 ... 0 1/2 0 1/2 . . . 0 .. .. .. .. . . . . 0 0 . . . 0 1/2 0 0 ... 1 0 , We set θ = 0.2, and set σε2 = N/T r(QQ0 ) which ensures that N −1 PN i=1 var(uit ) = 1. Experiment D Next, we take into account the second factor as well and generate its loadings as β2i = v2i , for i = 1, 2, ..., M2 (N ) 2 β2i = ρi−M , for i = M2 + 1, M2 + 1, ..., N, l where v2i ∼ IIDU (µv2 − 0.5, µv2 + 0.5), M2 = N a2 and ρl = 0.5. We examine the case where α2 = a √ and set σf1 = σf2 = 1 and µv1 = µv2 = 0.5. The rest of the parameters are se as in Experiment A, namely α2 = a, ρj = 0, φi = 0, for i = 1, 2, ..., N, j = 1, 2 and uit ∼ IIDN (0, 1), for all i and t. For all experiments we consider values of α = 0.70, 0.75, ..., 0.90, 0.95, 1.00, N = 100, 200, 500, 1000 and T = 100, 200, 500. All experiments are based on 2000 replications. For each replication, the values of α, ρj , ρl , φi and S are given as set out above. These parameters are fixed across all replications. The values of vji , j = 1, 2 are drawn randomly (N of them) for each replication. In the case where the leading factor (f1t ) is serially uncorrelated, the statistic for making inference about α (when κ = 1) is given by (see theorems 1, 3 and 4 ) ! c2 −1/2 1 4 σ v V̂ 2 + α̇ 2 2 ln(N ) (α̇ − α) →d N (0, 1), T f1 N µv 17 \ 4 4 for α̇ = α̃ or α̌. Note that when the leading factor is serially uncorrelated then V̂f 2 = E(f 1t )/σf1 − 1, 1 \ 4 4 where E(f 1t )/σf is consistently estimated by PT t=1 (x̃t \ 4 4 E(f 1t )/σf1 = − x̃)4 T , P 2 2 \ 2 2 where x̃t = N −1 N i=1 xit /σ̂x̄ . Also σv /µv , the estimator of σv /µv , is given by N Pα̇ c2 σ v = µ2v where i=1 (s) δ̂i − 1 N α̇ N Pα̇ (s) δ̂i i=1 N α̇ − 1 !2 , with α̇ = α̃ or α̌, n o (s) δ̂i denotes the sequence of δ̂i sorted according to their absolute values in a descending order, and δ̂i is the OLS estimator of the regression coefficient of xit on x̃t = (x̄t − x̄)/σ̂x̄ . Also see Theorem 4 and the discussions that proceed it. The above expressions apply irrespective of whether the model contains one or two factors. Size of the tests is computed under H0 : α = α0 , using a two-sided test where α0 takes values in the range [0.70, 1.00], as indicated above. Power is computed under the alternatives Ha : αa = α0 + 0.05 (power+), and Ha : αa = α0 − 0.05 (power-). All results are scaled by 100. Finally, when ρj 6= 0, the case of serially correlated factors, we use a corrected variance estimator of ft . The relevant formula for the test statistic is given by " # i c2 −1/2 1h 4 σ v V̂ 2 (q) + α̇ 2 2 ln(N ) (α̇ − α) →d N (0, 1), (38) T f N µv for α̇ = α̃ or α̌. V̂f 2 (q) is computed by first estimating an AR(q) process for z̃t = zt − z̄, where P P P x /σ̂x̄ , x̃ = T −1 Tt=1 x̃t and z̄ = T −1 Tt=1 zt ,and then V̂f 2 (q) = zt = (x̃t − x̃)2 , x̃t = N1 N it i=1 σ̂z2 /(1 − γ b1 − γ b2 − ... − γ bq )2 , where σ̂z is the regression standard error and γ bi is the ith estimated AR 2 /µ2 is computed as before. Note that coefficient fitted to z̃t . The lag order is set to q = T 1/3 , and σ\ v v this correction is not the standard Newey-West one but based on AR approximations. We found that this correction has better finite samples properties and hence we use this in both the Monte Carlo study and the empirical applications of Section 6. 5.1 Results The results for experiment A are summarized in Table A1 giving the bias, Root Mean Square Error (RMSE), size and power when α̃ is used as the estimator of α. Table A2 presents the same set of summary statistics for experiment A when the further bias-corrected estimator, α̌, is considered. We only report results for values of α over the range [0.70, 1.0]. Recall that α is identified only if α > 1/2, and for asymptotically valid inference on α it is further required that α > 4/7, unless T 1/2 /N (4α−2) → 0, as N and T → ∞. (See Theorem 2). Both estimators perform reasonably well with α̌ having a slightly smaller bias for values of α in the range [0.70 − 0.85], but overall there is little to choose between the two estimators, and therefore in what follows we focus on α̃ as the estimator of α. As predicted by the theory, the bias and RMSE of α̃ decline with both N and T , and tend to be smaller for larger values of α. A similar pattern can also be seen in size and power of the tests based on α̃. There is evidence of some size distortion when α is below 0.75, but it tends towards the nominal 5% level as α is increased. The size distortion is also 18 reduced as N and T are increased. The power of the test also rises in α, N and T , and approaches unity quite rapidly. However, the power function seems to be asymmetric with the power tending to be higher for alternatives below the null (denoted by Power-) as compared to the alternatives above the null (denoted by Power+). This asymmetry is particularly marked for low values of α and disappears as α is increased. The results for Experiment B where the factor is allowed to be serially correlated are summarized in Table B. As compared to the baseline case, we see some deterioration in the results, particularly for relatively small values of N and T . The RMSEs are slightly higher, the size distortions slightly larger, and the power slightly lower. But these differences tend to vanish as N and T are increased. The effects of allowing for weak cross-sectional dependence in the idiosyncratic errors, uit , on estimation of α are summarized in Table C for Experiment C. Considering the moderate nature of the spatial dependence introduced into the errors (with the spatial parameter, θ, set to 0.2), the results are not that different from the ones reported in Table A2, for the baseline experiments. However, one would expect greater distortions as θ is increased, although the effects of introducing weak dependence in the idiosyncratic errors are likely to be less pronounced if higher values of α are considered. For values of α near the borderline value of 1/2, it will become particularly difficult to distinguish between factor and spatial dependent structures. Finally, the results of Experiment D where one additional factor is included in the baseline case are summarized in Table D. As can be seen, the results are hardly affected by the addition of the new factor to the data generating process. Consistent with the one-factor case of Experiment A, both the bias and RMSE of α̃ fall gradually as N and α are increased, while tests of the null hypothesis based on α̃, are correctly sized for α > 0.7 in this case as well. Similar observations can also be made with respect to the power. The above Monte Carlo experiments, although limited in scope, clearly illustrate the potential utility of the estimation and inferential procedure proposed in the paper for the analysis of crosssectional dependence. The results are broadly in agreement with the theory and are reasonably robust to departures from the basic model. Although, the results tend to deteriorate somewhat when we consider serially correlated factors or weak cross-sectional dependence in the idiosyncratic errors, the estimated values of α tend to retain a high degree of accuracy even for moderate sample sizes. It is also worth bearing in mind that in most empirical applications the interest will be on estimates of α that are close to unity, as it is for these values that a factor structure makes sense as compared to spatial or other network models of cross-sectional dependence. It is, therefore, helpful that the quality of the small sample results tend to improve when values of α close to unity are considered. 6 Empirical Applications In this section we provide estimates of the exponent of cross-sectional dependence, α, for a number of panel data sets used extensively in economics and finance. Specifically, we consider three types of data sets: quarterly cross-country data used in global modelling, large quarterly data sets used in empirical factor model literature, and monthly stock returns on the constituents of Standard and Poor 500 index. 6.1 Cross-country dependence of macro-variables Table 1 presents estimates of α for real output growth, inflation, rate of change of real equity prices, and interest rates (short and long) computed over 33 countries (when available). The data is from Cesa-Bianchi, Pesaran, Rebucci, and Xu (2012), and covers the period 1979Q2-2009Q4, which updates 19 the earlier GVAR (global vector autoregressive) data sets used in Pesaran, Schuermann, and Weiner (2004), and Dees, di Mauro, Pesaran, and Smith (2007).8 We provide both bias corrected estimates, α̃ and α̌, computed using available cross-country time series, yit , over the full sample period. The observations were standardized as xit = (ỹit − ỹ i )/si , where ỹ i is the sample mean of each time series, and si is the corresponding standard deviations. Table 1 also reports the 95% confidence bands, the cross section dimension (N ) and the time series dimension (T ) for each of the variables. Although, there are 33 countries in the GVAR data set, not all variables are available for all the 33 countries. For example, the short term (3 months) interest rate data is not available for Saudi Arabia, and real equity prices and long term interest rate data (10 year government bond) are available only for some of the countries. We first note that the two different estimates of α provided in Table 1 are very close, which are in line with the Monte Carlo results reported in Tables A1 and A2. Focusing on α̃, we observe that the point estimates range between 0.754 for cross dependence of GDP growth rates, to 0.968 for long term interest rates. The exponent of cross-sectional dependence for short term interest rates and real equity prices at 0.907 and 0.881 are also quite high, indicating that financial variables are more strongly correlated as compared to the real variables. The reported confidence bands all lie above 0.5, but none cover unity either apart from the case of long term interest rates (marginally), suggesting that whilst a factor structure might be a good approximation for modelling global dependencies, the value of α = 1 typically assumed in the empirical factor literature might be exaggerating the importance of the common factors for modelling cross-sectional dependence at the expense of other forms of dependencies that originate from trade or financial inter-linkages that are more local or regional rather than global in nature. Table 1: Exponent N Real GDP growth, q/q 33 Inflation, q/q 33 Real equity price change, q/q 26 Short-term interest rates 32 Long-term interest rates 18 of cross-country ∗ T α̃0.025 122 0.691 123 0.778 122 0.797 123 0.831 123 0.864 dependence of macro-variables ∗ ∗ α̃ α̃0.975 α̌0.025 α̌ 0.754 0.818 0.689 0.752 0.851 0.924 0.777 0.850 0.881 0.966 0.796 0.881 0.907 0.983 0.831 0.907 0.968 1.072 0.864 0.968 ∗ α̌0.975 0.816 0.924 0.966 0.983 1.072 *95% level confidence bands 6.2 Within-country dependence of macro-variables An important strand in the empirical factor literature, promoted through the work of Forni, Hallin, Lippi, and Reichlin (2000), Forni and Lippi (2001) and Stock and Watson (2002), uses factor models to forecast a few key macro variables such as output growth, inflation or unemployment rate with a large number of macro-variables, that could exceed the number of available time periods. It is typically assumed that the macro variables satisfy a strong factor model with α = 1. We estimated α using the quarterly data sets used in Eklund, Kapetanios, and Price (2010). For the US the data set comprises 95 variables over the period 1960Q2 to 2008Q3. For the UK the data set covers 94 variables spanning from 1977Q1 to 2008Q2. The estimates of α, computed from the standardized data sets as explained in the previous subsection, together with their 95% confidence bands are summarized in Table 2. For both data sets the estimates, α̃ and α̌, are identical up to two decimal places. For the US data set the point estimate (α̃) of α, at 0.739, is slightly larger than the estimate obtained for the UK at 8 This version of GVAR data set can be downloaded from http://www-cfap.jbs.cam.ac.uk/research/gvartoolbox/download.html 20 Table 2: Exponent of within-country dependence of macro-variables US UK 1960Q2-2008Q3 1977Q1-2008Q2 N=95, T=194 N=94, T=126 ∗ ∗ ∗ ∗ α̃0.025 α̃ α̃0.975 α̃0.025 α̃ α̃0.975 0.689 0.739 0.788 0.636 0.715 0.793 ∗ ∗ ∗ ∗ α̌0.025 α̌ α̌0.975 α̌0.025 α̌ α̌0.975 0.689 0.738 0.788 0.635 0.713 0.792 *95% level confidence bands 0.715. The 95% confidence bands for both data sets are well above the threshold value of 0.50, but are well short of 1.0 at the upper end of the band. Once again there is some evidence of a common factor dependence, but the evidence is not as strong as it is assumed in the literature. 6.3 Cross-sectional exponent of stock returns One of the important considerations in the analysis of financial markets is the extent to which asset returns are interconnected. This is encapsulated in the capital asset pricing model (CAPM) of Sharpe (1964) and Lintner (1965), and the arbitrage pricing theory (APT) of Ross (1976). Both theories have factor representations with at least one strong common factor and an idiosyncratic component that could be weakly correlated (see, for example, Chamberlain (1983)). The strength of the factors in these asset pricing models is measured by the exponent of the cross-sectional dependence, α. When α = 1, as it is typically assumed in the literature, all individual stock returns are significantly affected by the factor(s), but there is no reason to believe that this will be the case for all assets and at all times. The disconnect between some asset returns and the market factor(s) could occur particularly at times of stock market booms and busts where some asset returns could be driven by non-fundamentals. Therefore, it would be of interest to investigate possible time variations in the exponent α for stock returns. Note that under our methodology the market factor associated with the CAPM specification is implied by the data rather than imposed by use of a specific market portfolio composition which can be limiting, as explained in Roll (1977). We base our empirical analysis on monthly excess returns of the securities included in the Standard & Poor 500 (S&P 500) index of large cap U.S. equities market, and estimate α recursively using rolling samples of size 120 months (10 years) and 60 months ( 5 years). Due to the way the composition of S&P 500 changes over time, we compiled returns on all 500 securities at the end of each month over the period from September 1989 to September 2011, and included in the rolling samples only those securities that had a sufficiently long history in the month under consideration. On average we ended up with 439 securities at the end of each month for the rolling samples of size 10 years, and 476 securities when we used a rolling sample of size 5 years. The one-month US treasury bill rate was chosen as the risk free rate (rf t ), and excess returns computed as r̃it = rit − rf t , where rit is the monthly return on the ith security in the sample inclusive of dividend payments (if any).9 Recursive estimates of α were then computed using the standardized observations xit = (r̃it − r̃i )/si , where r̃i is the sample mean of the excess returns over the selected rolling sample, and si is the corresponding standard deviations. The recursive estimates of α based on 10 years and 5 years rolling windows are given in Figure 1. We also computed rolling standard errors for the estimates, α̃t , using the serial correlation correction discussed in Section 5. Based on these standard errors, the 95% confidence bands of the recursive 9 For further details of data sources and definitions see Pesaran and Yamagata (2012). 21 estimates were on average ±0.02 around the point estimates for both rolling sample sizes considered. These bands are not shown in Figure 1, since the bands are relatively narrow and we aim to highlight the time variations in the estimates of α. Figure 1: α̃t associated with S&P 500 securities’ excess returns - 5-yr and 10-yr rolling samples 0.94 0.93 0.92 0.91 0.9 0.89 0.88 0.87 0.86 0.85 0.84 Sep-89 Jul-90 May-91 Mar-92 Jan-93 Nov-93 Sep-94 Jul-95 May-96 Mar-97 Jan-98 Nov-98 Sep-99 Jul-00 May-01 Mar-02 Jan-03 Nov-03 Sep-04 Jul-05 May-06 Mar-07 Jan-08 Nov-08 Sep-09 Jul-10 May-11 0.83 5-year estimates of α (ᾶ) 10-year estimates of α (ᾶ) The figure covers 23 years of monthly recursive estimates of α, and yet fall in a relatively narrow range of 0.854 − 0.917 in the case of the 10 years rolling samples, and in a slightly wider range of 0.838 − 0.932 in the case of 5 years rolling samples. These estimates clearly show a high degree of inter-linkages across individual securities, although the null hypothesis that α = 1 is clearly rejected. More importantly, there are clear trends in the estimates of α. The estimates based on the 10 years rolling samples fall from a high of 0.92 in 1990 to a low of 0.85 just before the burst of the dot-com bubble in 1999-2000. The estimates of α then stabilize around 0.86 over the period 2000 − 2008, then fall quite dramatically towards the end of 2008 at the time of the market crash, before starting to rise again to its present level of 0.90 in September 2011. The factors behind these fluctuations are complex and reflect the relative importance of micro and macro fundamentals prevailing in financial markets. A standard factor model does not seem able to fully account for the changing nature of the dependencies in securities market over the 1989-2011 period. A similar result is also obtained when α is estimated using 5-year rolling samples, although as to be expected the estimates are variable. Indeed, the upward movements in the estimates based on 5-year windows are more pronounced both in the latest crisis as well as in the period of 1997-2000 that saw smaller crises caused by the Asian economic turmoil, LTCM and the bursting of the dot-com bubble. The patterns observed in the above estimates of α are in line with changes in the degree of correlations in equity markets. It is generally believed that correlations of returns in equity markets rise at times of financial crises, and it would be of interest to see how our estimates of α relate to return correlations. To this end in Figure 2 we compare the estimates of α to average pair-wise correlation coefficients of excess returns (ρ̂N ) on securities included in S&P 500 index, using 10-year and 5-year rolling windows.10 As the plots in these figures show, our estimates of α closely follow the rolling 10 Denote the correlation of excess returns and j securities by ρ̂ij , the pair-wise average correlation of the market Pon−1i P N is then computed as ρ̂N = (1/N (N − 1)) N i=1 j=i+1 ρ̂ij , where N is the number of securities under consideration. 22 estimates of ρ̄N . Figure 2: Average pair-wise correlations of excess returns for securities in the S&P 500 index and the associated α̃t estimate computed using 10-year and 5-year rolling samples 0.93 0.40 0.91 0.35 0.95 0.45 0.93 0.40 0.35 0.91 0.90 0.30 0.30 0.89 0.88 0.25 0.25 0.87 0.20 0.87 0.20 0.15 0.15 0.83 0.10 Sep-89 Jul-90 May-91 Mar-92 Jan-93 Nov-93 Sep-94 Jul-95 May-96 Mar-97 Jan-98 Nov-98 Sep-99 Jul-00 May-01 Mar-02 Jan-03 Nov-03 Sep-04 Jul-05 May-06 Mar-07 Jan-08 Nov-08 Sep-09 Jul-10 May-11 Sep-89 Aug-90 Jul-91 Jun-92 May-93 Apr-94 Mar-95 Feb-96 Jan-97 Dec-97 Nov-98 Oct-99 Sep-00 Aug-01 Jul-02 Jun-03 May-04 Apr-05 Mar-06 Feb-07 Jan-08 Dec-08 Nov-09 Oct-10 Sep-11 0.85 0.85 10-year estimates of α (ᾶ) based on excess returns (lhs) 5-year estimates of α (ᾶ) based on excess returns (lhs) 10-year average pairwise correlations of excess returns (rhs) 5-year average pairwise correlations of excess returns (rhs) Further, it would be of interest to see how our estimates of α compare with estimates obtained using excess returns on market portfolio as a measure of the unobserved factor. This approach starts with capital asset pricing model (CAPM) and assumes that the single factor in CAPM regressions can be approximated by a stock market index. Under these assumptions, as noted in the Introduction, a direct estimate of α is given by α̂d = ln(M̂ )/ ln(N ), where M̂ denotes the estimated number of non-zero betas, and N is the total number of securities under consideration.11 M̂ can be consistently estimated (as N and T → ∞) by the number of t-tests of βi = 0 in the CAPM regressions rit − rf t = ai + βi (rmt − rf t ) + uit , for i = 1, 2, ..., N, (39) that end up in rejection of the null hypothesis at a chosen significance level, where rmt is a broadly defined stock market index. In our application we choose the value-weighted return on all NYSE, AMEX, and NASDAQ stocks to measure rmt ,12 and select 1% as the significance level of the tests. Such estimates of α obtained recursively using 10-year and 5-year rolling windows are shown in the two plots in Figure 3. For ease of comparison, these plots also include our (indirect) estimates of α based on the same data sets (except for the marker return, rmt , which is not used). The two sets of estimates co-move over most of the period and tend to become closer in the aftermath of the dot-com bubble and during the recent financial crisis. The correlation coefficient of the two sets of estimates is 0.923 for the ones based on 10-year rolling samples, and 0.739 for the ones based on the 5-year rolling samples. The two sets of estimates, however, differ in scale, with the direct estimates being closer to unity. The scale of the direct estimates clearly depends on the measure of market return, the level of significance chosen, and the assumption that the model contains only one single factor with α > 1/2, Almost identical estimates are also obtained if we use returns instead of excess returns. 11 Recall that M = [N α ], where M is the true number of non-zero betas. 12 The return data on market index was obtained from Ken http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data library.html 23 French’s data library. and in consequence is subject to a high degree of uncertainty.13 Nevertheless, it is reassuring that the direct and indirect estimates of α in this application tend to move together closely. Figure 3: Direct (α̂d ) and indirect (α̃) estimates of cross-sectional exponent of the market factor (using excess returns on S&P 500 securities) based on 10-year and 5-year rolling samples 1.01 1.01 0.99 0.99 0.97 0.97 0.95 0.95 0.93 0.93 0.91 0.91 0.89 0.89 0.87 0.87 0.85 0.85 0.83 10-year indirect α estimates (ᾶ ) Sep-89 Aug-90 Jul-91 Jun-92 May-93 Apr-94 Mar-95 Feb-96 Jan-97 Dec-97 Nov-98 Oct-99 Sep-00 Aug-01 Jul-02 Jun-03 May-04 Apr-05 Mar-06 Feb-07 Jan-08 Dec-08 Nov-09 Oct-10 Sep-11 Sep-89 Jun-90 Mar-91 Dec-91 Sep-92 Jun-93 Mar-94 Dec-94 Sep-95 Jun-96 Mar-97 Dec-97 Sep-98 Jun-99 Mar-00 Dec-00 Sep-01 Jun-02 Mar-03 Dec-03 Sep-04 Jun-05 Mar-06 Dec-06 Sep-07 Jun-08 Mar-09 Dec-09 Sep-10 Jun-11 0.83 10-year direct α estimates (1% significance level) 5-year indirect α estimates (ᾶ) 5-year direct α estimates (1% significance level) There is also a further consideration when comparing the estimates of α and αd . Under CAPM the errors, uit , in (39) are assumed to be cross-sectionally weakly correlated, namely that the cross-sectional exponent of the errors, say αu , must be ≤ 1/2. But this need not be the case in reality. Although we do not observe uit , under CAPM the OLS residuals from regressions of rit − rf t on rmt − rf t , denoted by ûit , provide an accurate estimate of uit up to Op (T −1/2 ), and can be used to compute consistent estimates of αu .14 The bias-adjusted estimates of αu , denoted by α̃u , and computed using standardized residuals over 5-year and 10-year rolling samples, are displayed in Figure 4. Interestingly enough, these estimates, although much smaller than those estimated using excess returns, nevertheless tend to be larger than the threshold value of 1/2, suggesting the presence factors other than the market factor influencing individual security returns. The influence of residual factor(s) is rather weak initially (around 0.60), but starts to rise in the years leading to the dot-com bubble and reaches the pick of 0.80 in the middle of 2000 and stays at around that level for the period up to 2006 − 2008 (depending on whether the 10-year or 5-year rolling samples are used), then begins to fall significantly after the start of the recent financial crisis, and currently stands at around 0.63. Although, special care must be exercised when interpreting these estimates (both because αu is estimated using residuals and the fact that α̃ tends to be biased upward particularly when α < 0.75), nevertheless their patterns over time are indicative of some departures from CAPM during the period 1999 − 2006. Also, it is interesting that the rolling estimates of αu tend to move in opposite directions to the estimates of α computed over the same rolling samples. Weakening of the market factor tends to coincide with strengthening of the residual factor(s), thus suggesting that correlations across returns could remain high even during 13 The distribution theory of the direct estimator of α is complicated by the cross dependence of the errors in the underlying CAPM regressions and its consideration is outside the scope of the present paper. 14 A formal proof and analysis when α is estimated from regression residuals is beyond the scope of the present paper. 24 periods where the cross-sectional exponent of the dominant factor is relatively low, once the presence of multiple factors with exponents exceeding 0.5 is acknowledged. Figure 4: Estimates of cross-sectional exponent of residuals (α̃u ) from CAPM regressions using 5-year and 10-year rolling samples Sep-89 Jul-90 May-91 Mar-92 Jan-93 Nov-93 Sep-94 Jul-95 May-96 Mar-97 Jan-98 Nov-98 Sep-99 Jul-00 May-01 Mar-02 Jan-03 Nov-03 Sep-04 Jul-05 May-06 Mar-07 Jan-08 Nov-08 Sep-09 Jul-10 May-11 0.81 0.79 0.77 0.75 0.73 0.71 0.69 0.67 0.65 0.63 0.61 0.59 0.57 0.55 5-year estimates of α associated with residuals of market factor 10-year estimates of α associated with residuals of market factor 7 Conclusions Cross-sectional dependence and the extent to which it occurs in large multivariate data sets is of great interest for a variety of economic, econometric and financial analyses. Such analyses vary widely. Examples include the effects of idiosyncratic shocks on aggregate macroeconomic variables, the extent to which financial risk can be diversified by investing in disparate assets or asset classes and the performance of standard estimators such as principal components when applied to data sets with unknown collinearity structures. A common characteristic of such analyses is the need to quantify cross-sectional dependence especially when it is prevalent enough to materially affect the outcome of the analysis. In this paper we propose a relatively simple method of measuring the extent of inter-connections in large panel data sets in terms of a single parameter that we refer to as the exponent of cross-sectional dependence. We find that this exponent can accommodate a wide range of cross-sectional dependence manifestations while retaining its simple and tractable form. We propose consistent estimators of the cross-sectional exponent and derive their asymptotic distribution under plausible conditions. The inference problem is complex, as it involves handling a variety of bias terms and, from an econometric point of view, has noteworthy characteristics such as nonstandard rates of convergence. We provide a feasible and relatively straightforward estimation and inference implementation strategy. A detailed Monte Carlo study suggests that the estimated measure has desirable small sample properties. We apply our measure to three widely analysed classes of data sets. In all cases, we find that the results of the empirical analysis accord with prior intuition. For example, in the case of cross country applications we obtain larger estimates for the cross-sectional exponent of equity returns as compared to those estimated for cross country output growths and inflation. For individual securities 25 in S&P 500 index, the estimates of cross-sectional exponents are systematically high but not equal to unity, a widely maintained assumption in the theoretical multi-factor literature. We conclude by pointing out some of the implications of our analysis for large N factor models of the type analysed by Bai and Ng (2002), Bai (2003), and Stock and Watson (2002). This literature assumes that all factors have the same cross-section exponent of α = 1, which, as our empirical applications suggest, may be too restrictive, and it is important that implications of this assumption’s failure are investigated. Chudik, Pesaran, and Tosetti (2011), Kapetanios and Marcellino (2010) and Onatski (2011) discuss some of these implications, namely that when 1/2 < α < 1, factor estimates are consistent but their rates of convergence are different (slower) as compared to the case where α = 1, and in particular their asymptotic distributions may need to be modified. Methods used to determine the number of factors in large data sets, discussed in, e.g., Bai and Ng (2002), Onatski (2009) and Kapetanios (2010), are invalid and will select the wrong number of factors, even asymptotically.15 Finally, the use of estimated factors in regressions for forecasting or other modelling purposes might not be justified under the conditions discussed in Bai and Ng (2006). References Bai, J. (2003): “Inferential Theory for Factor Models of Large Dimensions,” Econometrica, 71(1), 135–171. Bai, J., and S. Ng (2002): “Determining the Number of Factors in Approximate Factor Models,” Econometrica, 70(1), 191–221. (2006): “Confidence Intervals for Diffusion Index Forecasts and Inference for Factor-Augmented Regressions,” Econometrica, 74(4), 1133–1150. Bai, Z. D., and J. W. Silverstein (1998): “No Eigenvalues Outside the Support of the Limiting Spectral Distribution of Large Dimensional Sample Covariance Matrices,” Annals of Probability, 26(1), 316–345. Cesa-Bianchi, A., M. H. Pesaran, A. Rebucci, and T. Xu (2012): “China’s Emergence in the World Economy and Business Cycles in Latin America,” Forthcoming in Economia, Journal of the Latin American and Caribbean Economic Association. Chamberlain, G. (1983): “Funds, Factors and Diversification in Arbitrage Pricing Theory,” Econometrica, 51(5), 1305–1323. Chudik, A., M. H. Pesaran, and E. Tosetti (2011): “Weak and Strong Cross Section Dependence and Estimation of Large Panels,” Econometrics Journal, 14(1), C45–C90. Davidson, J. (1994): Stochastic Limit Theory. Oxford University Press. Dees, S., F. di Mauro, M. H. Pesaran, and L. V. Smith (2007): “Exploring the International Linkages of the Euro Area: a Global VAR Analysis,” Journal of Applied Econometrics, 22(1), 1–38. Eklund, J., G. Kapetanios, and S. Price (2010): “Forecasting in the Presence of Recent Structural Change,” Working Paper 406, Bank of England. Forni, M., M. Hallin, M. Lippi, and L. Reichlin (2000): “The Generalised Dynamic Factor Model: Identification and Estimation,” Review of Economics and Statistics, 82(4), 540–554. Forni, M., and M. Lippi (2001): “The Generalised Dynamic Factor Model: Representation Theory,” Econometric Theory, 17(6), 1113–1141. Gabaix, X. (2011): “The Granular Origins of Aggregate Fluctuations,” Econometrica, 79(3), 733–772. Hachem, W., P. Loubaton, and J. Najim (2005a): “The Empirical Eigenvalue Distribution of a Gram Matrix: From Independence to Stationarity,” Markov Processes and Related Fields, 11(4), 629–648. 15 It is interesting to note that another contribution to this literature (Onatski (2010)) does not assume strong factors and, therefore, the suggested method will be valid in our framework. Also, Kapetanios and Marcellino (2010) suggest modifications to the methods of Bai and Ng (2002) that enable their use in the presence of weak factors. 26 (2005b): “The Empirical Eigenvalue Distribution of a Gram Matrix with a Given Variance Profile,” Annales de l’Institut Henri Poincare, 42(6), 649–670. Kapetanios, G. (2010): “A Testing Procedure for Determining the Number of Factors in Approximate Factor Models with Large Datasets,” Journal of Business and Economic Statistics, 28(3), 397–409. Kapetanios, G., and M. Marcellino (2010): “Factor-GMM Estimation with Large Sets of Possibly Weak Instruments,” Computational Statistics and Data Analysis, 54(11), 2655–2675. Lintner, J. (1965): “The Valuation of Risk Assets and the Selection of Risky Investments in Stock Portfolios and Capital Budgets,” Review of Economics and Statistics, 47(1), 13–37. Onatski, A. (2009): “Testing Hypotheses about the Number of Factors in Large Factor Models,” Econometrica, 77(5), 1447–1479. (2010): “Determining the Number of Factors Form Empirical Distribution of Eigenvalues,” Review of Economics and Statistics, 92(4), 1004–1016. (2011): “Asymptotics of the Principal Components Estimator of Large Factor Models with Weakly Influential Factors,” Working paper, University of Cambridge. Pesaran, M. H. (2006): “Estimation and Inference in Large Heterogeneous panels with a Multifactor Error Structure,” Econometrica, 74(4), 967–1012. Pesaran, M. H., T. Schuermann, and S. Weiner (2004): “Modeling Regional Interdependencies Using a Global Error-Correcting Macroeconometric Model,” Journal of Business and Economic Statistics, 22(2), 129–162. Pesaran, M. H., and T. Yamagata (2012): “Testing CAPM with a Large Number of Assets,” Forthcoming Working Paper, University of Cambridge. Roll, R. (1977): “A Critique of the Asset Pricing Theory’s Tests Part I: On Past and Potential Testability of the Theory,” Journal of Financial Economics, 4(2), 129–176. Ross, S. A. (1976): “The Arbitrage Theory of Capital Asset Pricing,” Journal of Economic Theory, 13, 341–360. Sharpe, W. (1964): “Capital Asset Prices: A Theory of Market Equilibrium under Conditions of Risk,” Journal of Finance, 19(3), 425–442. Stock, J. H., and M. W. Watson (2002): “Macroeconomic Forecasting Using Diffusion Indexes,” Journal of Business and Economic Statistics, 20(1), 147–162. Stout, W. F. (1974): Almost Sure Convergence. Academic Press. Yin, Y. Q., Z. D. Bai, and P. R. Krishnaiah (1988): “On the Limit of the Largest Eigenvalue of the Large Dimensional Sample Covariance Matrix,” Probability Theory and Related Fields, 78(4), 509–521. Appendix I: Statement of Lemmas ∞ Lemma 1 Under Assumptions 2 and 3, ft , {uit }∞ t=1 , and {uit }i=1 are Lr -bounded, L2 -NED processes of size −ζ, for ∞ some r > 2. This result holds uniformly over i, in the case of {uit }∞ t=1 , and over t, in the case of {uit }i=1 . Lemma 2 Under Assumptions 2 and 3, T √ 1 X 2 √ N ūt →d N (0, σ̄∞ ), ft − f¯ T t=1 where 2 σ̄∞ PN P∞ 2 i=1 σi + j=1,j6=i σij , = lim N →∞ N σi2 = E(u2it ), and σij = E(uit uit−j ). 27 (40) Lemma 3 Under Assumption 3, T 2 √ 2 1 X √ √ N ūt − E N ūt →d N (0, V ), 2T t=1 where V = lim N →∞ ! ∞ 2 X 2 √ 2 √ √ . V ar N ūt + N ūt , N ūt−j Cov j=1 2 −1 2 Lemma 4 Under Assumptions 1-3, σ̄c + Op (N T )−1/2 . N − σ̄∞ = Op T p 2 min(N α , T ) ln(s2f v̄N ) − ln(σf2 µ2v ) →d N (0, ω) . p 2 Lemma 6 Under Assumption 6 and Assumptions 2-3, min(N, T ) ln(s2f v̄N ) − ln(σf2 µ2v ) →d N (0, ω) . Lemma 5 Under Assumptions 1-2, Lemma 7 Under Assumptions 1-3, and as long as α > 4/7, p min(N α , T ) 2 2 σ̄c σ̄N − N2 2 2 2α−1 N v̄N sf N σ̂x̄ ! = Op p min(N α , T )N 2−4α . Lemma 8 Under Assumptions 1-2, and as long as α > 1/2, 2 2 σ̄c σ̄N − N2 2 2 2α−1 N v̄N sf N σ̂x̄ p min(N α , T ) ln(N ) 2 σ̄c 1 + N2 N σ̂x̄ !! = op (1) . oN n oN oN n n (s) (s) (s) (s) (s) be the reorderings of {v̂i }N δ̂i where v̂i ≥ v̂i+1 , ∀i and δ̂i ≥ and δ̂i Lemma 9 Let v̂i i=1 and i=1 i=1 i=1 (s) δ̂i+1 , ∀i. Under Assumptions 1-3, and assuming that limT,N →∞ T −1 N α < ∞, we have N 2α−2 N Pα (s) v̂i − i=1 Nα N Pα 1 Nα (s) 2 v̂i i=1 − −1 σv2 = op (1) . µ2v (41) If σf2 µ2v = 1, then N Pα i=1 (s) δ̂i − 1 Nα N Pα (s) 2 δ̂i i=1 − Nα − 1 σv2 = op (1) . µ2v Lemma 10 Under Assumptions 1-2, we have V̂f 2 − Vf 2 = op (1), as long as l → ∞, l = o (T ) and l = o N α−1/2 T 1/2 . Lemma 11 Under Assumptions 1-2, and assuming αj = α, for all j = 1, ..., m, p min(N α , T ) ln(v̄ 0N S f f v̄ N ) − ln(µ0v Σf f µv ) →d N (0, ωm ) , 0 where µv = E (v i ), Σf f = E f t − µf f t − µf , 2 2 2 ωm = lim min (N α , T ) E v̄ 0N f¯T − µ0v µf − E v̄ 0N f¯T − µ0v µf , N,T →∞ µf = E (f t ) and f¯T = 1 T PT t=1 f t. Lemma 12 Under Assumptions 1-2, and assuming α > α2 > ... > αm , p min(N α , T ) ln(v̄ 0N S f f v̄ N ) − ln(µ0v Σf f µv ) →d N (0, ω) . Lemma 13 Under either Assumption 1 or 6 and Assumptions 2-3, and α > α2 ≥ α3 ≥ ... ≥ αm , p 2 min (N α , T ) ln (N ) ln(µ0v D N Σf f D N µv ) − ln µ21v σ1f = o (1) , if either α2 − α < −0.25 or, if T b = N , α2 < 3α/4 and b> 1 . 4 (α − α2 ) (42) Lemma 14 Let βi = N α−1 vi , 1/2 < α ≤ 1, where vi = vN i = ṽi + cN i and {ṽi }N i=1 is an i.i.d. sequence of random PN variables with mean µv 6= 0, and variance σv2 < ∞. Let c̄N = N1 c . Under Assumptions 2-3, (a) α̂, α̃ and α̌ are N i i=1 √ consistent estimators of α, if c̄N = op (N c ) for all c > 0, (b) Corollary 2 holds, if N c̄N = op (1). 28 Appendix II: Proofs of Theorems Proof of Theorem 1 We start by noting that !2 T T T 1 X 1 X 1 X 2 = x̄t − x̄t = x̄t − x̄2 , T t=1 T t=1 T t=1 P where x̄t = β̄N ft + ūt , and x̄ = T −1 Tt=1 x̄t = β̄N f¯ + ū. Further, let σ̂x̄2 N X Kρ = KN ρ = βi = ρ j=M +1 1 − ρ(N −M ) 1−ρ ; M = [N α ]. (43) Then we have " σ̂x̄2 = s2f = 2 2 β̄N sf + 2β̄N # " # T T 1 X 1 X 2 2 ¯ ft − f ūt + ūt − ū , T t=1 T t=1 T 2 1 X ft − f¯ →p σf2 > 0, as T → ∞. T t=1 Therefore i i h P ft − f¯ ūt + T1 Tt=1 ū2t − ū2 . ln = + ln 1 + 2 2 sf β̄N P But under Assumption 1, β̄N = (M/N )v̄N + N −1 Kρ , where v̄N = M −1 M i=1 vi , we have 2 Kρ sf Kρ 2 2 2 ln(β̄N sf ) = ln N α−1 v̄N sf + = 2(α − 1) ln(N ) + ln(s2f v̄N ) + 2 ln 1 + α N N v̄N σ̂x̄2 2β̄N 2 2 ln(β̄N sf ) h 1 T PT t=1 2 = 2(α − 1) ln(N ) + ln(s2f v̄N ) + Op (N −α ). Hence, recalling from (14) that α̂ = 1+ ln(σ̂x̄2 )/2 ln (N ),we have 2 2 ln(N ) (α̂ − α) − ln(s2f v̄N ) = ln 1 + 2β̄N h 1 T PT t=1 i i h P ft − f¯ ūt + T1 Tt=1 ū2t − ū2 + Op (N −α ). 2 2 β̄N sf (44) However, h P i i h P i h P i h P 2β̄N T1 Tt=1 ft − f¯ ūt + T1 Tt=1 ū2t − ū2 2β̄N T1 Tt=1 ft − f¯ ūt + T1 Tt=1 ū2t − ū2 = + op (aT N ) , ln 1 + 2 2 2 2 sf sf β̄N β̄N (45) where 2β̄N aT N = h 1 T i h P ft − f¯ ūt + 1 T PT t=1 T t=1 ū2t − ū2 2 2 β̄N sf Consider the first term of the RHS of (45). We have, h P i 2β̄N T1 Tt=1 ft − f¯ ūt = 2 2 β̄N sf √2 TN h σf 1 √ PT T t=1 i . i √ ft − f¯ N ūt sf β̄N (sf /σf ) . We note that sf /σf = 1 + Op (T −1/2 ). But, by Lemma 2 (as N and T → ∞) T √ 1 X 2 √ ft − f¯ N ūt →p N (0, σ̄∞ ), σf T t=1 where σ 2∞ is as in (40). We need to determine the probability order of 1/β̄N . We note that 1 1 1 − α−1 = α−1 N v̄N β̄N N v̄N + Kρ N −1 = −N 1−2α Kρ v̄N −Kρ N −1 2 N α−1 v̄N N 2α−2 v̄N + Kρ N α−2 v̄N −1 v̄N + Kρ N −α = Op N 1−2α , − 29 1 = (46) and hence 2β̄N h 1 T PT t=1 i ft − f¯ ūt = 2 2 β̄N sf √2 TN h σf 1 √ T PT t=1 i √ N ūt ft − f¯ sf β̄N (sf /σf ) −1/2 1/2−α = Op T N + Op T −1/2 N 1/2−2α . (47) Consider now the second term on the RHS of (45). We have −N α−2 Kρ v̄N − N −2 Kρ2 1 1 − = = 2 4 3 2 2 N 2α−2 v̄N N 4α−4 v̄N + N 3α−4 Kρ v̄N + N 2α−4 Kρ2 v̄N β̄N 3 2 2 − N 2−3α Kρ v̄N + N 2−4α Kρ2 v̄N v̄N + N −α Kρ v̄N + N −2α Kρ2 = Op N 2−3α . √ Note that since, by Lemma 1 and Theorems 17.5 and 19.11 of (Davidson, 1994, Theorem 15.18), N T ū = Op (1), then, 2 2 −1/2 2 since sf /σf = 1 + Op (T ) and 0 < σf < ∞, √ 2 N T ū ū2 N α−1 v̄N + Kρ N = 2 s2f NT N α−1 v̄ N + Kρ N 2 = Op T −1 N 1−2α . (48) s2f Similarly, √ 2 √ 2 i √ o h√ PT σ̄N N ūt 1 2 2 2 √ √ − 1 + T 2 1 T σ̄N t=1 σ̄N N T T t=1 ( N ūt ) − σ̄N + N T t=1 ūt T = = 2 2 2 K K K N α−1 v̄N + Nρ s2f N α−1 v̄N + Nρ s2f N α−1 v̄N + Nρ s2f √ 2 2 PT σ̄N N ūt √ √1 − 1 2 t=1 σ̄N N T T σ̄N = + 2 2 . K K N α−1 v̄N + Nρ s2f N N α−1 v̄N + Nρ s2f 1 √ PT n √1 T PT Note that 2 σ̄N N N α−1 v̄ But, by Lemma 3, N + Kρ N 2 − s2f 2 σ̄N 2 2 2α−1 N v̄N sf = Op (N 1−3α ). (49) " √ # 2 T 1 X N ūt √ − 1 →d N (0, 1), σ̄N 2T t=1 and 2 σ̄N √ N T √1 T PT √ N ūt σ̄N t=1 N α−1 v̄N + Kρ N 2 2 −1 = Op (T −1/2 N 1−2α ) + Op (T −1/2 N 1−3α ). (50) s2f Therefore, collecting all results derived above, and keeping the highest order terms of the RHS of (44), (47), (48), (49) and (50), we have 2 2 ln(N ) (α̂ − α) − ln(s2f v̄N )− 2 σ̄N 2 2 2α−1 N v̄N sf = Op max T −1/2 N 1/2−α , T −1 N 1−2α , T −1/2 N 1−2α , N 1−3α , N −α . Since α > 1/2, in the first instance this implies that α̂ − α = Op 1 ln(N ) , which establishes the consistency of α̂ as an estimate of α as N and T → ∞, in any order. Consider now the derivation of the asymptotic distribution of α̂. We have h i √ PT 1 √2 √ ft − f¯ N ūt 2 t=1 T N σ T σ̄N f 2 ln(N ) (α̂ − α) − 2α−1 2 2 = ln(s2f v̄N )+ + N v̄N sf sf N α−1 v̄N (sf /σf ) √ 2 2 √ 2 PT σ̄N N ūt √ √1 − 1 N T ū t=1 σ̄N N T T + + Op (N −α ). 2 2 K K ρ ρ 2 2 N T N α−1 v̄N + N sf N α−1 v̄N + N sf 30 (51) 2 We first examine ln(s2f v̄N ). By Lemma 5 we have p 2 min(N α , T ) ln(s2f v̄N ) − ln(σf2 µ2v ) →d N (0, ω) . Further, since α > 1/2, p min(N α , T ) h √2 TN i √ ¯ N ū f − f p t t t=1 α , T )T −1/2 N 1/2−α = o (1). = O min(N p p sf N α−1 v̄N (sf /σf ) 1 √ σf T Similarly, √ p min(N α , T ) PT N T ū N T N α−1 v̄N + and p min(N α , T ) 2 2 σ̄N √ √1 N T T PT Kρ N √ t=1 N α−1 v̄N 2 s2f p min(N α , T )T −1 N 1−2α = op (1), = Op −1 p = Op min(N α , T )T −1/2 N 1−2α = op (1). 2 K + Nρ s2f N ūt σ̄N 2 Thus, p where min(N α , T ) ln(N ) (α̂ − ∗ αN ) σ̄ 2 − 2α−1N 2 2 N v̄N sf ! →d N (0, ω) , ∗ = α+ ln (σ 2 µ2 )/ ln (N ). αN f v Proof of Theorem 2 We need to show that as long as either α > 4/7 or T 1/2 /N 4α−2 p → 0, min(N α , T ) ln(N ) 2 σ̄N 2 s2 N 2α−1 v̄N f − d 2 σ̄ N 2 N σ̂x̄ = op (1) . The result follows immediately by Lemma 7. Proof of Theorem 3 The result follows immediately by Lemma 8. Proof of Theorem 4 The proof follows immediately from Lemmas 9 and 10. Proof of Theorem 5 Under the general factor model, we have 0 " 0 σ̂x̄2 = β̄ N S f f β̄ N + 2β̄ N # " # T T 1 X 1 X 2 f t − f¯ ūt + ūt − ū2 , T t=1 T t=1 where Sf f = T 0 1 X f t − f¯ f t − f¯ →p Σf f > 0, as T → ∞. T t=1 We have that 0 β̄N Sf f β̄N = N 2α−2 v̄ 0N DN Sf f DN v̄ N + 2N α−2 v̄ 0N DN Sf f K ρ + N −2 K 0ρ Sf f K ρ = N 2α−2 v̄ 0N DN Sf f DN v̄ N + O N α−2 . So 0 ln β̄N Sf f β̄N = ln N 2α−2 v̄ 0N DN Sf f DN v̄ N + 2N α−2 v̄ 0N DN Sf f K ρ + N −2 K 0ρ Sf f K ρ = 2N −α v̄ 0N DN Sf f K ρ + N −2α K 0ρ Sf f K ρ 2(α − 1) ln (N ) + ln v̄ 0N DN Sf f DN v̄ N + ln 1 + . v̄ 0N DN Sf f DN v̄ N Then, 31 h 0 0 2 ln σ̂x̄ = ln(β̄ N S f f β̄ N ) + ln 1 + 2β̄ N 1 T i i h P f t − f¯ ūt + T1 Tt=1 ū2t − ū2 + Op N −α , 0 β̄ N S f f β̄ N PT t=1 ln σ̂x̄2 = 2(α − 1) ln(N ) + ln(v̄ 0N D N S f f D N v̄ N ) i h P i h P 0 2β̄ N T1 Tt=1 f t − f¯ ūt + T1 Tt=1 ū2t − ū2 + Op N −α . + ln 1 + 0 β̄ N S f f β̄ N So 0 2 ln(N ) (α̂ − α) − ln(v̄ 0N D N S f f D N v̄ N ) 2β̄ N = ln 1 + h 1 T PT t=1 i f t − f¯ ūt 0 h + 1 T PT t=1 ū2t − ū2 0 β̄ N S f f β̄ N β̄ N S f f β̄ N i + Op N −α , or 0 2 ln(N ) (α̂ − α) − ln(v̄ 0N D N S f f D N v̄ N ) = where 0 BN,T = 2β̄ N 2β̄ N h 1 T PT t=1 i f t − f¯ ūt 0 h + PT 2 2 t=1 ūt − ū 1 T 0 β̄ N S f f β̄ N h 1 T PT t=1 i + Op N −α + op (BN,T ), β̄ N S f f β̄ N i f t − f¯ ūt 0 h + β̄ N S f f β̄ N 1 T PT t=1 ū2t − ū2 i . 0 β̄ N S f f β̄ N Similarly to (47), 0 2β̄ N h 1 T PT t=1 i f t − f¯ ūt 0 = Op T −1/2 N 1/2−α + Op T −1/2 N 1/2−2α . β̄ N S f f β̄ N Also h 1 T PT 2 2 t=1 ūt − ū i 0 β̄ N S f f β̄ N where √ 2 N ūt 2 √1 PT σ̄N − 1 2 t=1 σ̄ T N σ̄N + Op (N 1−2α T −1 ), = √ + 0 N 2α−1 v̄ N D N S f f D N v̄ N T N 2α−1 v̄ 0N D N S f f D N v̄ N √ 2 N ūt 2 √1 PT σ̄N − 1 t=1 σ̄ T N −1/2 1−2α √ = O N . p T T N 2α−1 v̄ 0N D N S f f D N v̄ N So, 2 ln(N ) (α̂ − α)−ln(v̄ 0N D N S f f D N v̄ N )− 2 σ̄N 0 2α−1 N v̄ N D N S f f D N v̄ N = Op max T −1/2 N 1/2−α , T −1 N 1−2α , T −1/2 N 1−2α , N 1−3α , N −α Using the above derivations gives straightforward extensions of Lemmas 7 and 8. Using these, we get ! p 2 2 p σ̄c σ̄N N α min(N , T ) − = Op min(N α , T )N 2−4α 0 2 2α−1 N v̄ N D N S f f D N v̄ N N σ̂x̄ and p min(N α , T ) ln(N ) 2 2 σ̄c σ̄N − N2 N 2α−1 v̄ 0N D N S f f D N v̄ N N σ̂x̄ 2 σ̄c 1 + N2 N σ̂x̄ !! = op (1) , which together with Lemmas 11, 12 and 13 prove the theorem. Appendix III: Proofs of Lemmas Proof of Lemma 1 By the Marcinkiewicz–Zygmund inequality (see, e.g., (Stout, 1974, Theorem 3.3.6)), r sup E(|uit | ) = sup E i i (∞ X l=0 ψil ∞ X !)r ! ξis vst−l ≤ c sup i s=−∞ 32 ∞ X l=0 ! |ψil | 2 sup i ∞ X s=−∞ |ξis | 2 !!r/2 sup E(|vit |r ) , i,t so uit is Lr -bounded if supi supt E(|vt |r ) < ∞ which holds by Assumption 2. Moreover, writing k·kr for the Lr -norm, we have, by Minkowski’s inequality, ! X ∞ ∞ X X X vi ) = sup sup uit − E(uit |Ft,|m| ψij ξis νst kvit k2 sup |ψij | sup |ξis | , ≤ sup 2 i i i,t i i j=m+1 j=m+1 |s|≥m |s|≥m 2 (52) νi is the σfield generated by {vis ; i, s ≤ t − m} ∪ {vis ; i, s ≥ t + m}. But, Assumption 2 for any integer m > 0 where Ft,|m| P ∞ ζ ζ P ∞ implies that supi limm→∞ m j=m+1 |ψij | = O (1) and supi limm→∞ m |s|≥m |ξis | = O (1) . Consequently {uit }t=1 ∞ and {uit }i=1 and Lr -bounded, L2 -NED processes of size −ζ, uniformly over i and t. Similarly, we can show that ft is an Lr -bounded (r ≥ 2) L2 -NED processes of size −ζ. Proof of Lemma 2 √ P We have √1T Tt=1 ft − f¯ N ūt = √1 T PT t=1 √ zt , where zt = ft − f¯ N ūt . We have that zt is a station- ary process such that E (zt ) = 0. We note that by Lemma 1 and Theorem 24.6 of Davidson (1994), we have that 2 √ PN 2 = N1 Further, by Theorem 17.8 of Davidson (1994), we have that sums of L2 -bounded, N ūt E i=1 σi < ∞. L2 -NED triangular √ arrays of size −ζ are L2 -bounded, L2 -NED triangular arrays of size −ζ as well, implying, given Lemma 1, that N ūt is an L2 -bounded, L2 -NED triangular arrays of size −ζ. Further, by the Marcinkiewicz–Zygmund inequality, !r ! ! !!r/2 ∞ ∞ N ∞ ∞ N r √ X X 1 X X 1 XX 2 2 ψil ξis vst−l ≤c |ψil | |ξis | sup E(|vit |r ) ≤ E( N ūt ) = E √ N N i=1 i,t s=−∞ s=−∞ i=1 l=0 l=0 (53) ! !!r/2 ∞ ∞ X X |ψil |2 sup |ξis |2 sup E(|vit |r ) < ∞. c sup i l=0 i i,t s=−∞ √ As a result, N ūt is a Lr -bounded, L2 -NED triangular arrays of size −ζ. √ Finally, since { N ūt } and {ft } are Lr -bounded (r ≥ 2) L2 -NED processes of size −ζ on a φ-mixing process of size −η (η > 1), then, by Example 17.17 of Davidson (1994), {zt } is L2 -NED of size −{ζ(ϕ − 2)}/{2(ϕ − 1)} ≤ −1/2 on a φ-mixing process of size −η. Since νit and νf t are i.i.d. processes they are also φ-mixing processes of any size. In view of Theorem 17.5(ii) of Davidson (1994), this in turn implies that {zt } is an L2 -mixingale of size −1/2, if 2η > ζ, which automatically holds by the i.i.d. property of νit and νf t . This, implies the result of the Lemma by Theorem 24.6 of (Davidson, 1994, Theorem 15.18). Proof of Lemma 3 √ By Lemma 2, N ūt is a Lr -bounded, L2 -NED triangular arrays of size −ζ. By Example 17.17 of Davidson (1994), and √ 2 (53), N ūt is Lr -NED of size −{ζ(ϕ − 2)}/{2(ϕ − 1)} ≤ −1/2, r > 4. Then, by Theorem 24.6 of (Davidson, 1994, Theorem 15.18), the result follows. Proof of Lemma 4 PN PT 2 −1 2 1 2 2 We need to show that σ̄c + Op (N T )−1/2 . We have that σ̄c N − σ̄∞ = Op T N = NT i=1 t=1 ûit , where PN PT PN PT PN PT 2 2 2 2 1 1 1 ûit = xit − δ̂i x̃t . Then, N T i=1 t=1 ûit = N T i=1 t=1 uit + N T i=1 t=1 ûit − uit , where uit = xit − PN PT 2 2 1 δi x̃t . Following similar lines to those of the proof of Lemma 3 wePhavePthat N T i=1 t=1 uit →p σ̄∞ . Further, P P N T −1/2 N T 2 2 2 2 1 1 . Next, we examine N T i=1 t=1 ûit − uit . It is sufficient to consider i=1 t=1 uit − σ̄∞ = O (N T ) NT PN PT x̃t 1 i=1 t=1 uit (ûit − uit ). Noting that N 1−α x̄t = Op (1), and denoting equality in order of probability by =p , we NT have N T N T X 1 XX 1 X uit (ûit − uit ) = δi − δ̂i x̃t uit =p N T i=1 t=1 N T i=1 t=1 ! N !2 !2 T T X X X 1 1 1 1 √ N 1−α x̄t uit −E √ N 1−α x̄t uit + PT 1 1−α x̄ )2 NT T t=1 T t=1 t t=1 (N i=1 T 33 !2 ! N T X X 1 1 1−α E √ N x̄t uit . PT 1 1−α x̄ )2 T t=1 t t=1 (N i=1 T 2 2 P N 1−α x̄t uit < ∞ and T1 Tt=1 N 1−α x̄t = Op (1), which implies that 1 + NT But E PT √1 T t=1 1 NT Further, 1 NT PT √1 T 1 T t=1 1 T 1 PT 1−α x̄ )2 (N t t=1 ! N X T X E √1 N 1−α x̄t uit T t=1 i=1 !2 = Op 1 T . 2 2 PT 1−α √1 N 1−α x̄t uit − E N x̄ u is a NED process over i, which implies that t it t=1 T 1 PT 1−α x̄ )2 (N t t=1 ! N X T X √1 N 1−α x̄t uit T s= i=1 !2 T 1 X −E √ N 1−α x̄t uit T s= !2 1 = Op √ , T N proving the required result. Proof of Lemma 5 We have that 2 ln(s2f v̄N ) − 2 ln(σf2 v∞ ) 2 s2f v̄N 2 2 σf v∞ = ln Op ! s2f σf2 = ln s2f − σf2 2 ! + ln + Op 2 v̄N 2 v∞ s2f − σf2 σf2 = 2 v̄N − µ2v 2 ! + 2 − µ2v v̄N µ2v + . But, under Assumption 2, √ T where f¯ = 1 T PT t=1 s2f − σf2 σf2 ! T o 2 1 X n = √ (ft − f¯)/σf − 1 →d N 0, Vf 2 , T t=1 ft , and Vf 2 = E 2 + √ Nα [(ft − µf )/σf ]2 − 1 ∞ X [(ft − µf )/σf ]2 − 1 Cov [(ft−i − µf )/σf ]2 − 1 . i=1 Further, recalling that v̄N = 1 Nα PN α i=1 vi , √ Nα 2 v̄N −µ2 v µ2 v = v̄N − µv µv √ v̄N +µv v N α v̄Nµ−µ . But µ v v →p 2, and 2 →d N v̄N +µv µv 0, σv µ2v . (54) 2 2 p 2 sf −σf v̄N −µ2 2 2 v ) →d N (0, ω) , where ω = Further, E = 0, implying that ) − ln(σf2 v∞ min(N α , T ) ln(s2f v̄N 2 µ2 σf v h i 2 α ,T ) min(N α ,T ) 4σv V limN,T →∞ min(N + . T Nα µ2 f2 v Proof of Lemma 6 The proof follows easily along the same lines as that of Lemma 5. In the present case under Assumption 6 we have √ v̄2 −µ2 √ v̄ −µv v̄ +µv √ v̄ −µv P v̄N +µv N N v̄N = N −1 N N Nµ2 v = N Nµv , and → →d p 2. Therefore, N i=1 vi , and thus µ µ µv v v v 2 σv N 0, µ2 . v Proof of Lemma 7 We need to show that p min(N α , T ) 2 2 σ̄c σ̄N − N2 2 2 2α−1 N v̄N sf N σ̂x̄ ! = op (1) . We have 2 2 2 2 2 2 σ̄c σ̄c σ̄c σ̄c σ̄N σ̄N N − = − 2α−1N 2 2 + 2α−1N 2 2 − N2 . 2 2 2 2 N 2α−1 v̄N sf N σ̂x̄2 N 2α−1 v̄N sf N v̄N sf N v̄N sf N σ̂x̄ 34 (55) But 2 2 σ̄c σ̄N −1/2 1−2α N − = O T N , p 2 2 2 2 N 2α−1 v̄N sf N 2α−1 v̄N sf which is negligible as a bias. Next, ! ! 2 2 2 σ̄c σ̄c σ̄c 1 1 2 2 N N N − = σ̄ − = σ̄N N 2 2 2 2 2 N 2α−1 v̄N sf N σ̂x̄2 σ̄N N 2α−1 v̄N sf N σ̂x̄2 2 σ̄c N 2 σ̄N ! ! 1 2 2 N 2α−1 v̄N sf 2 2 N 2α−1 N 2−2α σ̂x̄2 − v̄N sf 1 N σ̂x̄2 But by the proof of Theorem 1, we have 2 2 v̄N sf − N 2−2α σ̂x̄2 = Op T −1/2 N 1/2−α + Op T −1 N 1−2α + Op (N 1−3α ) + Op (N −α ) + Op N 1−2α . So 2 σ̄N 2 σ̄c N 2 σ̄N ! ! 1 2 2 N 2α−1 v̄N sf 2 2 N 2α−1 N 2−2α σ̂x̄2 − v̄N sf 1 N σ̂x̄2 = Op T −1/2 N 1/2−α N 1−2α + Op T −1 N 1−2α N 1−2α + Op (N 1−3α N 1−2α ) + Op (N −α N 1−2α ) + Op N 1−2α N 1−2α = Op T −1/2 N 3/2−3α + Op T −1 N 2−4α + Op (N 2−5α ) + Op (N 1−3α ) + Op N 2−4α . Therefore, as long as α > 4/7, (55) holds, proving the Lemma. Proof of Lemma 8 We have that 2 σ̄N 2 σ̄c N 2 σ̄N ! 1 2 2 sf N 2α−1 v̄N ! N 2α−1 N 1−2α T 1 X√ 2 ( N ūt )2 − σ̄c N T t=1 !! 1 N σ̂x̄2 = Op N 2−4α T −1/2 , which is negligible as long as α > 1/2. To see the above result note the following. We have !2 T N 1 X 1 X 2 √ uit − σ̄c N = T t=1 N i=1 !2 N T X X 1 1 2 √ uit − σ̄∞ + T t=1 N t=1 ! ! N T N T N T 1 XX 2 1 XX 2 1 XX 2 uit − ûit − uit . N T i=1 t=1 N T i=1 t=1 N T i=1 t=1 2 P PN P PT 2 2 2 √1 − σ̄∞ = Op (T −1/2 ) and σ̄∞ − N1T N But it is straightforward to show that T1 Tt=1 i=1 uit i=1 t=1 uit = N 2 P PN PT PT PN PT 2 2 −1/2 1 1 √1 u − û − u = o (T ). So, Op (T −1/2 ). Finally by Lemma 4, N1T N it p it it i=1 t=1 i=1 t=1 t=1 t=1 NT T N 2 σ̄∞ − −1/2 2 σ̄c ). N = Op (T Proof of Lemma 9 The first step in the proof is to show that the number of cross-sectional units that are misclassified, i.e., that are included in the variance calculation when their loading is not a function of any vi , is op (N α ). The first thing to note is that we abstract from the possibility that any vi = 0. By the fact that Pr (vi = 0) = 0, it follows that the number of units that have vi = 0 is op (N α ). Without loss of generality, we further assume that units whose loadings do not depend on any vi have zero loadings. The probability that op (N α ) units are misclassified is bounded from above by [N α ] C1 N −α+c Pr ∪i=1 {|v̂i − vi | > ε} ∪ ∪N , i=[N α ]+1 {|v̂i | > ε} [N α ] is an upper for some C1 > 0, some ε > 0, and any c > 0, where Pr ∪i=1 {|v̂i − vi | > ε} ∪ ∪N i=[N α ]+1 {|v̂i | > ε} bound for the probability that one unit is misclassified. But α [N X] [N α ] N Pr ∪i=1 {|v̂i − vi | > ε} ∪ ∪i=[N α ]+1 {|v̂i | > ε} ≤ Pr (|v̂i − vi | > ε) + i=1 N X Pr (|v̂i | > ε) . i=[N α ]+1 E (|v̂i −vi |2 ) E (|v̂i |2 ) But, by Markov’s inequality Pr (|v̂i − vi | > ε) ≤ = O T −1 , i = 1, ..., [N α ] and Pr (|v̂i | > ε) ≤ ε2 = ε2 O T −1 , i = [N α ] + 1, ..., N. Therefore, the probability that op (N α ) units are misclassified is o N 1−α+c T −1 for all 35 . c > 0. Since α > 1/2, this probability goes to zero as long as limT,N →∞ T −1 N α < ∞, proving the first step in the proof. Next, we show the first part of the Lemma assuming that we observe which units have non zero loadings. Recall that, assuming that units whose loadings do not depend on any vi have zero loadings, xit = v̄vNi N 1−α x̄t + uit . We analyse v̂i by defining it to be the estimated regression coefficient of the regression of xit on N 1−α x̄t rather than xit on x̄t . This (1) is clearly the case once the normalisation N 2α−2 is taken into account in (41). Let vi = µvvi and vN i = v̄vNi . We need to α N P P α 2 σv2 − µ2 = op (1) . We have v̂i − N1α N show that N α1−1 i=1 v̂i v i=1 α N N Pα 1 1 X v̂ − v̂i i N α − 1 i=1 N α i=1 !2 α N N Pα σ2 1 X 1 − v2 = α v̂i − α v̂i µv N − 1 i=1 N i=1 !2 − 2 N Pα Pα 1 1 N v − v + N i N i N α − 1 i=1 N α i=1 2 2 2 N N N Pα Pα Pα Pα Pα (1) Pα (1) 1 1 N 1 1 N 1 1 N σ2 (1) (1) v − v − v v v v − + − − v2 . N i N i N α − 1 i=1 N α i=1 N α − 1 i=1 i N α i=1 i N α − 1 i=1 i N α i=1 i µv But by the law of large numbers for i.i.d. random variables with finite variance 2 N Pα Pα (1) 1 1 N σ2 (1) vi − α vi − v2 = Op N −1/2 . α N − 1 i=1 N i=1 µv It is sufficient to show that α N N Pα 1 1 X v̂i v̂ − i N α − 1 i=1 N α i=1 and 2 N Pα Pα 1 1 N v − v = op (1) Ni Ni N α − 1 i=1 N α i=1 − (56) 2 2 N N Pα Pα Pα Pα (1) 1 N 1 1 N 1 (1) − = op (1) . v − v − v v N i N i N α − 1 i=1 N α i=1 N α − 1 i=1 i N α i=1 i For (56), it is sufficient that 1 N α −1 N Pα (v̂i − vN i ) = op (1) . Recall that xit = i=1 1 α N P 1 (v̂i − vN i ) = α N − 1 i=1 But !2 PN α PT i=1 t=1 N α −1 N Pα P T t=1 i=1 PT t=1 x̄t uit vi v̄N N 1−α x̄t + uit . So α N X T X 1 = x̄2t (57) (N α − 1) v̄N P T t=1 ft2 ft uit . i=1 t=1 P ft uit = Op (N α T )1/2 and Tt=1 ft2 = Op (T ) . So N Pα P T x̄ u t it t=1 i=1 = Op T −1 N −α (N α T )1/2 = Op T −1/2 N −α/2 = op (1) . P T 2 Nα − 1 t=1 x̄t For (57), it is sufficient that 1 N α −1 N Pα (1) vN i − vi = op (1) . We have, i=1 N Pα 1 − 1 i=1 Nα But 1 N α −1 N Pα vi = Op (1) . Also v̄1N − µ1v = i=1 1 v̄N µv (1) vN i − vi = 1 v̄N − 1 µv Nα N Pα vi i=1 −1 . (v̄N − µv ) = Op N −α/2 . So, overall 1 N −1 N Pα v̂i − i=1 1 N PN i=1 op (1) , proving the result. Proof of Lemma 10 We need to show V̂f 2 − Vf 2 = op (1). We have V̂f 2 − Vf 2 = V̂f 2 − V̄f 2 + V̄f 2 − Vf 2 where V̄f 2 T T 1 X 1 X qt = qt − T t=1 T t=1 !2 + l X j=1 T T 1 X 1 X qt−j − qt T t=j+1 T t=1 36 ! T 1 X qt − qt T t=1 !! , v̂i 2 σ2 − µv2 = v and qt = 2 ft −f¯ sf . But, by Theorem 25.3 of Davidson (1994) and Assumption 3, we have that V̄f 2 − Vf 2 = op (1), as long as l → ∞ and l = o(T ). Then, it is sufficient to examine T T T 1 X ft 1 X ft x̄t 1 X x̄t − x̃t = − = = ft − α−1 T t=1 sf T t=1 sf σ̂x̄ sf T t=1 N v̄N ! ! PN T T T T N N N v̄N N α−1 ft + N1 1 X N −α X X N −α X X N −α X X i=1 uit = ft − uit = uit + op uit . sf T t=1 N α−1 v̄N sf T t=1 i=1 σf T t=1 i=1 σf T t=1 i=1 P P P 1/2 . So, T1 Tt=1 sfft − x̃t = Op N 1/2−α T −1/2 . Thus, V̂f 2 −Vf 2 = Op lN 1/2−α T −1/2 , But, Tt=1 N i=1 uit = Op (N T ) proving the Lemma. Proof of Lemma 11 Without loss of generality we consider the case of two factors. The result extends straightforwardly to m factors. We further assume, for simplicity, that factors are independent from each other. Then, ! 2 2 v̄1N s21f + 2v̄1N v̄2N s12f + v̄2N s22f 2 2 2 2 ln v̄1N s21f + 2v̄1N v̄2N s12f + v̄2N s22f − ln σ1f µ21v + σ2f µ22v = ln . 2 2 σ1f µ21v + σ2f µ22v Then, ! 2 2 2 2 s22f s22f s21f + 2v̄1N v̄2N s12f + v̄2N s21f + 2v̄1N v̄2N s12f + v̄2N v̄1N v̄1N = ln −1= 2 2 2 2 2 2 2 2 σ1f µ1v + σ2f µ2v σ1f µ1v + σ2f µ2v 2 2 2 2 s22f − σ2f µ22v + 2v̄1N v̄2N s12f s21f − σ1f µ21v + v̄2N v̄1N = 2 2 µ22v σ1f µ21v + σ2f 2 2 2 2 2 2 2 2 2 2 2 2 − σ2f µ22v + 2µ1v µ2v s12f σ2f + v̄2N σ2f s22f − v̄2N µ21v + v̄2N − σ1f σ1f + v̄1N σ1f s21f − v̄1N v̄1N = 2 2 σ1f µ21v + σ2f µ22v 2 2 2 2 2 2 − µ21v − µ22v σ1f v̄1N µ22v s22f − σ2f σ2f v̄2N µ21v s21f − σ1f 2µ1v µ2v s12f + 2 2 + 2 2 + 2 2 + 2 2 . 2 2 2 2 2 2 µ22v µ22v µ22v µ22v µ22v σ1f µ21v + σ2f σ1f µ1v + σ2f σ1f µ1v + σ2f σ1f µ1v + σ2f σ1f µ1v + σ2f Note that v̄1N v̄2N s12f = v̄1N v̄2N s12f − v̄1N µ2v s12f + v̄1N µ2v s12f − v̄1N µ2v σ12f + v̄1N µ2v σ12f − 2µ1v µ2v σ12f = s12f v̄1N (v̄2N − µ2v ) + v̄1N µ2v (s12f − σ12f ) + σ12f µ2v (v̄1N − 2µ1v ) = (s12f − σ12f ) v̄1N (v̄2N − µ2v ) + σ12f v̄1N (v̄2N − µ2v ) + v̄1N µ2v (s12f − σ12f ) + σ12f µ2v (v̄1N − 2µ1v ) . But (s12f − σ12f ) v̄1N (v̄2N − µ2v ) = op (T −1/2 ), and σ12f = 0, and so (s12f − σ12f ) v̄1N (v̄2N − µ2v ) + σ12f v̄1N (v̄2N − µ2v ) + v̄1N µ2v (s12f − σ12f ) + σ12f µ2v (v̄1N − 2µ1v ) v̄1N = v̄1N µ2v s12f = µ1v µ2v s12f + op T −1/2 . µ1v Then, 2 2 2 s2if − σif µ2iv s2if − σif µ2iv σif = 2 2 , i = 1, 2, 2 2 2 2 σ1f µ21v + σ2f µ22v σ1f µ1v + σ2f µ22v σif 2 2 2 2 σif v̄iN − µ2iv v̄iN − µ2iv µ2iv σif = , i = 1, 2, 2 2 2 2 σ1f µ21v + σ2f µ22v σ1f µ21v + σ2f µ22v µ2iv 2 2 or if σ1f µ21v + σ2f µ22v = 1, 2 2 2 2 sif − σif µ2iv s2if − σif = µ2iv σif , i = 1, 2, 2 σif 2 2 2 2 2 2 2 v̄iN − µiv σif v̄iN − µiv = µiv σif , i = 1, 2. µ2iv Assuming loadings of factors and factors are independent of each other and across factors, gives ! ! T 2 o √ s2if − σif 2 1 X n 2 2 2 2 2 2 (4) µiv σif T = µiv σif √ (fit − f¯i )/σif − 1 →d N 0, µ2iv σif µi , i = 1, 2, 2 σif T t=1 37 (58) 2 µ2iv σif √ Nα 2 v̄iN − µ2iv µ2iv 2 = µ2iv σif √ Nα v̄iN − µiv µiv v̄iN + µiv µiv 2 2 2 →d N 0, 2σiv µiv σif , i = 1, 2. Further, T √ σ1f σ2f X f1t − f¯1 f2t − f¯2 2 2 . µ1v µ2v T s12f = µ1v µ2v √ →d N 0, µ21v σ1f µ22v σ2f σ σ 1f 2f T t=1 Further, by factor independence E T o 2 1 X n √ (fit − f¯i )/σif − 1 T t=1 ! T f2t − f¯2 1 X f1t − f¯1 √ σ1f σ2f T t=1 !! = 0, i = 1, 2. So r ! ! ! 2 2 √ s21f − σ1f √ s22f − σ2f √ min (N α , T ) 2 2 2 2 µ1v σ1f + µ2v σ2f + 2µ1v µ2v T s12f + T T 2 2 T σ1f σ2f r 2 2 √ √ min (N α , T ) v̄1N − µ21v v̄2N − µ22v 2 2 2 2 α α µ σ N + µ σ N →d 1v 1f 2v 2f Nα µ21v µ22v ! α ,T ) 2 2 2 2 (4) 2 2 (4) µ22v σ2f µ2 + 2µ21v σ1f 0, min(N µ21v σ1f µ1 + µ22v σ2f T N . α ,T ) 2 2 2 2 + min(N 2σ1v µ21v σ1f + 2σ2v µ22v σ2f Nα Proof of Lemma 12 Again, without loss of generality we look at the case of two factors. The result again extends straightforwardly. We further assume, for simplicity, that factors are independent from each other. Then, 2 2 2 2 µ21v + N 2(α2 −α) σ2f µ22v = ln v̄1N s21f + 2N α2 −α v̄1N v̄2N s12f + N 2(α2 −α) v̄2N s22f − ln σ1f ln 2 2 s22f s21f + 2N α2 −α v̄1N v̄2N s12f + N 2(α2 −α) v̄2N v̄1N 2 2 σ1f µ21v + N 2(α2 −α) σ2f µ22v ! . Then, similarly to the proof of Lemma 11 ln 2 2 s22f s21f + 2N α2 −α v̄1N v̄2N s12f + N 2(α2 −α) v̄2N v̄1N 2 2 2 2 2(α −α) 2 σ1f µ1v + N σ2f µ2v ! = 2 2 2 − µ21v µ21v s21f − σ1f σ1f v̄1N + + 2 2 2 2 σ1f µ21v + N 2(α2 −α) σ2f µ22v σ1f µ21v + N 2(α2 −α) σ2f µ22v 2 2 2 N 2(α2 −α) σ2f v̄2N − µ22v N 2(α2 −α) µ22v s22f − σ2f 2N α2 −α µ1v µ2v s12f + + 2 2 . 2 2 2 2 2 2 2 2 2 2(α −α) 2(α −α) 2 2 σ1f µ1v + N σ2f µ2v σ1f µ1v + N σ2f µ2v σ1f µ1v + N 2(α2 −α) σ2f µ22v Then, 2 2 2 s21f − σ1f µ21v s21f − σ1f µ21v σ1f = 2 2 , 2 2 2 2 σ1f σ1f µ21v + N 2(α2 −α) σ2f µ22v σ1f µ1v + N 2(α2 −α) σ2f µ22v 2 2 2 2 σ1f v̄1N − µ21v v̄1N − µ21v µ21v σ1f = , 2 2 2 2 µ21v σ1f µ21v + N 2(α2 −α) σ2f µ22v σ1f µ21v + N 2(α2 −α) σ2f µ22v 2 2 2 N 2(α2 −α) µ22v s22f − σ2f N 2(α2 −α) s22f − σ2f µ22v σ2f = , 2 2 2 2 2 σ2f σ1f µ21v + N 2(α2 −α) σ2f µ22v σ1f µ21v + N 2(α2 −α) σ2f µ22v 2 2 2 2 N 2(α2 −α) σ2f v̄2N − µ22v N 2(α2 −α) v̄2N − µ22v µ22v σ2f . = 2 2 2 2 2 µ22v σ1f µ21v + N 2(α2 −α) σ2f σ1f µ1v + N 2(α2 −α) σ2f µ22v µ22v (59) (60) √ √ But, then it is obvious that the Lemma holds since (59) and (60) are op (1) , when multiplied by min T , N α √ √ respectively, as well as min T , N α N α2 −α µ1v µ2v s12f . 38 Proof of Lemma 13 We analyse the population counterpart of ln (v̄ 0N DN Sf f DN v̄ N ) assuming for simplicity that Σf f is diagonal and α > α2 ≥ α3 ≥ ... ≥ αm . We have ! m X 2 2 ln(µ0v DN Σf f DN µv ) = ln µ21v σ1f + N 2(α2 −α) N 2(αj −α2 ) µ2jv σjf . j=2 Then, 2 ln(µ0v DN Σf f DN µv )−ln µ21v σ1f = ln 1 + N 2(α2 −α) Pm 2(αj −α2 ) 2 2 Pm 2(αj −α2 ) 2 2 µjv σjf µjv σjf j=2 N j=2 N = N 2(α2 −α) . 2 2 µ21v σ1f µ21v σ1f So p min(N α , T ) ln (N ) ln(v̄ 0N DN Sf f DN v̄ N ) − ln(µ0v DN Σf f DN µv ) = Pm 2 N 2(αj −α2 ) µ2jv σjf p p j=2 2 . min(N α , T ) ln (N ) ln(v̄ 0N DN Sf f DN v̄ N ) − ln(µ21v σ1f ) − min(N α , T ) ln (N ) N 2(α2 −α) 2 µ21v σ1f We need Pm N 2(α2 −α) j=2 2 N 2(αj −α2 ) µ2jv σjf 2 µ21v σ1f = o min (N α , T )−1/2 ln (N )−1 . p This holds if min(N a , T )N 2(α2 −α) = o(1). If T < N α then a sufficient condition for the above to hold is α2 − α < −0.25. Otherwise, the sufficient condition is α2 < 3α/4. But, this condition is implied by α2 − α < −0.25 as long as α ≤ 1. An alternative condition that relates to the relative rate of growth of N and T is that α2 < 3α/4 and T b = N and 1/(4b) + α2 − α < 0 or b > 1 4(α−α2 ) Proof of Lemma 14 We note that the first part of the Lemma holds if 2 ) ln(s2f v̄N = op (1) . ln(N ) (61) We have 2 ln(s2f v̄N ) = ln s2f + 2 ln (v̄N ) = ln s2f + 2 ln N 1 X ṽi + c̄N N i=1 ! . PN ṽi + c̄N = op (N c ) for all c > 0, which holds if c̄N = op (N c ) for all c > 0, proving √ the first part of the Lemma. For the second part of the Lemma we reconsider (54). We have N (v̄N − µv ) = √ PN √ √ 1 PN N N i=1 ṽi + c̄N − µv . But, N N1 →d N 0, σv2 . Therefore, N c̄N = op (1) is sufficient for i=1 ṽi − µv So (61) holds if 1 N i=1 the second part of the Lemma to hold. Appendix IV: Proof of consistency of joint estimation of α and κ Without loss of generality, we consider a simplified version of the estimation problem. For simplicity, we assume vi = 0 for n > [N α ]. By the first part of the proof of Lemma 9, we can disregard the matter of misclassification, when ordering the estimated factor loadings. We start by noting that !2 T T T 1 X 2 1 X 1 X 2 σ̂x̄n = x̄nt − x̄nt = x̄nt − x̄2n , T t=1 T t=1 T t=1 where x̄nt = β̄n ft + ūnt , x̄n = T 1 X x̄nt = β̄n f¯ + ūn . T t=1 We have that 39 " 2 σ̂x̄n = β̄n2 s2f + 2β̄n # " # T T 1 X 1 X 2 2 ¯ ft − f ūnt + ūnt − ūn , T t=1 T t=1 or 2 σ̂x̄n i i h P h P κ2 + v̄n2 s2f − κ2 + 2β̄n T1 Tt=1 ft − f¯ ūnt + T1 Tt=1 ū2nt − ū2n , if n ≤ [N α ] i i h h P , = α α T 2 2 2 1 ¯ ūnt + 1 PT ū2nt − ū2n , if n > [N α ] [N ] κ2 + [N ] v̄[N + 2 β̄ f − f α ] sf − κ t n t=1 t=1 n n T T since n 1X vi = n i=1 β̄n = v̄n , if [N α ] v̄[N α ] , n n ≤ [N α ] , if n > [N α ] since vi = 0 for n > [N α ]. Let i i h P h P v̄n2 s2f − κ2 + 2β̄n T1 Tt=1 ft − f¯ ūnt + T1 Tt=1 ū2nt − ū2n , if n ≤ [N α ] i i h P h P . ξn = α 2 2 2 [N ] v̄[N + 2β̄n T1 Tt=1 ft − f¯ ūnt + T1 Tt=1 ū2nt − ū2n , if n > [N α ] α ] sf − κ n Then, κ2 + ξn , if n ≤ [N α ] . (62) + ξn , if n > [N α ] P P But T1 Tt=1 ft − f¯ ūnt = Op n−1/2 T −1/2 , uniformly over n; T1 Tt=1 ū2nt = Op n−1 , uniformly over n; ū2n = 2 2 2 min N −α/2 , T −1/2 . Op n−1 , uniformly over n; v̄n2 s2f −κ2 = Op min n−1/2 , T −1/2 , uniformly over n.; and v̄[N α ] sf −κ = Op Therefore, ξn = op (1) , uniformly over n. (63) 2 σ̂x̄n = [N α ] 2 κ n We estimate ( 62) by NLLS. Define ( ξˆn = 2 σ̂x̄n − κ̂2 , if n ≤ [N α ] , [N α̂ ] − n κ̂2 , if n > [N α ] 2 σ̂x̄n where κ̂2 and α̂ are the NLLS estimators of κ2 and α, and ( κ2 − κ̂2 , if n ≤ [N α ] ˆ dn = ξn − ξn = . α [N α̂ ] [N ] 2 κ − n κ̂2 , if n > [N α ] n So, N N N N 1 X 2 1 X 2 1 X 1 X ˆ2 ξn = ξn + dn + ξn dn . N n=1 N n=1 N n=1 N n=1 By the definition of the NLLS estimator lim Pr n,T →∞ N N 1 X ˆ2 1 X 2 ξn ≤ ξn N n=1 N n=1 ! = 1. (64) If either p lim κ2 − κ̂2 6= 0 (65) p lim α̂ − α 6= 0, (66) n,T →∞ or n,T →∞ then 1 N PN n=1 d2n = Op (1). Further, by (63), N 1 X ξn dn = op (1) . N n=1 Therefore, if either p limn→∞ κ2 − κ̂2 6= 0 or p limn→∞ α̂ − α 6= 0, then lim Pr n,T →∞ N N 1 X ˆ2 1 X 2 ξn − ξn − C > 0 N n=1 N n=1 ! > ε, for some C > 0 and ε > 0. But this contradicts (64). Therefore, neither (65) nor (66) can hold, proving consistency. 40 41 Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE 100 200 500 1000 100 200 500 1000 0.12 0.64 0.25 0.80 0.54 1.21 1.08 1.77 0.09 0.90 0.18 1.07 0.58 1.52 1.02 2.02 0.10 1.20 0.06 0.58 0.16 0.72 0.35 1.06 0.72 1.48 0.03 0.84 0.08 0.99 0.37 1.36 0.67 1.78 0.04 1.13 0.04 1.35 0.04 0.54 0.10 0.66 0.23 0.96 0.52 1.31 -0.00 0.80 0.04 0.94 0.26 1.25 0.50 1.62 0.01 1.09 -0.00 1.30 0.02 0.51 0.07 0.61 0.16 0.88 0.40 1.19 500 -0.02 0.78 0.02 0.90 0.19 1.17 0.37 1.51 200 0.00 1.06 -0.03 1.25 0.10 1.54 1000 0.14 1.44 0.19 1.62 Bias RMSE 0.31 1.73 500 0.51 1.90 Bias RMSE 0.45 2.04 200 0.63 2.21 0.35 1.92 0.99 2.48 0.85 Bias RMSE 0.80 100 0.75 100 0.70 N\T α 0.01 0.49 0.05 0.58 0.12 0.82 0.31 1.09 -0.03 0.76 -0.00 0.87 0.13 1.11 0.27 1.41 -0.01 1.04 -0.05 1.23 0.06 1.48 0.25 1.83 0.90 0.00 0.48 0.04 0.56 0.08 0.77 0.23 1.01 -0.04 0.76 -0.01 0.85 0.09 1.07 0.18 1.33 -0.01 1.03 -0.07 1.22 0.02 1.45 0.16 1.77 0.95 -0.02 0.48 -0.00 0.55 -0.03 0.73 -0.03 0.93 -0.06 0.75 -0.05 0.84 -0.02 1.04 -0.08 1.28 -0.03 1.03 -0.11 1.21 -0.09 1.43 -0.07 1.72 1.00 1000 500 200 100 1000 500 200 100 1000 500 200 100 N\T Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- α 9.80 78.00 99.35 8.05 98.35 99.95 6.60 100.00 100.00 6.80 100.00 100.00 9.00 65.15 95.35 8.85 89.55 99.00 7.05 99.60 99.90 7.30 100.00 100.00 11.10 52.75 84.70 9.15 76.10 91.50 9.60 94.50 96.25 7.85 98.95 99.20 0.70 6.05 87.95 99.40 5.80 99.50 100.00 5.25 100.00 100.00 5.65 100.00 100.00 6.75 76.00 95.70 6.85 93.70 99.40 6.05 99.95 99.90 6.65 100.00 100.00 8.25 61.30 83.60 7.55 82.30 92.20 8.05 96.75 97.20 6.45 99.40 99.35 0.75 4.70 93.70 99.65 4.60 99.75 100.00 4.90 100.00 100.00 5.65 100.00 100.00 5.15 82.80 96.15 5.75 96.20 99.50 5.85 100.00 100.00 5.60 100.00 100.00 6.40 66.50 84.20 6.10 86.40 93.15 7.65 97.65 97.65 5.70 99.55 99.55 0.80 4.15 96.65 99.65 4.65 99.95 100.00 4.45 100.00 100.00 5.00 100.00 100.00 500 4.45 86.90 96.90 5.10 98.20 99.65 5.75 99.95 100.00 5.65 100.00 100.00 200 6.10 70.45 85.40 5.70 90.00 93.15 6.95 98.45 97.95 5.65 99.75 99.65 100 0.85 3.35 98.60 99.80 4.60 100.00 100.00 4.05 100.00 100.00 5.30 100.00 100.00 4.25 90.00 97.60 5.05 98.85 99.85 5.85 100.00 100.00 5.50 100.00 100.00 5.65 75.00 85.65 5.00 92.10 93.85 6.85 98.90 98.25 5.25 99.70 99.85 0.90 3.30 99.50 99.95 4.60 100.00 100.00 4.25 100.00 100.00 5.30 100.00 100.00 3.90 93.45 98.20 5.15 99.65 99.85 5.70 100.00 100.00 5.55 100.00 100.00 5.75 78.35 86.35 5.30 93.65 94.10 6.80 99.40 98.35 5.35 99.75 99.85 0.95 case of a serially uncorrelated factor and cross-sectionally independent idiosyncratic errors (β2i = 0, f1t ∼ IIDN (0, 1), uit ∼ IIDN (0, σi2 ), v1i ∼ IIDU (0.5, 1.5), N=100,200,500,1000 and T=100,200,500) Table A1: Bias, RMSE, size and power (×100) for the α̃ estimate of the cross-sectional exponent - 5.25 99.95 99.95 5.70 100.00 100.00 4.85 100.00 100.00 5.45 100.00 100.00 5.55 98.00 98.00 5.95 99.85 99.85 6.25 100.00 100.00 6.05 100.00 100.00 7.05 84.50 84.50 6.45 93.25 93.25 7.10 98.40 98.40 5.55 99.85 99.85 1.00 42 Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE 100 200 500 1000 100 200 500 1000 0.03 0.64 0.09 0.80 0.19 1.17 0.49 1.60 0.00 0.91 0.02 1.09 0.23 1.51 0.42 1.95 0.01 1.23 0.04 0.58 0.10 0.72 0.21 1.04 0.44 1.41 0.00 0.85 0.02 1.00 0.23 1.36 0.38 1.76 0.01 1.13 -0.02 1.37 0.03 0.54 0.08 0.66 0.18 0.95 0.39 1.28 -0.01 0.80 0.02 0.94 0.20 1.26 0.36 1.62 0.01 1.09 -0.02 1.30 0.02 0.51 0.07 0.61 0.14 0.88 0.34 1.19 500 -0.02 0.78 0.01 0.90 0.17 1.17 0.31 1.51 200 -0.00 1.06 -0.03 1.26 0.08 1.54 1000 -0.03 1.49 0.13 1.63 Bias RMSE 0.17 1.75 500 0.16 1.95 Bias RMSE 0.32 2.05 200 0.33 2.24 0.29 1.93 0.38 2.51 0.85 Bias RMSE 0.80 100 0.75 100 0.70 N\T α 0.01 0.49 0.05 0.58 0.11 0.82 0.29 1.09 -0.03 0.76 -0.00 0.87 0.13 1.12 0.24 1.41 -0.01 1.04 -0.05 1.24 0.05 1.49 0.22 1.84 0.90 0.00 0.48 0.04 0.56 0.08 0.77 0.22 1.01 -0.04 0.76 -0.02 0.85 0.09 1.07 0.17 1.33 -0.01 1.03 -0.07 1.22 0.02 1.45 0.15 1.77 0.95 -0.02 0.48 -0.00 0.55 -0.03 0.73 -0.03 0.93 -0.06 0.75 -0.05 0.84 -0.02 1.04 -0.09 1.28 -0.03 1.03 -0.11 1.21 -0.09 1.43 -0.08 1.73 1.00 1000 500 200 100 1000 500 200 100 1000 500 200 100 N\T Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- α 7.55 86.05 97.35 7.85 98.80 99.75 7.00 100.00 100.00 7.20 100.00 100.00 8.90 73.90 89.05 10.00 92.35 97.00 8.45 99.65 99.70 8.15 100.00 100.00 11.90 61.65 75.60 10.50 80.20 87.45 10.60 95.30 94.40 8.05 99.10 98.85 0.70 5.35 91.00 98.70 5.95 99.60 99.90 5.05 100.00 100.00 5.95 100.00 100.00 6.90 79.80 92.85 7.60 94.65 98.55 6.45 100.00 99.90 6.70 100.00 100.00 9.50 65.55 78.35 7.70 83.60 90.45 8.45 96.80 96.65 6.65 99.40 99.30 0.75 4.75 94.80 99.50 4.90 99.80 100.00 4.90 100.00 100.00 5.65 100.00 100.00 5.55 84.20 94.90 6.30 96.35 99.40 5.80 100.00 100.00 5.65 100.00 100.00 7.00 68.20 81.25 6.50 87.10 92.45 7.75 97.70 97.60 5.80 99.60 99.55 0.80 4.25 97.00 99.65 4.65 99.95 100.00 4.40 100.00 100.00 5.10 100.00 100.00 500 4.55 87.55 96.55 5.40 98.20 99.65 5.80 99.95 100.00 5.70 100.00 100.00 200 6.20 71.05 84.25 5.90 90.15 93.05 6.85 98.45 97.90 5.65 99.75 99.65 100 0.85 3.50 98.65 99.80 4.60 100.00 100.00 4.05 100.00 100.00 5.30 100.00 100.00 4.40 90.40 97.45 5.10 98.85 99.85 5.80 100.00 100.00 5.50 100.00 100.00 5.75 75.40 85.35 5.05 92.10 93.65 6.90 98.90 98.25 5.25 99.70 99.85 0.90 case of a serially uncorrelated factor and cross-sectionally independent idiosyncratic errors (β2i = 0, f1t ∼ IIDN (0, 1), uit ∼ IIDN (0, σi2 ), v1i ∼ IIDU (0.5, 1.5), N=100,200,500,1000 and T=100,200,500) 3.35 99.50 99.95 4.60 100.00 100.00 4.25 100.00 100.00 5.30 100.00 100.00 3.90 93.50 98.10 5.10 99.65 99.85 5.70 100.00 100.00 5.60 100.00 100.00 5.80 78.50 86.15 5.30 93.65 94.10 6.80 99.40 98.35 5.35 99.75 99.85 0.95 Table A2: Bias, RMSE, size and power (×100) for the α̌ estimate of the cross-sectional exponent - 5.30 99.95 99.95 5.70 100.00 100.00 4.85 100.00 100.00 5.45 100.00 100.00 5.55 97.95 97.95 5.95 99.85 99.85 6.25 100.00 100.00 6.05 100.00 100.00 7.10 84.50 84.50 6.45 93.25 93.25 7.10 98.40 98.40 5.55 99.85 99.85 1.00 43 Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE 100 200 500 1000 100 200 500 1000 0.10 0.74 0.19 0.90 0.54 1.27 1.03 1.85 0.04 1.03 0.12 1.23 0.37 1.60 0.91 2.17 -0.14 1.48 0.04 0.69 0.09 0.82 0.33 1.13 0.67 1.60 -0.02 0.98 0.01 1.17 0.17 1.49 0.53 1.95 -0.19 1.43 -0.17 1.59 0.01 0.66 0.03 0.78 0.20 1.04 0.49 1.43 -0.05 0.96 -0.04 1.13 0.05 1.40 0.34 1.82 -0.21 1.39 -0.23 1.56 -0.01 0.64 -0.01 0.74 0.14 0.99 0.37 1.33 500 -0.07 0.95 -0.06 1.11 -0.03 1.35 0.22 1.71 200 -0.22 1.38 -0.26 1.53 -0.11 1.86 1000 -0.07 1.66 -0.05 1.91 Bias RMSE 0.06 1.99 500 0.25 2.11 Bias RMSE 0.09 2.28 200 0.28 2.39 -0.04 2.21 0.66 2.56 0.85 Bias RMSE 0.80 100 0.75 100 0.70 N\T α -0.02 0.63 -0.03 0.72 0.08 0.94 0.26 1.24 -0.08 0.94 -0.09 1.10 -0.06 1.32 0.12 1.64 -0.22 1.36 -0.28 1.51 -0.16 1.82 -0.15 2.16 0.90 -0.03 0.61 -0.04 0.71 0.04 0.91 0.18 1.16 -0.09 0.93 -0.10 1.09 -0.10 1.29 0.02 1.59 -0.23 1.35 -0.29 1.50 -0.20 1.79 -0.23 2.12 0.95 -0.05 0.61 -0.08 0.70 -0.07 0.88 -0.07 1.10 -0.11 0.93 -0.15 1.09 -0.20 1.27 -0.23 1.56 -0.25 1.35 -0.34 1.50 -0.31 1.78 -0.48 2.14 1.00 1000 500 200 100 1000 500 200 100 1000 500 200 100 N\T Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- α 9.80 74.30 98.20 7.80 96.50 99.80 7.70 99.95 100.00 7.55 100.00 100.00 10.75 63.10 92.10 8.90 86.70 96.75 9.10 98.05 99.65 8.95 99.45 99.90 11.90 55.15 76.20 11.70 72.75 84.15 12.30 89.40 91.15 13.65 95.30 95.35 0.70 7.25 84.60 97.90 6.75 98.30 99.95 6.35 100.00 100.00 6.75 100.00 100.00 8.20 70.55 90.15 7.70 91.05 96.75 8.45 98.75 99.70 7.60 99.55 100.00 9.75 60.75 71.80 10.65 76.50 83.40 11.80 90.80 90.60 12.60 95.50 95.80 0.75 5.90 89.65 98.85 5.70 99.15 99.95 6.55 100.00 100.00 6.00 100.00 100.00 7.20 75.90 90.35 7.05 93.05 96.95 8.40 99.00 99.75 7.70 99.80 100.00 9.25 64.95 70.70 10.35 79.30 83.55 11.70 91.85 90.90 12.70 96.25 95.95 0.80 5.00 93.25 99.05 5.45 99.75 99.90 5.80 100.00 100.00 6.50 100.00 100.00 500 6.10 80.80 90.85 6.75 94.05 97.70 8.60 99.35 99.65 7.55 99.80 99.95 200 8.35 67.20 71.05 10.90 81.00 83.25 11.65 92.10 90.80 12.65 96.35 96.05 100 0.85 4.55 95.35 99.30 5.65 100.00 100.00 6.25 100.00 100.00 6.35 100.00 100.00 5.50 84.25 91.50 8.20 95.15 97.80 8.50 99.25 99.70 7.90 99.80 99.95 8.90 69.70 70.20 11.10 82.80 83.45 11.80 92.55 91.25 12.85 96.60 96.10 0.90 4.75 97.30 99.65 5.40 99.95 100.00 6.45 100.00 100.00 6.75 100.00 100.00 6.40 87.00 91.10 8.25 96.25 98.20 8.70 99.35 99.55 7.70 99.80 99.95 10.25 72.25 70.95 11.10 83.90 83.40 11.90 92.95 91.55 12.60 96.95 96.05 0.95 case of a serially correlated factor and cross-sectionally independent idiosyncratic errors (β2i = 0, ρ1 = 0.5, ζ1t ∼ IIDN (0, 1), uit ∼ IIDN (0, σi2 ), v1i ∼ IIDU (0.5, 1.5), N=100,200,500,1000 and T=100,200,500) Table B: Bias, RMSE, size and power (×100) for the α̃ estimate of the cross-sectional exponent - 6.90 99.50 99.50 7.00 99.95 99.95 7.40 100.00 100.00 6.95 100.00 100.00 7.80 89.90 89.90 8.75 97.95 97.95 9.15 99.55 99.55 8.25 99.95 99.95 12.15 67.15 67.15 11.80 81.75 81.75 12.40 91.55 91.55 13.35 96.30 96.30 1.00 44 Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE 100 200 500 1000 100 200 500 1000 0.31 0.68 0.46 0.88 0.96 1.40 1.59 2.08 0.29 0.91 0.45 1.13 0.87 1.58 1.42 2.19 0.24 1.19 0.17 0.59 0.27 0.75 0.66 1.14 1.09 1.67 0.15 0.84 0.26 1.00 0.56 1.38 0.95 1.84 0.11 1.14 0.20 1.26 0.09 0.54 0.16 0.68 0.44 0.98 0.80 1.42 0.08 0.80 0.16 0.93 0.36 1.26 0.66 1.64 0.04 1.10 0.10 1.21 0.05 0.51 0.10 0.63 0.30 0.87 0.60 1.25 500 0.03 0.78 0.09 0.89 0.23 1.18 0.47 1.50 200 -0.01 1.08 0.04 1.18 0.12 1.45 1000 0.40 1.37 0.25 1.52 Bias RMSE 0.46 1.64 500 0.75 1.82 Bias RMSE 0.73 2.02 200 1.03 2.23 0.53 1.89 1.49 2.55 0.85 Bias RMSE 0.80 100 0.75 100 0.70 N\T α 0.02 0.49 0.05 0.60 0.22 0.80 0.44 1.13 0.01 0.77 0.05 0.87 0.14 1.12 0.33 1.39 -0.03 1.08 -0.00 1.17 0.04 1.40 0.37 1.79 0.90 0.01 0.48 0.02 0.57 0.15 0.74 0.32 1.04 -0.00 0.77 0.02 0.85 0.08 1.08 0.20 1.32 -0.04 1.07 -0.03 1.15 -0.01 1.38 0.24 1.74 0.95 -0.01 0.46 -0.02 0.55 0.03 0.70 0.04 0.94 -0.02 0.76 -0.03 0.84 -0.04 1.05 -0.09 1.27 -0.07 1.06 -0.07 1.14 -0.14 1.37 -0.04 1.69 1.00 1000 500 200 100 1000 500 200 100 1000 500 200 100 N\T Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- α 12.05 62.65 99.95 9.00 95.85 100.00 6.80 100.00 100.00 6.35 100.00 100.00 9.35 56.65 97.80 7.30 85.70 99.70 7.65 99.05 99.95 6.80 99.95 100.00 10.40 44.10 91.40 8.05 74.35 94.50 6.15 94.50 98.70 7.10 98.90 99.40 0.70 7.00 79.70 99.90 5.45 98.50 100.00 5.15 100.00 100.00 5.45 100.00 100.00 6.00 70.05 96.95 5.30 92.15 99.65 5.80 99.85 100.00 6.30 100.00 100.00 7.10 54.50 89.90 5.95 81.15 94.60 5.70 97.45 98.70 6.75 99.30 99.65 0.75 4.90 89.50 99.90 5.10 99.65 100.00 5.30 100.00 100.00 4.90 100.00 100.00 4.45 80.15 96.85 5.10 96.45 99.55 5.20 99.95 100.00 6.25 100.00 100.00 6.00 62.65 88.60 5.50 86.70 94.75 5.85 98.40 98.95 6.30 99.55 99.60 0.80 4.05 94.80 99.95 4.20 99.95 100.00 4.90 100.00 100.00 4.30 100.00 100.00 500 4.25 86.30 97.35 5.40 97.95 99.65 5.40 100.00 100.00 6.00 100.00 100.00 200 5.20 69.50 88.70 5.25 90.65 94.65 5.45 98.80 99.05 6.20 99.70 99.55 100 0.85 3.50 97.85 100.00 4.55 100.00 100.00 5.10 100.00 100.00 4.10 100.00 100.00 3.95 91.15 98.10 5.15 98.85 99.60 5.40 100.00 100.00 6.15 100.00 100.00 5.45 73.75 88.05 4.95 93.35 95.35 6.50 99.20 98.90 6.35 99.75 99.55 0.90 case of a serially uncorrelated factor and cross-sectionally dependent idiosyncratic errors (β2i = 0, f1t ∼ IIDN (0, 1), uit - θ = 0.2, v1i ∼ IIDU (0.5, 1.5), N=100,200,500,1000 and T=100,200,500) 3.85 99.20 100.00 3.90 100.00 100.00 4.45 100.00 100.00 4.50 100.00 100.00 4.10 94.15 98.00 5.15 99.50 99.65 5.15 100.00 100.00 6.20 100.00 100.00 5.25 77.55 87.75 5.35 94.30 95.50 6.15 99.40 98.90 6.40 99.80 99.55 0.95 Table C: Bias, RMSE, size and power (×100) for the α̃ estimate of the cross-sectional exponent - 5.15 100.00 100.00 4.95 100.00 100.00 5.00 100.00 100.00 4.90 100.00 100.00 5.95 97.40 97.40 5.95 99.65 99.65 6.10 100.00 100.00 6.45 100.00 100.00 6.95 85.60 85.60 6.05 94.80 94.80 6.45 98.85 98.85 6.70 99.60 99.60 1.00 45 Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE 100 200 500 1000 100 200 500 1000 0.20 0.64 0.31 0.81 0.78 1.32 1.20 1.84 0.17 0.87 0.25 1.04 0.74 1.51 1.17 2.02 0.12 1.16 0.14 0.58 0.24 0.72 0.51 1.10 1.02 1.64 0.12 0.82 0.18 0.97 0.46 1.33 0.98 1.83 0.05 1.11 0.15 1.31 0.09 0.54 0.14 0.66 0.39 0.97 0.86 1.45 0.06 0.79 0.10 0.92 0.34 1.22 0.83 1.67 -0.01 1.08 0.06 1.26 0.07 0.52 0.12 0.62 0.29 0.88 0.57 1.25 500 0.04 0.77 0.08 0.88 0.25 1.15 0.55 1.50 200 -0.02 1.06 0.04 1.22 0.21 1.50 1000 0.23 1.38 0.30 1.56 Bias RMSE 0.41 1.65 500 0.67 1.80 Bias RMSE 0.72 2.00 200 0.89 2.15 0.43 1.85 1.06 2.33 0.85 Bias RMSE 0.80 100 0.75 100 0.70 N\T α 0.04 0.51 0.07 0.59 0.25 0.81 0.44 1.12 0.01 0.76 0.03 0.86 0.20 1.09 0.41 1.40 -0.05 1.06 -0.01 1.20 0.17 1.46 0.31 1.77 0.90 0.03 0.50 0.04 0.57 0.18 0.76 0.37 1.04 0.01 0.75 0.01 0.86 0.13 1.06 0.32 1.33 -0.05 1.05 -0.04 1.19 0.09 1.42 0.22 1.71 0.95 -0.00 0.49 -0.02 0.56 -0.01 0.71 -0.02 0.92 -0.03 0.75 -0.06 0.84 -0.05 1.03 -0.08 1.25 -0.09 1.05 -0.11 1.18 -0.09 1.40 -0.18 1.68 1.00 1000 500 200 100 1000 500 200 100 1000 500 200 100 N\T Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- Size Power+ PowerSize Power+ PowerSize Power+ PowerSize Power+ Power- α 9.40 74.00 99.35 8.10 95.70 100.00 5.30 100.00 100.00 4.95 100.00 100.00 7.00 63.80 97.15 7.00 87.05 99.50 5.30 99.60 100.00 6.00 100.00 100.00 6.90 51.95 88.00 6.80 75.00 94.70 7.30 95.15 98.20 6.75 99.15 99.40 0.70 7.55 83.00 99.55 5.40 98.85 100.00 4.25 100.00 100.00 4.95 100.00 100.00 5.85 70.95 98.10 5.70 93.65 99.40 5.15 99.85 100.00 5.75 100.00 100.00 6.40 57.25 88.75 5.60 81.30 94.40 6.55 96.60 98.25 6.70 99.45 99.45 0.75 6.50 88.95 99.95 4.70 99.65 100.00 4.55 100.00 100.00 5.10 100.00 100.00 4.45 77.15 98.35 4.65 96.70 99.55 5.10 100.00 100.00 5.40 100.00 100.00 5.75 63.45 89.00 5.30 85.40 94.75 6.70 97.90 98.05 6.45 99.75 99.40 0.80 5.20 94.05 99.95 4.30 99.90 100.00 4.60 100.00 100.00 4.95 100.00 100.00 500 4.30 85.10 98.00 4.85 98.25 99.85 4.95 100.00 100.00 6.20 100.00 100.00 200 4.60 70.85 87.50 4.80 88.15 94.60 6.65 98.30 98.35 6.20 99.85 99.55 100 0.85 4.60 97.10 100.00 4.30 100.00 100.00 5.00 100.00 100.00 5.00 100.00 100.00 4.45 89.00 98.40 4.80 98.90 99.85 5.00 100.00 100.00 5.75 100.00 100.00 5.20 75.90 88.25 5.10 91.05 95.15 6.55 98.90 98.25 6.30 99.85 99.60 0.90 case of two serially uncorrelated factors and cross-sectionally independent idiosyncratic errors (α2 = α, fjt and uit ∼ IIDN (0, 1), vji ∼ IIDU (0.5, 1.5), j = 1, 2, N=100,200,500,1000 and T=100,200,500) 4.55 98.90 100.00 4.30 100.00 100.00 4.95 100.00 100.00 5.15 100.00 100.00 4.40 93.40 98.55 5.20 99.40 99.85 5.80 100.00 100.00 6.10 100.00 100.00 5.40 79.20 88.30 5.00 92.45 95.05 6.70 99.10 98.30 6.30 99.90 99.65 0.95 Table D: Bias, RMSE, size and power (×100) for the α̃ estimate of the cross-sectional exponent - 6.35 100.00 100.00 5.30 100.00 100.00 5.85 100.00 100.00 5.20 100.00 100.00 5.50 97.90 97.90 6.10 99.80 99.80 5.75 100.00 100.00 6.25 100.00 100.00 7.00 84.00 84.00 5.80 94.65 94.65 7.10 98.35 98.35 6.60 99.45 99.45 1.00