Adaptive Estimation of the Dynamic Linear Model with Fixed E¤ects¤ By Tiemen Woutersen and Marcel Voiay May 2002 1 Introduction The analysis of the dynamic linear Model with …xed e¤ects has been subject of some attention in econometrics for almost two decades. The popularity of this linear model might be due to the fact that it is the simplest model in which Heckman’s (1981a and 1981b) spurious correlation and state dependence can be studied. Also, in practice growth models such as the dynamic version of the Solow’s (1956) are …tting very well into the class of dynamic linear models with …xed e¤ects. Sollow’s model became important once the paper of Mankiw, Romer and Weil (1992) used it to …t data on GDP; savings and labor force for 121 countries during 1960 ¡ 1985: They found then that aproximatively 80 percent of the variation in the above mentioned variables con…rm model’s predictions. Refering to the model estimation, Nickell (1981) showed that the maximum likelihood su¤ers from the incidental parameter problem of Neyman and Scott (1948). In particular, Nickell shows that the inconsistency of the maximum likelihood estimator is O(T ¡1): Econometricians have subsequently developed moment estimators. Examples include Anderson and Hsiao (1982), Holtz-Eakin, Newey and Rosen (1988), Arrelano and Bond (1991), and Ahn and Schmidt (1995). Combining moment restrictions can be di¢cult, especially, when some moments become uninformative for particulr regions of the parameter space. Neway ¤ We gratefully acknowlegde stimulating suggestions from Jan Kieviet. The CIBC Human Capital and Productivity Program provided …nancial support. All errors are ours. y Correspondence addresses: Brown University, Box B, RI, 02912, USA, and University of Western Ontario, Department of Economics, Social Science Centre, London, Ontario, N6A 5C2, Canada. Email: Tony_Lancaster@brown.edu, mvoia@uwo.ca, and twouters@uwo.ca. 1 and Smith (2001) also give a general discussion on how combining moments usually causes a higher order bias. The small sample bias of these moment estimators motivates the bias corrected least squares estimator of Kieviet (1995). Alvarez and Arellano (1998) developed an alternative asymptotic where the number of individuals, N; increases as well as the number of observations per individual, T: Using this alternative asymptotics, Hahn, Hausman and Kuersteiner (2001) develop an expression for the bias for the case in which N / T then, Hahn et al. developed an estimator for the dynamic linear model without regressors. Hahn and Kuersteiner (2002) developed a bias corrected MLE which is asymptotically unbiased and e¢cient. Lancaster (1997 and 2000) proposes to approximately separate the parameters of interest from the nuisance parameters. In particular, Lancaster uses a parametrization of the likelihood that has a block diagonal information matrix. That is, the cross derivatives of the log likelihood of the nuisance parameters and parameters of interest is zero in expectation. Lancaster (1997 and 2000) then integrates out all …xed e¤ects. Woutersen (2001) shows that information-orthogonality reduces the bias of this integrated likelihood estimator to O(T ¡2 ) and that the integrated likelihood estimator is asymptotically unbiased and adaptive if T _ N ® where ® > 13 : That is, the integrated likelihood estimator is as e¢cient as the infeasible maximum likelihood estimator that assumes the values of the nuisance parameters to be known. This paper presents a new adaptiveness result for the integrated likelihood estimator that integrates out individual speci…c e¤ects. We show that the integrated likelihood estimator is adaptive for any asymptotic in which T increases as long as the integrated likelihood estimator is consistent for N increasing and T being …xed. This theorem implies adaptiveness of Lancaster’s (1997) estimator of the dynamic linear model without regressors. We also derive an information orthogonal parametrization for the dynamic model with regressors. For this model, the integrated likelihood estimator is asymptotically unbiased and adaptive as long as T _ N ® where ® > 13 : {Lancaster (1997) gives a parametrization of the dynamic linear model that approximates 2 an information-orthogonal parametrization if N ! 1: However, information-orthogonality is a …nite sample property. Simulations show that an implicit parametrization works better; this was somewhat surprising to me (tw) since N was larger than T in all simulations} This paper is organized as follows. Section 2 reviews the integrated likelihood estimator and information orthogonality. Section 3 derives an information orthogonal parametrization for the dynamic linear model with …xed e¤ects and regressors. Section 4 gives adaptiveness results for the integrated likelihood estimator (case without regressors). Section 5 gives simulation results and section 6 concludes. 2 The Integrated Likelihood Estimator and Orthogonality Suppose we observe N individuals for T periods. Let the log likelihood contribution of the tth spell of individual i be denoted by Lit : Summing over the contribution of individual i yields the log likelihood contribution, Li(¯; ¸i ) = X Lit (¯; ¸i ); t where ¯ is the common parameter and ¸i is the individual speci…c e¤ect. Suppose that the parameter ¯ is of interest and that the …xed e¤ect ¸i is a nuisance parameter that controls for heterogeneity. This paper considers elimination of nuisance parameters by integration. This Bayesian treatment of nuisance parameters is straightforward: formulate a prior on all the nuisance parameters and then integrate the likelihood with respect to that prior distribution of the nuisance parameters, (see Gelman et al. (1995) for an overview). For a panel data model with …xed e¤ects this means that we have to specify priors on the common parameters and all the …xed e¤ects. Berger et al. (1999) review integrated likelihood methods in which ‡at priors are used for both the parameter of interest and the nuisance parameters. The individual speci…c nuisance parameters are then eliminated by integration. We denote the logarithm of the integrated likelihood contribution by Li;I ; i.e. i;I L (¯) = ln 3 Z i eL d¸i : Summing over i yields the logarithm of the integrated likelihood, X X Z i I i;I L (¯) = L (¯) = ln eL d¸i: i (1) i After integrating out the …xed e¤ects, the mode of the integrated likelihood can be used as an estimator1 We thus de…ne the integrated likelihood estimator ¯^ to be the mode of LI (¯): ¯^ = arg max LI (¯): ¯ A parametrization of the likelihood is information-orthogonal if the information matrix is block diagonal. That is EL¯¸(¯ 0; ¸0) = 0 Z i.e. tmax tmin L¯¸ (¯ 0; ¸0 )eL(¯0 ;¸0 ) dt = 0; where t denotes the dependent variable, t 2 [t min ; t max] and ¯ 0 ; ¸0 denote the true value of the parameters. Cox and Reid (1987) and Je¤reys (1961) use this concept and refer to it as ‘orthogonality’. We prefer the term information-orthogonality to distinguish it from the other orthogonality concepts and to stress that it is de…ned in terms of the properties of the information matrix. See Tisbshirani and Wasserman (1994) and Woutersen (2000) for an overview of orthogonality concepts. Consider the log likelihood L(¯; f (¯; ¸)) where the nuisance parameter f is written as a function of ¯ and the orthogonal nuisance parameter ¸: Di¤erentiating L(¯; f (¯; ¸)) with respect to ¯ and ¸ yields @L(¯; f (¯; ¸)) @f = L¯ + Lf @¯ @¯ @ 2L(¯; f (¯; ¸)) @f @f @f @ 2f = Lf ¯ + Lf f + Lf @¸@¯ @¸ @¸ @¯ @¸@¯ where Lf is a score and therefore ELf = 0: Information orthogonality requires that the cross-derivative @2 L(¯;f (¯;¸)) @¸@¯ is zero in expectation, i.e. EL¯¸ = ELf ¯ @f @f @f + ELf f = 0: @¸ @¸ @¯ 1 As N ! 1; using the marginal posteriors is asymptotically equivalent. Considering the mode of the posterior, however, simpli…es the algebra. 4 This condition implies the following di¤erential equation ELf ¯ + ELff @f = 0: @¯ (2) If equation (2) has an analytical solution then L(¯; f (¯; ¸)) is an explicit function of f¯; ¸g and we refer to such a parametrization as an explicit parametrization. In most cases, however, equation (2) has an implicit solution and we have to recover the Jacobian @¸ @f from this implicit solution. In this case, L(¯; ¸) has an implicit parametrization. The informationorthogonal parametrization of the dynamic linear model without regressors is explicit and given by Lancaster (1997). We need an implicit parametrization to deal with regressors and will derive that parametrization in the next section. For both implicit and explicit parametrization, Woutersen (2001) shows that the integrated likelihood estimator is adaptive if T _ N ® ® > 3 1 3 and ¯^ ¡ ¯ is Op(T ¡2 ) if ® · 13 . Orthogonality in the Dynamic Linear Model Consider the dynamic linear model with …xed e¤ects without regressors, yis = yi;s¡1½ + fi + "is where E"is = 0; E"2is = ¾2; E"is "it = 0 for s 6= t and s = 1; :::; T: Lancaster (2000) conditions on yi0 and suggests the following parametrization2 fi = yi0(1 ¡ ½) + ¸ie¡b(½) where b(½) = T 1 X T ¡s s ½: T s=1 s Analogue to quasi-maximum likelihood estimators, normality of the error terms is assumed in order to derive the integrated likelihood estimator. The estimator depends only on the …rst two moments of yis and is given by Lancaster (1997). In particular, the score of the integrated likelihood has the following form (see apendix 1 for derivation), 1 X 1 (ys ¡ ys¡1 ½)ys¡1 ¡ 2 T (y s ¡ ys¡1 ½)ys¡1 ¾2 s ¾ 1 T ¡1 1 X T = f ¡ (ys ¡ ys¡1½)2 + (ys ¡ ys¡1½)2 g ¾2 2 2¾2 s 2 Li;I = b 0(½) + ½ Li;I ¾2 where b(½)0 = 1 T PT s=1(T ¡ s)½s¡1 : 2 Appendix 5 of Woutersen (2001) gives information-orthogonal parametrizations for linear models with more then one autoregressive term. 5 3.1 Dynamic linear model with regressors Now consider the dynamic linear model with …xed e¤ects and regressors, yis = yi;s¡1½+xis¯ +fi +"is where E"is = 0; E"2is = ¾2; E"is "it = 0 for s 6= t and s = 1; :::; T: We assume normality in order to derive the following likelihood contribution for individual i: Li = ¡T ln ¾ ¡ 1 X (y ¡ yi;s¡1½ ¡ xis ¯ ¡ f i)2. 2¾2 s is It follows from Woutersen (2001, appendix 5) that the information-orthogonal nuisance parameter has the following expression, Z yi;s¡1½+xis¯+fi 1X ¸= E d¹: T s ¡1 Thus, ¸ is expressed as an unbounded integral. Fortunately, its derivatives are …nite, @¸ @f @¸ @½ @¸ @¾2 = 1 = 1X Ey i;s¡1 T s = 0: Normalizing the regressors to have mean zero, and Thus, P s xis = 0, gives @¸ X = xis = 0 @¯ s @¸ df 1X @½ = ¡ @¸ = ¡ Eyi;s¡1: d½ T s @f R i R Li Li (L½ + Lif df d ln e d¸ d½ )e df i;I 2 R i L½ (¯; ½; ¾ ) = = (3) d½ eL df R i Li R Li L¯ e df d ln e d¸ i;I L¯ (¯; ½; ¾2 ) = = R Li (4) d¯ e df 1 T ¡1 1 X T i;I L¾2 (¯; ½; ¾2 ) = f ¡ 2 (yis ¡ yi;s¡1½ ¡ xis¯)2 + (yis ¡ yi;s¡1 ½ ¡ xis ¯)2g: 2 ¾ 2 2¾ s 2 6 Thus, the estimate for the variance is 4 c ¾2 = X X 1 f (yis ¡ yi;s¡1½ ¡ xis ¯)2 + T (yis ¡ yi;s¡1 ½ ¡ xis ¯)2g: N(T ¡ 1) s i Adaptiveness Woutersen (2001) shows that the integrated likelihood estimator is adaptive in the sense that it is equivalent to the infeasible maximum likelihood estimator that assumes the nuisance parameters to be known. The conditions for this result are a regularity condition that the integrated likelihood can be approximated by a Laplace formula and the substantial condition that T _ N ® where ® > 13 : Suppose that T is rather small so that an asymptotics in which T _ N ® where ® > 1 3 is not satisfactory. Woutersen (2001) gives su¢cient conditions for the integrated likelihood estimator to be consistent for T …xed and N ! 1: We now show that consistency for …xed T implies that the integrated likelihood estimator is adaptive for an asymptotics with T increasing. Proposition (case without regressors) Suppose the asymptotic variance of ½ML = arg max½ L(½; ¸0) equals ª = £ ¤ 1 0 g ¡1 EfL L ½ ½ NT and that T / N ® where ® > 0: Then the integrated likelihood estimator b ½ is an adaptive estimator and p NT (b½I ¡ ½0) !d N(0; ª): Proof: See appendix 2 . 5 Simulation Results We use the same simulation designs as Hahn, Hausman and Kuersteiner (2001), (HH K), Hahn and Kuersteiner (2002), (H K) ; and Kiviet (1995). When we compare the performances of the integrated likelihood estimator with the performances of HHK estimator we considered the case without regressors, looking only at the performances of the autoregressive parameter estimator (½). We designed the Monte Carlo experiment by looking at 7 values of the autoregressive parameter ½ = f0:1; 0:3; 0:5; 0:8; 0:9; 0:95; 0:99g for samples of size n = f100; 500g and T = f5; 10g and assuming unknown ¾2: Thus, we extended the analysis of HHK by looking at values of ½ closed to the unit root. The number of simulation replications was set at 5000. The simulations results are presented in Table.1. Looking at the RMSE of the estimates, we observe that the integrated likelihood estimator is performing better than the other estimators in all situations. Also, we observe that for small values of ½ = f0:3; 0:5g ; for n = f500g and T = f10g ; the bias for integrated likelihood estimator is slighty higher than the bias of HHK estimator , and that in any other cases the bias is lower. We can observe very good performances of the integrated likelihood estimator in the vicinity of unit root. Thus, in Table 3, we presented the performances of the integrated likelihood estimator for high values of ½ = f0:75; 0:8; 0:85; 0:9; 0:95g for n = 100 and T = 5 : The results show that our estimator is performing better in terms of mean and median bias, exception at ½ = 0:9 for the mean bias, and much better in terms of MSE in comparison with the other available estimators: Thus, our results con…rm the prediction about the MSE in comparison to the other available estimators. Also, we did Monte Carlo simulations using the design of HK (2002) ; where ½ = f0:0; 0:3; 0:6; 0:9g ; n = f100; 200g and T = f5; 10; 20g : Our simulations results are presented in Table 2. In all situations our results are presenting lower bias and lower RMSE: Thus, even for the cases where we have the same RMSE (½ = f0:0; 0:3g for n = 200 and T = 20) ; our results are better due to lower bias. When we compare the performances of the integrated likelihood estimator with the performances of the Kieviet estimator we are considering the case with covariates. We also compare the results of the integrated likelihood estimator using Lancaster’s (1997) approximation, to the Kieviet results, see Table 3. We are considering 10 out of 14 cases Kieviet considered, this is due to the fact that we are not assuming any distribution for the …xed e¤ect as Kieviet assumed. Thus, we present results for 10 di¤erent parameter combinations which are matching Kieviet’s combinations for ¹ = 1. The results for 5000 replications are presented in Table 4. 8 Comparing the results obtained using Lancaster’s (1997) approximation to the Kieviet’ results, we observe that for values of ½ = 0; Kiviet’s estimator is performing better than integrated likelihood estimator and as ½ increases the reverse is true, exception is at ½ = 0:4 where the bias of the integrated likelihood ½ estimator is higher than Kiviet’s at ½ = 0:8; where RMSE of the integrated likelihood ¯ estimator is larger than Kieviet’s. Looking at the integrated likelihood estimator results obtained by using the implicit parametrization of Woutersen (2001) ; see Table 5; we observe improvements, thus we are not …nding any cases for which Kiviet.s estimator is doing better than the integrated likelihood estimator. Overall, the simulations show that the integrated likelihood estimator is very good in terms of MSE and bias, see section 9 for the tables. 6 Conclusion This paper derived new adaptiveness result for the integrated likelihood estimator. The integrated likelihood estimator is adaptive for an asymptotic with T increasing. Simulations show the relevance of the adaptiveness results since the MSE of the integrated likelihood estimator was smaller then the MSE of competing estimators for several versions of the dynamic linear model with …xed e¤ects. Also, our simulation results showed very good performances of the estimator for slower adjustment processes, ½ close to unit root. Then, we derived a new estimator for the dynamic linear model with …xed e¤ects and regressors. The new estimator performed very well, showing in simulations very good results for both MSE and bias. 9 7 Appendices Appendix 1. ¡ ¡ ¢¢ We assume normality of the error term "is s N 0; ¾2 to derive moment coditions. i;I De…ning the integrated likelihood by eL : e Li;I = = = _ Z P 1 b(½) ¡ 1 2 s(ys¡ys¡1½¡f )2 2¾ e e df ¾T Z 1 P 2 1 b(½) ¡ 2¾ 2 s (ys ¡ys¡1½¡f ) df e e T ¾ Z T 2 1 b(½)¡ 12 Ps(ys¡ys¡1 ½) 2 2¾ e e¡ 2¾2 ff ¡2f(ys ¡ys¡1 ½)gdf T ¾ 1 b(½)¡ 12 Ps(ys ¡ys¡1 ½)2 + T2 (ys¡ys¡1½)2 2¾ e ¾T ¡1 This implies that the log-of the integrated likelihood for individual i and the moments we are using to estimate parameters are: T ¡1 1 X T ln(¾2 ) + b(½) ¡ 2 (ys ¡ y s¡1½)2 + 2 (ys ¡ ys¡1 ½)2 2 2¾ s 2¾ X 1 = b 0(½) + 2 f (ys ¡ ys¡1 ½)ys¡1 ¡ T(ys ¡ ys¡1½)ys¡1g ¾ s X 1 T 2 = b 00 (½) ¡ 2 ys¡1 + 2 ys¡12 ¾ s ¾ 1 T¡1 1 X T = f ¡ 2 (ys ¡ ys¡1 ½)2 + (ys ¡ ys¡1½)2g 2 ¾ 2 2¾ s 2 Li;I = Li;I ½ Li;I ½½ Li;I ¾2 i;I The functions Li;I ½ and L¾ 2 can be used as a moment function.We can derive the same result when we integrate the score function: R i L R i L L½e d¸ L½e df @¸ i;I L½ = R L = R L since = eb(½) does not depend on ¸: e d¸ e df @f 1 1 P 2 T 2 constant eL = p e¡ 2 f s (ys¡ys¡1½) g¡ 2 ff ¡2f(ys ¡ys¡1 ½)g 2¼T 1 1 P 2 T 2 T 2 = p e¡ 2 f s (ys¡ys¡1½) g+ 2 (ys¡ys¡1½) ¡ 2 (f ¡ys ¡ys¡1 ½) 2¼T 10 R L f e df Note that R L e df Li;I ½ R (f ¡ (~ ys ¡ y~s¡1½))2eLdf 1 R L = y~s ¡ y~s¡1½ and = T e df X 0 = b (½) + (ys ¡ ys¡1½)ys¡1 ¡ T(ys ¡ ys¡1½)y s¡1 s Li;I ½½ 00 = b (½) ¡ X 2 ys¡1 + T ys¡12 s Appendix 2. Proposition To be shown: p NT (b½I ¡ ½0) !d N(0; ª); where ª is the asymptotic variance of the ML estimator. Proof: We are considering the case without regressors. a) Determining the asymptotic variance of the ML estimator for orthogonal …xed e¤ect ¡ ¡ ¢¢ We assume normality of the error term "is s N 0; ¾2 to derive moment coditions. After deriving the moment conditions we show that under E"is = 0; E"2is = ¾2 and E"is "ij = 0 for j 6= s our Proposition holds. De…ne the known orthogonal …xed e¤ects by ¸i ; with initial …xed e¤ects de…ned by fi = (1 ¡ ½) yi0 ¡ ¸ie¡b(½) : If we de…ne the loglikelihood with known orthogonal …xed e¤ects for individual i by Li ; we have: 1 1 X Li = ¡ ln(¾2 ) ¡ 2 (~ ys ¡ y~s¡1½ ¡ ¸e¡b(½) )2 where y~s = ys ¡ y0 : 2 2¾ s Now, we can derive the …rst and second moments necessary to determine the asymptotic 11 variance for ML estimator: 1 X (~ y ¡ y~s¡1½ ¡ ¸e¡b(½) )(~ ys¡1 ¡ ¸b0 (½)e¡b(½) ) ¾2 s s 1 X = (~ ys ¡ y~s¡1½ ¡ ¸e¡b(½) )(~ ys¡1 ¡ ¸b0 (½)e¡b(½) ) ¾2 s ¡ ¢ 1 X ¸ X = ¡ 2 (~ ys¡1 ¡ ¸b0 (½)e¡b(½) )2 + 2 (~ ys ¡ y~s¡1½ ¡ ¸e¡b(½) ) b02(½) ¡ b 00(½) e¡b(½) ¾ s ¾ s Li½ = Li½½ 1 T where b(½) = s) (s ¡ 1) ½s¡2: De…ning, by @f @½ PT s=1 T ¡s s s ½ ; b 0(½) = 1 T PT s=1(T ¡ s)½s¡1 and b 00 (½) = 1 T PT s=1(T ¡ = ¡y0 + ¸b 0(½)e¡b(½) we can rewrite: ¢ 1 X @f 2 ¸ X ¡ 02 (y + ) + "s b (½) ¡ b 00(½) e¡b(½) s¡1 2 2 ¾ s @½ ¾ s à µ ¶2! ¢ 1 X 2 @f @f ¸ X ¡ 02 = ¡ 2 ys¡1 + 2T y s¡1 + T + 2 "s b (½) ¡ b00 (½) e¡b(½) ¾ @½ @½ ¾ s s µ ¶ ¢ 1 X 2 2T @f T @f 2 ¸ X ¡ 02 = ¡ 2 ys¡1 ¡ 2 y s¡1 ¡ 2 + 2 "s b (½) ¡ b00 (½) e¡b(½) : ¾ s ¾ @½ ¾ @½ ¾ s Li½½ = ¡ Li½½ Taking expectations we have: ELi½½ = ¡ ¾12 E ELi½½ = ¡ ¾12 E P P 2 2T s ys¡1 ¡ ¾2 @f T @½ Ey s¡1 ¡ ¾ 2 2T 2 s ys¡1 ¡ ¾2 @f T @½ Ey s¡1 ¡ ¾ 2 ³ ³ @f @½ @f @½ ´2 ´2 + ¾¸2 : P s E ("s ) ¡ 02 ¢ b (½) ¡ b00 (½) e¡b(½) Given that E"is = 0; E"2is = ¾2 and E"is"ij = 0 for j 6= s and 1 X @f (yis ¡ yi;s¡1½ ¡ fi)(yi;s¡1 + i ) ¾2 s @½ 1 X @fi = "is (yi;s¡1 + ) and 2 ¾ s @½ à !2 1 X @fi = "is (yi;s¡1 + ) ¾4 @½ s µ ¶µ ¶ 1 X 2 @fi 2 2 X @fi @fi = " (yi;s¡1 + ) + 4 ("is ) ("ij ) yi;s¡1 + yi;j ¡1 + ; ¾4 s is @½ ¾ @½ @½ Li½ = ¡ Li½ ¢ ¡ i ¢0 L½ j6=s 12 ¡ ¢ we have E (L½) (L½)0 that is à µ ¶2 ! 1 XX @f i E "2is yi;s¡1 + ¾4 @½ s i µ µ ¶µ ¶¶ 2 X @fi @fi + 4 E ("is ) ("ij ) yi;s¡1 + yi;j¡1 + ¾ @½ @½ j6=s à µ ! ¶ µ ¶ 1 XX @fi 2 ¾2 X X @fi 2 2 = E "is yi;s¡1 + = 4 E yi;s¡1 + ¾4 i s @½ ¾ i s @½ µ ¶ 1 XX 2 2T @fi X T X @fi 2 = Ey + Ey + : i;s¡1 i¡ ¾2 i s ¾2 @½ i ¾2 i @½ ¡ ¢ E (L½) (L½)0 = Thus, ¡ ¢ E (L½ ) (L½ )0 = ¡EL½½; therefore, the asymptotic variance of ML estimator for orthogonal …xed e¤ects will be de…ned as: · ¸¡1 · ¸· ¸¡1 ¡ 1 1 1 0¢ ª = Asy:var (b ½ML) = EL½½ E (L½) (L½) EL½½ NT NT NT · ¸¡1 · ¸¡1 ¡ 1 1 0¢ = E (L½) (L½) = ¡ EL½½ NT NT · ¸¡1 1 ª = Asy:var (b ½ML) = ¡ EL½½ NT where EL½½ is µ ¶ 1 XX 2 2T @fi X T X @fi 2 EL½½ = ¡ 2 Eyi;s¡1 ¡ 2 Ey i¡ ¡ 2 : ¾ i s ¾ @½ i ¾ i @½ b) Determining the asymptotic variance of the integrated likelihood estimator ¡ ¢ Assuming again "is s N 0; ¾2 we derived the moment conditions for the integrated i;I likelihood estimator. If we de…ne the integrated likelihood for individual i by eL ; we have eL i;I _ 1 ¾ 1 eb(½)¡ 2¾2 T ¡1 P T 2 2 s (ys ¡ys¡1½) + 2¾ 2 (ys ¡ys¡1 ½) : This implies that the log of integrated likelihood to be: Li;I = T ¡1 1 X T ln(¾2) + b(½) ¡ 2 (ys ¡ ys¡1 ½)2 + 2 (ys ¡ ys¡1½)2 2 2¾ s 2¾ 13 and the corresponding …rst and second moments for the integrated likelihood are: 1 X f (ys ¡ ys¡1½)ys¡1 ¡ T(ys ¡ ys¡1½)y s¡1g ¾2 s 1 X 2 T = b 00(½) ¡ 2 y + ys¡12: ¾ s s¡1 ¾2 Li;I = b 0(½) + ½ Li;I ½½ Taking expectations we have: 00 ELi;I ½½ = b (½) ¡ 1 X 2 T Eys¡1 + 2 Eys¡12: 2 ¾ s ¾ Relying on the work of Newey, W.K. and D. McFadden (1994) on the asymptotic variance of moment estimators, we can express the asymptotic variance of the integrated likelihood estimator as: · 1 Asy:var (b ½I ) = ELI½½ NT ¸¡1 · where ELI½½ is ¸¡1 ³¡ ¢ ¡ ¢ ´¸ · 1 1 I I 0 I E L½ L½ EL½½ NT NT à ! à ! XX X 1 T 2 ELI½½ = N b00 (½) ¡ 2 E yi;s¡1 + 2 E yi¡2 ¾ ¾ s i i and ELI½ à ! X X 1 = Nb (½) + 2 E f (y is ¡ yi;s¡1½)yi;s¡1 ¡ T(yis ¡ yi;s¡1½)yi;s¡1 g : ¾ s i 0 Given the fact that compute …rst: 1 NT E ³¡ LI½ ¢¡ LI½ ¢0 ´ = P 1 i NT E 14 ³ Li;I ½ ´³ ´0 Li;I when ELI½ = 0; we will ½ ¡ i;I ¢ ¡ i;I ¢0 2b0 (½) X 2b0 (½) L½ L½ = b0 (½)2 + (y ¡ y ½)y ¡ T(ys ¡ ys¡1½)y s¡1 s s¡1 s¡1 ¾2 ¾2 s à !2 1 X + 4 (ys ¡ ys¡1½)y s¡1 ¡ T(ys ¡ ys¡1½)ys¡1 ¾ s 2b0 (½) X 2b0 (½) = b0 (½)2 + (f + " ) y ¡ T (f + "s ) ys¡1 s s¡1 ¾2 ¾2 s à !2 1 X + 4 (f + "s ) ys¡1 ¡ T (f + "s ) ys¡1 ¾ s à ! 0 (½) X 2b = b0 (½)2 + ("s ) (ys¡1) ¡ T ("s ) (ys¡1) ¾2 s à !2 1 X + 4 ("s ) ys¡1 ¡ T ("s ) ys¡1 : ¾ s ¡ ¢ ¡ ¢0 Thus, E LI½ LI½ is determined by: à ! ¡ I ¢ ¡ I ¢0 2b0 (½) X X 0 2 E L½ L½ = Nb (½) + E ("is ) (yi;s¡1 ) ¡ T ("is ) (yi;s¡1) ¾2 s i à !2 1 X X + 4E ("is ) yi;s¡1 ¡ T ("is ) yi;s¡1 ; ¾ s i 0 where b (½) = 0 lim b (½) T !1 2 lim b0 (½)2 T !1 = 1 T PT s=1(T lim T !1 à à s¡1 ¡ s)½ 0 ; b (½) = T 1X (T ¡ s)½s¡1 T s=1 2 !2 ³ P T 1 T s=1 (T ¡ s)½ s¡1 ´2 ; !2 T 1 X s¡1 = lim ½ ¡ s½ T !1 T s=1 s=1 à T !2 à T !2 T T X 2 X s¡1 X s¡1 1 X s¡1 s¡1 = lim ½ ¡ lim ½ s½ + lim s½ T !1 T !1 T T !1 T s=1 s=1 s=1 s=1 ¡ ¡1 ¢ 1 = + O T + O(T ¡2 ): (1 ¡ ½)2 T X s¡1 15 and 00 lim b (½) = T !1 = = = = lim T !1 lim T !1 lim T !1 à à T 1X (T ¡ s) (s ¡ 1) ½s¡2 T s=1 T X 1 T T X s=1 s=1 T !1 1 + lim ½ T !1 lim T !1 T X s=2 T X ¡ ¢ T s ¡ T ¡ s2 + s ½s¡2 s½s¡2 ¡ lim T X s=2 ! T X s=1 T T 1 X 2 s¡2 1 X s¡2 s½ + lim s½ T !1 T T !1 T ½s¡2 ¡ lim 1 ¡ lim ½ T !1 (s) ½s¡2 ¡ ! (s ¡ 1 + 1) ½s¡2 ¡ T X s=2 s=1 s=1 ¡ ¢ ½s¡2 + O T ¡1 ¡ ¢ 1 + O T ¡1 1¡ ½ T X ¡ ¢ 1 + O T ¡1 T !1 T !1 1¡ ½ s=2 s=2 ¡ ¢ ¡ ¢ 1 1 1 1 = + ¡ + O T ¡1 = + O T ¡1 : 2 2 1 ¡½ 1¡ ½ (1 ¡ ½) (1 ¡ ½) = lim (s ¡ 1) ½s¡2 + lim ½s¡2 ¡ The second moment of the integrated likelihood estimator is de…ned as: LI½½ = Nb 00 (½) ¡ 1 XX 2 T X y + yi;s¡12 i;s¡1 2 ¾2 ¾ s i i and after taking expectations we have ELI½½ = N b00 (½) ¡ 1 XX 2 T X E y + E yi;s¡1 2: i;s¡1 2 ¾2 ¾ s i i Adding up the two moments we have: ¡ I ¢ ¡ I ¢0 1 1 ELI = b0 (½)2 + 2b0 (½) E P (P (" ) (y L½ + NT i;s¡1) ¡ T ("is ) (yi;s¡1 )) ½½ NT E L½ T i s is NT ¾ 2 P P 00 + NT1¾4 E i ( s ("is ) yi;s¡1 ¡ T ("is ) yi;s¡1 )2 + b T(½) P P 1 P 2 2 ¡ NT1¾2 i s Eyi;s¡1 + N¾ 2 i E ys¡1 b0(½) 2 b00 (½) 2b0(½) P P = T + T + NT ¾2 i ( s E ("is ) (yi;s¡1 ¡ y i;s¡1)) P P P P + NT1¾4 E i ( s ("is ) (yi;s¡1 ¡ yi;s¡1))2 ¡ NT1¾2 i s Ey2i;s¡1 1 P 2 + N¾ 2 i E ys¡1 ¡ ¢ ¡ ¢ P P = O T ¡1 +O T ¡1 + NT1¾4 i s E ("is )2 (yi;s¡1 ¡ yi;s¡1 )2 P P P + NT2¾4 i s j6=s E ("is ) ("ij) (yi;s¡1 ¡ yi;s¡1 ) (y i;j¡1 ¡ yi;j¡1) 16 P P 2 1 P E y 2 ¡ NT1¾2 i s Eyi;s¡1 + N¾ 2 i;s¡1 i ¡ ¢ ¡ ¢ 2 ¾2 P P = O T ¡1 + O T ¡1 + NT i s E (yi;s¡1 ¡ yi;s¡1) ¾4 P P 1 P 2 2 ¡ NT1¾2 i s Eyi;s¡1 + N¾ 2 i E yi;s¡1 ¡ ¢ ¡ I ¢0 ¡ ¢ ¡ ¢ 1 E LI 1 ELI = O T ¡1 + O T ¡1 + 1 P P Ey2 L½ + NT ½ ½½ i s i;s¡1 NT N T ¾2 1 P 1 P P 1 P 2 2 2 ¡ N¾2 i E yi;s¡1 ¡ NT ¾2 i s Eyi;s¡1 + N¾ 2 i E yi;s¡1 : ¡ ¢ ¡ ¢0 1 ¡ ¢ 1 Thus, we get NT E LI½ LI½ + NT ELI½½ that is O T ¡1 : This means that, when T ! 1; we can write: ³¡ ¢ ¡ ¢0´ 1 1 ELI½½ = ¡ E LI½ LI½ ; NT NT therefore, the asymptotic variance of the integrated likelihood estimator is given by · ¸¡1 ³¡ ¢ ¡ ¢0 ´¸¡1 · 1 1 I I I Asy:var (b ½I ) = E L½ L½ = ¡ EL½½ : NT NT Comparing the two expectations 1 I NT EL½½ and 1 N T EL½½ we can establish the rate at which the asymptotic variances are equal. i Taking the di¤erence between ELi;I ½½ ¡ EL½½ we have: ³ ´2 00 P P @fi 1 1 P @fi I ¡ 1 EL = b (½) + 1 2+ 2 EL Ey Ey + ½½ i;s¡1 2 2 2 ½½ i i i;s¡1 i NT NT T @½ @½ N¾ N¾ N¾ ´2 P ³ P ¡ ¢ b00(½) @fi 1 1 = T + N¾2 i Eyi;s¡1 + @½ + N¾2 i V ar yi;s¡1 ³ P ´2 ³P ´ 00 1 P E s yi;s¡1 + @f i 1 P V ar s yi;s¡1 = b T(½) + N¾ + 2 2 i T @½ i T N¾ µ P ¶2 P T ¡1 j ¡s s¡1 s¡1 00 P yi0 +(T ¡s+1)½ f i +"i;s¡1 j =1 ½ ) @fi b (½) 1 s (½ = T + N¾2 i E + @½ T µ P s¡1 ¶ P ¡1 j¡s yi0+(T ¡s+1)½s¡1 fi +" i;s¡1 T ) 1 P s (½ j=1 ½ + N¾ V ar 2 i T Now, considering T is increasing, when T ! 1; we have PT P T ¡1 j¡s s¡1y +(T ¡s+1)½s¡1 f +" ) i0 i i;s¡1 s=1(½ j=1 ½ lim T !1 Ey s¡1 = lim T !1 E T PT PT ¡1 j ¡s PT s¡1f ½s¡1 yi0 i s=1 E"i;s¡1 j =1 ½ s=1(T ¡s+1)½ +lim +lim T !1 T !1 T T T PT 1 y i0 (s¡1)½s¡1f i 1 limT !1 1¡½T + 1¡½ fi ¡ limT !1 s=1 T PT 1 ¡ ¡1 ¢ ¡ ¡1¢ s¡1f ¡ ¢ 1 1¡½ yi0 i s=1(s¡1)½ f +O T ; because is O T and is O T ¡1 : i 1¡½ T T = limT !1 = = PT s 17 Thus, Eys¡1 ! f 1¡½ @f @½ and 0 lim b (½) = T !1 = = ¡f b0 (½) ; but lim à lim T X T !1 T !1 T 1X (T ¡ s)½s¡1 T s=1 s=1 ! T 1 X s¡1 s½ T !1 T ½s¡1 ¡ lim s=1 1 1 1 = ¡ lim 1 ¡ ½ T !1 T (1 ¡ ½)2 ¡ ¢ 1 @f f lim b0 (½) = + O T ¡1 =) !¡ as T ! 1 T !1 1 ¡½ @½ 1¡½ ³ ´ ³ ´ ³ ´2 ¡ ¡1¢ @f @f Then, Ey s¡1 + @f is O T and y + ! 0 as T ! 1 and ) Ey + s¡1 s¡1 @½ @½ @½ ¡ ¡2 ¢ ¡ ¢ is O T . Also, if we are looking at V ar yi;s¡1 when T ! 1 we have: 0P ³ PT ¡1 j¡s ´ 1 s¡1y + (T ¡ s + 1) ½s¡1f + " ½ i0 i i;s¡1 ¡ ¢ s j =1 ½ A V ar y i;s¡1 = V ar @ T 0 0 11 0 0 12 1 T T ¡1 T T ¡1 X X X X 1 1 @"i;s¡1 @¾2 @ = V ar @ ½j¡s AA = 2 ½j¡s A A T2 T s=1 j=1 s=1 j=1 and 1 N¾ 2 P i ¡ ¢ V ar y i;s¡1 = 1 NT P i µ 1 T Using the results derived, we have PT s=1 ³P 1 I NT EL½½ ´ T ¡1 j¡s 2 j=1 ½ 1 EL ¡ NT ½½ ¶ ¡ ¢ is O T ¡1 : ¡ ¢ is O T ¡1 : Given the fact that the di¤erence between the asymptotic variances: Asy:var (b½I ) ¡ Asy:var (b½ML ) = Asy:var (b½I ) ¡ Asy:var (b½ML ) = 1 1 ¡ 1 1 I NT EL½½ NT EL½½ 1 ELI ¡ 1 EL ½½ ½½ ¡NT ¢ ¡ NT ¢ 1 1 I NT EL½½ NT EL½½ ¡ ¢ is of order O T ¡1 or o (1) as N ! 1 and T _ N ® ; where the only condition in ® is ® > 0; we can conclude that the two asymptotic variances are equal: Asy:var (b ½I ) = Asy:var (b ½ML ) = ª: 18 8 References Alvarez, J. and M. Arellano (1998): “The Time Series and Cross-Section Asymptotics of Dynamic Panel Data Estimators,” unpublished manuscript. Arellano, M., and B. E. Honoré, (2001): “Panel Data Models: Some Recent Developments,” in Handbook of Econometrics, Vol. 5, ed. by J. Heckman and E. Leamer, Amsterdam: North-Holland. Baltagi, B. H. (1995): “Econometric Analysis of Panel Data,” John Wiley and Sons, New York. Basu, D. (1977): “On the elimination of nuisance parameters,” Journal of the American Statistical Association, 72, 355-366. Berger, J. O., B. Liseo, and R. L. Wolpert (1999): “Integrated Likelihood Methods for Eliminating Nuisance Parameters,” Statistical Science, 14, 1-28. Bickel, P. J. (1982), “On Adaptive estimation”, Annals of Statistics, vol.10, 647 - 671. Bickel P. J., C. A. J. Klaassen, Y. Ritov and J. A. Wellner (1993), E¢cient and adaptive estimation for semiparametric models. Johns Hopkins series in mathematical science. Chamberlain, G. (1980): “Analysis of Covariance with Qualitative Data” Review of Economic Studies, 47, 225-238. ———– (1982): “Multivariate Regression for Panel Data,” Journal of Econometrics, 18, 5-46. ———– (1984): “Panel Data,” in Handbook of Econometrics, Vol. 2, ed. by Z. Griliches and M. D. Intriligator. Amsterdam: North-Holland. ———– (1985): “Heterogeneity, Omitted Variable Bias, and Duration Dependence,” in Longitudinal Analysis of Labor Market Data, ed. by J. J. Heckman and B. Singer. Cambridge: Cambridge University Press. Cox, D. R., and N. Reid (1987): “Parameter Orthogonality and Approximate Conditional Inference (with Discussion),” Journal of the Royal Statistical Society, Series B, 49, 1-39. ———– (1993): “A Note on the Calculation of Adjusted Pro…le Likelihood,” Journal of the Royal Statistical Society, Series B, 45, 467-471. 19 Critchley, F. (1987): “Discussion of Parameter Orthogonality and Approximate Conditional Inference (by Cox, D. R., and N. Reid) Journal of the Royal Statistical Society, Series B, 49, 25-26. Ferguson, H. (1991): “Asymptotic properties of conditional maximum-likelihood estimator,” The Canadian Journal of Statistics, 20, 63-75. Ferguson, H., N. Reid and D.R. Cox (1989): “Estimating equations from modi…ed pro…le likelihood,” in Estimating Functions, V.P. Godambe, ed., Oxford University Press. Gelman, A., J. B. Carlin, H. S. Stern, D. B. Rubin (1995): Bayesian Data Analysis, Chapman and Hall. Hahn, J., J. Hausman, and G. Kuersteiner (2001): “Bias Corrected Instrumental Variables Estimation for Dynamic Panel Models with Fixed E¤ects”, MIT working paper. Hahn, J. and G. Kuersteiner (2002): “Asymptotically Unbiased Inference for a Dynamic Panel Model with Fixed E¤ects When Both n and T are Large”, Forthcoming Econometrica. Heckman, J. J. (1981a): “Statistical Models for Discrete Panel Data,” in Structural Analysis of Discrete Data with Econometric Applications, edited by C. F. Manski and D. McFadden, pp. 114-178, MIT Press, Cambridge. Heckman, J. J. (1981b): “The Incidental Parameter Problem and the Problem of Initial Conditions in Estimating a Discrete Time-Discrete Dtata Stochastic Process,” in Structural Analysis of Discrete Data with Econometric Applications, edited by C. F. Manski and D. McFadden, pp. 179-195, MIT Press, Cambridge. Heckman, J., H. Ichimura, J. Smith and P. Todd (1998): “Characterizing Selection Bias Using Experimental Data,” Econometrica, 66, 1017-1098. Hsiao, C. (1986): “Analysis of Panel Data”. Cambridge University Press, New York Hsiao, C., M. H. Pesaran and A. K. Tahmiscioglu (1998): “Maximum Likelihood Estimation of Fixed E¤ects Dynamic Panel Data Models Covering Short Time Periods”, USC working paper. 20 Hills, S. E. (1987): “Discussion of Parameter Orthogonality and Approximate Conditional Inference (by Cox, D. R., and N. Reid) Journal of the Royal Statistical Society, Series B, 49, 23-24. Honoré, B. E. and E. Kyriazidou (2000): “Panel Data Discrete Choice Models with Lagged Dependent Variables,” Econometrica, 68, 839-874. Je¤reys, H. (1961): Theory of Probability, 3rd ed. Oxford: Clarendon Press. Kass, R. E., L. Tierney, and J. B. Kadane (1990): “The Validity of Posterior Expansions Based on Laplace’s Method,” in Essays in Honor of George Barnard, eds. S. Geiser, S. J. Press, and A. Zellner, Amsterdam: North-Holland. Kiviet (1995): “On Bias, Inconsistency and E¢ciency of Various Estimators in Dynamic Panel Data Models”, Journal of Econometrics, 1995, 68, 53-78. Lancaster, T. (1997): “Orthogonal Parameters and Panel Data,” Working paper, Brown University. Lancaster, T. (1999): “Econometric Analysis of Dynamic Models: A Growth Theory Example,” Working paper, Brown University. ———– (1999): “Some Econometrics of Scarring,” Brown manuscript, Brown University. ———– (2000): “The Incidental Parameters since 1948,” Journal of Econometrics, 95, 391-413. Liang, K. (1987): “Estimating functions and approximate conditional likelihood”, Biometrika, 74, 695-702. Mankiw, N.G., Romer D., and D. Weil: “A Contribution to the Empirics of Economic Growth,” Quarterly Journal of Economics, CVII, 1992, 407-437. Mundlak, Y. (1961): “Empirical Production Function Free of Management Bias,” Journal of Farm Economics, 43, 44-56. 21 Nerlove, M. (2000): “The Future of Panel Data Econometrics,” Working Paper University of Maryland. Neyman, J., and E. L. Scott (1948): “Consistent Estimation from Partially Consistent Observations,” Econometrica, 16, 1-32. 1. Newey, W. K and R. J. Smith (2001): “Higher Order Properties of GMM and Generalized Empirical Likelihood Estimators,” Working Paper MIT. Newey, W. K (1990): “Semiparametric E¢ciency Bounds,” Journal of Applied Econometrics, 5, 99-135. Newey, W. K., and D. McFadden (1994): “Large Sample Estimation and Hypothesis Testing,” in Handbook of Econometrics, Vol. 4, ed. by R. F. Engle and D. MacFadden. Amsterdam: North-Holland. Reid, N. (1995): “The Roles of Conditioning in Inference,” Statistical Science, 10, 138-157. ———– (1996): “Likelihood and Bayesian approximation methods, in Bayesian Statistics 5 eds. J. M Bernardo, Jo. O. Berger, A. P Dawid, and A. F. M. Smith, Oxford Science Publications. Solow, R.M. (1956): “A Contribution to the Theory of Economic Growth,” Quarterly Journal of Economics, LXX, 1956, 65-94. Stuart, A., and J. K. Ord (1991): Kendall’s Advanced Theory of Statistics, Vol. 2. New York: Oxford University Press. Tibshirani, R., and L. Wasserman (1994): “Some Aspects of Reparametrization of Statistical Models,” The Canadian Journal of Statistics, Vol. 22, 163-173. Tierney, L. and J. B. Kadane (1986): “Accurate Approximations for Posterior Moments and Marginal Densities,” Journal of the American Statistical Society, 81, 82-86. Tierney, L., R. E. Kass and J. B. Kadane (1989): “Fully Exponential Laplace Approximations to Expectations and Variances of Nonpositive Functions,” Journal of the American Statistical Society, 84, 710-716. ———– (1989): “Approximate Marginal Densities of Nonlinear Functions,” Biometrika, 76, 22 425-433. Trognon, A. (2000): “Panel Data Econometrics: A Succesful Past and a Promising Future,” Working Papaer Genes(INSEE). Van der Vaart, A. W. (1998): Asymptotic Statistics, Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge. Woutersen, T. M. (2000): “Consistent Estimation and Orthogonality,” UWO working paper. Woutersen, T. M. (2001): “Robustness against Incidental Parameters and Mixing Distributions,” UWO working paper. 23 9 Tables Table 1 Performances of integrated likelihood estimator for the case without exogenous regressors comparative to estimators found in Hahn, Hausman and Kuersteiner (2001) (¾2 is unknown; 5000 data sets) T 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 n 100 100 500 500 100 100 500 500 100 100 500 500 100 100 500 500 100 100 500 500 100 100 500 500 100 100 500 500 ½ 0:10 0:10 0:10 0:10 0:30 0:30 0:30 0:30 0:50 0:50 0:50 0:50 0:80 0:80 0:80 0:80 0:90 0:90 0:90 0:90 0:95 0:95 0:95 0:95 0:99 0:99 0:99 0:99 RMSE ½^GMM 0:08 0:05 0:04 0:02 0:10 0:05 0:04 0:02 0:13 0:06 0:06 0:03 0:32 0:14 0:13 0:05 0:55 0:25 0:28 0:10 RMSE ^½BC2 0:08 0:05 0:04 0:02 0:10 0:05 0:04 0:02 0:13 0:06 0:06 0:03 0:34 0:11 0:13 0:04 0:78 0:23 0:30 0:08 24 RMSE ^½LIML 0:082 0:045 0:036 0:020 0:099 0:050 0:044 0:023 0:130 0:058 0:057 0:026 0:327 0:109 0:127 0:044 0:604 0:229 0:277 0:080 RMSE ^½I 0:056 0:035 0:025 0:016 0:061 0:036 0:027 0:016 0:067 0:037 0:031 0:016 0:089 0:045 0:036 0:018 0:099 0:057 0:038 0:023 0:111 0:059 0:042 0:026 0:118 0:058 0:044 0:028 T 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 5 10 n 100 100 500 500 100 100 500 500 100 100 500 500 100 100 500 500 100 100 500 500 100 100 500 500 100 100 500 500 ½ 0:10 0:10 0:10 0:10 0:30 0:30 0:30 0:30 0:50 0:50 0:50 0:50 0:80 0:80 0:80 0:80 0:90 0:90 0:90 0:90 0:95 0:95 0:95 0:95 0:99 0:99 0:99 0:99 %bias ^½GMM ¡14:96 ¡14:06 ¡3:68 ¡3:15 ¡8:86 ¡7:06 ¡2:03 ¡1:58 ¡10:05 ¡6:76 ¡2:25 ¡1:53 ¡27:65 ¡13:45 ¡6:98 ¡3:48 ¡50:22 ¡24:27 ¡20:50 ¡8:74 %bias ^½BC2 0:25 ¡0:77 ¡0:77 ¡0:16 ¡0:47 ¡0:66 ¡0:16 ¡0:10 ¡1:14 ¡0:93 ¡0:15 ¡0:11 ¡11:33 ¡4:55 ¡0:72 ¡0:37 ¡42:10 ¡15:82 ¡6:23 ¡2:02 %bias ^½LIML ¡3 ¡1 ¡1 ¡1 ¡3 ¡1 ¡1 0 ¡3 ¡1 ¡1 0 ¡15 ¡5 ¡3 ¡1 ¡41 ¡15 ¡10 ¡2 %bias ^½I 2:66 0:45 ¡0:20 ¡0:74 1:04 ¡0:25 ¡0:37 ¡0:14 0:38 ¡0:15 ¡0:29 ¡0:19 0:80 0:25 ¡0:12 ¡0:06 1:36 0:83 ¡0:07 ¡0:07 1:45 0:78 ¡0:05 0:02 1:69 0:44 0:15 0:22 The …xed e¤ects ®i and the innovations "it are assumed to have independent standard normal distributions. Initial observations yi0 are assumed to be generated by the stationary ³ ´ ®i 1 distribution N 1¡½ ; : 2 0 1¡½ 0 25 Table 2 Performances of Integrated likelihood estimator for the case without exogenous regressors comparative to GMM and BCML estimator of Hahn and Kuersteiner (2002) (¾2 is unknown; 5000 data sets) T 5 5 5 5 5 5 5 5 10 10 10 10 10 10 10 10 20 20 20 20 20 20 20 20 n 100 100 100 100 200 200 200 200 100 100 100 100 200 200 200 200 100 100 100 100 200 200 200 200 ½ 0:0 0:3 0:6 0:9 0:0 0:3 0:6 0:9 0:0 0:3 0:6 0:9 0:0 0:3 0:6 0:9 0:0 0:3 0:6 0:9 0:0 0:3 0:6 0:9 bias ^½GMM ¡0:011 ¡0:027 ¡0:074 ¡0:452 ¡0:006 ¡0:014 ¡0:038 ¡0:337 ¡0:011 ¡0:021 ¡0:045 ¡0:218 ¡0:006 ¡0:011 ¡0:025 ¡0:152 ¡0:011 ¡0:017 ¡0:029 ¡0:100 ¡0:006 ¡0:009 ¡0:016 ¡0:065 bias b^½ -0:039 ¡0:069 ¡0:115 ¡0:178 ¡0:041 ¡0:071 ¡0:116 ¡0:178 ¡0:010 ¡0:019 ¡0:038 ¡0:079 ¡0:011 ¡0:019 ¡0:037 ¡0:079 ¡0:003 ¡0:005 ¡0:011 ¡0:032 ¡0:003 ¡0:005 ¡0:010 ¡0:031 bias ^½I ¡0:0004 0:003 0:002 0:012 ¡0:001 0:001 0:003 0:006 ¡0:001 ¡0:0007 0:001 0:007 ¡0:0007 ¡0:001 ¡0:0001 0:004 ¡0:000 0:000 0:000 0:0005 ¡0:0002 ¡0:0002 ¡0:0004 0:0008 RMSE ^½GMM 0:074 0:099 0:160 0:552 0:053 0:070 0:111 0:443 0:044 0:053 0:075 0:248 0:031 0:038 0:051 0:181 0:029 0:033 0:042 0:109 0:020 0:022 0:027 0:074 RMSE b ^½ RMSE ^½I 0:065 0:054 0:089 0:061 0:129 0:070 0:187 0:099 0:055 0:038 0:081 0:042 0:124 0:048 0:183 0:067 0:036 0:035 0:040 0:036 0:051 0:037 0:085 0:058 0:027 0:025 0:032 0:026 0:045 0:027 0:082 0:041 0:024 0:023 0:024 0:023 0:024 0:022 0:037 0:026 0:017 0:017 0:017 0:017 0:018 0:015 0:034 0:018 The …xed e¤ects ®i and the innovations "it are assumed to have independent standard normal distributions. Initial observations yi0 are assumed to be generated by the stationary ³ ´ ®i 1 distribution N 1¡½ ; : 2 0 1¡½ 0 26 Table 3. Performances of Integrated likelihood estimator for high values of ½ and T = 5 N = 100 ½ = 0:75 ½ = 0:75 ½ = 0:75 ½ = 0:80 ½ = 0:80 ½ = 0:80 ½ = 0:85 ½ = 0:85 ½ = 0:85 ½ = 0:90 ½ = 0:90 ½ = 0:90 ½ = 0:95 ½ = 0:95 ½ = 0:95 Actual mean % Bias Actual median % Bias RMSE Actual mean % Bias Actual median % Bias RMSE Actual mean % Bias Actual median % Bias RMSE Actual mean % Bias Actual median % Bias RMSE Actual mean % Bias Actual median % Bias RMSE bLIML;1 ½ 1:297 ¡3:087 0:181 ¡0:112 ¡5:725 0:213 ¡3:899 ¡10:117 0:233 ¡9:757 ¡15:389 0:246 ¡15:203 ¡19:637 0:252 bI2SLS;LD ½ 5:533 1:381 0:176 4:304 1:457 0:173 1:966 0:065 0:160 ¡0:771 ¡2:346 0:153 ¡3:367 ¡4:776 0:149 b½CU E;LD 11:553 7:470 0:213 10:413 8:651 0:205 7:983 7:558 0:194 6:138 6:114 0:180 3:124 3:136 0:165 b½I 1:158 0:475 0:084 0:802 ¡0:112 0:089 0:995 ¡0:022 0:093 1:363 0:288 0:099 1:455 ¡0:045 0:111 Table 4 Integrated likelihood estimator for the exogenous regressors case using Lancaster’s approximation (1997) (¾2 is unknown; 5000 data sets, ¾2» is the variance of the exogenous variable generation BIAS ½ I 0:065 II 0:034 III 0:006 IV 0:050 V 0:027 V I 0:005 BIAS ½ V II 0:028 V III 0:027 XI 0:020 XII 0:023 BIAS ¯ ¡0:044 ¡0:018 ¡0:001 ¡0:042 ¡0:019 ¡0:002 BIAS ¯ ¡0:010 ¡0:007 ¡0:008 ¡0:010 process) RMSE ½ RMSE ¯ 0:079 0:156 0:053 0:135 0:033 0:174 0:064 0:063 0:046 0:045 0:032 0:033 RMSE ½ RMSE ¯ 0:072 0:186 0:071 0:136 0:062 0:051 0:063 0:052 27 T 6 6 6 6 6 6 T 3 3 3 3 ½ 0:0 0:4 0:8 0:0 0:4 0:8 ½ 0:4 0:4 0:4 0:4 ¯ 1:0 0:6 0:2 1:0 0:6 0:2 ¯ 0:6 0:6 0:6 0:6 ® 0:80 0:80 0:80 0:99 0:99 0:99 ® 0:80 0:80 0:99 0:99 ¾» 0:85 0:88 0:40 0:20 0:19 0:07 ¾» 0:88 1:84 0:19 0:40 Kieviet Results for LSDVc estimator in the exogenous regressors case BIAS ½ I ¡0:019 II ¡0:038 III ¡0:125 IV ¡0:017 V ¡0:042 V I ¡0:127 BIAS ½ V II ¡0:205 V III ¡0:033 XI ¡0:249 XII ¡0:248 BIAS ¯ ¡0:018 ¡0:002 ¡0:011 0:005 0:019 0:050 BIAS ¯ 0:004 0:061 0:055 0:046 RMSE ½ 0:043 0:059 0:135 0:050 0:067 0:136 RMSE ½ 0:220 0:077 0:263 0:263 RMSE ¯ 0:057 0:052 0:114 0:217 0:231 0:643 RMSE ¯ 0:093 0:077 0:433 0:212 T 6 6 6 6 6 6 T 3 3 3 3 ½ 0:0 0:4 0:8 0:0 0:4 0:8 ½ 0:4 0:4 0:4 0:4 ¯ 1:0 0:6 0:2 1:0 0:6 0:2 ¯ 0:6 0:6 0:6 0:6 ® 0:80 0:80 0:80 0:99 0:99 0:99 ® 0:80 0:80 0:99 0:99 ¾» 0:85 0:88 0:40 0:20 0:19 0:07 ¾» 0:88 1:84 0:19 0:40 Table 5. Integrated likelihood estimator for the exogenous regressors case BIAS ½ I ¡0:0003 II 0:0002 III 0:0002 IV 0:0004 V ¡0:0001 V I ¡0:0004 BIAS ½ V II ¡0:0009 V III ¡0:0001 XI ¡0:0002 X II ¡0:0020 (¾2 BIAS ¯ 0:0004 0:0006 0:0005 ¡0:0004 0:0004 ¡0:0007 BIAS ¯ 0:0010 ¡0:0010 ¡0:0008 0:0003 is unknown; 5000 data sets) RMSE ½ RMSE ¯ T 0:0288 0:0288 6 0:0289 0:0283 6 0:0290 0:0288 6 0:0289 0:0290 6 0:0287 0:0289 6 0:0290 0:0289 6 RMSE ½ RMSE ¯ T 0:0287 0:0287 3 0:0290 0:0288 3 0:0287 0:0289 3 0:0283 0:0289 3 28 ½ 0:0 0:4 0:8 0:0 0:4 0:8 ½ 0:4 0:4 0:4 0:4 ¯ 1:0 0:6 0:2 1:0 0:6 0:2 ¯ 0:6 0:6 0:6 0:6 ® 0:80 0:80 0:80 0:99 0:99 0:99 ® 0:80 0:80 0:99 0:99 ¾» 0:85 0:88 0:40 0:20 0:19 0:07 ¾» 0:88 1:84 0:19 0:40