ECON 6002 Econometrics I Memorial University of Newfoundland Panel Data Models Adapted from Vera Tabakova’s notes 15.1 Grunfeld’s Investment Data 15.2 Sets of Regression Equations 15.3 Seemingly Unrelated Regressions 15.4 The Fixed Effects Model 15.4 The Random Effects Model Extensions RCM, dealing with endogeneity when we have static variables Principles of Econometrics, 3rd Edition Slide 15-2 The different types of panel data sets can be described as: “long and narrow,” with “long” time dimension and “narrow”, few cross sectional units; “short and wide,” many units observed over a short period of time; “long and wide,” indicating that both N and T are relatively large. Principles of Econometrics, 3rd Edition Slide 15-3 INVit f Vit , Kit (15.1) The data consist of T = 20 years of data (1935-1954) for N = 10 large firms. Value of stock, proxy for expected profits Capital stock, proxy for desired permanent Capital stock Let yit = INVit and x2it = Vit and x3it = Kit yit 1it 2it x2it 3it x3it eit (15.2) Notice the subindices! Principles of Econometrics, 3rd Edition Slide 15-4 INVGE ,t 1 2VGE ,t 3 KGE ,t eGE ,t t 1, ,20 (15.3a) INVWE ,t 1 2VWE ,t 3 KWE ,t eWE ,t yit 1 2 x2it 3 x3it eit t 1, ,20 i 1, 2; t 1, ,20 (15.3b) For simplicity we focus on only two firms keep if (i==3 | i==8) in STATA Principles of Econometrics, 3rd Edition Slide 15-5 INVGE ,t 1,GE 2,GEVGE ,t 3,GE KGE ,t eGE ,t t 1, ,20 (15.4a) INVWE ,t 1,WE 2,WEVWE ,t 3,WE KWE ,t eWE ,t yit 1i 2i x2it 3i x3it eit Principles of Econometrics, 3rd Edition t 1, i 1, 2; t 1, ,20 ,20 (15.4b) Slide 15-6 2 E eGE ,t 0 var eGE ,t GE cov eGE ,t , eGE ,s 0 E eWE ,t 0 var eWE ,t cov eWE ,t , eWE ,s 0 2 WE (15.5) Assumption (15.5) says that the errors in both investment functions (i) have zero mean, (ii) are homoskedastic with constant variance, and (iii) are not correlated over time; autocorrelation does not exist. 2 2 The two equations do have different error variances GE and WE . Principles of Econometrics, 3rd Edition Slide 15-7 reg inv v k if i==3 scalar sse_ge = e(rss) reg inv v k if i==8 scalar sse_we = e(rss) Principles of Econometrics, 3rd Edition Slide 15-8 Let Di be a dummy variable equal to 1 for the Westinghouse observations and 0 for the General Electric observations. INVit 1,GE 1Di 2,GEVit 2 Di Vit 3,GE Kit 3Di Kit eit (15.6) * Create dummy variable gen d = (i == 8) gen dv = d*v gen dk = d*k * Estimate dummy variable model reg inv d v dv k dk test d dv dk Principles of Econometrics, 3rd Edition Slide 15-9 Principles of Econometrics, 3rd Edition Slide 15-10 cov eGE,t , eWE,t GE,WE (15.7) This assumption says that the error terms in the two equations, at the same point in time, are correlated. This kind of correlation is called a contemporaneous correlation. Principles of Econometrics, 3rd Edition Slide 15-11 Econometric software includes commands for SUR (or SURE) that carry out the following steps: (i) Estimate the equations separately using least squares; (ii) Use the least squares residuals from step (i) to estimate , 2 GE (iii) 2 WE and GE ,WE ; Use the estimates from step (ii) to estimate the two equations jointly within a generalized least squares framework. Principles of Econometrics, 3rd Edition Slide 15-12 Principles of Econometrics, 3rd Edition Slide 15-13 * Open and summarize data use grunfeld2, clear summarize * SUR sureg ( inv_ge v_ge k_ge) ( inv_we v_we k_we), corr test ([inv_ge]_cons = [inv_we]_cons) ([inv_ge]_b[v_ge] = [inv_we]_b[v_we]) ([inv_ge]_b[k_ge] = [inv_we]_b[k_we]) Principles of Econometrics, 3rd Edition Slide 15-14 There are two situations where separate least squares estimation is just as good as the SUR technique : (i) when the equation errors are not contemporaneously correlated; (ii) when the same explanatory variables appear in each equation. If the explanatory variables in each equation are different, then a test to see if the correlation between the errors is significantly different from zero is of interest. Principles of Econometrics, 3rd Edition Slide 15-15 2 GE ,WE r ˆ GE ,WE ˆ ˆ ˆ 2 GE ,WE 2 2 GE WE 207.5871 0.53139 777.4463104.3079 1 T KGE T KWE 2 20 eˆGE ,t eˆWE ,t t 1 1 20 eˆGE ,t eˆWE ,t T 3 t 1 In this case we have 3 parameters in each equation so: KGE KWE 3. Principles of Econometrics, 3rd Edition Slide 15-16 Testing for correlated errors for two equations: H 0 : GE ,WE 0 2 2 LM TrGE ,WE (1) under H0 . LM = 10.628 > 3.84 Hence we reject the null hypothesis of no correlation between the errors and conclude that there are potential efficiency gains from estimating the two investment equations jointly using SUR. Principles of Econometrics, 3rd Edition Slide 15-17 Testing for correlated errors for three equations: H0 : 12 13 23 0 2 LM T r122 r132 r232 (3) Principles of Econometrics, 3rd Edition Slide 15-18 Testing for correlated errors for M equations: M i 1 LM T rij2 i 2 j 1 Under the null hypothesis that there are no contemporaneous correlations, this LM statistic has a χ2-distribution with M(M–1)/2 degrees of freedom, in large samples. Principles of Econometrics, 3rd Edition Slide 15-19 H0 : 1,GE 1,WE , 2,GE 2,WE , 3,GE 3,WE (15.8) Most econometric software will perform an F-test and/or a Wald χ2–test; in the context of SUR equations both tests are large sample approximate tests. The F-statistic has J numerator degrees of freedom and (MTK) denominator degrees of freedom, where J is the number of hypotheses, M is the number of equations, and K is the total number of coefficients in the whole system, and T is the number of time series observations per equation. The χ2-statistic has J degrees of freedom. Principles of Econometrics, 3rd Edition Slide 15-20 yit 1it 2it x2it 3it x3it eit (15.9) We cannot consistently estimate the 3×N×T parameters in (15.9) with only NT total observations. But we can impose some more structure… 1it 1i , 2it 2 , 3it 3 (15.10) We consider only one-way effects and assume common slope parameters across cross-sectional units Principles of Econometrics, 3rd Edition Slide 15-21 All behavioral differences between individual firms and over time are captured by the intercept. Individual intercepts are included to “control” for these firm specific differences. yit 1i 2 x2it 3 x3it eit Principles of Econometrics, 3rd Edition (15.11) Slide 15-22 1 i 1 1 i 2 1 i 3 D1i , D2i , D3i , etc. 0 otherwise 0 otherwise 0 otherwise INVit 11D1i 12 D2i 1,10 D10i 2V2it 3 K3it eit (15.12) This specification is sometimes called the least squares dummy variable model, or the fixed effects model. Principles of Econometrics, 3rd Edition Slide 15-23 Principles of Econometrics, 3rd Edition Slide 15-24 H 0 : 11 12 1N H1 : the 1i are not all equal (15.13) These N–1= 9 joint null hypotheses are tested using the usual F-test statistic. In the restricted model all the intercept parameters are equal. If we call their common value β1, then the restricted model is: INVit 1 2Vit 3 Kit eit Principles of Econometrics, 3rd Edition Slide 15-25 Principles of Econometrics, 3rd Edition Slide 15-26 SSER SSEU J F SSEU NT K 1749128 522855 9 48.99 522855 200 12 We reject the null hypothesis that the intercept parameters for all firms are equal. We conclude that there are differences in firm intercepts, and that the data should not be pooled into a single model with a common intercept parameter. Principles of Econometrics, 3rd Edition Slide 15-27 yit 1i 2 x2it 3 x3it eit t 1, ,T (15.14) 1 T yit 1i 2 x2it 3 x3it eit T t 1 1 T 1 T 1 T 1 T yi yit 1i 2 x2it 3 x3it eit T t 1 T t 1 T t 1 T t 1 (15.15) 1i 2 x2i 3 x3i ei Principles of Econometrics, 3rd Edition Slide 15-28 yit 1i 2 x2it 3 x3it eit ( yi 1i 2 x2 i 3 x3i ei ) (15.16) yit yi 2 ( x2it x2i ) 3 ( x3it x3i ) (eit ei ) yit 2 xit 3 xit eit Principles of Econometrics, 3rd Edition (15.17) Slide 15-29 Principles of Econometrics, 3rd Edition Slide 15-30 INV it .1098V it .3106K it (se*) (.0116) (.0169) (15.18) ˆ e2* SSE NT 2 NT 2 NT N 2 Principles of Econometrics, 3rd Edition 198 188 1.02625 Slide 15-31 Principles of Econometrics, 3rd Edition Slide 15-32 yi b1i b2 x2i b3 x3i b1i yi b2 x2i b3 x3i Principles of Econometrics, 3rd Edition i 1, ,N (15.19) Slide 15-33 ONE PROBLEM: Even with the trick of using the within estimator, we still implicitly (even if no longer explicitly) include N-1 dummy variables in our model (not N, since we remove the intercept), so we use up N-1 degrees of freedom. It might not be then the most efficient way to estimate the common slope ANOTHER ONE. By using deviations from the means, the procedure wipes out all the static variables, whose effects might be of interest In order to overcome this problem, we can consider the random effects/or error components model Principles of Econometrics, 3rd Edition Slide 15-34 1i 1 ui (15.20) E ui 0, cov ui , u j 0, var ui u2 (15.21) yit 1i 2 x2it 3 x3it eit Randomness of the intercept Usual error 1 ui 2 x2it 3 x3it eit Principles of Econometrics, 3rd Edition (15.22) Slide 15-35 yit 1 2 x2it 3 x3it eit ui a composite error (15.23) 1 2 x2it 3 x3it vit vit ui eit (15.24) Because the random effects regression error has two components, one for the individual and one for the regression, the random effects model is often called an error components model. Principles of Econometrics, 3rd Edition Slide 15-36 E vit E ui eit E ui E eit 0 0 0 v2 var vit var ui eit var ui var eit 2cov ui , eit (15.25) u2 e2 Principles of Econometrics, 3rd Edition Slide 15-37 There are several correlations that can be considered. The correlation between two individuals, i and j, at the same point in time, t. The covariance for this case is given by cov vit , v jt E (vit v jt ) E ui eit u j e jt E uiu j E ui e jt E eit u j E eit e jt 0000 0 Principles of Econometrics, 3rd Edition Slide 15-38 The correlation between errors on the same individual (i) at different points in time, t and s. The covariance for this case is given by cov vit , vis E (vit vis ) E ui eit ui eis E ui2 E ui eis E eit ui E eit eis (15.26) u2 0 0 0 u2 Principles of Econometrics, 3rd Edition Slide 15-39 The correlation between errors for different individuals in different time periods. The covariance for this case is cov vit , v js E (vit v js ) E ui eit u j e js E uiu j E ui e js E eit u j E eit e js 0000 0 Principles of Econometrics, 3rd Edition Slide 15-40 cov(vit , vis ) u2 corr(vit , vis ) 2 2 var(vit ) var(vis ) u e Principles of Econometrics, 3rd Edition (15.27) Slide 15-41 yit 1 2 x2it 3 x3it eit eˆit yit b1 b2 x2it b3 x3it N T 2 eˆit NT i 1 t 1 LM 1 N T 2 T 1 2 ˆ e it i 1 t 1 Principles of Econometrics, 3rd Edition (15.28) Slide 15-42 yit* 1x1*it 2 x2*it 3 x3*it vit* yit* yit yi , x1*it 1 , 1 Principles of Econometrics, 3rd Edition x2*it x2it x2i , x3*it x3it x3i e T u2 e2 (15.29) (15.30) (15.31) Slide 15-43 ˆ 1 ˆ e T ˆ ˆ Principles of Econometrics, 3rd Edition 2 u 2 e 1 .1951 5 .1083 .0381 .7437 Slide 15-44 Pooled OLS vs different intercepts: test (use a Chow type, after FE or run RE and test if the variance of the intercept component of the error is zero) You cannot pool onto OLS? Then… FE vs RE: test (Hausman type) Different slopes too perhaps? => use SURE of RCM and test for equality of slopes across units Note that there is within variation versus between variation The OLS is an unweighted average of the between estimator and the within estimator The RE is a weighted average of the between estimator and the within estimator The FE is also a weighted average of the between estimator and the within estimator with zero as the weight for the between part The RE is a weighted average of the between estimator and the within estimator The FE is also a weighted average of the between estimator and the within estimator with zero as the weight for the between part So now you see where the extra efficiency of RE comes from!... The RE uses information from both the crosssectional variation in the panel and the time series variation, so it mixes LR and and SR effects The FE uses only information from the time series variation, so it estimates LR effects With a panel, we can learn about dynamic effects from a short panel, while we need a long time series on a single cross-sectional unit, to learn about dynamics from a time series data set If the random error vit ui eit is correlated with any of the righthand side explanatory variables in a random effects model then the least squares and GLS estimators of the parameters are biased and inconsistent. Principles of Econometrics, 3rd Edition Slide 15-50 yit 1 2 x2it 3 x3it (ui eit ) 1 T 1 T 1 T 1 T 1 T yi yit 1 2 x2it 3 x3it ui eit T t 1 T t 1 T t 1 T t 1 T t 1 (15.32) (15.33) 1 2 x2i 3 x3i ui ei Principles of Econometrics, 3rd Edition Slide 15-51 yit 1 2 x2it 3 x3it ui eit ( yi 1 2 x2i 3 x3i ui ei ) (15.34) yit yi 2 ( x2it x2i ) 3 ( x3it x3i ) (eit ei ) Principles of Econometrics, 3rd Edition Slide 15-52 t bFE ,k bRE ,k 12 var b var b FE ,k RE ,k bFE ,k bRE ,k se b se b FE ,k RE ,k 2 12 2 (15.35) We expect to find var bFE ,k var bRE ,k 0. var bFE ,k bRE ,k var bFE ,k var bRE ,k 2cov bFE ,k , bRE ,k var bFE ,k var bRE ,k because Hausman proved that cov bFE ,k , bRE ,k var bRE ,k . Principles of Econometrics, 3rd Edition Slide 15-53 The test statistic to the coefficient of SOUTH is: t bFE ,k bRE ,k se b 2 se b FE ,k RE ,k 2 12 .0163 (.0818) 2 12 .03612 .0224 2.3137 Using the standard 5% large sample critical value of 1.96, we reject the hypothesis that the estimators yield identical results. Our conclusion is that the random effects estimator is inconsistent, and we should use the fixed effects estimator, or we should attempt to improve the model specification. Principles of Econometrics, 3rd Edition Slide 15-54 If the random error vit ui eit is correlated with any of the righthand side explanatory variables in a random effects model then the least squares and GLS estimators of the parameters are biased and inconsistent. Then we would have to use the FE model But with FE we lose the static variables? Solutions? HT, AM, BMS, instrumental variables models could help Principles of Econometrics, 3rd Edition Slide 15-55 Further issues We can generalise the random effects idea and allow for different slopes too: Random Coefficients Model Again, the now it is the slope parameters that differ, but as in RE model, they are drawn from a common distribution The RCM in a way is to the RE model what the SURE model is to the FE model Principles of Econometrics, 3rd Edition Slide 15-56 Further issues Unit root tests and Cointegration in panels Dynamics in panels Principles of Econometrics, 3rd Edition Slide 15-57 Further issues Of course it is not necessary that one of the dimensions of the panel is time as such Example: i are students and t is for each quiz they take Of course we could have a one-way effect model on the time dimension instead Or a two-way model Or a three way model! But things get a bit more complicated there… Principles of Econometrics, 3rd Edition Slide 15-58 Further issues Another way to have more fun with panel data is to consider dependent variables that are not continuous Logit, probit, count data can be considered STATA has commands for these Based on maximum likelihood and other estimation techniques we have not yet considered Principles of Econometrics, 3rd Edition Slide 15-59 Further issues You can understand the use of the FE model as a solution to omitted variable bias If the unmeasured variables left in the error model are not correlated with the ones in the model, we would not have a bias in OLS, so we can safely use RE If the unmeasured variables left in the error model are correlated with the ones in the model, we would have a bias in OLS, so we cannot use RE, we should not leave them out and we should use FE, which bundles them together in each cross-sectional dummy Principles of Econometrics, 3rd Edition Slide 15-60 Further issues Another criterion to choose between FE and RE If the panel include all the relevant cross-sectional units, use FE, if only a random sample from a population, RE is more appropriate (as long as it is valid) Principles of Econometrics, 3rd Edition Slide 15-61 Readings Wooldridge’s book on panel data Baltagi’s book on panel data Greene’s coverage is also good Principles of Econometrics, 3rd Edition Slide 15-62 Balanced panel Breusch-Pagan test Cluster corrected standard errors Contemporaneous correlation Endogeneity Error components model Fixed effects estimator Fixed effects model Hausman test Heterogeneity Least squares dummy variable model LM test Panel corrected standard errors Pooled panel data regression Principles of Econometrics, 3rd Edition Pooled regression Random effects estimator Random effects model Seemingly unrelated regressions Unbalanced panel Slide 15-63 Principles of Econometrics, 3rd Edition Slide 15-64 yit 1 2 x2it 3 x3it (ui eit ) yit yi 2 ( x2it x2i ) 3 ( x3it x3i ) (eit ei ) ˆ e2 Principles of Econometrics, 3rd Edition SSEDV NT N K slopes (15A.1) (15A.2) (15A.3) Slide 15-65 yi 1 2 x2i 3 x3i ui ei i 1, ,N T var ui ei var ui var ei var ui var eit T t 1 1 T e2 T 2 2 u 2 var eit u 2 T T t 1 (15A.4) (15A.5) e2 T 2 u Principles of Econometrics, 3rd Edition Slide 15-66 2 SSEBE 2 e u T N K BE 2 2 ˆ SSEBE SSEDV 2 2 e e ˆ u u T T N K BE T NT N K slopes Principles of Econometrics, 3rd Edition (15A.6) (15A.7) Slide 15-67