Fixed and Random Effects IX. Fixed Effects. A. Unmodeled Heterogeneity. We hope that our independent variables have explained much of what is different about an observation, a unit, or a year, but there is probably some unmodeled heterogeneity. Since we haven’t modeled it, it goes into ei,t. The real problem comes when some units (or, less commonly, time periods) share some unmodeled heterogeneity. We’d love to be able to explain everything that makes Luxembourg different, but usually we can’t, so we need to violate the prohibition on using proper names as independent variables and do something to remove this shared and thus systematic heterogeneity from the error term. B. The Fixed Effects Model in Concept. One way to do this is to estimate a “fixed effects” model that gives Luxembourg and every other unit in our study its own intercept. The most intuitive way to do this would be by including a dummy variable for N1 units. We still assume that the betas pool across units, so in essence we have N parallel regression lines. Observations across time in each unit vary around a baseline level specific that unit. Note that any substantive explanatory variables that do not vary across time in each unit will be perfectly collinear with the fixed effects, and so we cannot include them in the model (or estimate their effects). yi,t = αi + xi,tβ + ei,t C. Sketch of a Test for Fixed Effects. The null hypothesis is that our simple, restrictive model was appropriate, that all of the units share the same intercept. The alternative is that they vary across units, so the way to test this is by running both models and then comparing their sum of squares in a joint F-test. D. Estimating the Fixed Effects Model. We could just include dummy variables for all but one of the units. If we have panel data, though, this sacrifices a lot of degrees of freedom. And with so many units and very few time periods, these intercepts may be picking up on a lot of random error and thus be quite inconsistent. We’re not going to learn much of substance from these “incidental” or “nuisance” parameters. So this frees us to estimate the effect of our substantive coefficients in a slightly different way that preserves the substantive story of fixed effects without costing us so many degrees of freedom. We convert our xs and y for each observation into a deviation from the mean in that unit. This “sweeps out the unit effects” because when you mean deviate variables, you no longer need to include an intercept term. So the model regresses yi,t – mean(yi) on xi,t – mean(xi). This is often called this “within” estimator because it looks at how changes in the explanatory variables cause y to vary around a mean within the unit. Stata has a canned procedure that (I believe) transforms your variables in this way and then corrects the standard errors to reflect the fact that N of your observations bring no new information (since they are determined by the mean and the other observations for each unit). xtreg DiscretionarySpending salary totalday staffper init ideo leg_dd gov_dem docs_g senc_g medi_g noreast, fe Fixed-effects (within) regression Group variable (i): alpha Number of obs Number of groups = = 616 44 R-sq: Obs per group: min = avg = max = 14 14.0 14 within = 0.0126 between = 0.0232 overall = 0.0053 corr(u_i, Xb) = -0.3025 F(3,569) Prob > F = = 2.42 0.0648 -----------------------------------------------------------------------------Discretion~g | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------salary | -2.94e-06 5.26e+07 -0.00 1.000 -1.03e+08 1.03e+08 totalday | (dropped) staffper | (dropped) init | (dropped) ideol | (dropped) leg_dd | -.1616329 .0701156 -2.31 0.022 -.29935 -.0239159 gov_dem | -.0682768 .0426327 -1.60 0.110 -.1520135 .0154599 docs_g | (dropped) senc_g | (dropped) medi_g | (dropped) noreast | (dropped) _cons | 4.374255 1.33e+12 0.00 1.000 -2.62e+12 2.62e+12 -------------+---------------------------------------------------------------sigma_u | .51059043 sigma_e | .42354445 rho | .59238137 (fraction of variance due to u_i) -----------------------------------------------------------------------------F test that all u_i=0: F(43, 569) = 7.86 Prob > F = 0.0000 X. Random Effects A. The Random Effects Model in Concept. Instead of thinking of each unit as having its own systematic baseline, we think of each intercept as the result of a random deviation from some mean intercept. The intercept is a draw from some distribution for each unit, and it is independent of the error for a particular observation. Instead of trying to estimate N parameters as in fixed effects, we just need to estimate parameters describing the distribution from which each unit’s intercept is drawn. If we have a large N (panel data), we will be able to do this, and random effects will be more efficient than fixed effects. It has N more degrees of freedom, and it also uses information from the “between” estimator (which averages observations over a unit and regresses average y on average x to look at differences across units). Another nice property is that you can still have explanatory variables that don’t change over time for a unit. If we have a big T (TS-CS data), then the difference between fixed effects and random effects goes away. yi,t = μ + αi + xi,tβ + ei,t B. Sketch of a Test for Random Effects. A small assumption is that Cov(αi , ei) = 0. A huge assumption is that Cov(αi , xi) = 0, which means that the things that make a unit’s intercept different are unrelated to the country’s xs. In concept, a test regresses the errors on the xs. C. Estimating the Random Effects Model. xtreg DiscretionarySpending salary totalday staffper init ideo leg_dd gov_dem docs_g senc_g medi_g noreast, re Random-effects GLS regression Group variable (i): alpha Number of obs Number of groups = = 616 44 R-sq: Obs per group: min = avg = max = 14 14.0 14 within = 0.0125 between = 0.5560 overall = 0.3272 Random effects u_i ~ Gaussian corr(u_i, X) = 0 (assumed) Wald chi2(11) Prob > chi2 = = 53.09 0.0000 -----------------------------------------------------------------------------Discretion~g | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------salary | 8.34e-06 5.39e-06 1.55 0.122 -2.23e-06 .0000189 totalday | .0000968 .0008137 0.12 0.905 -.0014979 .0016916 staffper | -.0223183 .0209193 -1.07 0.286 -.0633193 .0186828 init | -.2080793 .1182186 -1.76 0.078 -.4397836 .0236249 ideol | .6509317 .8078395 0.81 0.420 -.9324047 2.234268 leg_dd | -.1153163 .064981 -1.77 0.076 -.2426768 .0120442 gov_dem | -.0570731 .042057 -1.36 0.175 -.1395033 .0253572 docs_g | .0700836 .0764945 0.92 0.360 -.0798429 .2200101 senc_g | .056224 .1389907 0.40 0.686 -.2161928 .3286409 medi_g | .1813995 .0828276 2.19 0.029 .0190605 .3437385 noreast | .4777465 .126015 3.79 0.000 .2307616 .7247314 _cons | 3.74571 .2060277 18.18 0.000 3.341903 4.149517 -------------+---------------------------------------------------------------sigma_u | .32985269 sigma_e | .42354445 rho | .37753489 (fraction of variance due to u_i) ------------------------------------------------------------------------------