Fixed and Random Effects

advertisement
Fixed and Random Effects
IX. Fixed Effects.
A. Unmodeled Heterogeneity. We hope that our independent variables have
explained much of what is different about an observation, a unit, or a year, but there is
probably some unmodeled heterogeneity. Since we haven’t modeled it, it goes into ei,t. The
real problem comes when some units (or, less commonly, time periods) share some
unmodeled heterogeneity. We’d love to be able to explain everything that makes
Luxembourg different, but usually we can’t, so we need to violate the prohibition on using
proper names as independent variables and do something to remove this shared and thus
systematic heterogeneity from the error term.
B. The Fixed Effects Model in Concept. One way to do this is to estimate a
“fixed effects” model that gives Luxembourg and every other unit in our study its own
intercept. The most intuitive way to do this would be by including a dummy variable for N1 units. We still assume that the betas pool across units, so in essence we have N parallel
regression lines. Observations across time in each unit vary around a baseline level specific
that unit. Note that any substantive explanatory variables that do not vary across time in
each unit will be perfectly collinear with the fixed effects, and so we cannot include them in
the model (or estimate their effects).
yi,t = αi + xi,tβ + ei,t
C. Sketch of a Test for Fixed Effects. The null hypothesis is that our simple,
restrictive model was appropriate, that all of the units share the same intercept. The
alternative is that they vary across units, so the way to test this is by running both models
and then comparing their sum of squares in a joint F-test.
D. Estimating the Fixed Effects Model. We could just include dummy variables
for all but one of the units. If we have panel data, though, this sacrifices a lot of degrees of
freedom. And with so many units and very few time periods, these intercepts may be
picking up on a lot of random error and thus be quite inconsistent. We’re not going to learn
much of substance from these “incidental” or “nuisance” parameters. So this frees us to
estimate the effect of our substantive coefficients in a slightly different way that preserves
the substantive story of fixed effects without costing us so many degrees of freedom. We
convert our xs and y for each observation into a deviation from the mean in that unit. This
“sweeps out the unit effects” because when you mean deviate variables, you no longer need
to include an intercept term. So the model regresses yi,t – mean(yi) on xi,t – mean(xi). This is
often called this “within” estimator because it looks at how changes in the explanatory
variables cause y to vary around a mean within the unit. Stata has a canned procedure that (I
believe) transforms your variables in this way and then corrects the standard errors to reflect
the fact that N of your observations bring no new information (since they are determined by
the mean and the other observations for each unit).
xtreg DiscretionarySpending salary totalday staffper init ideo leg_dd gov_dem
docs_g senc_g medi_g noreast, fe
Fixed-effects (within) regression
Group variable (i): alpha
Number of obs
Number of groups
=
=
616
44
R-sq:
Obs per group: min =
avg =
max =
14
14.0
14
within = 0.0126
between = 0.0232
overall = 0.0053
corr(u_i, Xb)
= -0.3025
F(3,569)
Prob > F
=
=
2.42
0.0648
-----------------------------------------------------------------------------Discretion~g |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------salary | -2.94e-06
5.26e+07
-0.00
1.000
-1.03e+08
1.03e+08
totalday | (dropped)
staffper | (dropped)
init | (dropped)
ideol | (dropped)
leg_dd | -.1616329
.0701156
-2.31
0.022
-.29935
-.0239159
gov_dem | -.0682768
.0426327
-1.60
0.110
-.1520135
.0154599
docs_g | (dropped)
senc_g | (dropped)
medi_g | (dropped)
noreast | (dropped)
_cons |
4.374255
1.33e+12
0.00
1.000
-2.62e+12
2.62e+12
-------------+---------------------------------------------------------------sigma_u | .51059043
sigma_e | .42354445
rho | .59238137
(fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0:
F(43, 569) =
7.86
Prob > F = 0.0000
X. Random Effects
A. The Random Effects Model in Concept. Instead of thinking of each unit as
having its own systematic baseline, we think of each intercept as the result of a random
deviation from some mean intercept. The intercept is a draw from some distribution for
each unit, and it is independent of the error for a particular observation. Instead of trying to
estimate N parameters as in fixed effects, we just need to estimate parameters describing the
distribution from which each unit’s intercept is drawn. If we have a large N (panel data), we
will be able to do this, and random effects will be more efficient than fixed effects. It has N
more degrees of freedom, and it also uses information from the “between” estimator (which
averages observations over a unit and regresses average y on average x to look at differences
across units). Another nice property is that you can still have explanatory variables that
don’t change over time for a unit. If we have a big T (TS-CS data), then the difference
between fixed effects and random effects goes away.
yi,t = μ + αi + xi,tβ + ei,t
B. Sketch of a Test for Random Effects. A small assumption is that Cov(αi , ei) =
0. A huge assumption is that Cov(αi , xi) = 0, which means that the things that make a unit’s
intercept different are unrelated to the country’s xs. In concept, a test regresses the errors
on the xs.
C. Estimating the Random Effects Model.
xtreg DiscretionarySpending salary totalday staffper init ideo leg_dd gov_dem
docs_g senc_g medi_g noreast, re
Random-effects GLS regression
Group variable (i): alpha
Number of obs
Number of groups
=
=
616
44
R-sq:
Obs per group: min =
avg =
max =
14
14.0
14
within = 0.0125
between = 0.5560
overall = 0.3272
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Wald chi2(11)
Prob > chi2
=
=
53.09
0.0000
-----------------------------------------------------------------------------Discretion~g |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------salary |
8.34e-06
5.39e-06
1.55
0.122
-2.23e-06
.0000189
totalday |
.0000968
.0008137
0.12
0.905
-.0014979
.0016916
staffper | -.0223183
.0209193
-1.07
0.286
-.0633193
.0186828
init | -.2080793
.1182186
-1.76
0.078
-.4397836
.0236249
ideol |
.6509317
.8078395
0.81
0.420
-.9324047
2.234268
leg_dd | -.1153163
.064981
-1.77
0.076
-.2426768
.0120442
gov_dem | -.0570731
.042057
-1.36
0.175
-.1395033
.0253572
docs_g |
.0700836
.0764945
0.92
0.360
-.0798429
.2200101
senc_g |
.056224
.1389907
0.40
0.686
-.2161928
.3286409
medi_g |
.1813995
.0828276
2.19
0.029
.0190605
.3437385
noreast |
.4777465
.126015
3.79
0.000
.2307616
.7247314
_cons |
3.74571
.2060277
18.18
0.000
3.341903
4.149517
-------------+---------------------------------------------------------------sigma_u | .32985269
sigma_e | .42354445
rho | .37753489
(fraction of variance due to u_i)
------------------------------------------------------------------------------
Download