Fixed and Random Effects: Addressing Panel Data
The Problem
Omitted Variable Bias: unobserved characteristics are biasing the estimates of your parameters and the error term is no longer random
For example, in this basic regression equation:
Y it
* = b
0
+ b
1
X
1
+ b
2
X
2
+ e it
Where:
Y is the dependent variable (DV) and i = entity and t = time.
X represents one independent variable
β is the coefficient for that variable
e is the error term, which includes d
, the individual error term that is correlated with the regressors and e, the random error term
The individual, unobserved effects ( d
) are not controlled for and instead are absorbed in the error term and the parameter estimates.
How this relates to panel data
Datasets with multiple observations, or panel data (see Figure 1), provide an opportunity to pull out the unobserved characteristics from the parameter estimates and the error term.
Figure 1: Panel Data Output
Panel Data includes multiple observations per entity over a specific period of time
Solution One: Fixed Effects
A fixed-effects model (FE) takes into account the repetition across entities (individuals, states, counties, companies, etc.) and controls for individual-level (or entity-level) effects through the use of dummy variables. FE is used whenever you are only interested in analyzing the impact of variables that vary over time and when we assume that something within the individual may impact or bias the predictor or outcome variables.
The “fixed effects” refer to the time-invariant characteristics, both observed and unobserved, that are unique to each individual.
Assumptions for this model:
1.
Each entity has its own individual characteristics that may influence the predictor variables
2.
Entity-level effects are correlated with the predictor variables
3.
Entity-level effects are not correlated with other entity’s individual characteristics
The Equation:
Y it
=
0
+
k
kt
+ δ t
T t
+ γ n
E n
+
it
Where:
Y is the dependent variable (DV) where i = entity and t = time.
X represents one independent variable
β k is the coefficient for that variable
u it is the error term
δ t
T t
is time as a binary variable
γ n
E n is the coefficient for a binary time regressor
is the coefficient for the binary repressors (entities)
is the entity n; binary (dummies)
In other words, you add a dummy variable for each entity (n-1 for a control group), which absorbs the effects of particular entities. By adding the dummy, you control for unobserved heterogeneity and can then estimate the true effect of your predictor variables.
The coefficients on the regressors are measuring the change within individual entities over time.
They will now be interpreted as “as X varies in time by one unit, Y increases or decreases by ß units.”
Solution Two: Random Effects
The Random Effects model is similar to the Fixed Effects model and addresses the same problem
(how to control for unobservable characteristics that do not change over time in an entity).
However, in a Random Effects model, we make the assumption that the unobservable
characteristics are not correlated with the explanatory variables . This allows us to include observable, time-invariant variables (such as race or gender) to be included as explanatory variables.
The individual effect of a specific entity is considered “random,” meaning the effect is measured by comparing the differences between different entities – not within entities.
By controlling for a random effect, we can make our model more efficient (meaning we’ll have a tighter distribution curve), but we won’t address possible biased coefficients (like in a Fixed
Effect model).
When to Use Random Effects:
Random Effects models are often used when there is not complete data for all of the entities in your sample to accurately measure the unobservable characteristics in a dummy variable – or to be sure that the unobservable characteristics are correlated to the regressors.
Since random effects models treat differences between individual entities as a random draw from
theory as to why the unobservable characteristics are not correlated to the regressors, than a
u
unobservable characteristics) or the alternative (regressors are correlated to the unobservable xtreg y x1 , fe estimates store fixed xtreg y x1 , re estimates store random hausman fixed random
. hausman fixed random
If the Chi-Squared is less than .05 (significant), then use Fixed Effects.
Coefficients
(b) (B) (b-B) sqrt(diag(V_b-V_B))
Helpful Links:
An easy to digest discussion on fixed effects:
x1 2.48e+09 1.25e+09 1.23e+09 6.41e+08
b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test: Ho: difference in coefficients not systematic
chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 3.67
Prob>chi2 = 0.0553
If this is < 0.05 (i.e. significant) use fixed effects.
29
PU/DSS/OTR
Discussion of both models: www.upa.pdx.edu/IOA/newsom/mlrclass/ho_rand fix d.doc
A specific overview of panel data and how fixed and random effects are used: http://dss.princeton.edu/training/Panel101.pdf