Here - Kellogg School of Management

advertisement
Common Errors:
How to (and Not to) Control
for Unobserved Heterogeneity
Lecture slides
by Todd Gormley
What are these slides?
 The following slides are a combination of
lecture slides used by Todd Gormley in his
Ph.D. course on “Empirical Methods in
Corporate Finance” at The Wharton School
 For more details about the issues discussed in
these slides, please see the below article
 Gormley, T. and D. Matsa, 2014, “Common Errors:
How to (and Not to) Control for Unobserved
Heterogeneity,” Review of Financial Studies 27(2): 617-61.
Slides by Gormley
Panel Data & Common Errors
Motivation [Part 1]
 Controlling for unobserved heterogeneity is a
fundamental challenge in empirical finance


Unobservable factors affect corporate policies and prices
These factors may be correlated with variables of interest
 Important sources of unobserved heterogeneity are
often common across groups of observations

Demand shocks across firms in an industry, differences
in local economic environments, etc.
Slides by Gormley
Panel Data & Common Errors
Motivation [Part 2]
 E.g. consider a the firm-level estimation
leveragei , j ,t  0  1 profiti , j ,t 1  ui , j ,t
where leverage is debt/assets for firm i,
operating in industry j in year t, and profit is
the firms net income/assets
What might be some unobservable omitted
variables in this estimation?
Slides by Gormley
Panel Data & Common Errors
Motivation [Part 3]
 Oh, there are so, so many…
 Managerial talent and/or risk aversion
 Cost of capital
 Industry supply and/or demand shock
 Regional demand shocks
 And so on…
Sadly, this is
easy to do with
other dependent
or independent
variables…
 Easy to think of ways these might be affect
leverage and be correlated with profits
Slides by Gormley
Panel Data & Common Errors
Panel data to the rescue…
 Thankfully, panel data can help us with a
particular type of unobserved variable…

What type of unobserved variable does
panel data help us with, and why?

Answer = It helps with any unobserved variable
that doesn’t vary within groups of observations
Slides by Gormley
Panel Data & Common Errors
Outline for lecture





Panel data and fixed effects (FE)
How not to control for unobserved heterogeneity
General implications
Benefits and limitations of FE model
Estimating high-dimensional FE models
Slides by Gormley
Panel Data & Common Errors
Panel data
 Panel data = whenever you have multiple
observations per unit of observation i (e.g.
you observe each firm over multiple years)
 Let’s assume N units i
 And, J observations per unit i [i.e. balanced panel]
 E.g., You observe 5,000 firms in Compustat over
a twenty year period [i.e. N=5,000, J=20]
Slides by Gormley
Panel Data & Common Errors
The underlying model [Part 1]
 When unobserved heterogeneity is thought to be
present, researcher implicitly assumes the following:
yi , j   X i , j  fi   i , j
 i indexes groups of observations (e.g. industry);
j indexes observations within each group (e.g. firm)




yi,j = dependent variable
Xi,j = independent variable of interest
fi = unobserved group heterogeneity
 i , j = error term
Slides by Gormley
Panel Data & Common Errors
The underlying model [Part 2]
 The following standard assumptions are made:
N groups, J observations per group,
where J is small and N is large
X and ε are i.i.d. across groups, but
not necessarily i.i.d. within groups
var( f )   2f ,  f  0
var( X )   X2 ,  X  0
var( )   2 ,   0
Slides by Gormley
Simplifies some expressions,
but doesn’t change any results
Panel Data & Common Errors
The underlying model [Part 3]
 Finally, the following assumptions are made:
cov( fi ,  i , j )  0
co v( X i , j ,  i , j )  co v( X i , j ,  i ,  j )  0
cov( X i , j , fi )   Xf  0
Source of identification concern
Slides by Gormley
What do these imply?
Answer = Model is correct in
that if we can control for f, we’ll
properly identify effect of X; but
if we don’t control for f there
will be omitted variable bias
Panel Data & Common Errors
OLS estimate of β is inconsistent
True model is:
yi , j   X i , j  fi   i , j
But OLS estimates:
yi , j   OLS X i , j  uiOLS
,j

By failing to control for group effect, fi,
OLS suffers from omitted variable bias
ˆ OLS   
 Xf
 X2
Alternative estimation strategies are required…
Slides by Gormley
Panel Data & Common Errors
Can solve this by transforming data
 First, notice that if you take the population
mean of the dependent variable for each
unit of observation, i, you get…
yi = a + b xi + f i + e i
where
yi =
Again, I assumed
there are J obs.
per unit i
1
1
1
y
,
x
=
x
,
e
=
e i, j
å
å
å
i, j
i
i, j
i
J j
J j
J j
Slides by Gormley
Panel Data & Common Errors
Transforming data [Part 2]
 Now, if we subtract yi from yi ,t , we have
(
) (
yi,t - yi = b xi,t - xi + e i,t - e i
)

And look! The unobserved variable, fi , is gone
(as is the constant) because it is group-invariant

With our earlier assumptions, easy to see that  xi ,t  xi 
is uncorrelated with the new disturbance, e i,t - e i ,
which means…
(
)
?
Slides by Gormley
Panel Data & Common Errors
Fixed effects (or within) estimator
 Answer: OLS estimation of transformed
model will yield a consistent estimate of β
 The prior transformation is called the
“within transformation” because it
demeans all variables within their group
 This is also called the FE estimator
Slides by Gormley
Panel Data & Common Errors
Least Squares Dummy Variable (LSDV)
 Another way to do the FE estimation is by

adding indicator (dummy) variables
I.e. create a dummy variable for each group i,
and add it to the regression


This is least squares dummy variable model
Now, our estimation equation exactly matches the
true underlying model
yi, j = a + b xi, j + f i + ui, j
Slides by Gormley
Panel Data & Common Errors
LSDV versus FE [Part 1]
 Why do both approaches work? Well…
 Frisch-Waugh-Lovell Theorem shows us there are
two ways to estimate the below β1…
y = b0 + b1x + b 2 z + e


Estimate directly; i.e. regress y onto both x and z
OR we can just partial z out from both y and x before
regressing y on x (i.e. regress residuals from regression of
y on z onto residuals from regression of x on z)
Slides by Gormley
Panel Data & Common Errors
LSDV versus FE [Part 2]
 Can show that LSDV and within-transformation
of FE are identical because demeaned variables
of within regression are the residuals from a
regression onto group dummies!
Slides by Gormley
Panel Data & Common Errors
Outline for lecture





Panel data and fixed effects (FE)
How not to control for unobserved heterogeneity
General implications
Benefits and limitations of FE model
Estimating high-dimensional FE models
Slides by Gormley
Panel Data & Common Errors
Other approaches…
 Gormley and Matsa (RFS 2014) notes that existing
literature uses various other strategies to control for
unobserved group-level heterogeneity…
Their questions – How do each of the approaches
differ? And, when are they consistent?
Their answer – Some popular strategies can distort
inferences and should not be used; FE estimator
should be used instead
Slides by Gormley
Panel Data & Common Errors
They focus on two popular strategies
 “Adjusted-Y” (AdjY) – dependent variable is
demeaned within groups [e.g. ‘industry-adjust’]
 “Average effects” (AvgE) – uses group mean of
dependent variable as control [e.g. ‘state-year’ control]
Slides by Gormley
Panel Data & Common Errors
AdjY & AvgE are widely used
 In Journal of Finance, Journal of Financial
Economics, and Review of Financial Studies
 Used since at least the late 1980s
 Still used, 60+ papers published in 2008-2010
 Variety of subfields; asset pricing, banking,
capital structure, governance, M&A, etc.
 Also been used in papers published in the
American Economic Review, Journal of Political
Economy, and Quarterly Journal of Economics
Slides by Gormley
Panel Data & Common Errors
But, AdjY and AvgE are inconsistent
 As Gormley and Matsa (RFS 2014) shows…



Both can be more biased than OLS
Both can get opposite sign as true coefficient
In practice, bias is likely and trying to predict its
sign or magnitude will typically impractical
Slides by Gormley
Panel Data & Common Errors
More implications of GM (RFS 2014)
 Other, related strategies should also not be used





“Characteristically-adjusted” stock returns in AP
“Adjusted” stock returns when trying to estimate
firms’ internal value of cash
Simple comparisons of benchmark-adjusted
outcomes before & after events (like M&A)
“Diversification discount”
Using group average of an independent
variable as instrumental variable
 Now, let’s see why…
Slides by Gormley
Panel Data & Common Errors
Adjusted-Y (AdjY)
 Tries to remove unobserved group heterogeneity by
demeaning the dependent variable within groups
AdjY
AdjY
AdjY estimates: yi , j  yi   X i , j  ui , j
1
where yi 
J
  X
k group i
i ,k
 fi   i ,k 
Note: Researchers often exclude observation at hand when
calculating group mean or use a group median, but both
modifications will yield similarly inconsistent estimates
Slides by Gormley
Panel Data & Common Errors
Example AdjY estimation
 One example – firm value regression:
Qi , j ,t  Qi ,t    β ' Xi, j ,t   i , j ,t
 Qi , j ,t = Tobin’s Q for firm j, industry i, year t
 Qi ,t = mean of Tobin’s Q for industry i in year t
 Xi,j,t = vector of variables thought to affect value
 Researchers might also include firm & year FE
Anyone know why AdjY is going to be inconsistent?
Slides by Gormley
Panel Data & Common Errors
Here is why…
 Rewriting the group mean, we have:
yi  fi   X i   i ,
 Therefore, AdjY transforms the true data to:
yi , j  yi   X i , j   X i   i , j   i
What is the AdjY estimation forgetting?
Slides by Gormley
Panel Data & Common Errors
AdjY has omitted variable bias
 ˆ adjY can be inconsistent when   0
True model: yi , j  yi   X i , j   X i   i , j   i
But, AdjY estimates:

yi , j  yi   AdjY X i , j  uiAdjY
,j
By failing to control for X i , AdjY suffers from
omitted variable bias when  XX  0
ˆ
AdjY
  
Slides by Gormley
 XX
 X2
In practice, a positive
covariance between
X and X will be
very common
Panel Data & Common Errors
Further analysis of AdjY estimate
ˆ AdjY    
 XX
 X2
 Bias doesn’t disappear as group size J increases
 Can be inconsistent even when OLS is not; this
happens when σXf = 0 and  XX  0
Bias is more complicated with two variables…
Slides by Gormley
Panel Data & Common Errors
AdjY estimates with 2 variables
 Suppose, there are instead two RHS variables
True model:
yi , j   X i , j   Zi , j  fi   i , j
 Use same assumptions as before, but add:
cov( Z i , j ,  i , j )  cov( Z i , j ,  i ,  j )  0
var( Z )   Z2 ,  Z  0
cov( X i , j , Z i , j )   XZ
cov( Z i , j , f i )   Zf
Slides by Gormley
Panel Data & Common Errors
AdjY estimates with 2 variables [Part 2]
 With a bit of algebra, it is shown that:

  XZ  ZX   Z2 XX     XZ  ZZ   Z2 XZ  
 

2
2
2
AdjY
 ˆ  
 Z  X   XZ


 AdjY  

2
2
ˆ















 


XZ XX
X ZX 
XZ XZ
X ZZ  
2 2
2
 





Z X
XZ


Estimates of both
β and γ can be
inconsistent
Slides by Gormley
Determining sign and
magnitude of bias will
typically be difficult
Panel Data & Common Errors
Average effects (AvgE)
 AvgE uses group mean of dependent variable as
control for unobserved heterogeneity
AvgE estimates:
yi , j   AvgE X i , j   AvgE yi  uiAvgE
,j
Slides by Gormley
Panel Data & Common Errors
Example AvgE estimation
 Following profit regression is an AvgE example:
ROAi , s ,t    β ' Xi,s ,t   ROAs ,t   i , s ,t
 ROAs,t = mean of ROA for state s in year t
 Xi,s,t = vector of variables thought to profits
 Researchers might also include firm & year FE
Anyone know why AvgE is going to be inconsistent?
Slides by Gormley
Panel Data & Common Errors
Average effects (AvgE)
 AvgE uses group mean of dependent variable as
control for unobserved heterogeneity
AvgE estimates:
yi , j   AvgE X i , j   AvgE yi  uiAvgE
,j
Recall, true model: yi , j   X i , j  fi   i , j
Problem is that y i measures fi with error
Slides by Gormley
Panel Data & Common Errors
AvgE has measurement error bias
 Recall that group mean is given by


yi  fi   X i   i ,
Therefore, y i measures fi with error  X i   i
As is well known, even classical measurement error
causes all estimated coefficients to be inconsistent
 Bias here is complicated because error can be
correlated with both mismeasured variable, f i ,
and with Xi,j when  XX  0
Slides by Gormley
Panel Data & Common Errors
AvgE estimate of β with one variable
 With a bit of algebra, it is shown that:
ˆ AvgE   
Determining
magnitude and
direction of bias
is difficult



 Xf  fX   2 X2   2      XX  2f   fX   

2
X

2
f

 2 fX         Xf   XX
2
2
X
2
Covariance between X and X
again problematic, but not
needed for AvgE estimate to
be inconsistent
Slides by Gormley


2
Even non-i.i.d.
nature of errors
can affect bias!
Panel Data & Common Errors
How common will the bias be?
 First, we look at when 
XX
 0 by separating Xi,j
into it’s group and idiosyncratic components
X i , j  xi  wi , j
Assume group
means are i.i.d.
with mean zero
and variance  x2
Slides by Gormley
Idiosyncratic component
distributed with mean 0
and variance  w2
And, assume cov( xi , wi , j )  0
Panel Data & Common Errors
AdjY and AvgE bias very common
 Both AdjY and AvgE biased when  XX  0
 But with prior setup, we can show that…
 XX     w
2
x
Bias whenever
different means
across groups!
i , j , wi ,  j
Or, bias whenever
observations within groups
are not independent!
* Solved excluding observation at hand (most common approach)
Slides by Gormley
Panel Data & Common Errors
Analytical comparisons
 Next, we use analytical solutions to compare
relative performance of OLS, AdjY, and AvgE
 To do this, we re-express solutions…


We use correlations (e.g. solve bias in terms of
correlation between X and f,  Xf , instead of  Xf )
We also assume i.i.d. errors [just makes bias of
AvgE less complicated]
Slides by Gormley
Panel Data & Common Errors
ρXf has large effect on performance
(from Figure 1A)
AdjY more biased
than OLS, except for
AvgE worst for low
large values for ρXf
correlations, best for high
1.5
2
Estimate, ˆ
OLS
1
True β = 1
0
0.5
AdjY
-0.75
Other parameters held constant
AvgE
-0.5
-0.25
0
 f  X     X  1,  x  w  0.25, w
i , j wi ,  j
0.25
 0.5, J  10.
Slides by Gormley
0.5
0.75
 Xf
Panel Data & Common Errors
Relative variation across groups key
(from Figure 1B)
Estimate, ˆ
0.5
0.75
1
1.25
OLS
0.25
AvgE
0
AdjY
0
.5
1
 f  X     X  1,  Xf  0.25, w
i , j wi ,  j
 0.5, J  10.
Slides by Gormley
1.5
2
x /w
Panel Data & Common Errors
More observations need not help!
(from Figure 1F)
Estimate, ˆ
1
1.25
OLS
0.75
AvgE
0.5
AdjY
0
5
10
 f  X     X  1,  x  w   Xf  0.25, w
i , j wi ,  j
15
20
25
J
 0.5, J  10.
Slides by Gormley
Panel Data & Common Errors
Summary of OLS, AdjY, and AvgE
 In general, all three estimators are inconsistent


in presence of unobserved group heterogeneity
AdjY and AvgE may not be an improvement
over OLS; depends on various parameter values
AdjY and AvgE can yield estimates with
opposite sign of the true coefficient
Slides by Gormley
Panel Data & Common Errors
Comparing FE, AdjY, and AvgE
 To estimate effect of X on Y controlling for Z
Add group FE
 One could regress Y onto both X and Z…
 Or, regress residuals from regression of Y on Z
onto residuals from regression of X on Z
Within-group
transformation!
 AdjY and AvgE aren’t the same as finding the
effect of X on Y controlling for Z because...
 AdjY only partials Z out from Y
 AvgE uses fitted values of Y on Z as control
Slides by Gormley
Panel Data & Common Errors
The differences matter! Example #1
 Consider the following capital structure regression:
( D / A)i ,t    βXi,t  fi   i ,t
 (D/A)it = book leverage for firm i, year t
 Xi,t = vector of variables thought to affect leverage
 fi = firm fixed effect
 We now run this regression for each approach to
deal with firm fixed effects, using 1950-2010 data,
winsorizing at 1% tails…
Slides by Gormley
Panel Data & Common Errors
Estimates vary considerably
(from Table 2)
Dependent variable = book leverage
Fixed Assets/ Total Assets
Ln(sales)
Return on Assets
Z-score
Market-to-book Ratio
Observations
R-squared
OLS
Adj Y
Avg E
FE
0.270***
(0.008)
0.011***
(0.001)
-0.015***
(0.005)
-0.017***
0.000
-0.006***
(0.000)
0.066***
(0.004)
0.011***
0.000
0.051***
(0.004)
-0.010***
(0.000)
-0.004***
(0.000)
0.103***
(0.004)
0.011***
0.000
0.039***
(0.004)
-0.011***
(0.000)
-0.004***
(0.000)
0.248***
(0.014)
0.017***
(0.001)
-0.028***
(0.005)
-0.017***
(0.001)
-0.003***
(0.000)
166,974
0.29
166,974
0.14
166,974
0.56
166,974
0.66
Slides by Gormley
Panel Data & Common Errors
The differences matter! Example #2
 Consider the following firm value regression:
Qi , j ,t    β ' Xi, j ,t  f j ,t   i , j ,t
 Q = Tobin’s Q for firm i, industry j, year t
 Xi,j,t = vector of variables thought to affect value
 fj,t = industry-year fixed effect
 We now run this regression for each approach
to deal with industry-year fixed effects…
Slides by Gormley
Panel Data & Common Errors
Estimates vary considerably
(from Table 4)
Dependent Variable = Tobin's Q
OLS
Adj Y
Avg E
FE
Delaware Incorporation
0.100***
(0.036)
0.019
(0.032)
0.040
(0.032)
0.086**
(0.039)
Ln(sales)
-0.125***
(0.009)
-0.054***
(0.008)
-0.072***
(0.008)
-0.131***
(0.011)
R&D Expenses / Assets
6.724***
(0.260)
3.022***
(0.242)
3.968***
(0.256)
5.541***
(0.318)
Return on Assets
-0.559***
(0.108)
-0.526***
(0.095)
-0.535***
(0.097)
-0.436***
(0.117)
55,792
0.22
55,792
0.08
55,792
0.34
55,792
0.37
Observations
R-squared
Slides by Gormley
Panel Data & Common Errors
The differences matter! Example #3
 It also matters in literature on antitakeover laws


Past papers used AvgE to control for unobserved,
time-varying differences across states & industries
Gormley and Matsa (2014) show that properly
using industry-year, state-year, and firm FE
estimator changes estimates considerably
 E.g., using this framework, they show that managers
have an underlying preference to “Play it Safe”
 For details, see http://ssrn.com/abstract=2465632
Slides by Gormley
Panel Data & Common Errors
Outline for lecture





Panel data and fixed effects (FE)
How not to control for unobserved heterogeneity
General implications
Benefits and limitations of FE model
Estimating high-dimensional FE models
Slides by Gormley
Panel Data & Common Errors
General implications
 With this framework, easy to see that other
commonly used estimators will be biased


AdjY-type estimators in M&A, asset pricing, etc.
AvgE-type instrumental variables
Slides by Gormley
Panel Data & Common Errors
Other AdjY estimators are problematic
 Same problem arises with other AdjY estimators




Subtracting off median or value-weighted mean
Subtracting off mean of matched control sample
[as is customary in studies if diversification “discount”]
Comparing industry-adjusted means for treated firms
pre- versus post-event [as often done in M&A studies]
Characteristically adjusted returns [as used in asset pricing]
Slides by Gormley
Panel Data & Common Errors
AdjY-type estimators in asset pricing
 Common to sort and compare stock returns across
portfolios based on a variable thought to affect returns
 But, returns are often first “characteristically adjusted”


I.e. researcher subtracts the average return of a benchmark
portfolio containing stocks of similar characteristics
This is equivalent to AdjY, where “adjusted returns” are
regressed onto indicators for each portfolio
 Approach fails to control for how avg. independent
variable varies across benchmark portfolios
Slides by Gormley
Panel Data & Common Errors
Asset Pricing AdjY – Example
 Asset pricing example; sorting returns based on
R&D expenses / market value of equity
Characteristically adjusted returns by R&D Quintile (i.e., Adj Y)
Missing
Q1
Q2
Q3
Q4
Q5
-0.012***
(0.003)
-0.033***
(0.009)
-0.023***
(0.008)
-0.002
(0.007)
We use industry-size benchmark portfolios
and sorted using R&D/market value
Slides by Gormley
0.008
(0.013)
0.020***
(0.006)
Difference between
Q5 and Q1 is 5.3
percentage points
Panel Data & Common Errors
Estimates vary considerably
(from Table 5)
Dependent Variable = Yearly Stock Return
Adj Y
FE
R&D Missing
0.021**
(0.009)
0.030***
(0.010)
R&D Quintile 2
0.01
(0.013)
0.019
(0.014)
R&D Quintile 3
0.032***
(0.012)
0.051***
(0.018)
R&D Quintile 4
0.041***
(0.015)
0.068***
(0.020)
R&D Quintile 5
0.053***
(0.011)
0.094***
(0.019)
Observations
144,592
144,592
0.00
0.47
R
2
Slides by Gormley
Same AdjY result,
but in regression
format; quintile 1
is excluded
Use benchmark-period
FE to transform both
returns and R&D; this is
equivalent to double sort
Panel Data & Common Errors
AvgE IV estimators also problematic
 Many researchers try to instrument problematic Xi,j
with group mean, X i , excluding observation j

Argument is that X i is correlated with Xi,j but not error
 But, this is typically going to be problematic


Any correlation between Xi,,j and an unobserved heterogeneity, fi, causes exclusion restriction to not hold
Can’t add FE to fix this since IV only varies at group level
Slides by Gormley
Panel Data & Common Errors
What if AdjY or AvgE is true model?
 If data exhibits structure of AvgE estimator,
this would be a peer effects model
[i.e. group mean affects outcome of other members]

In this case, none of the estimators
(OLS, AdjY, AvgE, or FE) reveal the true β
[Manski 1993; Leary and Roberts 2010]
 Even if interested in studying
y i , j  y i , AdjY
only consistent if Xi,j does not affect yi,j !
Slides by Gormley
Panel Data & Common Errors
Outline for lecture





Panel data and fixed effects (FE)
How not to control for unobserved heterogeneity
General implications
Benefits and limitations of FE model
Estimating high-dimensional FE models
Slides by Gormley
Panel Data & Common Errors
FE Estimator – Benefits [Part 1]
 There are many benefits of FE estimator

Allows for arbitrary correlation between each fixed
effect, fi, and each x within group i
 I.e. it is very general and not imposing much structure on
what the underlying data must look like

Very intuitive interpretation; coefficient is identified
using only changes within cross-sections
Slides by Gormley
Panel Data & Common Errors
FE Estimator – Benefits [Part 2]
 It is also very flexible and can help us control for
many types of unobserved heterogeneities
 Can add year FE if worried about unobserved
heterogeneity across time [e.g. macroeconomic shocks]
 Can add CEO FE if worried about unobserved
heterogeneity across CEOs [e.g. talent, risk aversion]
 Add industry-by-year FE if worried about unobserved
heterogeneity across industries over time [e.g. investment
opportunities, demand shocks]
Slides by Gormley
Panel Data & Common Errors
FE Estimator – Limitations
 But, FE estimator also has its limitations



Can’t identify variables that don’t vary within group
Subject to potentially large measurement error bias
Can be hard to estimate in some cases
Slides by Gormley
Panel Data & Common Errors
Limitation #1 – Can’t est. some var.
 If no within-group variation in the independent var.,
x, of interest, can’t disentangle it from group FE

It is collinear with group FE; and will be dropped by
computer or swept out in the within transformation
 In some cases, IV can be used to obtain estimates
for variables that do not vary within groups
[see Hausman and Taylor 1981]
Slides by Gormley
Panel Data & Common Errors
Limitation #2 – Noisy ind. variables
 If some within-group variation is noise, then
variation being exploited that is noise rises in FE

Think of there being two types of variation
 Good (meaningful) variation
 Noise variation because we don’t perfectly
measure the underlying variable of interest

Adding FE can sweep out a lot of the good
variation; fraction of remaining variation coming
from noise goes up [What will this do?]
Slides by Gormley
Panel Data & Common Errors
Noisy independent variables [Part 2]
 Answer: Attenuation bias on mismeasured
(i.e. noisy) independent variable will go up!
 Practical advice: Be careful in interpreting
‘zero’ coefficients on potentially mismeasured
regressors; might just be attenuation bias!

Note… sign of bias on other coefficients will
be generally difficult to know
Slides by Gormley
Panel Data & Common Errors
Noisy independent variables [Part 3]
 Problem can also apply even when all
variables are perfectly measured [How?]
 Answer: Adding FE might throw out relevant
variation; e.g. y in firm FE model might respond
to sustained changes in x, rather than transitory
changes [see McKinnish 2008 for more details]
 With FE you’d only have the transitory variation
leftover; might find x uncorrelated with y in FE
estimation even though sustained changes in x is
most important determinant of y
Slides by Gormley
Panel Data & Common Errors
Possible solutions for Limitation #2
 Standard solutions for measurement error apply
(e.g. IV), but in practice, hard to fix
 For examples on how to deal with measurement
error, see following papers
 Griliches and Hausman (JoE 1986)
 Biorn (Econometric Reviews 2000)
 Erickson and Whited (JPE 2000, RFS 2012)
 Almeida, Campello, and Galvao (RFS 2010)
Slides by Gormley
Panel Data & Common Errors
Limitation #3 – Computation issues
Researchers occasionally motivate using
AdjY and AvgE because FE estimator is
computationally difficult to do when there
are more than one FE of high-dimension
Now, let’s see why this is
(and isn’t) a problem…
Slides by Gormley
Panel Data & Common Errors
Computational issues [Part 1]
 Estimating a model with multiple types of
FE can be computationally difficult

When more than one type of FE, you cannot
remove both using within-transformation
 Generally, you can only sweep one away with
within-transformation; other FE dealt with by
adding dummy variable to model
 E.g. firm and year fixed effects [See next slide]
Slides by Gormley
Panel Data & Common Errors
Computational issues [Part 2]
 Consider below model:
Year FE
Firm FE
yi ,t     xi ,t  t  fi  ui ,t

To estimate this in Stata, we’d use a command
something like the following…
xtset firm
xi: xtreg y x i.year, fe
Tells Stata to create and add dummy
variables for year variable
Slides by Gormley
Tells Stata that panel dimension
is given by firm variable
Tells Stata to remove FE for
panels (i.e. firms) by doing
within-transformation
Panel Data & Common Errors
Computational issues [Part 3]
 Dummies not swept away in withintransformation are actually estimated


With year FE, this isn’t problem because
there aren’t that many years of data
If had to estimate 1,000s of firm FE,
however, it might be a problem…
Slides by Gormley
Panel Data & Common Errors
Why is this a problem?

Estimating FE model with many dummies
can require a lot of computer memory


E.g., estimation with both firm and 4-digit
industry-year FE requires ≈ 40 GB of memory
Most researchers don’t have this much memory;
hence, we don’t see these regressions being used
Slides by Gormley
Panel Data & Common Errors
This is growing problem

Multiple unobserved heterogeneities
increasingly argued to be important


Manager and firm fixed effects in executive
compensation and other CF applications
[Graham, Li, and Qui 2011, Coles and Li 2011]
Firm, industry×year, state×year FE to
control for industry- and state-level shocks
[Gormley and Matsa 2014]
Slides by Gormley
Panel Data & Common Errors
But, there are solutions!
 There exist two techniques that can be used
to arrive at consistent FE estimates without
requiring as much memory
#1 – Interacted fixed effects
#2 – Memory saving procedures
Slides by Gormley
Panel Data & Common Errors
Outline for lecture





Panel data and fixed effects (FE)
How not to control for unobserved heterogeneity
General implications
Benefits and limitations of FE model
Estimating high-dimensional FE models
Slides by Gormley
Panel Data & Common Errors
#1 – Interacted fixed effects
 Combine multiple fixed effects into onedimensional set of fixed effect, and remove
using within transformation

E.g. firm and industry-year FE could be
replaced with firm-industry-year FE
But, there are limitations…


Can severely limit parameters you can estimate
Could have serious attenuation bias
Slides by Gormley
Panel Data & Common Errors
#2 – Memory-saving procedures
 Use properties of sparse matrices to reduce

required memory, e.g. Cornelissen (2008)
Or, instead iterate to a solution, which
eliminates memory issue entirely, e.g.
Guimaraes and Portugal (2010)


See Gormley and Matsa (RFS 2014) for
details of how each method works
Both can be done in Stata using user-written
commands FELSDVREG and REGHDFE
Slides by Gormley
Panel Data & Common Errors
These latter techniques work…
 Estimated typical capital structure regression
with firm and 4-digit industry×year dummies



Standard FE approach would not work; my
computer did not have enough memory…
Sparse matrix procedure took 8 hours…
Iterative procedure took 5 minutes
 See new Gormley and Matsa “Playing it Safe”
working paper for example application
http://ssrn.com/abstract=2465632
Slides by Gormley
Panel Data & Common Errors
See website for more details…
 For examples of SAS, STATA, and R code
one can use to estimate these high-dimensional
FE estimations, please see our website
 http://finance.wharton.upenn.edu/~tgormley
/papers/fe.html
Slides by Gormley
Panel Data & Common Errors
Concluding remarks
 Unobserved heterogeneity across groups is common
identification concern in empirical finance
 Despite heavy use, AdjY and AvgE are typically biased


Can lead to very misleading inferences, including
estimates with opposite sign of true effect
Problem also applies to other, ad hoc
transformations of dep. var. used in literature
 FE is best way to account for unobserved
heterogeneity; limitations can easily be overcome
Slides by Gormley
Panel Data & Common Errors
Practical advice… the punch lines
 Don’t use AdjY or AvgE!
 Don’t use group averages as instruments!
 But, do use fixed effects


Should use benchmark portfolio-period FE in asset
pricing rather than char-adjusted returns
Use iteration techniques to estimate models with
multiple high-dimensional FE
Slides by Gormley
Panel Data & Common Errors
Additional sources
 In addition to Gormley and Matsa (RFS 2014),
other sources used to construct these slides are…



Chapter 10 of Wooldridge, Jeffrey M., 2010,
Econometric Analysis of Cross-Section and Panel Data, MIT
Press, Massachusetts, Second Edition
Chapter 11 of Greene, William H., 2011, Econometric
Analysis, Prentice Hall, N.J., Seventh Edition.
Sections 5.1 of Angrist, Joshua D., and Jorn-Steffen
Pischke, 2009, Mostly Harmless Econometrics, Princeton
University Press, New Jersey
Slides by Gormley
Panel Data & Common Errors
Download