Common Errors: How to (and Not to) Control for Unobserved Heterogeneity Lecture slides by Todd Gormley What are these slides? The following slides are a combination of lecture slides used by Todd Gormley in his Ph.D. course on “Empirical Methods in Corporate Finance” at The Wharton School For more details about the issues discussed in these slides, please see the below article Gormley, T. and D. Matsa, 2014, “Common Errors: How to (and Not to) Control for Unobserved Heterogeneity,” Review of Financial Studies 27(2): 617-61. Slides by Gormley Panel Data & Common Errors Motivation [Part 1] Controlling for unobserved heterogeneity is a fundamental challenge in empirical finance Unobservable factors affect corporate policies and prices These factors may be correlated with variables of interest Important sources of unobserved heterogeneity are often common across groups of observations Demand shocks across firms in an industry, differences in local economic environments, etc. Slides by Gormley Panel Data & Common Errors Motivation [Part 2] E.g. consider a the firm-level estimation leveragei , j ,t 0 1 profiti , j ,t 1 ui , j ,t where leverage is debt/assets for firm i, operating in industry j in year t, and profit is the firms net income/assets What might be some unobservable omitted variables in this estimation? Slides by Gormley Panel Data & Common Errors Motivation [Part 3] Oh, there are so, so many… Managerial talent and/or risk aversion Cost of capital Industry supply and/or demand shock Regional demand shocks And so on… Sadly, this is easy to do with other dependent or independent variables… Easy to think of ways these might be affect leverage and be correlated with profits Slides by Gormley Panel Data & Common Errors Panel data to the rescue… Thankfully, panel data can help us with a particular type of unobserved variable… What type of unobserved variable does panel data help us with, and why? Answer = It helps with any unobserved variable that doesn’t vary within groups of observations Slides by Gormley Panel Data & Common Errors Outline for lecture Panel data and fixed effects (FE) How not to control for unobserved heterogeneity General implications Benefits and limitations of FE model Estimating high-dimensional FE models Slides by Gormley Panel Data & Common Errors Panel data Panel data = whenever you have multiple observations per unit of observation i (e.g. you observe each firm over multiple years) Let’s assume N units i And, J observations per unit i [i.e. balanced panel] E.g., You observe 5,000 firms in Compustat over a twenty year period [i.e. N=5,000, J=20] Slides by Gormley Panel Data & Common Errors The underlying model [Part 1] When unobserved heterogeneity is thought to be present, researcher implicitly assumes the following: yi , j X i , j fi i , j i indexes groups of observations (e.g. industry); j indexes observations within each group (e.g. firm) yi,j = dependent variable Xi,j = independent variable of interest fi = unobserved group heterogeneity i , j = error term Slides by Gormley Panel Data & Common Errors The underlying model [Part 2] The following standard assumptions are made: N groups, J observations per group, where J is small and N is large X and ε are i.i.d. across groups, but not necessarily i.i.d. within groups var( f ) 2f , f 0 var( X ) X2 , X 0 var( ) 2 , 0 Slides by Gormley Simplifies some expressions, but doesn’t change any results Panel Data & Common Errors The underlying model [Part 3] Finally, the following assumptions are made: cov( fi , i , j ) 0 co v( X i , j , i , j ) co v( X i , j , i , j ) 0 cov( X i , j , fi ) Xf 0 Source of identification concern Slides by Gormley What do these imply? Answer = Model is correct in that if we can control for f, we’ll properly identify effect of X; but if we don’t control for f there will be omitted variable bias Panel Data & Common Errors OLS estimate of β is inconsistent True model is: yi , j X i , j fi i , j But OLS estimates: yi , j OLS X i , j uiOLS ,j By failing to control for group effect, fi, OLS suffers from omitted variable bias ˆ OLS Xf X2 Alternative estimation strategies are required… Slides by Gormley Panel Data & Common Errors Can solve this by transforming data First, notice that if you take the population mean of the dependent variable for each unit of observation, i, you get… yi = a + b xi + f i + e i where yi = Again, I assumed there are J obs. per unit i 1 1 1 y , x = x , e = e i, j å å å i, j i i, j i J j J j J j Slides by Gormley Panel Data & Common Errors Transforming data [Part 2] Now, if we subtract yi from yi ,t , we have ( ) ( yi,t - yi = b xi,t - xi + e i,t - e i ) And look! The unobserved variable, fi , is gone (as is the constant) because it is group-invariant With our earlier assumptions, easy to see that xi ,t xi is uncorrelated with the new disturbance, e i,t - e i , which means… ( ) ? Slides by Gormley Panel Data & Common Errors Fixed effects (or within) estimator Answer: OLS estimation of transformed model will yield a consistent estimate of β The prior transformation is called the “within transformation” because it demeans all variables within their group This is also called the FE estimator Slides by Gormley Panel Data & Common Errors Least Squares Dummy Variable (LSDV) Another way to do the FE estimation is by adding indicator (dummy) variables I.e. create a dummy variable for each group i, and add it to the regression This is least squares dummy variable model Now, our estimation equation exactly matches the true underlying model yi, j = a + b xi, j + f i + ui, j Slides by Gormley Panel Data & Common Errors LSDV versus FE [Part 1] Why do both approaches work? Well… Frisch-Waugh-Lovell Theorem shows us there are two ways to estimate the below β1… y = b0 + b1x + b 2 z + e Estimate directly; i.e. regress y onto both x and z OR we can just partial z out from both y and x before regressing y on x (i.e. regress residuals from regression of y on z onto residuals from regression of x on z) Slides by Gormley Panel Data & Common Errors LSDV versus FE [Part 2] Can show that LSDV and within-transformation of FE are identical because demeaned variables of within regression are the residuals from a regression onto group dummies! Slides by Gormley Panel Data & Common Errors Outline for lecture Panel data and fixed effects (FE) How not to control for unobserved heterogeneity General implications Benefits and limitations of FE model Estimating high-dimensional FE models Slides by Gormley Panel Data & Common Errors Other approaches… Gormley and Matsa (RFS 2014) notes that existing literature uses various other strategies to control for unobserved group-level heterogeneity… Their questions – How do each of the approaches differ? And, when are they consistent? Their answer – Some popular strategies can distort inferences and should not be used; FE estimator should be used instead Slides by Gormley Panel Data & Common Errors They focus on two popular strategies “Adjusted-Y” (AdjY) – dependent variable is demeaned within groups [e.g. ‘industry-adjust’] “Average effects” (AvgE) – uses group mean of dependent variable as control [e.g. ‘state-year’ control] Slides by Gormley Panel Data & Common Errors AdjY & AvgE are widely used In Journal of Finance, Journal of Financial Economics, and Review of Financial Studies Used since at least the late 1980s Still used, 60+ papers published in 2008-2010 Variety of subfields; asset pricing, banking, capital structure, governance, M&A, etc. Also been used in papers published in the American Economic Review, Journal of Political Economy, and Quarterly Journal of Economics Slides by Gormley Panel Data & Common Errors But, AdjY and AvgE are inconsistent As Gormley and Matsa (RFS 2014) shows… Both can be more biased than OLS Both can get opposite sign as true coefficient In practice, bias is likely and trying to predict its sign or magnitude will typically impractical Slides by Gormley Panel Data & Common Errors More implications of GM (RFS 2014) Other, related strategies should also not be used “Characteristically-adjusted” stock returns in AP “Adjusted” stock returns when trying to estimate firms’ internal value of cash Simple comparisons of benchmark-adjusted outcomes before & after events (like M&A) “Diversification discount” Using group average of an independent variable as instrumental variable Now, let’s see why… Slides by Gormley Panel Data & Common Errors Adjusted-Y (AdjY) Tries to remove unobserved group heterogeneity by demeaning the dependent variable within groups AdjY AdjY AdjY estimates: yi , j yi X i , j ui , j 1 where yi J X k group i i ,k fi i ,k Note: Researchers often exclude observation at hand when calculating group mean or use a group median, but both modifications will yield similarly inconsistent estimates Slides by Gormley Panel Data & Common Errors Example AdjY estimation One example – firm value regression: Qi , j ,t Qi ,t β ' Xi, j ,t i , j ,t Qi , j ,t = Tobin’s Q for firm j, industry i, year t Qi ,t = mean of Tobin’s Q for industry i in year t Xi,j,t = vector of variables thought to affect value Researchers might also include firm & year FE Anyone know why AdjY is going to be inconsistent? Slides by Gormley Panel Data & Common Errors Here is why… Rewriting the group mean, we have: yi fi X i i , Therefore, AdjY transforms the true data to: yi , j yi X i , j X i i , j i What is the AdjY estimation forgetting? Slides by Gormley Panel Data & Common Errors AdjY has omitted variable bias ˆ adjY can be inconsistent when 0 True model: yi , j yi X i , j X i i , j i But, AdjY estimates: yi , j yi AdjY X i , j uiAdjY ,j By failing to control for X i , AdjY suffers from omitted variable bias when XX 0 ˆ AdjY Slides by Gormley XX X2 In practice, a positive covariance between X and X will be very common Panel Data & Common Errors Further analysis of AdjY estimate ˆ AdjY XX X2 Bias doesn’t disappear as group size J increases Can be inconsistent even when OLS is not; this happens when σXf = 0 and XX 0 Bias is more complicated with two variables… Slides by Gormley Panel Data & Common Errors AdjY estimates with 2 variables Suppose, there are instead two RHS variables True model: yi , j X i , j Zi , j fi i , j Use same assumptions as before, but add: cov( Z i , j , i , j ) cov( Z i , j , i , j ) 0 var( Z ) Z2 , Z 0 cov( X i , j , Z i , j ) XZ cov( Z i , j , f i ) Zf Slides by Gormley Panel Data & Common Errors AdjY estimates with 2 variables [Part 2] With a bit of algebra, it is shown that: XZ ZX Z2 XX XZ ZZ Z2 XZ 2 2 2 AdjY ˆ Z X XZ AdjY 2 2 ˆ XZ XX X ZX XZ XZ X ZZ 2 2 2 Z X XZ Estimates of both β and γ can be inconsistent Slides by Gormley Determining sign and magnitude of bias will typically be difficult Panel Data & Common Errors Average effects (AvgE) AvgE uses group mean of dependent variable as control for unobserved heterogeneity AvgE estimates: yi , j AvgE X i , j AvgE yi uiAvgE ,j Slides by Gormley Panel Data & Common Errors Example AvgE estimation Following profit regression is an AvgE example: ROAi , s ,t β ' Xi,s ,t ROAs ,t i , s ,t ROAs,t = mean of ROA for state s in year t Xi,s,t = vector of variables thought to profits Researchers might also include firm & year FE Anyone know why AvgE is going to be inconsistent? Slides by Gormley Panel Data & Common Errors Average effects (AvgE) AvgE uses group mean of dependent variable as control for unobserved heterogeneity AvgE estimates: yi , j AvgE X i , j AvgE yi uiAvgE ,j Recall, true model: yi , j X i , j fi i , j Problem is that y i measures fi with error Slides by Gormley Panel Data & Common Errors AvgE has measurement error bias Recall that group mean is given by yi fi X i i , Therefore, y i measures fi with error X i i As is well known, even classical measurement error causes all estimated coefficients to be inconsistent Bias here is complicated because error can be correlated with both mismeasured variable, f i , and with Xi,j when XX 0 Slides by Gormley Panel Data & Common Errors AvgE estimate of β with one variable With a bit of algebra, it is shown that: ˆ AvgE Determining magnitude and direction of bias is difficult Xf fX 2 X2 2 XX 2f fX 2 X 2 f 2 fX Xf XX 2 2 X 2 Covariance between X and X again problematic, but not needed for AvgE estimate to be inconsistent Slides by Gormley 2 Even non-i.i.d. nature of errors can affect bias! Panel Data & Common Errors How common will the bias be? First, we look at when XX 0 by separating Xi,j into it’s group and idiosyncratic components X i , j xi wi , j Assume group means are i.i.d. with mean zero and variance x2 Slides by Gormley Idiosyncratic component distributed with mean 0 and variance w2 And, assume cov( xi , wi , j ) 0 Panel Data & Common Errors AdjY and AvgE bias very common Both AdjY and AvgE biased when XX 0 But with prior setup, we can show that… XX w 2 x Bias whenever different means across groups! i , j , wi , j Or, bias whenever observations within groups are not independent! * Solved excluding observation at hand (most common approach) Slides by Gormley Panel Data & Common Errors Analytical comparisons Next, we use analytical solutions to compare relative performance of OLS, AdjY, and AvgE To do this, we re-express solutions… We use correlations (e.g. solve bias in terms of correlation between X and f, Xf , instead of Xf ) We also assume i.i.d. errors [just makes bias of AvgE less complicated] Slides by Gormley Panel Data & Common Errors ρXf has large effect on performance (from Figure 1A) AdjY more biased than OLS, except for AvgE worst for low large values for ρXf correlations, best for high 1.5 2 Estimate, ˆ OLS 1 True β = 1 0 0.5 AdjY -0.75 Other parameters held constant AvgE -0.5 -0.25 0 f X X 1, x w 0.25, w i , j wi , j 0.25 0.5, J 10. Slides by Gormley 0.5 0.75 Xf Panel Data & Common Errors Relative variation across groups key (from Figure 1B) Estimate, ˆ 0.5 0.75 1 1.25 OLS 0.25 AvgE 0 AdjY 0 .5 1 f X X 1, Xf 0.25, w i , j wi , j 0.5, J 10. Slides by Gormley 1.5 2 x /w Panel Data & Common Errors More observations need not help! (from Figure 1F) Estimate, ˆ 1 1.25 OLS 0.75 AvgE 0.5 AdjY 0 5 10 f X X 1, x w Xf 0.25, w i , j wi , j 15 20 25 J 0.5, J 10. Slides by Gormley Panel Data & Common Errors Summary of OLS, AdjY, and AvgE In general, all three estimators are inconsistent in presence of unobserved group heterogeneity AdjY and AvgE may not be an improvement over OLS; depends on various parameter values AdjY and AvgE can yield estimates with opposite sign of the true coefficient Slides by Gormley Panel Data & Common Errors Comparing FE, AdjY, and AvgE To estimate effect of X on Y controlling for Z Add group FE One could regress Y onto both X and Z… Or, regress residuals from regression of Y on Z onto residuals from regression of X on Z Within-group transformation! AdjY and AvgE aren’t the same as finding the effect of X on Y controlling for Z because... AdjY only partials Z out from Y AvgE uses fitted values of Y on Z as control Slides by Gormley Panel Data & Common Errors The differences matter! Example #1 Consider the following capital structure regression: ( D / A)i ,t βXi,t fi i ,t (D/A)it = book leverage for firm i, year t Xi,t = vector of variables thought to affect leverage fi = firm fixed effect We now run this regression for each approach to deal with firm fixed effects, using 1950-2010 data, winsorizing at 1% tails… Slides by Gormley Panel Data & Common Errors Estimates vary considerably (from Table 2) Dependent variable = book leverage Fixed Assets/ Total Assets Ln(sales) Return on Assets Z-score Market-to-book Ratio Observations R-squared OLS Adj Y Avg E FE 0.270*** (0.008) 0.011*** (0.001) -0.015*** (0.005) -0.017*** 0.000 -0.006*** (0.000) 0.066*** (0.004) 0.011*** 0.000 0.051*** (0.004) -0.010*** (0.000) -0.004*** (0.000) 0.103*** (0.004) 0.011*** 0.000 0.039*** (0.004) -0.011*** (0.000) -0.004*** (0.000) 0.248*** (0.014) 0.017*** (0.001) -0.028*** (0.005) -0.017*** (0.001) -0.003*** (0.000) 166,974 0.29 166,974 0.14 166,974 0.56 166,974 0.66 Slides by Gormley Panel Data & Common Errors The differences matter! Example #2 Consider the following firm value regression: Qi , j ,t β ' Xi, j ,t f j ,t i , j ,t Q = Tobin’s Q for firm i, industry j, year t Xi,j,t = vector of variables thought to affect value fj,t = industry-year fixed effect We now run this regression for each approach to deal with industry-year fixed effects… Slides by Gormley Panel Data & Common Errors Estimates vary considerably (from Table 4) Dependent Variable = Tobin's Q OLS Adj Y Avg E FE Delaware Incorporation 0.100*** (0.036) 0.019 (0.032) 0.040 (0.032) 0.086** (0.039) Ln(sales) -0.125*** (0.009) -0.054*** (0.008) -0.072*** (0.008) -0.131*** (0.011) R&D Expenses / Assets 6.724*** (0.260) 3.022*** (0.242) 3.968*** (0.256) 5.541*** (0.318) Return on Assets -0.559*** (0.108) -0.526*** (0.095) -0.535*** (0.097) -0.436*** (0.117) 55,792 0.22 55,792 0.08 55,792 0.34 55,792 0.37 Observations R-squared Slides by Gormley Panel Data & Common Errors The differences matter! Example #3 It also matters in literature on antitakeover laws Past papers used AvgE to control for unobserved, time-varying differences across states & industries Gormley and Matsa (2014) show that properly using industry-year, state-year, and firm FE estimator changes estimates considerably E.g., using this framework, they show that managers have an underlying preference to “Play it Safe” For details, see http://ssrn.com/abstract=2465632 Slides by Gormley Panel Data & Common Errors Outline for lecture Panel data and fixed effects (FE) How not to control for unobserved heterogeneity General implications Benefits and limitations of FE model Estimating high-dimensional FE models Slides by Gormley Panel Data & Common Errors General implications With this framework, easy to see that other commonly used estimators will be biased AdjY-type estimators in M&A, asset pricing, etc. AvgE-type instrumental variables Slides by Gormley Panel Data & Common Errors Other AdjY estimators are problematic Same problem arises with other AdjY estimators Subtracting off median or value-weighted mean Subtracting off mean of matched control sample [as is customary in studies if diversification “discount”] Comparing industry-adjusted means for treated firms pre- versus post-event [as often done in M&A studies] Characteristically adjusted returns [as used in asset pricing] Slides by Gormley Panel Data & Common Errors AdjY-type estimators in asset pricing Common to sort and compare stock returns across portfolios based on a variable thought to affect returns But, returns are often first “characteristically adjusted” I.e. researcher subtracts the average return of a benchmark portfolio containing stocks of similar characteristics This is equivalent to AdjY, where “adjusted returns” are regressed onto indicators for each portfolio Approach fails to control for how avg. independent variable varies across benchmark portfolios Slides by Gormley Panel Data & Common Errors Asset Pricing AdjY – Example Asset pricing example; sorting returns based on R&D expenses / market value of equity Characteristically adjusted returns by R&D Quintile (i.e., Adj Y) Missing Q1 Q2 Q3 Q4 Q5 -0.012*** (0.003) -0.033*** (0.009) -0.023*** (0.008) -0.002 (0.007) We use industry-size benchmark portfolios and sorted using R&D/market value Slides by Gormley 0.008 (0.013) 0.020*** (0.006) Difference between Q5 and Q1 is 5.3 percentage points Panel Data & Common Errors Estimates vary considerably (from Table 5) Dependent Variable = Yearly Stock Return Adj Y FE R&D Missing 0.021** (0.009) 0.030*** (0.010) R&D Quintile 2 0.01 (0.013) 0.019 (0.014) R&D Quintile 3 0.032*** (0.012) 0.051*** (0.018) R&D Quintile 4 0.041*** (0.015) 0.068*** (0.020) R&D Quintile 5 0.053*** (0.011) 0.094*** (0.019) Observations 144,592 144,592 0.00 0.47 R 2 Slides by Gormley Same AdjY result, but in regression format; quintile 1 is excluded Use benchmark-period FE to transform both returns and R&D; this is equivalent to double sort Panel Data & Common Errors AvgE IV estimators also problematic Many researchers try to instrument problematic Xi,j with group mean, X i , excluding observation j Argument is that X i is correlated with Xi,j but not error But, this is typically going to be problematic Any correlation between Xi,,j and an unobserved heterogeneity, fi, causes exclusion restriction to not hold Can’t add FE to fix this since IV only varies at group level Slides by Gormley Panel Data & Common Errors What if AdjY or AvgE is true model? If data exhibits structure of AvgE estimator, this would be a peer effects model [i.e. group mean affects outcome of other members] In this case, none of the estimators (OLS, AdjY, AvgE, or FE) reveal the true β [Manski 1993; Leary and Roberts 2010] Even if interested in studying y i , j y i , AdjY only consistent if Xi,j does not affect yi,j ! Slides by Gormley Panel Data & Common Errors Outline for lecture Panel data and fixed effects (FE) How not to control for unobserved heterogeneity General implications Benefits and limitations of FE model Estimating high-dimensional FE models Slides by Gormley Panel Data & Common Errors FE Estimator – Benefits [Part 1] There are many benefits of FE estimator Allows for arbitrary correlation between each fixed effect, fi, and each x within group i I.e. it is very general and not imposing much structure on what the underlying data must look like Very intuitive interpretation; coefficient is identified using only changes within cross-sections Slides by Gormley Panel Data & Common Errors FE Estimator – Benefits [Part 2] It is also very flexible and can help us control for many types of unobserved heterogeneities Can add year FE if worried about unobserved heterogeneity across time [e.g. macroeconomic shocks] Can add CEO FE if worried about unobserved heterogeneity across CEOs [e.g. talent, risk aversion] Add industry-by-year FE if worried about unobserved heterogeneity across industries over time [e.g. investment opportunities, demand shocks] Slides by Gormley Panel Data & Common Errors FE Estimator – Limitations But, FE estimator also has its limitations Can’t identify variables that don’t vary within group Subject to potentially large measurement error bias Can be hard to estimate in some cases Slides by Gormley Panel Data & Common Errors Limitation #1 – Can’t est. some var. If no within-group variation in the independent var., x, of interest, can’t disentangle it from group FE It is collinear with group FE; and will be dropped by computer or swept out in the within transformation In some cases, IV can be used to obtain estimates for variables that do not vary within groups [see Hausman and Taylor 1981] Slides by Gormley Panel Data & Common Errors Limitation #2 – Noisy ind. variables If some within-group variation is noise, then variation being exploited that is noise rises in FE Think of there being two types of variation Good (meaningful) variation Noise variation because we don’t perfectly measure the underlying variable of interest Adding FE can sweep out a lot of the good variation; fraction of remaining variation coming from noise goes up [What will this do?] Slides by Gormley Panel Data & Common Errors Noisy independent variables [Part 2] Answer: Attenuation bias on mismeasured (i.e. noisy) independent variable will go up! Practical advice: Be careful in interpreting ‘zero’ coefficients on potentially mismeasured regressors; might just be attenuation bias! Note… sign of bias on other coefficients will be generally difficult to know Slides by Gormley Panel Data & Common Errors Noisy independent variables [Part 3] Problem can also apply even when all variables are perfectly measured [How?] Answer: Adding FE might throw out relevant variation; e.g. y in firm FE model might respond to sustained changes in x, rather than transitory changes [see McKinnish 2008 for more details] With FE you’d only have the transitory variation leftover; might find x uncorrelated with y in FE estimation even though sustained changes in x is most important determinant of y Slides by Gormley Panel Data & Common Errors Possible solutions for Limitation #2 Standard solutions for measurement error apply (e.g. IV), but in practice, hard to fix For examples on how to deal with measurement error, see following papers Griliches and Hausman (JoE 1986) Biorn (Econometric Reviews 2000) Erickson and Whited (JPE 2000, RFS 2012) Almeida, Campello, and Galvao (RFS 2010) Slides by Gormley Panel Data & Common Errors Limitation #3 – Computation issues Researchers occasionally motivate using AdjY and AvgE because FE estimator is computationally difficult to do when there are more than one FE of high-dimension Now, let’s see why this is (and isn’t) a problem… Slides by Gormley Panel Data & Common Errors Computational issues [Part 1] Estimating a model with multiple types of FE can be computationally difficult When more than one type of FE, you cannot remove both using within-transformation Generally, you can only sweep one away with within-transformation; other FE dealt with by adding dummy variable to model E.g. firm and year fixed effects [See next slide] Slides by Gormley Panel Data & Common Errors Computational issues [Part 2] Consider below model: Year FE Firm FE yi ,t xi ,t t fi ui ,t To estimate this in Stata, we’d use a command something like the following… xtset firm xi: xtreg y x i.year, fe Tells Stata to create and add dummy variables for year variable Slides by Gormley Tells Stata that panel dimension is given by firm variable Tells Stata to remove FE for panels (i.e. firms) by doing within-transformation Panel Data & Common Errors Computational issues [Part 3] Dummies not swept away in withintransformation are actually estimated With year FE, this isn’t problem because there aren’t that many years of data If had to estimate 1,000s of firm FE, however, it might be a problem… Slides by Gormley Panel Data & Common Errors Why is this a problem? Estimating FE model with many dummies can require a lot of computer memory E.g., estimation with both firm and 4-digit industry-year FE requires ≈ 40 GB of memory Most researchers don’t have this much memory; hence, we don’t see these regressions being used Slides by Gormley Panel Data & Common Errors This is growing problem Multiple unobserved heterogeneities increasingly argued to be important Manager and firm fixed effects in executive compensation and other CF applications [Graham, Li, and Qui 2011, Coles and Li 2011] Firm, industry×year, state×year FE to control for industry- and state-level shocks [Gormley and Matsa 2014] Slides by Gormley Panel Data & Common Errors But, there are solutions! There exist two techniques that can be used to arrive at consistent FE estimates without requiring as much memory #1 – Interacted fixed effects #2 – Memory saving procedures Slides by Gormley Panel Data & Common Errors Outline for lecture Panel data and fixed effects (FE) How not to control for unobserved heterogeneity General implications Benefits and limitations of FE model Estimating high-dimensional FE models Slides by Gormley Panel Data & Common Errors #1 – Interacted fixed effects Combine multiple fixed effects into onedimensional set of fixed effect, and remove using within transformation E.g. firm and industry-year FE could be replaced with firm-industry-year FE But, there are limitations… Can severely limit parameters you can estimate Could have serious attenuation bias Slides by Gormley Panel Data & Common Errors #2 – Memory-saving procedures Use properties of sparse matrices to reduce required memory, e.g. Cornelissen (2008) Or, instead iterate to a solution, which eliminates memory issue entirely, e.g. Guimaraes and Portugal (2010) See Gormley and Matsa (RFS 2014) for details of how each method works Both can be done in Stata using user-written commands FELSDVREG and REGHDFE Slides by Gormley Panel Data & Common Errors These latter techniques work… Estimated typical capital structure regression with firm and 4-digit industry×year dummies Standard FE approach would not work; my computer did not have enough memory… Sparse matrix procedure took 8 hours… Iterative procedure took 5 minutes See new Gormley and Matsa “Playing it Safe” working paper for example application http://ssrn.com/abstract=2465632 Slides by Gormley Panel Data & Common Errors See website for more details… For examples of SAS, STATA, and R code one can use to estimate these high-dimensional FE estimations, please see our website http://finance.wharton.upenn.edu/~tgormley /papers/fe.html Slides by Gormley Panel Data & Common Errors Concluding remarks Unobserved heterogeneity across groups is common identification concern in empirical finance Despite heavy use, AdjY and AvgE are typically biased Can lead to very misleading inferences, including estimates with opposite sign of true effect Problem also applies to other, ad hoc transformations of dep. var. used in literature FE is best way to account for unobserved heterogeneity; limitations can easily be overcome Slides by Gormley Panel Data & Common Errors Practical advice… the punch lines Don’t use AdjY or AvgE! Don’t use group averages as instruments! But, do use fixed effects Should use benchmark portfolio-period FE in asset pricing rather than char-adjusted returns Use iteration techniques to estimate models with multiple high-dimensional FE Slides by Gormley Panel Data & Common Errors Additional sources In addition to Gormley and Matsa (RFS 2014), other sources used to construct these slides are… Chapter 10 of Wooldridge, Jeffrey M., 2010, Econometric Analysis of Cross-Section and Panel Data, MIT Press, Massachusetts, Second Edition Chapter 11 of Greene, William H., 2011, Econometric Analysis, Prentice Hall, N.J., Seventh Edition. Sections 5.1 of Angrist, Joshua D., and Jorn-Steffen Pischke, 2009, Mostly Harmless Econometrics, Princeton University Press, New Jersey Slides by Gormley Panel Data & Common Errors