Part 4: Fixed Effects [ 1/96] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business Part 4: Fixed Effects [ 2/96] Estimation with Fixed Effects The fixed effects model y it =x itβ+c i +εit , observation for person i at time t y i =X iβ+c ii+ε i , Ti observations in group i =X iβ+c i +ε i , note c i (c i , c i ,...,c i ) y =Xβ+c +ε , Ni=1 Ti observations in the sample c=(c1 , c2 ,...cN ), Ni=1 Ti by 1 vector ci is arbitrarily correlated with xit but E[εit|Xi,ci]=0 Dummy variable representation yit =xitβ+Nj=1 jdijt +εit , dijt = 1(i=j) Part 4: Fixed Effects [ 3/96] The Fixed Effects Model yi = Xi + diαi + εi, for each individual y1 y2 yN X1 X 2 X N d1 0 0 d2 0 0 0 0 0 0 0 β ε α dN β = [X, D] ε α = Zδ ε E[ci | Xi ] = g(Xi); Effects are correlated with included variables. Cov[xit,ci] ≠0 Part 4: Fixed Effects [ 4/96] Useful Analysis of Variance Notation Decomposition of Total variation: N i=1 Σ Σ Ti t=1 2 (zit z) Σ N i=1 Σ Ti t=1 (zit zi .) Σ Ti zi . z 2 N i=1 2 Total variation = Within groups variation + Between groups variation Part 4: Fixed Effects [ 5/96] Baltagi and Griffin’s Gasoline Data World Gasoline Demand Data, 18 OECD Countries, 19 years Variables in the file are COUNTRY = name of country YEAR = year, 1960-1978 LGASPCAR = log of consumption per car LINCOMEP = log of per capita income LRPMG = log of real price of gasoline LCARPCAP = log of per capita number of cars See Baltagi (2001, p. 24) for analysis of these data. The article on which the analysis is based is Baltagi, B. and Griffin, J., "Gasoline Demand in the OECD: An Application of Pooling and Testing Procedures," European Economic Review, 22, 1983, pp. 117-137. The data were downloaded from the website for Baltagi's text. Part 4: Fixed Effects [ 6/96] Analysis of Variance Per Capita Gasoline Use for 18 OECD Countries 6 .5 0 6 .0 0 LGASPCAR 5 .5 0 5 .0 0 4 .5 0 4 .0 0 3 .5 0 3 .0 0 0 2 4 6 8 10 COUNT RY 12 14 16 18 Part 4: Fixed Effects [ 7/96] Analysis of Variance SETPANEL ; Group = country $ ANOVA ; Lhs=lgaspcar $ Part 4: Fixed Effects [ 8/96] X1 X 2 X D X 3 X N d1 0 0 d2 0 0 0 0 d3 0 0 X i1 i2 0 iN 0 dN (T1 rows) (T2 rows) (T3 rows) (TN rows) N i=1Ti rows Part 4: Fixed Effects [ 9/96] Estimating the Fixed Effects Model The FEM is a plain vanilla regression model but with many independent variables Least squares is unbiased, consistent, efficient, but inconvenient if N is large. 1 b X X X D X y Dy a D X D D Using the Frisch-Waugh theorem b =[X MD X ]1 X MD y Part 4: Fixed Effects [ 10/96] Fixed Effects Estimator (cont.) M1D 0 0 2 0 M 0 D (The dummy variables are orthogonal) MD N 0 MD 0 MDi I Ti di (didi ) 1 di = I Ti (1/Ti )didi X MD X = Ni=1 X iMDi X i , X MD y = Ni=1 X iMDi y i , XM y X iMDi X i i i D T k,l i t=1 (x it,k -x i.,k )(x it,l -x i.,l ) T i t=1 (x it,k -x i.,k )(y it -y i. ) i k If all groups have the same Ti , MD M0 I where M0 I T (1/T)dd X MD X = X [M0 I]X and b = X [M0 I]X 1 X [M0 I]y. Part 4: Fixed Effects [ 11/96] The Within Transformation Removes the Effects y it x it β ci +εit y i x iβ ci +εi y it y i ( x it - x i )β (εit εi ) y it x it β εit Wooldridge notation for data in deviations from group means Part 4: Fixed Effects [ 12/96] Least Squares Dummy Variable Estimator b is obtained by ‘within’ groups least squares (group mean deviations) Normal equations for a are D’Xb+D’Da=D’y a = (D’D)-1D’(y – Xb) ai=(1/Ti )Σ Ti t=1 (yit -xitb)=ei Notes: This is simple algebra – the estimator is just OLS Least squares is an estimator, not a model. (Repeat twice.) Note what ai is when Ti = 1. Follow this with yit-ai-xit’b=0 if Ti=1. Part 4: Fixed Effects [ 13/96] Inference About OLS Assume strict exogeneity: Cov[εit,(xjs,cj)]=0. Every disturbance in every period for each person is uncorrelated with variables and effects for every person and across periods. Now, it’s just least squares in a classical linear regression model. Asy.Var[b] = (2 / Ni=1 Ti )plim[(2 / Ni=1 Ti )Ni=1 XiMDi Xi ]1 which is the usual estimator for OLS 2 ˆ Ti Ni=1 t=1 (y it -ai -x it b)2 N i=1 Ti - N - K (Note the degrees of freedom correction) Part 4: Fixed Effects [ 14/96] Application Cornwell and Rupert Part 4: Fixed Effects [ 15/96] LSDV Results Note huge changes in the coefficients. SMSA and MS change signs. Significance changes completely! Pooled OLS Part 4: Fixed Effects [ 16/96] The Effect of the Effects Part 4: Fixed Effects [ 17/96] The Estimated Fixed Effects Frequency Fixed E ffects fr om C or nw ell and R uper t W age Model .8 5 6 1 .6 8 8 2 .5 2 0 3 .3 5 1 4 .1 8 3 AI 5 .0 1 5 5 .8 4 7 6 .6 7 8 Part 4: Fixed Effects [ 18/96] A Kernel Density Estimator * x x 1 n 1 * * i m f̂(xm ) i1 K , for a set or points x m n h h h "bandwidth" chosen by the analyst. A common choice is Silverman's rule of thumb = 1.06ˆ x /n 1/5 K the kernel function, such as the normal or logistic density (or one of several others) x* the point at which the density is approximated. Part 4: Fixed Effects [ 19/96] Examining the Effects with a KDE Fixed Effects from Cornwell and Rupert Wage Model .3 4 5 .2 7 6 De ns ity .2 0 7 .1 3 8 .0 6 9 .0 0 0 0 1 2 3 4 5 6 AI Ke rn e l d e n s i ty e s ti m a te fo r AI Mean = 4.819, standard deviation = 1.054. 7 Part 4: Fixed Effects [ 20/96] Histogram vs. KDE Fixed E ffects fr om C or nw ell and R uper t W age Model Fixed Effects from Cornwell and Rupert Wage Model .3 4 5 .2 7 6 De ns ity Frequency .2 0 7 .1 3 8 .0 6 9 .0 0 0 0 1 2 3 4 5 6 7 AI .8 5 6 1 .6 8 8 2 .5 2 0 3 .3 5 1 4 .1 8 3 AI 5 .0 1 5 5 .8 4 7 6 .6 7 8 Ke rn e l d e n s i ty e s ti m a te fo r AI CREATE ; ID=TRN(7,0)$ SETPANEL ; GROUP=ID $ REGRESS ;lhs=lwage;rhs=occ,smsa,ms,exp ; panel ; fixed $ ? Creates 595 by 1 matrix named ALPHAFE HISTOGRAM; rhs=alphafe ;title=Fixed Effects from Cornwell and Rupert Wage Model$ KERNEL;rhs=alphafe ; title=Fixed Effects from Cornwell and Rupert Wage Model$ Part 4: Fixed Effects [ 21/96] Part 4: Fixed Effects [ 22/96] A Kernel Density Estimator * x x 1 n 1 * * i m f̂(xm ) i1 K , for a set or points x m n h h h "bandwidth" chosen by the analyst. A common choice is Silverman's rule of thumb = 1.06ˆ x /n 1/5 K the kernel function, such as the normal or logistic density (or one of several others) x* the point at which the density is approximated. Part 4: Fixed Effects [ 23/96] Part 4: Fixed Effects [ 24/96] Part 4: Fixed Effects [ 25/96] Part 4: Fixed Effects [ 26/96] A Caution About Stata and R2 Residual Sum of Squares Total Sum of Squares Or is it? What is the total sum of squares? R squared = 1 - For the FE model above, Conventional: Total Sum of Squares = y "Within Sum of Squares" y = N Ti i 1 t 1 N Ti i 1 t 1 it it y 2 yi R2 = 0.90542 areg 2 R2 = 0.65142 xtreg fe Which should appear in the denominator of R 2 The coefficient estimates and standard errors are the same. The calculation of the R 2 is different. In the areg procedure, you are estimating coefficients for each of your covariates plus each dummy variable for your groups. In the xtreg, fe procedure the R reported is obtained by only fitting a 2 mean deviated model where the effects of the groups (all of the dummy variables) are assumed to be fixed quantities. So, all of the effects for the groups are simply subtracted out of the model and no attempt is made to quantify their overall effect on the fit of the model. Since the SSE is the same, the R2=1−SSE/SST is very different. The difference is real in that we are making different assumptions with the two approaches. In the xtreg, fe approach, the effects of the groups are fixed and unestimated quantities are subtracted out of the model before the fit is performed. In the areg approach, the group effects are estimated and affect the total sum of squares of the model under consideration. Part 4: Fixed Effects [ 27/96] Robustness of the LSDV Estimator Under the full Gauss-Markov assumptions, b is unbiased and consistent (and even efficient). If Var[εi] = Ωi ≠ε2ITi then b is consistent but inefficient. (We’ll return to robust estimation below.) Under all assumptions, Var[ai] is O(1/Ti). ai is unbiased but inconsistent. Inconsistent not because it estimates the wrong parameter, but because it converges to a random variable, not a constant. Ti is not increasing. Part 4: Fixed Effects [ 28/96] Robust Counterpart to White Estimator? Assumes Var[εi] = Ωi ≠2ITi ei = yi – aiiTi - Xib = MDyi – MDXib (Ti x 1 vector of group residuals) 1 Est.Asy.Var[b]= Ni=1XiMDi X i Ni=1 (XiMDi ei )(eiMDi X i ) Ni=1X iMDi X i Ti Ti H1 Ni=1 t=1 (x it x i )eit t=1 (x it x i )eit 1 H Ti H Ni=1 t=1 ( xit xi )( xit xi ) Resembles (and is based on) White, but treats a full vector of disturbances at a time. Robust to heteroscedasticity and autocorrelation (within the groups). 1 Part 4: Fixed Effects [ 29/96] Robust Covariance Matrix for LSDV Cluster Estimator for Within Estimator Part 4: Fixed Effects [ 30/96] Asymptotics for ai ai (didi ) 1 di ( y i X ib) from the LS normal equations = y i x ib = (y i x iβ) - x i (b-β) = i + i - x i (b-β) E[ai | X] i 0 0 = i (b is unbiased) Var[ai | X] 0 2 / Ti x i Var[(b-β) | X]x i limN E[ai | X ] i limN Var[ai | X] 2 / Ti + 0 (b is consistent so Var[(b-β) | X] 0) (See slide 13, limN 2 =0) N i1 Ti Part 4: Fixed Effects [ 31/96] LSDV is an IV Estimator y it x it β c i +εit x it β (c i +εit ) x it β wit Cov[x it , wit ] Cov[x it ,(c i +εit )] g(x it ) 0 x it is correlated with the FEs embedded in wit Part 4: Fixed Effects [ 32/96] y it x it β c i +εit (1 observation) y i X iβ c i di +ε i (Ti observations) y i X iβ wi (i Ti observations) y Xβ w plim(b)=plim X X X y 1 1 X X 1 = β plim N plim N i=1 Ti i=1 Ti N i=1 x c t=1 it i Ti 1 N Ti X X 1 = β plim N plim i=1 N ci i=1 Ti Ti i=1 Ti t 1 xit Ti 1 X X N T = β plim N plim i=1 N i c i x i. i=1 Ti i=1 Ti 1 X X N = β plim N plim i=1 fic i x i. 0 < fi < 1, Ni=1 fi 1 i=1 Ti T 1 Note N i = if balanced panel N i=1 Ti Part 4: Fixed Effects [ 33/96] M1D 0 0 2 0 M 0 D (The dummy variables are orthogonal) MD N 0 0 M D MDi I Ti di (didi )1 di = I Ti (1/Ti )di di X MD X = Ni=1 X iMDi X i , X MD y = Ni=1 X iMDi y i , X M X X M y i i i D i D i k,l i k T i t=1 (x it,k -x i.,k )(x it,l -x i.,l ) T i t=1 (x it,k -x i.,k )(y it -y i. ) Part 4: Fixed Effects [ 34/96] 1 bLSDV X MD X X MD y Define Z = MD X. bLSDV 1 Z X Z y (Looks like an IV estimator.) Z X (1) Plim 0? Σ i Ti T i Plim X iMDi X i t=1 (x it,k -x i.,k )(x it,l -x i.,l ) k,l Nonsingular PD matrix if there is no multicollinearity and if every column of X has within group variation. Σ i X iMDi (c ii i ) Z w Z w (2) Plim = 0 ? Plim Plim Σ T Σ T Σ T i i i i i i c iMDi i = 0 because i has no within group variation 1 Plim Σ i X iMDi i 0 by the assumption of the model. Σ i Ti Part 4: Fixed Effects [ 35/96] LSDV is a Control Function Estimator y it x it β c i +εit x it β (c i +εit ) x it β wit Cov[x it , wit ] Cov[x it ,(c i +εit )] g(x it ) 0 x it is correlated with the FEs embedded in wit . LS regression of y on X is inconsistent because X is correlated with w. We seek a control function h(.) such that X|h(.) is uncorrelated with w. (In the presence of h(.), X is not correlated with w.) Using the Frisch-Waugh theorem b =[X MD X]1 X MD y Consider regression of y on [X ,X ]. I.e., add group means to the regression. Part 4: Fixed Effects [ 36/96] LSDV is a Control Function Estimator Consider regression of y on [X ,X]. I.e., add group means to the regression. x11 x [X ,X] 21 x N1 = [X, x12 x 22 x1K x 2K x11 .i1 x 21 .i2 x12 .i1 x 22 .i2 x N2 x NK x N1 .iN x N2 .iN (I-MD ) X ] = [X , PD X] = [X ,F] x11i1 x11i2 x NK .iN Part 4: Fixed Effects [ 37/96] LSDV is a Control Function Estimator Using the Frisch-Waugh theorem b ControlFunction =[X MF X ]1 X MF y X MF X X [I F(FF)1 F]X X [I PD X ( X PDPD X )1 X PD ]X PD is symmetric and idempotent. And PD = I-MD X [I (I-MD )X( X (I-MD )X ) 1 X (I-MD )]X Multiply this out in full and collect some terms =X IX - X (I-MD )X X (I-MD )X X (I-MD )X 1 The two large matrices cancel. One more step =X X - X (I-MD )X = X X - X X + X MD X = X MD X. Likewise, X MF y X MD y . Therefore, b ControlFunction = bLSDV Part 4: Fixed Effects [ 38/96] Note the usual problem with control function estimators. The standard errors need to be corrected. Part 4: Fixed Effects [ 39/96] The problem here is the estimator of the disturbance variance. The matrix is OK. Note, for example, .01374007/.01950085 (top panel) = .16510 /.23432 (bottom panel). Part 4: Fixed Effects [ 40/96] Part 4: Fixed Effects [ 41/96] Generalized Least Squares? If Var[εi] = Ωi ≠ε2ITi then b is consistent but inefficient. ˆ =[X Ω-1 X]1 [X Ω-1 y] GLS : β =[Ni=1 XiΩi-1 X i ]-1 [Ni=1 X iΩi-1 y i ] Estimate Ω? (1) Balanced panel case: (1/N)Ni=1eiei from fixed effects (2) Unbalanced case? Put zeros in ei in appropriate places? Elements of Ωˆ are now based on different T. i ˆ is TxT with rank at most N. If T > N, Ω ˆ is Note Ω singular and GLS cannot be computed. N will be >> T. Part 4: Fixed Effects [ 42/96] Maximum Likelihood Estimation With normally distributed disturbances, the FE model is the ordinary classical normal linear regression model. OLS is the maximum likelihood estimator of β. The maximum likelihood estimator of 2 is Ni1 tTi 1 eit2 , the usual mean squared residual, with no ˆ Ti t 1 Ti 2 correction for degrees of freedom. From standard results for the linear model (e.g., Greene, p. 51), the exact expectation is (Ni1 Ti ) N K 1 K 1 N K 2 2 E[ ] 1 1 ˆ N N N T T T T i1 i i1 2 2 Part 4: Fixed Effects [ 43/96] ML Estimation (cont.) N ( 1 K 1 N K 2 2 2 2 i1 Ti ) N K E[ 1 1 ˆ ] N N N T T T T i1 i i1 2 This is a 'regular' problem, so ˆ converges to a probability limit - it is consistent for something. Note, as 2 2 N increases, ˆ converges to [1 - 1/T]. T (or Ti ) is 2 fixed in this model. So, ˆ is not a consistent estimator of 2 unless T increases. Suppose Ti 2. Then 2 plim . The inconsistency does not go away as N ˆ 2 increases. This is THE example of the Incidental Parameters 2 Problem. (Neyman and Scott (1948). It occurs because the number of parameters being estimated is growing as N grows. Part 4: Fixed Effects [ 44/96] Between Groups Estimator Inconsistency of the group means estimator y i x iβ c i +εi = x iβ w i Cov[w i , x i ] Cov[c i +εi , x ] 0 Part 4: Fixed Effects [ 45/96] Time Invariant Regressors Time invariant xit is defined as invariant for all i. E.g., SEX dummy variable. ED (education in the Cornwell/Rupert data). If xit,k is invariant for all i, then xit,k = ihidi for the set of dummy variables and some set of his. If xit,k is invariant for all i, then the group mean deviations are all 0. Part 4: Fixed Effects [ 46/96] FE With Time Invariant Variables +----------------------------------------------------+ | There are 2 vars. with no within group variation. | | FEM ED | +----------------------------------------------------+ +--------+--------------+----------------+--------+--------+----------+ |Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X| +--------+--------------+----------------+--------+--------+----------+ EXP | .09671227 .00119137 81.177 .0000 19.8537815 WKS | .00118483 .00060357 1.963 .0496 46.8115246 OCC | -.02145609 .01375327 -1.560 .1187 .51116447 SMSA | -.04454343 .01946544 -2.288 .0221 .65378151 FEM | .000000 ......(Fixed Parameter)....... ED | .000000 ......(Fixed Parameter)....... +--------------------------------------------------------------------+ | Test Statistics for the Classical Model | +--------------------------------------------------------------------+ | Model Log-Likelihood Sum of Squares R-squared | |(1) Constant term only -2688.80597 886.90494 .00000 | |(2) Group effects only 27.58464 240.65119 .72866 | |(3) X - variables only -1688.12010 548.51596 .38154 | |(4) X and group effects 2223.20087 83.85013 .90546 | +--------------------------------------------------------------------+ Part 4: Fixed Effects [ 47/96] Drop The Time Invariant Variables Same Results +--------+--------------+----------------+--------+--------+----------+ |Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X| +--------+--------------+----------------+--------+--------+----------+ EXP | .09671227 .00119087 81.211 .0000 19.8537815 WKS | .00118483 .00060332 1.964 .0495 46.8115246 OCC | -.02145609 .01374749 -1.561 .1186 .51116447 SMSA | -.04454343 .01945725 -2.289 .0221 .65378151 +--------------------------------------------------------------------+ | Test Statistics for the Classical Model | +--------------------------------------------------------------------+ | Model Log-Likelihood Sum of Squares R-squared | |(1) Constant term only -2688.80597 886.90494 .00000 | |(2) Group effects only 27.58464 240.65119 .72866 | |(3) X - variables only -1688.12010 548.51596 .38154 | |(4) X and group effects 2223.20087 83.85013 .90546 | +--------------------------------------------------------------------+ No change in the sum of squared residuals Efficient Estimation of Time Invariant and Rarely Changing Variables in Finite Sample Panel Analyses with Unit Fixed Effects Thomas Plümper and Vera Troeger Political Analysis, 2007 Part 4: Fixed Effects [ 49/96] Part 4: Fixed Effects [ 50/96] Introduction: The Pledge [T]he FE model … does not allow the estimation of time invariant variables. A second drawback of the FE model … results from its inefficiency in estimating the effect of variables that have very little within variance. This article discusses a remedy to the related problems of estimating time invariant and rarely changing variables in FE models with unit effects Part 4: Fixed Effects [ 51/96] The Model yit = αi + k=1βk x kit + m=1 m zmi + εit K M where αi denote the N unit effects. Part 4: Fixed Effects [ 52/96] Fixed Effects Vector Decomposition Step 1: Compute the fixed effects regression to get the “estimated unit effects.” “We run this FE model with the sole intention to obtain estimates of the unit effects, αi.” ˆαi = yi - K bFE xki k=1 k Part 4: Fixed Effects [ 53/96] Step 2 Regress ai on zi and compute residuals ai = m=1 m zim +hi M hi is orthogonal to zi (since it is a residual) Vector hi is expanded so each element hi is replicated Ti times - h is the length of the full sample. Part 4: Fixed Effects [ 54/96] Step 3 Regress yit on a constant, X, Z and h using ordinary least squares to estimate α, β, γ, δ. yit = α + k=1βk x kit + m=1 m zmi + δhi + εit K M Notice that i in the original model has become +h i in the revised model. Part 4: Fixed Effects [ 55/96] The Turn: Based on Cornwell and Rupert namelist ; x = exp,wks,occ,ind,south,smsa,union ; z = fem,ed $ (1) Step 1. regress ; lhs=lwage;rhs=x,z;panel;fixed;pds=7 $ create ; uhi = alphafe(_stratum) $ (2) Step 2 regress ; lhs = uhi ; rhs = one,z ; res = hi $ (3) Step 3. regress ; lhs = lwage ; rhs = one,x,z,hi $ Part 4: Fixed Effects [ 56/96] Step 1 (Based on full sample) These 2 variables have no within group variation. FEM ED F.E. estimates are based on a generalized inverse. --------+--------------------------------------------------------| Standard Prob. Mean LWAGE| Coefficient Error z z>|Z| of X --------+--------------------------------------------------------EXP| .09663*** .00119 81.13 .0000 19.8538 WKS| .00114* .00060 1.88 .0600 46.8115 OCC| -.02496* .01390 -1.80 .0724 .51116 IND| .02042 .01558 1.31 .1899 .39544 SOUTH| -.00091 .03457 -.03 .9791 .29028 SMSA| -.04581** .01955 -2.34 .0191 .65378 UNION| .03411** .01505 2.27 .0234 .36399 FEM| .000 .....(Fixed Parameter)..... .11261 ED| .000 .....(Fixed Parameter)..... 12.8454 --------+--------------------------------------------------------- Part 4: Fixed Effects [ 57/96] Step 2 (Based on 595 observations) --------+--------------------------------------------------------| Standard Prob. Mean UHI| Coefficient Error z z>|Z| of X --------+--------------------------------------------------------Constant| 2.88090*** .07172 40.17 .0000 FEM| -.09963** .04842 -2.06 .0396 .11261 ED| .14616*** .00541 27.02 .0000 12.8454 --------+--------------------------------------------------------- Part 4: Fixed Effects [ 58/96] Step 3! --------+--------------------------------------------------------| Standard Prob. Mean LWAGE| Coefficient Error z z>|Z| of X --------+--------------------------------------------------------Constant| 2.88090*** .03282 87.78 .0000 EXP| .09663*** .00061 157.53 .0000 19.8538 WKS| .00114*** .00044 2.58 .0098 46.8115 OCC| -.02496*** .00601 -4.16 .0000 .51116 IND| .02042*** .00479 4.26 .0000 .39544 SOUTH| -.00091 .00510 -.18 .8590 .29028 SMSA| -.04581*** .00506 -9.06 .0000 .65378 UNION| .03411*** .00521 6.55 .0000 .36399 FEM| -.09963*** .00767 -13.00 .0000 .11261 ED| .14616*** .00122 120.19 .0000 12.8454 HI| 1.00000*** .00670 149.26 .0000 -.103D-13 --------+--------------------------------------------------------- Part 4: Fixed Effects [ 59/96] Part 4: Fixed Effects [ 60/96] Part 4: Fixed Effects [ 61/96] http://davegiles.blogspot.com/2012/06/fixed-effects-vector-decomposition.html Part 4: Fixed Effects [ 62/96] Paul Allison, 2005 Part 4: Fixed Effects [ 63/96] http://people.stern.nyu.edu/wgreene/Econometrics/Bell-Jones-Fixed-vs-Random-Sept-2013.pdf Part 4: Fixed Effects [ 64/96] What happened here? yit = αi + k=1βk x kit + m=1 m zmi + εit K M where αi denote the N unit effects. An assumption is added along the way Cov(αi , Zi ) = 0. This is exactly the number of orthogonality assumptions needed to identify . It is not part of the original model. Part 4: Fixed Effects [ 65/96] Two Way Fixed Effects A two way FE model. Individual dummy variables and time dummy variables. yit = αi + t + xit’β + εit Normalization needed as the individual and time dummies both sum to one. Reformulate model: yit = μ + αi* + t* + xit’β + εit with i αi* =0, t t* = 0 yit yit yi. y.t y Full estimation: Practical estimation. Add T-1 dummies Complication: Unbalanced panels are complicated Complication in recent applications: Vary large N and very large T Part 4: Fixed Effects [ 66/96] Fixed Effects Estimators Slope estimators, as usual with transformed data μ̂=y-x b α̂i * (y i. y) ( x i. x)b ˆ t * (y.t y) ( x.t x)b Part 4: Fixed Effects [ 67/96] Two Way Fixed Effects Application Spanish Dairy Farms; N=247, T=6 +--------+--------------+----------------+--------+--------+----------+ |Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X| +--------+--------------+----------------+--------+--------+----------+ No Effects Constant| 11.5774868 .00364586 3175.515 .0000 X1 | .59517558 .01958331 30.392 .0000 0 X2 | .02305014 .01122274 2.054 .0400 0 X3 | .02319244 .01303099 1.780 .0751 0 X4 | .45175783 .01078465 41.889 .0000 0 Firm Dummies X1 | .66200103 .02467845 26.825 .0000 0 X2 | .03735244 .01613309 2.315 .0206 0 X3 | .03039947 .02320776 1.310 .1902 0 X4 | .38251038 .01201690 31.831 .0000 0 Firm and Time Dummies X1 | .63796531 .02379854 26.807 .0000 0 X2 | .04127557 .01544463 2.672 .0075 0 X3 | .02819226 .02217322 1.271 .2036 0 X4 | .30816028 .01322571 23.300 .0000 0 REGRESS ; Lhs = yit ; Rhs = one,x1,x2,x3,x4 ; pds=6 ; period=t $ Marginal changes in the estimates. Why? Part 4: Fixed Effects [ 68/96] Analysis of Variance (FIT) +--------------------------------------------------------------------+ | Test Statistics for the Classical Model | +--------------------------------------------------------------------+ | Model Log-Likelihood Sum of Squares R-squared | |(1) Constant term only -1448.90832 .6131518321D+03 .0000000 | |(2) Group effects only 412.25944 .4974526192D+02 .9188696 | |(3) X - variables only 809.67611 .2909570093D+02 .9525473 | |(4) X and group effects 1751.64437 .8161093811D+01 .9866899 | |(5) X ind.&time effects 1826.23878 .7379537558D+01 .9879646 | +--------------------------------------------------------------------+ | Hypothesis Tests | | Likelihood Ratio Test F Tests | | Chi-squared d.f. Prob. F num. denom. P value | |(2) vs (1) 3722.336 246 .00000 56.859 246 1235 .00000 | |(3) vs (1) 4517.169 4 .00000 7412.185 4 1477 .00000 | |(4) vs (1) 6401.105 250 .00000 365.021 250 1231 .00000 | |(4) vs (2) 2678.770 4 .00000 1568.114 4 1231 .00000 | |(4) vs (3) 1883.937 246 .00000 12.836 246 1231 .00000 | |(5) vs (4) 149.189 5 .00000 25.969 5 1226 .00000 | |(5) vs (3) 2033.125 252 .00000 14.317 252 1226 .00000 | +--------------------------------------------------------------------+ Part 4: Fixed Effects [ 69/96] Unbalanced Panel Data (First 10 households in healthcare data) Ti t 1 it z.i z Nt i1 it z.t z Part 4: Fixed Effects [ 70/96] Two Way FE with Unbalanced Data This computation is not appropriate in two way FE models with unbalanced panels: 1 b= i1 t 1 ( x it - x i . - x.t x )( x it - x i . - x.t x ) N Ti N Ti ( x - x . - x. x )(y - y . - y. y) i t it i t i1 t 1 it μ̂ = y-x b α̂i * (y i . y) ( x i . x )b ˆ t * (y.t y) ( x.t x)b The model must be fit as a one way FEM with time dummy variables Part 4: Fixed Effects [ 71/96] y it y it y i. y.t y and likewise for x it . Does not work correctly for unbalanced panels. Fit two way models as one way with time dummies. Part 4: Fixed Effects [ 72/96] Textbook formula application. This is incorrect. Two way fixed effects as one way with time dummies Part 4: Fixed Effects [ 73/96] Different Normalizations Separate constants: using D Overall constant and N-1 constrasts Overall constant, N constants, i i = 0 y=Xβ+Dα+ε =Xβ+Cα * +ε 1 = y, so Cα* = Dα = (DP)(P α) Part 4: Fixed Effects [ 74/96] Renormalizing Fixed Effects N Dummy Variables vs. a Constant and N-1 Dummy Variables Use 4 groups for example i 0 0 0 0 i 0 0 D 0 0 i 0 0 0 0 i i 0 0 0 1 i i 0 0 1 D C i 0 i 0 1 i 0 0 i 1 P 1 0 1 0 0 0 0 1 0 0 0 DP 0 1 1 1 1 1 0 0 0 1 0 0 0 1 0 0 0 1 -1 P a = α1 , α2 α1 , α3 α1 ,..., αN α1 Implication: No change in other coefficients, no change in sum of squares or R2 Part 4: Fixed Effects [ 75/96] A “Hierarchical” Model Lower level structural model y it x itβ c i +εit Upper level model for effects c i ziδ + w i How does this affect the fixed effects model? y it x itβ αi +εit No change in the model, but it invites a second step. Part 4: Fixed Effects [ 76/96] Estimating a Hierarchical Model Classical assumptions at both levels y it x itβ c i +εit , E[εit|X i , c i ] 0, Var[εit|X i , c i ]=2 , etc. c i ziδ + w i , E[w i|zi ] 0, Var[w i|zi]=2w Two step estimation Fixed effects, dummy variables at top level Regress ai on zi to estimate δ at the 2nd level. The regression is heteroscedastic. Use OLS/White or Weighted LS with ai ci (cˆi ci ) c i v i ziδ (wi v i ) Asy.Var[ai | X i , zi ] [ 2 / Ti ] xit (I - MDi )Asy.Var[b](I MDi ) x it = [ 2 / Ti ] xiAsy.Var[b]xi Part 4: Fixed Effects [ 77/96] A Two Step Regression Sample ; all$ Create ; person=trn(7,0) ; year=trn(-7,0)$ Namelist; varyingX=occ,smsa,ms,exp$ Namelist; fixedX=one,fem,ed$ ? FE regression to compute dummy variable coefficients Regress ; lhs=lwage ; rhs=varyingX ; panel ; fixed ; pds=7$ Create ; ai=alphafe(person)$ Create ; occb= GroupMean(occ,pds=7)$ Create ; msb = GroupMean(ms,pds=7)$ Create ; smsab=GroupMean(smsa,pds=7)$ Create ; expb= GroupMean(exp,pds=7)$ ? Standard errors for dummy variable coefficient estimates Namelist; means=occb,smsab,msb,expb$ Create ; varai=ssqrd/_Groupti + qfr(means,varb) ; wt=1/varai$ ? Weighted least squares regression of dummy variable coefficients ? on time invariant variables. Regress ; if[year = 7] ; lhs=ai;rhs=FixedX;wts=wt$ Regress ; if[year = 7] ; lhs=ai;rhs=FixedX;Het $ Part 4: Fixed Effects [ 78/96] First Stage Fixed Effects Model Part 4: Fixed Effects [ 79/96] Second Stage Regressions Weighted Least Squares OLS with White Estimator Part 4: Fixed Effects [ 80/96] Hierarchical Linear Model as REM +--------------------------------------------------+ | Random Effects Model: v(i,t) = e(i,t) + u(i) | | Estimates: Var[e] = .235368D-01 | | Var[u] = .110254D+00 | | Corr[v(i,t),v(i,s)] = .824078 | | Sigma(u) = 0.3303 | +--------------------------------------------------+ +--------+--------------+----------------+--------+--------+----------+ |Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X| +--------+--------------+----------------+--------+--------+----------+ OCC | -.03908144 .01298962 -3.009 .0026 .51116447 SMSA | -.03881553 .01645862 -2.358 .0184 .65378151 MS | -.06557030 .01815465 -3.612 .0003 .81440576 EXP | .05737298 .00088467 64.852 .0000 19.8537815 FEM | -.34715010 .04681514 -7.415 .0000 .11260504 ED | .11120152 .00525209 21.173 .0000 12.8453782 Constant| 4.24669585 .07763394 54.702 .0000 Part 4: Fixed Effects [ 81/96] Hierarchical Linear Model y it x it β c i +εit , (x does not contain a constant) E[εit|X i , c i ] 0, Var[ε it|X i , c i ]=2 c i +ziδ + w i , E[w i|zi ] 0, Var[w i|zi ] 2w y it x it β [ ziδ w i ] εit Part 4: Fixed Effects [ 82/96] HLM (Simulation Estimator) vs. REM ---------+ Nonrandom parameters OCC | -.02461285 .00566374 -4.346 .0000 .51116447 SMSA | -.06076787 .00490494 -12.389 .0000 .65378151 MS | -.04446541 .00850068 -5.231 .0000 .81440576 EXP | .08508257 .00046901 181.409 .0000 19.8537815 ---------+ Means for random parameters Constant| 2.89358963 .02426391 119.255 .0000 ---------+ Scale parameters for dists. of random parameters Constant| .86092728 .00448368 192.014 .0000 ---------+ Heterogeneity in the means of random parameters cONE_FEM| -.54972521 .01030773 -53.331 .0000 cONE_ED | .16915125 .00122320 138.286 .0000 ======================================================================== ---------+Variance parameter given is sigma Std.Dev.| .15681703 .00074231 211.256 .0000 (REM Estimated by two step FGLS) Sigma(u) = 0.3303 OCC | -.03908144 .01298962 -3.009 .0026 .51116447 SMSA | -.03881553 .01645862 -2.358 .0184 .65378151 MS | -.06557030 .01815465 -3.612 .0003 .81440576 EXP | .05737298 .00088467 64.852 .0000 19.8537815 FEM | -.34715010 .04681514 -7.415 .0000 .11260504 ED | .11120152 .00525209 21.173 .0000 12.8453782 Constant| 4.24669585 .07763394 54.702 .0000 Part 4: Fixed Effects [ 83/96] Mundlak’s Approach y it x itβ+ziδ c i +εit , (x does not contain a constant) E[εit|X i , c i ] 0, Var[ε it|X i , c i ]=2 c i + x i + w i , E[w i|X i , zi ] 0, Var[w i|X i , zi ] 2w y it x itβ ziδ x i w i εit = random effects model including group means. Part 4: Fixed Effects [ 84/96] Mundlak Form of FE Model +--------+--------------+----------------+--------+--------+----------+ |Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X| +--------+--------------+----------------+--------+--------+----------+ x(i,t) OCC | -.02021384 .01375165 -1.470 .1416 .51116447 SMSA | -.04250645 .01951727 -2.178 .0294 .65378151 MS | -.02946444 .01915264 -1.538 .1240 .81440576 EXP | .09665711 .00119262 81.046 .0000 19.8537815 z(i) FEM | -.34322129 .05725632 -5.994 .0000 .11260504 ED | .05099781 .00575551 8.861 .0000 12.8453782 Means of x(I,t) and constant Constant| 5.72655261 .10300460 55.595 .0000 OCCB | -.10850252 .03635921 -2.984 .0028 .51116447 SMSAB | .22934020 .03282197 6.987 .0000 .65378151 MSB | .20453332 .05329948 3.837 .0001 .81440576 EXPB | -.08988632 .00165025 -54.468 .0000 19.8537815 Estimates: Var[e] = .0235632 Var[u] = .0773825 Part 4: Fixed Effects [ 85/96] Application Passmore,W. et al., “The Effect of Housing Government Sponsored Enterprises on Mortgage Rates,” Federal Reserve Board, Division of Research & Statistics and Monetary Affairs, 2004, rev. 1/2005 Part 4: Fixed Effects [ 86/96] First Stage – Rate Difference MortgageRatei,t 0 +" loan to value ratio terms" + "new home" (dummy variable) + "small loan" (dummy variable) + "up front fees paid" (dummy variable) + "mortgage bank" vs. depository inst. (dummy variable) + α1,i JumboLoani,t( dummy variable for loan > $317, 000) + i,t " i"= state,year grouping "t"= individual loan in specified state,year Nearly all "conforming" loans (under $317,000) are held by Fannie Mae. Expect 1,i to be > 0 as Fannie Mae is able to finance at lower cost than other institutions, and Fannie Mae does not finance Jumbo loans. Interest is in "pass through" of the cost advantage. Part 4: Fixed Effects [ 87/96] An Algebraic Aspect Ji is not quite a group dummy variable. For the group, Ji is one for some members of the group – those with a “jumbo” mortgage. MiJ I Ji ( Ji Ji ) Ji MiJ y i y i Ji * mean of those with jumbo loans y it y i,jumbo if jumbo loan y it y it if not a jumbo loan Otherwise, this could be treated like a fixed effects model. Part 4: Fixed Effects [ 88/96] Second Stage – Pass Through a1,i = 0 + 1 "Estimated Capital Cost Advantage" + "market characteristics" + "state" and "quarter" dummy variables + wi Primary interest is in 1 which is the amount of the capital cost advantage that is passed through to mortgagees. Result: Less than half of cost advantage was passed through to borrowers. Part 4: Fixed Effects [ 89/96] Time Varying Fixed Effects 911 Rescue Part 4: Fixed Effects [ 90/96] Need for Clarification Part 4: Fixed Effects [ 91/96] Time Varying Fixed Effects Part 4: Fixed Effects [ 92/96] Munnell State Production Model Part 4: Fixed Effects [ 93/96] No Effects Part 4: Fixed Effects [ 94/96] Quadratic Fixed Effects Correct DF: 816-6-3(48)=666 Multiply standard errors by sqr(810/666) = 1.103 Part 4: Fixed Effects [ 95/96] Appendix II. Fixed Effects Algebra Part 4: Fixed Effects [ 96/96] Panel Data Algebra M1D 0 0 2 0 M 0 D (The dummy variables are orthogonal) MD N 0 MD 0 MDi I Ti di (didi ) 1 d = I Ti (1/Ti )did X MD X = Ni=1 X iMDi X i , X MD y = Ni=1 X iMDi y i , 1 XM y X iMDi X i bLSDV X MD X X MD y i i D i k k,l T i t=1 (x it,k -x i.,k )(x it,l -x i.,l ) T i t=1 (x it,k -x i.,k )(y it -y i. ) Part 4: Fixed Effects [ 97/96] Balanced Panel Data Algebra MD,T 0 0 MD,T 0 0 MD (Each matrix is T T) 0 0 MD,T MD,T I T d(dd ) 1 d = I T (1/T)dd 1 MD,T 0 MD,T 0 MD,T 0 MD,T 0 MD,T 1 MD,T MD IN MD,T 0 MD,T 0 MD,T 1 MD,T Note : dd = a matrix of ones is Baltagi's JT ; (1/T)dd = a matrix of 1/T is his JT Part 4: Fixed Effects [ 98/96] Balanced Panel MD,T I T d(dd ) 1 d = I T (1/T)dd = I T JT 1 JT 0 JT 0 JT 1 J T P is 0 JT 0 JT 1 JT 0 JT 0 JT 1 JT PX = 0 JT 0 JT 0 JT 0 JT ; PX creates group means 1 JT 0 JT X 1 X 1 0 JT X 2 X 2 This is T rows each with means repeated. 1 JT X N X N Part 4: Fixed Effects [ 99/96] Balanced Panel I T 0 INT = 0 INT 0 IT 0 0 1 I T 0 I 0 T = IT 0 I T 1 I T - 1 JT 0 I T 0 JT -P = 0 I T 0 JT 0 IT 1 IT 0 IT 0 I T 0 JT 1 I T - 1 JT 0 I T 0 JT IN I T IN JT IN I T - JT Q 0 IT 0 I T = IN I T 1 IT 0 I T 0 JT 0 I T 0 JT 1 I T - 1 JT Part 4: Fixed Effects [ 100/96] Balanced Panel 1 JT 0 JT PX = 0 JT 0 JT 1 JT 0 JT 0 JT X1 X1 0 JT X 2 X 2 This is T rows each with means repeated. 1 JT X N X N PX = I J X X1 X1 X This is T rows each X QX = X - I J X = 2 2 with mean deviations. X N X N Part 4: Fixed Effects [ 101/96] Balanced Panel QX = X - IN JT X = INT X - IN JT X = IN I T X - IN JT X X1 X1 X X 2 This is T rows each 2 = QX with mean deviations X N X N 1 bLSDV X QX X Qy Part 4: Fixed Effects [ 102/96] Balanced Panel JT 0 P is 0 0 JT 0 0 0 ; PX creates group means JT I - JT 0 0 I - JT Q is 0 0 Homework: 0 0 1 IN I - dd T I - JT (1) Verify that both P and Q are idempotent (2) Show that PQ = 0 (3) What is the trace of Q?