First Difference and Fixed Effects Moulton-UNC Chapel Hill This log file provides an example of Fixed Effects and First Differences. We are investigating the determinants of the probability that someone will leave a bequest. Please email me (moulton@email.unc.edu) if you see any errors or have questions. This is important since the Fixed Effects model that is estimated by including all the individual fixed effects will not run if the matrix size is not large enough to accommodate all the fixed effects. . set matsize 11000 First open the Health and Retirement Study data, but just the variables that are necessary. . use hhidpn r*cenreg r*beqany h*atota ragender raedyrs r*shlt h*hhres r*sayret h*hhres using rndhrs_p.dta, clear I convert all the i.w, i.u, etc missing value flags to instead just be “.”. I only do this for waves 3 to 8 (using a loop) since I will drop all other waves later. . replace ragender = . if ragender > 2 (7 real changes made, 7 to missing) . replace raedyrs = . if raedyrs > 17 (131 real changes made, 131 to missing) . forvalues w = 3/8 { 2. *make retirement 1 if fully retired and 0 otherwise . replace r`w'sayret = (r`w'sayret == 1) if r`w'sayret <= 3 3. replace r`w'sayret = . if r`w'sayret > 1 4. *remove any oddly coded missing values . replace r`w'beqany = . if r`w'beqany > 100 5. replace h`w'atota = log(h`w'atota) 6. su h`w'hhres 7. replace h`w'hhres = . if h`w'hhres > r(max) 8. replace r`w'shlt = . if r`w'shlt > 5 9. replace r`w'cenreg = . if r`w'cenreg > 5 10. } Note that the output from this loop has been redacted from the log file. Next use a loop to create a balanced panel, where individuals with non-missing values for all relevant variables in waves 3 to 8 are retained. . forvalues w = 3/8 { 2. foreach x in r`w'beqany h`w'hhres h`w'atota r`w'sayret r`w'shlt r`w'cenreg { 3. keep if `x' != . 4. } 5. } (25,349 observations deleted) (0 observations deleted) (818 observations deleted) (2,651 observations deleted) (2 observations deleted) (0 observations deleted) (3,466 observations deleted) (0 observations deleted) (170 observations deleted) (5 observations deleted) (2 observations deleted) (0 observations deleted) (1,619 observations deleted) (0 observations deleted) (61 observations deleted) (8 observations deleted) (0 observations deleted) (1 observation deleted) (935 observations deleted) (0 observations deleted) (45 observations deleted) (19 observations deleted) (0 observations deleted) (1 observation deleted) (614 observations deleted) (0 observations deleted) (21 observations deleted) (1 observation deleted) (1 observation deleted) First Difference and Fixed Effects Moulton-UNC Chapel Hill (0 observations deleted) (331 observations deleted) (0 observations deleted) (12 observations deleted) (3 observations deleted) (1 observation deleted) (0 observations deleted) Then save the data that we used in class. . save FE-FD-updated.dta, replace file FE-FD-updated.dta saved Then use that data. . use "/Users/jgmoulton/Downloads/FE-FD-updated.dta", clear We will estimate the First Difference model by hand by differencing several of the variables between waves 3 and 4. . gen beqdiff = r4beqany - r3beqany . gen hhresdiff = h4hhres - h3hhres . gen wealthdiff = h4atota - h3atota . gen retireddiff = r4sayret - r3sayret Then reshape the wide data to be long. . reshape long h@hhres r@shlt r@cenreg r@beqany h@atota r@sayret, i(hhidpn) j(wave) (note: j = 1 2 3 4 5 6 7 8 9 10 11 12) (note: r1beqany not found) Data wide -> long ----------------------------------------------------------------------------Number of obs. 1359 -> 16308 Number of variables 78 -> 14 j variable (12 values) -> wave xij variables: h1hhres h2hhres ... h12hhres -> hhhres r1shlt r2shlt ... r12shlt -> rshlt r1cenreg r2cenreg ... r12cenreg -> rcenreg r1beqany r2beqany ... r12beqany -> rbeqany h1atota h2atota ... h12atota -> hatota r1sayret r2sayret ... r12sayret -> rsayret ----------------------------------------------------------------------------Xtset the data so that we can use the xtreg command to estimate a fixed effects model later. . xtset hhidpn wave panel variable: hhidpn (strongly balanced) time variable: wave, 1 to 12 delta: 1 unit Keep only waves 3 to 8. . keep if wave >= 3 & wave <= 8 (8,154 observations deleted) Before running regressions, we first check that we have within variation for the variables included in our regression. It looks like we have a good amount for most of the variables. Although variables like education do not have any within variation in this dataset. . xtsum Variable | Mean Std. Dev. Min Max | Observations -----------------+--------------------------------------------+---------------hhidpn overall | 7.76e+07 6.61e+07 1.00e+07 2.09e+08 | N = 8154 between | 6.62e+07 1.00e+07 2.09e+08 | n = 1359 within | 0 7.76e+07 7.76e+07 | T = 6 | | wave overall | 5.5 1.70793 3 8 | N = 8154 between | 0 5.5 5.5 | n = 1359 within | 1.70793 3 8 | T = 6 | | rcenreg overall | 2.660657 .9895839 1 5 | N = 8154 between | .9710592 1 5 | n = 1359 First Difference and Fixed Effects within ragender overall between within raedyrs overall between within rshlt overall between within hatota overall between within rsayret overall between within rbeqany overall between within hhhres overall between within beqdiff overall between within hhresd~f overall between within wealth~f overall between within retire~f overall between within | | | 1.499632 | | | | 13.35173 | | | | 2.429605 | | | | 12.73514 | | | | .4540103 | | | | 92.97768 | | | | 2.126686 | | | | 4.025018 | | | | -.1103753 | | | | .0836954 | | | | -.0036792 | | Moulton-UNC Chapel Hill .1920898 -.5060093 .5000305 .5001839 0 1 1 1.499632 2.846221 2.847094 0 0 0 13.35173 1.052416 .869697 .5930178 1 1 -.2370616 1.520786 1.412563 .5645154 .6931472 5.408678 3.577643 .497911 .3717857 .3313239 0 0 -.379323 24.62895 21.54188 11.95061 0 0 9.644346 .8770996 .7058009 .5210127 1 1 -1.206647 20.12526 20.13144 0 -100 -100 4.025018 .7670273 .7672627 0 -10 -10 -.1103753 .677269 .6774768 0 -6.339477 -6.339477 .0836954 .4331829 .4333158 0 -1 -1 -.0036792 4.660657 | | 2 | 2 | 1.499632 | | 17 | 17 | 13.35173 | | 5 | 5 | 5.596272 | | 17.78957 | 17.07953 | 16.13795 | | 1 | 1 | 1.287344 | | 100 | 100 | 176.311 | | 12 | 6.5 | 10.46002 | | 100 | 100 | 4.025018 | | 6 | 6 | -.1103753 | | 3.128419 | 3.128419 | .0836954 | | 1 | 1 | -.0036792 | T = 6 N = n = T = 8154 1359 6 N = n = T = 8154 1359 6 N = n = T = 8154 1359 6 N = n = T = 8154 1359 6 N = n = T = 8154 1359 6 N = n = T = 8154 1359 6 N = n = T = 8154 1359 6 N = n = T = 8154 1359 6 N = n = T = 8154 1359 6 N = n = T = 8154 1359 6 N = n = T = 8154 1359 6 One way to create indicator variables, is to use the tab and gen option. . tab wave, gen(TTT) wave | Freq. Percent Cum. ------------+----------------------------------3 | 1,359 16.67 16.67 4 | 1,359 16.67 33.33 5 | 1,359 16.67 50.00 6 | 1,359 16.67 66.67 7 | 1,359 16.67 83.33 8 | 1,359 16.67 100.00 ------------+----------------------------------Total | 8,154 100.00 Estimate the two-period first difference model using the manually differenced variables. Note that I am only include wave 4 since we only created the difference between waves 4 and 3. We are also omitting the constant since we differenced it out. . reg beqdiff hhresdiff wealthdiff retireddiff TTT* if wave == 4, nocons note: TTT1 omitted because of collinearity note: TTT3 omitted because of collinearity note: TTT4 omitted because of collinearity note: TTT5 omitted because of collinearity note: TTT6 omitted because of collinearity Source | SS df MS Number of obs = 1,359 First Difference and Fixed Effects -------------+---------------------------------Model | 23207.0244 4 5801.75611 Residual | 549172.976 1,355 405.293709 -------------+---------------------------------Total | 572380 1,359 421.177336 Moulton-UNC Chapel Hill F(4, 1355) Prob > F R-squared Adj R-squared Root MSE = = = = = 14.31 0.0000 0.0405 0.0377 20.132 -----------------------------------------------------------------------------beqdiff | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhresdiff | .2287623 .7136789 0.32 0.749 -1.171273 1.628798 wealthdiff | 1.066715 .8078159 1.32 0.187 -.5179905 2.651421 retireddiff | -1.396982 1.262236 -1.11 0.269 -3.873131 1.079166 TTT1 | 0 (omitted) TTT2 | 3.955849 .5553654 7.12 0.000 2.86638 5.045319 TTT3 | 0 (omitted) TTT4 | 0 (omitted) TTT5 | 0 (omitted) TTT6 | 0 (omitted) -----------------------------------------------------------------------------The coefficients are the same if we use the D. command that differences the data across one time period (you need to xtset the data first). . reg D.(rbeqany hhhres hatota rsayret TTT*) if wave == 4, nocons note: D.TTT2 omitted because of collinearity note: D.TTT3 omitted because of collinearity note: D.TTT4 omitted because of collinearity note: D.TTT5 omitted because of collinearity note: D.TTT6 omitted because of collinearity Source | SS df MS -------------+---------------------------------Model | 23207.0244 4 5801.75611 Residual | 549172.976 1,355 405.293709 -------------+---------------------------------Total | 572380 1,359 421.177336 Number of obs F(4, 1355) Prob > F R-squared Adj R-squared Root MSE = = = = = = 1,359 14.31 0.0000 0.0405 0.0377 20.132 -----------------------------------------------------------------------------D.rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | D1. | .2287623 .7136789 0.32 0.749 -1.171273 1.628798 | hatota | D1. | 1.066715 .8078159 1.32 0.187 -.5179906 2.651421 | rsayret | D1. | -1.396982 1.262236 -1.11 0.269 -3.873131 1.079166 | TTT1 | D1. | -3.955849 .5553654 -7.12 0.000 -5.045319 -2.86638 | TTT2 | D1. | 0 (omitted) | TTT3 | D1. | 0 (omitted) | TTT4 | D1. | 0 (omitted) | TTT5 | D1. | 0 (omitted) | TTT6 | D1. | 0 (omitted) -----------------------------------------------------------------------------The coefficients are also the same if we estimate a Fixed Effects model using only waves 3 and 4. First Difference and Fixed Effects provide the same estimates when using two time periods. Note that with xtreg you do not specify noconstant since we are estimating all the different intercepts. First Difference and Fixed Effects . xtreg rbeqany hhhres hatota note: TTT2 omitted because of note: TTT3 omitted because of note: TTT4 omitted because of note: TTT5 omitted because of note: TTT6 omitted because of Moulton-UNC Chapel Hill rsayret TTT* if wave == 4 | wave == 3, fe collinearity collinearity collinearity collinearity collinearity Fixed-effects (within) regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0405 between = 0.3288 overall = 0.1508 corr(u_i, Xb) = 0.3236 = = 2,718 1,359 min = avg = max = 2 2.0 2 = = 14.31 0.0000 F(4,1355) Prob > F -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | .2287623 .7136789 0.32 0.749 -1.171273 1.628798 hatota | 1.066715 .8078159 1.32 0.187 -.5179906 2.651421 rsayret | -1.396982 1.262236 -1.11 0.269 -3.873131 1.079166 TTT1 | -3.955849 .5553654 -7.12 0.000 -5.045319 -2.86638 TTT2 | 0 (omitted) TTT3 | 0 (omitted) TTT4 | 0 (omitted) TTT5 | 0 (omitted) TTT6 | 0 (omitted) _cons | 80.53884 10.37666 7.76 0.000 60.18278 100.8949 -------------+---------------------------------------------------------------sigma_u | 22.26177 sigma_e | 14.235408 rho | .70977198 (fraction of variance due to u_i) -----------------------------------------------------------------------------F test that all u_i=0: F(1358, 1355) = 3.49 Prob > F = 0.0000 Now estimate the First Difference model using all the time periods. . reg D.(rbeqany hhhres hatota rsayret TTT*), nocons note: D.TTT6 omitted because of collinearity Source | SS df MS -------------+---------------------------------Model | 30619.435 8 3827.42938 Residual | 1918271.56 6,787 282.639099 -------------+---------------------------------Total | 1948891 6,795 286.812509 Number of obs F(8, 6787) Prob > F R-squared Adj R-squared Root MSE = = = = = = 6,795 13.54 0.0000 0.0157 0.0146 16.812 -----------------------------------------------------------------------------D.rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | D1. | -.0198774 .3279133 -0.06 0.952 -.6626902 .6229354 | hatota | D1. | 1.456645 .2958633 4.92 0.000 .8766604 2.03663 | rsayret | D1. | -.9030089 .4722086 -1.91 0.056 -1.828686 .0226679 | TTT1 | D1. | -2.469052 1.043154 -2.37 0.018 -4.513962 -.4241425 | TTT2 | D1. | 1.428535 .931412 1.53 0.125 -.3973244 3.254395 | TTT3 | D1. | 1.065853 .8029547 1.33 0.184 -.5081904 2.639896 | TTT4 | First Difference and Fixed Effects Moulton-UNC Chapel Hill D1. | 1.12836 .6513059 1.73 0.083 -.1484037 2.405124 | TTT5 | D1. | .6005762 .4574299 1.31 0.189 -.2961298 1.497282 | TTT6 | D1. | 0 (omitted) -----------------------------------------------------------------------------Note that the coefficients on the other variables are the same if you difference the time indicators or include them as is. . reg D.(rbeqany hhhres hatota rsayret) TTT*, nocons note: TTT1 omitted because of collinearity Source | SS df MS -------------+---------------------------------Model | 30619.435 8 3827.42938 Residual | 1918271.56 6,787 282.639099 -------------+---------------------------------Total | 1948891 6,795 286.812509 Number of obs F(8, 6787) Prob > F R-squared Adj R-squared Root MSE = = = = = = 6,795 13.54 0.0000 0.0157 0.0146 16.812 -----------------------------------------------------------------------------D.rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | D1. | -.0198774 .3279133 -0.06 0.952 -.6626902 .6229354 | hatota | D1. | 1.456645 .2958633 4.92 0.000 .8766604 2.03663 | rsayret | D1. | -.9030089 .4722086 -1.91 0.056 -1.828686 .0226679 | TTT1 | 0 (omitted) TTT2 | 3.897588 .4581274 8.51 0.000 2.999514 4.795661 TTT3 | -.3626826 .4582761 -0.79 0.429 -1.261048 .5356823 TTT4 | .0625076 .4592435 0.14 0.892 -.8377537 .962769 TTT5 | -.527784 .4596993 -1.15 0.251 -1.428939 .3733708 TTT6 | -.6005762 .4574299 -1.31 0.189 -1.497282 .2961298 -----------------------------------------------------------------------------The coefficients for the Fixed Effects model using all the time periods are similar to the First Difference model, but not exactly the same. . xtreg rbeqany hhhres hatota rsayret TTT*, fe note: TTT6 omitted because of collinearity Fixed-effects (within) regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0220 between = 0.4381 overall = 0.2768 corr(u_i, Xb) = 0.4883 F(8,6787) Prob > F = = 8,154 1,359 min = avg = max = 6 6.0 6 = = 19.05 0.0000 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | -.1103689 .2828504 -0.39 0.696 -.6648443 .4441066 hatota | 1.937789 .2655226 7.30 0.000 1.417281 2.458296 rsayret | -1.031235 .4576656 -2.25 0.024 -1.928403 -.1340665 TTT1 | -2.24378 .5352282 -4.19 0.000 -3.292995 -1.194565 TTT2 | 1.603078 .5263248 3.05 0.002 .5713163 2.63484 TTT3 | 1.191522 .514469 2.32 0.021 .1830017 2.200043 TTT4 | 1.21655 .5039036 2.41 0.016 .2287408 2.204359 TTT5 | .6473157 .4979766 1.30 0.194 -.3288745 1.623506 TTT6 | 0 (omitted) _cons | 68.60014 3.534376 19.41 0.000 61.67166 75.52863 -------------+---------------------------------------------------------------- First Difference and Fixed Effects Moulton-UNC Chapel Hill sigma_u | 19.819517 sigma_e | 12.95355 rho | .70069183 (fraction of variance due to u_i) -----------------------------------------------------------------------------F test that all u_i=0: F(1358, 6787) = 9.81 Prob > F = 0.0000 But both Fixed Effects and First Difference are quite different from the Pooled OLS model. . reg rbeqany hhhres hatota rsayret TTT* note: TTT6 omitted because of collinearity Source | SS df MS -------------+---------------------------------Model | 1571509.72 8 196438.715 Residual | 3373978.22 8,145 414.239192 -------------+---------------------------------Total | 4945487.94 8,153 606.585053 Number of obs F(8, 8145) Prob > F R-squared Adj R-squared Root MSE = = = = = = 8,154 474.22 0.0000 0.3178 0.3171 20.353 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | .5956975 .2603045 2.29 0.022 .0854342 1.105961 hatota | 9.131058 .1491015 61.24 0.000 8.838781 9.423335 rsayret | -.7810734 .4652072 -1.68 0.093 -1.692998 .1308515 TTT1 | 1.001606 .7975332 1.26 0.209 -.5617631 2.564974 TTT2 | 4.325273 .7945231 5.44 0.000 2.767805 5.882741 TTT3 | 3.201472 .7891959 4.06 0.000 1.654447 4.748498 TTT4 | 2.614182 .7839107 3.33 0.001 1.077517 4.150847 TTT5 | 1.371202 .7811275 1.76 0.079 -.1600075 2.902411 TTT6 | 0 (omitted) _cons | -26.30545 2.112393 -12.45 0.000 -30.44628 -22.16462 -----------------------------------------------------------------------------We can observe the different intecepts for each individual if we run the Fixed Effects model by including all the indicators for the identifier (i.hhidpn in this case). You will need to use the command “set matsize 11000” if you have not done so already since there are a lot of individuals in the dataset and the default matsize is not large enough to estimate the regression. . reg rbeqany hhhres hatota rsayret TTT* i.hhidpn note: TTT6 omitted because of collinearity Source | SS df MS -------------+---------------------------------Model | 3806666.92 1,366 2786.72542 Residual | 1138821.02 6,787 167.794463 -------------+---------------------------------Total | 4945487.94 8,153 606.585053 Number of obs F(1366, 6787) Prob > F R-squared Adj R-squared Root MSE = = = = = = 8,154 16.61 0.0000 0.7697 0.7234 12.954 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | -.1103689 .2828504 -0.39 0.696 -.6648443 .4441066 hatota | 1.937789 .2655226 7.30 0.000 1.417281 2.458296 rsayret | -1.031235 .4576656 -2.25 0.024 -1.928403 -.1340665 TTT1 | -2.24378 .5352282 -4.19 0.000 -3.292995 -1.194565 TTT2 | 1.603078 .5263248 3.05 0.002 .5713163 2.63484 TTT3 | 1.191522 .514469 2.32 0.021 .1830017 2.200043 TTT4 | 1.21655 .5039036 2.41 0.016 .2287408 2.204359 TTT5 | .6473157 .4979766 1.30 0.194 -.3288745 1.623506 TTT6 | 0 (omitted) | hhidpn | 10038040 | -4.91e-12 7.478736 -0.00 1.000 -14.66067 14.66067 10059020 | -3.406031 7.493284 -0.45 0.649 -18.09522 11.28316 10097040 | -18.37875 7.482767 -2.46 0.014 -33.04732 -3.710178 10394010 | 1.938642 7.503271 0.26 0.796 -12.77012 16.64741 30 pages later… 208504010 208675010 | | -86.01498 .3376332 7.709259 7.5091 -11.16 0.04 0.000 0.964 -101.1275 -14.38256 -70.90242 15.05782 First Difference and Fixed Effects Moulton-UNC Chapel Hill 208725020 208773020 208827020 208896010 | -34.3129 7.543641 -4.55 0.000 -49.1008 -19.525 | -4.518654 7.505793 -0.60 0.547 -19.23236 10.19505 | -79.631 7.728287 -10.30 0.000 -94.78086 -64.48113 | -24.63045 7.566187 -3.26 0.001 -39.46255 -9.798353 | _cons | 72.39976 6.562587 11.03 0.000 59.53504 85.26449 -----------------------------------------------------------------------------Stata has stored all of those intercepts temporarily. . return list scalars: r(level) = 95 macros: r(label10) : "(base)" r(label9) : "(omitted)" matrices: r(table) : 9 x 1369 And we can test if they are different. The easiest way to do that is to conduct an F-test with a null that the intercepts are all equal to 0, meaning that each individual’s intercept is not different from the omitted individual’s intercept. Since the p-value is smaller than 0.05 in this case, we would reject the null that they are equal to 0 and accept the alternative that they are different. This suggests that Fixed Effects may be preferred to Pooled OLS. . testparm i.*hhidpn ( 1) ( 2) ( 3) 10038040.hhidpn = 0 10059020.hhidpn = 0 10097040.hhidpn = 0 Several pages later… (1357) (1358) 208827020.hhidpn = 0 208896010.hhidpn = 0 F(1358, 6787) = Prob > F = 9.81 0.0000 But it is a lot easier to conduct this test, it shows up at the bottom of the xtreg Fixed Effects output. . xtreg rbeqany hhhres hatota rsayret TTT*, fe note: TTT6 omitted because of collinearity Fixed-effects (within) regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0220 between = 0.4381 overall = 0.2768 corr(u_i, Xb) = 0.4883 F(8,6787) Prob > F = = 8,154 1,359 min = avg = max = 6 6.0 6 = = 19.05 0.0000 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | -.1103689 .2828504 -0.39 0.696 -.6648443 .4441066 hatota | 1.937789 .2655226 7.30 0.000 1.417281 2.458296 rsayret | -1.031235 .4576656 -2.25 0.024 -1.928403 -.1340665 TTT1 | -2.24378 .5352282 -4.19 0.000 -3.292995 -1.194565 TTT2 | 1.603078 .5263248 3.05 0.002 .5713163 2.63484 TTT3 | 1.191522 .514469 2.32 0.021 .1830017 2.200043 TTT4 | 1.21655 .5039036 2.41 0.016 .2287408 2.204359 TTT5 | .6473157 .4979766 1.30 0.194 -.3288745 1.623506 TTT6 | 0 (omitted) _cons | 68.60014 3.534376 19.41 0.000 61.67166 75.52863 -------------+---------------------------------------------------------------sigma_u | 19.819517 First Difference and Fixed Effects Moulton-UNC Chapel Hill sigma_e | 12.95355 rho | .70069183 (fraction of variance due to u_i) -----------------------------------------------------------------------------F test that all u_i=0: F(1358, 6787) = 9.81 Prob > F = 0.0000 We estimated the First Difference by manually differencing the data, but we can also manually mean difference the data to estimate the Fixed Effects model. We start with generating the mean value of a variable for each individual. . egen MEANbeqany = mean(rbeqany), by(hhidpn) Then creating another variable that is the difference between the value of the variable and the individual’s mean. . gen MDbeqany = rbeqany - MEANbeqany Check to make sure that it worked. . br hhidpn wave rbeqany MEANbeqany MDbeqany Then use a loop to do it for every variable. . foreach x in hhhres hatota rsayret { 2. egen MEAN`x' = mean(`x'), by(hhidpn) 3. gen MD`x' = `x' - MEAN`x' 4. } Then regress the individual mean differenced variables. Note that we are omitting the constant since we differenced it out (just like in First Differences). . reg MDbeqany MDhhhres MDhatota MDrsayret, nocons Source | SS df MS -------------+---------------------------------Model | 12270.6371 3 4090.21236 Residual | 1152117.71 8,151 141.346793 -------------+---------------------------------Total | 1164388.34 8,154 142.79965 Number of obs F(3, 8151) Prob > F R-squared Adj R-squared Root MSE = = = = = = 8,154 28.94 0.0000 0.0105 0.0102 11.889 -----------------------------------------------------------------------------MDbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------MDhhhres | -.3409778 .2541839 -1.34 0.180 -.839243 .1572874 MDhatota | 2.108028 .2351886 8.96 0.000 1.646998 2.569057 MDrsayret | -.9354061 .3998503 -2.34 0.019 -1.719215 -.1515975 -----------------------------------------------------------------------------And rerun our xtreg, fe to make sure that the results are the same. The coefficients are the same, however the standard errors are too small in the manually estimated version since the degrees of freedom are not adjusted for the individual indicator variables. . xtreg rbeqany hhhres hatota rsayret, fe Fixed-effects (within) regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0105 between = 0.4364 overall = 0.3065 corr(u_i, Xb) = 0.5278 F(3,6792) Prob > F = = 8,154 1,359 min = avg = max = 6 6.0 6 = = 24.11 0.0000 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | -.3409778 .2784547 -1.22 0.221 -.8868363 .2048807 hatota | 2.108028 .2576457 8.18 0.000 1.602961 2.613094 rsayret | -.9354061 .4380302 -2.14 0.033 -1.794083 -.0767297 _cons | 67.2815 3.377482 19.92 0.000 60.66057 73.90242 -------------+---------------------------------------------------------------sigma_u | 19.684013 sigma_e | 13.024156 rho | .69550892 (fraction of variance due to u_i) -----------------------------------------------------------------------------F test that all u_i=0: F(1358, 6792) = 9.72 Prob > F = 0.0000 First Difference and Fixed Effects Moulton-UNC Chapel Hill We can predict the different intercepts for each individual . predict MU, u And regress them on the time-invariant variables in the model to get an idea of how much of the variation in the intercepts (or time invariant unobservables) we can explain with our observable time invariant variables. In this case, about 10% (see R squared). . gen female = (ragender == 2) . reg MU raedyrs female Source | SS df MS -------------+---------------------------------Model | 313772.768 2 156886.384 Residual | 2843254.39 8,151 348.822769 -------------+---------------------------------Total | 3157027.16 8,153 387.222759 Number of obs F(2, 8151) Prob > F R-squared Adj R-squared Root MSE = = = = = = 8,154 449.76 0.0000 0.0994 0.0992 18.677 -----------------------------------------------------------------------------MU | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------raedyrs | 1.802196 .0730644 24.67 0.000 1.658971 1.945421 female | -5.998004 .4158896 -14.42 0.000 -6.813253 -5.182754 _cons | -21.06563 1.038999 -20.27 0.000 -23.10234 -19.02893 -----------------------------------------------------------------------------With Fixed Effects models we cannot include time invariant variables, however we can interact them with time-varying variables. This allows us to estimate if the effect varies across groups. In this case we are using the ## option to include the female indicator interacted with the wealth variable. . xtreg rbeqany hhhres i.female##c.hatota rsayret, fe note: 1.female omitted because of collinearity Fixed-effects (within) regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0110 between = 0.0318 overall = 0.0265 corr(u_i, Xb) = -0.1099 = = 8,154 1,359 min = avg = max = 6 6.0 6 = = 18.84 0.0000 F(4,6791) Prob > F --------------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] ----------------+---------------------------------------------------------------hhhres | -.3438924 .2784191 -1.24 0.217 -.8896812 .2018964 1.female | 0 (omitted) hatota | 1.570832 .4037838 3.89 0.000 .7792891 2.362375 | female#c.hatota | 1 | .8996671 .5207344 1.73 0.084 -.1211355 1.92047 | rsayret | -.9205838 .4380502 -2.10 0.036 -1.7793 -.0618681 _cons | 68.48225 3.447765 19.86 0.000 61.72355 75.24095 ----------------+---------------------------------------------------------------sigma_u | 21.331345 sigma_e | 13.022253 rho | .72850204 (fraction of variance due to u_i) --------------------------------------------------------------------------------F test that all u_i=0: F(1358, 6791) = 9.62 Prob > F = 0.0000 Note that if you run the model separately for men… . xtreg rbeqany hhhres hatota rsayret if female == 0, fe Fixed-effects (within) regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0080 = = 4,080 680 min = 6 First Difference and Fixed Effects Moulton-UNC Chapel Hill between = 0.2380 overall = 0.1471 corr(u_i, Xb) = 0.3311 avg = max = 6.0 6 = = 9.18 0.0000 F(3,3397) Prob > F -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | -.3191085 .3364081 -0.95 0.343 -.9786912 .3404742 hatota | 1.59187 .3324218 4.79 0.000 .940103 2.243637 rsayret | -1.187565 .5279356 -2.25 0.025 -2.222668 -.1524612 _cons | 77.55832 4.406477 17.60 0.000 68.91871 86.19793 -------------+---------------------------------------------------------------sigma_u | 12.630731 sigma_e | 10.663882 rho | .58383597 (fraction of variance due to u_i) -----------------------------------------------------------------------------F test that all u_i=0: F(679, 3397) = 7.22 Prob > F = 0.0000 and women, you get different coefficients because we only interacted one of the variables in the combined model, but when we run them separately it is like we interacted every variable. . xtreg rbeqany hhhres hatota rsayret if female == 1, fe Fixed-effects (within) regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0125 between = 0.5441 overall = 0.3939 corr(u_i, Xb) = 0.6129 = = 4,074 679 min = avg = max = 6 6.0 6 = = 14.33 0.0000 F(3,3392) Prob > F -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | -.3678956 .4367785 -0.84 0.400 -1.224271 .4884802 hatota | 2.457914 .38449 6.39 0.000 1.704059 3.21177 rsayret | -.6905625 .6887097 -1.00 0.316 -2.040891 .6597656 _cons | 59.31754 4.979433 11.91 0.000 49.55455 69.08053 -------------+---------------------------------------------------------------sigma_u | 24.08111 sigma_e | 15.020166 rho | .71992064 (fraction of variance due to u_i) -----------------------------------------------------------------------------F test that all u_i=0: F(678, 3392) = 9.59 Prob > F = 0.0000 For instance, we get the same results as the stratified models (where we ran them separately) if we interact every variable in the model. . xtreg rbeqany i.female##c.hhhres i.female##c.hatota i.female##i.rsayret, fe note: 1.female omitted because of collinearity Fixed-effects (within) regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0110 between = 0.0344 overall = 0.0286 corr(u_i, Xb) = -0.0969 F(6,6789) Prob > F = = 8,154 1,359 min = avg = max = 6 6.0 6 = = 12.61 0.0000 -------------------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------------------+---------------------------------------------------------------1.female | 0 (omitted) hhhres | -.3191085 .4108568 -0.78 0.437 -1.124517 .4862996 | First Difference and Fixed Effects Moulton-UNC Chapel Hill female#c.hhhres | 1 | -.0487871 .5587819 -0.09 0.930 -1.144175 1.046601 | hatota | 1.59187 .4059883 3.92 0.000 .7960055 2.387734 | female#c.hatota | 1 | .8660443 .5253322 1.65 0.099 -.1637716 1.89586 | rsayret | 1.completely reti.. | -1.187565 .6447703 -1.84 0.066 -2.451517 .0763871 | female#rsayret | 1 #| 1.completely reti.. | .4970022 .8788319 0.57 0.572 -1.225784 2.219788 | _cons | 68.44464 3.450335 19.84 0.000 61.6809 75.20838 ---------------------+---------------------------------------------------------------sigma_u | 21.27283 sigma_e | 13.023851 rho | .72736543 (fraction of variance due to u_i) -------------------------------------------------------------------------------------F test that all u_i=0: F(1358, 6789) = 9.42 Prob > F = 0.0000 Note that the coefficient for the individual being retired is -1.18 for males and: . di -1.187565 + .4970022 -.6905628 for females. Which is what we see in the stratified results above. We can also include self-reported health using i. (correctly this time, unlike the example from a prior lecture). . reg rbeqany hhhres hatota rsayret i.rshlt Source | SS df MS -------------+---------------------------------Model | 1591584.52 7 227369.218 Residual | 3353903.41 8,146 411.723964 -------------+---------------------------------Total | 4945487.94 8,153 606.585053 Number of obs F(7, 8146) Prob > F R-squared Adj R-squared Root MSE = = = = = = 8,154 552.24 0.0000 0.3218 0.3212 20.291 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | .6540828 .257867 2.54 0.011 .1485976 1.159568 hatota | 8.790527 .1526396 57.59 0.000 8.491315 9.08974 rsayret | -.7791816 .4576924 -1.70 0.089 -1.676375 .1180122 | rshlt | 2.very good | -.3837923 .6246604 -0.61 0.539 -1.608286 .8407015 3.good | -.1905096 .6543037 -0.29 0.771 -1.473112 1.092093 4.fair | -3.576621 .8139137 -4.39 0.000 -5.1721 -1.981143 5.poor | -11.38068 1.390215 -8.19 0.000 -14.10586 -8.655506 | _cons | -19.00389 2.156946 -8.81 0.000 -23.23206 -14.77573 -----------------------------------------------------------------------------We can also change the reference category using b3. The reference category changed from the first category (excellent health) to 3 (good health) and now all the coefficients are in relation to 3. . reg rbeqany hhhres hatota rsayret b3.rshlt Source | SS df MS -------------+---------------------------------Model | 1591584.52 7 227369.218 Residual | 3353903.41 8,146 411.723964 -------------+---------------------------------Total | 4945487.94 8,153 606.585053 Number of obs F(7, 8146) Prob > F R-squared Adj R-squared Root MSE = = = = = = 8,154 552.24 0.0000 0.3218 0.3212 20.291 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | .6540828 .257867 2.54 0.011 .1485976 1.159568 First Difference and Fixed Effects Moulton-UNC Chapel Hill hatota | 8.790527 .1526396 57.59 0.000 8.491315 9.08974 rsayret | -.7791816 .4576924 -1.70 0.089 -1.676375 .1180122 | rshlt | 1.excellent | .1905096 .6543037 0.29 0.771 -1.092093 1.473112 2.very good | -.1932827 .5709417 -0.34 0.735 -1.312474 .9259088 4.fair | -3.386112 .7572668 -4.47 0.000 -4.870548 -1.901676 5.poor | -11.19017 1.351554 -8.28 0.000 -13.83956 -8.54078 | _cons | -19.1944 2.069981 -9.27 0.000 -23.25209 -15.13671 -----------------------------------------------------------------------------This is important since when I change the reference category to 5 (poor health), the coefficients appear to change quite a bit and become more statistically significant. This is because they are all in relation to 5 (poor health). . reg rbeqany hhhres hatota rsayret b5.rshlt Source | SS df MS -------------+---------------------------------Model | 1591584.52 7 227369.218 Residual | 3353903.41 8,146 411.723964 -------------+---------------------------------Total | 4945487.94 8,153 606.585053 Number of obs F(7, 8146) Prob > F R-squared Adj R-squared Root MSE = = = = = = 8,154 552.24 0.0000 0.3218 0.3212 20.291 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | .6540828 .257867 2.54 0.011 .1485976 1.159568 hatota | 8.790527 .1526396 57.59 0.000 8.491315 9.08974 rsayret | -.7791816 .4576924 -1.70 0.089 -1.676375 .1180122 | rshlt | 1.excellent | 11.38068 1.390215 8.19 0.000 8.655506 14.10586 2.very good | 10.99689 1.348262 8.16 0.000 8.353952 13.63983 3.good | 11.19017 1.351554 8.28 0.000 8.54078 13.83956 4.fair | 7.80406 1.421072 5.49 0.000 5.018396 10.58972 | _cons | -30.38457 2.275688 -13.35 0.000 -34.8455 -25.92365 -----------------------------------------------------------------------------If you use the xi: command to estimate a model with indicators or fixed effects, Stata will create variables for each category (minus the reference category). . xi: reg rbeqany hhhres hatota rsayret i.rshlt i.rshlt _Irshlt_1-5 (naturally coded; _Irshlt_1 omitted) Source | SS df MS -------------+---------------------------------Model | 1591584.52 7 227369.218 Residual | 3353903.41 8,146 411.723964 -------------+---------------------------------Total | 4945487.94 8,153 606.585053 Number of obs F(7, 8146) Prob > F R-squared Adj R-squared Root MSE = = = = = = 8,154 552.24 0.0000 0.3218 0.3212 20.291 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | .6540828 .257867 2.54 0.011 .1485976 1.159568 hatota | 8.790527 .1526396 57.59 0.000 8.491315 9.08974 rsayret | -.7791816 .4576924 -1.70 0.089 -1.676375 .1180122 _Irshlt_2 | -.3837923 .6246604 -0.61 0.539 -1.608286 .8407015 _Irshlt_3 | -.1905096 .6543037 -0.29 0.771 -1.473112 1.092093 _Irshlt_4 | -3.576621 .8139137 -4.39 0.000 -5.1721 -1.981143 _Irshlt_5 | -11.38068 1.390215 -8.19 0.000 -14.10586 -8.655506 _cons | -19.00389 2.156946 -8.81 0.000 -23.23206 -14.77573 -----------------------------------------------------------------------------As we can see in the summary output below. . su Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------hhidpn | 8,154 7.76e+07 6.61e+07 1.00e+07 2.09e+08 First Difference and Fixed Effects Moulton-UNC Chapel Hill wave | 8,154 5.5 1.70793 3 8 rcenreg | 8,154 2.660657 .9895839 1 5 ragender | 8,154 1.499632 .5000305 1 2 raedyrs | 8,154 13.35173 2.846221 0 17 -------------+--------------------------------------------------------rshlt | 8,154 2.429605 1.052416 1 5 hatota | 8,154 12.73514 1.520786 .6931472 17.78957 rsayret | 8,154 .4540103 .497911 0 1 rbeqany | 8,154 92.97768 24.62895 0 100 hhhres | 8,154 2.126686 .8770996 1 12 -------------+--------------------------------------------------------beqdiff | 8,154 4.025018 20.12526 -100 100 hhresdiff | 8,154 -.1103753 .7670273 -10 6 wealthdiff | 8,154 .0836954 .677269 -6.339477 3.128419 retireddiff | 8,154 -.0036792 .4331829 -1 1 TTT1 | 8,154 .1666667 .3727009 0 1 -------------+--------------------------------------------------------TTT2 | 8,154 .1666667 .3727009 0 1 TTT3 | 8,154 .1666667 .3727009 0 1 TTT4 | 8,154 .1666667 .3727009 0 1 TTT5 | 8,154 .1666667 .3727009 0 1 TTT6 | 8,154 .1666667 .3727009 0 1 -------------+--------------------------------------------------------MEANbeqany | 8,154 92.97768 21.53527 0 100 MDbeqany | 8,154 1.98e-08 11.95061 -83.33334 83.33334 MEANhhhres | 8,154 2.126686 .7055845 1 6.5 MDhhhres | 8,154 -1.04e-09 .5210127 -3.333333 8.333333 MEANhatota | 8,154 12.73514 1.41213 5.408678 17.07953 -------------+--------------------------------------------------------MDhatota | 8,154 -1.27e-09 .5645154 -9.157493 3.402811 MEANrsayret | 8,154 .4540103 .3716716 0 1 MDrsayret | 8,154 -3.01e-09 .3313239 -.8333333 .8333333 MU | 8,154 2.63e-08 19.67798 -91.83736 13.45221 female | 8,154 .4996321 .5000305 0 1 -------------+--------------------------------------------------------_Irshlt_2 | 8,154 .345352 .4755121 0 1 _Irshlt_3 | 8,154 .2857493 .4517983 0 1 _Irshlt_4 | 8,154 .1293844 .3356454 0 1 _Irshlt_5 | 8,154 .0311504 .1737346 0 1 Lastly, we will conduct the Hausman test between Fixed Effects and Pooled OLS. To do this, we need to remove any time indicators from both models and any time-invariant controls from Pooled OLS. Fixed Effects . xtreg rbeqany hhhres hatota rsayret i.rshlt, fe Fixed-effects (within) regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0113 between = 0.4260 overall = 0.2963 corr(u_i, Xb) = 0.5169 F(7,6788) Prob > F = = 8,154 1,359 min = avg = max = 6 6.0 6 = = 11.09 0.0000 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | -.33284 .2785446 -1.19 0.232 -.8788747 .2131947 hatota | 2.100239 .2578482 8.15 0.000 1.594776 2.605702 rsayret | -.9805525 .4392543 -2.23 0.026 -1.841629 -.1194764 | rshlt | 2.very good | -.733578 .5591741 -1.31 0.190 -1.829735 .3625785 3.good | .2267821 .6529866 0.35 0.728 -1.053276 1.506841 4.fair | .3095366 .8211299 0.38 0.706 -1.300135 1.919209 5.poor | .4336902 1.332549 0.33 0.745 -2.178524 3.045904 | _cons | 67.51886 3.398302 19.87 0.000 60.85712 74.1806 First Difference and Fixed Effects Moulton-UNC Chapel Hill -------------+---------------------------------------------------------------sigma_u | 19.748141 sigma_e | 13.022953 rho | .69692381 (fraction of variance due to u_i) -----------------------------------------------------------------------------F test that all u_i=0: F(1358, 6788) = 9.56 Prob > F = 0.0000 We need to store the estimates. . estimates store my_fe Pooled OLS . reg rbeqany hhhres hatota rsayret i.rshlt Source | SS df MS -------------+---------------------------------Model | 1591584.52 7 227369.218 Residual | 3353903.41 8,146 411.723964 -------------+---------------------------------Total | 4945487.94 8,153 606.585053 Number of obs F(7, 8146) Prob > F R-squared Adj R-squared Root MSE = = = = = = 8,154 552.24 0.0000 0.3218 0.3212 20.291 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | .6540828 .257867 2.54 0.011 .1485976 1.159568 hatota | 8.790527 .1526396 57.59 0.000 8.491315 9.08974 rsayret | -.7791816 .4576924 -1.70 0.089 -1.676375 .1180122 | rshlt | 2.very good | -.3837923 .6246604 -0.61 0.539 -1.608286 .8407015 3.good | -.1905096 .6543037 -0.29 0.771 -1.473112 1.092093 4.fair | -3.576621 .8139137 -4.39 0.000 -5.1721 -1.981143 5.poor | -11.38068 1.390215 -8.19 0.000 -14.10586 -8.655506 | _cons | -19.00389 2.156946 -8.81 0.000 -23.23206 -14.77573 -----------------------------------------------------------------------------. estimates store my_ols To calculate the Hausman test you need a consistent estimator (Fixed Effects) and another estimator that might be biased if the time-invariant unobservables are correlated with X, but is more efficient (Pooled OLS). The test tells you if there are significant enough differences in the coefficients that you should use the Fixed Effects model rather than the more efficient but maybe biased OLS model. Type hausman followed by the fixed effects estimates, then the OLS estimates, followed by the sigmamore option. Cameron & Trivedi state that it is better to use the sigmamore option in their book on page 267. Sigmamore specifies that both covariance matrices are based on the same estimated disturbance variance from the efficient estimator (Pooled OLS). In this case, we reject the null hypothesis that the coefficients are the same and we should the use the consistent estimator (Fixed Effects). . hausman my_fe my_ols, sigmamore ---- Coefficients ---| (b) (B) (b-B) sqrt(diag(V_b-V_B)) | my_fe my_ols Difference S.E. -------------+---------------------------------------------------------------hhhres | -.33284 .6540828 -.9869228 .3490836 hatota | 2.100239 8.790527 -6.690288 .3716256 rsayret | -.9805525 -.7791816 -.2013709 .5088421 rshlt | 2 | -.733578 -.3837923 -.3497857 .6073454 3 | .2267821 -.1905096 .4172917 .779114 4 | .3095366 -3.576621 3.886158 .9871182 5 | .4336902 -11.38068 11.81437 1.542096 -----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg B = inconsistent under Ha, efficient under Ho; obtained from regress Test: Ho: difference in coefficients not systematic chi2(7) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 397.37 Prob>chi2 = 0.0000 First Difference and Fixed Effects Moulton-UNC Chapel Hill Calculating a Random Effects is as easy as changing , fe to , re. Random Effects allows the inclusion of time invariant variables, such as female and education. . xtreg rbeqany hhhres hatota rsayret female raedyrs, re Random-effects GLS regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0102 between = 0.4109 overall = 0.3052 corr(u_i, X) = = 8,154 1,359 min = avg = max = 6 6.0 6 = = 853.22 0.0000 Wald chi2(5) Prob > chi2 = 0 (assumed) -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | .0712011 .2586166 0.28 0.783 -.4356781 .5780802 hatota | 4.954339 .2075139 23.87 0.000 4.547619 5.361059 rsayret | -1.397019 .4181785 -3.34 0.001 -2.216634 -.5774039 female | -5.245325 .8875721 -5.91 0.000 -6.984934 -3.505716 raedyrs | 1.277051 .1601563 7.97 0.000 .9631506 1.590952 _cons | 15.93623 3.176974 5.02 0.000 9.709478 22.16299 -------------+---------------------------------------------------------------sigma_u | 14.983751 sigma_e | 13.024156 rho | .56962494 (fraction of variance due to u_i) -----------------------------------------------------------------------------I am using a slimmed down version of the regression equation and will conduct the test triangle. First, we can see the lambda (or theta in Stata) of the Random Effects model using the theta option. Theta is included in the output, but you can also… . xtreg rbeqany hhhres hatota rsayret, re theta Random-effects GLS regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0102 between = 0.4448 overall = 0.3137 corr(u_i, X) theta = = 8,154 1,359 min = avg = max = 6 6.0 6 = = 723.69 0.0000 Wald chi2(3) Prob > chi2 = 0 (assumed) = .66861607 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | .0908525 .2599972 0.35 0.727 -.4187327 .6004377 hatota | 5.429998 .2023974 26.83 0.000 5.033306 5.82669 rsayret | -1.349901 .4200215 -3.21 0.001 -2.173128 -.5266739 _cons | 24.24557 2.705589 8.96 0.000 18.94272 29.54843 -------------+---------------------------------------------------------------sigma_u | 15.13849 sigma_e | 13.024156 rho | .57465506 (fraction of variance due to u_i) -----------------------------------------------------------------------------Use the formula for lambda from class using the sigma_u (standard deviation of the time invariant, individual specific portion of the error – MU) and sigma_e (standard deviation of the time-variant – NU). . di 1 - (e(sigma_e)^2/(e(sigma_e)^2+6*e(sigma_u)^2))^.5 .66861607 Start with the OLS model and save the output. . reg rbeqany hhhres hatota rsayret Source | SS df MS Number of obs = 8,154 First Difference and Fixed Effects -------------+---------------------------------Model | 1554814.13 3 518271.378 Residual | 3390673.8 8,150 416.033596 -------------+---------------------------------Total | 4945487.94 8,153 606.585053 Moulton-UNC Chapel Hill F(3, 8150) Prob > F R-squared Adj R-squared Root MSE = = = = = 1245.74 0.0000 0.3144 0.3141 20.397 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | .6481612 .2591728 2.50 0.012 .1401164 1.156206 hatota | 9.083049 .1486387 61.11 0.000 8.791679 9.374419 rsayret | -1.173936 .4563964 -2.57 0.010 -2.068589 -.2792826 _cons | -23.54163 2.014055 -11.69 0.000 -27.4897 -19.59357 -----------------------------------------------------------------------------. est store my_OLS Estimate the fixed effects model and check the outcome of Specification Test 1. The null is that the intercepts are = 0. We would reject the null at greater than 99% confidence. This leads us to think that fixed effects might be preferred to OLS. . xtreg rbeqany hhhres hatota rsayret, fe Fixed-effects (within) regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0105 between = 0.4364 overall = 0.3065 corr(u_i, Xb) = 0.5278 = = 8,154 1,359 min = avg = max = 6 6.0 6 = = 24.11 0.0000 F(3,6792) Prob > F -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | -.3409778 .2784547 -1.22 0.221 -.8868363 .2048807 hatota | 2.108028 .2576457 8.18 0.000 1.602961 2.613094 rsayret | -.9354061 .4380302 -2.14 0.033 -1.794083 -.0767297 _cons | 67.2815 3.377482 19.92 0.000 60.66057 73.90242 -------------+---------------------------------------------------------------sigma_u | 19.684013 sigma_e | 13.024156 rho | .69550892 (fraction of variance due to u_i) -----------------------------------------------------------------------------F test that all u_i=0: F(1358, 6792) = 9.72 Prob > F = 0.0000 . est store my_FE Then run the random effects model and store the output. . xtreg rbeqany hhhres hatota rsayret, re Random-effects GLS regression Group variable: hhidpn Number of obs Number of groups R-sq: Obs per group: within = 0.0102 between = 0.4448 overall = 0.3137 corr(u_i, X) = 0 (assumed) Wald chi2(3) Prob > chi2 = = 8,154 1,359 min = avg = max = 6 6.0 6 = = 723.69 0.0000 -----------------------------------------------------------------------------rbeqany | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------hhhres | .0908525 .2599972 0.35 0.727 -.4187327 .6004377 hatota | 5.429998 .2023974 26.83 0.000 5.033306 5.82669 rsayret | -1.349901 .4200215 -3.21 0.001 -2.173128 -.5266739 _cons | 24.24557 2.705589 8.96 0.000 18.94272 29.54843 -------------+---------------------------------------------------------------- First Difference and Fixed Effects Moulton-UNC Chapel Hill sigma_u | 15.13849 sigma_e | 13.024156 rho | .57465506 (fraction of variance due to u_i) -----------------------------------------------------------------------------. est store my_RE This command calculates the Breusch-Pagan test, with a null that the variance of the timeinvariant, individual specific errors are = 0. We would reject the null at greater than 99% confidence and indicates that Random Effects might be preferred to OLS. . xttest0 Breusch and Pagan Lagrangian multiplier test for random effects rbeqany[hhidpn,t] = Xb + u[hhidpn] + e[hhidpn,t] Estimated results: | Var sd = sqrt(Var) ---------+----------------------------rbeqany | 606.5851 24.62895 e | 169.6286 13.02416 u | 229.1739 15.13849 Test: Var(u) = 0 chibar2(01) = Prob > chibar2 = 6120.85 0.0000 Here are the estimates for each of the different models. . est table my_OLS my_RE my_FE ----------------------------------------------------Variable | my_OLS my_RE my_FE -------------+--------------------------------------hhhres | .64816117 .09085249 -.34097782 hatota | 9.0830489 5.4299981 2.1080277 rsayret | -1.1739359 -1.349901 -.9354061 _cons | -23.541634 24.245572 67.281498 ----------------------------------------------------We are calculating the Hausman test by comparing a consistent estimator (Fixed Effects) under both the null (that mu and X are uncorrelated) and that alternative hypothesis (mu and X are correlated) to an estimator (OLS) that is more efficient under the null but might be biased under the alternative. We reject the null that the slopes are not different at least at 99% confidence. This indicates that Fixed Effects might be preferred to OLS. . hausman my_FE my_OLS, sigmamore ---- Coefficients ---| (b) (B) (b-B) sqrt(diag(V_b-V_B)) | my_FE my_OLS Difference S.E. -------------+---------------------------------------------------------------hhhres | -.3409778 .6481612 -.989139 .3507106 hatota | 2.108028 9.083049 -6.975021 .375119 rsayret | -.9354061 -1.173936 .2385298 .5121392 -----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg B = inconsistent under Ha, efficient under Ho; obtained from regress Test: Ho: difference in coefficients not systematic chi2(3) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 349.79 Prob>chi2 = 0.0000 Just as above, we are calculating the Hausman test by comparing a consistent estimator (Fixed Effects) under both the null (that mu and X are uncorrelated) and that alternative hypothesis (mu and X are correlated) to an estimator (Random Effects) that is more efficient under the null but might be biased under the alternative. We reject the null that the slopes are not different at least at 99% confidence. This indicates that Fixed Effects might be preferred to Random. . hausman my_FE my_RE, sigmamore ---- Coefficients ---- First Difference and Fixed Effects Moulton-UNC Chapel Hill | (b) (B) (b-B) sqrt(diag(V_b-V_B)) | my_FE my_RE Difference S.E. -------------+---------------------------------------------------------------hhhres | -.3409778 .0908525 -.4318303 .1173452 hatota | 2.108028 5.429998 -3.32197 .1694012 rsayret | -.9354061 -1.349901 .4144949 .1579034 -----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg B = inconsistent under Ha, efficient under Ho; obtained from xtreg Test: Ho: difference in coefficients not systematic chi2(3) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 386.62 Prob>chi2 = 0.0000 The tests all lead to using the Fixed Effects model.