Heteroskedasticity worksheet For these exercises use mid14gss.dta, which contains the variables below: age: respondent’s age male: 1 if male, 0 if female black: 1 if black, 0 if not househ: number of people in the household childs: number of children (both residential and non-residential) educ: educational attainment in years of schooling inc06: family income, in thousands of dollars tvhours: hours per day watching tv rel: a standardized scale of religiosity from five items (mean=0, s.d. = 1), higher values=more religious* conserv: a standardized scale of political conservatism (mean=0, s.d.=1), higher values=more conservative* Predict conservatism using age, education, income, gender, race, and religiosity 0 -1 -2 Residuals 1 2 1) Check for heteroskedasticity by plotting residuals against fitted values. Does it look like you have heteroskedasticity? -1.5 -1 -.5 0 Fitted values .5 1 2) Add a smoothed regression line to the scatter plot using the lowess command. Does this change your assessment? 2 1 0 -1 -2 -1.5 -1 -.5 Residuals 0 Fitted values .5 1 lowess r conhat -2 -1 0 1 2 3) Based on a plot of residuals against each of your included covariates, do you think one variable in particular is causing heteroskedasticy? Education looks problematic: 0 5 10 15 highest year of school completed Residuals lowess r educ 20 -2 -1 0 1 2 And maybe religiosity: -2 -1 0 1 Standardized values of (r1+r2+r3+r4+r5) Residuals 2 lowess r rel 4) Can we fix the heteroskedasticity by adding a squared term for the education covariate(s)? No 3 2 1 0 -1 -2 -2 -1 0 1 Fitted values Residuals lowess r2 conhat2 5) Test the original regression for heteroskedasticity using the Breusch-Pagan test. Do you reject or fail to reject the null hypothesis of homoscedasticity? Reject: . ivhettest, nr2 OLS heteroskedasticity test(s) using levels of IVs only Ho: Disturbance is homoskedastic White/Koenker nR2 test statistic : 52.454 Chi-sq(6) P-value = 0.0000 6) Based on the auxiliary regression used in the Breusch-Pagan tests, which variables are the most likely culprits for causing heteroskedasticity? . reg r2 age c.educ inc06 male black rel Source | SS df MS -------------+-----------------------------Model | 47.9831203 6 7.99718671 Residual | 934.472991 1067 .875794744 -------------+-----------------------------Total | 982.456112 1073 .915616134 Number of obs F( 6, 1067) Prob > F R-squared Adj R-squared Root MSE = = = = = = 1074 9.13 0.0000 0.0488 0.0435 .93584 -----------------------------------------------------------------------------r2 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------age | .0016935 .0017864 0.95 0.343 -.0018119 .0051988 educ | .0484885 .0106529 4.55 0.000 .0275855 .0693916 inc06 | .0071345 .006197 1.15 0.250 -.0050252 .0192941 male | -.0012675 .0597298 -0.02 0.983 -.1184687 .1159337 black | -.313183 .085517 -3.66 0.000 -.4809836 -.1453824 rel | .0551263 .0295905 1.86 0.063 -.0029359 .1131886 _cons | .1040123 .1717598 0.61 0.545 -.233013 .4410377 ------------------------------------------------------------------------------ 7) Does the White test for heteroskedasticity (simple version) give the same result? Yes . reg r2 conhat conhat2 Source | SS df MS -------------+-----------------------------Model | 21.74384 2 10.87192 Residual | 960.712272 1071 .897023596 -------------+-----------------------------Total | 982.456112 1073 .915616134 Number of obs F( 2, 1071) Prob > F R-squared Adj R-squared Root MSE = = = = = = 1074 12.12 0.0000 0.0221 0.0203 .94711 -----------------------------------------------------------------------------r2 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------conhat | .2647934 .0752187 3.52 0.000 .1172007 .4123861 conhat2 | -.2943126 .1325408 -2.22 0.027 -.5543816 -.0342435 _cons | .8844795 .0360335 24.55 0.000 .8137752 .9551837 -----------------------------------------------------------------------------. di e(r2)*e(N) 23.769901 . di chi2tail(2,23.77) 6.893e-06 8) Re-estimate the original regression model with standard errors robust to heteroskedasticity. What changed? Not much 9) Assuming heteroskedasticity is related to education, estimate a weighted least squares regression. Does this change your conclusions at all? No 10) Making no assumptions about the form of heteroskedasticity, estimate a feasible general least squares model. Again, do any of your conclusions change? No . estimates table ols robust wls fgls, stat(r2 rmse) b(%7.3g) se(%6.3g) t(%7.3g) ------------------------------------------------------ Variable | ols robust wls fgls -------------+---------------------------------------age | -.00024 -.00024 .00012 -.00059 | .0018 .0017 .0017 .0017 | -.138 -.139 .0731 -.352 educ | -.0181 -.0181 -.00143 -.0132 | .0104 .011 .0121 .01 | -1.73 -1.65 -.118 -1.32 inc06 | .0254 .0254 .024 .0223 | .0061 .0062 .0064 .0069 | 4.19 4.1 3.75 3.26 male | .145 .145 .118 .11 | .0586 .0585 .0578 .0553 | 2.48 2.48 2.05 1.99 black | -.771 -.771 -.722 -.665 | .0839 .0695 .0736 .0644 | -9.2 -11.1 -9.8 -10.3 rel | .282 .282 .28 .231 | .029 .0296 .0292 .0283 | 9.7 9.5 9.58 8.16 _cons | .124 .124 -.0999 .0878 | .168 .168 .175 .157 | .733 .733 -.569 .558 -------------+---------------------------------------r2 | .163 .163 .158 .154 rmse | .918 .918 .898 .867 -----------------------------------------------------legend: b/se/t