heteroskedasticity w..

advertisement
Heteroskedasticity worksheet
For these exercises use mid14gss.dta, which contains the variables below:
age: respondent’s age
male: 1 if male, 0 if female
black: 1 if black, 0 if not
househ: number of people in the household
childs: number of children (both residential and non-residential)
educ: educational attainment in years of schooling
inc06: family income, in thousands of dollars
tvhours: hours per day watching tv
rel: a standardized scale of religiosity from five items (mean=0, s.d. = 1), higher values=more
religious*
conserv: a standardized scale of political conservatism (mean=0, s.d.=1), higher values=more
conservative*
Predict conservatism using age, education, income, gender, race, and religiosity
0
-1
-2
Residuals
1
2
1) Check for heteroskedasticity by plotting residuals against fitted values. Does it look like you
have heteroskedasticity?
-1.5
-1
-.5
0
Fitted values
.5
1
2) Add a smoothed regression line to the scatter plot using the lowess command. Does this
change your assessment?
2
1
0
-1
-2
-1.5
-1
-.5
Residuals
0
Fitted values
.5
1
lowess r conhat
-2
-1
0
1
2
3) Based on a plot of residuals against each of your included covariates, do you think one
variable in particular is causing heteroskedasticy?
Education looks problematic:
0
5
10
15
highest year of school completed
Residuals
lowess r educ
20
-2
-1
0
1
2
And maybe religiosity:
-2
-1
0
1
Standardized values of (r1+r2+r3+r4+r5)
Residuals
2
lowess r rel
4) Can we fix the heteroskedasticity by adding a squared term for the education covariate(s)?
No
3
2
1
0
-1
-2
-2
-1
0
1
Fitted values
Residuals
lowess r2 conhat2
5) Test the original regression for heteroskedasticity using the Breusch-Pagan test. Do you reject
or fail to reject the null hypothesis of homoscedasticity?
Reject:
. ivhettest, nr2
OLS heteroskedasticity test(s) using levels of IVs only
Ho: Disturbance is homoskedastic
White/Koenker nR2 test statistic
: 52.454 Chi-sq(6) P-value = 0.0000
6) Based on the auxiliary regression used in the Breusch-Pagan tests, which variables are the
most likely culprits for causing heteroskedasticity?
. reg r2 age c.educ inc06 male black rel
Source |
SS
df
MS
-------------+-----------------------------Model | 47.9831203
6 7.99718671
Residual | 934.472991 1067 .875794744
-------------+-----------------------------Total | 982.456112 1073 .915616134
Number of obs
F( 6, 1067)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
1074
9.13
0.0000
0.0488
0.0435
.93584
-----------------------------------------------------------------------------r2 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.0016935
.0017864
0.95
0.343
-.0018119
.0051988
educ |
.0484885
.0106529
4.55
0.000
.0275855
.0693916
inc06 |
.0071345
.006197
1.15
0.250
-.0050252
.0192941
male | -.0012675
.0597298
-0.02
0.983
-.1184687
.1159337
black |
-.313183
.085517
-3.66
0.000
-.4809836
-.1453824
rel |
.0551263
.0295905
1.86
0.063
-.0029359
.1131886
_cons |
.1040123
.1717598
0.61
0.545
-.233013
.4410377
------------------------------------------------------------------------------
7) Does the White test for heteroskedasticity (simple version) give the same result?
Yes
. reg r2 conhat conhat2
Source |
SS
df
MS
-------------+-----------------------------Model |
21.74384
2
10.87192
Residual | 960.712272 1071 .897023596
-------------+-----------------------------Total | 982.456112 1073 .915616134
Number of obs
F( 2, 1071)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
1074
12.12
0.0000
0.0221
0.0203
.94711
-----------------------------------------------------------------------------r2 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------conhat |
.2647934
.0752187
3.52
0.000
.1172007
.4123861
conhat2 | -.2943126
.1325408
-2.22
0.027
-.5543816
-.0342435
_cons |
.8844795
.0360335
24.55
0.000
.8137752
.9551837
-----------------------------------------------------------------------------. di e(r2)*e(N)
23.769901
. di chi2tail(2,23.77)
6.893e-06
8) Re-estimate the original regression model with standard errors robust to heteroskedasticity.
What changed?
Not much
9) Assuming heteroskedasticity is related to education, estimate a weighted least squares
regression. Does this change your conclusions at all?
No
10) Making no assumptions about the form of heteroskedasticity, estimate a feasible general least
squares model. Again, do any of your conclusions change?
No
. estimates table ols robust wls fgls, stat(r2 rmse) b(%7.3g) se(%6.3g) t(%7.3g)
------------------------------------------------------
Variable |
ols
robust
wls
fgls
-------------+---------------------------------------age | -.00024
-.00024
.00012
-.00059
|
.0018
.0017
.0017
.0017
|
-.138
-.139
.0731
-.352
educ | -.0181
-.0181
-.00143
-.0132
|
.0104
.011
.0121
.01
|
-1.73
-1.65
-.118
-1.32
inc06 |
.0254
.0254
.024
.0223
|
.0061
.0062
.0064
.0069
|
4.19
4.1
3.75
3.26
male |
.145
.145
.118
.11
|
.0586
.0585
.0578
.0553
|
2.48
2.48
2.05
1.99
black |
-.771
-.771
-.722
-.665
|
.0839
.0695
.0736
.0644
|
-9.2
-11.1
-9.8
-10.3
rel |
.282
.282
.28
.231
|
.029
.0296
.0292
.0283
|
9.7
9.5
9.58
8.16
_cons |
.124
.124
-.0999
.0878
|
.168
.168
.175
.157
|
.733
.733
-.569
.558
-------------+---------------------------------------r2 |
.163
.163
.158
.154
rmse |
.918
.918
.898
.867
-----------------------------------------------------legend: b/se/t
Download