ECON 4550 Econometrics Memorial University of Newfoundland Heteroskedasticity Adapted from Vera Tabakova’s notes 8.1 The Nature of Heteroskedasticity 8.2 Using the Least Squares Estimator 8.3 The Generalized Least Squares Estimator 8.4 Detecting Heteroskedasticity Principles of Econometrics, 3rd Edition Slide 8-2 E ( y ) 1 2 x ei yi E ( yi ) yi 1 2 xi yi 1 2 xi ei Principles of Econometrics, 3rd Edition (8.1) (8.2) (8.3) Slide 8-3 Figure 8.1 Heteroskedastic Errors Principles of Econometrics, 3rd Edition Slide 8-4 E (ei ) 0 var(ei ) 2 cov(ei , e j ) 0 var( yi ) var(ei ) h( xi ) (8.4) Food expenditure example: yˆi 83.42 10.21 xi eˆi yi 83.42 10.21xi Principles of Econometrics, 3rd Edition Slide 8-5 Figure 8.2 Least Squares Estimated Expenditure Function and Observed Data Points Principles of Econometrics, 3rd Edition Slide 8-6 The existence of heteroskedasticity implies: The least squares estimator is still a linear and unbiased estimator, but it is no longer best. There is another estimator with a smaller variance. The standard errors usually computed for the least squares estimator are incorrect. Confidence intervals and hypothesis tests that use these standard errors may be misleading. Principles of Econometrics, 3rd Edition Slide 8-7 yi 1 2 xi ei var(b2 ) var(ei ) 2 (8.5) 2 N ( xi x ) 2 (8.6) i 1 yi 1 2 xi ei Principles of Econometrics, 3rd Edition var(ei ) i2 (8.7) Slide 8-8 N N var(b2 ) wi2i2 i 1 2 2 ( x x ) i i i 1 2 ( x x ) i i 1 N 2 (8.8) N N var(b2 ) wi2eˆi2 i 1 Principles of Econometrics, 3rd Edition 2 2 ( x x ) eˆi i i 1 2 ( x x ) i i 1 N 2 (8.9) Slide 8-9 We can use a robust estimator: regress food_exp income, robust yˆi 83.42 10.21xi (27.46) (1.81) (43.41) (2.09) (White se) (incorrect se) White: b2 tcse(b2 ) 10.21 2.024 1.81 [6.55, 13.87] Incorrect: b2 tcse(b2 ) 10.21 2.024 2.09 [5.97, 14.45] Principles of Econometrics, 3rd Edition Slide 8-10 In SHAZAM the HETCOV option on the OLS command reports the White's heteroskedasticity-consistent standard errors OLS FOOD INCOME / HETCOV Confidence interval estimate using the White standard errors CONFID INCOME / TCRIT=2.024 But the White standard errors reported by SHAZAM have numeric differences compared to our textbook results. There is a correction we could apply, but we will not worry about it for now Principles of Econometrics, 3rd Edition Slide 8-11 yi 1 2 xi ei (8.10) E (ei ) 0 Principles of Econometrics, 3rd Edition var(ei ) i2 cov(ei , e j ) 0 Slide 8-12 yi y xi i var ei i2 2 xi (8.11) 1 xi ei yi 1 2 x x xi xi i i (8.12) 1 x xi i1 Principles of Econometrics, 3rd Edition xi ei x xi ei xi xi i2 (8.13) Slide 8-13 yi 1xi1 2 xi2 ei (8.14) ei 1 1 2 var(e ) var var(ei ) xi 2 x xi xi i (8.15) i Principles of Econometrics, 3rd Edition Slide 8-14 To obtain the best linear unbiased estimator for a model with heteroskedasticity of the type specified in equation (8.11): 1. Calculate the transformed variables given in (8.13). 2. Use least squares to estimate the transformed model given in (8.14). Principles of Econometrics, 3rd Edition Slide 8-15 The generalized least squares estimator is as a weighted least squares estimator. Minimizing the sum of squared transformed errors that is given by: 2 N e 2 i ei x ( xi 1/2ei )2 i 1 i 1 i i 1 N When N xi is small, the data contain more information about the regression function and the observations are weighted heavily. When xi is large, the data contain less information and the observations are weighted lightly. Principles of Econometrics, 3rd Edition Slide 8-16 Food example again, where was the problem coming from? regress food_exp income [aweight = 1/income] yˆi 78.68 10.45 xi (8.16) (se) (23.79) (1.39) ˆ 2 tcse(ˆ 2 ) 10.451 2.024 1.386 [7.65,13.26] Principles of Econometrics, 3rd Edition Slide 8-17 Food example again, where was the problem coming from? Specify a weight variable (SHAZAM works with the inverse) GENR W=1/INCOME OLS FOOD INCOME / WEIGHT=W 95% confidence interval, p. 205 CONFID INCOME / TCRIT=2.024 yˆi 78.68 10.45 xi (se) (23.79) (1.39) (8.16) ˆ 2 tcse(ˆ 2 ) 10.451 2.024 1.386 [7.65,13.26] Principles of Econometrics, 3rd Edition Slide 8-18 The residual statistics reported in a WLS regression (SIGMA**2, STANDARD ERROR OF THE ESTIMATE-SIGMA, and SUM OF SQUARED ERRORS-SSE) are all based on the transformed (weighted) residuals You should remember when making model comparisons, that the highvariance observations are systematically underweighted by this procedure This may be a good thing, if you want to avoid having these observations dominate the model selection comparisons. But if you want model selection to be based on how well the alternative models fit the original (untransformed) data, you must base the model selection tests on the untransformed residuals. var(ei ) i2 2 xi (8.17) ln(i2 ) ln(2 ) ln( xi ) i2 exp ln( 2 ) ln( xi ) (8.18) exp(1 2 zi ) Principles of Econometrics, 3rd Edition Slide 8-20 i2 exp(1 2 zi 2 s ziS ) ln(i2 ) 1 2 zi (8.19) (8.20) yi E ( yi ) ei 1 2 xi ei Principles of Econometrics, 3rd Edition Slide 8-21 ln(eˆi2 ) ln(i2 ) vi 1 2 zi vi (8.21) ln(ˆ i2 ) .9378 2.329 zi ˆ1 ˆ 1zi ) ˆ i2 exp( yi 1 xi ei 1 2 i i i i Principles of Econometrics, 3rd Edition Slide 8-22 ei 1 1 2 var 2 var(ei ) 2 i 1 i i i yi y ˆ i i 1 x ˆ i i1 xi x ˆ i yi 1xi1 2 xi2 ei Principles of Econometrics, 3rd Edition i2 (8.22) (8.23) (8.24) Slide 8-23 yi 1 2 xi 2 k xiK ei var(ei ) i2 exp(1 2 zi 2 Principles of Econometrics, 3rd Edition s ziS ) (8.25) (8.26) Slide 8-24 The steps for obtaining a feasible generalized least squares estimator for 1 , 2 , , K are: 1. Estimate (8.25) by least squares and compute the squares of the least squares residuals eˆi2 . 2. Estimate 1 , 2 , , S by applying least squares to the equation ln eˆi2 1 2 zi 2 Principles of Econometrics, 3rd Edition S ziS vi Slide 8-25 3. Compute variance estimates ˆ i2 exp(ˆ 1 ˆ 2 zi 2 ˆ S ziS. ) 4. Compute the transformed observations defined by (8.23), including xi3 , , xiK if K 2. 5. Apply least squares to (8.24), or to an extended version of (8.24) if K 2 . yˆi 76.05 10.63x (se) (9.71) Principles of Econometrics, 3rd Edition (.97) (8.27) Slide 8-26 For our food expenditure example: gen z = log(income) regress food_exp income predict ehat, residual gen lnehat2 = log(ehat*ehat) regress lnehat2 z * -------------------------------------------* Feasible GLS * -------------------------------------------predict sig2, xb gen wt = exp(sig2) regress food_exp income [aweight = 1/wt] Principles of Econometrics, 3rd Edition Slide 8-27 The HET command can be used for Maximum Likelihood Estimation of the model given in Equations (8.25) and (8.26), p. 207. This method is an alternative estimation method to the GLS method discussed in the text (so the results will also be different): HET FOOD INCOME (INCOME) / MODEL=MULT Principles of Econometrics, 3rd Edition Slide 8-28 Using our wage data (cps2.dta): WAGE 9.914 1.234EDUC .133EXPER 1.524METRO (se) (1.08) (.070) (.015) WAGEMi M 1 2 EDUCMi 3 EXPERMi eMi WAGERi R1 2 EDUCRi 3 EXPERRi eRi (.431) i 1, 2, i 1, 2, bM 1 9.914 1.524 8.39 Principles of Econometrics, 3rd Edition , NM , NR (8.28) (8.29a) (8.29b) ??? Slide 8-29 var(eMi ) 2M ˆ 2M 31.824 var(eRi ) 2R ˆ 2R 15.243 bM 1 9.052 bM 2 1.282 bM 3 .1346 bR1 6.166 bR 2 .956 bR 3 .1260 Principles of Econometrics, 3rd Edition (8.30) Slide 8-30 WAGEMi 1 M 1 M M EDUCMi EXPERMi eMi 2 3 M M M i 1, 2, , NM WAGERi 1 EDUCRi EXPERRi eRi 2 3 R1 R R R R R i 1, 2, Principles of Econometrics, 3rd Edition (8.31a) (8.31b) , NR Slide 8-31 Feasible generalized least squares: 1. Obtain estimated ˆ M and ˆ R by applying least squares separately to the metropolitan and rural observations. ˆ M when METROi 1 2. ˆ i ˆ R when METROi 0 3. Apply least squares to the transformed model WAGEi 1 EDUCi EXPERi METROi ei 2 3 R1 ˆ ˆ ˆ ˆ ˆ i i i i i ˆ i Principles of Econometrics, 3rd Edition (8.32) Slide 8-32 WAGE 9.398 1.196 EDUC .132 EXPER 1.539METRO (se) (1.02) (.069) (.015) (.346) (8.33) . regress wage educ exper metro [aweight = 1/wt] (sum of wgt is 3.7986e+01) Source SS df MS Model Residual 9797.0667 26284.1488 3 996 3265.6889 26.3897076 Total 36081.2155 999 36.1173328 wage Coef. educ exper metro _cons 1.195721 .1322088 1.538803 -9.398362 Principles of Econometrics, 3rd Edition Std. Err. .068508 .0145485 .3462856 1.019673 t 17.45 9.09 4.44 -9.22 Number of obs F( 3, 996) Prob > F R-squared Adj R-squared Root MSE P>|t| 0.000 0.000 0.000 0.000 = = = = = = 1000 123.75 0.0000 0.2715 0.2693 5.1371 [95% Conf. Interval] 1.061284 .1036595 .8592702 -11.39931 1.330157 .160758 2.218336 -7.397408 Slide 8-33 STATA Commands: Principles of Econometrics, 3rd Edition * -------------------------------------------* Rural subsample regression * -------------------------------------------regress wage educ exper if metro == 0 scalar rmse_r = e(rmse) scalar df_r = e(df_r) * -------------------------------------------* Urban subsample regression * -------------------------------------------regress wage educ exper if metro == 1 scalar rmse_m = e(rmse) scalar df_m = e(df_r) * -------------------------------------------* Groupwise heteroskedastic regression using FGLS * -------------------------------------------gen rural = 1 - metro gen wt=(rmse_r^2*rural) + (rmse_m^2*metro) regress wage educ exper metro [aweight = 1/wt] Slide 8-34 Remark: To implement the generalized least squares estimators described in this Section for three alternative heteroskedastic specifications, an assumption about the form of the heteroskedasticity is required. Using least squares with White standard errors avoids the need to make an assumption about the form of heteroskedasticity, but does not realize the potential efficiency gains from generalized least squares. Principles of Econometrics, 3rd Edition Slide 8-35 8.4.1 Residual Plots Estimate the model using least squares and plot the least squares residuals. With more than one explanatory variable, plot the least squares residuals against each explanatory variable, or against yˆ i , to see if those residuals vary in a systematic way relative to the specified variable. Principles of Econometrics, 3rd Edition Slide 8-36 8.4.2 The Goldfeld-Quandt Test ˆ 2M 2M F 2 2 ˆ R R F( NM KM , N R K R ) H0 : 2M 2R against H0 : 2M 2R (8.34) (8.35) ˆ 2M 31.824 F 2 2.09 ˆ R 15.243 Principles of Econometrics, 3rd Edition Slide 8-37 8.4.2 The Goldfeld-Quandt Test * -------------------------------------------* Goldfeld Quandt test * -------------------------------------------scalar GQ = rmse_m^2/rmse_r^2 scalar crit = invFtail(df_m,df_r,.05) scalar pvalue = Ftail(df_m,df_r,GQ) scalar list GQ pvalue crit ˆ 2M 31.824 F 2 2.09 ˆ R 15.243 Principles of Econometrics, 3rd Edition Slide 8-38 8.4.2 The Goldfeld-Quandt Test ˆ 12 3574.8 ˆ 22 12,921.9 For the food expenditure data You should now be able to obtain this test statistic And check whether it exceeds the critical value ˆ 22 12,921.9 F 2 3.61 ˆ 1 3574.8 Principles of Econometrics, 3rd Edition Slide 8-39 8.4.2 The Goldfeld-Quandt Test Sort the data by income: SORT INCOME FOOD / DESC OLS FOOD INCOME On the DIAGNOS command the CHOWONE= option reports the Goldfeld-Quandt test for heteroskedasticity (bottom of page 212) with a p-value for a one-sided test. The HET option reports the tests for heteroskedasticity reported on page 215. DIAGNOS / CHOWONE=20 HET Principles of Econometrics, 3rd Edition Slide 8-40 8.4.2 The Goldfeld-Quandt Test SORT INCOME FOOD / DESC OLS FOOD INCOME On the DIAGNOS command the CHOWONE= option reports the Goldfeld-Quandt test for heteroskedasticity (bottom of page 212) with a p-value for a one-sided test. The HET option reports the tests for heteroskedasticity reported on page 215. DIAGNOS / CHOWONE=20 HET Of course, this option also computes the Chow test statistic for structural change (that is, tests the null of parameter stability in the two subsamples). Principles of Econometrics, 3rd Edition Slide 8-41 8.4.2 The Goldfeld-Quandt Test SEQUENTIAL CHOW AND GOLDFELD-QUANDT TESTS N1 N2 SSE1 SSE2 CHOW PVALUE G-Q DF1 DF2 PVALUE 20 20 0.23259E+06 64346. 0.45855 0.636 3.615 18 18 0.005 CHOW TEST - F DISTRIBUTION WITH DF1= 2 AND DF2= 36 Principles of Econometrics, 3rd Edition Slide 8-42 8.4.2 The Goldfeld-Quandt Test SHAZAM considers that the alternative hypothesis is smaller error variance in the second subset relative to the first subset. Some authors present the alternative as larger variance in the second subset. Goldfeld and Quandt recommend ordering the observations by the values of one of the explanatory variables. This can be done with the SORT command in SHAZAM. The DESC option on the SORT command should be used if it is assumed that the variance is positively related to the value of the sort variable. Principles of Econometrics, 3rd Edition Slide 8-43 8.4.3 Testing the Variance Function yi E ( yi ) ei 1 2 xi 2 K xiK ei var( yi ) i2 E(ei2 ) h(1 2 zi 2 h(1 2 zi 2 (8.36) S ziS ) S ziS ) exp(1 2 zi 2 (8.37) S ziS ) h(1 2 zi ) exp ln(2 ) ln( xi ) Principles of Econometrics, 3rd Edition Slide 8-44 8.4.3 Testing the Variance Function h(1 2 zi 2 S ziS ) 1 2 zi 2 h(1 2 zi 2 H 0 : 2 3 S ziS (8.38) S ziS ) h(1 ) S 0 (8.39) H1 : not all the s in H 0 are zero Principles of Econometrics, 3rd Edition Slide 8-45 8.4.3 Testing the Variance Function var( yi ) i2 E(ei2 ) 1 2 zi 2 ei2 E(ei2 ) vi 1 2 zi 2 eˆi2 1 2 zi 2 2 N R 2 Principles of Econometrics, 3rd Edition S ziS (8.40) S ziS vi (8.41) S ziS vi (8.42) (2S 1) (8.43) Slide 8-46 8.4.3a The White Test E ( yi ) 1 2 xi 2 3 xi 3 z2 x2 Principles of Econometrics, 3rd Edition z3 x3 z4 x22 z5 x32 Slide 8-47 8.4.3b Testing the Food Expenditure Example SST 4,610,749,441 SSE R 1 .1846 SST whitetst 2 Or estat imtest, white SSE 3,759,556,169 2 N R 2 40 .1846 7.38 2 N R 2 40 .18888 7.555 Principles of Econometrics, 3rd Edition p-value .023 Slide 8-48 DIAGNOS / HET Will yield a battery of heteroskedasticity tests using different specifications DIAGNOS\HET REQUIRED MEMORY IS PAR= 7 CURRENT PAR= 22480 DEPENDENT VARIABLE = FOOD 40 OBSERVATIONS REGRESSION COEFFICIENTS Same in SLR 10.2096426868 83.4160065402 HETEROSKEDASTICITY TESTS CHI-SQUARE D.F. P-VALUE TEST STATISTIC E**2 ON YHAT: 7.384 1 0.00658 E**2 ON YHAT**2: 7.549 1 0.00600 E**2 ON LOG(YHAT**2): 6.516 1 0.01069 E**2 ON LAG(E**2) ARCH TEST: 0.089 1 0.76544 LOG(E**2) ON X (HARVEY) TEST: 10.654 1 0.00110 ABS(E) ON X (GLEJSER) TEST: 11.466 1 0.00071 E**2 ON X TEST: KOENKER(R2): 7.384 1 0.00658 B-P-G (SSR) : 7.344 1 0.00673 E**2 ON X X**2 (WHITE) TEST: KOENKER(R2): 7.555 2 0.02288 B-P-G (SSR) : 7.514 2 0.02336 Breusch-Pagan test generalized least squares Goldfeld-Quandt test heteroskedastic partition heteroskedasticity heteroskedasticity-consistent standard errors homoskedasticity Lagrange multiplier test mean function residual plot transformed model variance function weighted least squares White test Principles of Econometrics, 3rd Edition Slide 8-51 Principles of Econometrics, 3rd Edition Slide 8-52 yi 1 2 xi ei E (ei ) 0 var(ei ) i2 cov(ei , e j ) 0 (i j ) b2 2 wi ei wi Principles of Econometrics, 3rd Edition (8A.1) xi x xi x 2 Slide 8-53 E b2 E 2 E wi ei 2 wi E ei 2 Principles of Econometrics, 3rd Edition Slide 8-54 var b2 var wi ei wi2 var ei wi w j cov ei , e j i j w 2 i 2 i (8A.2) 2 2 ( x x ) i i 2 2 ( xi x ) Principles of Econometrics, 3rd Edition Slide 8-55 var(b2 ) Principles of Econometrics, 3rd Edition 2 xi x 2 (8A.3) Slide 8-56 eˆi2 1 2 zi 2 S ziS vi (8B.1) ( SST SSE ) / ( S 1) F SSE / ( N S ) N SST eˆi2 eˆ2 i 1 Principles of Econometrics, 3rd Edition 2 (8B.2) N and SSE vˆi2 i 1 Slide 8-57 SST SSE ( S 1) F SSE / ( N S ) 2 SSE var(e ) var(vi ) N S 2 i 2 SST SSE 2 i (2S 1) (8B.3) (8B.4) (8B.5) var(e ) Principles of Econometrics, 3rd Edition Slide 8-58 SST SSE 2ˆ e4 2 ei2 var 2 2 e 1 2 var( e i )2 4 e 1 N 2 2 2 SST var(e ) (eˆi eˆ ) N i 1 N 2 i Principles of Econometrics, 3rd Edition (8B.6) var(ei2 ) 2e4 (8B.7) Slide 8-59 SST SSE SST / N 2 SSE N 1 SST (8B.8) N R2 Principles of Econometrics, 3rd Edition Slide 8-60