1 Heteroskedasticity 2 The Nature of Heteroskedasticity Heteroskedasticity is a systematic pattern in the errors where the variances of the errors are not constant. Ordinary least squares assumes that all observations are equally reliable (constant variance). For efficiency (accurate estimation / prediction) re-weight observations to ensure equal error variance. Regression Model 3 yt = 1 + 2xt + εt zero mean: E(εt) = 0 homoskedasticity: var(εt) = 2 nonautocorrelation: heteroskedasticity: cov(εt, εs) = t =s var(εt) = t2 4 Homoskedastic pattern of errors Consumption y t . . . . . . . . . . . . ... . . ... .. . . . . . . . . . . . . . . .. . . . Income xt The Homoskedastic Case 5 f(yt) . . x1 x2 x3 x4 . . Income xt 6 Heteroskedastic pattern of errors consumption . yt . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . income xt The Heteroskedastic Case 7 f(yt) . . . Rich people Poor people x1 x2 x3 Income xt Properties of Least Squares 8 1. Least squares still linear and unbiased. 2. Least squares NOT efficient. 3. Hence, it is no longer B.L.U.E. 4. Usual formulas give incorrect standard errors for least squares. 5. Confidence intervals and hypothesis tests based on usual standard errors are wrong. 9 yt = 1 + 2xt + ε Heteroskedasticity: E(εt) = 0, var(εt) = t2 , cov(εt, εs) = , t s b2 wt yt 2 wt t Where wt E(b ) E( 2 2 (Unbiased) (Linear) Xt X X t X 2 w ) w E( ) t t 2 t t 2 yt = 1 + 2xt + εt heteroskedasticity: var(εt) = t 2 Incorrect formula for least squares variance: 2 var(b2) = xt x Correct formula for least squares variance: var(b2) = 2 x x t t xt x 10 Halbert White Standard Errors 11 White estimator of the least squares variance: est.var(b2) = ^ 2 x x ε t t xt x ε^t2 : the squares of the least squares residuals 1. In large samples, White standard error (square root of estimated variance) is a consistent measure. 2. Because the squared residuals are used to approximate the variances, White's estimator is strictly appropriate only in large samples. 12 Two Types of Heteroskedasticity 1. Proportional Heteroskedasticity. (continuous function (of xt, for example) ) For example, Income is less important as an explanatory variable for food expenditure of highincome families. It is harder to guess their food expenditure. 2. Partitioned Heteroskedasticity. (discrete categories/groups) For instance, exchange rates are more volatile after Asian Financial Crisis. 13 Proportional Heteroskedasticity yt = 1 + 2xt + εt E(εt) = 0 where var(εt) = t 2 t 2 = 2 xt cov(εt, εs) = 0 t=s The variance is assumed to be proportional to the value of xt 14 std.dev. proportional to xt yt = 1 + 2xt + εt variance: var(εt) = t 2 t 2 = 2 x t standard deviation: t = xt To correct for heteroskedasticity divide the model by yt 1 xt εt = 1 + 2 + xt xt xt xt It is important to recognize that 1 and 2 are the same in both the transformed model and the untransformed model xt yt 1 xt εt = 1 + 2 + xt xt xt xt yt = 1xt1 + 2xt2 + εt * * * var(εt ) = var( εt xt )= * 1 15 * 1 var(εt) = x 2 xt xt t var(εt *) = 2 εt is heteroskedastic, but εt is* homoskedastic. 16 Generalized Least Squares These steps describe weighted least squares: 1. Decide which variable is proportional to the heteroskedasticity ( xt in previous example). 2. Divide all terms in the original model by the square root of that variable (divide by xt ). 3. Run least squares on the transformed model * and x * variables which has new y*t , xt1 t2 but no intercept. 17 The errors are weighted by the reciprocal of xt . When xt is small, the data contain more information about the regression function and the observations are weighted heavily. When xt is large, the data contain less information and the observations are weighted lightly. In this way we take advantage of the heteroskedasticity to improve parameter estimation (efficiency). Partitioned Heteroskedasticity 18 yt = 1 + 2xt + εt yt = bushels per acre of corn t = 1, ,100 xt = gallons of water per acre (rain or other) ... Error variance of field corn: t = 1, . . . ,80 Error variance of sweet corn: t = 81, . . . ,100 var(εt) = 12 var(εt) = 22 Re-weighting Each Group’s Observations Field corn: yt = 1 + 2xt + εt yt 1 xt εt = + + 1 2 1 1 1 1 Sweet corn: yt = 1 + 2xt + εt yt 1 xt εt = + + 1 2 2 2 2 2 19 var(εt) = 12 t = 1, . . . ,80 var(εt) = 22 t = 81, . . . ,100 20 yt 1 xt εt = + + 1 2 i i i i t = 1, . . . ,100 yt = 1xt1 + 2xt2 + εt t = 1, . . . ,100 * var(εt = var( *) * εt i2 * )= 1 i2 * var(εt) = 1 i2 i2 var(εt *) = 1 εt is heteroskedastic, but εt is* homoskedastic. 21 Apply Generalized Least Squares Run least squares separately on data for each group. ^ 2 (MSE ) provides estimator of 2 1 1 1 using the 80 observations on field corn. ^ 2 (MSE ) provides estimator of 2 2 2 2 using the 20 observations on sweet corn. 22 Detecting Heteroskedasticity Determine existence and nature of heteroskedasticity: 1. Residual Plots provide information on the exact nature of heteroskedasticity (partitioned or proportional) to aid in correcting for it. 2. Goldfeld-Quandt Test checks for presence of heteroskedasticity. 23 Residual Plots Plot residuals against one variable at a time after sorting the data by that variable to try to find a heteroskedastic pattern in the data. et 0 . . . . . . . . . . . . . . . . . .. . . . . .. . . . . .. . . . . . . . . xt .. . . . 24 Goldfeld-Quandt Test The Goldfeld-Quandt test can be used to detect heteroskedasticity in either the proportional case or for comparing two groups in the discrete case. For proportional heteroskedasticity, it is first necessary to determine which variable, such as xt, is proportional to the error variance. Then sort the data from the largest to smallest values of that variable. 25 In the proportional case, (drop the middle r observations where r T/6,) run separate least squares regressions on the first T1 observations and the last T2 observations. Ho: 1 2 = 2 2 H1: 1 2 > 2 2 Goldfeld-Quandt Test Statistic GQ = ^ 1 ^ 2 Use F Table 2 2 ~ F[T1-K1, T2-K2] We assume that 12 > 22 . (If not, then reverse the subscripts.) The Small values of GQ support Ho while large values support H1.