HETEROSCEDASTICITY The assumption of equal variance Var(ui) = σ2, for all i, is called homoscedasticity, which means “equal scatter” (of the error terms ui around their mean 0) 1 2 Equivalently, this means that the dispersion of the observed values of Y around the regression line is the same across all observations If the above assumption of homoscedasticity does not hold, we have heteroscedasticity (unequal scatter) 3 4 Consequences of ignoring heteroscedasticity during the OLS procedure The estimates and forecasts based on them will still be unbiased and consistent However, the OLS estimates are no longer the best (B in BLUE) and thus will be inefficient. Forecasts will also be inefficient 5 The estimated variances and covariances of the regression coefficients will be biased and inconsistent, and hence the t- and F-tests will be invalid 6 Testing for heteroscedasticity 1. Before any formal tests, visually examine the model’s residuals ûi Graph the ûi or ûi2 separately against each explanatory variable Xj, or against Ŷi, the fitted values of the dependent variable 7 Residuals Example of heteroscedasticity 3 2 1 0 -1 0 -2 -3 Series1 20 40 60 Income (X) ordered by size 8 2. The Goldfeld-Quandt test Step 1. Arrange the data from small to large values of the indp variable Xj Step 2. Run two separate regressions, one for small values of Xj and one for large values of Xj, omitting d middle observations (app. 20%), and record the residual sum of squares RSS for each regression: RSS1 for small values of Xj and RSS2 for large Xj’s. 9 Step 3. Calculate the ratio F = RSS2/RSS1, which has an F distribution with d.f. = [n – d – 2(k+1)]/2 both in the numerator and the denominator, where n is the total # of observations, d is the # of omitted observations, and k is the # of explanatory variables. 10 • Step4. Reject H0: All the variances σi2 are equal (i.e., homoscedastic) if F > Fcr, where Fcr is found in the table of the F distribution for [n-d-2(k+1)]/2 d.f. and for a predetermined level of significance α, typically 5%. 11 Drawbacks of the the Goldfeld-Quandt test It cannot accommodate situations where several variables jointly cause heteroscedasticity The middle d observations are lost 12 3. Lagrange Multiplier (LM) tests (for large n>30) The Breusch-Pagan test Step 1. Run the regression of ûi2 on all the explanatory variables. In our example (CN p. 37), there is only one explanatory variable, X1, therefore the model for the OLS estimation has the form: ûi2 = α0 + α1X1i + vi 13 Step 2. Keep the R2 from this regression. Let’s call it Rû22 Calculate either (a) F = (Rû22/k)/[(1-Rû22)/(n-(k+1)], where k is the # of explanatory variables; the F statistic has an F distribution with d.f. = [k, n-(k+1)] Reject H0: All the variances σi2 are equal (i.e., homoscedastic) if F >Fcr 14 or (b) LM = n Rû22, where LM is called the Lagrangian Multiplier (LM) statistic and has an asymptotic chi-square (χ2) distribution with d.f. = k Reject H0: All the variances σi2 are equal (i.e., homoscedastic) if LM> χcr2 15 Drawbacks of the Breusch- Pagan test It has been shown to be sensitive to any violation of the normality assumption Three other popular LM tests: the Glejser test; the Harvey-Godfrey test, and the Park test, are also sensitive to such violations (won’t be covered in this course) 16 One LM test, the White test, does not depend on the normality assumption; therefore it is recommended over all the other tests 17 The White test Step 1.The test is based on the regr. of û2 on all the explanatory variables (Xj), their squares (Xj2), and all their cross products. E.g., when the model contains k = 2 explanat. variables, the test is based on an estim. of the model: û2 =β0+ β1X1 +β2X2+β3X12+β4X22 + β5X1X2 + v 18 Step 2. Compute the statistic χ2 = nRû22, where n is the sample size and Rû22 is the unadjusted R-squared from the OLS regression in Step 1. The statistic χ2 = nRû22, has an asymptotic chi-square (χ2) distrib. with d.f. = k, where k is the # of ALL explanatory variables in the AUXILIARY model. Reject H0: All the variances σi2 are equal (i.e., homoscedastic) if χ2 > χcr2 19 Estimation Procedures when H0 is rejected • 1. Heteroscedasticity with a known proportional factor If it can be assumed that the error variance is proportional to the square of the indep. variable Xj2, we can correct for heteroscedasticity by dividing every term of the regression by X1i and then reestimating the model using the transformed variables. In the two-variable case, we will have to reestimate the following model (CN, p. 39): Yi/X1i = β0/X1i + β1 + ui/X1i 20 • 2. Heteroscedasticity consistent covariance matrix (HCCM) As we know, the usual OLS inference is faulty in the presence of heteroscedasticity because in this case the estimators of variances Var(bj) are biased. Therefore, new ways have been developed for estimation of heteroscedasticity-robust variances. The most popular is the HCCM procedure proposed by White. 21 The heteroscedasticity consistent covariance matrix (HCCM) procedure. Let’s consider the model: Yi = β0 + β1X1i + β2X2i + ... + βkXki + ui • Step 1. Estimate the initial model by the OLS method. Let ûi denote the OLS residuals from the initial regression of Y on X1, X2, .., Xk 22 • Step 2. Run the OLS regression of Xj (each time for a different j) on all other independent variables. Let ŵij denotes the ith residual from regressing Xj on all other independent variables. 23 • Step 3. Let RSSj be the residual sum of squares from this regression: RSSj = SXjXj(1-R2). RSSj can also be calculated as RSSj = [n-(k+1)]SER2, where SER is the standard error of regression and can easily be found in the Excel’s OLS solution. 24 • Step 4. The heteroscedasticityrobust variance Var(bj) can be calculated as follows: Var(bj) = Σŵij2ûi2/RSSj2. The square root of Var(bj) is called the heteroscedasticity-robust standard error for bj. Example: CN, p. 44. 25 • 3. Feasible Generalized Least Squares (FGLS) method Step 1. Compute the residuals ûi from the OLS of the initial regression model 26 • Step 2. Regress ûi2 against a constant term and all the explanatory variables from either the Breusch-Pagan test for heteroscedasticity (e.g., when k =2: ûi2 = α0 + α1X1i + α2X2i + vi ) or the White test for heteroscedasticity: ûi2 = α0 + α1X1i + α2X2i + α3X1i2 + α4X2i2 + α5X1i X2i + vi 27 • Step 3. Estimate the original model by OLS using the weights zi = 1/σi, where σi2 are the predicted values of the dependent variable (the ûi2) in the Breusch-Pagan (or White) model. Note: the model must be estimated without a constant term. Such OLS procedure is called WLS (weighted least squares). 28 It may happen that the predicted values σi2 of the dependent variable may not be positive, so we cannot calculate the corresponding weights zi = 1/σi. If this situation arises for some observations, then we can use the original ûi2 and take their positive square roots. 29