8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Var (u | X ) 2 Homoskedasticity fails if the variance of the error term varies in the sample (ie: varies with the x variables) -We used Homoskedasticity for t tests, F test, and confidence intervals, even with large samples 8. Heteroskedasticity 8.1 Consequences of Heteroskedasticity for OLS 8.2 Heteroskedasticity-Robust Inference after OLS Estimation 8.3 Testing for Heteroskedasticity 8.4 Weighted Least Squares Estimation 8.5 The Linear Probability Model Revisited 8.1 Consequences of Heteroskedasticity We have already seen that Heteroskedasticity: 1) Does not cause bias or inconsistency (this depends on MLR. 1 through MLR. 4) 2) Does not affect R2 or adjusted R2 (since these estimate the POPULATION variances which are not conditional on X) Heteroskedasticity does: 1) Make Var(Bjhat) biased, and therefore invalidate typical OLS standard errors (and therefore tests) 2) Make OLS no longer BLUE (a better estimator may exist) 8.2 Heteroskedasticity-Robust Inference after OLS Estimation -Because testing hypothesis is a key element of econometrics, we need to obtain accurate standard errors in the presence of heteroskedasticity -in the last few decades, econometricians have learned how to adjust standard errors when HETEROSKEDASTICITY OF UNKNOWN FORM exists -these heteroskedasticity-robust procedures are valid (in large samples) regardless of eror variance 8.2 Het Fixing 1 -Given a typical single independent variable model, heteroskedasticity implies a varying variance: yi 0 1 xi ui Var (ui | xi ) 2 i -Rewriting the OLS slope estimator, we can obtain a formula for its variance: ( x x ) u i i ˆ 1 1 2 ( xi x) Var ( ˆ1 ) 2 2 ( x x ) i i SSTx2 -Recall that 8.2 Het Fixing 1 SST ( x x) 2 x i -Also notice that given homoskedasticity, i2 2 Var ( ˆ1 ) i2 SSTx -While we don’t know σi2, White (1980) showed that a valid estimator is: Vaˆr ( ˆ1 ) 2 2 ( x x ) i uˆi SSTx2 8.2 Het Fixing 1 -Given a multiple independent variable model: y 0 1 x1 ... k xk u -The valid estimator of Var(Bjhat) becomes: 2 2 ˆ ˆi r ij u Vaˆr ( ˆ j ) (8.4) 2 SSR j -where rijhat2 is the ith residual of a regression of xj on all other x variables -where SSRj is the sum of the squared residuals from that regression 8.2 Het Fixing 1 -The square root of this estimate of variance is commonly called the HETEROSKEDASTICITYROBUST STANDARD ERROR, but is also called the White, Huber, or Eickert standard errors due to its founders -there are a variety of slight adjustments to this standard error, but economists generally simply use the values reported by their program -this se adjustment gives us HETEROSKEDASTICITY-ROBUST T STATISTICS: (estimate hypothesiz ed value) t standard error 8.2 Why Bother with Normal Errors? -One may ask why we bother with normal OLS errors when heteroskedasticity-robust standard errors are valid more often: 1) Normal OLS t stats have an exact t distribution, regardless of sample size 2) Robust t statistics are valid only for large sample sizes Note that HETEROSKEDASTICITY-ROBUST F STATISTICS also exist, often called the HETEROSKEDASTICITY-ROBUST WALD STATISTIC and reported by most econ programs. 8.3 Testing for Heteroskedasticity -In this chapter we will cover a variety of modern tests for heteroskedasticity -It is important to know if heteroskedasticity exists, as its existence means OLS is no longer the BEST estimator -Note that while other tests for heteroskedasticity exist, the test presented here are preferred due to their more DIRECT testing for heteroskedasticity 8.3 Testing for Het -Consider our typical linear model and a null hypothesis suggesting homoskedasticity: y 0 1 x1 ... k xk u H 0 : Var (u | x1 , x2, ..., xk ) 2 Since we know that Var(u|X)=E(u2|X), we can rewrite the null hypothesis to read: H 0 : E(u | x1 , x2, ..., xk ) E(u ) 2 2 2 8.3 Testing for Het -As we are testing whether u2 is related to any explanatory variables, we can use the linear model: u 0 1 x1 ... k xk v 2 -where v is an error term with mean zero given the x’s -note that the dependent variable is SQUARED -this changes our null hypothesis to: H 0 : 1 2 ... k 0 8.3 Testing for Het -Since we don’t know the true error of the regression, but only the residual, our estimation becomes: uˆ 0 1 x1 ... k xk error 2 -Which is valid for large sample distributions -The R2 from the above regression is used to construct an F statistic: F 2 uˆ 2 R /k (1 R ) /( n k 1) 2 uˆ 2 (8.15) 8.3 Testing for Het -This test F statistic is compared to a critical F* with k, n-k-1 degrees of freedom -If the null hypothesis is rejected, there is evidence to conclude that heteroskedasticity exists at a given α -If the null hypothesis is not rejected, there is insufficient evidence to conclude that heteroskedasticity exists at a given α -this is sometimes called the BREUCH-PAGAN TEST FOR HETEROSKEDASTICITY (BP TEST) 8.3 BP HET TEST In order to conduct a BP test for het 1) Run a normal OLS regression (y on x’s) and obtain the square of the residuals, uhat2 2) Run a regression of uhat2 on all independent variables and save the R2 3) Obtain a test F statistic and compare it to the critical F* 4) If F>F*, reject the null hypothesis of homoskedasticity and start correcting for heteroskedasticity 8.3 BP HET TEST If we suspect that our model’s heteroskedasticity depends on only certain x variables, Only regress uhat2 on those variables -Keep in mind that the K in the R2 formula and in the degrees of freedom comes from the number of independent variables in the uhat2 regression An alternate test for het is the white test: 8.3 White Test for Het -Given the statistical modifications covered in chapter 5, White (1980) proposed another test for heteroskedasticity -With 3 independent variables, White proposed a linear regression with 9 regressors: uˆ 0 1 x1 2 x2 3 x3 4 x1 5 x2 2 2 2 6 x3 7 x1 x2 8 x1 x3 9 x2 x3 error 2 -The null hypothesis (homoskedasticity) now sets all δ (except the intercept) equal to zero 8.3 White Test for Het -Unfortunately this test involves MANY regressors (27 regressors for 6 x variables) and as such may have degrees of freedom issues -one special case of the White test is to estimate the regression: uˆ 0 1 yˆ 2 yˆ error 2 2 -since this preserves the “squared” concept of the White test and is particularly useful when het is suspected to be connected to the level of the expected value E(y|X) -this test has a F distribution w/2,n-3 df 8.3 Special White HET TEST In order to conduct a special White test for het 1) Run a normal OLS regression (y on x’s) and obtain the square of the residuals, uhat2 and the predicted values, yhat 2) Run the regression of uhat2 on both yhat and yhat2 (including an intercept). Record the R2 values 3) Using these R2 values, compute a test F statistic as in the BP test 4) If F>F*, reject the null hypothesis (homoskedasticity) 8.3 Heteroskedasticity Note -Our decision to REJECT the null hypothesis and suspect heteroskedasticity is only valid if MLR.4 is valid -if MLR.4 is violated (ie: bad funcitonal form or omitted variables), one can reject the null hypothesis even if het doesn’t actually exist -Therefore always chose functional form and all variables before testing for heteroskedasticity