University of East London Dr. Derick Boyd Business School - Finance, Economics and Accounting FE 3003 Applied Econometrics d.a.c.boyd@uel.ac.uk HETEROSKEDASTICITY Reference: Any econometric text from the reading list or otherwise. Contents 1. Definition of Heteroskedasticity 2. Plausibility ~ When is heteroskedasticity likely to arise? 3. Consequences of heteroskedasticity on the properties of Least Squares estimators 4. The White Test for heteroskedasticity ~ 5. Weighted Least Squares – Transforming the original model to obtain homoskedastic errors. 1. Definition of Heteroskedasticity Heteroskedasticity arise when the homoskedasticity assumption 2 Var ( i ) Var (Yi ) 2 is violated, giving rise to Var ( i ) Var (Yi ) i Note it is the subscript i on the 2 that makes all the difference i indicates that the variance for each observation is different for i=1, 2, … N. 2 Diagrams of homo and hetero Heteroskedasticity suggests that the Var ( i ), Var (Yi ) is a function of the X i , the independent variable. We can see this from the diagrams because in the homoskedastic case, the variance of Yi is fairly constant across the range of values of the X i while in the heteroskedastic cases the variance of Yi changes (the height of the observations systematically lengthens or shortens). This means that in the cases of 2 heteroskedasticity we can write Var ( i ) i f ( X i ) k 2 X i . 2. Plausibility ~ When is heteroskedasticity likely to arise? It is not usual to find heteroskedasticity when we use time series data in our estimations but it can be a problem when we use cross-sectional data (Gujarati: savings function example). 3. Consequences of heteroskedasticity Heteroskedasticity will affect at the properties of least squares estimators. OLS estimators will not be BLUE. They will be Linear and Unbiased but they will not have the Minimum Variance property, consequently they will not be efficient or consistent and so will not be the best estimators. Proof of Linearity Bˆ ( X X ) 1 X Y and this remains the case that B̂ is a function of Y weighted by the non-stochastic Xs, ( X X )1 X . 1 of 5 Proof of Unbiasedness Bˆ ( X X )1 X Y ( X X ) 1 X ( XB U ) given that Y XB U ) . ( X X )1 X XB ( X X )1 X U Taking expectations, we get, E ( Bˆ ) ( X X )1 X XB ( X X )1 X E (U ) B since ( X X )1 X X I and E(U)=0 by the zero mean assumption. So E ( Bˆ ) ( B) and the OLS estimators remain unbiased. Proof the Heteroskedasticity affect the MVP of OLS estimators We have seen that the Var ( Bˆ ) ( X X ) 1 X U UX ( X X ) 1 . Under the homoskedastic assumption U U 2 I , so that the Var ( Bˆ ) maybe written Var ( Bˆ ) ( X X ) 1 X U UX ( X X ) 1 ( X X )1 X 2 IX ( X X ) 1 2 ( X X ) 1 X X ( X X ) 1 2 ( X X )1 Without the use of the homoskedastic assumption we could not write Var ( Bˆ ) ( X X ) 1 X 2 IX ( X X ) 1 and so could not get the Var ( Bˆ ) 2 ( X X ) 1 result. Since we use the homoskedastic assumption in deriving the Var ( Bˆ ) 2 ( X X ) 1 result then the minimum variance property will be affected in the absence of homoskedasticity. Note that we use the Var ( Bˆ ) in t tests to carry out test of significance: ˆ 0 ˆ . t seˆ ˆ Var ( B) In the presence of heteroskedasticity such tests will be unreliable since the Var ( Bˆ ) will not be the best one to use. 4. The White Test for heteroskedasticity The White test uses the null hypothesis: Ho: The variance of the disturbance term is homoskedastic , i.e., Var ( i ) Var (Yi ) 2 and the alternative hypothesis is; H1: the variance of the disturbance term is heteroskedastic of an unknown form. We will use a multiple regression model to illustrate this test. Suppose the model, Yi 0 1 X i 2 Zi i under the usual assumptions. To carry out this test we: (i) have to run the original model and save the residuals ˆi ; (ii) define an auxiliary equation to carry out the test, for example: ˆi2 0 1 X i 2 X i2 3 Zi 4 Zi2 5 X i Zi i 2 of 5 Common specifications of the auxiliary equations are: linear; quadratic; or quadratic with cross-products – the form illustrated here. (iii) Heteroskedasticity implies that the variance terms ˆi2 is related to the regressors in some way (linear, quadratic or multiplicatively) and so the test is essentially to see if the specification in (ii) significantly explain the ˆi2 dependent variable. We can use a h2 test in a fairly straightforward procedure (where h = k-1 from the auxiliary equation (in this case h=61=5). We obtain critical values for the test by using the 2 tables. For instance doing the test at the 5% significant level with h=5, five degrees of 2 freedom, would imply a test statistic of h2 5;5% 11.07 . (iv) (v) With the test critical value available we would then calculate the 2 test statistic and this is very easily done – it is NR 2 from the auxiliary equation. So if when we ran the auxiliary equation we had N=125 observations and we obtained an R2 = 0.35788, the test statistics would be 2 44.735 . 2 2 If (test statistics) 5;5%(critical value) (as in this example) we reject the null hypothesis of homoskedasticity. 2 2 If test statistics 5;5%(critical value) we do not reject the null hypothesis of homoskedasticity and we can estimate the original model as it is. Weighted Least Squares If we have information about the heteroskedastic relationship we may be able to use that information to transform the original model in order to obtain an homoskedastic disturbance term. For instance, suppose that the heteroskedasticity is of the form, Var ( i ) i2 k 2 X i Z i where k2 is a constant (note is not a constant). Var ( i ) k 2 X i Zi We may note that: k2 i Var ( i ) Var XZ X i Zi i i i , instead of i , then the X i Zi error would be homoskedastic. So, we transform our model by dividing it through by 1 obtaining: X i Zi This implies that if the error term in our model was Yi 0 X i Zi 1 1 X i Zi Xi 2 X i Zi 3 of 5 Zi X i Zi i X i Zi We can re-write this to make it look less intimidating as, Yi * 0 X 0 * 1 X *i 2 Z *i i * where the variables are defined with respect to the equation above. Estimations from this equation is called Weighted Least Squares – in this case each observation is weighted by the inverse of X i Zi . We can do this Weighted Least Squares transformation for various forms of heteroskedasticity. Suppose, Var ( i ) k 2 X i then, k2 Var ( i ) Var i X Xi i and the transformed equation would be X Z Yi 1 0 1 i 2 i i Xi Xi Xi Xi Xi In this case each observation is weighted by the inverse of Xi . Another popular form for heteroskedasticity is to suppose, Var ( i ) k 2 X i2 Var ( i ) Var i 2 Xi Xi In this case the transformed equation would be, then, k2 X Z Yi 1 0 1 i 2 i i Xi Xi Xi Xi Xi that we can re-write as, Yi * 0 X 0 * 1 2 Z *i i * In this case each observation is weighted by the inverse of X i . Note that in this case the 1 parameter will be the intercept in this specification but it should still be interpreted as the slope parameter from the original model and the 0 parameter remains really the intercept. Essentially, we are just redefining the variables that are used in the estimation to get rid of the heteroskedastic error problem; the estimated parameters should be based on what was in the original model. 4 of 5 TUTORIAL QUESTION (a) Use EViews to carry out the White Test using your country data and print. (b) Explain the White Test for Heteroskedasticity making clear type of test is being carried out. (c) Carry out the test using your country study in order to confirm the EViews results and explain the conclusions you arrive at. **** 5 of 5