Heteroskedasticity: Nature and Detection 11.1 Aims and Learning Objectives By the end of this session students should be able to: • Explain the nature of heteroskedasticity • Understand the causes and consequences of heteroskedasticity • Perform tests to determine whether a regression model has heteroskedastic errors 11.2 Nature of Heteroskedasticity Heteroskedasticity is a systematic pattern in the errors where the variances of the errors are not constant. Ordinary least squares assumes that all observations are equally reliable. 11.3 Regression Model Yi = 1 + 2Xi + Ui Homoskedasticity: Var(Ui) = 2 Or E(Ui2) = 2 Heteroskedasticity: Var(Ui) = i 2 Or E(Ui2) = i 2 11.4 Homoskedastic pattern of errors Consumption Yi . . . .. . . . . . . . . . . .. . . . .. . . . . . . . . . . . . . . .. . . . Income Xi 11.5 The Homoskedastic Case f(Yi) . . X1 X2 X3 X4 . . Income Xi 11.6 Heteroskedastic pattern of errors Consumption . Yi . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . Income Xi 11.7 The Heteroskedastic Case f(Yi) . . . rich people poor people X1 X2 X3 Income Xi 11.8 Causes of Heteroskedasticity Common Causes Direct Indirect • Scale Effects • Omitted Variables • Structural Shift • Outliers •Learning Effects •Parameter Variation 11.9 Consequences of Heteroskedasticity 1. Ordinary least squares estimators still linear and unbiased. 2. Ordinary least squares estimators not efficient. 3. Usual formulas give incorrect standard errors for least squares. 4. Confidence intervals and hypothesis tests based on usual standard errors are wrong. 11.10 ^ ^ Yi = 1 + 2Xi + ei heteroskedasticity: Var(ei) = i 2 Formula for ordinary least squares variance 2 (homoskedastic disturbances): Var ( ˆ ) 2 2 x i Formula for ordinary least squares variance 2 2 (heteroskedastic disturbances): xi i ˆ Var ( 2 ) 2 2 x i Therefore when errors are heteroskedastic ordinary 11.11 least squares estimators are inefficient (i.e. not “best”) Detecting Heteroskedasticity Yi ˆ1 ˆ2 X 2i ˆ3 X 3i ei ei2 : squared residuals provide proxies for Ui2 Preliminary Analysis • Data - Heteroskedasticity often occurs in cross sectional data (exceptions: ARCH, panel data) • Graphical examination of residuals - plot ei or ei2 against each explanatory variable or against predicted Y 11.12 Residual Plots Plot residuals against one variable at a time after sorting the data by that variable to try to find a heteroskedastic pattern in the data. . ei 0 . . . . .. . .. . . . . . .. . . . . . . . . . . Xi . . . . 11.13 Formal Tests for Heteroskedasticity The Goldfeld-Quandt Test 1. Sort data according to the size of a potential proportionality factor d (largest to smallest) 2. Omit the middle r observations 3. Run separate regressions on first n1 observations and last n2 observations 4. If disturbances are homoskedastic then Var(Ui) should be the same for both samples. 11.14 The Goldfeld-Quandt Test 5. Specify null and alternative hypothesis Ho: 1 2 = 2 2 H1: 1 2 > 2 2 6. Test statistic ˆ GQ ~ F (n1 k1 , n2 k 2 ) ˆ 2 1 2 2 Compare test statistic value with critical value from F-distribution table 11.15 White’s Test 1. Estimate Yˆi ˆ1 ˆ2 X 2i And obtain the residuals ˆ3 X 3i 2. Run the following auxiliary regression: e A0 A1 X 2i A2 X 3i A3 X A4 X A5 X 2i X 3i Vi 2 i 2 2i 2 3i 3. Calculate White test statistic from auxiliary regression 2 2 nR ~ d . f . 4. Obtain critical value from 2 distribution (df = no. of explanatory variables in auxiliary regression) 5. Decision rule: if test statistic > critical 2 value then reject null hypothesis of no heteroskedasticity 11.16 Summary In this lecture we have: 1. Analysed the theoretical causes and consequences of heteroskedasticity 2. Outlined a number of tests which can be used to detect the presence of heteroskedastic errors 11.17