Lesson #7

Chapter 9 Heteroskedasticity Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education. Learning Objectives • Understand methods for detecting heteroskedasticity • Correct for heteroskedasticity 9-2 What is Heteroskedasticity? Heteroskedasticity is when the error term has a nonconstant variance or 𝑉𝑎𝑟 𝜀 = 𝜎𝑖2 . Homoskedasticity is when the error term has a nonconstant variance 𝑉𝑎𝑟 𝜀 = 𝜎 2 . Notice that for homoskedasticity there is no i subscript so that the variance is constant while for heteroskedasticity the i subscript denotes that the variance changes for each observation 9-3 A Picture of Homoskedasticity Versus Heteroskedasticity 9-4 The Issues And Consequences Associated With Heteroskedastic Data Problem: Heteroskedasticity violates assumption M6, which states that the error term must have constant variance. Consequences: Under heteroskedasticity parameter estimates are unbiased. Parameter estimates are not minimum variance among all unbiased estimators. Estimated standard errors are incorrect and all measures of precision based on the estimated standard errors are also incorrect. 9-5 Goals of this Chapter 9-6 An Important Caveat before Continuing • With more advanced statistical packages, many researchers include a very simple command asking their chosen statistical program to provides standard error estimates that automatically correct for heteroskedasticity (White’s heteroskedastic consistent standard errors) • Even though correcting for heteroskedasticity is straightforward, it important to first work through the more “old-school” examples that we do below before learning how to calculate White’s heteroskedastic consistent standard errors. 9-7 Understand Methods For Detecting Heteroskedasticity Informal methods - Graphs Formal methods using statistical tests - Breusch-Pagan test - General White’s Test - Modified White’s Test - Goldfeld-Quandt Test 9-8 Informal Method Either graph: (1) The dependent variable against each independent variable… (2) The residuals against each independent variable… (3) The residuals squared against each independent variable… (4) The standardized residuals against each independent variable… and look for a pattern in the dispersion of the observations. If a pattern exists then that is evidence of heteroskedasticity. 9-9 Regression of Number of Olympic Medals on per capita GDP by Country 9-10 Notice how the variance increases as the independent variable increases. This is evidence of heteroskedasticity. 9-11 This residual plot is obtained by checking the residual plot option in Excel when running a regression. As in the previous slide, notice how the variance increases as the independent variable (GDP per Capita) increases. This is evidence of heteroskedasticity. 9-12 The primary drawback of the informal method is that it is not clear how much of a pattern needs to exist to lead us to the conclusion that the model is heteroskedastic. This leads us to the need for formal tests of heteroskedasticity. 9-13 Formal Methods for Detecting Heteroskedasticity The formal methods that we consider are all based on statistical tests of the following general null and alternative hypotheses 𝐻0 : the error term is homoskedastic 𝐻1 : the error term is heteroskedastic 9-14 Testing for Heteroskedasticity (1) Breusch - Pagan (2) Modified White’s Test (3) Goldfeld-Quandt Test 9-15 Breusch-Pagan Test How to do it: (1) Estimate the population regression model 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥1𝑖 + 𝛽2 𝑥2𝑖 + ⋯ + 𝛽𝑘 𝑥𝑘𝑖 + 𝜀𝑖 and obtain the residuals, 𝑒𝑖 . (2) Square the residuals or 𝑒𝑖2 . (3) Estimate the population regression model 𝑒𝑖2 = 𝛾0 + 𝛾1 𝑥1𝑖 + 𝛾2 𝑥2𝑖 + ⋯ + 𝛾𝑘 𝑥𝑘𝑖 + 𝜑 (4) Perform an F-test for overall significance to see if the squared residuals are statistically related to any of the independent variables. 9-16 Breusch-Pagan Test Why It Works: If the squared residuals are found to be statistically related to the independent variables then we conclude that the data are heteroskedastic and we should take the appropriate steps to correct for the problem. 9-17 Breusch-Pagan Test for Olympic Medal vs GDP per Capita Data Dependent Variable is Residuals Squared The significant F is much less than 0.05 (or 0.01 for that matter) so we reject the null hypothesis of homoskedasticity and conclude model is heteroskedastic. 9-18 Modified White’s Test How to do it: (1) Estimate the population regression model 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥1𝑖 + 𝛽2 𝑥2𝑖 + ⋯ + 𝛽𝑘 𝑥𝑘𝑖 + 𝜀𝑖 and obtain the residuals, 𝑒𝑖 , and predicted values. (2) Square the residuals. (3) Estimate the population regression model 𝑒𝑖2 = 𝛿0 + 𝛿1 𝑦𝑖 + 𝛿2 𝑦𝑖2 + 𝑢𝑖 (4) Perform an F-test for overall significance to see if the squared residuals are statistically related to the 𝑦𝑖 and 𝑦𝑖2 variables. 9-19 Modified White’s Test Why It Works: This test works for the same reason that that Breusch-Pagan test works. The primary difference is that the 𝑦𝑖 and 𝑦𝑖2 variables are a function of the independent variables, the independent variables squared, and the cross-products of the independent variables, meaning that including those terms in the squared residual regression tests whether the squared residuals are a function of all of those terms rather than a function of the independent variables alone. 9-20 Modified White’s Test for Olympic Medal vs GDP per Capita Data Dependent Variable is Residuals Squared The significant F is much less than 0.05 (or 0.01 for that matter) so we reject the null hypothesis of homoskedasticity and conclude the model is heteroskedastic. 9-21 Goldfeld-Quandt Test How to do it: (1) Identify which independent variable is suspected of contributing towards heteroskedasticity and sort the data from smallest to largest on that variable. (2) Omit the middle 𝑐 observations. (3) Run two regressions with the remaining (𝑛 − 𝑐) observations. 𝑈𝑆𝑆2 , 𝑈𝑆𝑆1 (4) Form the test statistic 𝐺𝑄 = where 𝑈𝑆𝑆2 is the larger value (because the 𝐹 − 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 must be greater than or equal to 1). (5) Reject the null hypothesis of homoskedasticity if GQ > 𝐹𝑛1−𝑘1,𝑛2−𝑘2,.05 . 9-22 Goldfeld-Quandt Test Why It Works: This test works when the suspected heteroskedasticity is of the type that the error variances either increase (or decrease) with the value of a given independent variable. If we find that the unexplained sum of squares for the largest values is “large” relative to the unexplained sum of squares for the smallest values, then we conclude that the error variance changes significantly with the value of the independent variable, suggesting that the data are heteroskedastic. 9-23 Goldfeld-Quandt Test How to do it: For the Olympic Medal Data, there are 408 observations. Dividing the data into thirds, the first regression should contain the smallest 136 (408/3) GDP per capita data, and the second regression should contain the largest 136 GDP per capita data. 9-24 USS1 9-25 USS2 9-26 Goldfeld-Quandt Test Example 𝐺𝑄 = 𝑈𝑆𝑆2 𝑈𝑆𝑆1 = 63,534.37 =3.4259 18,545.19 Critical Value = 𝐹∞,∞,0.05 = 1 Because 3.4259 > 1 we reject the null hypothesis of homoskedasticity and conclude that the model is heteroskedastic. 9-27 Correcting for Heteroskedasticity (1) Weighted least squares (2) White’s heteroskedastic consistent standard errors 9-28 Weighted Least Squares How to Do It: (1) Assume the form of heteroskedasticity, say 𝑉𝑎𝑟 𝜀 = 𝜎 2 ℎ(𝑥). (2) Create new variables by dividing through by the square root of ℎ(𝑥) ∗ 𝑦𝑖∗ = 𝑦𝑖 ℎ(𝑥), 𝑥0∗ = 1 ℎ(𝑥) , 𝑥1𝑖 = 𝑥1𝑖 ℎ 𝑥 , ∗ ∗ 𝑥2𝑖 = 𝑥2𝑖 ℎ(𝑥), …, 𝑥𝑘𝑖 = 𝑥𝑘𝑖 ℎ(𝑥). (3) Estimate the population regression model 𝑦𝑖∗ = ∗ 𝛽0 𝑥0∗ + 𝛽1 𝑥1𝑖 + 𝜀𝑖∗ . 9-29 Weighted Least Squares Why It Works: Weighted least squares changes the model from one that was initially heteroskedastic into one that is homoskedastic. The new error term 𝜀 ∗ = 𝜀/ ℎ(𝑥) has variance 𝑉𝑎𝑟 𝜀 ∗ = 𝜎 2 ℎ(𝑥)/( ℎ 𝑥 )2 = 𝜎 2 . This only works as long as the assumed form of heteroskedasticity is correct. 9-30 Weighted Least Squares Example Assume that the form of heteroskedasticity is 𝑉𝑎𝑟 𝜀 = 𝜎 2 𝐺𝐷𝑃𝑝𝑒𝑟𝐶𝑎𝑝𝑖𝑡𝑎𝑖 so that ℎ 𝑥 = 𝐺𝐷𝑃𝑝𝑒𝑟𝐶𝑎𝑝𝑖𝑡𝑎𝑖 ℎ 𝑥 = 𝐺𝐷𝑃𝑝𝑒𝑟𝐶𝑎𝑝𝑖𝑡𝑎𝑖 The transformed variables are 𝑀𝑒𝑑𝑎𝑙𝑠𝑖 ∗ 𝑀𝑒𝑑al𝑠 = 𝐺𝐷𝑃𝑝𝑒𝑟𝐶𝑎𝑝𝑖𝑡𝑎𝑖 9-31 Weighted Least Squares Example The transformed variables are 𝑀𝑒𝑑al𝑠𝑖∗ = 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡 ∗ 𝑀𝑒𝑑𝑎𝑙𝑠𝑖 𝐺𝐷𝑃𝑝𝑒𝑟𝐶𝑎𝑝𝑖𝑡𝑎𝑖 = 1 𝐺𝐷𝑃𝑝𝑒𝑟𝐶𝑎𝑝𝑖𝑡𝑎𝑖 𝐺𝐷𝑃𝑝𝑒𝑟𝐶𝑎𝑝𝑖𝑡𝑎𝑖∗ = 𝐺𝐷𝑃𝑝𝑒𝑟𝐶𝑎𝑝𝑖𝑡𝑎𝑖 𝐺𝐷𝑃𝑝𝑒𝑟𝐶𝑎𝑝𝑖𝑡𝑎𝑖 9-32 Weighted Least Squares Example Excel Results 9-33 Breusch-Pagan Test of Transformed Weighted Least Squares Data Unfortunately, even after the transformation this model still suffers from heteroskedasticity 9-34 Robust Standard Errors The preferred method to correct for heteroskedasticity is to use White’s heteroskedastic consistent standard errors. The coefficient estimates are still unbiased so the only thing that needs to be corrected are the standard errors. In STATA, the command is reg y x1 x2 x3, robust The ,robust (or even ,r) is the portion of the command that corrects the standard errors. 9-35 STATA Results with Original Standard Errors STATA Results with Robust Standard Errors 9-36

Lesson #7

Related documents

Products

Support

Lesson #7

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib