Chapter 5 Heteroskedasticity What is in this Chapter? • How do we detect this problem • What are the consequences of this problem? • What are the solutions? What is in this Chapter? • First, We discuss tests based on OLS residuals, likelihood ratio test, G-Q test and the B-P test. The last one is an LM test. • Regarding consequences, we show that the OLS estimators are unbiased but inefficient and the standard errors are also biased, thus invalidating tests of significance What is in this Chapter? • Regarding solutions, we discuss solutions depending on particular assumptions about the error variance and general solutions. • We also discuss transformation of variables to logs and the problems associated with deflators, both of which are commonly used as solutions to the heteroskedasticity problem. 5.1 Introduction • The homoskedasticity=variance of the error terms is constant • The heteroskedasticity=variance of the error terms is non-constant • Illustrative Example – Table 5.1 presents consumption expenditures (y) and income (x) for 20 families. Suppose that we estimate the equation by ordinary least squares. We get (figures in parentheses are standard errors) 5.1 Introduction 5.1 Introduction 5.1 Introduction 5.1 Introduction 5.1 Introduction 5.1 Introduction • The residuals from this equation are presented in Table 5.3 • In this situation there is no perceptible increase in the magnitudes of the residuals as the value of x increases • Thus there does not appear to be a heteroskedasticity problem. 5.2 Detection of Heteroskedasticity 5.2 Detection of Heteroskedasticity 5.2 Detection of Heteroskedasticity 5.2 Detection of Heteroskedasticity 5.2 Detection of Heteroskedasticity 5.2 Detection of Heteroskedasticity 5.2 Detection of Heteroskedasticity • Some Other Tests – Likelihood Ratio Test – Goldfeld and Quandt Test – Breusch-Pagan Test 5.2 Detection of Heteroskedasticity • Likelihood Ratio Test 5.2 Detection of Heteroskedasticity • Goldfeld and Quandt Test – If we do not have large samples, we can use the Goldfeld and Quandt test. – In this test we split the observations into two groups — one corresponding to large values of x and the other corresponding to small values of x — – Fit separate regressions for each and then apply an F-test to test the equality of error variances. – Goldfeld and Quandt suggest omitting some observations in the middle to increase our ability to discriminate between the two error variances. 5.2 Detection of Heteroskedasticity • Breusch-Pagan Test 5.2 Detection of Heteroskedasticity 5.2 Detection of Heteroskedasticity • Illustrative Example 5.2 Detection of Heteroskedasticity 5.2 Detection of Heteroskedasticity 5.2 Detection of Heteroskedasticity 5.2 Detection of Heteroskedasticity 5.2 Detection of Heteroskedasticity 5.3 Consequences of Heteroskedasticity 5.3 Consequences of Heteroskedasticity 5.3 Consequences of Heteroskedasticity 5.3 Consequences of Heteroskedasticity 5.3 Consequences of Heteroskedasticity 5.3 Consequences of Heteroskedasticity 5.3 Consequences of Heteroskedasticity 5.3 Consequences of Heteroskedasticity 5.3 Consequences of Heteroskedasticity 5.4 Solutions to the Heteroskedasticity Problem • There are two types of solutions that have been suggested in the literature for the problem of heteroskedasticity: – Solutions dependent on particular assumptions about σi. – General solutions. • We first discuss category 1. Here we have two methods of estimation: weighted least squares (WLS) and maximum likelihood (ML). 5.4 Solutions to the Heteroskedasticity Problem • WLS 5.4 Solutions to the Heteroskedasticity Problem Thus the constant term in this equation is the slope coefficient in the original equation. 5.4 Solutions to the Heteroskedasticity Problem 5.4 Solutions to the Heteroskedasticity Problem 5.4 Solutions to the Heteroskedasticity Problem • If we make some specific assumptions about the errors, say that they are normal • We can use the maximum likelihood method, which is more efficient than the WLS if errors are normal 5.4 Solutions to the Heteroskedasticity Problem 5.4 Solutions to the Heteroskedasticity Problem 5.4 Solutions to the Heteroskedasticity Problem • Illustrative Example 5.4 Solutions to the Heteroskedasticity Problem 5.4 Solutions to the Heteroskedasticity Problem 5.5 Heteroskedasticity and the Use of Deflators • There are two remedies often suggested and used for solving the heteroskedasticity problem: – Transforming the data to logs – Deflating the variables by some measure of "size." 5.5 Heteroskedasticity and the Use of Deflators 5.5 Heteroskedasticity and the Use of Deflators 5.5 Heteroskedasticity and the Use of Deflators • One important thing to note is that the purpose in all these procedures of deflation is to get more efficient estimates of the parameters • But once those estimates have been obtained, one should make all inferences—calculation of the residuals, prediction of future values, calculation of elasticities at the means, etc., from the original equation—not the equation in the deflated variables. 5.5 Heteroskedasticity and the Use of Deflators • Another point to note is that since the purpose of deflation is to get more efficient estimates, it is tempting to argue about the merits of the different procedures by looking at the standard errors of the coefficients. • However, this is not correct, because in the presence of heteroskedasticity the standard errors themselves are biased, as we showed earlier 5.5 Heteroskedasticity and the Use of Deflators • For instance, in the five equations presented above, the second and third are comparable and so are the fourth and fifth. • In both cases if we look at the standard errors of the coefficient of X, the coefficient in the undeflated equation has a smaller standard error than the corresponding coefficient in the deflated equation • However, if the standard errors are biased, we have to be careful in making too much of these differences 5.5 Heteroskedasticity and the Use of Deflators • In the preceding example we have considered miles M as a deflator and also as an explanatory variable • In this context we should mention some discussion in the literature on "spurious correlation" between ratios. • The argument simply is that even if we have two variables X and Y that are uncorrelated, if we deflate both the variables by another variable Z, there could be a strong correlation between X/Z and Y/Z because of the common denominator Z • It is wrong to infer from this correlation that there exists a close relationship between X and Y 5.5 Heteroskedasticity and the Use of Deflators • Of course, if our interest is in fact the relationship between X/Z and Y/Z, there is no reason why this correlation need be called "spurious." • As Kuh and Meyer point out, "The question of spurious correlation quite obviously does not arise when the hypothesis to be tested has initially been formulated in terms of ratios, for instance, in problems involving relative prices. 5.5 Heteroskedasticity and the Use of Deflators • Similarly, when a series such as money value of output is divided by a price index to obtain a 'constant dollar' estimate of output, no question of spurious correlation need arise. • Thus, spurious correlation can only exist when a hypothesis pertains to undeflated variables and the data have been divided through by another series for reasons extraneous to but not in conflict with the hypothesis framed an exact, i.e., nonstochastic relation." 5.5 Heteroskedasticity and the Use of Deflators • In summary, often in econometric work deflated or ratio variables are used to solve the heteroskedasticity problem • Deflation can sometimes be justified on pure economic grounds, as in the case of the use of "real" quantities and relative prices • In this case all the inferences from the estimated equation will be based on the equation in the deflated variables. 5.5 Heteroskedasticity and the Use of Deflators • However, if deflation is used to solve the heteroskedasticity problem, any inferences we make have to be based on the original equation, not the equation in the deflated variables • In any case, deflation may increase or decrease the resulting correlations, but this is beside the point. Since the correlations are not comparable anyway, one should not draw any inferences from them. 5.5 Heteroskedasticity and the Use of Deflators • Illustrative Example 5.5 Heteroskedasticity and the Use of Deflators 5.5 Heteroskedasticity and the Use of Deflators 5.5 Heteroskedasticity and the Use of Deflators 5.5 Heteroskedasticity and the Use of Deflators 5.5 Heteroskedasticity and the Use of Deflators 5.6 Testing the Linear Versus LogLinear Functional Form 5.6 Testing the Linear Versus LogLinear Functional Form • When comparing the linear with the log-linear forms, we cannot compare the R2 because R2 is the ratio of explained variance to the total variance and the variances of y and log y are different • Comparing R2's in this case is like comparing two individuals A and B, where A eats 65% of a carrot cake and B eats 70% of a strawberry cake • The comparison does not make sense because there are two different cakes. 5.6 Testing the Linear Versus LogLinear Functional Form • The Box-Cox Test – One solution to this problem is to consider a more general model of which both the linear and loglinear forms are special cases. Box and Cox consider the transformation 5.6 Testing the Linear Versus LogLinear Functional Form 5.6 Testing the Linear Versus LogLinear Functional Form 5.6 Testing the Linear Versus LogLinear Functional Form 5.6 Testing the Linear Versus LogLinear Functional Form 5.6 Testing the Linear Versus LogLinear Functional Form 5.6 Testing the Linear Versus LogLinear Functional Form 5.6 Testing the Linear Versus LogLinear Functional Form 5.6 Testing the Linear Versus LogLinear Functional Form Summary • 1. If the error variance is not constant for all the observations, this is known as the heteroskedasticity problem. The problem is informally illustrated with an example in Section 5.1. • 2. First, we would like to know whether the problem exists. For this purpose some tests have been suggested. We have discussed the following tests: – – – – – – (a) Ramsey's test. (b) Glejser's tests. (c) Breusch and Pagan's test. (d) White's test. (e) Goldfeld and Quandt's test. (f) Likelihood ratio test. Summary • 3. The consequences of the heteroskedasticity problem are: – (a) The least squares estimators are unbiased but inefficient. – (b) The estimated variances are themselves biased. – If the heteroskedasticity problem is detected, we can try to solve it by the use of weighted least squares. – Otherwise, we can at least try to correct the error variances Summary • 4. There are three solutions commonly suggested for the heteroskedasticity problem – – – – (a) Use of weighted least squares. (b) Deflating the data by some measure of "size.“ (c) Transforming the data to the logarithmic form. In weighted least squares, the particular weighting scheme used will depend on the nature of heteroskedasticity. Summary • 5. The use of deflators is similar to the weighted least squared method, although it is done in a more ad hoc fashion. Some problems with the use of deflators are discussed in Section 5.5. • 6. The question of estimation in linear versus logarithmic form has received considerable attention during recent years. Several statistical tests have been suggested for testing the linear versus logarithmic form. In Section 5.6 we discuss three of these tests: the Box-Cox test, the BM test, and the PE test. All are easy to implement with standard regression packages. We have not illustrated the use of these tests.