Chapter 8 Heteroskedasticity Introduction e x x x y + β+ + β+ β+β

advertisement
Chapter 8
What are the consequences of using the least squares method if the
assumptions * are violated ?
Heteroskedasticity
Introduction
• The least squares estimator is still unbiased.
The linear regression equation is:
• The standard errors calculated for the least squares estimators
are incorrect. Therefore, hypothesis testing exercises may lead
to unreliable conclusions.
y i = β1 + β 2 x i2 + β 3 x i 3 + . . . + βK x iK + e i
for
i = 1, 2, . . . , N
• The least squares estimator does not have minimum variance in
the class of linear unbiased estimators.
The adequacy of the least squares (OLS) estimation results depend on
a number of assumptions:
Correct model specification – all relevant variables are
included, correct functional form.
Specification error gives biased estimators.
This suggests that methods are needed for:
• Tests for detecting heteroskedasticity and correlation patterns
in the residuals.
The random error term satisfies:
E(e i ) = 0
• Calculation formula for correct standard errors for the least
squares estimators.
for all i
for all i 


cov (e i ,e j) = 0 for i ≠ j 
var (e i ) = σ 2
• Estimation procedures that will give minimum variance
estimators.
*
Chapter 8 looks at models with heteroskedastic errors.
The assumptions
*
state the errors are homoskedastic (equal error
Chapter 9 looks at models where the uncorrelated error assumption
does not describe the economic behaviour.
variance) and uncorrelated.
1
Econ 326 - Chapter 8
2
Econ 326 - Chapter 8
Detecting Heteroskedasticity
Heteroskedasticity
Heteroskedastic errors can be stated as:
The Breusch-Pagan Test
var (e i ) = σ 2i
for
i = 1, 2, . . . , N
The linear regression equation with heteroskedastic errors is:
The error variance can be different for each observation.
y i = β1 + β 2 x i2 + β 3 x i 3 + . . . + βK x iK + e i
with var (e i ) = σ 2i
Example
With cross-section survey data, a model that explains household
expenditure as a function of household income may feature
heteroskedastic errors.
Households with relatively high income will have more discretionary
income and, therefore, more variability in expenditure habits.
Higher income households will have larger error variance compared
to households in a lower income group.
A proposal is that the error variance is a function of a set of
explanatory variables. This leads to a general functional form for the
error variance stated as:
σ 2i = h (α1 + α2 x i 2 + α3 x i 3 + . . . + αK x iK )
When α 2 = α 3 = . . . = α K = 0 the equation errors are
homoskedastic.
A test of interest is:
3
Econ 326 - Chapter 8
4
H0 : α2 = α 3 = . . . = αK = 0
homoskedasticity
H1 : not all α in H0 are zero
heteroskedasticity
Econ 326 - Chapter 8
A test method is now presented.
Example
Estimate the regression equation by least squares and obtain the least
squares residuals êi .
The data set introduced in Chapter 2 contains observations on weekly
food expenditure (y) and income (x), in dollars, for a sample of 40
households with three family members.
Then run the artificial regression:
The linear regression equation is:
ê2i = α1 + α2 x i 2 + α 3 x i 3 + . . . + αK x iK + v i
Using the R 2 goodness-of-fit statistic from the artificial regression, a
test statistic for the overall significance of the artificial regression is:
N ⋅ R2
This statistic, named the Breusch-Pagan test for heteroskedasticity,
2
y i = β1 + β 2 x i + e i
Consider that the error variance is potentially a function of income.
The Breusch-Pagan test statistic for heteroskedasticity is calculated as
7.38.
can be compared with a chi-square ( χ 2 ) distribution with K–1
degrees of freedom.
The p-value for the test statistic is:
With Stata the estat hettest command can be used to report the
Breusch-Pagan test for heteroskedasticity.
The small p-value gives evidence to reject the null hypothesis of
homoskedasticity.
5
6
Econ 326 - Chapter 8
p = P(χ(21) > 7.38) = 0.007
Econ 326 - Chapter 8
The Goldfeld-Quandt Test
Log-transformed variables have useful application.
For the food expenditure data consider the log-log model:
ln(y i ) = β1 + β 2 ln(x i ) + e i
For the household expenditure function recognize that higher
variability may be associated with higher household income.
Sort the data set in ascending order of household income (x).
For this model the Breusch-Pagan test statistic for heteroskedasticity
has a calculated value of 2.30 with an accompanying p-value of 0.13.
A heteroskedastic error assumption is:
σ12

var(e i ) = 
σ 22
The conclusion now is that, for this data set, the homoskedasticity
assumption suits the log-log model.
The log transformation rescales the data and therefore may correct
for heteroskedasticity that is observed in the linear model.
In particular, the observations in the upper quartile are compressed
so that the difference with the other observations is less extreme.
with
for group 1 (the ' low' income group)
for group 2 (the ' high' income group)
σ12 < σ 22
Another way of stating the error variance assumption that explicitly
recognizes higher error variance for higher household income (x) is:
var (e i ) = σ 2 x i
To test for this form of heteroskedasticity consider the one-sided test:
H0 : σ12 = σ 22
against H1 : σ12 < σ 22
The test method is called the Goldfeld-Quandt test.
7
Econ 326 - Chapter 8
8
Econ 326 - Chapter 8
The test statistic is calculated as follows.
An application may suggest a higher error variance in the first group.
The one-tail test of interest is now:
Split the sample into two groups, with N1 observations in group 1
and N2 observations in group 2.
Fit separate least squares (OLS) regressions to each group of
observations and estimate an error variance for each group as:
σˆ 12 =
1
SSE1
N1 − K
σˆ 22 =
1
SSE2
N2 − K
H0 : σ12 = σ 22
2
2
against H1 : σ12 > σ 22
The Goldfeld-Quandt test statistic and p-value are computed as:
and
GQ =
σˆ 12
σˆ 22
p = P(F(N1 − K , N2 − K ) > GQ)
where SSE1 and SSE2 are the sum of the squared residuals for
each group.
A two-tail test can also be considered:
H0 : σ12 = σ 22
The Goldfeld-Quandt test statistic is:
2
2
2
against H1 : σ12 ≠ σ 22
2
ˆ 12 > σˆ 22 the Goldfeld-Quandt test statistic and p-value are:
With σ
σˆ 2
GQ = 22
σˆ 1
GQ =
This can be compared with an F distribution with (N2 − K, N1 − K)
degrees of freedom.
σˆ 12
>1
σˆ 22
p = 2 ⋅ P(F(N1 − K , N2 − K ) > GQ)
A p-value is calculated as:
p = P(F(N2 − K , N1 − K ) > GQ)
9
Econ 326 - Chapter 8
10
Econ 326 - Chapter 8
Example
The p-value calculation is illustrated in the figure.
For the household food expenditure data set sort the observations in
ascending order of household income (x) and then split the sample
into two groups each with 20 observations.
Estimation results for the linear regression equation show:
SSE1 = 64346.
‘low’ income group
SSE2 = 232595.
‘high’ income group
probability density function
of F(18,18)
The Goldfeld-Quandt test statistic is:
GQ =
σˆ 22
σˆ 12
=
tail area
p=0.005
SSE2 /( 20 − 2)
= 3.61
SSE1 /(20 − 2)
0
With Microsoft Excel, the Function F.DIST.RT(3.61, 18, 18)
gives the p-value 0.005 (this calculation was also confirmed by the
Stata results).
For the one-tail test:
H0 : σ12 = σ 22
the p-value is calculated as:
GQ=3.61
against H1 : σ12 < σ 22
The calculated p-value for the Goldfeld-Quandt test is less than any
standard significance level (such as 0.05 or 0.01) and therefore the
null hypothesis of homoskedasticity is rejected.
p = P(F(18 , 18 ) > 3.61)
This result agrees with the finding of the Breusch-Pagan test statistic
presented earlier.
11
Econ 326 - Chapter 8
12
Econ 326 - Chapter 8
For the log-log version of the food expenditure model a GoldfeldQuandt test statistic was calculated as 2.93.
With the one-sided alternative of higher error variance in the ‘high’
income group the calculated p-value of 0.014 suggests that, at a 1%
significance level, the homoskedasticity hypothesis is not rejected.
Heteroskedasticity-Consistent Standard Errors
Consider a simple model with heteroskedastic errors:
y i = β1 + β 2 x i + e i
for
with E(e i ) = 0 , var (e i ) = σ 2i
i = 1, 2, . . . , N
and cov (e i ,e j) = 0 for i ≠ j
The least squares principle gives an unbiased estimation rule for the
parameters.
With heteroskedastic errors, it can be shown that the variance of the
slope estimator b2 is:
N
∑ (x i − x )2 σ 2i
var( b2 ) =
i=1
 N
2
(1)
2
 ∑ (x i − x ) 
 i =1

2
2
In the case of homoskedasticity σ i = σ 2 for all i and (1) is
simplified to:
N
var( b2 ) = σ 2
13
Econ 326 - Chapter 8
14
∑ (x i − x )2
i =1
N


 ∑ ( x i − x )2 
 i=1

2
= σ2
1
N
(2)
∑ (x i − x )
2
i=1
Econ 326 - Chapter 8
Equation (2) is the calculation formula used for obtaining the
variances of the least squares estimators that are routinely reported
on the least squares (OLS) estimation computer output. This may
overestimate or underestimate the correct calculation formula stated
in Equation (1).
However Equation (1) is not operational as stated since the error
variances σ 2i are unknown.
To obtain an operational formula, the White variance estimator
(proposed by Halbert White of the University of California at San
Diego) approximates Equation (1) by:
With Stata the robust option on the regress command will
report estimates of the variances and covariances of the parameter
estimators that are adjusted for general heteroskedasticity.
As a technical note, for a bias adjustment, Stata scales the calculations
by multiplying the variances and covariances by N/(N−
−K).
The robust option still reports the least squares parameter
estimates – only the variances (and therefore, the standard errors) are
adjusted. This is intended to permit more reliable hypothesis testing.
N
∑ (x i − x )2 ê2i
vâr(b2 ) =
i=1
 N

 ∑ ( x i − x )2 
 i =1

2
where the êi for i = 1, . . . , N are the least squares residuals.
15
Econ 326 - Chapter 8
16
Econ 326 - Chapter 8
Generalized Least Squares
Suppose that a linear regression equation is estimated by the least
squares principle (OLS) and diagnostic testing shows that
heteroskedastic errors are an important feature.
How can this information be used in model estimation ?
The generalized (weighted) least squares estimator can be obtained as
follows. Transform the regression equation by dividing the
observations by σi . The transformed model is:
x 
 yi 
 1 
  = β1   + β 2  i  + v i
 σi 
 σi 
 σi 
Consider the linear regression equation:
y i = β1 + β 2 x i + e i
2
with E(e i ) = 0 , var (e i ) = σ i
for
i = 1, 2, . . . , N
and cov (e i ,e j) = 0 for i ≠ j
The error of the transformed model is:
e 
∑  i 
i = 1 σ i 
ei
σi
The statistical properties of the transformed error are:
To make use of the information about the heteroskedastic errors, a
proposal is to find estimators of β1 and β 2 (the intercept and slope
coefficients) that minimize the ‘weighted’ sum of squared errors:
N
vi =
2
e  1
E(v i ) = E i  =
E(e i ) = 0
 σi  σi
zero mean
e  1
var( v i ) = var  i  = 2 var(e i ) = 1
 σi  σi
unit variance
homoskedastic errors
This method is known as weighted least squares (WLS).
It is a special case of generalized least squares (GLS).
E(v i v j ) =
1
E(e i e j ) = 0 for i ≠ j uncorrelated errors
σ iσ j
Therefore, the transformed error satisfies the standard assumptions
of the Gauss Markov theorem.
Least squares (OLS) estimation of the transformed model gives the
WLS or GLS estimator.
17
Econ 326 - Chapter 8
18
Econ 326 - Chapter 8
A practical problem is that the σ 2i is unknown.
Least squares (OLS) estimation can be applied to the transformed
model to get the weighted least squares (WLS) estimates.
To make this operational a form for σ 2i must be specified.
Example
For modelling household food expenditure (y) as a function of
income (x) a reasonable assumption may be:
var (e i ) = σ 2i = σ 2 x i
A problem is that the specification of the error variance equation may
not be clear-cut.
For example, another variance form is:
var (e i ) = σ 2 x 2i
The transformed model is now:
The error variance increases as income increases (this assumes x i > 0
for all i since non-positive variance is not allowed).
It can be noted that
 yi 
1
x 
  = β1   + β 2  i  + v i
x
x
 i
 i
 xi 
σi = σ x i
1
= β1   + β 2 + v i
 xi 
The transformed model is:
 yi 
 1 
 xi 






 x  = β1  x  + β 2  x  + v i
 i
 i
 i
where
vi =
ei
xi
For this model:
var( v i ) =
1
var(e i ) = σ 2
xi
homoskedastic errors
Note that the transformed model has no intercept coefficient.
19
Econ 326 - Chapter 8
20
Econ 326 - Chapter 8
Grouped Data
The error assumptions for the model are:
E( ei ) = 0
Example – Wheat Production in Australia
A data set, from an Australian wheat-growing district, contains
26 years of time-series data.
var( e i ) = E
for all i
( )
e 2i
A linear regression equation is specified as:
y i = β 1 + β 2 x i 2 + β 3 x i 3 + ei
for
for i = 1, 2, . . . , 13
for i = 14, 15, . . . , 26
(uncorrelated errors)
It is expected that σ12 > σ 22 .
is quantity of wheat produced in year i,
This is an example of a model with heteroskedastic errors.
x i 2 is the price of wheat guaranteed for year i,
x i 3 = 1, 2, . . . , 26 is a time trend variable that serves as a proxy
for technological improvements, and
ei
cov(e i , e j ) = 0 for all i ≠ j
i = 1, 2, . . . , 26
where
yi
σ 12

=
σ 22
The equation can be estimated by least squares (OLS) and the
heteroskedasticity assumption can be tested with the
Goldfeld-Quandt test. The one-sided test of interest is:
is a random error.
H0 : σ12 = σ 22
The influence of weather is reflected in the error term.
against H1 : σ12 > σ 22
New wheat varieties were introduced after year 13.
Their yield was less dependent on weather conditions and therefore
lower error variance is suggested for years 14 to 26.
For the Australian wheat data set, the estimation results reported a
Goldfeld-Quandt test statistic of 11.11. The p-value for the test was
calculated as less than 0.0005 to give the conclusion that the null
hypothesis of homoskedasticity is rejected (at any reasonable
significance level such as 0.01 or 0.05) in favour of the alternative of
lower variance in the second half of the sample period.
21
22
Econ 326 - Chapter 8
Econ 326 - Chapter 8
The presence of heteroskedasticity means that the least squares
standard errors will be unreliable for confidence interval estimation
and hypothesis testing.
The White standard errors make adjustments for general
heteroskedasticity (see the earlier lecture notes).
Least squares estimation results are:
In this application, there is some useful information about the source
of heteroskedasticity – there are two subsets of observations, each
with a different variance. By including this information in the
estimation better estimates may be obtained.
Generalized least squares (GLS) estimates can be obtained by
working with the transformed model:
ŷ i = 139.9 + 19.54 x i 2 + 3.64 x i 3
(1.12)
(2.57)
t-statistics – OLS
(<0.0005) (0.27)
(6.03)
(0.02)
p-values
(0.94)
(2.13)
t-statistics – White
(<0.0005) (0.36)
(5.35)
(0.04)
p-values
x
yi
x
e
1
= β1
+ β2 i2 + β3 i3 + i
σ1
σ1
σ1
σ1 σ1
for i = 1, 2, . . ., 13
x
yi
x
e
1
= β1
+ β2 i2 + β3 i3 + i
σ2
σ2
σ2
σ2 σ2
for i = 14, 15, . . ., 26
The important feature of the transformed model is that the error term
is homoskedastic. That is,
In this case, the t-statistics for individual tests of significance, based
on the least squares standard errors that ignored the
heteroskedasticity, were bigger than the t-statistics that used the
White standard errors that made use of the information about the
heteroskedasticity.
That is, the least squares standard errors were smaller than the White
standard errors and, therefore, overstated the precision of the
estimation.
e  1
var i  = 2 var(ei ) = 1
 σ1  σ1
for i = 1, 2, . . ., 13
e  1
var i  = 2 var(ei ) = 1
 σ2  σ2
for i = 14, 15, . . ., 26
2
2
A practical problem is that the error variances σ 1 and σ 2 are
unknown.
It can be noted that the results show that the coefficient on the price
of wheat ( x 2 ) is not significantly different from zero.
23
Econ 326 - Chapter 8
24
Econ 326 - Chapter 8
A feasible estimator can be obtained by a two-step estimation
method.
STEP 1
Apply separate least squares (OLS) estimation to each
subset of observations. With N1 observations in the first group and
N2 observations in the second group the error variances are
estimated from the least squares residuals as:
σˆ 12 =
σˆ 22 =
N1
1
1
SSE1 =
∑ ê 2
N1 − K
N1 − K i = 1 i
Note: The GLS estimator is no longer a linear function of y i because
σ̂
σ̂ 2 depend on y i .
σ1 and σ
The usual interval estimates and hypothesis tests are now only
approximate tests in ‘small’ samples.
For the Australian wheat production equation, the GLS estimation
results are:
ŷ i = 138.1 + 21.72 x i 2 + 3.28 x i 3
and
N
1
1
SSE2 =
∑ ê 2i
N2 − K
N2 − K i = N1 + 1
x *i 2
25
 x i 2 σˆ 1
=
 x i 2 σˆ 2
1 σˆ 1
x *i1 = 
1 σˆ 2
x *i 3
 x i 3 σˆ 1
=
 x i 3 σˆ 2
(0.82)
standard errors
(3.99)
t-statistics
for
(26.13) (20.74)
(5.35)
for i = 1, 2, . . . , 13
for i = 1, 2, . . . , 13
for i = 14, 15, . . . , 26
Econ 326 - Chapter 8
p-values
ŷ i = 139.9 + 19.54 x i 2 + 3.64 x i 3
i = 1, 2, . . . , N
for i = 14, 15, . . . , 26
(0.001)
These results can be compared with the least squares (OLS)
estimation results reported earlier:
where the transformed observations are constructed as:
 y i σˆ 1
y *i = 
 y i σˆ 2
(8.92)
(2.43)
(<0.0005) (0.023)
STEP 2
Obtain the feasible GLS (generalized least squares)
estimator by applying least squares (OLS) to the transformed model:
y *i = β 1 x *i1 + β 2 x *i 2 + β 3 x *i3 + v i
(12.8)
(10.77)
(1.71)
standard errors - White
(0.94)
(2.13)
t-statistics – White
(<0.0005) (0.36)
(0.04)
p-values
The GLS estimation gives t-statistics for individual tests of
significance that are all significant at a 5% significance level.
The results show that the GLS standard errors are smaller than the
White standard errors that accompany the least squares estimation.
That is, the GLS method gives increased precision for the estimation.
26
Econ 326 - Chapter 8
A 95% confidence interval estimate for the coefficient on the price of
wheat is calculated as:
bGLS
± t c se( bGLS
) = 21.72 ± 2.069 (8.92) = [3.3, 40.2]
2
2
Although the interval estimate appears to be relatively wide, the
results can be compared with the interval estimate from least squares
estimation:
Conclusions So Far
Economic theory is used to specify a linear regression equation.
The intercept and slope parameters can be estimated by the least
squares principle (OLS).
Following model estimation a variety of diagnostic tests can be
inspected. Examples are:
• the Jarque-Bera test for normality of the residuals
(Chapter 4)
b2 ± t c se(b2 )white = 19.54 ± 2.069 (20.74) = [−23.4, 62.4]
Least squares has poor ability to estimate the price coefficient with
any precision.
• the Ramsey RESET test for model misspecification
(Chapter 6)
• the Chow test for structural change
(Chapter 7)
• the Breusch-Pagan test for heteroskedastic errors
(Chapter 8)
• the Goldfeld-Quandt test for heteroskedastic errors
(Chapter 8)
Other tests are also available, but not presented here.
27
Econ 326 - Chapter 8
28
Econ 326 - Chapter 8
Suppose a test shows evidence of heteroskedasticity.
How should this be interpreted ?
Three alternative approaches can be considered.
the model may be misspecified such as incorrect
functional form. For example, log transformations of the
variables may transform the heteroskedastic errors to
homoskedastic errors.
with a correctly specified model, heteroskedastic errors
lead to least squares estimators that are unbiased.
But the least squares standard errors are incorrect.
Therefore, report the least squares parameter estimates
and use the White standard errors that are adjusted for
general heteroskedasticity for confidence interval
estimation and hypothesis testing.
the above approach is inefficient.
That is, it does not give a minimum variance estimator.
To get an efficient estimator use generalized (weighted)
least squares – WLS or GLS.
This method requires the specification of a variance
function for the error variances σ 2i .
In practice, this may not be clear-cut.
29
Econ 326 - Chapter 8
Download