Lecture 13
Heteroskedasticity
(Chapter 10.1–10.4, 10.7)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
Agenda for Today
• Review Standard Errors (Chapter 5.2)
• Heteroskedasticity (Chapter 10.1)
• OLS and Heteroskedasticity (Chapter 10.2)
• Tests for Heteroskedasticity (Chapter 10.3)
• Generalized Least Squares (Chapter 10.4)
• GLS: an Example (Chapter 10.5)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-2
Review of Standard Errors (Chapter 5.2)
• Our DGP describes an underlying process we
wish to study.
• The model includes a stochastic component,
the error term.
• There are many possible realizations of the
error term, but we get to observe only one.
• The underlying process plus the particular
realization of the error termour sample.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-3
Review of Standard Errors (cont.)
• We observe only one sample.
• Our estimator (OLS) provides a rule for going
from one sample to one best guess of the
underlying parameters.
• If we could observe many samples, we could
estimate many best guesses.
• If we could observe many samples, we would
have a distribution of estimates.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-4
Review of Standard Errors (cont.)
• Using our model and our sample, we
can estimate NOT ONLY the underlying
parameters BUT ALSO the crosssample distribution of our estimates.
• Estimating the cross-sample distribution
lets us judge how likely our best
guess is to be close to the real
underlying parameters.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-5
Review of Standard Errors (cont.)
Gauss–Markov DGP
Yi   0  1 X 1i   i
E ( i )  0
Var ( i )  
2
Cov( i ,  j )  0, if i  j
X 's fixed across samples.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-6
Review of Standard Errors (cont.)
Gauss–Markov DGP
Yi   0  1 X 1i   i
E ( i )  0
X 's fixed across samples.
These assumptions suffice to derive the
expectation of a linear estimator:
E (wiYi )   0 wi  1wi X i
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-7
Review of Standard Errors (cont.)
• To derive the variance of a linear
estimator, we need to add the two
assumptions about the variance
and covariances of the error term:
Var( i )  
2
Cov( i , j )  0, if i  j
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-8
Review of Standard Errors (cont.)
Variance of a Linear Estimator
Var ( ˆ )  Var (wiYi )
n
 Var ( wiYi )   cov( wiYi , w jY j )
i 1 j  i
 wi 2 Var (Yi )  wi 2 Var (  0  1 X 1i   i )
  wi
2
2
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-9
Review of Standard Errors (cont.)
Variance of OLS with a Single Explanator
Yi   0  1 X i   i
Var ( ˆ1 )   2 w2ˆ
1i
(Xi  X )
wˆ i 
1
( X j  X ) 2
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-10
Review of Standard Errors (cont.)
Variance of OLS with a Single Explanator
 (Xi  X ) 
2
2
2
ˆ
Var ( 1 )   wi    
 ( X j  X ) 2 



2
( X j  X ) 2

2
xi2
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-11
Review of Standard Errors (cont.)
• Problem: we do not know  2
• Solution: estimate  2
• We do not observe the ACTUAL error
terms, i
• We DO observe the residual, ei
ei
s 
n  k 1
2
2
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-12
Review of Standard Errors (cont.)
• We can estimate the STANDARD
ERROR of our OLS estimator (the
cross-sample standard deviation).
ei
(n  k  1)
ˆ
e.s.e.(  ) 
2
( X i  X )
2
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-13
Review of Standard Errors (cont.)
• Estimated standard errors are used to
calculate t-statistics, used for testing
single hypotheses.
*
ˆ



tˆ 
ˆ
e.s.e.(  )
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-14
Review of Standard Errors (cont.)
• Estimated Standard Errors are also
used to calculate Confidence Intervals.
In  % of samples, the C.I. will contain
the true parameter value, .
%
C.I .
 ˆ  t
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
(1 )
2
n  k 1
e.s.e.( ˆ )
13-15
Review of Standard Errors (cont.)
• Our formula for Estimated Standard
Errors relied on ALL the Gauss–Markov
DGP assumptions.
• For this lecture, we will focus on the
assumption of homoskedasticity.
• What happens if we relax the
2
assumption that Var(i )   ?
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-16
Heteroskedasticity (Chapter 10.1)
• HETEROSKEDASTICITY
– The variance of i is NOT a constant  2.
– The variance of i is greater for some
observations than for others.
Var(i )  
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
2
i
13-17
Heteroskedasticity (cont.)
• For example, consider a regression of
housing expenditures on income.
Renti  0  1 Incomei   i
• Consumers with low values of income have
little scope for varying their rent expenditures.
Var(i ) is low.
• Wealthy consumers can choose to spend a
lot of money on rent, or to spend less,
depending on tastes. Var(i ) is high.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-18
Figure 10.1 Rents and Incomes for a
Sample of New Yorkers
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-19
OLS and Heteroskedasticity
(Chapter 10.2)
• What are the implications of
heteroskedasticity for OLS?
• Under the Gauss–Markov assumptions
(including homoskedasticity), OLS was
the Best Linear Unbiased Estimator.
• Under heteroskedasticity, is OLS
still Unbiased?
• Is OLS still Best?
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-20
OLS and Heteroskedasticity (cont.)
• A DGP with Heteroskedasticity
Yi   0  1 X 1i  ... k X ki   i
E ( i )  0
Var ( i )  
2
i
Cov( i ,  j )  0 for i  j
X ’s fixed across samples
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-21
OLS and Heteroskedasticity (cont.)
• Is OLS still unbiased under this DGP?
E(wiYi )  E(wiYi )  wi E(Yi )
 wi ( 0  1 X1i  .. k X ki )
E(wiYi )  1 if wi  0, wi X1i  1,
wi X 2i  0...wi X ki  0
• The unbiasedness conditions are the same
as under the Gauss–Markov DGP.
• OLS is still unbiased!
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-22
OLS and Heteroskedasticity (cont.)
• To determine whether OLS is “Best” (i.e.
the unbiased linear estimator with the
lowest variance), we need to calculate
the variance of a linear estimator under
heteroskedasticity.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-23
OLS and Heteroskedasticity (cont.)
• The variance of a linear estimator:
n
Var(wiYi )  Var(wiYi )    Cov(wiYi , w jY j )
i1 ji
n
 Var(wiYi )  0  wi 2Var(Yi )    wi w j Cov(Yi ,Y j )
i1 ji
 wi 2Var(Yi )  0  wi 2 i 2
• How does this differ from the formula
under homoskedasticity?
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-24
OLS and Heteroskedasticity
• The variance of a linear estimator is
wi  i
2
2
• OLS minimizes
 2wi 2
• OLS is no longer efficient!
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-25
OLS and Heteroskedasticity (cont.)
• Under heteroskedasticity, OLS is
unbiased but inefficient.
• OLS does not have the smallest
possible variance, but its variance may
be acceptable. And the estimates are
still unbiased.
• However, we do have one very serious
problem: our estimated standard error
formulas are wrong!
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-26
OLS and Heteroskedasticity (cont.)
• For the k = 1 case, the OLS weight to
estimate 1 is
wi 
xi
x j
2
• The variance of the estimator is
2
2

x

Var ( ˆ1 )  i 2 i 2
( x j )
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-27
OLS and Heteroskedasticity (cont.)
• For the k = 1 case, the variance of OLS is
2
2

x

Var ( ˆ1 )  i 2 i 2
( x j )
• However, our old e.s.e. formula is
2
2

e
s
i
e.s.e.( ˆ1 ) 

x j 2
(n  2) x j 2
• This formula is no longer valid. Our C.I.’s and
hypothesis tests will NOT be correct!!!!!
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-28
OLS and Heteroskedasticity (cont.)
• Implications of Heteroskedasticity:
– OLS is still unbiased.
– OLS is no longer efficient; some other
linear estimator will have a lower variance.
– Estimated Standard Errors will be incorrect;
C.I.’s and hypothesis tests (both t- and
F- tests) will be incorrect.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-29
OLS and Heteroskedasticity (cont.)
• Implications of Heteroskedasticity
– OLS is no longer efficient; some other
linear estimator will have a lower variance.
• Can we use a better estimator?
– Estimated Standard Errors will be incorrect;
C.I.’s and hypothesis tests (both t- and
F- tests) will be incorrect.
• If we keep using OLS, can we calculate
correct e.s.e.’s?
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-30
Tests for Heteroskedasticity
(Chapter 10.3)
• Before we turn to remedies for
heteroskedasticity, let us first consider tests
for the complication.
• There are two types of tests:
1. Tests for continuous changes in variance:
White and Breusch–Pagan tests
2. Tests for discrete (lumpy) changes in variance:
the Goldfeld–Quandt test
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-31
The White Test
• The White test for heteroskedasticity has
a basic premise: if disturbances are
homoskedastic, then squared errors are
on average roughly constant.
• Explanators should NOT be able to
predict squared errors, or their proxy,
squared residuals.
• The White test is the most general test for
heteroskedasticity.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-32
The White Test (cont.)
• Five Steps of the White Test:
1. Regress Y against your various
explanators using OLS
2. Compute the OLS residuals, e1...en
3. Regress ei2 against a constant, all of
the explanators, the squares of the
explanators, and all possible interactions
between the explanators (p slopes total)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-33
The White Test (cont.)
• Five Steps of the White Test (cont.)
4. Compute R2 from the “auxilliary
equation” in step 3
5. Compare nR2 to the critical value
from the Chi-squared distribution with
p degrees of freedom.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-34
The White Test: Example
(1) Estimate Wagei   0  1edi   2 expi   3 IQi   i
(2) Calculate e  Wage  ˆ  ˆ ed  ˆ exp  ˆ IQ
i
i
0
1
i
2
i
3
i
(3) Regress ei 2   0  1edi   2 edi 2   3 expi   4 exp 2
  5 IQi   6 IQi 2   7 edi expi
  8edi IQi   9 expi IQi  vi
(4) Compute nR 2 from (3)
(5) Reject homoskedasticity if nR 2  Chi-Squared critical
value with 9 degrees of freedom (16.92, if the
significance level is 0.05)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-35
The White Test
• The White test is very general, and
provides very explicit directions. The
econometrician has no judgment calls
to make.
• The White test also burns through
degrees of freedom very, very rapidly.
• The White test is appropriate only for
“large” sample sizes.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-36
The Breusch–Pagan Test
• The Breusch–Pagan test is very similar to the
White test.
• The White test specifies exactly which
explanators to include in the auxilliary
equation. Because the test includes crossterms, the number of slopes (p) increases
very quickly.
• In the Breusch–Pagan test, the
econometrician selects which explanators to
include. Otherwise, the tests are the same.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-37
The Breusch–Pagan Test (cont.)
• In the Breusch–Pagan test, the
econometrician selects m explanators to
include in the auxilliary equation.
• Which explanators to include is a
judgment call.
• A good judgment call leads to a more
powerful test than the White test.
• A poor judgment call leads to a poor test.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-38
The Goldfeld–Quandt Test
• Both the White test and the Breusch–Pagan
test focus on smoothly changing variances
for the disturbances.
• The Goldfeld–Quandt test compares
the variance of error terms across
discrete subgroups.
• Under homoskedasticity, all subgroups
should have the same estimated variances.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-39
The Goldfeld–Quandt Test (cont.)
• The Goldfeld–Quandt test compares
the variance of error terms across
discrete subgroups.
• The econometrician must divide the
data into h discrete subgroups.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-40
The Goldfeld–Quandt Test (cont.)
• If the Goldfeld–Quandt test is
appropriate, it will generally be clear
which subgroups to use.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-41
The Goldfeld–Quandt Test (cont.)
• For example, the econometrician
might ask whether men and women’s
incomes vary similarly around their
predicted means, given education
and experience.
• To conduct a Goldfeld–Quandt test,
divide the data into h = 2 groups, one for
men and one for women.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-42
The Goldfeld–Quandt Test (cont.)
(1) Divide the n observations into h groups, of sizes n1..nh
(2) Choose two groups, say 1 and 2.
H 0 :  12   2 2 against H a :  12   2 2
(3) Regress Y against the explanators for group 1.
(4) Regress Y against the explanators for group 2.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-43
Goldfeld–Quandt Test (cont.)
(5) Relabel the groups as L and S, such that
SSRL
SSRS

nL k
nS  k
SSRL
nL  k
Compute G 
SSRS
nS  k
(6) Compare G to the critical value for an F-statistic
with (nL  k) and (nS  k) degrees of freedom.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-44
Goldfeld–Quandt Test: An Example
• Do men and women’s incomes vary
similarly about their respective means,
given education and experience?
• That is, do the error terms for an income
equation have different variances for
men and women?
• We have a sample with 3,394 men and
3,146 women.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-45
Goldfeld–Quandt Test:
An Example (cont.)
(1) Divide the n observations into men and women,
of sizes nm and nw .
(2) We have only two groups, so choose both of them.
H 0 :  m 2   w2 against H a :  m 2   w 2
(3) For the men, regress
log(income)i  0  1edi  2 expi  3 expi 2   i
(4) For the women, regress
log(income)i  0  1edi   2 expi  3 expi 2  vi
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-46
Goldfeld–Quandt Test:
An Example (cont.)
(5) sm 2 
sw
2
SSRm 1736.64

 0.5123
n m  k 3394 - 4
SSRw 1851.52


 0.5893
n w  k 3146 - 4
Compute G 
0.5893
 1.15
0.5123
(6) Compare G to the critical value for an F-statistic
with 3142 and 3390 degrees of freedom, which is
0.99997 for the 5% significance level.
We reject the null hypothesis at the 5% level.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-47
Generalized Least Squares
(Chapter 10.4)
• OLS is unbiased, but not efficient.
• The OLS weights are not optimal.
• Suppose we are estimating a straight line
through the origin: Y   X  
• Under homoskedasticity, observations with
higher X values are relatively less distorted by
the error term.
• OLS places greater weight on observations
with high X values.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-48
Figure 10.2 Homoskedastic Disturbances
More Misleading at Smaller X ’s
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-49
Generalized Least Squares
• Suppose observations with higher X values
have error terms with much higher variances.
• Under this DGP, observations with high X ’s
(and high variances of ) may be more
misleading than observations with low X ’s
(and low variances of ).
• In general, we want to put more weight on
observations with smaller i2
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-50
Figure 10.3 Heteroskedasticity with
Smaller Disturbances at Smaller X ’s
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-51
Generalized Least Squares
• To construct the BLUE Estimator for S, we
follow the same steps as before, but with our
new variance formula. The resulting estimator
is “Generalized Least Squares.”
Start with a linear estimator, wiYi
Impose the unbiasedness conditions,
wi X Ri  0 for R  S, wi X Si  1
Find wi to minimize wi2 i2
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-52
Generalized Least Squares (cont.)
• For an example, consider the DGP
Yi   X i   i
E( i )  0
Var( i )   di
2
2
Cov( i , j )  0 for i  j
X i , di fixed across samples
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-53
Generalized Least Squares (cont.)
min w  2 wi 2 di 2
i
such that
w d
j
j
1
Xi
Solution: wi 
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
di
2
X 
j

 d 
 j
2
13-54
Generalized Least Squares (cont.)
GLS
ˆ


Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
Yi X i
d d
i
i
 Xj
  d
 j



2
13-55
Generalized Least Squares (cont.)
• In practice, econometricians choose a
different method for implementing GLS.
• Historically, it was computationally
difficult to program a new estimator
(with its own weights) for every
different dataset.
• It was easier to re-weight the data first,
and THEN apply the OLS estimator.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-56
Generalized Least Squares (cont.)
• We want to transform the data so that
it is homoskedastic. Then we can
apply OLS.
• It is convenient to rewrite the variance
term of the heteroskedastic DGP as
Var(i )   d
2
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
2
i
13-57
Generalized Least Squares (cont.)
• If we know the di factor for each
observation, we can transform the data
by dividing through by di.
• Once we divide all variables by di, we
obtain a new dataset that meets the
Gauss–Markov conditions.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-58
GLS: DGP for Transformed Data
X i i
Yi
1

  0  1
di di
di
di
 i 
E   0
 di 
 i
Var 
 di
 1
1 2 2
2


d



Var




i
i
2
2
d
d
i
i

 i  j 
1
Cov ( i ,  j )  0
Cov  ,  
 di d j  di d j


Xi
fixed across samples.
di
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-59
Generalized Least Squares
• This procedure, Generalized Least
Squares, has two steps:
1. Divide all variables by di
2. Apply OLS to the transformed variables
• This procedure optimally weights down
observations with high di’s
• GLS is unbiased and efficient
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-60
Generalized Least Squares (cont.)
• Example: a straight line through the origin:
1. First, divide Yi , X i by di
2. Apply OLS to
ˆ 
Yi X i
,
di di
Yi X i
d d
i
i
 Xj
  d
 j
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.



2
13-61
Generalized Least Squares (cont.)
• Note: we derive the same BLUE
Estimator (Generalized Least Squares)
whether we:
1. Find the optimal weights for
heteroskedastic data, or
2. Transform the data to be
homoskedastic, then use OLS weights
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-62
GLS: An Example (Chapter 10.5)
• We can solve heteroskedasticity by
dividing our variables through by di.
• The DGP with the transformed data is
Gauss–Markov.
• The catch: we don’t observe di.
How can we implement this strategy
in practice?
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-63
GLS: An Example (cont.)
• We want to estimate the relationship
renti  0  1incomei  i
• We are concerned that higher income
individuals are less constrained in how much
income they spend in rent. Lower income
individuals cram into what housing they can
afford; higher income individuals find housing
to suit their needs/tastes.
• That is, Var(i ) may vary with income.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-64
GLS: An Example (cont.)
• An initial guess: Var(i )   ·income
2
2
i
• di = incomei
• If we have modeled heteroskedasticity
correctly, then the BLUE Estimator is:
rent
1
 0
 1  vi
income i
incomei
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-65
TABLE 10.1
Rent and Income in New York
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-66
TABLE 10.5 Estimating a Transformed
Rent–Income Relationship, var(i )   2 X i2
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-67
Checking Understanding
• An initial guess: Var(i )   ·income
2
2
i
• di = incomei
rent
1
 0
 1  vi
income i
incomei
• How can we test to see if we have
correctly modeled the heteroskedasticity?
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-68
Checking Understanding
• If we have the correct model of
heteroskedasticity, then OLS with the
transformed data should be homoskedastic.
rent
1
 0
 1  vi
income i
incomei
• We can apply either a White test or a
Breusch–Pagan test for heteroskedasticity to
the model with the transformed data.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-69
Checking Understanding (cont.)
• To run the White test, we regress
1
1
ei  0  1
 2
 i
2
incomei
incomei
• nR2 = 7.17
• The critical value at the 0.05 significance level
for a Chi-square statistic with 2 degrees of
freedom is 5.99
• We reject the null hypothesis.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-70
GLS: An Example
2
2
Var(

)


·income
• Our initial guess:
i
i
• This guess didn’t do very well. Can we
do better?
• Instead of blindly guessing, let’s try
looking at the data first.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-71
Figure 10.4 The Rent–Income Ratio
Plotted Against the Inverse of Income
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-72
GLS: An Example
• We seem to have overcorrected
for heteroskedasticity.
• Let’s try Var(i )   ·incomei
2
rent
income i
 0
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
1
income i
 1 incomei  vi
13-73
TABLE 10.6 Estimating a Second Transformed
Rent–Income Relationship, var(i )   2 Xi
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-74
GLS: An Example
• Unthinking application of the White test
procedures for the transformed data leads to
1
1
 2
  3 income
incomei
income i
1
  4 incomei   5
income  i
income i
ei   0  1
• The interaction term reduces to a constant,
which we already have in the auxilliary
equation, so we omit it and use only the first
4 explanators.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-75
GLS: An Example (cont.)
• nR2 = 6.16
• The critical value at the 0.05 significance level
for a Chi-squared statistic with 4 degrees of
freedom is 9.49
• We fail to reject the null hypothesis that the
transformed data are homoskedastic.
• Warning: failing to reject a null hypothesis
does NOT mean we can “accept” it.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-76
GLS: An Example (cont.)
• Generalized Least Squares is not trivial
to apply in practice.
• Figuring out a reasonable di can be
quite difficult.
• Next time we will learn another
approach to constructing di , Feasible
Generalized Least Squares.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-77
Review
• In this lecture, we began relaxing the Gauss–
Markov assumptions, starting with the
assumption of homoskedasticity.
• Under heteroskedasticity, Var(i )   2 di
– OLS is still unbiased
– OLS is no longer efficient
– OLS e.s.e.’s are incorrect, so C.I., t-, and
F- statistics are incorrect
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-78
Review (cont.)
• Under heteroskedasticity,
2
2 2
ˆ
Var (  )   w d
i
i
• For a straight line through the origin,
OLS
2 X i di
ˆ
Var (  )  
2 2
(X i )
2
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
2
13-79
Review (cont.)
• We can use squared residuals to test for
heteroskedasticity.
• In the White test, we regress the
squared residuals against all
explanators, squares of explanators,
and interactions of explanators. The
nR2 of the auxilliary equation is
distributed Chi-squared.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-80
Review (cont.)
• The Breusch–Pagan test is similar, but
the econometrician chooses the
explanators for the auxilliary equation.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-81
Review (cont.)
• In the Goldfeld–Quandt test, we first
divide the data into distinct groups, and
conduct our OLS regression on each
group separately.
• We then estimate s2 for each group.
• The ratio of two s2 estimates is
distributed as an F-statistic.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-82
Review (cont.)
• Under heteroskedasticity, the BLUE Estimator
is Generalized Least Squares
• To implement GLS:
1. Divide all variables by di
2. Perform OLS on the transformed variables
• If we have used the correct di , the
transformed data are homoskedastic.
We can test this property.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
13-83