Lecture 17: Serial correlation BUEC 333 Professor David Jacks

Lecture 17: Serial correlation
BUEC 333
Professor David Jacks
Three assumptions necessary for unbiasedness:
1.) correct specification;
2.) zero mean error term;
3.) exogeneity of independent variables.
Three assumptions necessary for efficiency:
4.) no perfect collinearity;
5.) no serial correlation;
6.) no heteroskedasticity.
Violatin’ the classical assumptions
Since serial correlation (SC) violates 5.) and this
implies that OLS is not BLUE, we want to know:
1.) What is the nature of the problem?
2.) What are the consequence of the problem?
3.) How is the problem diagnosed?
4.) What remedies for the problem are available?
We now consider these in turn…
Violatin’ the classical assumptions
SC occurs when an observation’s error term (εi) is
correlated with another observation’s error term
(εj), or Cov(εi, εj) ≠ 0.
Usually happens because there is an important
relationship—economic or otherwise—between
observations which we are failing to control for.
Serial correlation
SC can also arise from cluster sampling where
observations are of the same variables on
systematically related subjects.
Example: firms operating in the same market;
consumption with sample data from families with
one observation for each family member.
Serial correlation
There are two basic types of serial correlation:
pure and impure.
Pure serial correlation arises if the model is
correctly specified but the errors are serially
correlated (that is, all other assumptions hold).
Example: the DGP is Yt = β0 + β1X1t + εt where
εt = ρεt-1 + ut and ut is a ―classical‖ error term.
Pure serial correlation
Note: here, we use the subscript t (for time,
instead of i) to denote the observation number;
this is standard for models of time series data
where SC arises most frequently.
Further note: this kind of serial correlation is also
called first-order autocorrelation or first-order
autoregression—or AR(1) for short—and ρ is
called the autocorrelation coefficient.
Pure serial correlation
First-order autocorrelation: εt = ρεt-1 + ut.
Requires –1 < ρ < 1. But why –1 and 1?
And what if ρ = 0?
ρ < 0 is an example of negative serial correlation
where εt and εt-1 tend to have opposite signs.
Pure serial correlation
Pure serial correlation
ρ > 0 is an example of positive serial correlation
where εt and εt-1 tend to have the same sign.
This case is fairly easy to interpret and very
common in economic data, especially time series.
For time series, macroeconomic shocks take time
to work their way fully through the economy.
Pure serial correlation
Pure serial correlation
Example: modeling the price of oranges.
They can only be grown in warm climates but are
consumed almost everywhere.
Dispersion of production and consumption means
they have to be transported by container ships,
trains, and trucks before being sold to consumers.
An example of pure serial correlation
An unexpected shock to supply of oil leads to an
increase in price of oil that lasts several months.
Positive shock to oil prices likely to filter into a
series of positive shocks to the price of oranges.
Here, modeling prices of oranges at daily
frequency virtually guarantees that SC will be a
An example of pure serial correlation
We can also have autocorrelation at higher orders:
1.) εt = ρ1εt-1 + ρ2εt-2 + ut (second-order)
2.) εt = ρ1εt-1 + ρ2εt-2 + ρ3εt-3 + ut (third-order)
3.) …
Autocorrelation also in non-adjacent periods;
e.g., with quarterly data on real estate prices, we
might have εt = ρεt-4 + ut.
Further cases of pure serial correlation
Further cases of pure serial correlation
Impure serial correlation arises if the model is
mis-specified due to an omitted variable and the
specification errors induces SC.
For example, suppose the DGP is
Yt   0  1 X 1t   2 X 2t  t
Instead, we estimate Yt   0  1 X 1t   where:
1.)  t*   2 X 2t   t
2.) X 2t  
3.)  t and ut
Impure serial correlation via omitted variables
Because of mis-specification error, error term is:
   2 X 2t   t
   2 ( X 2t 1  ut )   t
   2 X 2t 1   2ut   t
   ( t 1   t 1 )   2ut   t
The error term of observation t is, therefore,
Impure serial correlation via omitted variables
Even though the ―true‖ errors satisfy assumptions
of CLRM, leaving out X2 induces SC in the error
term because X2 is serially correlated.
This omitted variables problem does not cause
bias if and only if omitted variable is uncorrelated
with included independent variable.
Impure serial correlation via omitted variables
Imagine your first job is to model consumer
demand for LCD TVs for Samsung.
You know from ECON 201 that relative prices
matter, specifically the price of Samsung TVs and
that of their competitors or PSAM/PSONY.
Being hungover, you forget another lesson from
ECON 201:
Impure serial correlation via omitted variables
Impure serial correlation also arises if the model
is mis-specified due to incorrect functional form.
For example, suppose the DGP is
Yt   0  1 X 1t   2 X 12t  t
Instead, we estimate Yt   0  1 X 1t   where
 t   2 X 1t   t .
Impure serial correlation via functional form
Use of incorrect
functional form in
this case tends to
group positive and
negative residuals
Impure serial correlation via functional form
Both forms of SC violates Assumption 5 of the
CLRM, and hence OLS is not the BLUE.
What more can we say?
1.) OLS estimates remain unbiased…but only if
the problem is with pure SC.
Consequences of serial correlation
Suppose we have the simple linear regression
Yi   0  1 X i   i  Y   0  1 X  
X  X Y  Y 
ˆ 
 X  X 
X  X     
ˆ   
 X  X 
 
 E ˆ1  1 since Cov( X i ,  i )  0
Consequences of serial correlation
OLS estimates, however, will be biased if the
problem is with impure SC brought about by
correlated omitted variables.
Impure SC represents violation of Assumption 1.
In this case, the SC problem is of secondary
importance next to the bias potentially induced by
specification error.
Consequences of serial correlation
2.) Even if unbiased, OLS is no longer best;
that is, no longer exhibits minimum variance.
SC implies that errors are partly predictable: with
positive SC, a positive error today implies
tomorrow’s error is likely to be positive as well.
But OLS ignores this auto-correlation.
Consequences of serial correlation
3.) The formulas derived for the standard errors of
OLS estimates are now incorrect.
These formulas all assume that errors are not
serially correlated.
Relaxing this assumption, changes the formulas;
computers can be programmed to handle this.
Consequences of serial correlation
In this case, the “true” standard errors will
typically be larger than that which OLS reports
(when it assumes there is no SC)…this implies that
OLS’ standard errors are biased.
And since this standard error is typically larger
than what OLS says, the ―true‖ t-statistic will
typically be
Consequences of serial correlation
There are a number of formal tests available.
However, the simplest way forward and another
good habit to get into is simply looking at a plot of
the residuals from a regression model as before.
If any red flags are set off with this exercise in
ocular econometrics proceed to formal tests, being
very mindful of the potential problem.
Testing for serial correlation
Most common test is the Durbin-Watson (DW)
Test (sometime referred to as the d-test).
Some caveats to be aware of:
1.) the model needs to have an intercept term;
2.) the model cannot include a lagged dependent
variable (that is, Yt-1 cannot be one of the
independent variables)
Testing for serial correlation
If we write the error term as
DW will test for the following null hypothesis:
H0 : ρ ≤ 0 (no positive autocorrelation)
H1 : ρ > 0 (positive autocorrelation)
This test is so common that almost every software
package automatically calculates the value of the
DW statistic whenever you estimate a regression.
Testing for serial correlation
Test statistic is based on residuals from OLS,
{e1, e2, …, eT} where T is the sample size:
e  e 
 e
t 2
t 1
t 1 t
One way to think about d when (+) SC is present:
the numerator will tend to be small samples that
The Durbin-Watson test
e  e 
 e
t 2
t 1
t 1 t
where et   et 1  ut
Now, consider these extremes:
1.) ρ = 1, then et – et-1 ≈ 0
2.) ρ = –1, then et – et-1 ≈ – 2et-1
3.) ρ = 0, then et – et-1 ≈ –et-1
The Durbin-Watson test
Hence, values of the test statistic ―far‖ from 2
indicate that serial correlation likely present.
Unfortunately, distribution theory for d is wonky
…for some values of d, the test is inconclusive.
For a given significance level, there are
consequently two critical values, 0 < dL < dU < 2.
The Durbin-Watson test
For a one-sided test,
H0 : ρ ≤ 0 (no positive autocorrelation)
H1 : ρ > 0 (positive autocorrelation)
a.) Reject H0
b.) Do not reject H0
c.) The test is inconclusive
Decision rules for the DW test
If evidence of pure serial correlation—whether
through a formal test or just by looking at residual
plots—you have several options available to you:
1.) Use OLS and ―fix‖ the standard errors.
We know OLS is unbiased if SC is pure…
but the usual formulas for the standard errors is
wrong (and hence our tests can be misleading).
Remedies for serially correlated errors
This is the approach followed with Newey-West
standard errors which provide consistent
estimates of the standard errors.
What consistency means: estimators get arbitrarily
close to their true value (in a probabilistic sense)
when the sample size goes to infinity.
In Stata, use ―robust‖ option in regression (as in
―reg salary points, robust‖)…most of the time, this
will suffice.
Remedies for serially correlated errors
2.) Other times, you may want to try a more
efficient estimator.
OLS is not BLUE in this case, but what is?
The BLUE is now a generalization of OLS called
Generalized Least Squares (GLS).
Suppose we want to estimate the regression:
Yt   0  1 X t   t
Remedies for serially correlated errors
Then we could write the model as:
1.) Yt   0  1 X t   t 1  ut
Multiply by ρ and lag this by one period:
Yt 1   0  1 X t 1   t  2   ut 1
Since εt = ρεt-1 + ut and εt-1 = ρεt-2 + ut-1, we have
Subtracting (2) from (1),
3.) Yt  Yt 1   0 (1   )  1 ( X t   X t 1 )  ut
Remedies for serially correlated errors
3.) Yt  Yt 1  0 (1   )  1 ( X t   X t 1 )  ut
4.) Yt   0  1 X t  ut
4.) is a Generalized Least Squares
Remedies for serially correlated errors
Note that:
1.) The error term is now not serially correlated;
OLS estimation of 4.) will be minimum variance if
we know ρ or can accurately estimate it.
2.)The slope coefficient β1 is the same as the slope
coefficient of the original serially correlated
equation 1.) above.
Remedies for serially correlated errors
3.) The dependent variable has changed compared
to that in original equation; this means that GLS is
not directly comparable to OLS with respect to R2.
4.) GLS is a method of simultaneously estimating
β0 , β1, and ρ (while being the BLUE of β0 and β1);
different ways of calculating the GLS estimator
discussed in text which are pretty involved.
Remedies for serially correlated errors
SC as a very common problem in econometrics.
At best, SC presents problems related to the
efficiency of OLS estimators.
At worst, SC presents problems related to both the
bias and efficiency of OLS estimators.