Uploaded by IT Data Group

serial

advertisement
Serial Autocorrelation:
Why it’s there, what it does, and how to get rid of it.
By Ronald U. Mendoza
Fordham Department of Economics
What is serial autocorrelation?
When using time series data in regressions, one must always check to make sure that all the
assumptions of the classical linear regression model are satisfied. One of these assumptions is that the
disturbance term relating to any observation is not influenced by the disturbance term relating to any other
observation. Put simply, the error terms of the Ordinary Least Squares (OLS) equation estimate must be
independently distributed of each other and hence the covariance between any pair of error or residual
terms must be zero. Should this covariance be non-zero, then the residuals are said to be autocorrelated and
a relationship between present and past values can be observed. Serial autocorrelation therefore refers to
the existence of a linear equation involving the residuals of the regression.
In a typical regression of Y regressed on X as in Equation 1, the presence of first order serial
autocorrelation in the residuals is expressed by Equation 2. Notice that because v is not independently
distributed, then a necessary assumption for the OLS procedure has been violated.
Yt = α + βX t + vt
(1)
vt = ρvt −1 + ε t
(2)
What causes serial autocorrelation?
There are many causes for serial autocorrelation in regressions involving time series data. The
most significant cause is that there is usually momentum built into most time series. For instance, data for
the GNP can have high interdependence in between successive values because total national output tends to
follow a so-called business cycle. Because of this, a regression involving GNP could result in error terms
which are also highly interdependent. Other examples of time series that exhibit high interdependence
include the Consumer Price Index (CPI), production, employment, unemployment, exports, imports, etc.1
1
. If you are interested in an intuitive explanation of other causes for serial autocorrelation, Gujarati’s Basic
Econometrics, 3rd Edition (1995) is a good reference.
1
What does serial autocorrelation do to the OLS regression?
The presence of autocorrelation in the error terms of an OLS regression still results in unbiased
coefficient estimates. However, these estimates are not the most efficient in the class of all linear unbiased
estimators. In other words, another unbiased and more efficient estimator can be found.
This inefficiency of these estimates manifests itself in the t-statistics generated from the
coefficients. Because the t-stat is nothing more than the coefficient estimate divided by the square root of
the variance of that coefficient estimate, then serial autocorrelation results in dubious t-statistics. Ergo,
reliable inferences cannot be made based on the regression results.
How does one test for serial auto-correlation?
There is a large body of literature on tests for serial autocorrelation. However, the standard test
included in most software packages is the Durbin-Watson test. It is defined simply as:
n
d=
∑ (v
t =2
t
− vt −1 ) 2
n
∑v
t =1
2
t
where v is the estimated residual from the OLS regression and n is the sample regression size. Once, d is
calculated from the residuals of a typical OLS regression (and this can be done easily in Excel), one can
then use the following rules in order to test for the presence of serial autocorrelation:
Durbin-Watson d test: Decision Rules
Null Hypothesis
Condition*
Decision
No positive autocorrelation.
0 < d < d(lower)
Reject null. (Autocorrelation!)
No positive autocorrelation.
d(lower) <= d <= d(upper)
No decision.
No negative correlation.
4 - d(lower) < d < 4
Reject null. (Autocorrelation!)
No negative correlation.
4 - d(upper) <= d <= 4 - d(lower)
No decision.
No autocorrelation.
d(upper) < d < 4 - d(upper)
Do not reject null.
(positive or negative)
*The upper bound critical is d(upper) and the lower bound critical is d(lower). Also "<=" is read as "less
than or equal to."
2
Attached are the Durbin-Watson critical values for models with up to five regressors. This was
taken from the appendix of Greene's Econometric Analysis, 3rd Edition (1997). Note that the appropriate
sample size is used in order to identify the relevant critical values for the upper and lower bounds of the
DW statistic.
How does one correct for serial autocorrelation?
Most statistical and/or econometrics software packages now enable the user to automatically
correct for serial autocorrelation. One such package, SAS, allows for the correction using a simple two-step
procedure. First, an estimate for ρ in equation 2 had to be made using a maximum likelihood procedure.
The intuition behind this first step is that an estimate of ρ is made so that the resulting error terms are
independently distributed. The second step involves the incorporation of ρ into the estimation of equation
1. The objective of this augmentation is for modified equation 1 to exhibit independently and identically
distributed residuals.
The derivation of the corrected model is relatively straightforward. First, we get the one-period lag
of equation 1. This is shown below as equation 3.
Yt −1 = α + βX t −1 + vt −1
(3)
Rho is then multiplied on both sides in order to get equation 4.
ρYt −1 = αρ + βρX t −1 + ρvt −1
(4)
Equation 4 is then subtracted from equation 1 in order to arrive at a white noise independently and
identically distributed residual ε t .
Equation 5 represents equation 1 corrected for serial autocorrelation of the first order. Note that by
renaming these terms, then equation 5 can be expressed as equation 6 which provides best linear unbiased
estimates (BLUE) using OLS.
Yt − ρYt −1 = α (1 − ρ ) + β ( X t − ρX t −1 ) + ε t
(5)
Yt ∗ = α ∗ + β ∗ X t∗ + ε t
(6)
3
The complete SAS program for a typical export demand function2 in Dr. Schwalbenberg's
International Economic Policy class is written below:
SAS PROGRAM:
EXPLANATION:
data temp;
Gives the entire data set a name.
infile 'a:\data.prn';
Uses the data named "data" which is saved in a disk.
Note that it's saved as a space delimited file in Excel.
input lnrx lnrr lngdp;
Gives each column of data names.
Be sure to remember which one is which in Excel!
proc reg;
Calls for the regression procedure.
model lnrx=lnrr lngdp;
Models the regression.
At this point SAS gives you results which are possibly
autocorrelated. You have to check the Durbin-Watson
statistic in order to test for the presence of serial
autocorrelation.
proc autoreg data=temp;
Calls for the autoregressive correction procedure.
model lnrx=lnrr lngdp /nlag=1 dw=1 dwprob;
Models the regression with the information that the
residuals have first order autocorrelation.3 The results
will include those for the uncorrected and the
corrected versions of the regression. The DW statistic
of the corrected regression as well as its probability
will also be shown.
run;
References:
Greene, William H. Econometric Analysis, 3rd Edition. Prentice Hall, Upper Saddle River NJ.1997.
Gujarati, Damodar. Basic Econometrics, 3rd Edition. McGraw-Hill. NewYork. 1995.
SAS/ETS Software Applications Guide 2: Econometric Modeling, Simulation, and Forecasting. Version 6,
1st Edition. SAS Institute. Cary NC. 1993.
2
The export demand function is the regression of the country's real exports (in log form) against the logtransform of the real exchange rate and the log transform of the real GDP of the country's trading partner.
3
Note that first order serial autocorrelation refers to the existence of a relationship between the present and
only the first lagged value. The possibility of higher order serial autocorrelation, though less common, must
still be considered. The DW statistic of the corrected regression should show a rejection of autocorrelation.
If not, another correction is necessary.
4
5
Download