Supplement 13B: Durbin-Watson Test for Autocorrelation

advertisement
Supplement 13B: Durbin-Watson Test for Autocorrelation
Modeling Autocorrelation
Because autocorrelation is primarily a phenomenon of time series data, it is convenient to
represent the linear regression model using t as a subscript to represent time:
(13B.1)
Yt   0  1 X1t  2 X 2t  ...  k X kt   t
where t = 1, 2, … , n
We assume that there are observations covering n periods of time. Autocorrelation (also called
serial correlation) exists when the error terms 1, 2, …, n are not independent of one another.
There are many ways we might envision non-independence among the errors. The first-order
autoregressive model (sometimes called the AR1 model) is a common way of thinking about
correlated errors:
(13B.2)
 t   t 1  ut
where –1 ≤  ≤ +1
where  is the autocorrelation parameter and ut is a well-behaved (i.e., normally distributed,
homoscedastic, non-autocorrelated) random disturbance with mean zero and constant variance
2. As you can see from equation (13B.2), if  = 0, then t is also well-behaved because t = ut. If
 = 0 then t is unaffected by error t-1. In other words, if  = 0 there is no “carry-over” from
period t–1 to period t.
However, if  is not zero, then error t is affected by error t-1. In fact, it is fairly easy to show
that t is affected by all of the prior errors. Because the same relationship holds between every
error and its predecessor, we can substitute  t 1   t 2  ut 1 into equation (13B.2) to obtain
 t   t 1  ut   (  t 2  ut 1 )  ut   2 t 2  ut 1  ut
Continuing with substitutions in this fashion, we can show that
 t  ut  ut 1   2ut 2  ...   t u0
This says that the error in period t is affected by all the prior random errors. Only if  = 0 will
autocorrelation vanish so that t = ut. Serial correlation is a violation of the assumption of
independence, creating problems with the t-tests and confidence intervals that are reported in
regression software.
More specifically, if  > 0 (a common occurrence in time series data) the reported MSE tends to
underestimate the variance of the errors. The ANOVA test statistic for overall significance may
thus be inflated, as well as the t-test statistics for individual predictor significance. While the
least-squares estimates remain unbiased, they are no longer minimum variance unbiased
estimators (see MVUE in Chapter 8). Therefore, a test for autocorrelation is desirable when
autocorrelation is suspected.
Supplement 13B: Durbin-Watson Test for Autocorrelation
Page 1
Durbin-Watson Test
Because the true errors 1, 2, …, n are unobservable, we use the regression residuals e1, e2, …,
en in our test for autocorrelation. The Durbin-Watson statistic is calculated as
n
DW 
(13.xx)
 (e  e
t 2
t 1
t
)2
n
e
t 1
2
t
The range of DW is 0 ≤ DW ≤ 4. An approximate relationship exists between DW and  :
DW  2(1–)
Thus:
 = +1
=0
 = –1



DW  2(1–1) = 0
DW  2(1–0) = 2
DW  2[1–(–1)] =4
Perfect positive autocorrelation
No autocorrelation
Perfect negative autocorrelation
When DW is much less than 2, we suspect positive serial correlation (a common condition in
time series data). Conversely, when DW is much greater than 2, we suspect negative serial
correlation (a less common condition in time series data). When DW = 2 we have no sample
evidence of autocorrelation. Some practitioners (e.g., http://help.sap.com) suggest a rule of
thumb that within the range 1.5 to 2.5 there is little cause for alarm. However, more precise tests
are desirable.
Critical Values for Durbin Watson Test
To test for autocorrelation, the test statistic is compared to lower and upper critical values (dL
and dU) for a specified level of significance α. The critical values depend on the sample size (n)
and the number of predictors (k). Critical values for  = .05 are shown in Table 13B.1 for various
sample sizes and numbers of predictors. This table only goes up to k = 5 predictors and only
shows selected sample sizes. This suffices to illustrate the DW test, and should cover the types of
problems you will encounter as an introductory statistics student. However, you can easily find
more complete tables if you need them (see table footnote, chapter references, or Google).
Within Table 13B.1 you can interpolate between sample sizes if you find it necessary.
Supplement 13B: Durbin-Watson Test for Autocorrelation
Page 2
Table 13B.1 Durbin-Watson Critical 5% Values
k=1
k=2
k=3
k=4
k=5
n
dL
dU
dL
dU
dL
dU
dL
dU
dL
dU
10
0.879
1.320
0.697
1.641
0.525
2.016
0.376
2.414
0.243
2.822
15
1.077
1.361
0.946
1.543
0.814
1.750
0.685
1.977
0.562
2.220
20
1.201
1.411
1.100
1.537
0.998
1.676
0.894
1.828
0.792
1.991
25
1.288
1.454
1.206
1.550
1.123
1.654
1.038
1.767
0.953
1.886
30
1.352
1.489
1.284
1.567
1.214
1.650
1.143
1.739
1.071
1.833
40
1.442
1.544
1.391
1.600
1.338
1.659
1.285
1.721
1.230
1.786
50
1.503
1.585
1.462
1.628
1.421
1.674
1.378
1.721
1.335
1.771
60
1.549
1.616
1.514
1.652
1.480
1.689
1.444
1.727
1.408
1.767
70
1.583
1.641
1.554
1.672
1.525
1.703
1.494
1.735
1.464
1.768
80
1.611
1.662
1.586
1.688
1.560
1.715
1.534
1.743
1.507
1.772
90
1.635
1.679
1.612
1.703
1.589
1.726
1.566
1.751
1.542
1.776
100
1.654
1.694
1.634
1.715
1.613
1.736
1.592
1.758
1.571
1.780
150
1.720
1.746
1.706
1.760
1.693
1.774
1.679
1.788
1.665
1.802
200
1.758
1.778
1.748
1.789
1.738
1.799
1.728
1.810
1.718
1.820
Source: Excerpts from N. E. Savin and Kenneth J. White, “The Durbin-Watson Test for Serial Correlation with
Extreme Sample Sizes or Many Regressors,” Econometrica, Vol. 45, No. 8 (Nov., 1977), pp. 1989-1996. Used with
permission of the Econometric Society.
To test for positive autocorrelation, the hypotheses are:
H0:  = 0
H1:  > 0
(errors are not autocorrelated)
(errors are positively autocorrelated)
The interpretation of the test for positive autocorrelation is shown in words and visually:
If DW < dL conclude H1 (errors are positively autocorrelated)
If DW > dU conclude H0 (errors are not positively autocorrelated)
If dL ≤ DW ≤ dU the test is inconclusive
Supplement 13B: Durbin-Watson Test for Autocorrelation
Page 3
Example: Changes in Consumer Prices
CPI
Are changes in consumer prices over time related to changes in manufacturing capacity
utilization, changes in the money supply, and unemployment rates? Table 13B.2 shows data for
40 recent years. The variables to be investigated are:
ChCPI = change in the Consumer Price Index (all items)
CapUtil = change in the manufacturing capacity utilization rate
ChgM1 = change in the M1 component of the money supply
ChgM2 = change in the M2 component of the money supply
Unem = unemployment rate (percent)
Table 13B. Selected U.S. Economic Variables, 1971-2010
Year
ChCPI
CapUtil
ChgM1
ChgM2
1971
3.3
78.0
6.5
13.4
1972
3.4
83.4
9.2
13.0
1973
8.7
87.6
5.5
6.6
…
…
…
…
…
2008
0.1
75.0
16.7
10.0
2009
2.7
67.2
5.7
3.4
2010
1.5
71.7
8.2
3.4
Source: Economic report of the President, February, 2011. Only the first three and last three
observations are shown.
CPI
Unem
5.9
5.6
4.9
…
5.8
9.3
9.6
Regression Analysis
R²
Adjusted R²
R
Std. Error
ANOVA table
Source
Regression
Residual
Total
SS
118.7398
264.8899
383.6298
0.310
0.231
0.556
2.751
n 40
k 4
Dep. Var. ChCPI
df
4
35
39
Regression output
Variables
Coefficients Std. error
Intercept
-39.4880 11.4973
CapUtil
0.4647
0.1249
ChgM1
-0.0896
0.1082
ChgM2
0.2102
0.1389
Unem
0.9857
0.4000
MS
29.6850
7.5683
F
3.92
p-value
.0098
t (df=35)
-3.435
3.720
-0.828
1.514
2.464
p-value
.0015
.0007
.4131
.1391
.0188
VIF
1.566
1.427
1.101
1.896
Durbin-Watson = 0.83
The fitted regression is shown here. Overall, the regression is significant at α = .01. The best
predictor is CapUtil (p = .0007) followed by Unem (p = .0188). The money supply predictor
ChgM2 is weak (p = .1389) and ChgM1 is not significant. The intercept differs significantly from
zero, but is not of interest here.
Supplement 13B: Durbin-Watson Test for Autocorrelation
Page 4
Some of the predictor signs are in line with a priori expectations, but this naïve model is of little
economic interest. We will not analyze it in detail, as our objective here is to only examine the
pattern of residuals. To test for positive autocorrelation, the hypotheses are:
H0:  = 0 (errors are not autocorrelated)
H1:  >0 (errors are positively autocorrelated)
The Durbin-Watson test statistic (shown above in the computer printout) is DW = 0.83. There are
k = 4 predictors and n = 40, so dL = 1.285 and dU = 1.721. The decision rule is:
If DW < dL conclude H1 (errors are positively autocorrelated)
If DW > dU conclude H0 (errors are not positively autocorrelated)
If dL ≤ DW ≤ dU the test is inconclusive
Because DW < dL, we conclude that positive autocorrelation exists in the errors. This pattern can
be seen in Figure 13B.1 as a cyclic pattern, i.e., a series of runs of residuals of the same sign (-+++--++++ etc). The residuals have a zero mean (as the must) but the series of 40 residuals only
crosses the zero axis line 12 times. Chance alone would suggest more sign changes (i.e., more
centerline crossings).
FIGURE 13B.1 Residual Plot Over Time
Residuals
Residual (gridlines = std. error)
8.25
5.50
2.75
0.00
-2.75
-5.50
-8
2
12
22
Observation
32
42
Negative Autocorrelation
In our example (and in most economic time series models) we would want to test for positive
autocorrelation. However, to test for negative autocorrelation, the hypotheses would be:
H0:  = 0
H1:  < 0
(errors are not autocorrelated)
(errors are negatively autocorrelated)
The test is similar to positive autocorrelation, except that the test statistic is 4–DW.
Supplement 13B: Durbin-Watson Test for Autocorrelation
Page 5
For negative autocorrelation, the test is:
If 4–DW < dL conclude H1 (errors are negatively autocorrelated)
If 4–DW > dU conclude H0 (errors are not negatively autocorrelated)
If dL ≤ 4–DW ≤ dU the test is inconclusive
Caveat for Durbin-Watson Test
The D-W tables do not apply if you have lagged values of the response variable (e.g., Yt–1 or Yt–2)
among the list of predictors. At this stage of your training, it is best to avoid such predictors.
Exercise
Note: * indicates optional portions for those who want a greater challenge.
13B.1 Below are data on several economic variables that might help predict per capita consumer
spending (annual data covering 1964-2010). The response variable in the proposed model
is ConsCap and the three predictors are YdCap, Unem, and r3-mo, where:
ConsCap =
YdCap =
Unem =
r3-mo =
per capita consumption expenditures (current dollars)
per capita disposable personal income (current dollars)
unemployment rate (percent)
three-month U.S. Treasury bill rate (percent)
TABLE 13B.3 U.S. Economic Data, 1964-2010
Year
ConsCap
YdCap
1964
2144
2408
1965
2284
2562
1966
2446
2733
…
…
…
2008
33148
35931
2009
32526
35888
2010
33382
36691
Source: Economic Report of the President, February, 2011.
Consumption
Unem
5.2
4.5
3.8
…
5.8
9.3
9.6
r3-mo
3.56
3.95
4.88
…
1.48
0.16
0.14
Instructions: Estimate the regression. Include a table of residuals and the Durbin-Watson
test (in Minitab, look under Options, while in MegaStat you must check a box if you want
the DW test statistic). (b) Discuss the overall significance of the model, and tell which
predictors are significant at α = .05 (c) Find the 5% critical values in Table 13B.1 and
state the decision rule for the DW test for positive autocorrelation. Hint: To be
conservative, use the next lower sample size if your n is not in the table (or interpolate the
table values). (d) What is your conclusion about autocorrelation? (e) Plot the residuals in
time order. Describe the pattern. (f) How many times do the residuals sign change (count
the crossings of the zero axis). What does this tell you? (g*) Based on what you know
about economics, are the signs of the predictors logical a priori (ignoring those that are
insignificant)? (h*) Re-estimate the model using variables that have been transformed
using first differences (see the data spreadsheet). Does this reduce autocorrelation?
Supplement 13B: Durbin-Watson Test for Autocorrelation
Page 6
RELATED READINGS
Durbin, J.; and Watson, G. S. “Testing for Serial Correlation in Least Squares Regression, I.”
Biometrika 37, 1950, 409–428.
Durbin, J., and Watson, G. S. “Testing for Serial Correlation in Least Squares Regression, II.”
Biometrika 38, 1951, 159–179.
Gujarati, Damodar; and Dawn Porter. Basic Econometrics. 5th ed. McGraw-Hill, 2009.
Kutner, Michael H.; Christopher J. Nachtsheim; John Neter;, and William Li. Applied Linear
Statistical Models. 5th ed. McGraw-Hill, 2005.
Savin, N. E. and Kenneth J. White, “The Durbin-Watson Test for Serial Correlation with
Extreme Sample Sizes or Many Regressors,” Econometrica, 45, 1977, 1989-1996.
Supplement 13B: Durbin-Watson Test for Autocorrelation
Page 7
Download