Word Document - Mysmu .edu mysmu.edu

advertisement
ECON 107 Applied Econometrics
Topic 11: Time-Series Models
(Studenmund, Chapter 12)
I. The Trend Model
Suppose we have a simple 2-variable, time-series regression:
log( Y t ) =  0 +  1 t +  t
then  1 represents the percentage changes in Yt (ie the growth rate of Yt ). If  t is
a classical error term, the OLS estimator of  1 is unbiased, consistent and
efficient.
Even when  t are serially correlated, the OLS estimator of  1 is still efficient.
II. The Distributed-Lag Model
Suppose we have a simple 2-variable, time-series regression:
Y t =  0 + 1 X t +  t
This implies 'contemporaneous effect' (i.e., the current value of Yt depends on the
current value of Xt.)
An alternative specification would be:
Y t =  0 +  1 X t +  2 X t -1 +  3 X t -2 +  t
This allows for 'intertemporal effect' (i.e., Yt depends on the current and lagged
values of Xt.) This is known as a Distributed-Lag Model.
Couple of things to note:

Change in notation for coefficients. β0 is the intercept. β1 is the
contemporaneous effect of X on Y. Also known as the ‘short-run’ impact.
β2 is the effect lagged one period. β3 is the effect lagged two periods.

The 'long-run' effect is the summation of the 3 slope coefficients.
3
t = 1  t =  1 +  2 +  3

No reason why we’d have to stop at 2 lagged values. In general, we could
specify a distributed lagged regression up to k periods.
Y t =  0 +  1 X t +  2 X t -1 +  3 X t -2 + . . . +  k 1 X t -k +  t
EXAMPLE: Estimating the effects of the Employment Contracts Act (ECA) on
aggregate employment.
Suppose we’re interested in the ultimate effects of the ECA on employment.
These would occur slowly over a few periods as this legislation sets in motion a
number of changes in the labour market.
Suppose we use the following:
lnE t =  0 +  Z t +  1 ECAt +  2 ECAt -1
+  3 ECAt -2 +  4 ECAt -3 +  t
where
Et = Employment
Zt = Other regressors.
ECAt = Dummy variable equals 1 after 15 May 1991; zero earlier.
Imagine we have annual data between 1980 and 2005. The dummy for the ECA
is set equal to zero between 1980 and 1990, and one in 1991 to 2005.
We get these results:
lˆnE t = . . . - .025 ECAt - .007 ECAt -1 + .011 ECAt -2 + .036 ECAt -3
(.010) ........ (.015)
........ (.014) ..... (.012)
How would you interpret these hypothetical results?
The short-run impact is:
̂ 1 = - .025
The immediate impact of the ECA was a 2.5% decline in employment.
The long-run impact is:
t =1 ̂ t = .015
4
Page-3
Eventually the ECA may have had a positive impact on employment. Maybe
reduced wages, increased productivity through a more flexible work environment,
or reduced the costs of taking on new staff (e.g., reduced requirements for
redundancy provisions), all ultimately lead to higher employment.
III. The Koyck Transformation
The previous example raises the issue of the specification of the lag structure.
How many ‘lags’ should be included. The assumption is that the effects of the
ECA on employment are complete after 3 lags (i.e., 1994). But the ECA could be
transforming the labour market for many years to come.
Theory might suggest the length of the lagged response, but unlikely.
This is important because the time series is finite. Limited number of observations.
If we allow for up to 10 years of lagged effects from the ECA, we won’t be able
to estimate the regression model until at least 2001.
Another problem is multicollinearity. The longer the lag structure, the higher the
collinearity among these regressors.
Ad hoc procedure could be used. Keep adding lagged values until the estimated
coefficient on the last variable included is insignificant. Said earlier that this
‘step-wise regression procedure’ is not a good idea. Practically, in this example,
we may have never got to the 3rd lagged term. If we stopped too early, we would
have erroneously concluded that the ECA had a negative impact on employment.
The Koyck Transformation can be thought of as an alternative to this ad hoc
procedure for choosing the structure of your distributed lag model.
Suppose we start with an 'infinite', distributed-lag model:
Y t =  0 +  1 X t +  2 X t -1 +  3 X t - 2 + . . . +  i
where the coefficients are assumed to follow a geometric 'rate of decay' λ.
 k =  1  k 1
k = 1, 2, . . .
0 <  < 1
Page-4
Note that without this ‘constraint’ on the β coefficients, this regression couldn’t be
estimated. The number of regressors always exceeds the number of observations.
For example:
if  = .5 then  2 = .5  1 ,
 3 = .25  1 , . . . ,
 6 = .03  1 , . . . ,
 11 = .001  1 .
The impact of the current value of X on Y is 1000-times greater than the impact
of X lagged 10 periods on Y.
ESTIMATION: Begin with original equation:
2
Y t =  0 +  1 X t +   1 X t -1 +   1 X t -2 + . . . +  t
Substitute in the ‘restrictions’ on the slope coefficients.
The same functional form from previous period can be multiplied by λ.
 Y t -1 =   0 +   1 X t -1 +  2  1 X t -2 +  3  1 X t -3 + . . . +   t -1
Subtract 2nd expression from the 1st. The trick is that all of the terms with lagged
values of X cancel out:
Y t -  Y t -1 =  0 (1 -  ) +  1 X t + (  t -   t -1 )
or rearranging terms:
Y t =  0 (1 -  ) +  1 X t +  Y t -1 + vt
where vt =  t -   t 1 and vt-1 =  t 1 -   t  2
Since adjacent composite disturbance terms share a common element, the Koyck
transformation model suffers from serial correlation.
But we’ve turned an equation with an infinite number coefficients into a
3-variable regression with only 3 coefficients to be estimated (β0, λ and β1).
We started with a distributed lag model, and ended up with an Autoregressive
Model. The lagged dependent variable is one of the explanatory variables.
Page-5
The 'long term' effect in these models is analogous to a multiplier:

k = 1  k =  1
1
1- 
e.g., if λ=.5 and β1=.75, then the long term effect is 1.5. If λ=.8 and β1=.75, then
the long term effect is 3.75.
The Durbin h Test.
This is an alternative to the Durbin-Watson d statistic in testing for first-order
autocorrelation in a regression with a lagged dependent variable as a regressor.
Recall that we said the d statistic is inappropriate when lagged dependent variables
are included as regressors.
The problem is that the residuals in an autoregressive model are biased. This tends
to push the residuals toward 2, indicating an absence of autocorrelation. This
means that you might not reject the null of no autocorrelation when you should.
This h statistic can be written:
h  (1 -
d
n
)
2
1 - n[Var( ̂ )]
where ‘n’ is the sample size, and λ is the coefficient on the lagged dependent
variable.
Note that is d=2 (indicating an absence of autocorrelation), h=0. It also has a unit
variance. It follows a standardised normal distribution.
EXAMPLE: Autoregressive consumption function.
Cˆ t = - 55.29 + .645 DI t + .339 C t -1
.................... (.171) ...... (.180)
2
R = .996 N = 25 d = .69
The idea is that there is some ‘inertia’ behind consumption spending.
Page-6
Compute h:
h  (1 -
.69
25
)
2
1 - 25[.0324]
 7.51
Since h is normally distributed, a 5% 2-tailed test gives us a critical value of 1.96.
Since 7.51>1.96, we reject the null hypothesis of no first order autocorrelation in
favour of positive autocorrelation. If the h statistic is less than -1.96, we’d reject
the null in favour of negative autocorrelation.
One problem with this test procedure is that this h statistic is undefined if the
numerator within the square root sign is negative. This occurs if:
nVar( ̂ )  1
In practice, this doesn’t happen very often. You can’t take the square root of a
negative number!
IV. Autoregressive Models
The Koyck Transformation Model generates an autoregressive specification.
How do we interpret such a regression model? Does it raise any estimation
issues?
The AR(1) model is:
Y t =  0 +  1Y t -1 +  t
where Yt-1 is the regressor. This is known as a first-order autoregressive process,
because only one lagged value of the dependent variable appears on the right-hand
side.
A covariance stationary time series is one whose basic 3 properties do not change
over time. The three properties are:
1. The mean is constant over time.
2. The variance is constant over time
3. The correlation between the variable and lagged variable only depends on
the lag length.
Page-7
If one of these properties is violated, the time series is non-stationary.
Stationary Time Series Model: In AR(1) model, Yt is stationary if and only if
|  1 | 1 . In this case  0 and 1 can be estimated by OLS. The OLS estimators of  0
and 1 are consistent follow the normal distribution when the sample size is large
enough. Hence the t-test is (ONLY) justified asymptotically.
Non-stationary Time Series Model: If 1  1 , Yt is called a unit root process
which is non-stationary. A special case of unit root process is the pure random
walk, where
Y t = Y t -1 +  t
with  t being a classical error term.
When a unit root process is regressed on another unit root process, the
independent variable can appear to be more significant than they actually are.
Even in the case where they are not related, the classical t test tends to find a
significant relationship between them. Such a problem is called Spurious
Regression.
Phillips (1986, Journal of Econometrics) shows that in the spurious regression: (1)
the OLS estimator is inconsistent, in fact it does not converge to a constant as the
sample size increases; (2) the usual t ratio from the spurious regression diverges
as the sample size increases, and hence tends to reject the null hypothesis of the
slope being zero.
Test for non-stationarity:
H 0 : 1  1,
H 1 : 1  1,
Under the null, the OLS estimator of 1 does not follow a normal distribution,
even when the sample size is large. As a result, the t statistic does not follow a t
distribution.
One can use the Dickey-Fuller test to test for non-stationary. It estimates 
from the following model via OLS
Page-8
Y t  Y t -1 =  0 +  Y t -1 +  t
and test
H 0 :   0,
H 1 :   0.
The Dickey-Fuller t-statistic has the same expression as the familiar t-statistic.
That is
t
ˆ
se( ˆ )
However, the standard t-table does not apply here.
The critical values for the Dickey-Fuller t-statistic are:
Sign. Level
Crit Value
0.01
-3.43
0.025
-3.12
0.05
-2.86
0.10
-2.57
When the Dickey-Fuller t-statistic is larger than the critical value (do not use the
absolute value), we cannot reject the null of unit root.
Phillips (1987, Econometrica) found the analytical expression of the
Dickey-Fuller t-statistic under the assumption that the sample size goes to infinity.
V. Questions for Discussion: Q12.3
VI. Computing Exercise: Johnson, Ch12
Download