5 Transfer function modelling

advertisement
MSc Further Time Series Analysis
5
Transfer function modelling
5.1
The model
Consider the construction of a model for a time series (Yt ) whose values are influenced
by the earlier values of a series (Xt ). Thus the process (Yt ) is dynamically related to the
process (Xt ). We may think of (Xt ) as the input to a system and of (Yt ) as the output,
or of (Xt ) as a process of explanatory variables and of (Yt ) as a process of dependent
variables. In general, we shall model (Yt ) as a linearly filtered version of (Xt ), where the
filter used is one-sided, i.e., causal, and also includes a constant term µ. Thus, at the
heart of the model, we have a relationship of the form
Yt = µ +
∞
X
vj Xt−j ,
j=0
which represents the systematic dynamics of the model. The effect of the input process
takes time to work through to the output process. Furthermore, there may be a timedelay before the input starts to influence the output, i.e., a positive integer k such that
vj = 0 for 0 ≤ j ≤ k − 1. (In comparison with Section 3 on linear filters, note the reversal
of the roles of the processes (Xt ) and (Yt ) as input and output.) In practice, this model
needs to be developed further, because the relationship between (Xt ) and (Yt ) will not
be exact but will be subject to disturbance. We introduce a disturbance or noise term Ut
to arrive at the following equation for what is known as the transfer function model or a
distributed lag model.
Yt = µ +
∞
X
vj Xt−j + Ut
(−∞ < t < ∞).
(1)
j=0
The disturbance process (Ut ) in Equation (1) is unobservable and is not necessarily a
white noise process. We make the following two assumptions:
1. (Ut ) is a zero-mean stationary process.
2. (Ut ) is uncorrelated with the input process (Xt ).
If we further assume that (Xt ) is a stationary process then it follows that the output
process (Yt ) is also a stationary process. If the data being modelled is non-stationary then
it may be differenced to reduce it to stationarity before fitting the model of Equation (1).
In such cases it may well be appropriate, as is the case for univariate models, to assume
that the differenced data, both input and output, have zero mean, which implies that the
constant µ in the model equation (1) is taken to be zero.
1
Example
Consider a manufacturer who decides on the amount of a certain product that he will
start to produce at time t, basing his decision upon the predicted selling price. Suppose
that it takes k time periods from start to completion of the product. Let Yt denote the
quantity of the product ready for supply at time t and Xt the market price at time t.
If the manufacturer uses simple exponential smoothing then the predicted price x̂t any
number of steps ahead at time t may be written in the form
x̂t = (1 − α)
∞
X
αj Xt−j .
j=0
Assuming a simple linear relationship between planned production and predicted price,
together with the added disturbance term, we may write
Yt+k = µ + δx̂t + Ut+k
= µ + δ(1 − α)
∞
X
αj Xt−j + Ut+k .
j=0
Equivalently, writing δ(1 − α) = β,
Yt = µ + β
= µ+β
∞
X
j=0
∞
X
αj Xt−k−j + Ut
αj−k Xt−j + Ut .
(2)
j=k
Note the presence of a time-delay k. Alternatively, using the lag operator, we may write
Yt = µ + βLk
∞
X
(αL)j Xt + Ut
j=0
= µ+
βLk
Xt + Ut .
1 − αL
(3)
The processes (Xt ) and (Yt ) as described will not be stationary in general, but they
may be differenced to transform them to stationarity. The transformed series will still
satisfy essentially the same Equations (2) and (3) with µ = 0.
Returning to the general case, we assume that the disturbance process (Ut ) is an
ARMA process with infinite moving average representation
Ut = ψ(L)t ,
where (t ) is a white noise process with variance σ 2 and uncorrelated with the input
process (Xt ). Equation (1) may then be written as
Yt = µ + v(L)Xt + ψ(L)t ,
(4)
where v(z) is the generating function of the coefficients of the filter. In the present setting
v(z) is also referred to as the transfer function of the filter. The first two terms on the
2
right hand side of Equation (4) represent the systematic dynamics of the model and the
third term the disturbance dynamics.
We may write the ARMA model for (Ut ) more explicitly in the form
φ(L)Ut = θ(L)t .
Thus φ(z) is the autoregressive characteristic polynomial and θ(z) is the moving average characteristic polynomial for the disturbance process. We assume that the transfer
function v(z) may be expressed as a rational function, a ratio of polynomials,
ω(z)z k
,
v(z) =
δ(z)
(5)
where k is the time-delay, the denominator (autoregressive) polynomial δ is given by
δ(z) = 1 − δ1 z − δ2 z 2 − . . . − δp z p ,
for some p, and the numerator (moving average) polynomial ω by
ω(z) = ω0 − ω1 z − ω2 z 2 − . . . − ωq z q ,
for some q. The corresponding recursive filter is δ(L)Yt = ω(L)Lk Xt . Equation (4) becomes
ω(L)Lk
θ(L)
Yt = µ +
Xt +
t .
(6)
δ(L)
φ(L)
• The polynomial ω(z) has ω0 6= 1 in general, because a multiplicative constant has
been absorbed into it. The minus sign in front of the subsequent ωi reflects the SAS
usage.
• To have a well-defined model, we assume that all the roots of the characteristic
equations δ(z) = 0 and φ(z) = 0 lie outside the unit circle in the complex plane.
• The model of Equation (1)/(4)/(6) may be rewritten as
δ(L)Yt = µ† + ω(L)Lk Xt + Ut† ,
where µ† = δ(1)µ and Ut† = δ(L)Ut , to exhibit explicitly a recursive, autoregressive
aspect of the model for (Yt ).
Assuming that the processes (Xt ) and (Yt ) are stationary, let µX and µY denote their
respective means. Taking expectations in Equation (1),

µY = µ + 
∞
X

vj  µX .
(7)
j=0
The quantity ∞
j=0 vj is sometimes referred to as the total multiplier — the change in µY
per unit change in µX , i.e., the long-term effect on (Yt ) of a unit change in µX . Note that
the total multiplier may also be written as v(1) ≡ ω(1)/δ(1).
P
3
5.2
The cross-correlation function and model identification
Taking (Xt ) to be stationary, we assume temporarily that (Xt ) has zero mean, which does
not alter the second-order moments of the model but simplifies the notation in deriving the
results of this sub-section. Recall that in Equation (1) (Ut ) is also zero-mean stationary
and that (Xt ) and (Ut ) are uncorrelated with each other. For any τ , multiplying through
by Xt−τ and taking expectations, we obtain
γτ21
=
∞
X
vj γτ11−j ,
(8)
j=0
where γτ21 is the cross-covariance, γτ21 = E(Yt Xt−τ ), and γτ11 is the autocovariance, γτ11 =
E(Xt Xt−τ ). Equation (8) simplifies in the special case when (Xt ) is a white noise process,
in which case
γτ21 = vτ γ011 ,
which we may rewrite as
vτ =
v
u 22
u
21 t γ0
ρ
τ
γ011
∝ ρ21
τ .
(9)
The result of Equation (9) shows that, if the input process is white noise, we have
a straightforward method of estimating the transfer function from the sample crosscorrelation function and the sample variances of the input and output processes:
v̂τ =
v
u 22
u
21 t c0
r
τ
c11
0
∝ rτ21 .
(10)
Now, in general, (Xt ) is not a white noise process but is some other ARMA process.
Hence there exists some filter with generating function w(z) such that
w(L)Xt = ηt ,
where (ηt ) is a white noise process, uncorrelated with (Ut ). Applying this filter to Equation
(1) we obtain
Yt∗ = µ∗ +
∞
X
vj ηt−j + Ut∗ ,
(11)
j=0
where Yt∗ = w(L)Yt , µ∗ = w(1)µ and Ut∗ = w(L)Ut . Equation (11) is similar in form to
the model Equation (1), with the same filter (vj ), but with the input process white noise.
Hence the results of Equations (9) and (10) may be applied to the filtered processes (Yt∗ )
and (ηt ) to estimate (vj ).
In attempting to find an appropriate transfer function model, a standard approach
involves first filtering the input and output processes, using the same filter for both, so as
to convert the input process to white noise. Such a procedure is commonly known as “prewhitening”. It presupposes that the processes (Xt ) and (Yt ) have, if necessary, already
been transformed to stationarity by differencing. By examining the cross-correlation
function of the pre-whitened process data, we may be able to identify a suitable transfer
function model.
4
5.3
Example: sales
Using the example introduced in Section 4.2, recall that it is the first differences of the
series ind and sales that appear to be stationary. The main reason for investigating
the series would appear to be to predict sales from the leading indicator. Hence we
shall attempt to construct a transfer function model with first differences of the sales
as the output (Yt ) and the first differences of the leading indicator as the input (Xt ).
The following SAS program fits an appropriate model, chosen from among models of the
general form of Equation (6).
proc arima data=IndSales;
identify var=ind(1) nlag=20;
estimate q=1 noint;
identify var=sales(1) crosscor=ind(1) nlag=20;
estimate q=1 input=(3 $ / (1) ind) noint;
forecast lead=5 out=results;
run;
In fact, this program would have to be developed iteratively, step by step. The first
identify statement specifies what is going to be the input variable, here the first differences of the variable ind, and produces the autocorrelation and other functions. The
autocorrelation function output on page 7 suggests that we should fit an MA(1) model
to the differences, i.e., an ARIMA(0,1,1) model to ind. The first estimate statement
fits this model, and the noint option indicates that a zero mean is being assumed. The
output on page 8 exhibits the fitted model, and the p-values of the portmanteau statistics
show that the model fits well.
The fitted model for the input process is used
1. for pre-whitening the input and output variables before calculation of their crosscorrelation function and
2. for calculating forecast values of the input variable which are in turn used in the
calculation of forecast values of the output variable.
After the input process has been modelled, the second identify statement, with the
crosscor option, produces (i) the autocorrelation and other functions for what is going
to be the output variable, the first differences of sales, and (ii) the cross-correlation
function of the first differences of sales and ind, automatically pre-whitened using the
model fitted to ind by the previous estimate statement. Examination of the crosscorrelation function on page 10 indicates that there is a time-delay of 3 units. Thereafter,
the ccf appears to die away geometrically. This suggests a transfer function of the form
v(z) =
ω0 z 3
.
1 − δ1 z
• We are fortunate in having a clear-cut structure to the cross-correlation function
here!
• Note that this cross-correlation function differs from the one in Section 4.2, where
the variables had not been pre-whitened.
5
As part of the process of model identification, we also have to specify a model for the
disturbance process (Ut ). We might just try fitting the transfer function model, assuming
a few simple models for the disturbance process, to find the simplest one that works.
Another, more systematic approach involves first finding a rough estimate v̂ of the transfer
function, using Equation (10). An estimated disturbance process is then given by ût =
yt − v̂(L)xt . At least in simple cases, we may readily calculate the values of this estimated
disturbance process and fit an ARMA model to them.
In the present case, using Equation (10), first estimates of the parameters of our
proposed form of transfer function are given by
s
ω̂0 = v̂3 = 0.67523
3.794675
= 4.71
0.078036
and
v̂4
r421
0.45227
δ̂1 =
= 21 =
= 0.670.
v̂3
r3
0.67523
From analysis of the estimated disturbance process or by some trial and error, it turns
out that an MA(1) model for (Ut ) is appropriate here.
The second estimate statement in the SAS program fits our chosen transfer function
model to the data with the first difference of sales as the output variable, as specified in
the previous identify statement. The q=1 option specifies that the disturbance process
is to be modelled as an MA(1) process. The input option is used to specify the input
variable and the form of the rational transfer function of Equation (5). In general, the
input option takes the form
input = ( k $ (‘‘numerator lags’’) / (‘‘denominator lags’’) x )
where k is the time-delay (Shift in the terminology of the SAS output), ‘‘numerator
lags’’ specifies the numerator polynomial, ‘‘denominator lags’’ specifies the denominator polynomial, and x represents the input variable. In the present case, the time-delay
is 3. The term / (1) after the dollar sign specifies that the numerator polynomial is
a constant and that the denominator polynomial is of order 1, i.e., 1 − δ1 z. The input
variable will be the first difference of ind. The noint option specifies that there is to be
no constant term in the transfer function model.
The SAS output on pages 11 and 12 gives the fitted model as
Yt =
4.71790
Xt−3 + t − 0.29561t−1 .
1 − 0.72484L
Apart from one value that is significant at the 5% level, the p-values of the diagnostic
statistics on page 11 indicate that the fitted model is satisfactory in that
1. the autocorrelations of the residuals are consistent with being from a white noise
process and
2. the cross-correlations are consistent with the residuals, and hence the disturbance
process, being uncorrelated with the input process.
Finally, the forecast statement in the SAS program produces forecasts of sales for
the next five time-points. We might note that the forecast values are fairly similar to the
ones obtained at the end of Section 4 using the VAR(5) model, and would be even more
so if constant terms had not been included in the VAR(5) model.
6
The ARIMA Procedure
Name of Variable = ind
Period(s) of Differencing
Mean of Working Series
Standard Deviation
Number of Observations
Observation(s) eliminated by differencing
1
0.022752
0.315162
149
1
Autocorrelations
Lag
Covariance
Correlation
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
0.099327
-0.044402
0.0084831
-0.0069778
0.012869
-0.0090300
0.0077101
-0.0077688
0.011913
-0.0051837
-0.012395
0.018569
-0.0085629
0.0040351
-0.0037875
-0.0017806
-0.0001773
0.0050849
-0.0093822
0.0019400
0.011636
1.00000
-.44703
0.08541
-.07025
0.12956
-.09091
0.07762
-.07821
0.11994
-.05219
-.12479
0.18695
-.08621
0.04062
-.03813
-.01793
-.00179
0.05119
-.09446
0.01953
0.11715
-1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|********************|
*********| .
|
.
|** .
|
. *|
.
|
.
|***.
|
. **|
.
|
.
|** .
|
. **|
.
|
.
|** .
|
. *|
.
|
. **|
.
|
.
|****
|
. **|
.
|
.
|* .
|
. *|
.
|
.
|
.
|
.
|
.
|
.
|* .
|
. **|
.
|
.
|
.
|
.
|** .
|
"." marks two standard errors
...
7
Std Error
0
0.081923
0.096921
0.097425
0.097764
0.098910
0.099469
0.099875
0.100285
0.101243
0.101424
0.102449
0.104714
0.105189
0.105294
0.105387
0.105407
0.105407
0.105574
0.106140
0.106164
Conditional Least Squares Estimation
Parameter
MA1,1
Estimate
Standard
Error
t Value
Approx
Pr > |t|
Lag
0.44920
0.07347
6.11
<.0001
1
Variance Estimate
0.080382
Std Error Estimate
0.283517
AIC
48.21641
SBC
51.22036
Number of Residuals
149
* AIC and SBC do not include log determinant.
Autocorrelation Check of Residuals
To
Lag
ChiSquare
DF
Pr >
ChiSq
6
12
18
24
5.93
13.82
15.07
19.45
5
11
17
23
0.3128
0.2431
0.5905
0.6749
--------------------Autocorrelations--------------------0.055
-0.002
0.029
0.054
0.082
0.116
-0.031
0.141
0.018
-0.029
-0.023
-0.017
Model for variable ind
Period(s) of Differencing
1
No mean term in this model.
Moving Average Factors
Factor 1:
8
1 - 0.4492 B**(1)
0.146
-0.071
0.010
0.028
-0.001
0.171
0.038
0.015
0.083
-0.009
-0.059
-0.033
Name of Variable = sales
Period(s) of Differencing
Mean of Working Series
Standard Deviation
Number of Observations
Observation(s) eliminated by differencing
1
0.420134
1.439145
149
1
Autocorrelations
Lag
Covariance
Correlation
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
2.071138
0.645779
0.576179
0.468885
0.522143
0.309832
0.276728
0.130231
0.274203
-0.039048
-0.0077359
0.220275
-0.027219
-0.036294
-0.154498
-0.139372
0.017822
-0.031430
-0.148697
0.057475
0.146291
1.00000
0.31180
0.27819
0.22639
0.25210
0.14960
0.13361
0.06288
0.13239
-.01885
-.00374
0.10635
-.01314
-.01752
-.07460
-.06729
0.00860
-.01518
-.07179
0.02775
0.07063
-1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
|********************|
|******
|
|******
|
|*****
|
|*****
|
|***.
|
|***.
|
|* .
|
|***.
|
|
.
|
|
.
|
|** .
|
|
.
|
|
.
|
*|
.
|
*|
.
|
|
.
|
|
.
|
*|
.
|
|* .
|
|* .
|
"." marks two standard errors
...
Variable ind has been differenced.
Correlation of sales and ind
Period(s) of Differencing
Number of Observations
Observation(s) eliminated by differencing
Variance of transformed series sales
Variance of transformed series ind
Both series have been prewhitened.
9
1
149
1
3.794675
0.078036
Std Error
0
0.081923
0.089534
0.095159
0.098707
0.102938
0.104387
0.105528
0.105780
0.106886
0.106908
0.106909
0.107617
0.107628
0.107647
0.107993
0.108274
0.108279
0.108293
0.108612
0.108660
Crosscorrelations
Lag
Covariance
Correlation
-20
-19
-18
-17
-16
-15
-14
-13
-12
-11
-10
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-0.020535
-0.015761
-0.026376
0.026874
0.022502
-0.0088874
-0.0024431
0.0087292
-0.015148
-0.038093
-0.028313
-0.026054
0.026927
-0.0013061
-0.034684
0.013016
0.0012583
0.022045
0.0054125
0.051478
0.034232
0.043060
0.010062
0.367442
0.246112
0.185447
0.140160
0.145861
0.107803
0.094235
0.053115
0.078822
0.038038
-0.0078178
0.059361
0.025895
0.023753
-0.0041726
-0.021109
-0.013316
-0.0015524
-.03774
-.02896
-.04847
0.04939
0.04135
-.01633
-.00449
0.01604
-.02784
-.07000
-.05203
-.04788
0.04948
-.00240
-.06374
0.02392
0.00231
0.04051
0.00995
0.09460
0.06291
0.07913
0.01849
0.67523
0.45227
0.34079
0.25757
0.26804
0.19811
0.17317
0.09761
0.14485
0.06990
-.01437
0.10909
0.04759
0.04365
-.00767
-.03879
-.02447
-.00285
-1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
*| .
*| .
*| .
|* .
|* .
| .
| .
| .
*| .
*| .
*| .
*| .
|* .
| .
*| .
| .
| .
|* .
| .
|**.
|* .
|**.
| .
|**************
|*********
|*******
|*****
|*****
|****
|***
|**.
|***
|* .
| .
|**.
|* .
|* .
| .
*| .
| .
| .
"." marks two standard errors
...
Both variables have been prewhitened by the following filter:
Prewhitening Filter
Moving Average Factors
Factor 1:
10
1 - 0.4492 B**(1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Conditional Least Squares Estimation
Parameter
MA1,1
NUM1
DEN1,1
Estimate
Standard
Error
t Value
Approx
Pr > |t|
Lag
0.29561
4.71790
0.72484
0.08026
0.07109
0.0054965
3.68
66.36
131.87
0.0003
<.0001
<.0001
1
0
1
Variable
sales
ind
ind
Shift
0
3
3
Variance Estimate
0.064117
Std Error Estimate
0.253213
AIC
16.13843
SBC
25.06864
Number of Residuals
145
* AIC and SBC do not include log determinant.
Correlations of Parameter Estimates
Variable
Parameter
sales
ind
ind
MA1,1
NUM1
DEN1,1
sales
MA1,1
ind
NUM1
ind
DEN1,1
1.000
0.054
-0.035
0.054
1.000
-0.627
-0.035
-0.627
1.000
Autocorrelation Check of Residuals
To
Lag
ChiSquare
DF
Pr >
ChiSq
6
12
18
24
7.36
11.27
15.23
19.55
5
11
17
23
0.1950
0.4213
0.5791
0.6686
--------------------Autocorrelations--------------------0.030
-0.041
-0.059
0.069
0.085
0.059
0.030
-0.013
0.037
-0.050
-0.023
0.025
0.022
0.127
0.064
-0.079
0.184
0.028
-0.103
-0.076
0.068
-0.014
0.066
0.086
Crosscorrelation Check of Residuals with Input ind
To
Lag
ChiSquare
DF
Pr >
ChiSq
5
11
17
23
10.47
12.60
14.11
20.85
4
10
16
22
0.0332
0.2466
0.5905
0.5302
--------------------Crosscorrelations------------------0.006
0.010
0.016
-0.129
11
0.107
0.004
-0.019
0.106
-0.129
0.044
-0.015
0.029
0.058
0.064
-0.029
0.063
-0.052
0.018
0.084
-0.046
0.199
0.093
0.043
0.112
Model for variable sales
Period(s) of Differencing
1
No mean term in this model.
Moving Average Factors
Factor 1:
1 - 0.29561 B**(1)
Input Number 1
Input Variable
Shift
Period(s) of Differencing
Overall Regression Factor
ind
3
1
4.717904
Denominator Factors
Factor 1:
1 - 0.72484 B**(1)
Forecasts for variable sales
Obs
Forecast
Std Error
151
152
153
154
155
262.8490
264.1643
263.3722
263.3373
263.3120
0.2532
0.3097
0.3574
1.3960
2.2118
12
95% Confidence Limits
262.3527
263.5573
262.6717
260.6013
258.9770
263.3452
264.7714
264.0727
266.0734
267.6471
Download