Chapter 4: Transfer Function Analysis 4.1 Introduction • Previously we have only considered modelling of a single time series Yt, i.e., Yt is linearly related to its past values. • Transfer function analysis is concerned with the situation where Yt is predicted using current and past values of other variables. 1 • For example, the monthly sales of a company may be related to its monthly advertising expenditure - Our concern here is to predict the future monthly sales. - If we think that monthly sales is somewhat affected by advertising expenditure, then by incorporating the information on advertising, we may be able to give better sales forecasting. 2 Transfer function is concerned with situations similar to the above example. The time series being forecasted (Yt) is called the output series, while the time series (Xt) which helps explain the movement of Yt, Yt –1, Yt – 2 … is called the input series. 3 4.2 Nature of transfer function Assumptions and terminology: • Observations for the various series occur at equally spaced intervals. • Input series is not affected by the output series; i.e., we are limited to single equation models. • Both the input and output series are stationary. 4 • The model may be written as: Yt = V0Xt + V1Xt–1 + V2Xt–2 + VRXt–k + Nt = V(B)Xt + Nt (4.1) where V’s are the unknown parameters, and Nt is the noise series, which represented the combined effects of other factors affecting the output series. • Nt may be autocorrelated, but is assumed to be independent of Xt. • The parameters V0, V1 … etc. are called impluse response weights. • V0 states how Yt responds to a change in Xt, V1 states how Yt responds to a change in Xt-1 and so on. 5 • The larger the absolute value of any weight Vk, the larger the response of Yt to a change in Xt–k. In theory, this distributed lag response could be of infinite length. • Yt often does not respond immediately to a change in Xt, that is, some initial weights may be zero. For example, the company advertising expense may not affect the sales immediately. It may take up a month for the market to digest the advertising effects. Hence V0 = 0. 6 - The number of v weights equal to zero is called the dead time, denoted as b. - Also, the order (k) of the transfer function can be rather larger. To avoid infinite number of weights, (2.1) is often written as W B b Yt B X t Nt B where W(B) = W0 – W1B – W2B2 – … WsBs δ(B) = 1 – δ1B – δ2B2 – … δrBr 7 This effectively expresses the infinite order transfer function as the ratio of two finite order polynomial. For example, consider W(B) = 1.2 – 0.5B and δ(B) = 1 – 0.8B then V(B) = (1.2 – 0.5B)(1 + 0.8B + (0.8)2B2+…) = 1.2 + 0.46B + 0.368B2 + 0.924B3 + … 8 Overtime, the full change in Yt in response to unit’s change in Xt is given by, W0 W1 Ws g V0 V1 V2 1 1 2 r where g is called a steady-state gain of the transfer function. 9 4.3 Nature of impulse response function • In ARIMA model, the model identification consists of matching sample autocorrelation and partial autocorrelation functions to the theoretical AL and PAL of specific ARIMA models. • In transfer function models, the same approach is used. We match the sample cross correlation function to the theoretical impulse response functions. It is necessary that we investigate the impulse response function and determine what various patterns in the weights V0, V1, V2… imply about b, r and s. Very often, we find that the values of r and s do not exceed 2. 10 Case 1: r = 0, s = 0 and b = 0 Since the impulse response function is defined as V B W B , B this implies V0 + V1B + V2B2 + …= W0 Equating the coefficients of B, we get Vk B0 : V0 = W0 Bk : Vk = 0 k>0 Therefore, the transfer function model is Yt = W0Xt, i.e., Yt changes by W0 units immediately for every unit change in Xt. Impulse response function graph is: 0 1 2 3 lag 11 Case 2: r = 0, s = 0 and b = 1 The transfer function is V0 + V1B + …V2B2 + …= W0B Equating the coefficients of B, we get B0 : V0 = 0 B1 : V1 = W0 Bk : Vk = 0 ; k2 Therefore, the transfer function model is Yt = W0BXt = W0Xt–1 i.e., Yt changes by W0 units immediately for every unit change in Xt–1. Impulse response function graph is: Vk For a non-zero b in general, Vk = 0 for all k b, and Yt = W0BbXt = W0Xt–b 0 1 2 3 lag 12 Case 3: r = 0, s = 1 and b = 0 V(B) = W0 – W1B B0 : V0 = W0 B1 : V1 = –W1 Bk : Vk = 0 k 2 Therefore, the transfer function model is Yt = W0Xt – W1Xt–1 i.e., changes in Xt lead to changes in by Yt and Yt–1. Impulse function graph is: Vk Depending on the values of the coefficients, the impulse response function has either positive or negative spikes or combination of both at lags 0 and 1. 0 1 lag 13 Case 4: r = 0, s = 1 and b = 1 V(B) = W0 B – W1B2 B0 : V0 = 0 B1 : V1 = W0 B2 : V2 = –W1 Bk : Vk = 0 k 3 Therefore, the transfer function model is Yt = W0Xt–1 – W1Xt–2 i.e., changes in Xt lead to changes in Yt+1 and Yt+2. Impulse response function graph is: Vk Depending on the values of the coefficients, the impulse response function has either positive or negative spikes or combination of both at lags 1 and 2. 0 1 2 3 lag 14 Case 5: r = 0, s = 2 and b = 0 V(B) = W0 – W1B – W2B2 B0 : V0 = W0 B1 : V1 = –W1 B2 : V2 = –W2 Bk : Vk = 0 k 3 Therefore, the transfer function model is Yt = W0Xt – W1Xt–1 – W2Xt–2 i.e., changes in Xt lead to changes in Yt, Yt+1 and Yt+2. Impulse response function graph is: Vk 0 1 2 3 lag 15 Case 6: r = 1, s = 0 and b = 0 W0 The transfer function is V B 1 1 B with (V0 + V1B + V2B2 + …)(1– δ1B) = W0 B0 : V0 = W0 B1 : V1 – V0δ1 = 0 V1 = V0δ1 = W0 δ1 B2 : V2 – V1δ1 = 0 V2 = V1δ1 = W0 δ12 Bk : Vk = W0 δ1k for k 0 Vk Since 1 1, 1k 0 as k Impulse response function graph is: The transfer function model is Yt = W0Xt + W0δ1Xt–1 + W0δ12Xt–2 + … that is, changes in Xt lead to changes in Yt, Yt+1, Yt+2 with the effect dying out geometrically. lag 0 1 2 3 4 5 16 Case 7: r = 1, s = 1 and b = 0 The transfer function model is V B W0 W1B with 1 1 B (V0 + V1B + V2B2 + …)(1– δ1B) = W0 – W1 B B0 : V0 = W0 B1 : V1 – V0δ1 = –W1 V1 = V0δ1 – W1= W0 δ1–W1 B2 : V2 – V1δ1 = 0 V2 = V1δ1 = W0 δ12 – W1δ1 Bk : Vk = Vk–1δ1 = W0 δ1k – W1 δ1k–1 for k 1 Impulse response function graph is: The impulse response function decays from lag 1 geometrically but the impulse at lag 0 does not follow any decay pattern. Changes in Xt lead to changes in Yt, Yt+1, Yt+2 with the effects dying out geometrically from lag 1. Vk lag 0 1 2 3 17 Case 8: r = 1, s = 2 and b = 0 2 W W B W B 1 2 The transfer function is V B 0 with 1 1 B (V0 + V1B + V2B2 + …)(1– δ1B) = W0 – W1 B – W2B2 B0 : V0 = W0 B1 : V1 – V0δ1 = –W1 V1 = V0δ1 – W1= W0 δ1–W1 B2 : V2 – V1δ1 = –W2 V2 = V1δ1 – W2 = W0 δ12 – W1δ1 – W2 Bk : Vk = Vk–1δ1 =δ1k–2(W0δ12 – W1δ1 – W2) for k 2 Impulse response function graph is: VR The impulse response function decays from lag 2 geometrically but the impulse at lags 0 and 1 do not follow any decay pattern. Changes in Xt lead to changes in Yt, Yt+1, Yt+2 with the effects dying from lag 2. lag 0 1 2 3 4 18 Case 9: r = 2, s = 0 and b = 0 W0 The transfer function is V B 1 1 B 2 B 2 with (V0 + V1B + V2B2 + …)(1 – δ1B – δ2B2) = W0 B0 : V0 = W0 Bk : Vk = Vk–1δ1 – Vk–2δ2 for k 1 Depending on the values of δ1 and δ2, Vk decays slowly or follows a damped sinewave from lag 1 with V0 not following any decaying pattern. 19 Case 10: r = 2, s = 1 and b = 0 W0 W1B The transfer function is V B 1 1 B 2 B 2 with (V0 + V1B + V2B2 + …)(1 – δ1B – δ2B2) = W0 – W1B B0 : V0 = W0 B1 : V1 = δ1W0 – W1 Bk : Vk = Vk–1δ1 – Vk–2δ2 for k 2 Depending on the values of δ1 and δ2, Vk decays slowly or follows a damped sinewave from lag 2 with V0 and V1 not following any decaying pattern. 20 In summary, δ(B) represents the decay pattern with • r=0 • r=1 (no decay) (decay geometrically) • r=2 (decay exponentially or damped sinewave) W(B) captures unpatterned spikes (not part of the decay pattern) • r = 0, 2 (number of unpatterned spikes –1 equals the value of s) • r=1 (number of unpatterned spikes before the decay equals the value of s) 21 4.3 Building a transfer function model Our goals are 1. To find a parsimonious expression for the polynomial V(B) as a ratio of other polynomials of lower order. 2. To find a parsimonious expression for the time structure of the disturbance series Nt in the form of an ARIMA model. The ARIMA model for the Nt series is given by P(B)P(BL)Nt = q (B)Q(BL)t where t is white noise. So the transfer function becomes, q B Q B t W B b Yt B Xt B p B p B L L 22 The procedure of building a transfer function model involves three steps: Step 1 : Identification of a model describing Xt and “pre-whitening” of both the input and output series. Step 2 : Calculation of the sample cross correlation between the pre-whitened series and the identification of a preliminary transfer function model. Step 3 : Identification of an ARIMA model for the noise series. 23 Step 1 : For purposes of illustration, we consider the data given in Table 14.1 of Bowerman and O’Connell (1993) on monthly advertising (Xt) and sales (Yt), as depicted in the following graphs. 24 Figure 4.1: Monthly advertising expenditure (in thousands of dollars) 150 140 130 120 Advertising 110 100 90 80 0 10 20 30 40 50 60 70 80 90 100 25 Figure 4.2: Monthly sales volume (in millions of dollars) 350 330 310 290 270 Sales 250 230 210 190 170 150 0 10 20 30 40 50 60 70 80 90 100 26 • It is identified that Xt is best described by an ARIMA (1, 1, 1) model with ˆ1 0.3003 and ˆ1 0.7871 and Yt by an ARIMA (2, 1, 1) model with ˆ1 0.9817, ˆ2 0.6903 and ˆ1 0.74. • It is easier to construct a transfer function using only stationary input and output series. For purposes of convenience, we denote the appropriately differenced and transformed series of Xt and Yt as xt and yt respectively. 27 • Furthermore, experience has shown that the identification process is simplified if the input series is white noise. We do this by pre-whitening the input series. Suppose that we have identified an ARMA (p, q) (P, Q)L model for xt. i.e. p(B)P(BL)xt = q(B)Q(BL)t where t is white noise. Then the series t is called the pre-whitened series with p B P B L xt t q B Q B L 28 • In practice, t is known and estimated by ˆ t , the residual of the ARMA model for xt. For example, the first difference of the advertising expenditure series can be written as, xt 0.3003xt 1 t 0.7871 t 1 1 0.3003B xt ˆ t 1 0.7871B Although in most cases, we write t instead of ˆ t . 29 • But recall that the transfer function model is written as yt = V(B)xt + Nt. Pre-whiten the input series means that we have to multiply the entire transfer function by B B L p P , i.e., L q B Q B p B P B L yt p B P B L xt V B t L L q B Q B q B Q B p B P B L yt Define t q B Q B L 30 • This operation is called “pre-whitening” the input and output series. But note that t is not necessarily white noise. Why? • The “pre-whitened” transfer function is t V Bt t 1 0.3003B yt In our example, t 1 0.7871B 31 Step 2 : • To identify the transfer function, we must estimate the impulse response function V(B) from the pre-whitened input and output series and then match the pattern of the estimated function to that of the theoretical impulse response function. • We first establish the relationship between the impulse weights and the series. Now, multiply the pre-whitened transfer function Bt = V(B)t + t by t – k and take expectation, we obtain the cross-covariance for Bt and t at lag k defined as 32 k E V B tt k E tt k E V0tt k E V1t 1t k E V2t 2t k E tt k As t and t are independent (why?) hence, For k = 0, k = 1, (0) = V02 (1) = V12 In general, (k) = Vk2, k = 0, 1, 2 … 33 • Therefore, the impulse response weights are proportional to the cross covariance. Now, the cross correlation coefficient of t and t at lag k is k • E tt k Vk 2 Then we see that k Vk k • Thus the cross-correlation coefficients between the pre-whitened input and output series are directly proportional to the impulse response weights. 34 • (k) cannot be calculated in practice, so we compute its estimate using n r k t 1 t k t 1 2 2 2 t t t 1 t 1 n n 1 2 where and are the sample means of and respectively. The sample cross correlation coefficients are computed for k = 0, 1, 2 and so on. Since we assume that Yt is led by Xt, we except (k) = 0 for k < 0. 35 • Since the sample cross correlation values are sample statistics, we need their standard errors to determine if any of them is significantly different from zero. • Bartlett approximates the standard error of r(k) by 1 nk . The cross correlation coefficient at lag k is insignificant at the 5% level if k 2 nk . 36 The following SAS program computes the pre-whitened input and output series and also the sample cross correlation coefficients: data sales; infile 'd:\teaching\ms4221\advertising.txt'; input @6 X Y; proc arima; identify var=X(1) noprint; estimate p=1 q=1 noconstant; identify var=y(1) crosscor=(x(1)) nlag=10; run; 37 Correlation of Y and X Variable X has been differenced. Period(s) of Differencing = 1. Both series have been prewhitened. Variance of transformed series = 271.7644 and 21.91755 Number of observations = 99 NOTE: The first observation was eliminated by differencing. Crosscorrelations Lag -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 4.762407 0.06171 | . |* . | 1.901889 0.02464 | . | . | 1.366826 0.01771 | . | . | -4.748175 -0.06152 | . *| . | -4.222908 -0.05472 | . *| . | -7.902604 -0.10239 | . **| . | -6.045174 -0.07833 | . **| . | 4.666570 0.06047 | . |* . | 6.975240 0.09038 | . |** . | -3.357529 -0.04350 | . *| . | -9.157557 -0.11866 | . **| . | -7.390464 -0.09576 | . **| . | 26.271432 0.34040 | . |******* | 49.041253 0.63543 | . |************* | 22.236550 0.28812 | . |****** | -14.682423 -0.19024 | ****| . | -35.866691 -0.46473 | *********| . | -31.123249 -0.40327 | ********| . | -2.799674 -0.03628 | . *| . | 10.285329 0.13327 | . |***. | 8.423270 0.10914 | . |** . | "." marks two standard errors Crosscorrelation Check Between Series To Lag 5 Chi Square DF 65.55 6 Crosscorrelations Prob 0.000 -0.119 -0.096 0.340 0.635 0.288 -0.190 ARIMA Procedure Both variables have been prewhitened by the following filter: Prewhitening Filter Autoregressive Factors Factor 1: 1 + 0.30027 B**(1) Moving Average Factors Factor 1: 1 + 0.78714 B**(1) 38 Some identification rules 1) Determining b: • It is equal to the number of V weights that are equal to zero starting from V0. • In practice, we look for a set of zero value V weights in the sample cross correction function. For the advertising sales example, 2 appears to be appropriate. 39 2) Determining r: • If there is no decay pattern at all, but rather a group of spikes followed by a zero cut-ff, then r = 0. • If r(k) shows a geometrical decay pattern after some lags, then r = 1. • If r(k) decays slowly or shows a damped sinewave pattern, then r = 2. • For the above example, we choose r = 2. 40 Determining S • If r has been chosen to 0 or 2, then s = number of significant unpatterned spikes –1. • If r has been chose to 1, then s = number of significant unpatterned spikes. • For the above example, the decaying pattern starts from lag 3 and there is one significant spike preceding that. We have chosen r = 2, which implies that s = 1 – 1 = 0. 41 So we are led to choose the model, W0 2 yt B xt t 2 1 1 B 2 B Step 3 • After the transfer function model has been tentatively identified, an ARIMA process for the noise series must be considered, we do so by first estimating the tentatively identified transfer function model by maximum likelihood. The following SAS program shows an example. 42 data sales; infile 'd:\teaching\ms4221\advertising.txt'; input @6 X Y; proc arima; identify var=X(1) noprint; estimate p=1 q=1 noconstant; identify var=y(1) crosscor=(x(1)) nlag=10; estimate input=(2$/(1,2)x) noconstant method=ml plot; run; 43 The ARIMA Procedure Maximum Likelihood Estimation Parameter Estimate Standard Error t Value Approx Pr > |t| Lag NUM1 DEN1,1 DEN1,2 1.69655 1.16305 -0.73685 0.03958 0.0093579 0.01013 42.86 124.29 -72.72 <.0001 <.0001 <.0001 0 1 2 Variance Estimate Std Error Estimate AIC SBC Number of Residuals 15.49461 3.93632 532.8967 540.5583 95 Variable X X X Shift 2 2 2 Autocorrelation Check of Residuals To Lag ChiSquare DF Pr > ChiSq 6 12 18 24 31.79 33.83 36.19 38.84 6 12 18 24 <.0001 0.0007 0.0067 0.0284 --------------------Autocorrelations-------------------0.456 -0.102 -0.082 0.119 0.040 -0.034 -0.012 0.050 -0.086 0.059 -0.066 -0.046 -0.273 0.019 -0.064 -0.046 -0.155 0.045 -0.017 0.028 -0.066 -0.042 0.069 -0.005 44 Autocorrelation Plot of Residuals Lag Covariance Correlation 0 1 2 3 4 5 6 7 8 9 10 15.494612 7.070411 0.621509 -1.333995 -4.237373 -2.395656 -1.025432 -1.580902 -0.531162 0.907621 0.300735 1.00000 0.45631 0.04011 -.08609 -.27347 -.15461 -.06618 -.10203 -.03428 0.05858 0.01941 -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 | | | | | | | | | | | |********************| . |********* | . |* . | . **| . | *****| . | . ***| . | . *| . | . **| . | . *| . | . |* . | . | . | Std Error 0 0.102598 0.122106 0.122245 0.122882 0.129129 0.131063 0.131415 0.132246 0.132339 0.132612 "." marks two standard errors Partial Autocorrelations Lag Correlation 1 2 3 4 5 6 7 8 9 10 0.45631 -0.21232 -0.01508 -0.27921 0.13492 -0.12407 -0.06528 -0.03186 0.08863 -0.10187 -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 | | | | | | | | | | . |********* ****| . . | . ******| . . |***. . **| . . *| . . *| . . |** . . **| . | | | | | | | | | | 45 • The SAC and SPAC of the residuals indicate the strong possibility of MA(1) process. So we re-estimate the transfer function again as W0 2 yt B xt 1 1B t 2 1 1 B 2 B using the following SAS program: 46 data sales; infile 'd:\teaching\ms4221\advertising.txt'; input @6 X Y; proc arima; identify var=X(1) noprint; estimate p=1 q=1 noconstant noprint; identify var=y(1) crosscor=(x(1)) nlag=10 noprint; estimate q=1 input=(2$/(1,2)x) noconstant method=ml; plot; run; 47 The ARIMA Procedure Maximum Likelihood Estimation Parameter Estimate Standard Error t Value Approx Pr > |t| Lag MA1,1 NUM1 DEN1,1 DEN1,2 -0.68954 1.66928 1.16758 -0.74202 0.07721 0.04954 0.01150 0.01278 -8.93 33.69 101.50 -58.05 <.0001 <.0001 <.0001 <.0001 1 0 1 2 Variance Estimate Std Error Estimate AIC SBC Number of Residuals 11.15887 3.34049 503.3192 513.5347 95 Variable Y X X X Shift 0 2 2 2 Autocorrelation Check of Residuals To Lag ChiSquare DF Pr > ChiSq 6 12 18 24 7.10 10.92 14.17 16.78 5 11 17 23 0.2131 0.4498 0.6550 0.8200 --------------------Autocorrelations--------------------0.056 -0.062 -0.108 0.091 0.038 -0.061 0.085 0.011 -0.015 0.124 -0.091 -0.030 -0.255 -0.058 -0.007 -0.073 0.003 0.089 -0.018 0.070 -0.023 -0.037 0.029 -0.036 48 • This model appears to be adequate. The residuals are uncorrelated and all coefficients are significant. • Exante forecasts of Yt can be generated using the statement “Forecast lead = number of periods” immediately after the estimation statements. 49