Autoregressive Integrated Moving Average (ARIMA) models 1 - Forecasting techniques based on exponential smoothing -General assumption for the above models: times series data are represented as the sum of two distinct components (deterministc & random) - Random noise: generated through independent shocks to the process -In practice: successive observations show serial dependence 2 ARIMA Models - ARIMA models are also known as the Box-Jenkins methodology - very popular : suitable for almost all time series & many times generate more accurate forecasts than other methods. - limitations: If there is not enough data, they may not be better at forecasting than the decomposition or exponential smoothing techniques. Recommended number of observations at least 30-50 - Weak stationarity is required - Equal space between intervals 3 4 5 Linear Models for Time series 6 Linear Filter ¥ yt = L ( xt ) ) = å yi xt-i , t = ...., -1, 0,1,.... i=-¥ - It is a process that converts the input xt, into output yt - The conversion involves past, current and future values of the input in the form of a summation with different weights - Time invariant do not depend on time - Physically realizable: the output is a linear function of the current and past values of the input - Stable if i i i In linear filters: stationarity of the input time series is also reflected in the output 7 Stationarity 8 A time series that fulfill these conditions tends to return to its mean and fluctuate around this mean with constant variance. Note: Strict stationarity requires, in addition to the conditions of weak stationarity, that the time series has to fulfill further conditions about its distribution including skewness, kurtosis etc. Determine stationarity -Take snaphots of the process at different time points & observe its behavior: if similar over time then stationary time series -A strong & slowly dying ACF suggests deviations from stationarity 9 10 11 Infinite Moving Average Input xt stationary Output yt Stationary, with +¥ yt = å yi xt-i i=-¥ +¥ E(yt ) = m y = å yi m x i=-¥ ¥ & ¥ Cov ( yt, yt+k ) = g y ( k ) = å å yiy jg x (i - j + k ) i=-¥ j=-¥ THEN, the linear process with white noise time series εt +¥ yt = m + åyiet-i i=0 Is stationary εt independent random shocks, with E(εt)=0 & g e ( h) = { s 2 , if h=0 0, if h¹0 12 autocovariance function ¥ ¥ g y ( k ) = ååyiy jg e (i - j + k ) i=0 j=0 =s ¥ 2 åy y i i+k i=0 Linear Process +¥ yt = m + åyiet-i i=0 yt 0 t 1 t 1 2 t 2 ( i B i ) t i 0 B t Infinite moving average 13 The infinite moving average serves as a general class of models for any stationary time series THEOREM (World 1938): Any no deterministic weakly stationary time series yt can be represented as +¥ yt = m + åyiet-i i=0 ¥ where åy 2 i <¥ i=0 INTERPRETATION A stationary time series can be seen as the weighted sum of the present and past disturbances 14 Infinite moving average: - Impractical to estimate the infinitely weights - Useless in practice except for special cases: i. Finite order moving average (MA) models : weights set to 0, except for a finite number of weights ii. Finite order autoregressive (AR) models: weights are generated using only a finite number of parameters iii. A mixture of finite order autoregressive & moving average models (ARMA) 15 Finite Order Moving Average (MA) process Moving average process of order q(MA(q)) y0 =1, q weights not set to 0 yt t 1 t 1 q t q t white noise MA(q) : always stationary regardless of the values of the weights yt (1 1 B q B q ) t q 1 i B i i i 1 B t q Q ( B) =1- åqi Bi i=1 16 εt white noise Expected value of MA(q) Variance of MA(q) E yt E t 1 t 1 q t q Var yt y 0 Var t 1 t 1 q t q 2 1 12 q2 Autocovariance of MA(q) y k E t 1 t 1 q t q t k 1 t k 1 q t k q , k 1, 2,.,q Autocorelation of MA(q) 2 0, k q k 1 k 1 k q q y k y k 0, k q y 0 k 1 k 1 k q q / 1 2 2 1 q , k 1, 2 ,.,q 17 ACF function: Helps identifying the MA model & its appropriate order as its cuts off after lag k Real applications: r(k) not always zero after lag q; becomes very small in absolute value after lag q 18 First Order Moving Average Process MA(1) q=1 Autocovariance of MA(q) yt = m + et - q1et-1 et white noise g y (0) = s 2 (1+ q12 ) g y (1) = -q1s 2 g y (k) = 0, k > 1 Autocorelation of MA(q) y 1 1 1 12 y (k ) 0, k 1 y 1 1 1 1 12 2 19 - Mean & variance : stable - Short runs where successive observations tend to follow each other - Positive autocorrelation - Observations oscillate successively - negative autocorrelation 20 Second Order Moving Average MA(2) process yt t 1 t 1 2 t 2 1 1 B 2 B 2 t Autocovariance of MA(q) g y (0) = s 2 (1+ q12 + q 22 ) g y (1) = s 2 (-q1 + q1q 2 ) g y (2) = s 2 (-q 2 ) g y (k) = 0, k > 2 Autocorelation of MA(q) r y (1) = -q1 + q1q 2 1+ q12 + q 22 -q 2 1+ q12 + q 22 r y (k) = 0, k > 2 r y ( 2) = 21 The sample ACF cuts off after lag 2 22 Finite Order Autoregressive Process - World’s theorem: infinite number of weights, not helpful in modeling & forecasting - Finite order MA process: estimate a finite number of weights, set the other equal to zero Oldest disturbance obsolete for the next observation; only finite number of disturbances contribute to the current value of time series - Take into account all the disturbances of the past : use autoregressive models; estimate infinitely many weights that follow a distinct pattern with a small number of parameters 23 First Order Autoregressive Process, AR(1) +¥ yt = m + åyie t-i i=0 æ¥ ö i = m + çåyi B et ÷ è i=0 ø q Y ( B) =1- åyi Bi i=1 = m + Y ( B) et Assume : the contributions of the disturbances that are way in the past are small compared to the more recent disturbances that the process has experienced Reflect the diminishing magnitudes of contributions of the disturbances of the past,through set of infinitely many weights in descending magnitudes , such as yi = f i , f < 1 Exponential decay pattern The weights in the disturbances starting from the current disturbance and going back in the past: 1,, 2 , 3 , 24 yt t t 1 2 t 2 i t i i 0 yt 1 t 1 t 2 2 t 3 THEN yt t t 1 2 t 2 yt 1 t yt 1 t where First order autoregressive process AR(1) d = (1- f ) m WHY AUTOREGRESSIVE ? AR(1) stationary if f <1 ¥ Þ å yi < ¥ i=0 25 Mean AR(1) E ( yt ) = m = Autocovariance function AR(1) d 1- f k 2 k Autocorrelation function AR(1) 1 , k 0,1,2 2 1 k k k , k 0,1,2 0 The ACF for a stationary AR(1) process has an exponential decay form 26 Observe: - The observations exhibit up/down movements 27 Second Order Autoregressive Process, AR(2) y t 1 yt 1 2 yt 2 t , 1 This model can be represented in the infinite MA form & provide the conditions of stationarity for yt in terms of φ1 & φ2 WHY? 1. Infinite MA yt 1 yt 1 2 yt 2 t yt 1 Byt 2 B 2 yt t (1 1 B 2 B 2 ) yt t B yt t Apply B 1 28 yt B B t 1 1 B t i t i i 0 i B i t i 0 where B 1 & B i B i B 1 i 0 29 Calculate the weights i BB 1 B i Bi B 1 i 0 1 B B 2 1 2 0 1 B 2 B 2 1 0 1 1 0 B 2 1 1 2 0 B 2 j 1 j 1 2 j 2 B j 1 We need 0 1 1 1 0 0 j 1 j 1 2 j 2 0, for all j 2,3, 30 Solutions The satisfy the second-order linear difference equation The solution : in terms of the 2 roots m1 and m2 from j m 2 1m 2 0 m1 , m2 AR(2) stationary: 1 12 42 2 if m1 , m2 1, i 0 i Condition of stationarity for complex conjugates a+ib: 2 b2 1 AR(2) infinite MA representation: m1 , m2 1 31 Mean E yt 1E yt 1 2 E yt 2 1 2 1 1 2 For 1 1 2 , m 1 : nonstationarity Autocovariance function k cov yt , yt k cov 1 yt 1 2 yt 2 t , yt k 1 cov yt 1 , yt k 2 cov yt 2 , yt k cov t , yt k 1 k 1 2 k 2 For k=0: For k>0: 2 , if k 0 0 , if k 0 0 1 1 2 2 2 k 1 k 1 2 k 2, k 1,2 Yule-Walker equations 32 Autocorrelation function k 1 k 1 2 k 2, k 1,2, Solutions A. Solve the Yule-Walker equations recursively 1 1 0 2 1 1 1 1 2 2 1 1 2 3 1 2 2 1 B. General solution Obtain it through the roots m1 & m2 associated with the polynomial m2 1m 2 0 33 Case I: m1, m2 distinct real roots k c1m1k c2m2k , k 0,1,2, c1, c2 constants: can be obtained from ρ (0) ,ρ(1) stationarity: m1 , m2 1 ACF form: mixture of 2 exponentially decay terms e.g. AR(2) model It can be seen as an adjusted AR(1) model for which a single exponential decay expression as in the AR(1) is not enough to describe the pattern in the ACF and thus, an additional decay expression is added by introducing the second lag term yt-2 34 Case II: m1, m2 complex conjugates in the form a ib k Rk c1 cosk c2 sink , k 0,1,2, R mi a 2 b 2 cos( ) a / R sin( ) b / R a ib Rcos( ) i sin c1, c2: particular constants ACF form: damp sinusoid; damping factor R; 2 / frequency ; period 35 Case III: one real root m0; m1= m2=m0 k c1 c2k m0k , k 0,1,2, ACF form: exponential decay pattern 36 AR(2) process :yt=4+0.4yt-1+0.5yt-2+et Roots of the polynomial: real ACF form: mixture of 2 exponential decay terms 37 AR(2) process: yt=4+0.8yt-1-0.5yt-2+et Roots of the polynomial: complex conjugates ACF form: damped sinusoid behavior 38 General Autoregressive Process, AR(p) Consider a pth order AR model yt 1 yt 1 2 yt 2 p yt p t , t white noise or B yt t , where B 1 1 B 2 B 2 p B p 39 AR(P) stationary If the roots of the polynomial m p 1m p1 2m p2 p 0 are less than 1 in absolute value AR(P) absolute summable infinite MA representation Under the previous condition yt B t i t i i 0 B B & i 1 i 0 40 Weights of the random shocks B B 1 as j 0, j 0 0 1 j 1 j 1 2 j 2 p j p 0, forall j 1,2, 41 For stationary AR(p) E yt 1 1 2 p k Cov yt , yt k Cov 1 yt 1 2 yt 2 p yt p t , yt k p i k i i 1 2 , if k 0 0 , if k 0 p 0 i i 2 i 1 p 0 1 i i 2 i 1 42 ACF p k i k i , k 1,2, pth order linear difference equations i 1 AR(p) : -satisfies the Yule-Walker equations -ACF can be found from the p roots of the associated polynomial e.g. distinct & real roots : k c1m1k c2 m2 k c p mp k - In general the roots will not be real ACF : mixture of exponential decay and damped sinusoid 43 ACF - MA(q) process: useful tool for identifying order of process cuts off after lag k - AR(p) process: mixture of exponential decay & damped sinusoid expressions Fails to provide information about the order of AR 44 Partial Autocorrelation Function Consider : - three random variables X, Y, Z & - Simple regression of X on Z & Y on Z Cov(Z, X) Var(Z) Cov(Z,Y ) Y = a 2 + b2 Z, where b2 = Var(Z) X = a1 + b1Z, where b1 = The errors are obtained from X * = X - X = X - (a1 + b1Z ) Y * = Y -Y = Y - (a 2 + b2 Z ) 45 Partial correlation between X & Y after adjusting for Z: The correlation between X* & Y* ( corr ( X *,Y * ) = corr X - X,Y -Y ) Partial correlation can be seen as the correlation between two variables after being adjusted for a common factor that affects them 46 Partial autocorrelation function (PACF) between yt & yt-k The autocorrelation between yt & yt-k after adjusting for yt-1, yt-2, …yt-k AR(p) process: PACF between yt & yt-k for k>p should equal zero Consider - a stationary time series yt; not necessarily an AR process - For any fixed value k , the Yule-Walker equations for the ACF of an AR(p) process k r ( j ) = åfik r ( j - i ), j = 1, 2,..., k i=1 r (1) = f1k + f2k r (1) +… + fkk r ( k -1) r ( 2) = f1k + f2k r (1) +… + fkk r ( k - 2) r ( k ) = f1k + f2k r (1) +… + fkk 47 Matrix notation 1 1 2 k 1 Solutions 1 1 1 k 2 2 3 1 k 1 k 2 k 3 k 3 1 Pkk k k Pk 1k For any given k, k =1,2,… the last coefficient kk is called the partial autocorrelation coefficient of the process at lag k AR(p) process: kk 0, k p Identify the order of an AR process by using the PACF 48 MA(1) yt 40 t 0.8 t 1 MA(2) yt 40 t 0.7 t 1 0.28 t 2 Decay pattern Decay pattern AR(1) yt 8 0.8 yt 1 t AR(1) yt 8 0.8 yt 1 t Cuts off after 1st lag AR(2) AR(2) yt 8 0.8 yt 1 0.5 yt 2 t Cuts off after 2nd lag 49 Invertibility of MA models Invertible moving average process: The MA(q) process k Pk 1k is invertible if it has an absolute summable infinite AR representation It can be shown: The infinite AR representation for MA(q) i 1 i 1 yt i yt i t , i 50 Obtain i 1 B B 1 2 2 q Bq 1 1B 2 B2 1 We need 1 1 0 2 1 1 2 0 j 1 j 1 q j q 0 0 1 & j 0, j 0 Condition of invertibility The roots of the associated polynomial be less than 1 in absolute value mq 1mq1 2mq2 q 0 An invertible MA(q) process can then be written as an infinite AR process 51 PACF possibly never cuts off PACF of a MA(q) process is a mixture of exponential decay & damp sinusoid expressions In model identification, use both sample ACF & sample PACF 52 Mixed Autoregressive –Moving Average (ARMA) Process ARMA (p,q) model yt 1 yt 1 2 yt 2 p yt p t 1 t 1 2 t 2 q t q p q i 1 i 1 i yt i t i t i Byt B t , t white noise Adjust the exponential decay pattern by adding a few terms 53 Stationarity of ARMA (p,q) process Related to the AR component ARMA(p,q) stationary if the roots of the polynomial less than one in absolute value m p 1m p1 2m p2 p 0 ARMA(p,q) has an infinite MA representation yt i t i B t , B B B 1 i 0 54 Invertibility of ARMA(p,q) process Invertibility of ARMA process related to the MA component Check through the roots of the polynomial mq 1mq1 2mq2 q 0 If the roots less than 1 in absolute value then ARMA(p,q) is invertible & has an infinite representation B yt t B 1 & B B 1 B Coefficients: i 1 i1 2 i2 q iq 0,,iip1,, p i 55 ARMA(1,1) Sample ACF & PACF: exponential decay behavior 56 57 58 59 Non Stationary Process Not constant level, exhibit homogeneous behavior over time yt is homogeneous, non stationary if -It is not stationary -Its first difference, wt=yt-yt-1=(1-B)yt or higher order differences wt=(1-B)dyt produce a stationary time series Yt autoregressive intergrated moving average of order p, d,q –ARIMA(p,d,q) If the d difference , wt=(1-B)dyt produces a stationary ARMA(p,q) process ARIMA(p,d,q) B1 B yt B t d 60 The random walk process ARIMA(0,1,0) Simplest non-stationary model 1 Byt t First differencing eliminates serial dependence & yields a white noise process 61 yt=20+yt-1+et Evidence of non-stationary process -Sample ACF : dies out slowly -Sample PACF: significant at the first lag -Sample PACF value at lag 1 close to 1 First difference -Time series plot of wt: stationary -Sample ACF& PACF: do not show any significant value -Use ARIMA(0,1,0) 62 The random walk process ARIMA(0,1,1) 1 Byt 1 B t Infinite AR representation, derived from: i i 1 10,,ii10 , i 1 yt i yt i t i 1 1 yt 1 yt 2 t ARIMA(0,1,1)= (IMA(1,1)): expressed as an exponential weighted moving average (EWMA) of all past values 63 ARIMA(0,1,1) -The mean of the process is moving upwards in time -Sample ACF: dies relatively slow -Sample PACF: 2 significant values at lags 1& 2 Possible model :AR(2) Check the roots -First difference looks stationary -Sample ACF & PACF: an MA(1) model would be appropriate for the first difference , its ACF cuts off after the first lag & PACF decay pattern 64 yt 2 0.95yt 1 t 65 66