Decomposition Method 1 Types of Data Time series data: a sequence of observations measured over time (usually at equally spaced intervals, e.g., weekly, monthly and annually). Examples of time series data include: Gross Domestic Product each quarter; annual rainfall; daily stock market index Cross sectional data: data on one or more variables collected at the same point in time 2 Time Series vs Causal Modeling Causal (regression) models: the investigator specifies some behavioural relationship and estimates the parameters using regression techniques; Time series models: the investigator uses the past data of the target variable to forecast the present and future values of the variable 3 Time Series vs Causal Modeling On the other hand, there are many cases when one cannot, or one prefers not to, build causal models: 1. insufficient information is known about the behavioural relationship; 2. lack of, or conflicting, theories; 3. insufficient data on explanatory variables; 4. expertise may be unavailable; 5. time series models may be more accurate 4 Time Series vs Causal Modeling Direct benefits of using time series models: 1. Little storage capacity is needed; 2. some time series models are automatic in that user intervention is not required to update the forecasts each period; 3. some time series models are evolutionary in that the models adapt as new information is received; 5 Classical Decomposition of Time Series Trend – does not necessarily imply a monotonically increasing or decreasing series but simply a lack of constant mean, though in practice, we often use a linear or quadratic function to predict the trend; Cycle – refers to patterns or waves in the data that are repeated after approximately equal intervals with approximately equal intensity. For example, some economists believe that “business cycles” repeat themselves every 4 or 5 years; 6 Classical Decomposition of Time Series Seasonal – refers to a cycle of one year duration; Random (irregular) – refers to the (unpredictable) variation not covered by the above 7 Decomposition Method Multiplicative Models Yt TRt SNt CLt IRt Additive Models Yt TRt SNt CLt IRt Find the estimates of these four components. 8 Multiplicative Decomposition Examples: (1) US Retail and Food Services Sales from 1996 Q1 to 2008 Q1 Figure 2.1 (2) Quarterly Number of Visitor Arrivals in Hong Kong from 2002 Q1 to 2008 Q1 Figure 2.2 9 Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q 108 307 107 306 106 305 105 304 104 303 103 302 102 301 101 300 100 399 199 398 198 397 197 396 196 Sales Y(t) (in MN US$) Figure 2.1 US Retail Sales US Retail & Food Services Sales 500,000 450,000 400,000 350,000 300,000 250,000 200,000 150,000 100,000 50,000 0 Time Back 10 Figure 2.2 Visitor Arrivals Number of Visitor Arrivals in Hong Kong 2500000 2000000 1500000 1000000 500000 108 Q 307 Q 107 Q 306 Q 106 Q 305 Q 105 Q 304 Q 104 Q 303 Q 103 Q 302 Q 102 0 Q Number of Visitors Y(t) 3000000 Time 11 Cycles are often difficult to identify with a short time series. Classical decomposition typically combines cycles and trend as one entity: Yt TCt SNt IRt 12 Illustration : Consider the following 4-year quarterly time series on sales volume: Period (t) Year Quarter Sales 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 72 110 117 172 76 112 130 194 78 119 128 201 81 134 141 216 2 3 4 13 Figure 2.3 14 Step 1 : Estimation of seasonal component (SNt) Yt = TCt SNt IRt SˆNt Yt TCt IRt 72 110 117 172 Moving Average 4 for periods 1 – 4 117.75 110 117 172 76 Moving Average 4 for periods 2 – 5 118.75 15 Period (t) Year Quarter Sales 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 72 110 117 172 76 112 130 194 78 119 128 201 81 134 141 216 2 3 4 MA (t) 117.75 118.75 119.25 122.5 128 128.5 130.25 129.75 131.5 132.25 136 139.25 143 16 Assuming the average of the observations is also the median of the observations, the MA for periods 1 – 4, 2 – 5, 3 – 6 are centered at positions 2.5, 3.5 and 4.5 respectively. 17 To get an average centered at periods 3, 4, 5 etc. the means of two consecutive moving averages are calculated: 117.75 118.75 Centered Moving 2 Average for period 3 118.25 118.75 119.25 Centered Moving 2 Average for period 4 119 18 Period (t) Year Quarter Sales 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 72 110 117 172 76 112 130 194 78 119 128 201 81 134 141 216 2 3 4 MA (t) CMA(t) 117.75 118.75 119.25 122.5 128 128.5 130.25 129.75 131.5 132.25 136 139.25 143 118.25 119 120.875 125.25 128.25 129.375 130 130.625 131.875 134.125 137.625 141.125 19 Because the CMAt contains no seasonality and irregularity, the seasonal component may be Yt ~ estimated by SNt CMAt 117 ~ For example, SN 3 0.989 118.25 ~ 172 SN 4 1.445 119 20 Period (t) Year Quarter Sales 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 72 110 117 172 76 112 130 194 78 119 128 201 81 134 141 216 2 3 4 ~ MA (t) CMA(t) SN(t ) 117.75 118.75 119.25 122.5 128 128.5 130.25 129.75 131.5 132.25 136 139.25 143 118.25 119 120.875 125.25 128.25 129.375 130 130.625 131.875 134.125 137.625 141.125 0.989429175 1.445378151 0.628748707 0.894211577 1.013645224 1.499516908 0.6 0.911004785 0.970616114 1.49860205 0.588555858 0.949512843 21 ~ After all SN t s have been computed, they are further averaged to eliminate irregularities in the series. We also adjust the seasonal indices so that they sum to the number of seasons in a year (i.e., 4 for quarterly data, 12 for monthly data). Why?) 22 Quarter Average 1 (0.628748707 + 0.6 + 0.588555858)/3= 2 (0.894211577 + 0.911004785 + 0.949512843)/3= 3 (0.989429175 + 1.013645224 + 0.970616114)/3= 4 (1.445378151 + 1.499516908 + 1.49860205)/3= Sum = 23 Step 2 : Estimation of Trend/Cycle Define deseasonalized (or seasonally adjusted) series as Dt Yt SNˆ t for example, D1 = 72/0.6063 = 118.7506 24 25 TCt may be estimated by regression using a linear trend: Dt 0 1t t t 1, 2, 3 TCˆ t Dˆ t b0 b1t , where b0 and b1 are least squares estimates of 0 and 1 respectively. 26 EXCEL regression output : So, TˆCt 113.6997914 1.854638009t 27 For example, TˆC1 113.6997914 1.8546380091 115.5544294 Tˆ C 2 113.6997914 1.8546380092 117.4090674 28 29 Step 3 : Computation of fitted values and out-of-sample forecasts Yˆt TˆCt SˆN t In - samplefit : Yˆ 115.5544 0.6063 70.0621 1 Yˆ16 143.37401.4825 212.5516 30 Out of sample forecast : Yˆ17 TˆC17 SˆN17 113.670 1.85517 0.6063 145.2286 0.6063 88.054 Yˆ18 TˆC18 SˆN18 113.670 1.85518 0.9191 147.0833 0.9191 135.1796 31 32 Figure 2.4 33 Measuring Forecast Accuracy : Let et Yt Yˆt be theerrorsof forecast. 1) Mean Squared Error n MSE et2 n t 1 RMSE MSE 2) Mean Absolute Deviation n MAD et n t 1 RMAD MAD 34 et = Method A –2 1.5 –1 2.1 0.7 Method B –4 0.7 0.5 1.4 0.1 Method A : MSE = MAD = 2.43 1.46 Method B : MSE = MAD = 3.742 1.34 35 Naive Prediction Yˆt Yt 1 Theil’s u Statistics U Y Y Yt Yˆt 2 2 t t 1 n n if U = 1 Forecasts produced are no better than naive forecast U = 0 Forecasts produced perfect fit The smaller the value of U, the better the forecasts. 36 MSE = 11.932 MAD = 2.892 Theil’s U = 0.0546 37 Out-of-Sample Forecasts 1) Expost forecast Prediction for the period in which actual observations are available 2) Exante forecast Prediction for the period in which actual observations are not available. 38 “back” casting T2 T1 estimation period Ex-ante forecast Ex-post forecast in-sample simulation T3 Time (today) 39 Additive Decomposition Yt TCt SNt IRt Yt Yt Trend Trend (Multiplicative Seasonality) Time (Additive Seasonality) Time 40 Multiplicative decomposition is used when the time series exhibits increasing or decreasing seasonal variation (Yt=TCt SNt IRt) Yr 1 Yr 2 TCt SNt Yt Yt – Yt-1 Q1 Q2 Q3 Q4 11.5 13 14.5 16 1.5 0.5 0.8 1.2 17.25 6.5 11.6 19.2 –10.75 5.1 7.6 Q1 Q2 Q3 Q4 17.5 19 20.5 22 1.5 0.5 0.8 1.2 26.25 9.5 16.4 26.4 –16.75 6.9 10 41 Additive decomposition is used when the time series exhibits constant seasonal variation (Yt=TCt + SNt + IRt) Yr 1 Yr 2 TCt SNt Yt Yt – Yt-1 Q1 Q2 Q3 Q4 11.5 13 14.5 16 1.8 –1 –1.5 0.7 13.3 12 13 16.7 –1.3 1 3.7 Q1 Q2 Q3 Q4 17.5 19 20.5 22 1.8 –1 –1.5 0.7 19.3 18 19 22.7 –1.3 1 3.7 42 Step 1 : Estimation of seasonal component (SNt) Calculation of MAt and CMAt is the same as per multiplicative decomposition Initial seasonal component may be estimated by ~ SNt Yt CMAt For example, ~ SN 3 117 118.25 1.25 ~ SN 4 172 119 53 43 Seasonal indices are averaged and adjusted so that they sum to zero (Why?) 44 45 Step 2 : Estimation of Trend/Cycle Deseasonalized series is defined as Dt Yt SNˆ t TCt may be estimated by regression as per multiplicative decomposition 46 i.e., Dt = o + 1t + t ˆ b b t as per TCˆt D and t 0 1 Multiplicative decomposition 47 So, and TCˆt 113.22708331.980637255t Yˆt TˆCt SˆNt For example, 1 TˆC1 113.2270833 1.980637255 and 115.2077206 Yˆ1 1115.2077206 50.80208333 64.40563725 48 MSE = 27.911 MAD = 4.477 49