Decomposition Method - City University of Hong Kong

Decomposition Method 1 Types of Data  Time series data: a sequence of observations measured over time (usually at equally spaced intervals, e.g., weekly, monthly and annually). Examples of time series data include: Gross Domestic Product each quarter; annual rainfall; daily stock market index  Cross sectional data: data on one or more variables collected at the same point in time 2 Time Series vs Causal Modeling  Causal (regression) models: the investigator specifies some behavioural relationship and estimates the parameters using regression techniques;  Time series models: the investigator uses the past data of the target variable to forecast the present and future values of the variable 3 Time Series vs Causal Modeling  On the other hand, there are many cases when one cannot, or one prefers not to, build causal models: 1. insufficient information is known about the behavioural relationship; 2. lack of, or conflicting, theories; 3. insufficient data on explanatory variables; 4. expertise may be unavailable; 5. time series models may be more accurate 4 Time Series vs Causal Modeling  Direct benefits of using time series models: 1. Little storage capacity is needed; 2. some time series models are automatic in that user intervention is not required to update the forecasts each period; 3. some time series models are evolutionary in that the models adapt as new information is received; 5 Classical Decomposition of Time Series  Trend – does not necessarily imply a monotonically increasing or decreasing series but simply a lack of constant mean, though in practice, we often use a linear or quadratic function to predict the trend;  Cycle – refers to patterns or waves in the data that are repeated after approximately equal intervals with approximately equal intensity. For example, some economists believe that “business cycles” repeat themselves every 4 or 5 years; 6 Classical Decomposition of Time Series  Seasonal – refers to a cycle of one year duration;  Random (irregular) – refers to the (unpredictable) variation not covered by the above 7 Decomposition Method  Multiplicative Models Yt  TRt  SNt  CLt  IRt  Additive Models Yt  TRt  SNt  CLt  IRt Find the estimates of these four components. 8 Multiplicative Decomposition  Examples: (1) US Retail and Food Services Sales from 1996 Q1 to 2008 Q1 Figure 2.1 (2) Quarterly Number of Visitor Arrivals in Hong Kong from 2002 Q1 to 2008 Q1 Figure 2.2 9 Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q 108 307 107 306 106 305 105 304 104 303 103 302 102 301 101 300 100 399 199 398 198 397 197 396 196 Sales Y(t) (in MN US$) Figure 2.1 US Retail Sales US Retail & Food Services Sales 500,000 450,000 400,000 350,000 300,000 250,000 200,000 150,000 100,000 50,000 0 Time Back 10 Figure 2.2 Visitor Arrivals Number of Visitor Arrivals in Hong Kong 2500000 2000000 1500000 1000000 500000 108 Q 307 Q 107 Q 306 Q 106 Q 305 Q 105 Q 304 Q 104 Q 303 Q 103 Q 302 Q 102 0 Q Number of Visitors Y(t) 3000000 Time 11  Cycles are often difficult to identify with a short time series.  Classical decomposition typically combines cycles and trend as one entity: Yt  TCt  SNt  IRt 12 Illustration : Consider the following 4-year quarterly time series on sales volume: Period (t) Year Quarter Sales 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 72 110 117 172 76 112 130 194 78 119 128 201 81 134 141 216 2 3 4 13 Figure 2.3 14 Step 1 : Estimation of seasonal component (SNt)  Yt = TCt  SNt  IRt  SˆNt  Yt TCt  IRt 72  110 117 172  Moving Average  4 for periods 1 – 4  117.75 110 117 172 76 Moving Average  4 for periods 2 – 5  118.75 15 Period (t) Year Quarter Sales 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 72 110 117 172 76 112 130 194 78 119 128 201 81 134 141 216 2 3 4 MA (t) 117.75 118.75 119.25 122.5 128 128.5 130.25 129.75 131.5 132.25 136 139.25 143 16  Assuming the average of the observations is also the median of the observations, the MA for periods 1 – 4, 2 – 5, 3 – 6 are centered at positions 2.5, 3.5 and 4.5 respectively. 17  To get an average centered at periods 3, 4, 5 etc. the means of two consecutive moving averages are calculated: 117.75  118.75  Centered Moving 2 Average for period 3  118.25 118.75  119.25 Centered Moving  2 Average for period 4  119 18 Period (t) Year Quarter Sales 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 72 110 117 172 76 112 130 194 78 119 128 201 81 134 141 216 2 3 4 MA (t) CMA(t) 117.75 118.75 119.25 122.5 128 128.5 130.25 129.75 131.5 132.25 136 139.25 143 118.25 119 120.875 125.25 128.25 129.375 130 130.625 131.875 134.125 137.625 141.125 19  Because the CMAt contains no seasonality and irregularity, the seasonal component may be Yt ~ estimated by SNt  CMAt 117 ~ For example, SN 3   0.989 118.25 ~ 172 SN 4   1.445 119 20 Period (t) Year Quarter Sales 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 72 110 117 172 76 112 130 194 78 119 128 201 81 134 141 216 2 3 4 ~ MA (t) CMA(t) SN(t ) 117.75 118.75 119.25 122.5 128 128.5 130.25 129.75 131.5 132.25 136 139.25 143 118.25 119 120.875 125.25 128.25 129.375 130 130.625 131.875 134.125 137.625 141.125 0.989429175 1.445378151 0.628748707 0.894211577 1.013645224 1.499516908 0.6 0.911004785 0.970616114 1.49860205 0.588555858 0.949512843 21 ~  After all SN t  s have been computed, they are further averaged to eliminate irregularities in the series. We also adjust the seasonal indices so that they sum to the number of seasons in a year (i.e., 4 for quarterly data, 12 for monthly data). Why?) 22 Quarter Average 1 (0.628748707 + 0.6 + 0.588555858)/3= 2 (0.894211577 + 0.911004785 + 0.949512843)/3= 3 (0.989429175 + 1.013645224 + 0.970616114)/3= 4 (1.445378151 + 1.499516908 + 1.49860205)/3= Sum = 23 Step 2 : Estimation of Trend/Cycle  Define deseasonalized (or seasonally adjusted) series as Dt  Yt SNˆ t for example, D1 = 72/0.6063 = 118.7506 24 25  TCt may be estimated by regression using a linear trend: Dt   0  1t   t t  1, 2, 3 TCˆ t  Dˆ t  b0  b1t , where b0 and b1 are least squares estimates of 0 and 1 respectively. 26 EXCEL regression output : So, TˆCt  113.6997914 1.854638009t 27 For example, TˆC1  113.6997914 1.8546380091  115.5544294 Tˆ C 2  113.6997914 1.8546380092  117.4090674 28 29 Step 3 : Computation of fitted values and out-of-sample forecasts Yˆt  TˆCt  SˆN t In - samplefit : Yˆ  115.5544 0.6063 70.0621 1  Yˆ16  143.37401.4825 212.5516 30 Out of sample forecast : Yˆ17  TˆC17  SˆN17  113.670 1.85517 0.6063  145.2286 0.6063  88.054 Yˆ18  TˆC18  SˆN18  113.670 1.85518 0.9191  147.0833 0.9191  135.1796 31 32 Figure 2.4 33 Measuring Forecast Accuracy : Let et  Yt  Yˆt be theerrorsof forecast. 1) Mean Squared Error n MSE   et2 n t 1 RMSE  MSE 2) Mean Absolute Deviation n MAD   et n t 1 RMAD  MAD 34 et = Method A –2 1.5 –1 2.1 0.7 Method B –4 0.7 0.5 1.4 0.1 Method A : MSE = MAD = 2.43 1.46 Method B : MSE = MAD = 3.742 1.34 35 Naive Prediction Yˆt  Yt 1 Theil’s u Statistics U    Y  Y  Yt  Yˆt 2 2 t t 1 n n if U = 1  Forecasts produced are no better than naive forecast U = 0  Forecasts produced perfect fit The smaller the value of U, the better the forecasts. 36 MSE = 11.932 MAD = 2.892 Theil’s U = 0.0546 37 Out-of-Sample Forecasts 1) Expost forecast  Prediction for the period in which actual observations are available 2) Exante forecast  Prediction for the period in which actual observations are not available. 38 “back” casting T2 T1 estimation period Ex-ante forecast Ex-post forecast in-sample simulation T3 Time (today) 39 Additive Decomposition Yt  TCt  SNt  IRt Yt Yt Trend Trend (Multiplicative Seasonality) Time (Additive Seasonality) Time 40 Multiplicative decomposition is used when the time series exhibits increasing or decreasing seasonal variation (Yt=TCt  SNt  IRt) Yr 1 Yr 2 TCt SNt Yt Yt – Yt-1 Q1 Q2 Q3 Q4 11.5 13 14.5 16 1.5 0.5 0.8 1.2 17.25 6.5 11.6 19.2 –10.75 5.1 7.6 Q1 Q2 Q3 Q4 17.5 19 20.5 22 1.5 0.5 0.8 1.2 26.25 9.5 16.4 26.4 –16.75 6.9 10 41 Additive decomposition is used when the time series exhibits constant seasonal variation (Yt=TCt + SNt + IRt) Yr 1 Yr 2 TCt SNt Yt Yt – Yt-1 Q1 Q2 Q3 Q4 11.5 13 14.5 16 1.8 –1 –1.5 0.7 13.3 12 13 16.7 –1.3 1 3.7 Q1 Q2 Q3 Q4 17.5 19 20.5 22 1.8 –1 –1.5 0.7 19.3 18 19 22.7 –1.3 1 3.7 42 Step 1 : Estimation of seasonal component (SNt)  Calculation of MAt and CMAt is the same as per multiplicative decomposition  Initial seasonal component may be estimated by ~ SNt  Yt  CMAt For example, ~ SN 3  117  118.25  1.25 ~ SN 4  172 119  53 43  Seasonal indices are averaged and adjusted so that they sum to zero (Why?) 44 45 Step 2 : Estimation of Trend/Cycle  Deseasonalized series is defined as Dt  Yt  SNˆ t  TCt may be estimated by regression as per multiplicative decomposition 46 i.e., Dt = o + 1t + t ˆ  b  b t as per TCˆt  D and t 0 1 Multiplicative decomposition 47 So, and TCˆt  113.22708331.980637255t Yˆt  TˆCt  SˆNt For example, 1 TˆC1  113.2270833 1.980637255 and  115.2077206 Yˆ1  1115.2077206 50.80208333  64.40563725 48 MSE = 27.911 MAD = 4.477 49

Decomposition Method - City University of Hong Kong

Related documents

Products

Support

Decomposition Method - City University of Hong Kong

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib