Validity of forecast variances from exponential smoothing methods. Kimberly Fernandes April 19, 2006 For Dr. Harry Joe, STAT 541 1.0 Introduction The SAS Time series forecasting system (TSFS) uses ARIMA models for forecast variances for exponential smoothing methods. However, it is not obvious whether it is always appropriate to do so. For example, it is not clear whether ARIMA forecasts make sense under circumstances where the SAS TSFS finds that a certain exponential smoothing model is a better fit than its ARIMA equivalent. This report explores the validity of using ARIMA forecast variances for exponential smoothing methods under various conditions by using the TSFS for three of the time series datasets available from the STAT 541 course website and the SAS help library. In particular, the exponential smoothing methods and their corresponding ARIMA models are compared under combinations of two characteristics: whether the exponential smoothing model is picked by the TSFS as the best-fitting model or not, and whether the RMSE scores are similar between the exponential models and their ARIMA equivalents. The relationship between the similarity of RMSE scores and the similarity of forecasts between models is also investigated, as well as the possibility of using other summary statistics provided by the TSFS to distinguish between forecasting methods. 2.0 Datasets presented While many data sets from the SAS help and SAS user libraries were examined, three datasets are presented in this report: the Tbill monthly interest rate dataset (from the course website), the Citibase monthly data ‘conq’ dataset, and the steel dataset (both from the SAS help library). These datasets were chosen because the best model chosen by the TSFS was an exponential smoothing model (Damped Trend Exponential Smoothing, Linear Holt Exponential Smoothing, and Simple Exponential Smoothing respectively), and thus provided three conditions under which the forecast variances of these models, as well as models that were not the best fit, could be explored. Two models not considered in this report are the Seasonal Exponential Smoothing and Winters Additive models. This is because the ARIMA(0,1,d+1)(0,1,0)d is the corresponding ARIMA equivalent for these two models, but it was impossible to fit this ARIMA model in the TSFS for all the datasets I tried, because the number of iterations and the convergence parameters cannot be changed via the TSFS. Although it is possible 1 to change these parameters via the corresponding proc arima code and use this output to compare to the fit, forecasts, and forecast variances of the TSFS exponential smoothing models, it is unclear whether the ARIMA estimates produced by this version of proc arima would be the same as those used to produce the forecast variances for the corresponding exponential smoothing models, as the parameter estimates generated by ARIMA do not always completely match up to the exponential smoothing estimates. Thus, this report focused on the exponential smoothing models where the corresponding ARIMA models could be easily fit via the TSFS. 3.0 Comparing the fit of Exponential Smoothing models to their ARIMA equivalents By default, the SAS TSFS automatically finds a ‘best’ fitting model for a given data set by finding the model with the smallest Root Mean Square Error (RMSE) for a subset of models chosen corresponding to series diagnostics [1]. In this section, exponential smoothing methods and their corresponding ARIMA models are compared under combinations of two characteristics: whether the exponential smoothing model is picked by the TSFS as the best-fitting model or not, and whether the RMSE scores are similar between the exponential models and their ARIMA equivalents. 3.1. The Citimon (conq) dataset from the SAS help library For the Citimon (conq) dataset, the Linear (Holt) Exponential Smoothing model (LHES) was chosen by the SAS TSFS as the best-fitting model, with an RMSE of 2488.8 (Table 1). Table 1 also shows the RMSE values for the corresponding ARIMA model, and for the Double (Brown) exponential smoothing (DBES) model which uses the same ARIMA model with different parameters for its forecast variances. RMSEs for other models and their corresponding ARIMA equivalents are also shown. It is unclear from this table whether the RMSE values between LHES and its corresponding ARIMA model are very similar; the table shows, for instance, that the RMSE for LHES is closer to the Damped Trend Exponential Smoothing (DTES) than to its ARIMA equivalent. When the LHES, ARIMA(0,2,2), and DBES models are plotted on the same graph (Figure 1), the 2 three models generate very similar forecasts and forecasts variances. The numeric forecasts and forecasts variances for the three models are given in Appendix A1. The parameter estimates for the LHES, DBES, and ARIMA(0,2,2) models were examined to see whether they satisfied the equations relating the smoothing weights to the ARIMA parameters. The parameter estimates did not match up very well; the LHES parameters were 0.537 and 0.001, but should have been 1.215 and 0.445 according to the ARIMA(0,2,2) parameter estimates, and the DBES parameters did not match up either. I found this surprising given the similarity of the forecasts and forecast variances (Figure 1). Model RMSE Linear (Holt) Exponential Smoothing 2488.8 ARIMA(0, 2,2) 2548.3 Double (Brown) Exponential Smoothing 2579.8 Simple Exponential Smoothing 2547.8 ARIMA(0,1,1) 2558.0 Damped Trend Exponential Smoothing 2490.0 ARIMA(1,1,2) 2545.6 Table 1: RMSE values for models fit to the Citimon (conq) dataset. Figure 1: Forecasts and 95% Forecast Confidence Intervals for the Citimon (conq) dataset: LHES (green), ARIMA(02,2) (red), DBES (blue). 3 Figure 2 shows a plot of the simple exponential smoothing (SES) and corresponding ARIMA (0,1,1) models, and Figure 3 shows a plot of the DTES and corresponding ARIMA(1,1,2) models. Figure 2 shows that the SES and the ARIMA (0,1,1) models produce very similar forecasts and forecast variances, as the green and red lines often look like one single line. Figure 3, in contrast, shows that the forecast and forecast variances for the unknown data points for the DTES and ARIMA (1,1,2) models are quite different. Figure 2: Forecasts and 95% Forecast Confidence Intervals for the Citimon (conq) dataset: SES (green), ARIMA(0,1,1) (red). Figure 3: Forecasts and 95% Forecast Confidence Intervals for the Citimon (conq) dataset: DTES (green), ARIMA(1,1,2) (red). 4 This analysis on the citimon data set shows the following: in this instance, the best fitting model and the corresponding ARIMA model produce very similar forecasts and forecast variances, even though the RMSE scores are not extremely close. For the exponential smoothing models that were not chosen as the best-fitting model, the forecast and forecast variances are extremely similar in one instance (the SES and the ARIMA(0,1,1) models), and so are the RMSE scores. In the case of the DTES/ARIMA(1,1,2) models, however, the forecast and forecast variances are quite different, as are the RMSE scores. In the next two sections, we will see if similar results are obtained for the steel dataset and the T-bill monthly interest dataset. 3.2. The steel dataset from the SAS help library For the steel dataset, the SES model was chosen by the SAS TSFS as the bestfitting model, with an RMSE of 1.699 (Table 3). Table 3 also shows the RMSE values for the corresponding ARIMA(01,1,) model. RMSEs for the DTES and its corresponding ARIMA(1,1,2) are shown, but the ARIMA(0,2,2) and its corresponding exponential smoothing models are not shown as the ARIMA(0,2,2) model did not converge after setting the number of iterations in proc arima to over 1,000,000, and thus was excluded from this section. The RMSE values between SES and its corresponding ARIMA model are quite similar, while there is a greater difference between the RMSE values for DTES and its corresponding ARIMA equivalent. Figure 4 shows the SES and the ARIMA(0,1,1) models plotted on the same graph, and the forecasts and intervals are very similar, with the ARIMA(0,1,1) confidence intervals being slightly wider than the SES confidence intervals. The parameter estimates for the SES and ARIMA(0,1,1) models were examined to see whether they satisfied the equations relating the smoothing weights to the ARIMA parameters. It was found that the parameter estimates matched up relatively well; the SES parameter was 0.416 and according to the parameter estimates from the ARIMA model, the SES parameter should have been 0.471. This intuitively makes sense, since the forecasts and forecast variances are very similar (Figure 4). 5 Model RMSE Simple Exponential Smoothing 1.699 ARIMA(0,1,1) 1.726 Damped Trend Exponential Smoothing 1.699 ARIMA(1,1,2) 1.598 Table 3: RMSE values for models fit to the Steel dataset. Figure 4: Forecasts and 95% Forecast Confidence Intervals for the Steel dataset: SES (green), ARIMA(0,1,1) (red). Figure 5 shows a plot of the DTES model and the corresponding ARIMA(1,1,2) model. While the forecasts for the known data appear to be quite different for the two methods, the forecasts and forecast variances for the unknown data points appear to be quite similar, with the ARIMA forecasts and confidence intervals being a bit higher in value than the DTES model at first, and then leveling off, with the DTES forecast confidence intervals being slightly larger than the ARIMA forecast confidence intervals toward the end. 6 Figure 5: Forecasts and 95% Forecast Confidence Intervals for the Steel dataset: damped trend exponential smoothing (green), ARIMA(1,1,2) (red). For the steel data set, the best fitting exponential model and its ARIMA equivalent appear to have relatively more similar RMSE values than in the citimon data set. Like with the citimon data set, the best fitting model and the corresponding ARIMA model produce very similar forecasts and forecast variances. For the exponential smoothing model that was not chosen as the best-fitting model, the forecast values for the known data are quite different between the two models, which may be reflected in the relatively different RMSE scores. The forecasted values and confidence intervals for the unknown data, however, are quite similar. We will now consider our final data set, the Tbill monthly interest data set, in the next section. 3.3. The T-bill dataset from the STAT 541 Course Website For the T-bill dataset, the Damped Trend Exponential Smoothing model was chosen by the SAS TSFS as the best-fitting model, with an RMSE of 0.277 (Table 4). Note that for this dataset, the RMSE scores for all the models fit seem to be quite similar. The forecasts and forecast variances for the DTES and its corresponding ARIMA model are almost identical (Figure 6). It was also verified that the parameter estimates for the DTES and ARIMA(1,1,2) models satisfied the equations relating the two models; the DTES parameters were 1.487 and -0.515, and from the ARIMA 7 parameter estimates the DTES parameters should have been 1.489 and -0.491. Thus, it intuitively makes sense that the forecasts and forecast intervals are quite similar. Model RMSE Damped Trend Exponential Smoothing 0.27727 ARIMA(1,1,2) 0.27939 Linear (Holt) Exponential Smoothing 0.28462 ARIMA(0, 2,2) 0.28861 Double (Brown) Exponential Smoothing 0.29842 Simple Exponential Smoothing 0.28636 ARIMA(0,1,1) 0.28188 Table 4: RMSE values for models fit to T-bill monthly interest rate dataset. Figure 6: Forecasts and 95% Forecast Confidence Intervals for the T-bill monthly interest rate dataset: DTES (green), ARIMA(1,1,2) (red). Figure 7 shows a plot of the SES model and the corresponding ARIMA(0,1,1) model, and Figure 8 shows a plot of the DBES model and the corresponding ARIMA(0,2,2) model. Figure 7 shows that the SES and the ARIMA(0,1,1) models produce very similar forecasts and forecast variances, with the ARIMA(0,1,1) confidence interval being slightly larger than that of the SES model even though its RMSE value is slightly smaller than that of the SES model. 8 Figure 7: Forecasts and 95% Forecast Confidence Intervals for the T-bill dataset: SES (green), ARIMA(0,1,1) (red). Figure 8 is more interesting. Even though the RMSE values for the LHES, ARIMA(0,2,2), and DBES models are quite similar, along with the forecasts for the known data points, the forecasts for the unknown data points are quite different. The LHES and ARIMA(0,2,2) models have similar forecast and forecast variances for the unknown data points, although it was verified that the parameter estimates relating the two models do not match up very well. The DBES forecast is quite different from the other two models; it estimates an increasing trend rather than a decreasing trend. This is surprising given that the RMSE scores are quite similar, although it should be noted that the LHES and the ARIMA(0,2,2) models have more similar RMSE scores than the DBES model (Table 4). 9 Figure 8: Forecasts and 95% Forecast Confidence Intervals for the T-bill dataset: DBES (green), LHES (red), ARIMA(0,2,2) (blue). This analysis on the t-bill data set shows that in this instance, the best fitting model and the corresponding ARIMA model produce very similar forecasts and forecast variances, and the RMSE scores are quite close. For the exponential smoothing models that were not chosen as the best-fitting model, the forecast and forecast variances are extremely similar in one instance (the SES model and the ARIMA(0,1,1) model), and so are the RMSE scores. In the case of the DBES/LHES/ARIMA(0,2,2) models, however, the forecast and forecast variances are similar for the LHES and ARIMA(0,2,2) models where the RMSE scores are very similar, but differ for the DBES model where the RMSE score is a bit larger. 3.0 Choice of Summary Statistics considered A note should be made regarding the focus on the relationship between RMSE scores and forecasts/forecasting variances in this paper, as opposed to examining other summary statistics. By default, the SAS TSFS uses the RMSE scores as the criterion to choose the best-fitting model, but other criteria exist. I tried to find the best fitting model for the three datasets considered using different criteria (Mean Absolute Percent Error, AIC, BIC, Random Walk R-Square, Amemiya’s Predicted Criterion), and there was only one circumstance where the best-fitting model chosen was different. This occurred for the citibase data; the log DTES model was chosen over the LHES model under the Mean Absolute Percent criterion. Thus, it appears that the choice of which summary statistic to use to find the best-fitting model does not greatly affect which model is chosen. 10 It also appears that the similarity of RMSEs between models, or similarities of any other of the summary statistics provided by the TSFS, is not guaranteed to elucidate whether particular models will have similar forecasts. Appendix B shows an example of this for the T-bill dataset. Although the SES, LHES, and ARIMA(0,1,1)(1,0,0) models have similar RMSEs, they produce very different forecasts for unknown data points. Comparing other summary statistics (R-squared, AIC, BIC, APC) does not shed light on how different forecasts from various methods will be, since similar values are yielded under these different summary statistics even though the forecasts themselves are quite different. 5.0 Conclusions Table 5 summarizes the results from the three datasets presented in Section 3. While the results are limited as only three datasets are considered, which cannot be representative of all possible datasets, the results are still interesting. We see that in all three instances, the best exponential model chosen by SAS yielded similar forecasts and forecast variances as its corresponding ARIMA model. The RMSE scores between the best model and its ARIMA equivalent were similar in the case of the Steel and T-bill datasets, which makes sense since the parameter equations matched up well between the ARIMA and smoothing exponential models. The RMSE scores were not very similar for the best model in the Citimon dataset, and thus it is not surprising that the parameter equations did not match up for this dataset. When the exponential smoothing model was not chosen by SAS as the best underlying model to represent the data, it was not guaranteed that the smoothing models and their corresponding ARIMA equivalents produced similar forecasts or forecast variances (although this was the case for the SES model). It was also not always guaranteed that similar RMSE scores, or other similar summary statistics, would be produced. From this small investigation, it appears that when the exponential smoothing model is found to be the best model to represent the underlying pattern of the data, its ARIMA equivalent produces similar forecasts and forecast variances, and thus it is most likely appropriate to use the ARIMA equivalent to make confidence intervals for the exponential smoothing model. It is less clear as to whether the ARIMA equivalent 11 should be used to make confidence intervals when the exponential smoothing method in question is not the best fit for the data, as the exponential smoothing models and the optimized-weighted ARIMA models can be quite different. Overall, however, it appears that the forecast methods used in SAS TSFS appear to be valid and useful. This report also finds that models with similar RMSE or other summary statistic values may still yield quite different forecasts, so it is important that models be visually inspected and compared rather than relying on summary statistics alone to reveal similarities between different forecasting models. Best model: - Similar forecasts/forecast variances? - Similar RMSE scores? - Did parameter equations match up? Not found as Best model: - Similar forecasts/forecast variances? - Similar RMSE scores? Citimon Steel T-bill Yes Yes Yes Not very similar No Yes Yes Yes Yes Not always Yes, but only one other exponential model was considered Not always Not always No, but only one other exponential model was considered Yes Table 5: Summary of Results from the three datasets considered. 12 Appendix A: Numeric Comparisons of the Forecasts and Forecast variances for the best fitting exponential smoothing models and their ARIMA equivalents. DATE Linear Holt Exponential Smoothing PREDICT UPPER LOWER ARIMA (0,2,2) PREDICT UPPER LOWER Double Brown Exponential Smoothing PREDICT UPPER LOWER 1/1/1992 2/1/1992 3/1/1992 4/1/1992 112957.4 113336.8 113716.2 114095.5 117869.7 118915 119890 120813.5 108045.2 107758.6 107542.3 107377.6 112997.1 113398.9 113800.6 114202.4 117964.1 119078.5 120138.2 121159.2 108030.2 107719.3 107463 107245.5 112520.7 112894.9 113269.2 113643.4 117594.6 118358.9 119170.7 120026.6 107446.7 107430.9 107367.7 107260.3 5/1/1992 6/1/1992 7/1/1992 8/1/1992 114474.9 114854.3 115233.6 115613 121697 122548.5 123373.5 124176.2 107252.8 107160 107093.8 107049.9 114604.1 115005.8 115407.6 115809.3 122151.5 123121.7 124074.6 125013.6 107056.7 106890 106740.5 106605 114017.7 114392 114766.2 115140.5 120923.4 121857.9 122827.3 123829.3 107112 106926.1 106705.1 106451.6 9/1/1992 10/1/1992 11/1/1992 12/1/1992 115992.4 116371.8 116751.1 117130.5 124959.7 125726.4 126478.5 127217.6 107025.1 107017.1 107023.7 107043.4 116211 116612.8 117014.5 117416.3 125941.4 126860 127771.1 128675.9 106480.7 106365.5 106258 106156.6 115514.7 115889 116263.3 116637.5 124861.6 125922.4 127010 128123.1 106167.8 105855.6 105516.5 105152 A1: Forecasts and 95% Forecast Confidence Intervals for the Citimon (conq) dataset. DATE Simple Exponential Smoothing ARIMA (0,1,1) 1/1/1981 1/1/1982 1/1/1983 PREDICT 4.331373 4.331373 4.331373 UPPER 7.698977 7.978444 8.23797 LOWER 0.963769 0.684302 0.424776 PREDICT 4.380705 4.380705 4.380705 UPPER 7.789771 8.149582 8.477916 LOWER 0.971638 0.611827 0.283493 1/1/1984 1/1/1985 1/1/1986 1/1/1987 4.331373 4.331373 4.331373 4.331373 8.481298 8.711127 8.929484 9.13793 0.181448 -0.04838 -0.26674 -0.47518 4.380705 4.380705 4.380705 4.380705 8.781824 9.06606 9.334013 9.588196 -0.02041 -0.30465 -0.5726 -0.82679 1/1/1988 1/1/1989 1/1/1990 1/1/1991 4.331373 4.331373 4.331373 4.331373 9.337706 9.529809 9.715063 9.89415 -0.67496 -0.86706 -1.05232 -1.2314 4.380705 4.380705 4.380705 4.380705 9.830537 10.06255 10.28546 10.50025 -1.06913 -1.30114 -1.52405 -1.73884 1/1/1992 4.331373 10.06765 -1.4049 4.380705 10.70775 -1.94634 A2: Forecasts and 95% Forecast Confidence Intervals for the Steel dataset. DATE 3/1/2002 4/1/2002 Simple Exponential Smoothing PREDICT UPPER LOWER 2.597898 3.152702 2.043094 2.616405 3.480589 1.752221 ARIMA (0,1,1) PREDICT UPPER 2.590001 3.148995 2.601149 3.472205 LOWER 2.031007 1.730093 5/1/2002 6/1/2002 7/1/2002 8/1/2002 2.628627 2.636699 2.64203 2.64555 3.762533 4.013751 4.240947 4.448524 1.494721 1.259647 1.043112 0.842576 2.608878 2.614237 2.617953 2.620529 3.746398 3.991038 4.213428 4.417778 1.471358 1.237436 1.022478 0.82328 9/1/2002 10/1/2002 11/1/2002 12/1/2002 2.647875 2.649411 2.650425 2.651094 4.639798 4.817382 4.98337 5.139456 0.655952 0.48144 0.317479 0.162733 2.622315 2.623554 2.624412 2.625008 4.607053 4.783535 4.94905 5.105088 0.637577 0.463572 0.299775 0.144927 1/1/2003 2/1/2003 2.651537 2.651829 5.287014 5.427168 0.01606 -0.12351 2.62542 2.625707 5.252878 5.393442 -0.00204 -0.14203 A3: Forecasts and 95% Forecast Confidence Intervals for the Tbill dataset. 13 Appendix B: Comparing Summary Statistics for the T-Bill Dataset The summary statistics for these three models are very similar, but the forecasts and forecast variances for the unknown data points are quite different. AIC BIC Amemiya’s Adjusted R-Squared 0.286 Mean Absolute Percent Error 5.293 -183 -181 0.908 0.285 5.336 -182 -177 0.906 0.285 5.243 -181 -179 0.909 RMSE Simple Exponential Smoothing Linear Holt Exponential Smoothing ARIMA(0,1,1)(1,0,0) 14 R-Code To make plots: data leadout1LU; set leadout1(keep=date upper lower rename=(upper=upper1 lower=lower1)); where date >= '01mar02'd; run; data leadout11; set leadout1(keep=date actual predict rename=(actual=actual1 predict=predict1)); run; data leadout2LU; set leadout2(keep=date upper lower rename=(upper=upper2 lower=lower2)); where date >= '01mar02'd; run; data leadout22; set leadout2(keep=date actual predict rename=(actual=actual2 predict=predict2)); run; data leadout3LU; set leadout3(keep=date upper lower rename=(upper=upper3 lower=lower3)); where date >= '01mar02'd; run; data leadout33; set leadout3(keep=date actual predict rename=(actual=actual3 predict=predict3)); run; data final; merge leadout1LU leadout11 leadout2LU leadout22 leadout3LU leadout33; by date; run; /* ------- Graphics Output ------- */ goptions cback=white ctitle=bl ctext=bl border ftitle=centx ftext=centx; title 'TBill monthly interest data'; symbol1 h=2 pct v=star c=black; /* for actual */ symbol2 i=spline width=1 v=none c=red l = 20; symbol3 i=spline width=2 v=none c=red; symbol4 i=spline width=1 v=none c=red l = 20; symbol5 i=spline width=1 v=none c=green l = 20; symbol6 i=spline width=2 v=none c=green; symbol7 i=spline width=1 v=none c=green l=20; symbol8 i=spline width=1 v=none c=blue l = 20; symbol9 i=spline width=2 v=none c=blue; symbol10 i=spline width=1 v=none c=blue l=20; axis1 offset=(1 cm) label=('Year') minor=none order=('01jan96'd to '01jul03'd by year); axis2 label=(angle=90 'Tbill Monthly Interest Data') order=(0 to 7 by 1); proc gplot data=final; format date year4.; plot actual1 * date = 1 Lower2 * date = 2 predict2 * date =3 upper2 * date = 4 15 lower1 * date = 5 predict1 * date = 6 upper1 * date = 7 lower3 * date = 8 predict3 * date = 9 upper3 * date = 10 / overlay noframe href='01mar02'd chref=red vaxis=axis2 vminor=1 haxis=axis1; footnote1 c=black f=centx ' * Actual' c=g f=centx ' --- linear trend (Holt) exponential smoothing' c=r f=centx ' --- Linear Holt Exponential Smoothing' c=blue f=centx ' --- ARIMA(0,1,1)(1,0,0)'; run; quit; footnote; To export datasets into excel for editing: PROC EXPORT DATA= Work.Leadout1 OUTFILE= "C:\Documents and Settings\Owner\STAT 541\Final\Data\tbill_dtes.xls" DBMS=EXCEL2000 REPLACE; RUN; PROC EXPORT DATA= Work.Leadout2 OUTFILE= "C:\Documents and Settings\Owner\STAT 541\Final\Data\tbill_arima112.xls" DBMS=EXCEL2000 REPLACE; RUN; PROC EXPORT DATA= Work.Leadout3 OUTFILE= "C:\Documents and Settings\Owner\STAT 541\Final\Data\citimon_conq_dblbrown.xls" DBMS=EXCEL2000 REPLACE; RUN; 16 References 1. Time Series Forecasting System. Windows Reference: Develop Models Window. http://www.uni.edu/sasdoc/ets/chap29/sect10.htm. 17