Exponential smoothing: The state of the art – Part II Everette S. Gardner, Jr. 1 Exponential smoothing: The state of the art – Part II 2 History Methods Properties Method selection Model-fitting Inventory control Conclusions Timeline of Operations Research (Gass, 2002) 1654 1733 1763 1788 1795 1826 1907 1909 1936 1941 1942 1943 1944 1944 3 Expected value, B. Pascal Normal distribution, A. de Moivre Bayes Rule, T. Bayes Lagrangian multipliers, J. Lagrange Method of Least Squares, C. Gauss, A. Legendre Solution of linear equations, C. Gauss Markov chains, A. Markov Queuing theory, A. Erlang The term OR first used in British military applications Transportation model, F. Hitchcock U.K. Naval Operational Research, P. Blackett Neural networks, W. McCulloch, W. Pitts Game theory, J. von Neumann, O. Morgenstern Exponential smoothing, R. Brown Exponential smoothing at work “A depth charge has a magnificent laxative effect on a submariner.” Lt. Sheldon H. Kinney, Commander, USS Bronstein (DE 189) 4 Forecast Profiles N None N None A Additive DA Damped Additive M Multiplicative DM Damped Multiplicative 5 A Additive M Multiplicative Damped multiplicative trends (Taylor, 1.00 0.95 0.90 2002) 4,000 Damping paramete r 3,000 2,000 1,000 6 0.85 Variations on the standard methods 7 Multivariate series (Pfefferman & Allen, 1989) Missing or irregular observations (Wright,1986) Irregular update intervals (Johnston, 1993) Planned discontinuities (Williams & Miller, 1999) Combined level/seasonal component (Snyder & Shami, 2001) Multiple seasonal cycles (Taylor, 2003) Fixed drift (Hyndman & Billah, 2003) Smooth transition exponential smoothing (Taylor, 2004) Renormalized seasonals (Archibald & Koehler, 2003) SSOE state-space equivalent methods (Hyndman et al., 2002) Smoothing with a fixed drift (Hyndman & Billah, 2003) Equivalent to the “Theta method”? (Assimakopoulos and Nikolopoulos, 2000) How to do it When to do it 8 Set drift equal to half the slope of a regression on time Then add a fixed drift to simple smoothing, or Set the trend parameter to zero in Holt’s linear trend Unknown Adaptive simple smoothing (Taylor, 2004) Smooth transition exponential smoothing (STES) is the only adaptive method to demonstrate credible improved forecast accuracy 9 The adaptive parameter changes according to a logistic function of the errors Model-fitting is necessary Renormalization of seasonals Additive (Lawton, 1998) Without renormalization Renormalization of seasonals alone Forecasts are biased unless renormalization is done every period Multiplicative (Archibald & Koehler, 2003) 10 Level and seasonals are biased Trend and forecasts are unbiased Competing renormalization methods give forecasts different from each other and from unnormalized forecasts Archibald & Koehler (2003) solution 11 Additive and multiplicative renormalization equations that give the same forecasts as standard equations Cumulative renormalization correction factors for those who wish to keep the standard equations Continental Airlines Domestic Yields 0.15 0.14 0.13 0.12 Model Restarted 0.11 0.10 Jan-00 12 Jan-01 Jan-02 Jan-03 Jan-04 Jan-05 Standard vs. state-space methods Trend damping Multiplicative seasonality Standard: Seasonal component depends on level State-space: Independent components Model fitting 13 Standard: Immediate State-space: Starting at 2 steps ahead Standard: Minimize squared errors State-space: Minimize squared relative errors if multiplicative errors are assumed. Properties 14 Equivalent models Prediction intervals Robustness Equivalent models Linear methods All methods 15 ARIMA DLS regression Kernel regression (Gijbels et al.,1999; Taylor, 2004) MSOE state-space models (Harvey, 1984) SSOE state-space models (Ord et al.,1997) Analytical prediction intervals Options Empirical evidence 16 SSOE models (Hyndman et al., 2005) Model-free (Chatfield & Yar, 1991) None Empirical prediction intervals Options Chebyshev distribution (fitted errors) (Gardner, 1988) Quantile regression (fitted errors) (Taylor & Bunn, 1999) Parametric bootstrap (Snyder et al., 2002) Simulation from assumed model (Bowerman, O’Connell, & Koehler, 2005) Empirical evidence 17 Limited, but encouraging Robustness Many equivalent models for each method (Chatfield et al., 2001; Koehler et al., 2001) Simple ES performs well in many series that are not ARIMA (0,1,1) (Cogger,1973) Aggregated series can often be approximated by ARIMA (0,1,1) (Rosanna & Seater, 1995) 18 Robustness (continued) Exponentially declining weights are robust (Muth, 1960; Satchell & Timmerman, 1995) Additive seasonal methods are not sensitive to the generating process (Chen,1997) The damped trend includes numerous special cases (Gardner & McKenzie,1988) 19 Automatic forecasting with the damped additive trend = .84 = .38 20 = 1.00 Summary of 66 empirical studies, 1985-2005 21 Seasonal methods rarely used Damped trend rarely used Multiplicative trend never used Little attention to method selection But exponential smoothing was robust, performing well in at least 58 studies Method selection 22 Benchmarking Time series characteristics Expert systems Information criteria Operational benefits Identification vs. selection Benchmarking in method selection 23 Methods should be compared to reasonable alternatives Competing methods should use exactly the same information Forecast comparisons should be genuinely out of sample Method selection: Time series characteristics Variances of differences (Shah,1997) Considered only simple smoothing and a linear trend Should be tested with an exponential smoothing framework Regression-based performance index 24 Seemed a good idea at the time Discriminant analysis (Gardner & McKenzie,1988) (Meade, 2000) Considered every feasible time series model Should be tested with an exponential smoothing framework Method selection: Expert systems Rule-based forecasting Original version (Collopy & Armstrong, 1992) Automatic version (Vokurka et al., 1996) Streamlined version (Adya et al., 2001) Other rule-induction systems (Arinze,1994; Flores & Pearce, 2000) 25 Expert systems are no better than aggregate selection of the damped trend alone (Gardner, 1999) Method selection: AIC Damped trend vs. state-space models selected by AIC: Average of all forecast horizons Damped trend 20 State-space 18 16 14 12 10 8 111 1,001 M3 Ann. M3 Qtr. M3 Mon. MAPE 26 Asymmetric MAPE Method selection: Empirical information criteria (EIC) 27 Strategy: Penalize the likelihood by linear and nonlinear functions of the number of parameters (Billah et al., 2005) Evaluation: EIC superior to other information criteria, but results are not benchmarked Method selection: Operational benefits 28 Forecasting determines inventory costs, service levels, and scheduling and staffing efficiency. Research is limited because a model of the operating system is needed to project performance measures. Method selection: Operational benefits (cont.) Manufacturing (Adshead & Price, 1987) U.S. Navy repair parts (Gardner, 1990) 29 Producer of industrial fasteners (£4 million annual sales) Costs: holding, stockout, overtime 50,000 inventory items Tradeoffs: Backorder delays vs. investment Savings: $30 million (7%) in investment Average delay in filling backorders 50 Random walk Backorder days 45 Linear trend 40 35 Simple smoothing 30 Damped trend 25 370 380 390 400 410 Inventory investment (millions) 30 420 430 Inventory analysis: Packaging materials for snack-food manufacturer $2,500,000 $2,000,000 Actual Inventory from subjective forecasts $1,500,000 $1,000,000 $500,000 $0 Target maximum inventory based on damped trend 31 Month Month Monthly Usage Method selection: Operational benefits (cont.) Electronics components (Flores et al., 1993) RAF repair parts (Eaves & Kingsman, 2004) 32 967 inventory items Costs: holding cost vs. margin on lost sales 11,203 inventory items Tradeoffs: inventory investment vs. stockouts Savings: £285 million (14%) in investment Forecasting for inventory control: Cumulative lead-time demand 33 SSOE models yield standard deviations of cumulative lead-time demand (Snyder et al., 2004) Differences from traditional expressions (such as s Lead time ) are significant Standard deviation multipliers, α = 0.30 Traditional Correct 5 4 3 2 1 0 2 3 4 Lead time 34 5 6 Forecasting for inventory control: Cumulative lead-time demand (cont.) The parametric bootstrap (Snyder et al., 2002) can estimate variances for: 35 Any seasonal model Non-normal demands Intermittent demands Stochastic lead times Forecasting for inventory control: Intermittent demand Croston’s method (Croston, 1972) Mean demand = Smoothed nonzero demand Smoothed inter-arrival time Bias correction (Eaves & Kingsman, 2004; Syntetos & Boylan, 2001, 2005) Mean demand x (1 – α / 2) 36 Forecasting for inventory control: Intermittent demand (continued) There is no stochastic model for Croston’s method (Shenstone & Hyndman, 2005) 37 Many questionable variance expressions in the literature The state-space model for intermittent series requires a constant mean inter-arrival time (Snyder, 2002) Why not aggregate the data to eliminate zeroes? Progress in the state of the art, 19852005 Analytical variances are available for most 38 methods through SSOE models. Robust methods are available for multiplicative trends and adaptive simple smoothing. Croston’s method has been corrected for bias. Confusion about renormalization of seasonals has finally been resolved. There has been little progress in method selection. Much empirical work remains to be done. Suggestions for research Refine the state-space framework Validate and compare method selection procedures 39 Add the damped multiplicative trend Damp all trends immediately Test alternative method selection procedures Information criteria – Benchmark the EIC Discriminant analysis Regression-based performance index Suggestions for research (continued) Develop guidelines for the following choices: 40 Damped additive vs. damped multiplicative trend Fixed vs. adaptive parameters in simple smoothing Fixed vs. smoothed trend in additive trend model Standard vs. state-space seasonal components Additive vs. multiplicative errors Analytical vs. empirical prediction intervals Conclusion “The challenge for future research is to establish some basis for choosing among these and other approaches to time series forecasting.” (Gardner,1985) 41