Why the damped trend works Everette S. Gardner, Jr. Eddie McKenzie 1 Empirical performance of the damped trend “The damped trend can reasonably claim to be a benchmark forecasting method for all others to beat.” (Fildes et al., JORS, 2008) “The damped trend is a well established forecasting method that should improve accuracy in practical applications.” (Armstrong, IJF, 2006) 2 Why the damped trend works Practice Optimal parameters are often found at the boundaries of the [0, 1] interval. Thus fitting the damped trend is a means of automatic method selection from numerous special cases. Theory The damped trend and each special case has an underlying random coefficient state space (RCSS) model that adapts to changes in trend. 3 The damped trend method Recurrence form t yt (1 ) ( t 1 bt 1 ) bt ( t t 1 ) (1 ) bt 1 yˆ t h t t ( 2 . . . h )bt Error-correction form t t 1 bt 1 et bt bt 1 et 4 1 Special case when ø = 1 1. Holt method t yt (1 ) ( t 1 bt 1 ) bt ( t t 1 ) (1 ) bt 1 yˆ t h 5 t t hbt Special cases when β = 0 2. SES 3. SES with drift (Hyndman and Billah, IJF, 2003) 0 1, 1 : t yt (1 ) t 1 b yˆ t h t t hb 4. SES with damped drift 0 1, 0 1 : t yt (1 ) t 1 b yˆ t h 6 t t ( 2 . . . h )b Fit periods for M3 Annual series # YB067 5,000 4,000 3,000 2,000 1,000 0 1 2 3 4 5 6 7 8 Year 7 9 10 11 12 13 14 Special cases when β = 0, continued 5. Random walk 6. Random walk with drift 1, 0 : t yt b yˆ t h t t hb 7. Random walk with damped drift 1, 0 1 : t yt b yˆ t h 8 t t ( 2 . . . h )b Special cases when α = β = 0 8. 1 Linear trend 9. 0 1 Modified exponential trend 10. 0 9 Simple average Fitting the damped trend to the M3 series Multiplicative seasonal adjustment Initial values for level and trend Local: Regression on first 5 observations Global: Regression on all fit data Optimization (Minimum SSE) Parameters only Parameters and initial values (no significant difference from parameters only) 10 M3 mean symmetric APE (Horizons 1-18) 11 Makridakis & Hibon (2000) with backcasted initial values 13.6% Gardner & McKenzie (2010) with local initial values with global initial values 13.5 13.8 Methods identified in the M3 time series 12 Method Initial values Local Global Damped trend 43.0% Holt 10.0 1.8 SES w/ damped drift 24.8 23.5 SES w/ drift 2.4 11.6 SES 0.8 0.6 RW w/ damped drift 7.8 9.6 RW w/ drift 2.5 8.4 RW 0.0 0.0 Modified exp. trend 8.3 8.7 Linear trend 0.1 7.9 Simple average 0.3 0.0 27.8% Components identified in the M3 time series Component 13 Initial values Local Global Damped trend 51.3% 36.5% Damped drift 32.6 33.6 Trend 10.1 9.7 Drift 4.9 20.0 Constant level 1.2 0.6 Methods identified by type of data (Local initial values) Method Ann. Qtr. Mon. Damped trend 25.9% 47.1% 47.5% Holt 17.4 14.2 3.6 SES w/ damped drift 17.7 16.7 33.6 SES w/ drift 3.6 3.7 1.1 SES 0.2 0.4 1.5 18.3 9.0 1.9 RW w/ drift 7.8 2.1 0.4 RW 0.0 0.0 0.0 Modified exp. trend 9.1 6.0 10.1 Linear trend 0.2 0.1 0.1 Simple average 0.0 0.8 0.2 RW w/ damped drift 14 Rationale for the damped trend Brown’s (1963) original thinking: Parameters are constant only within local segments of the time series Parameters often change from one segment to the next Change may be sudden or smooth Such behavior can be captured by a random coefficient state space (RCSS) model There is an underlying RCSS model for the damped trend and each of its special cases 15 yt t 1 At bt 1 vt SSOE state space models for the damped trend Constant coefficient yt t 1 bt 1 t Random coefficient yt t 1 At bt 1 vt t t 1 bt 1 h1 t t t 1 At bt 1 h*1 vt bt bt 1 h2 t bt At bt 1 h*2 vt {At} are i.i.d. binary random variates White noise innovation processes ε and v are different Parameters h and h* are related but usually different 16 Runs of linear trends in the RCSS model bt At bt 1 h*2 vt With a strong linear trend, {At } will consist of long runs of 1s with occasional 0s. With a weak linear trend, {At } will consist of long runs of 0s with occasional 1s. In between, we get a mixture of models on shorter time scales, i.e. damping. 17 Advantages of the RCSS model Allows both smooth and sudden changes in trend. is a measure of the persistence of the linear trend. The mean run length is thus /(1 ) and P( At 1) 18 RCSS prediction intervals are much wider than those of constant coefficient models. Conclusions Fitting the damped trend is actually a means of automatic method selection. There is an underlying RCSS model for the damped trend and each of its special cases. SES with damped drift was frequently identified in the M3 series and should receive some consideration in empirical research. 19 References Paper and presentation available at: www.bauer.uh.edu/gardner 20