Why the Damped Trend Works

advertisement
Why the damped trend works
Everette S. Gardner, Jr.
Eddie McKenzie
1
Empirical performance of the damped trend
 “The damped trend can reasonably claim
to be a benchmark forecasting method
for all others to beat.” (Fildes et al., JORS,
2008)
 “The damped trend is a well established
forecasting method that should improve
accuracy in practical applications.”
(Armstrong, IJF, 2006)
2
Why the damped trend works
 Practice
Optimal parameters are often found at the boundaries
of the [0, 1] interval. Thus fitting the damped trend is a
means of automatic method selection from numerous
special cases.
 Theory
The damped trend and each special case has an
underlying random coefficient state space (RCSS)
model that adapts to changes in trend.
3
The damped trend method
Recurrence form
 t   yt  (1   ) ( t 1   bt 1 )
bt   ( t   t 1 )  (1   )  bt 1
yˆ t  h
t
 t  (   2  . . .   h )bt
Error-correction form
 t   t 1   bt 1  et
bt   bt 1   et
4
 1
Special case when ø = 1
1. Holt method
 t   yt  (1   ) ( t 1 bt 1 )
bt   ( t   t 1 )  (1   ) bt 1
yˆ t  h
5
t
 t  hbt
Special cases when β = 0
2. SES
3. SES with drift (Hyndman and Billah, IJF, 2003)
0    1,   1 :
 t   yt  (1   )  t 1  b
yˆ t  h
t
 t  hb
4. SES with damped drift
0    1, 0    1 :
 t   yt  (1   )  t 1   b
yˆ t  h
6
t
 t  (   2  . . .   h )b
Fit periods for M3 Annual series # YB067
5,000
4,000
3,000
2,000
1,000
0
1
2
3
4
5
6
7
8
Year
7
9
10
11
12
13
14
Special cases when β = 0, continued
5. Random walk
6. Random walk with drift
  1,   0 :
 t  yt  b
yˆ t  h t  t  hb
7. Random walk with damped drift
  1, 0    1 :
 t  yt   b
yˆ t  h
8
t
 t  (   2  . . .   h )b
Special cases when α = β = 0
8.
 1
Linear trend
9.
0  1
Modified exponential trend
10.   0
9
Simple average
Fitting the damped trend to the M3 series
 Multiplicative seasonal adjustment
 Initial values for level and trend
 Local: Regression on first 5 observations
 Global: Regression on all fit data
 Optimization (Minimum SSE)
 Parameters only
 Parameters and initial values (no significant difference
from parameters only)
10
M3 mean symmetric APE (Horizons 1-18)
11
Makridakis & Hibon (2000)
with backcasted initial values
13.6%
Gardner & McKenzie (2010)
with local initial values
with global initial values
13.5
13.8
Methods identified in the M3 time series
12
Method
Initial values
Local
Global
Damped trend
43.0%
Holt
10.0
1.8
SES w/ damped drift
24.8
23.5
SES w/ drift
2.4
11.6
SES
0.8
0.6
RW w/ damped drift
7.8
9.6
RW w/ drift
2.5
8.4
RW
0.0
0.0
Modified exp. trend
8.3
8.7
Linear trend
0.1
7.9
Simple average
0.3
0.0
27.8%
Components identified in the M3 time series
Component
13
Initial values
Local Global
Damped trend
51.3%
36.5%
Damped drift
32.6
33.6
Trend
10.1
9.7
Drift
4.9
20.0
Constant level
1.2
0.6
Methods identified by type of data
(Local initial values)
Method
Ann.
Qtr.
Mon.
Damped trend
25.9%
47.1%
47.5%
Holt
17.4
14.2
3.6
SES w/ damped drift
17.7
16.7
33.6
SES w/ drift
3.6
3.7
1.1
SES
0.2
0.4
1.5
18.3
9.0
1.9
RW w/ drift
7.8
2.1
0.4
RW
0.0
0.0
0.0
Modified exp. trend
9.1
6.0
10.1
Linear trend
0.2
0.1
0.1
Simple average
0.0
0.8
0.2
RW w/ damped drift
14
Rationale for the damped trend
 Brown’s (1963) original thinking:
 Parameters are constant only within local segments of
the time series
 Parameters often change from one segment to the next
 Change may be sudden or smooth
 Such behavior can be captured by a random
coefficient state space (RCSS) model
 There is an underlying RCSS model for the
damped trend and each of its special cases
15
yt   t 1  At bt 1  vt
SSOE state space models for the damped trend
Constant coefficient
yt   t 1  bt 1   t
Random coefficient
yt   t 1  At bt 1  vt
t   t 1  bt 1  h1 t
 t   t 1  At bt 1  h*1 vt
bt   bt 1  h2 t
bt  At bt 1  h*2 vt

{At} are i.i.d. binary random variates
White noise innovation processes ε and v are different

Parameters h and h* are related but usually different

16
Runs of linear trends in the RCSS model
bt  At bt 1  h*2 vt
 With a strong linear trend, {At } will consist of
long runs of 1s with occasional 0s.
 With a weak linear trend, {At } will consist of
long runs of 0s with occasional 1s.
 In between, we get a mixture of models on
shorter time scales, i.e. damping.
17
Advantages of the RCSS model


Allows both smooth and sudden changes
in trend.
 is a measure of the persistence of the
linear trend. The mean run length is thus
 /(1   ) and P( At  1)  

18
RCSS prediction intervals are much wider
than those of constant coefficient models.
Conclusions
 Fitting the damped trend is actually a means of
automatic method selection.
 There is an underlying RCSS model for the
damped trend and each of its special cases.
 SES with damped drift was frequently identified
in the M3 series and should receive some
consideration in empirical research.
19
References
Paper and presentation available at:
www.bauer.uh.edu/gardner
20
Download