Automatic algorithms for time series forecasting Rob J Hyndman

advertisement
Rob J Hyndman
Automatic algorithms
for time series
forecasting
Follow along using R
Requirements
Install the fpp package and its dependencies.
Automatic algorithms for time series forecasting
2
Motivation
1
Common in business to have over 1000
products that need forecasting at least monthly.
2
Forecasts are often required by people who are
untrained in time series analysis.
Specifications
Automatic forecasting algorithms must:
å determine an appropriate time series model;
å estimate the parameters;
å compute the forecasts with prediction intervals.
Automatic algorithms for time series forecasting
3
Motivation
1
Common in business to have over 1000
products that need forecasting at least monthly.
2
Forecasts are often required by people who are
untrained in time series analysis.
Specifications
Automatic forecasting algorithms must:
å determine an appropriate time series model;
å estimate the parameters;
å compute the forecasts with prediction intervals.
Automatic algorithms for time series forecasting
3
Example: Asian sheep
450
400
350
300
250
millions of sheep
500
550
Numbers of sheep in Asia
1960
1970
1980
1990
2000
2010
Year
Automatic algorithms for time series forecasting
4
Example: Asian sheep
450
400
350
300
250
millions of sheep
500
550
Automatic ETS forecasts
1960
1970
1980
1990
2000
2010
Year
Automatic algorithms for time series forecasting
4
Example: Asian sheep
450
300
350
400
library(fpp)
fit <- ets(livestock)
fcast <- forecast(fit)
plot(fcast)
250
millions of sheep
500
550
Automatic ETS forecasts
1960
1970
1980
1990
2000
2010
Year
Automatic algorithms for time series forecasting
5
Example: Cortecosteroid sales
1.0
0.8
0.6
0.4
Total scripts (millions)
1.2
1.4
Monthly cortecosteroid drug sales in Australia
1995
2000
2005
2010
Year
Automatic algorithms for time series forecasting
6
Example: Cortecosteroid sales
1.0
0.8
0.6
0.4
Total scripts (millions)
1.2
1.4
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
1995
2000
2005
2010
Year
Automatic algorithms for time series forecasting
6
Auto ARIMA
0.6
0.8
1.0
fit <- auto.arima(h02)
fcast <- forecast(fit)
plot(fcast)
0.4
Total scripts (millions)
1.2
1.4
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
1995
2000
2005
2010
Year
Automatic algorithms for time series forecasting
7
Outline
1 Forecasting competitions
2 Exponential smoothing
3 ARIMA modelling
4 Automatic nonlinear forecasting?
5 Time series with complex seasonality
6 Recent developments
Automatic algorithms for time series forecasting
Forecasting competitions
8
Makridakis and Hibon (1979)
Automatic algorithms for time series forecasting
Forecasting competitions
9
Makridakis and Hibon (1979)
Automatic algorithms for time series forecasting
Forecasting competitions
9
Makridakis and Hibon (1979)
This was the first large-scale empirical evaluation of
time series forecasting methods.
Highly controversial at the time.
Difficulties:
8 How to measure forecast accuracy?
8 How to apply methods consistently and objectively?
8 How to explain unexpected results?
Common thinking was that the more
sophisticated mathematical models (ARIMA
models at the time) were necessarily better.
If results showed ARIMA models not best, it
must be because analyst was unskilled.
Automatic algorithms for time series forecasting
Forecasting competitions
10
Consequences of M&H (1979)
As a result of this paper, researchers started to:
å
å
å
å
consider how to automate forecasting methods;
study what methods give the best forecasts;
be aware of the dangers of over-fitting;
treat forecasting as a different problem from
time series analysis.
Makridakis & Hibon followed up with a new
competition in 1982:
1001 series
Anyone could submit forecasts (avoiding the
charge of incompetence)
Multiple forecast measures used.
Automatic algorithms for time series forecasting
Forecasting competitions
11
Consequences of M&H (1979)
As a result of this paper, researchers started to:
å
å
å
å
consider how to automate forecasting methods;
study what methods give the best forecasts;
be aware of the dangers of over-fitting;
treat forecasting as a different problem from
time series analysis.
Makridakis & Hibon followed up with a new
competition in 1982:
1001 series
Anyone could submit forecasts (avoiding the
charge of incompetence)
Multiple forecast measures used.
Automatic algorithms for time series forecasting
Forecasting competitions
11
M-competition
Automatic algorithms for time series forecasting
Forecasting competitions
12
M-competition
Main findings
1
Statistically sophisticated or complex methods do
not necessarily provide more accurate forecasts
than simpler ones.
2
The relative ranking of the performance of the
various methods varies according to the accuracy
measure being used.
3
The accuracy when various methods are being
combined outperforms, on average, the individual
methods being combined and does very well in
comparison to other methods.
4
The accuracy of the various methods depends upon
the length of the forecasting horizon involved.
Automatic algorithms for time series forecasting
Forecasting competitions
13
M3 competition
Automatic algorithms for time series forecasting
Forecasting competitions
14
Makridakis and Hibon (2000)
“The M3-Competition is a final attempt by the authors to
settle the accuracy issue of various time series methods. . .
The extension involves the inclusion of more methods/
researchers (in particular in the areas of neural networks
and expert systems) and more series.”
3003 series
All data from business, demography, finance and
economics.
Series length between 14 and 126.
Either non-seasonal, monthly or quarterly.
All time series positive.
M&H claimed that the M3-competition supported the
findings of their earlier work.
However, best performing methods far from “simple”.
Automatic algorithms for time series forecasting
Forecasting competitions
15
Makridakis and Hibon (2000)
Best methods:
Theta
A very confusing explanation.
Shown by Hyndman and Billah (2003) to be average of
linear regression and simple exponential smoothing
with drift, applied to seasonally adjusted data.
Later, the original authors claimed that their
explanation was incorrect.
Forecast Pro
A commercial software package with an unknown
algorithm.
Known to fit either exponential smoothing or ARIMA
models using BIC.
Automatic algorithms for time series forecasting
Forecasting competitions
16
M3 results (recalculated)
Method
MAPE sMAPE MASE
Theta
ForecastPro
ForecastX
Automatic ANN
B-J automatic
17.42
18.00
17.35
17.18
19.13
Automatic algorithms for time series forecasting
12.76
13.06
13.09
13.98
13.72
Forecasting competitions
1.39
1.47
1.42
1.53
1.54
17
M3 results (recalculated)
Method
MAPE sMAPE MASE
Theta
17.42 12.76
ForecastPro
18.00 13.06
ForecastX
17.35 13.09
ä Calculations
not match
Automatic
ANN do
17.18
13.98
published paper.
B-J automatic
19.13 13.72
1.39
1.47
1.42
1.53
1.54
ä Some contestants apparently
submitted multiple entries but only
best ones published.
Automatic algorithms for time series forecasting
Forecasting competitions
17
Outline
1 Forecasting competitions
2 Exponential smoothing
3 ARIMA modelling
4 Automatic nonlinear forecasting?
5 Time series with complex seasonality
6 Recent developments
Automatic algorithms for time series forecasting
Exponential smoothing
18
Exponential smoothing methods
Trend
Component
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
Automatic algorithms for time series forecasting
Exponential smoothing
19
Exponential smoothing methods
Trend
Component
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
N,N:
Simple exponential smoothing
Automatic algorithms for time series forecasting
Exponential smoothing
19
Exponential smoothing methods
Trend
Component
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
N,N:
A,N:
Simple exponential smoothing
Holt’s linear method
Automatic algorithms for time series forecasting
Exponential smoothing
19
Exponential smoothing methods
Trend
Component
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
N,N:
A,N:
Ad ,N:
Simple exponential smoothing
Holt’s linear method
Additive damped trend method
Automatic algorithms for time series forecasting
Exponential smoothing
19
Exponential smoothing methods
Trend
Component
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
N,N:
A,N:
Ad ,N:
M,N:
Simple exponential smoothing
Holt’s linear method
Additive damped trend method
Exponential trend method
Automatic algorithms for time series forecasting
Exponential smoothing
19
Exponential smoothing methods
Trend
Component
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
N,N:
A,N:
Ad ,N:
M,N:
Md ,N:
Simple exponential smoothing
Holt’s linear method
Additive damped trend method
Exponential trend method
Multiplicative damped trend method
Automatic algorithms for time series forecasting
Exponential smoothing
19
Exponential smoothing methods
Trend
Component
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
N,N:
A,N:
Ad ,N:
M,N:
Md ,N:
A,A:
Simple exponential smoothing
Holt’s linear method
Additive damped trend method
Exponential trend method
Multiplicative damped trend method
Additive Holt-Winters’ method
Automatic algorithms for time series forecasting
Exponential smoothing
19
Exponential smoothing methods
Trend
Component
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
N,N:
A,N:
Ad ,N:
M,N:
Md ,N:
A,A:
A,M:
Simple exponential smoothing
Holt’s linear method
Additive damped trend method
Exponential trend method
Multiplicative damped trend method
Additive Holt-Winters’ method
Multiplicative Holt-Winters’ method
Automatic algorithms for time series forecasting
Exponential smoothing
19
Exponential smoothing methods
Trend
Component
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
There are 15 separate exp. smoothing methods.
Automatic algorithms for time series forecasting
Exponential smoothing
19
Exponential smoothing methods
Trend
Component
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
There are 15 separate exp. smoothing methods.
Each can have an additive or multiplicative error,
giving 30 separate models.
Automatic algorithms for time series forecasting
Exponential smoothing
19
Exponential smoothing methods
Trend
Component
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
There are 15 separate exp. smoothing methods.
Each can have an additive or multiplicative error,
giving 30 separate models.
Only 19 models are numerically stable.
Automatic algorithms for time series forecasting
Exponential smoothing
19
Exponential smoothing methods
Trend
Component
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
There are 15 separate exp. smoothing methods.
Each can have an additive or multiplicative error,
giving 30 separate models.
Only 19 models are numerically stable.
Multiplicative trend models give poor forecasts
leaving 15 models.
Automatic algorithms for time series forecasting
Exponential smoothing
19
Exponential smoothing methods
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
Trend
Component
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
General notation
E T S : ExponenTial Smoothing
Examples:
A,N,N:
A,A,N:
M,A,M:
Simple exponential smoothing with additive errors
Holt’s linear method with additive errors
Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting
Exponential smoothing
20
Exponential smoothing methods
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
Trend
Component
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
General notation
E T S : ExponenTial Smoothing
Examples:
A,N,N:
A,A,N:
M,A,M:
Simple exponential smoothing with additive errors
Holt’s linear method with additive errors
Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting
Exponential smoothing
20
Exponential smoothing methods
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
Trend
Component
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
General notation
Examples:
A,N,N:
A,A,N:
M,A,M:
E T S : ExponenTial Smoothing
↑
Trend
Simple exponential smoothing with additive errors
Holt’s linear method with additive errors
Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting
Exponential smoothing
20
Exponential smoothing methods
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
Trend
Component
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
General notation
Examples:
A,N,N:
A,A,N:
M,A,M:
E T S : ExponenTial Smoothing
↑ -
Trend Seasonal
Simple exponential smoothing with additive errors
Holt’s linear method with additive errors
Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting
Exponential smoothing
20
Exponential smoothing methods
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
Trend
Component
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
General notation
Examples:
A,N,N:
A,A,N:
M,A,M:
E T S : ExponenTial Smoothing
% ↑ -
Error Trend Seasonal
Simple exponential smoothing with additive errors
Holt’s linear method with additive errors
Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting
Exponential smoothing
20
Exponential smoothing methods
Seasonal Component
N
A
M
(None)
(Additive) (Multiplicative)
Trend
Component
N
(None)
N,N
N,A
N,M
A
(Additive)
A,N
A,A
A,M
Ad
(Additive damped)
Ad ,N
M
(Multiplicative)
M,N
Ad ,A
M,A
Ad ,M
M,M
Md
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
General notation
Examples:
A,N,N:
A,A,N:
M,A,M:
E T S : ExponenTial Smoothing
% ↑ -
Error Trend Seasonal
Simple exponential smoothing with additive errors
Holt’s linear method with additive errors
Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting
Exponential smoothing
20
Exponential smoothing methods
Seasonal Component
Innovations state space models
å
Trend
Component
All ETS models
N
(None) space
state
A
N
A
M
(None)
(Additive)
(Multiplicative)
be written in innovations
can
form (IJF,N,N
2002). N,A
N,M
A,N
A,M
(Additive)
A,A
Additive
and multiplicative
versions
give
the
Aå
(Additive damped)
Ad ,N
Ad ,A
Ad ,M
d
M
Md
same
point forecastsM,N
but different
prediction
(Multiplicative)
M,A
M,M
(Multiplicative damped)
Md ,N
Md ,A
Md ,M
intervals.
General notation
Examples:
A,N,N:
A,A,N:
M,A,M:
E T S : ExponenTial Smoothing
% ↑ -
Error Trend Seasonal
Simple exponential smoothing with additive errors
Holt’s linear method with additive errors
Multiplicative Holt-Winters’ method with multiplicative errors
Automatic algorithms for time series forecasting
Exponential smoothing
20
ETS state space model
xt −1
yt
εt
Automatic algorithms for time series forecasting
State space model
xt = (level, slope, seasonal)
Exponential smoothing
21
ETS state space model
xt −1
yt
εt
xt
Automatic algorithms for time series forecasting
State space model
xt = (level, slope, seasonal)
Exponential smoothing
21
ETS state space model
xt −1
yt
εt
xt
State space model
yt+1
xt = (level, slope, seasonal)
εt+1
Automatic algorithms for time series forecasting
Exponential smoothing
21
ETS state space model
xt −1
yt
εt
xt
yt+1
εt+1
xt+1
State space model
Automatic algorithms for time series forecasting
xt = (level, slope, seasonal)
Exponential smoothing
21
ETS state space model
xt −1
yt
εt
xt
yt+1
εt+1
xt+1
State space model
xt = (level, slope, seasonal)
yt+2
εt+2
Automatic algorithms for time series forecasting
Exponential smoothing
21
ETS state space model
xt −1
yt
εt
xt
yt+1
εt+1
xt+1
yt+2
εt+2
xt +2
State space model
Automatic algorithms for time series forecasting
xt = (level, slope, seasonal)
Exponential smoothing
21
ETS state space model
xt −1
yt
εt
xt
yt+1
εt+1
xt+1
yt+2
εt+2
xt +2
State space model
xt = (level, slope, seasonal)
y t +3
εt+3
Automatic algorithms for time series forecasting
Exponential smoothing
21
ETS state space model
xt −1
yt
εt
xt
yt+1
εt+1
xt+1
yt+2
εt+2
xt +2
y t +3
εt+3
xt +3
State space model
Automatic algorithms for time series forecasting
xt = (level, slope, seasonal)
Exponential smoothing
21
ETS state space model
xt −1
yt
εt
xt
yt+1
εt+1
xt+1
yt+2
εt+2
xt +2
y t +3
εt+3
xt +3
State space model
xt = (level, slope, seasonal)
y t +4
ε t +4
Automatic algorithms for time series forecasting
Exponential smoothing
21
ETS state space model
xt −1
yt
εt
xt
yt+1
εt+1
xt+1
yt+2
εt+2
xt +2
y t +3
εt+3
xt +3
State space model
xt = (level, slope, seasonal)
y t +4
Estimation
Compute likelihood L from
ε1 , ε2 , . . . , εT .
Optimize L wrt model
parameters.
Automatic algorithms for time series forecasting
ε t +4
Exponential smoothing
21
ETS state space model
xt −1
yt
εt
xt
yt+1
εt+1
xt+1
yt+2
εt+2
xt +2
y t +3
εt+3
xt +3
State space model
xt = (level, slope, seasonal)
y t +4
Estimation
Compute likelihood L from
ε1 , ε2 , . . . , εT .
Optimize L wrt model
parameters.
Automatic algorithms for time series forecasting
ε t +4
Q: How to choose
between the 15
useful ETS models?
Exponential smoothing
21
Cross-validation
Traditional evaluation
Training data
●
●
●
●
●
●
●
●
●
●
●
●
Automatic algorithms for time series forecasting
Test data
●
●
●
●
●
●
●
●
●
●
●
●
Exponential smoothing
●
time
22
Cross-validation
Traditional evaluation
Training data
●
●
●
●
●
●
●
●
●
●
●
●
Test data
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
time
Standard cross-validation
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Automatic algorithms for time series forecasting
●
●
●
●
●
●
●
●
●
●
Exponential smoothing
22
Cross-validation
Traditional evaluation
Training data
●
●
●
●
●
●
●
●
●
●
●
●
Test data
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
time
Standard cross-validation
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Time series cross-validation
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Automatic algorithms for time series forecasting
Exponential smoothing
22
Cross-validation
Traditional evaluation
Training data
●
●
●
●
●
●
●
●
●
●
●
●
Test data
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
time
Standard cross-validation
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Time series cross-validation
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Automatic algorithms for time series forecasting
Exponential smoothing
22
Cross-validation
Traditional evaluation
Training data
●
●
●
●
●
●
●
●
●
●
●
●
Test data
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
time
Standard cross-validation
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Time series cross-validation
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
Also
known as “Evaluation on
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
a
● rolling
● ● ● ● forecast
● ● ● ● ●origin”
● ● ● ● ● ● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Automatic algorithms for time series forecasting
Exponential smoothing
22
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting
Exponential smoothing
23
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting
Exponential smoothing
23
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting
Exponential smoothing
23
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting
Exponential smoothing
23
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic algorithms for time series forecasting
Exponential smoothing
23
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
version:
AICC = AIC +
2(k +1)(k +2)
T −k
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
AICc asymptotically equivalent, can be used on
small samples and is very fast to compute.
Automatic algorithms for time series forecasting
Exponential smoothing
24
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
version:
AICC = AIC +
2(k +1)(k +2)
T −k
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
AICc asymptotically equivalent, can be used on
small samples and is very fast to compute.
Automatic algorithms for time series forecasting
Exponential smoothing
24
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
version:
AICC = AIC +
2(k +1)(k +2)
T −k
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
AICc asymptotically equivalent, can be used on
small samples and is very fast to compute.
Automatic algorithms for time series forecasting
Exponential smoothing
24
ets algorithm in R
Based on Hyndman, Koehler,
Snyder & Grose (IJF 2002):
Apply each of 15 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
Automatic algorithms for time series forecasting
Exponential smoothing
25
ets algorithm in R
Based on Hyndman, Koehler,
Snyder & Grose (IJF 2002):
Apply each of 15 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
Automatic algorithms for time series forecasting
Exponential smoothing
25
ets algorithm in R
Based on Hyndman, Koehler,
Snyder & Grose (IJF 2002):
Apply each of 15 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
Automatic algorithms for time series forecasting
Exponential smoothing
25
ets algorithm in R
Based on Hyndman, Koehler,
Snyder & Grose (IJF 2002):
Apply each of 15 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
Automatic algorithms for time series forecasting
Exponential smoothing
25
Exponential smoothing
400
300
millions of sheep
500
600
Forecasts from ETS(M,A,N)
1960
1970
1980
1990
2000
2010
Year
Automatic algorithms for time series forecasting
Exponential smoothing
26
Exponential smoothing
400
fit <- ets(livestock)
fcast <- forecast(fit)
plot(fcast)
300
millions of sheep
500
600
Forecasts from ETS(M,A,N)
1960
1970
1980
1990
2000
2010
Year
Automatic algorithms for time series forecasting
Exponential smoothing
27
Exponential smoothing
1.2
1.0
0.8
0.6
0.4
Total scripts (millions)
1.4
1.6
Forecasts from ETS(M,N,M)
1995
2000
2005
2010
Year
Automatic algorithms for time series forecasting
Exponential smoothing
28
Exponential smoothing
1.2
0.6
0.8
1.0
fit <- ets(h02)
fcast <- forecast(fit)
plot(fcast)
0.4
Total scripts (millions)
1.4
1.6
Forecasts from ETS(M,N,M)
1995
2000
2005
2010
Year
Automatic algorithms for time series forecasting
Exponential smoothing
29
Exponential smoothing
> fit
ETS(M,N,M)
Smoothing parameters:
alpha = 0.4597
gamma = 1e-04
Initial states:
l = 0.4501
s = 0.8628 0.8193 0.7648 0.7675 0.6946 1.2921
1.3327 1.1833 1.1617 1.0899 1.0377 0.9937
sigma:
0.0675
AIC
AICc
-115.69960 -113.47738
BIC
-69.24592
Automatic algorithms for time series forecasting
Exponential smoothing
30
M3 comparisons
Method
MAPE sMAPE MASE
Theta
ForecastPro
ForecastX
Automatic ANN
B-J automatic
17.42
18.00
17.35
17.18
19.13
12.76
13.06
13.09
13.98
13.72
1.39
1.47
1.42
1.53
1.54
ETS
17.38
13.13
1.43
Automatic algorithms for time series forecasting
Exponential smoothing
31
Exponential smoothing
Automatic algorithms for time series forecasting
Exponential smoothing
32
Exponential smoothing
www.OTexts.org/fpp
Automatic algorithms for time series forecasting
Exponential smoothing
32
Exponential smoothing
Automatic algorithms for time series forecasting
Exponential smoothing
32
Exercise
1
2
3
Use ets to find the best ETS models for the
following series: ibmclose, eggs, bricksq,
hsales.
Try ets with cangas and lynx. What do you
learn?
Can you find another series for which ets gives
bad forecasts?
Automatic algorithms for time series forecasting
Exponential smoothing
33
Outline
1 Forecasting competitions
2 Exponential smoothing
3 ARIMA modelling
4 Automatic nonlinear forecasting?
5 Time series with complex seasonality
6 Recent developments
Automatic algorithms for time series forecasting
ARIMA modelling
34
ARIMA models
Inputs
Output
y t −1
y t −2
yt
y t −3
Automatic algorithms for time series forecasting
ARIMA modelling
35
ARIMA models
Inputs
Output
y t −1
Autoregression (AR)
model
y t −2
yt
y t −3
εt
Automatic algorithms for time series forecasting
ARIMA modelling
35
ARIMA models
Inputs
Output
y t −1
Autoregression moving
average (ARMA) model
y t −2
yt
y t −3
εt
ε t −1
ε t −2
Automatic algorithms for time series forecasting
ARIMA modelling
35
ARIMA models
Inputs
Output
y t −1
Autoregression moving
average (ARMA) model
y t −2
yt
y t −3
εt
ε t −1
ε t −2
Automatic algorithms for time series forecasting
Estimation
Compute likelihood L from
ε1 , ε2 , . . . , εT .
Use optimization
algorithm to maximize L.
ARIMA modelling
35
ARIMA models
Inputs
Output
y t −1
y t −2
yt
y t −3
εt
ε t −1
ε t −2
Automatic algorithms for time series forecasting
Autoregression moving
average (ARMA) model
ARIMA model
Autoregression moving
average (ARMA) model
applied to differences.
Estimation
Compute likelihood L from
ε1 , ε2 , . . . , εT .
Use optimization
algorithm to maximize L.
ARIMA modelling
35
ARIMA modelling
Automatic algorithms for time series forecasting
ARIMA modelling
36
Auto ARIMA
450
400
350
300
250
millions of sheep
500
550
Forecasts from ARIMA(0,1,0) with drift
1960
1970
1980
1990
2000
2010
Year
Automatic algorithms for time series forecasting
ARIMA modelling
37
Auto ARIMA
Forecasts from ARIMA(0,1,0) with drift
450
400
350
300
250
millions of sheep
500
550
fit <- auto.arima(livestock)
fcast <- forecast(fit)
plot(fcast)
1960
1970
1980
1990
2000
2010
Year
Automatic algorithms for time series forecasting
ARIMA modelling
38
Auto ARIMA
1.0
0.8
0.6
0.4
Total scripts (millions)
1.2
1.4
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
1995
2000
2005
2010
Year
Automatic algorithms for time series forecasting
ARIMA modelling
39
Auto ARIMA
0.6
0.8
1.0
fit <- auto.arima(h02)
fcast <- forecast(fit)
plot(fcast)
0.4
Total scripts (millions)
1.2
1.4
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
1995
2000
2005
2010
Year
Automatic algorithms for time series forecasting
ARIMA modelling
40
Auto ARIMA
> fit
Series: h02
ARIMA(3,1,3)(0,1,1)[12]
Coefficients:
ar1
-0.3648
s.e.
0.2198
ar2
-0.0636
0.3293
ar3
0.3568
0.1268
ma1
-0.4850
0.2227
ma2
0.0479
0.2755
ma3
-0.353
0.212
sma1
-0.5931
0.0651
sigma^2 estimated as 0.002706: log likelihood=290.25
AIC=-564.5
AICc=-563.71
BIC=-538.48
Automatic algorithms for time series forecasting
ARIMA modelling
41
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d yt = c + θ(B)εt
Need to select appropriate orders p, q, d, and
whether to include c.
Algorithm choices driven by forecast accuracy.
Automatic algorithms for time series forecasting
ARIMA modelling
42
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d yt = c + θ(B)εt
Need to select appropriate orders p, q, d, and
whether to include c.
Hyndman & Khandakar (JSS, 2008) algorithm:
Select no. differences d via KPSS unit root test.
Select p, q, c by minimising AICc.
Use stepwise search to traverse model space,
starting with a simple model and considering
nearby variants.
Algorithm choices driven by forecast accuracy.
Automatic algorithms for time series forecasting
ARIMA modelling
42
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d yt = c + θ(B)εt
Need to select appropriate orders p, q, d, and
whether to include c.
Hyndman & Khandakar (JSS, 2008) algorithm:
Select no. differences d via KPSS unit root test.
Select p, q, c by minimising AICc.
Use stepwise search to traverse model space,
starting with a simple model and considering
nearby variants.
Algorithm choices driven by forecast accuracy.
Automatic algorithms for time series forecasting
ARIMA modelling
42
How does auto.arima() work?
A seasonal ARIMA process
Φ(Bm )φ(B)(1 − B)d (1 − Bm )D yt = c + Θ(Bm )θ(B)εt
Need to select appropriate orders p, q, d, P, Q, D, and
whether to include c.
Hyndman & Khandakar (JSS, 2008) algorithm:
Select no. differences d via KPSS unit root test.
Select D using OCSB unit root test.
Select p, q, P, Q, c by minimising AICc.
Use stepwise search to traverse model space,
starting with a simple model and considering
nearby variants.
Automatic algorithms for time series forecasting
ARIMA modelling
43
M3 comparisons
Method
MAPE sMAPE MASE
Theta
17.42
ForecastPro
18.00
B-J automatic 19.13
12.76
13.06
13.72
1.39
1.47
1.54
ETS
AutoARIMA
13.13
13.85
1.43
1.47
17.38
19.12
Automatic algorithms for time series forecasting
ARIMA modelling
44
Exercise
1
2
3
4
Use auto.arima to find the best ARIMA models
for the following series: ibmclose, eggs,
bricksq, hsales.
Try auto.arima with cangas and lynx. What do
you learn?
Can you find a series for which auto.arima
gives bad forecasts?
How would you compare the ETS and ARIMA
results?
Automatic algorithms for time series forecasting
ARIMA modelling
45
Outline
1 Forecasting competitions
2 Exponential smoothing
3 ARIMA modelling
4 Automatic nonlinear forecasting?
5 Time series with complex seasonality
6 Recent developments
Automatic algorithms for time series forecasting
Automatic nonlinear forecasting?
46
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting
Automatic nonlinear forecasting?
47
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting
Automatic nonlinear forecasting?
47
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting
Automatic nonlinear forecasting?
47
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting
Automatic nonlinear forecasting?
47
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone on automated ANN for time series.
Watch this space!
Automatic algorithms for time series forecasting
Automatic nonlinear forecasting?
47
Outline
1 Forecasting competitions
2 Exponential smoothing
3 ARIMA modelling
4 Automatic nonlinear forecasting?
5 Time series with complex seasonality
6 Recent developments
Automatic algorithms for time series forecasting
Time series with complex seasonality
48
Examples
6500 7000 7500 8000 8500 9000 9500
Thousands of barrels per day
US finished motor gasoline products
1992
1994
1996
1998
2000
2002
2004
Weeks
Automatic algorithms for time series forecasting
Time series with complex seasonality
49
Examples
300
200
100
Number of call arrivals
400
Number of calls to large American bank (7am−9pm)
3 March
17 March
31 March
14 April
28 April
12 May
5 minute intervals
Automatic algorithms for time series forecasting
Time series with complex seasonality
49
Examples
20
15
10
Electricity demand (GW)
25
Turkish electricity demand
2000
2002
2004
2006
2008
Days
Automatic algorithms for time series forecasting
Time series with complex seasonality
49
TBATS model
TBATS
Trigonometric terms for seasonality
Box-Cox transformations for heterogeneity
ARMA errors for short-term dynamics
Trend (possibly damped)
Seasonal (including multiple and non-integer periods)
Automatic algorithm described in AM De Livera,
RJ Hyndman, and RD Snyder (2011). “Forecasting
time series with complex seasonal patterns using
exponential smoothing”. Journal of the American
Statistical Association 106(496), 1513–1527.
Automatic algorithms for time series forecasting
Time series with complex seasonality
50
TBATS model
yt = observation at time t
(ω)
yt
(ω)
yt
(
(ytω − 1)/ω if ω 6= 0;
=
log yt
if ω = 0.
= `t−1 + φbt−1 +
M
X
(i)
st−mi + dt
i=1
`t = `t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + β dt
p
q
X
X
dt =
φi dt−i +
θj εt−j + εt
i=1
j=1
(i)
ki
(i)
st =
X
j=1
(i)
sj,t
sj,t =
(i)
(i)
(i)
∗(i)
(i)
(i)
(i)
(i)
sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt
(i)
(i)
∗(i)
sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt
Automatic algorithms for time series forecasting
Time series with complex seasonality
51
TBATS model
yt = observation at time t
(ω)
yt
(ω)
yt
(
(ytω − 1)/ω if ω 6= 0;
=
log yt
if ω = 0.
= `t−1 + φbt−1 +
M
X
Box-Cox transformation
(i)
st−mi + dt
i=1
`t = `t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + β dt
p
q
X
X
dt =
φi dt−i +
θj εt−j + εt
i=1
j=1
(i)
ki
(i)
st =
X
j=1
(i)
sj,t
sj,t =
(i)
(i)
(i)
∗(i)
(i)
(i)
(i)
(i)
sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt
(i)
(i)
∗(i)
sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt
Automatic algorithms for time series forecasting
Time series with complex seasonality
51
TBATS model
yt = observation at time t
(ω)
yt
(ω)
yt
(
(ytω − 1)/ω if ω 6= 0;
=
log yt
if ω = 0.
= `t−1 + φbt−1 +
M
X
Box-Cox transformation
(i)
st−mi + dt
M seasonal periods
i=1
`t = `t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + β dt
p
q
X
X
dt =
φi dt−i +
θj εt−j + εt
i=1
j=1
(i)
ki
(i)
st =
X
j=1
(i)
sj,t
sj,t =
(i)
(i)
(i)
∗(i)
(i)
(i)
(i)
(i)
sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt
(i)
(i)
∗(i)
sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt
Automatic algorithms for time series forecasting
Time series with complex seasonality
51
TBATS model
yt = observation at time t
(ω)
yt
(ω)
yt
(
(ytω − 1)/ω if ω 6= 0;
=
log yt
if ω = 0.
= `t−1 + φbt−1 +
M
X
Box-Cox transformation
(i)
st−mi + dt
M seasonal periods
i=1
`t = `t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + β dt
p
q
X
X
dt =
φi dt−i +
θj εt−j + εt
i=1
j=1
(i)
ki
(i)
st =
X
j=1
global and local trend
(i)
sj,t
sj,t =
(i)
(i)
(i)
∗(i)
(i)
(i)
(i)
(i)
sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt
(i)
(i)
∗(i)
sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt
Automatic algorithms for time series forecasting
Time series with complex seasonality
51
TBATS model
yt = observation at time t
(ω)
yt
(ω)
yt
(
(ytω − 1)/ω if ω 6= 0;
=
log yt
if ω = 0.
= `t−1 + φbt−1 +
M
X
Box-Cox transformation
(i)
st−mi + dt
M seasonal periods
i=1
`t = `t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + β dt
p
q
X
X
dt =
φi dt−i +
θj εt−j + εt
i=1
st =
X
j=1
ARMA error
j=1
(i)
ki
(i)
global and local trend
(i)
sj,t
sj,t =
(i)
(i)
(i)
∗(i)
(i)
(i)
(i)
(i)
sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt
(i)
(i)
∗(i)
sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt
Automatic algorithms for time series forecasting
Time series with complex seasonality
51
TBATS model
yt = observation at time t
(ω)
yt
(ω)
yt
(
(ytω − 1)/ω if ω 6= 0;
=
log yt
if ω = 0.
= `t−1 + φbt−1 +
M
X
Box-Cox transformation
(i)
st−mi + dt
M seasonal periods
i=1
`t = `t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + β dt
p
q
X
X
dt =
φi dt−i +
θj εt−j + εt
i=1
j=1
(i)
ki
(i)
st =
X
j=1
(i)
sj,t
sj,t =
(i)
global and local trend
ARMA error
Fourier-like seasonal
(i)
(i)
∗(i)
(i)
(i)
sj,t−1 cos
λj + sj,t−1 sin λj + γ1 dt
terms
(i)
(i)
∗(i)
(i)
(i)
sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt
Automatic algorithms for time series forecasting
Time series with complex seasonality
51
TBATS model
yt = observation at time t
(ω)
yt
(
(ytω − 1)/ω if ω 6= 0;
=
logTBATS
yt
if ω = 0.
Box-Cox transformation
M
X (i)
Trigonometric
st−m + dt
i=1
Box-Cox
`t = `t−1 + φbt−1 + αdt
ARMA
bt = (1 − φ)b + φbt−1 + β dt
p
q
TrendX
X
dt =
φi dt−i +
θj εt−j + εt
Seasonal
i=1
j=1
(ω)
yt
(i)
= `t−1 + φbt−1 +
st =
ki
X
j=1
(i)
(i)
sj,t
M seasonal periods
i
sj,t =
(i)
global and local trend
ARMA error
Fourier-like seasonal
(i)
(i)
∗(i)
(i)
(i)
sj,t−1 cos
λj + sj,t−1 sin λj + γ1 dt
terms
(i)
(i)
∗(i)
(i)
(i)
sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt
Automatic algorithms for time series forecasting
Time series with complex seasonality
51
Examples
9000
8000
fit <- tbats(gasoline)
fcast <- forecast(fit)
plot(fcast)
7000
Thousands of barrels per day
10000
Forecasts from TBATS(0.999, {2,2}, 1, {<52.1785714285714,8>})
1995
2000
2005
Weeks
Automatic algorithms for time series forecasting
Time series with complex seasonality
52
Examples
500
Forecasts from TBATS(1, {3,1}, 0.987, {<169,5>, <845,3>})
300
200
100
0
Number of call arrivals
400
fit <- tbats(callcentre)
fcast <- forecast(fit)
plot(fcast)
3 March
17 March
31 March
14 April
28 April
12 May
26 May
9 June
5 minute intervals
Automatic algorithms for time series forecasting
Time series with complex seasonality
53
Examples
Forecasts from TBATS(0, {5,3}, 0.997, {<7,3>, <354.37,12>, <365.25,4>})
20
15
10
Electricity demand (GW)
25
fit <- tbats(turk)
fcast <- forecast(fit)
plot(fcast)
2000
2002
2004
2006
2008
2010
Days
Automatic algorithms for time series forecasting
Time series with complex seasonality
54
Outline
1 Forecasting competitions
2 Exponential smoothing
3 ARIMA modelling
4 Automatic nonlinear forecasting?
5 Time series with complex seasonality
6 Recent developments
Automatic algorithms for time series forecasting
Recent developments
55
Further competitions
1
2011 tourism forecasting competition.
2
Kaggle and other forecasting platforms.
3
4
GEFCom 2012: Point forecasting of
electricity load and wind power.
GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic algorithms for time series forecasting
Recent developments
56
Further competitions
1
2011 tourism forecasting competition.
2
Kaggle and other forecasting platforms.
3
4
GEFCom 2012: Point forecasting of
electricity load and wind power.
GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic algorithms for time series forecasting
Recent developments
56
Further competitions
1
2011 tourism forecasting competition.
2
Kaggle and other forecasting platforms.
3
4
GEFCom 2012: Point forecasting of
electricity load and wind power.
GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic algorithms for time series forecasting
Recent developments
56
Further competitions
1
2011 tourism forecasting competition.
2
Kaggle and other forecasting platforms.
3
4
GEFCom 2012: Point forecasting of
electricity load and wind power.
GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic algorithms for time series forecasting
Recent developments
56
Forecasts about forecasting
1
2
3
4
Automatic algorithms will become more
general — handling a wide variety of time
series.
Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
Automatic forecasting algorithms for
multivariate time series will be developed.
Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic algorithms for time series forecasting
Recent developments
57
Forecasts about forecasting
1
2
3
4
Automatic algorithms will become more
general — handling a wide variety of time
series.
Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
Automatic forecasting algorithms for
multivariate time series will be developed.
Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic algorithms for time series forecasting
Recent developments
57
Forecasts about forecasting
1
2
3
4
Automatic algorithms will become more
general — handling a wide variety of time
series.
Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
Automatic forecasting algorithms for
multivariate time series will be developed.
Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic algorithms for time series forecasting
Recent developments
57
Forecasts about forecasting
1
2
3
4
Automatic algorithms will become more
general — handling a wide variety of time
series.
Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
Automatic forecasting algorithms for
multivariate time series will be developed.
Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic algorithms for time series forecasting
Recent developments
57
Download