The Case for the Single Source of Error State Space Model

advertisement
Time Series Forecasting: The
Case for the Single Source of
Error State Space Model
J. Keith Ord, Georgetown University
Ralph D. Snyder, Monash University
Anne B. Koehler, Miami University
Rob J. Hyndman, Monash University
Mark Leeds, The Kellogg Group
http://www.buseco.monash.edu.au/depts/ebs/pubs/wpapers/2005
1
Outline of Talk
• Background
• General SSOE model
– Linear and nonlinear examples
– Estimation and model selection
• General linear state space model
–
–
–
–
–
–
MSOE and SSOE forms
Parameter spaces
Convergence
Equivalent Models
Explanatory variables
ARCH and GARCH models
• Advantages of SSOE
2
Review Paper
A New Look At Models for Exponential
Smoothing (2001).
JRSS, series D [The Statistician], 50, 14759.
Chris Chatfield, Anne Koehler, Keith Ord
&Ralph Snyder
3
Framework Paper
A State Space Framework for Automatic
Forecasting Using Exponential
Smoothing(2002)
International J. of Forecasting, 18, 439-454
Rob Hyndman, Anne Koehler, Ralph Snyder
& Simone Grosse
4
Some background
• The Kalman filter: Kalman (1960), Kalman &
Bucy (1961)
• Engineering: Jazwinski (1970), Anderson &
Moore (1979)
• Regression approach: Duncan and Horn (JASA,
1972)
• Bayesian Forecasting & Dynamic Linear Model:
Harrison & Stevens (1976, JRSS B); West &
Harrison (1997)
• Structural models: Harvey (1989)
• State Space Methods: Durbin & Koopman
(2001)
5
Single Source of Error (SSOE)
State Space Model
• Developed by Snyder (1985) among
others
• Also known as the Innovations
Representation
• Any Gaussian time series has an
innovations representation [SSOE looks
restrictive but it is not!]
6
Why a structural model?
• Structural models enable us to formulate
model in terms of unobserved components
and to decompose the model in terms of
those components
• Structural models will enable us to
formulate schemes with non-linear error
structures, yet familiar forecast functions
7
General Framework: Notation
yt : the observable process of interest, and
we set I t  {y t , y t -1 ,..., y1}
xt : vector of unobservab le state variables
 t : the unobservab le random errors with
means 0 and variance 
2
mt : vector of estimators for state variables
8
Single Source of Error (SSOE)
State Space Model
yt  h(xt 1 )  k (xt 1 ) t
xt  f (xt 1 )  g(xt 1 , α) t
 t ~ NID(0,  )
2
xt is a k  1 state vector
and α is a k 1 vector of parameters
9
Simple Exponential Smoothing
(SES)
Measurement Equation
yt   t 1   t
State Equation
 t   t 1  t
 t is the level at time t
10
Another Form for State Equation
Measurement Equation
yt   t 1   t
State Equation
 t   t 1   ( yt   t 1 )
or
 t  yt  (1   ) t 1
11
Reduced ARIMA Form
ARIMA(0,1,1):
yt  yt 1   t  (1   ) t 1
12
Another SES Model
Measurement Equation
yt   t 1   t 1 t
State Equation
 t   t 1   t 1 t
13
Same State Equation
for Second Model
yt   t 1
t 
 t 1
 t   t 1   t 1
yt   t 1
 t 1
 t   t 1   ( yt   t 1 )
14
Reduced ARIMA Model for Second
SES Model
NONE
15
Point Forecasts for Both Models
yˆ t h  ̂ t
ˆ  ˆ  ˆ ( y  ˆ )
t
t 1
t
t 1
or
ˆ  ˆy  (1  ˆ )ˆ )
t
t
t 1
16
SSOE Model for Holt-Winters
Method
yt  ( t 1  bt 1 )st m  ( t 1  bt 1 )st m t
 t  ( t 1  bt 1 )   ( t 1  bt 1 ) t
bt  bt 1   ( t 1  bt 1 ) t
st  st m  st m t
17
Likelihood, Exponential Smoothing,
and Estimation
Likelikhoo d with fixed x0
n
n

L (α , x )  n log   2  2  log k ( x )
t
t
0
t 1
t 1
yt  h(x t 1 )
t 
k (x t 1 )
 yt  h(x t 1 ) 

x t  f (x t 1 )  g(x t 1 )
 k (x t 1 ) 
18
Model Selection
Akaike Informatio n Criterion

AIC  L (αˆ , xˆ 0 )  2 p
p is the number of free states plus
the number of parameters
19
General Linear State Space Model
yt  h'xt 1   t
xt  Fx t 1  ηt
2



0
 t 


  ~ NID  , 

η
V
0



 t
 η

Vη  


Vη  
20
Special Cases
MSOE Model
Vη  Cov( t ,  t )  0
Vη is diagonal, that is, Cov(it , jt )  0 for i  j
SSOE Model
ηt  α ε t
Vη  Cov( t , ηt )  α 
2
Vη  Cov( ηt )  α' α  2
21
Linear SSOE Model
yt  hxt 1   t
xt  Fx t 1  α t
h is a k 1 vector
F is a k  k vector
α is a k 1 vector
22
SSOE for Holt’s Linear Trend
Exponential Smoothing
  t 1 
   t
yt  1 1
 bt 1 
 1 1  t 1    
    t

x t  
 0 1 bt 1    
23
MSOE Model for Holt’s Liner Trend
Exponential Smoothing
yt   t 1  bt 1   t
 t   t 1  bt 1  1t
bt  bt 1  2t
24
Parameter Space 1
• Both correspond to the same ARIMA
model in the steady state BUT parameter
spaces differ
– SSOE has same space as ARIMA
– MSOE space is subset of ARIMA
• Example: for ARIMA (0,1,1),  = 1- 
– MSOE has 0 <  < 1
– SSOE has 0 <  <2 equivalent to –1 <  < 1
25
Parameter space 2
• In general, ρ = 1 (SSOE) yields the same
parameter space as ARIMA,
ρ = 0 (MSOE) yields a smaller space.
• No other value of ρ yields a larger
parameter space than does ρ = 1
[Theorems 5.1 and 5.2]
• Restricted parameter spaces may lead to
poor model choices [e.g. Morley et al.,
2002]
26
Convergence of the
Covariance Matrix for Linear SSOE
In
the Kalman filter ,
Ct  0 as t  
where
mt  E (xt | y1 , y2 ,, yt )
Ct  Cov(xt | y1 , y2 ,, yt )  E[(xt  mt )(xt  mt ) | y1 , y2 ,, yt ]
m t  Fm t 1  a t ( yt  hm t 1 )
Kalman gain
2
2 1



FC
h
)(
h
C
h


)
at  (α
t 1
t 1
27
Convergence 2
• The practical import of this result is that,
provided t is not too small, we can
approximate the state variable by its
estimate
• That is, heuristic forecasting procedures,
such as exponential smoothing, that
generate forecast updates in a form like
the state equations, are validated.
28
Equivalence
• Equivalent linear state space models
(West and Harrison) will give rise to the
same forecast distribution.
• For the MSOE model the equivalence
transformation H of the state vector
typically produces a non-diagonal
covariance matrix.
• For the SSOE model the equivalence
transformation H preserves the perfect
correlation of the state vectors.
29
Explanatory Variables
yt  hxt 1  zt γ   t
x t  Fx t 1    t
SSOE can be put into a regression framework
~
yt  ~
zt   t
~
yt is a function of yt and 
~
zt is an augmented function of zt and 
x 0 
  
γ
30
ARCH Effects
SSOE version of the ARCH(1) model
yt 1  hx t 1   t
x t  Fx t 1    t
 t  ht1/ 2 t
 ~ N (0,1)

t
ht   0  1 t21
31
Advantages of SSOE Models
• Mapping from model to forecasting
equations is direct and easy to see
• ML estimation can be applied directly
without need for the Kalman updating
procedure
• Nonlinear models are readily incorporated
into the model framework
32
Further Advantages of
SSOE Models
• Akaike and Schwarz information criteria
can be used to choose models, including
choices among models with different
numbers of unit roots in the reduced form
• Largest parameter space among state
space models.
• In Kalman filter, the covariance matrix of
the state vector converges to 0.
33
Download