?  Linear stochastic processes

advertisement
Time series, Part 2
Linear stochastic processes
Linear time series (stochastic process)

X t      i Z t i
 |
Zt ~ WN(0,  )
i 
 X t  is stationary
E[X t ]  0

2
Z
 X ( ) 

 
i
i 
i 
i 
 2
We assume μ=0
If  i  0 for i<0
1
2
 B
i 
X t  1 X t 1   2 X t 2 
| 
i 0
i
Zt
Xt

 Zt   i Z t i

autoregressive process AR(∞)
i 1
 X t  is invertible
1
Zt
 ( B)
moving average process MA(∞)
i 1
 Z t    i X t i  Z t
|
 ( B) X t  Z t  X t 
 ( B)
i
i
A linear time series is expressed as …
X t  Zt  1Zt 1  2 Zt 2 

Linear filter

 ( B) 
|
?
Considering the lag operator :
X t   ( B) Z t
i
 ( B) 
1
 ( B)
the stochastic component Z t
can be expressed in terms of
the current and past
observations X t
Autoregressive processes
2
X t  1 X t 1   2 X t 2 

 Z t    i X t i  Z t
autoregression AR(∞)
i 1
We restrict the autoregression to the first p most recent terms
X t  1 X t 1   2 X t 2 
Zt ~ WN(0,  Z2 )
 Zt
Autoregression process of order p, AR(p)
X t  1 X t 1  2 X t 2 
(1  1B  2 B2 
  p X t  p  Zt
  p B p ) X t  Zt
 ( B)  1  1B  2 B 
2
p B
 ( B) X t  Z t
p
p
 ( B )  1   i B i
characteristic polynomial
i 1
Condition of stationarity
The roots of  ( )  0 must be outside the unit circle
or
p
p 1
the roots of   1    p 1   p  0 must be inside the unit circle
Autoregressive process of order one, AR(1)
Zt ~ WN(0,  Z2 )
X t   X t 1  Zt
Stationarity condition: |  | 1
X t  t  t 1   t 2 
2
Successive backward substitutions:
 2
Var[X t ]   (1      )      
1 2
i 0

2

2
4
2

2i

   i  t i
i 0
2
X
Autocorrelation? (we assume stationarity)
X t 1 X t  X t 1  X t 1  Zt   E[ X t 1 X t ]   E[ X t 1 X t 1 ]  E[ X t 1Zt ]   X (1)   X2   X (1)  
X t  X t  X t   X t 1  Zt   E[ X t  X t ]   E[ X t  X t 1 ]  E[ X t  Zt ]   X ( )   X (  1)
  X ( )   
()
1
0.5
0.5
()
()
()
1
0
-0.5
-1
0
0
-0.5
2
4

6
  0.8
8
10
-1
0
2
4

6
  0.8
8
10
Autoregressive process of order two, AR(2)
X t  1 X t 1  2 X t 2  Zt
Zt ~ WN(0,  Z2 )
Stationarity condition
The roots of  ( B)  1  1B  2 B 2 must be outside the unit circle
The roots of  2  1  2  0 must be inside the unit circle
Ρίζες: 1,2 
1    42
22
2
1
1,2
2  1  1

 1  2  1  1
1    1
2

?
Stationarity condition for AR(2)
3
two real roots:
12  42  0
real distinct roots
complex roots
real single root
2
2
1
one double real root:
12  42  0
0
-1
complex conjugate roots:
12  42  0
-2
-3
-3
-2
-1
0
1
1
2
3
Autocorrelation
(α) λ1=0.8+0.5i
λ2=0.8-0.5i
( )
()
0
-0.5
0.5
-0.5
0
5
10
15
-1
20
0
5
10
15
 2=-0.89
-0.5
15
20
5
10

15
 2=0.76
0.5
20
 1=-1.75
 2=-0.76
0.5
0
-1
15
()
 1=-0.15
20
10
(η) λ1=-0.8
λ2=-0.95
0
-0.5
-0.5
0
5
1
 2=-0.64
0
-1
0

 1=-1.6
-0.5

-1
20
()
()
0
15
1
0.5
()
0.5
10
(στ) λ1=0.8
λ2=-0.95
()
 1=-1.6
10
5

1
5
0
(δ) λ1=-0.8
λ2=-0.8
()
0
-0.5

1
0
0
-1
20
 2=0.76
0.5
-0.5
(β) λ1=-0.8+0.5i
λ2=-0.8-0.5i
()
 2=-0.76
0

-1
 1=1.75
 2=-0.64
()
-1
 1=0.15
 1=1.6
()
0.5
()
()
0.5
1
1
 1=1.6
 2=-0.89
()
()
1
()
1
(ζ) λ1=-0.8
λ2=0.95
(ε) λ1=0.8
λ2=0.95
(γ) λ1=0.8
λ2=0.8
0
5
10

15
20
-1
0
5
10

15
20
Autoregressive process of order two, AR(2)
Autocorrelation ? (we assume stationarity)
X t 1 X t  X t 1 1 X t 1  2 X t 2  Zt  
1  1  2 1
E[ X t 1 X t ]  1E[ X t 1 X t 1 ]  2 E[ X t 1 X t 2 ]  E[ X t 1Zt ] 
 X (1)  1 X2  2 X (1)   X (1)  1  2  X (1)
2  11  2
X t 2 X t  X t 2 1 X t 1  2 X t 2  Zt  
E[ X t 2 X t ]  1E[ X t 2 X t 1 ]  2 E[ X t 2 X t 1 ]  E[ X t 2 Zt ] 
 X (2)  1 X (1)  2 X2   X (2)  1 X (1)  2
Για υστέρηση τ:
1 
 X ( )  1 X (  1)  2 X (  2) 
  1 1  2  2
 can be
computed
recursively
variance
1
1  2
1 
12
 2  2 
1  2
(1  1B  2 B2 )   0
1 (1   2 )
1  12
 2  12
2 
1  12
real roots: exponential decay
characteristic
polynomial
complex roots: decaying harmonic
function
X t X t  X t 1 X t 1  2 X t 2  Zt  
 X2  1 X (1)  2 X (2)   Z2 
 Z2
 
1  1 1  2 2
2
X
Autoregressive process of order p, AR(p)
X t  1 X t 1  2 X t 2 
(1  1B  2 B2 
  p X t  p  Zt
Zt ~ WN(0,  Z2 )
  p B p ) X t  Zt
Stationarity condition
2
Roots of  ( B)  1  1B  2 B 
  p B p must be outside the unit circle
Autocorrelation ? (we assume stationarity)
For lag τ:
X t  X t  X t  1 X t 1  2 X t 2 
  p X t  p  Zt  
E[ X t  X t ]  1E[ X t  X t 1 ]  2 E[ X t  X t 2 ] 
 X ( )  1 X ( 1)  2 X (  2) 
 X ( )  1 X (  1)  2  X (  2) 
  p E[ X t  p X t 2 ]  E[ X t  p Zt ] 
  p X (  p) 
  p  X (  p)   ( B)   0
real roots : exponential decay
complex roots : decaying harmonic function
Autoregressive process of order p, AR(p)
  1 1  2  2 
 1
 2
1  1
 2  1 1
p
 p  1  p 1 2  p 2
 1 
 
2
 
 
 
 p 
2 1
2
 1 
 
2
p   
 
 
  p 
  
  p   p


 p  p 1
 p  p 2

 p
1
1

1
1

p 


  p 1  p  2
Yule-Walker
equations
2
1
 p 1 
 p 2 
 p 3
1



 p  p
   p1p
Variance
X t X t  X t 1 X t 1  2 X t 2 
  1 X (1)  2 X (2) 
2
X
  p X t  p  Zt  
 Z2
  p X ( p )     
1  11  2 2 
2
Z
2
X
p  p
Partial autocorrelation
1
1

1
Yule-Walker  1
equations


  k 1  k  2
11  1
k 1
1
k 2
22 
1
1
1
1
k 3
1

33  2
1
1
2
2
1
 k 1  1 
 k 2  2 
 k 3
1
1
 2  2  12

1
1  12
1
1 1
1 2
1 3
1  2
1 1
1 1
 1 
 
  2
   
   
 k    k 
For each k we compute
the coefficient k  kk

1

1

kk


k 1
1

1
1
1
k 1
k 2




1
1

k 2

2


1


k 3
2

1

k 2
k 3


1
1


k
k 3 k 2

1
1
partial autocorrelation for lag (order) k
Recursive algorithm of Durbin-Levinson
the coefficients of AR(p)  p1 ,  p 2 ,
2
k 2 k 1

k 3

,  pp ,
are computed recursively, and for each order k
the coefficients are computed from the
coefficients of order k-1
Partial autocorrelation
(α) λ1=0.8+0.5i
λ2=0.8-0.5i
( )
()
 1=1.6
 2=-0.89
 2=-0.64
0.5
-0.5
0
-0.5
5
10
15
-1
20
0
5
10
15
-1
-0.5
10

15
20

15
15
20
20
()
1
 1=-1.75
 2=0.76
 2=-0.76
0.5
0
-1
10
10
(η) λ1=-0.8
λ2=-0.95
-0.5
5
5

0.5
 (,)
0
0
0
 1=-0.15
 2=-0.64
0.5
-1
-1
20
()
-0.5
5
15
 1=-1.6
 (,)
0
10
1
 1=-1.6
0
5
(στ) λ1=0.8
λ2=-0.95
()
0.5
0

1
 2=-0.89
0
-0.5
(δ) λ1=-0.8
λ2=-0.8
()
 (,)
20
 2=0.76
0.5
0

1
-1
0.5
-0.5

(β) λ1=-0.8+0.5i
λ2=-0.8-0.5i
 1=0.15
 2=-0.76
 (,)
0
1
 1=1.75
 (,)
0
()
1
 1=1.6
 (,)
 (,)
0.5
-1
()
1
 (,)
1
(ζ) λ1=-0.8
λ2=0.95
(ε) λ1=0.8
λ2=0.95
(γ) λ1=0.8
λ2=0.8
0
-0.5
0
5
10

15
20
-1
0
5
10

15
20
Moving average processes
1
X t  Zt  1Zt 1  2 Zt 2 

 Zt   i Z t i
i 1
moving average MA(∞)
We constrain the white noise terms to the first q most recent terms
X t  Zt  1Zt 1  2 Zt 2 
Zt ~ WN(0,  Z2 )
Moving average process of order q, ΜΑ(q)
X t  Zt  1Zt 1  2 Zt 2 
X t  (1  1B  2 B2 
ΜΑ(q) is stationary
  q Zt  q
  q B q ) Zt
 i  i
X t   ( B) Z t
?
ΜΑ(q) is invertible if Zt   1 ( B) X t
invertibility condition
The roots of  ( )  0 must be outside the unit circle
 ( B)  1  1B  2 B2 
 q Bq
characteristic polynomial
Moving average process of order one, MA(1)
X t  Zt   Zt 1
Invertibility condition: |  | 1
Zt ~ WN(0,  Z2 )
X t X t   Zt   Zt 1  Zt   Zt 1   ...   X2  (1   2 ) Z2
?
X t 1 X t   Zt 1   Zt 2  Zt   Zt 1   ...   X (1)   Z2  1 
X t 2 X t   Zt 2   Zt 3  Zt   Zt 1   ...   X (2)  0
 

  1   2

 0
Example
X t  Zt  0.4Zt 1
X t  Zt  2.5Zt 1
 1
 2

1 2
| 1 | 1/ 2
?
For one 1 there are two solutions for θ
and only one satisfies the invertibility condition
X t  Zt   Zt 1
 1
 1

   2.9

 2
 0
and
X t  Zt  1/  Zt 1
they have the same autocorrelation
If the root of 1   B  0 is outside
the unit circle 
the root of 1  1/  B  0 is inside
the unit circle
Moving average process of order one, MA(1)
Partial autocorrelation
  0.8

11  1 
1 2
13
3
3,3 

1  2 12
1 2  4  6
0.5
0.5
0
-0.5
-1
0
-0.5
0
2
4
6
8
-1
10
0
2
4

- ρτ of ΜΑ(1) decays
the same as ϕττ of AR(1)
10
6
8
10
()
1
1
0.5
0.5
0
-0.5
-1
0
-0.5
0
2
4
6

- … but for MA(1),
ρτ and ϕττ are always ≤0.5
8
  0.8
()
 (,)
- ϕττ of ΜΑ(1) decays
the same as ρτ of AR(1)
6

 1
()
 ,
  (1   2 )

,
1   2( 1)
()
1
 (,)
 12
2


1  12
1 2  4
()
2,2
( )
1
8
10
-1
0
2
4

Moving average process of order two, MA(2)
X t   ( B)Zt  Zt  1Zt 1  2 Zt 2 ,
 ( B)  1  1B  2 B2
Zt ~ WN(0,  Z2 )
characteristic polynomial
MA(2) is always stationary
MA(2) is invertible if the roots of θ(Β) are outside the unit circle
Variance
 X2  (1  12  22 ) Z2
Autocorrelation
 1 (1   2 )
1   2   2   1
1
2

  2
  
 2
2
2
1  1   2
0
 2


Partial autocorrelation
11  1
2  12
2,2 
1  12
13  12 (2  2 )
3,3 
1  22  2 12 (1  2 )
 ,  ...
complicated
expression
λ1=0.8
λ2=0.95
λ1=-0.8+0.5i
λ2=-0.8-0.5i
λ1=0.8+0.5i
λ2=0.8-0.5i
λ1=0.8
λ2=-0.95
Autocorrelation
( )
()
1=1.6
0.6
1=-1.6
0
0.2
0
0
-0.2
-0.2
-0.2
-0.4
-0.4
-0.4
-0.4
-0.6
-0.6
-0.6
-0.6
-0.8
-0.8
-0.8
5
10
15
20
0
5

10
15
20
2=0.76
0.4
-0.2
0
1=-0.15
0.6
2=-0.76
0.2
()
()
1=1.75
0.4
0.2
0
0.8
0.6
2=-0.89
0.4
0.2
()
0.8
0.6
2=-0.89
0.4
()
()
0.8
()
0.8
0
5

10
15
-0.8
20
0
5

10
15
20

Partial autocorrelation
()
()
1=1.6
0.6
1=-1.6
0
0.2
0
0
-0.2
-0.2
-0.2
-0.4
-0.4
-0.4
-0.4
-0.6
-0.6
-0.6
-0.6
0
5
10
15
20

- ϕττ of ΜΑ(2) decays
same as ρτ of AR(2)
-0.8
0
5
10
15
20
-0.8
2=0.76
0.4
-0.2
-0.8
1=-0.15
0.6
2=-0.76
0.2
 (,)
 (,)
1=1.75
0.4
0.2
0
0.8
0.6
2=-0.89
0.4
0.2
()
0.8
0.6
2=-0.89
0.4
 (,)
()
0.8
 (,)
0.8
0
5

- ρτ of ΜΑ(2) decays
same as ϕττ of AR(2)
10

15
20
-0.8
0
5
10
15

- … but for MA(2),
ρτ and ϕττ is always ≤0.5
20
Moving average process of order q, MA(q)
X t   ( B)Zt  Zt  1Zt 1  2 Zt 2 
 ( B)  1  1B  2 B2 
Variance
 q Bq
  q Zt q
Zt ~ WN(0,  Z2 )
characteristic polynomial
 X2  (1  12  q2 ) Z2
Autocovariance
 Z2 (  1 1 
  
0

  q   q )   1, 2, , q
 q
Autocorrelation
   1 1    q   q

   1  12   22    q2

0

  1, 2, , q
 q
The partial autocorrelation decays in a way that is determined
from the roots of the characteristic polynomial
The expressions of ϕττ in terms of the
coefficients θ1, θ2, ..., θq are complicated
Relation between AR and MA processes
Autoregressive process
order p, AR(p)
X t  1 X t 1  2 X t 2    p X t  p  Zt
(1  1B  2 B2    p B p ) X t  Zt
 ( B)  1  1B  2 B2 
 ( B) X t  Z t
Zt ~ WN(0,  Z2 )
p B p
Moving average process
of order q, ΜΑ(q)
X t  Zt  1Zt 1  2 Zt 2   q Zt q
X t  (1  1B  2 B2   q Bq )Zt
 ( B)  1  1B  2 B2 
X t   ( B) Z t
AR(p) stationary
MA(q) invertible
X t   ( B) Zt
 ( B)  1   1 B   2 B 2 
such that  ( B) ( B)  1
1
X t   ( B)Zt ΜΑ(∞)
AR(p) ↔ MA(∞)
Wold's decomposition (1)
every covariance-stationary
time series can be written
as an infinite moving
average (MA(∞)) process
of its innovation process.
 q Bq
 ( B)
1
X t  Zt
 ( B)  1   1 B   2 B 2 
such that  ( B) ( B)  1
 ( B) X t  Z t
AR(p) and MA(q)
have dual relation
AR(∞)
MA(q) ↔ AR(∞)
The autocorrelation and partial autocorrelation of
AR(p) and MA(q) have also dual relation
AR(p): ρτ decays exponentially to 0, ϕττ gets zero for τ>p
MA(q): ϕττ decays exponentially to 0, ρτ gets zero for τ>q
Autoregressive moving average process ARMA(p,q)
Autoregressive process
of order p, AR(p)
X t  1 X t 1  2 X t 2    p X t  p  Zt
Moving average process
of order q, ΜΑ(q)
Zt ~ WN(0,  Z2 ) X t  Zt  1Zt 1  2 Zt 2   q Zt q
X t  1 X t 1  2 X t 2 
  p X t  p  Zt  1Zt 1  2 Zt 2 
  q Zt q
X t  1 X t 1  2 X t 2 
  p X t  p  Zt  1Zt 1  2 Zt 2 
  q Zt  q
 ( B) X t   ( B) Z t
Xt 
 ( B)
Zt
 ( B)
 ( B)
X t  Zt
 ( B)
Stationarity is determined by the AR part
Invertibility is determined by the MA part
Autocorrelation:
X t  X t  X t  (1 X t 1  2 X t 2 
 X ( )  1 X ( 1) 
  p X t  p  Zt  1Zt 1  2 Zt 2 
  q Zt  q )
  p X (  p)  E[ X t  t ]  1E[ X t  t 1 ] 
For   q
   1  1 
Για   q
mixing of autocorrelation for AR(p) and MA(q)
  p   p
  1 1 
  p   p
 q E[ X t  t q ]
such as for AR(p)
Process ARMA(1,1)
X t   X t 1  Zt   Zt 1
(1   B)
Zt
(1   B)
Stationarity condition: |  | 1
(1   B) X t  (1   B)Zt
Autocorrelation:
Xt 
Invertibility condition: |  | 1
X t  X t  X t  ( X t 1  Zt   Zt 1 )
     1  E[ X t  t ]   E[ X t  t 1 ]
 0
 1
 1
2
1


 2 2
2
2
2


Z
0
 X   0   1   Z   (   ) Z
2
1
(   )(1   ) 2
 1   0   Z2
1 
Z
2
1
     1 such as for AR(1)
?
 (   )(1   )
 1

2
1



2

  

 1
 2

Partial autocorrelation  : decays with the lag such as for MA(1)
An ARMA(p,q) process with small p,q, exhibits correlation pattern (ρτ and ϕττ)
that can be attained by AR(p) only for large order p, or
by MA(q) only for large order q
Estimation of models AR, MA, ARMA
(stationary) time series
(stochastic process)
 X t t 

mean value μ
(stationary) time series
of n observations
autocovariance
   X t X t       X t 
2
autocorrelation
   ( ) 
, xn 
1 n
x   xt
n t 1
sample mean value
    ( )    ( X t   )( X t    )
x1 , x2 ,
sample autocovariance
1 n
c  c( )   ( xt xt   x 2 )
n t  1
  0,1, , n  1
sample autocorrelation
 ( )
 (0)
r  r ( ) 
c( )
c(0)
stochastic process AR(p)
Estimation of the process (model)
X t  1 X t 1  2 X t 2 
● AR, MA or ARMA ? other model ?
  p X t  p  Zt
Zt ~ WN(0,  Z2 )
stochastic process MA(q)
X t  Zt  1Zt 1  2 Zt 2   q Zt q
stochastic process ARMA(p,q)
X t  1 X t 1  2 X t 2    p X t  p  Zt
 1Zt 1   2 Zt 2 
  q Zt q
● order p or/and q ?
● estimation of model parameters ?
AR( p) :1 , 2 ,
,  p ,  2
ΜΑ(q) :1 ,2 ,
,q ,  2
ARΜΑ( p, q) :1 , 2 ,
?
,  p ,1 ,2 ,
,q ,  2
Estimation of model AR(p)
We assume a stochastic process AR(p) generate the time series
estimation of parameters 1 , 2 ,
Fit of a model AR(p)
x1 , x2 ,
, xn 
,  p ,  2
Method of moments or method of Yule-Walker (YW)
Estimation of the parameters from the sample autocorrelations
r1 , r2 , , rp , sX2
ˆ , ˆ , , ˆ , s 2
1
1
2
1

1
1
Yule-Walker  1

equations

  p 1  p  2  p 3
Estimation of 1 , 2 , ,  p ,  X2
r1
1
r
1
1


 rp 1 rp  2
ˆ  Rp1rp
r2
r1
rp 3
2
p

 1   p  p
2
 

2
Z
2
   X 
1  11  2 2    p  p
   
   
1   p    p 
r1 , r2 , , rp , sX2
and then substitution …
 p 1   1 
 p 2  2 
rp 1   ˆ1   r1 
 
rp  2  ˆ2   r2 

   
   
1  ˆ   rp 
 p
sZ2  sX2 (1  ˆ1r1  ˆ2 r2 
Rpˆ  rp
sZ2
s 
1  ˆ1r1  ˆ2 r2 
2
X
 ˆp rp )
 ˆp rp
x1 , x2 ,
, xn  with a mean μ
X t    1 ( X t 1   )  2 ( X t 2   ) 
General form of AR(p)
  p ( X t  p   )  Zt
The estimation method of ordinary least squares (OLS)
Fit of model AR(p) to the data
Minimization of the sum of squares of the fitting errors
min S (  , 1 ,  p )  min
ˆ , ˆ , ˆ ,
1
2
, ˆ
p
n

t  p 1
 xt    1 ( xt 1   ) 
  p ( xt  p   ) 
zˆt  ( xt  ˆ )  ˆ1 ( xt 1  ˆ ) 
2
w.r.t.
 , 1 , 2 , ,  p
 ˆ p ( xt  p  ˆ ), t  p  1,
n
1
s 
zˆt2

n  p t  p 1
2
Z
ˆ
,n
1 n
x   xt
n t 1
X t    1 ( X t 1   )  Zt
AR(1)
n
S (  , 1 )    xt    1 ( xt 1   ) 
2
t 2
x(2)  ˆ x(1)
ˆ 
1  ˆ

ˆ  t 2
n
1 n
x(1) 
xt 1

n  1 t 2
( xt  ˆ )( xt 1  ˆ )

2
ˆ
(
x


)
t
t 2
n

n
t 2
x(2)
1 n

 xt
n  1 t 2
( xt  x )( xt 1  x )

n
2
(
x

x
)
t
t 2
ˆ
x
1 n
 ( xt  x )( xt 1  x )
n t 2
1 n
ˆ
c0   t 1 ( xt  x )2
n
c1 
c1
 r1
c0
Other methods for estimation of AR(p)
● backward – forward approach (FB)
● Burg’s algorithm (Burg)
● maximum likelihood (ML)
- conditioned
- unconditioned
The ML estimation is optimal, the other methods approximate it
The ML reduces to OLS when the time series is from a Gaussian process
Asymptotically (for large n) all methods converge to the same (ML) estimates
The YW has the slowest convergence rate to ML
Determination of order p of an AR model
the criterion of partial autocorrelation
correlation of
partial autocorrelation for lag τ:
xt , xt 1 ,
accounting for the correlation with
x  x  z
t
1,1 t 1
, xt  1 , xt 
t
xt  1,2 xt 1  2,2 xt 2  zt
xt  1,3 xt 1  2,3 xt 2  3,3 xt 3  zt
The order is p if
ˆp , p  0 and
ˆ ,  0
estimation of  for model AR(τ)
for τ>p
(fall from non-zero to zero
partial autocorrelation)
the criterion based on fitting errors
● Akaike information criterion (AIC)
● Bayesian information criterion (BIC)
● Final prediction error (FPE)
2p
n
p ln(n)
BIC( p)  ln( sz2 ) 
n
AIC( p)  ln( sz2 ) 
FPE( p)  sz2
n p
n p
Growth rate of gross national product (GNP) of USA
Παράδειγμα quarter-annual observations, 2nd quarter 1947 – 1st quarter 1991).
The time series is corrected for seasonality
GNP of USA: increments
incr.GNO(USA): autocorrelation
0.04
0.5
0.03
0.4
stationary ?
0.3
0.02
0.2
x
t
()
0.01
0
correlation ?
0
-0.01
-0.1
-0.02
-0.03
0.1
-0.2
0
50
100
0
150
5
10
15
20

t
incr.GNO(USA): AIC
incr.GNO(USA): partial autocorrelation
0.5
-9.16
0.4
-9.18
order of
AR model ?
0.3
-9.2
AIC(p)
 ,
0.2
0.1
0
-9.22
AR(3) ?
-9.24
-0.1
-9.26
-0.2
0
5
10

15
20
0
5
10
p
15
20
parameter estimation
ˆ  0.0077
xt  ˆ  xt
OLS 
ˆ1  0.35

ˆ2  0.18

ˆ3  0.14
ˆ0  ˆ 1  ˆ1  ˆ2  ˆ3  0.0047
t  4,
estimation xˆt  0.0047  0.35xt 1  0.18xt 2  0.14 xt 3
,176
sz2  ˆ z2  0.0000989
zˆt  xt  xˆt
errors or residual of fit
xt  0.0047  0.35xt 1  0.18xt 2  0.14 xt 3  zt
fitted AR(3)
incr.GNP(USA): AR(3) fit
sz  ˆ z  0.0098
incr.GNP(USA): AR(3) fit
0.03
0.03
0.02
0.02
0.01
0.01
x(t)
0.04
x(t)
0.04
0
0
-0.01
-0.01
-0.02
-0.02
-0.03
0
50
100
time t
150
200
-0.03
100
110
120
time t
Diagnostic check for model adequacy
Is the residual time series independent?  test for independence on
130
140
 zˆt t  p1
n
Fit of the model MA(q)
stochastic process AR(p)
Estimation of the process (model)
X t  1 X t 1  2 X t 2 
● AR, MA or ARMA ? other model ?
  p X t  p  Zt
Zt ~ WN(0,  Z2 )
stochastic process MA(q)
X t  Zt  1Zt 1  2 Zt 2   q Zt q
stochastic process ARMA(p,q)
X t  1 X t 1  2 X t 2    p X t  p  Zt
 1Zt 1   2 Zt 2 
  q Zt q
● order p or/and q ?
● estimation of model parameters ?
AR( p) :1 , 2 ,
,  p ,  2
ΜΑ(q) :1 ,2 ,
,q ,  2
ARΜΑ( p, q) :1 , 2 ,
We assume a stochastic process MA(q) for the time series
Fit of the process (model) MA(q)
x1 , x2 ,
?
,  p ,1 ,2 ,
, xn 
parameter estimation 1 ,2 ,
,q ,  2
,q ,  2
MA(q) X t  Zt  1Zt 1  2 Zt 2 
Method of moments
Variance
 X2  (1  12  q2 ) Z2
Nonlinear equation system
w.r.t. the parameters 1 ,2 ,
Autocorrelation
   1 1    q   q

   1  12   22    q2

0

Estimation of 1 , 2 ,
  1, 2, , q
,q
 q
, q ,  X2
r1 , r2 ,
, rq , s X2
Innovation algorithm
Method of ordinary least squares
Fit of model MA(q) to the data
Minimization of sum of squares of fitting errors
min S (  ,1 ,  q )  min
n
 (x    z
t  q 1
t
1 t 1

  q zt q ) 2
Numerical optimization method
ˆ1 ,ˆ2 , ,ˆq
w.r.t.
 ,1 ,2 , ,q
  q Zt  q
X t    Zt   Zt 1
MA(1)
Method of moments
 

  1   2

 0
 1
 2
  (1     )
2
X
2
q
2
Z
r
| r1 | 0.5  ˆ  1
| r1 |
2

1

1

4
r
1
2
| r1 | 0.5  r1ˆ  ˆ  r1  0  ˆ1,2 
2r1
We choose the solution | ˆ | 1 that gives rise to invertibility
s X2
s 
1  ˆ 2
2
Z
Method of ordinary least squares
2n-2 solutions, we select the
solution | ˆ | 1
We assume z0  0 (and   0 ) zt  xt   zt 1
z1  x1
z2  x2   z1  x2   x1
computational algorithm: least
squares with constraints for
invertibility
z3  x3   z2  x3   ( x2   x1 )  x3   x2   2 x1
zn  xn   zn1  xn   xn1   2 xn2 
n
  n2 x2   n1 x1

min  zt2  min x12  ( x2  x1 )2  ( x3  x2  x1 2 )2
t 1

min a0  a1 
 a2 n2 2n2

 ( xn  xn1 
 x1 n1 )2

Growth rate of gross national product (GNP) of USA
Παράδειγμα quarter-annual observations, 2nd quarter 1947 – 1st quarter 1991).
The time series is corrected for seasonality
GNP of USA: increments
incr.GNP(USA): autocorrelation
0.04
0.5
0.03
0.4
0.3
0.02
order of the
MA model ?
0.2
x
t
r()
0.01
0
0
-0.01
-0.1
-0.02
-0.2
0
50
100
0
150
5
10
15
20

t
incr.GNP(USA): AIC of MA models
-9.14
-9.16
AIC(q)
-0.03
0.1
-9.18
ΜΑ(2) ?
-9.2
-9.22
-9.24
0
2
4
6
q
8
10
parameter estimation
x  0.0077
OLS  ˆ1  0.312 ˆ2  0.272
sz2  0.000097
variance of errors (residuals)
sz  0.00983
xt  0.0077  zt  0.312 zt 1  0.272 zt 2
fitted ΜΑ(2)
incr.GNP(USA): MA(2) fit
t  1,
,176
incr.GNP(USA): MA(2) fit
0.04
0.03
0.03
0.02
0.02
0.01
0.01
fit with
ΜΑ(2)
x(t)
x(t)
0.04
0
0
-0.01
-0.01
-0.02
-0.02
-0.03
0
50
100
time t
150
-0.03
100
200
incr.GNP(USA): AR(3) fit
120
time t
130
140
incr.GNP(USA): AR(3) fit
0.04
0.03
0.03
0.02
0.02
0.01
0.01
fit with
AR(3)
x(t)
x(t)
0.04
0
0
-0.01
-0.01
-0.02
-0.02
-0.03
110
0
50
100
time t
150
200
-0.03
100
110
120
time t
130
140
Diagnostic check for model adequacy
Is the residual time series independent?  test for independence on
 zˆt t  p1
n
Εκτίμηση μοντέλου ARMA(p,q)
stochastic process AR(p)
Estimation of the process (model)
X t  1 X t 1  2 X t 2 
● AR, MA or ARMA ? other model?
  p X t  p  Zt
Zt ~ WN(0,  Z2 )
stochastic process MA(q)
X t  Zt  1Zt 1  2 Zt 2   q Zt q
● order p or/and q ?
● estimation of model parameters ?
stochastic process ARMA(p,q)
X t  1 X t 1  2 X t 2    p X t  p  Zt
 1Zt 1   2 Zt 2 
AR( p) :1 , 2 ,
,  p ,  2
ΜΑ(q) :1 ,2 ,
,q ,  2
ARΜΑ( p, q) :1 , 2 ,
  q Zt q
,  p ,1 ,2 ,
We assume a stochastic process ARMA(p,q) for the time series  x1 , x2 ,
Fit of the process (model) ARMA(p,q)
estimation of parameters 1 , 2 ,
,  p ,1 ,2 ,
,q ,  2
The methods of moments and least squares as for MA(q)
?
, xn 
,q ,  2
ARMA(1,1) X t     ( X t 1   )  Zt   Zt 1
Method of moments
 (   )(1   )
 1

   1   2  2

 1
 2

1   2  2 2
2
X 
Z
2
1
Estimation of
1 , 2 , ,  X2
r1 , r2 ,
Solution of equation system w.r.t.  ,
1  ˆ 2
2
s 
s
X
ˆˆ
1  ˆ 2  2
2
Z
Method of ordinary least squares
We assume z0  0 (and x0    0 )
z1  x1
z2  x2   x1   z1  x2  (   ) x1
n
min  zt2
t 1
z3  x3   x2   z2  x3  (   ) x2   (   ) x1
zn  xn   xn1   zn1  xn  (   ) xn1   (   ) xn2 
computational algorithm of least squares with
constraints for invertibility and stationarity
  n2 (   ) x1
, rp , sX2
?
Growth rate of gross national product (GNP) of USA
Παράδειγμα quarter-annual observations, 2nd quarter 1947 – 1st quarter 1991).
The time series is corrected for seasonality
GNP of USA: increments
incr.GNP(USA): autocorrelation
0.04
0.5
0.03
0.4
0.3
0.02
0.2
x
t
r()
0.01
0
0
-0.01
-0.1
-0.02
-0.03
0.1
-0.2
0
50
100
0
150
5
10
20

t
incr.GNP(USA): AIC of ARMA models
incr.GNP(USA): partial autocorrelation
-9.14
0.5
q=0
q=1
q=2
q=3
q=4
q=5
0.4
-9.16
0.3
AIC(p,q)
0.2
 p,p
15
0.1
0
-0.1
-9.18
order of
ARMA ?
ARMA(2,2) ?
-9.2
-9.22
-0.2
0
2
4
6
p
8
10
-9.24
-1
0
1
2
3
p
4
5
6
parameter estimation x  0.0077
OLS  ˆ1  0.614 ˆ2  0.455
variance of errors (residuals)
fitted ARΜΑ(2,2)
ˆ1  0.301 ˆ2  0.600
sz2  0.000097 sz  0.00983
xˆt  0.0065  0.614 xt 1  0.455xt 2  zt  0.301zt 1  0.600 zt 2
incr.GNP(USA): ARMA(2,2) fit
incr.GNP(USA): ARMA(2,2) fit
0.03
0.03
0.02
0.02
0.01
0.01
fit with
ARΜΑ(2,2)
x(t)
0.04
x(t)
0.04
0
0
-0.01
-0.01
-0.02
-0.02
-0.03
0
50
100
time t
150
-0.03
100
200
incr.GNP(USA): MA(2) fit
110
120
time t
130
140
incr.GNP(USA): MA(2) fit
0.03
0.03
0.02
0.02
0.01
0.01
fit with
ΜΑ(2)
x(t)
0.04
x(t)
0.04
0
0
-0.01
-0.01
-0.02
-0.02
-0.03
0
50
100
time t
150
-0.03
100
200
incr.GNP(USA): AR(3) fit
110
120
time t
130
140
incr.GNP(USA): AR(3) fit
0.03
0.03
0.02
0.02
0.01
0.01
fit with
AR(3)
x(t)
0.04
x(t)
0.04
0
0
-0.01
-0.01
-0.02
-0.02
-0.03
0
50
100
time t
150
t  1,
200
-0.03
100
110
120
time t
130
140
,176
Model for time series with trends (ARIMA)
Yt t 1

random walk
Yt  Yt 1  X t  X1  X 2 
process AR(1) for
Xt
(non-stationary process)

 X t t  iid E  X t   0
 1
First differences: X t  (1  B)Yt  Yt  Yt 1
Yt t 1

E  X t2    2
iid process
non-stationary process that exhibits trends
first differences:
X t  Yt  Yt 1
?
stationary process
YES
NO
second order differences: X t  X t  X t 1  Yt  2Yt 1  Yt 2
X t  d Yt  X t t 1 stationary after d order differences:
 (1  B)d Yt
Usually d  1

stationary process
YES
NO
AR(p), MA(q), ARMA(p,q)
Yt t 1 non-stationary process ARIMA(p,d,q)

X t  1 X t 1  2 X t 2 
d
The polynomial  ( B)(1  B) has a
unit root and all other roots are
outside the unit circle
?
  p X t  p  Zt  1Zt 1  2 Zt 2 
 ( B) X t   ( B) Z t
 ( B)d Yt   ( B)Zt
 ( B)(1  B)d Yt   ( B)Zt
?
  q Zt q
Fit of model ARIMA (Box-Jenkins approach)
time series  y1 , y2 , , yn 
time series history diagram
autocorrelation (strong and slowly decaying)
other ?
if autocorrelation
indication that there is trend
decays to zero
d
d-order differences xt  (1  B) yt
other ?
the time series
is stationary
stationary time series  x , x , , x 
?
1
2
n
model order
estimation of model parameters
if the autocorrelation
is statistically not
significant
fit of model AR(p), MA(q), ARMA(p,q)
?
diagnostic test
model adequacy
model ARMA(p,q) for  x1 , x2 ,
is it iid ?
YES
, xn 
test for
independence
STOP
d
then using the inverse transform xt  (1  B) yt
we get the model ARΙMA(p, d,q) for
 y1 , y2 ,
, yn 
?
NO
nonlinear
model ?
prediction ?
Annual index of global temperature (anomaly of surface temperature
of the north hemisphere at grid 5ο x 5ο), time period 1850-2011
Example
 y1 , y2 ,
, yn  real observations
Source: http://www.cru.uea.ac.uk/cru/data/temperature
annual global temperature: autocorrelation
0.8
1
0.6
0.5
0.4
stationary
time series?
r()
global temperature
annual land air temperature anomalies
1.5
0
0.2
-0.5
0
-1
1840
1860
x1 , x2 ,
1880
1900
1920
1940
year
1960
1980
2000
-0.2
2020
NO
0
5
10
15

, xn  first differences
first differences of annual land air temperature anomalies
first difference of annual global temperature: autocorrelation
0.3
1
0.2
0.1
stationary
time series?
0
r()
d(temp)
0.5
0
-0.1
-0.2
-0.5
YES
-0.3
-0.4
-1
1840
1860
1880
1900
1920
1940
year
1960
1980
2000
2020
-0.5
0
5
10

15
Model for time series
x1 , x2 ,
, xn  ?
partial autocorrelation
autocorrelation
first difference of annual global temperature: autocorrelation
0.3
0.2
diff of temp: partial autocorrelation
-2.85
0.2
0.1
-2.9
0.1
diff of temp: AIC of ARMA models
-0.2
-0.1
-0.2
-3.05
-3.15
-0.4
-0.4
0
5
10
-0.5
15
-3.2
0
5

10
15
0
1
Model for time series  y1 , y2 ,
2
3
4
5
6
p
, yn 
ARIΜΑ(0,1,4) (1  B)Yt  4 ( B) Zt
fit of ΜΑ(4) ( x  0.008)
xt  0.008  zt  0.758zt 1  0.022 zt 2  0.219 zt 3  0.275zt 4
diff of global temperature: ARMA(0,4) fit
sz2  0.0414
sz  0.2035
diff of global temperature: ARMA(0,4) fit
1
0.5
0.5
x(t)
1
0
-0.5
-1
1840
-3.25
-1

The most appropriate model ?
x(t)
-3
-3.1
-0.3
-0.3
-0.5
AIC(p,q)
 ()
-0.1
q=0
q=1
q=2
q=3
q=4
q=5
-2.95
0
0
r()
AIC criterion
0
-0.5
1860
1880
1900
1920
1940
time t
1960
1980
2000
2020
-1
1930
1935
1940
1945
time t
1950
1955
1960
Model of time series with seasonality (ARMAs)
Given the time series  y1 , y2 , , yn  without trend and with seasonality (periodicity)
Removal of seasonality of period s, k   n / s  :
k
1
si   yi  js xt  yt  st
Estimation of the periodic components si i=1,…,s
k j 1
1
Symmetric moving
xt  (0.5 yt  s /2  yt s /21   yt  s /21  0.5 yt  s /2 )
s even
s
average of order s
1 ( s 1)/2
xt 
yt i
s odd

s i  ( s 1)/2
s – differences (difference of lag s)
X t  sYt  (1  B s )Yt  Yt  Yt s
Given the time series  x1 , x2 ,
, xn  without trend and with seasonality s
Hypothesis: there are correlations but only between the same components
of each period (the dependence occurs at time steps s) :
x1 , x2 , x3 ,
, xs , xs 1 , xs 2 ,
, x2 s , x2 s 1 , x2 s 2 ,
, x3s , x3s 1, x3s 2 ,
xn 
k cycles of period s
model ARMA(P,Q)s for
i  1, 2,
X i  st  1 X i  s (t 1) 
,s
t  Ps  1, Ps  2,
xi , xis , xi2s ,
,n
X t  1 X t s 
, xi ks  the same for i  1, 2,
  P X i s ( P1)  Zi  st  1Zi s (t 1) 
  P X t  Ps  Zt  1Zt s 
( B s ) X t  ( B s )Zt
 Q Zt Qs
model ARMA(P,Q)s for
,s
 Q Zi s (Q 1)
x1 , x2 ,
, xn 
Model of time series with seasonality (ARIMAs)
This is an extension of ARMA(P,Q)s when the time series has “seasonal trend”,
meaning trend at the time points t, t+s, t+s, …
Given the time series  y1 , y2 , , yn  with seasonal trend and
given that the correlations are between components with the same periodic order
s – differences (difference of lag s)
 y1 , y2 ,
, yn 
xs1 , xs2 ,
, xn 
xt  s yt  (1  B ) yt  yt  yt s
s
ARMA(P,Q)s
ARIMA(P,1,Q)s
( B s )(1  B s )Yt  ( B s )Zt
( B s ) X t  ( B s )Zt
In general, ARIMA(P,D,Q)s
( B s )(1  B s ) D Yt  ( B s )Zt
Example
Mean monthly temperature at Thessaloniki station, period 1930-2000
part of the record
x1 , x2 , , xn  n  71*12  852
Estimation of seasonal component
Temp Thess: subtract average period
strong seasonality 6
YR JAN FEB MAR APR MAY JUN ΙJUL AUG SEP OCT NOV DEC
(periodicity)
4
1930 6,7 6,7 11,3 15,7
19 22,6 26,2
26 22,8 17,5 12,1 8,9
2
Temperature Thessaloniki: autocorrelation
Temp
4,1
9,2
5,5
10
9,2
7,2
8
8,8
7,7
4,2
4,8
7,3
1
-4
0
-6
30
-0.5
-1
35
40
45
50
55
60
65 70
year
75
80
85
90
95
00
05
90
95
00
05
95
00
05
moving average
0
20
40
60
80
100
Temp Thess: moving average with order 12

18
Temperature Thessaloniki, period 1/1930-12/2000
17.5
30
20
17
Temp
removal of
seasonality
25
Temp
0
-2
0.5
r()
1931 7,9 8,8
10 12,7 19,7 24,9 27,4 26,9 21,5
16 10,8
1932
5 2,9 7,4 14,1 19,4 24,1 27,2 26,1 24,1 21,5 11,7
1933 5,2 7,6 9,1 13,5 17,6 22,8 25,5 25,3 20,5 17,7 14,2
1934 5,3 5,7 12,6 15,9 20,5 23,9 26,9 26,3 23,1 17,6 13,4
1935 4,5 6,8 8,2 14,7 18,7 24,9 26,1
26 22,8 19,8 12,1
1936 10,5 8,3 12,2
16 18,4
23 27,1 25,9 21,6 16,3 12,3
1937 4,8 8,2 12,9 14,5 19,7 24,4 26,4
26 23,6 16,9 12,8
1938 4,8 6,6
11 12,9 18,4 24,1 27,5 27,1 22,2 17,9 12,8
1939 7,9 7,9 8,5 15,5 19,8 23,1 27,2 26,6 21,9 18,5 12,3
1940 3,1 6,8 8,1 13,9 17,5 22,9 26,6 24,3 21,4 18,5
13
1941 6,9 10,2
11 15,6 19,2 23,7
26
26 18,9 15,7 10,1
1942 0,9 5,6 8,6 13,9 20,7 24,6 25,4 26,1 23,8 17,5 10,7
16.5
16
15
15.5
10
15
5
0
30
14.5
30
35
40
45
50
55
60
65 70
year
75
80
85
90
95
00
35
40
45
05
50
55
60
65 70
year
75
80
85
12-order difference
Temperature Thessaloniki, period 1/1930-12/2000 - month
Temp Thess: 12-differencing
30
10
25
Temp
15
10
5
Temp
same model
ARMA(P,Q)s
for each month
20
0
-5
?
5
0
1930
1940
1950
1960
1970
year
1980
1990
2000
-10
30
35
40
45
50
55
60
65 70
year
75
80
85
90
model ARMA(P,Q)s
Temp Thess: ARMA(1,1)12 fit
30
Temp Thess: AIC of ARMA 12 models
5
25
4.5
20
AIC(p,q)
4
3.5
3
x(t)
q=0
q=1
q=2
q=3
q=4
q=5
15
10
5
2.5
2
-1
0
30
0
1
2
3
4
5
35
40
45
50
55
60
65 70
time t
75
80
6
p
Temp
TempThess:
Thess: ARMA(1,1)
ARMA(1,1)12
fitfit
12
30
30
x  15.928
xt  0.0075  0.9995xt 12  zt  0.5242 zt 12
25
25
sz2  3.733
sz  1.932
standard deviation
of residual time
series
x(t)
x(t)
Fit of ARMA(1,1)12
20
20
15
15
10
10
sz  1.603
55
00
60
60
time
timett
Fit with the estimation of
the periodic component
sz  1.427
85
90
95
00
05
Model of time series with trend and seasonality (SARIMA)
Given that the time series  y1 , y2 , , yn  has
trend
and
removal of trend
 y1 , y2 ,
, yn 
dependence between successive
observations (time step 1)
xt 2 , xt 1 , xt , xt 1 , xt 2
seasonality s
removal of seasonality
x1 , x2 ,
, xn 
dependence between seasonal
components of the same seasonal
order (time step s)
xt 2 s , xt s , xt , xt  s , xt 2s
ARIMA(P,1,Q)s
ARIMA(p,1,q)
 ( B)(1  B)d Yt   ( B)Zt
( B s )(1  B s ) D Yt  ( B s )Zt
 ( B)( B s )(1  B)d (1  B s ) D Yt   ( B)( B s )Zt
SARIMA(p,d,q)×(P,D,Q)s
Seasonal multiplicative model
d 0
and
D0
SARMA(p,q)×(P, Q)s
most often
d 1
D0
Monthly index of global temperature (anomaly of surface temperature
of the north hemisphere at grid 5ο x 5ο), time period 1850-2011
Example
 y1 , y2 ,
, yn  real observations
Source: http://www.cru.uea.ac.uk/cru/data/temperature
monthly global temperature: autocorrelation
1.2
2
1
0.8
1
0.6
0
r()
global temperature
land air temperature anomalies, period 1/1850-12/2011
3
0.4
-1
0.2
-2
0
-3
50 60 70
80 90 00
10 20 30 40 50 60
year
70 80 90
00 10
20
-0.2
0
20
40
60
80
100

land air temperature anomalies, period 1/1850-12/2011
3
removal of trend
?
global temperature
2
removal of seasonality / periodicity
1
0
dependences between successive
?
observations (time step 1)
-1
Jan
May
Sep
-2
-3
1840
1860
1880
1900
1920
1940
year
1960
1980
2000
?
2020
dependence between seasonal
components of the same seasonal
order (time step s)
?
first differences
differences of lag 12
first differences of month global temperature
first differences of monthly global temperature
1.5
1
1
0.5
d(temp)
d(temp)
0.5
0
-0.5
0
-0.5
-1
-1
-1.5
Jan50
Jan52
Jan54
Jan56
year
Jan58
Jan60
Jan62
-1.5
Jan50
Jan52
Jan54
first difference of monthly global temperature: autocorrelation
0.3
Jan58
Jan60
Jan62
12-difference of monthly global temperature: autocorrelation
0.4
significant autocorrelations
0.2
Jan56
year
0.2
0.1
r()
-0.1
for τ=12,24,…
-0.2
0
r()
for τ=1,2,…
0
-0.2
-0.4
-0.3
-0.4
0
20
40
60

80
100
-0.6
0
20
40
60

80
100
diff of temp: partial autocorrelation
diff of monthly temp: AIC of SARMA(p,q)x(1,0)
diff of monthly temp: AIC of SARMA(p,q)x(0,0)
0.1
-1.25
-1.35
-0.2
-0.3
-0.4
0
20
40
60
80
100
-1.4
-1.45
-1.55
-1.55
-1.6
-1.6
-1
-1.4
2
3
4
5
p
diff of monthly temp: AIC of SARMA(p,q)x(1,1)
-1.4
-1
6
-1.45
-1.55
-1.55
-1.6
-1.6
-1.6
-1.35
-1.4
0
1
2
3
4
5
p
diff of monthly temp: AIC of SARMA(p,q)x(1,2)
-1.25
q=0
q=1
q=2
q=3
q=4
-1.3
-1
-1.35
-1.45
-1.4
-1.45
-1.55
-1.55
-1.6
-1.6
-1.6
3
4
5
6
-1
0
1
2
p
min(AIC)=-1.622 for SARMA(3,3) (1,2)12
3
p
4
5
3
4
5
p
diff of monthly temp: AIC of SARMA(p,q)x(2,3)
6
6
q=0
q=1
q=2
q=3
q=4
-1.45
-1.55
2
2
-1.4
-1.5
1
1
-1.35
-1.5
0
0
-1.3
-1.5
-1
q=0
q=1
q=2
q=3
q=4
-1.25
q=0
q=1
q=2
q=3
q=4
-1.3
-1
6
AIC(p,q)
-1.25
6
6
-1.45
-1.55
3
4
5
p
diff of monthly temp: AIC of SARMA(p,q)x(0,2)
3
4
5
p
diff of monthly temp: AIC of SARMA(p,q)x(2,0)
-1.4
-1.5
2
2
-1.35
-1.5
1
1
-1.3
-1.5
0
0
-1.25
q=0
q=1
q=2
q=3
q=4
-1.35
-1.45
-1
AIC(p,q)
1
-1.3
AIC(p,q)
AIC(p,q)
-1.35
0
-1.25
AIC(p,q)
-1.3
-1.45
-1.5
diff of monthly temp: AIC of SARMA(p,q)x(0,1)
q=0
q=1
q=2
q=3
q=4
-1.4
-1.5

-1.25
-1.35
AIC(p,q)
 ()
AIC(p,q)
-0.1
q=0
q=1
q=2
q=3
q=4
-1.3
AIC(p,q)
-1.3
0
-1.25
q=0
q=1
q=2
q=3
q=4
-1
0
1
2
3
4
p
SARMA(1,2) (1,1)12 AIC=-1.618
5
6
SARMA(3,3) (1,2)12
xt  1.12 xt 1  0.70 xt 2  0.22 xt 3  0.95 xt 12  1.11xt 13  0.70 xt 14  0.18 xt 15
 zt  0.42 zt 1  0.22 zt 2  0.95 zt 3  1.01zt 12  0.48 zt 13  0.23zt 14  0.93zt 15
x  0.0013
sz  0.445
 0.13zt 24  0.08 zt 25  0.05 zt 26  0.08 zt 27
diff global temp: ARMA(3,3)x(1,2)12 fit
4
2
2
x(t)
x(t)
diff global temp: ARMA(3,3)x(1,2)12 fit
4
0
-2
-4
50 60 70
0
-2
80 90 00
10 20 30 40 50 60
time t
70 80 90
00 10
-4
60
20
time t
SARMA(1,2) (1,1)12
xt  0.35 xt 1  0.98 xt 12  0.34 xt 13
sz  0.446
 zt  1.04 zt 1  0.1zt 2  0.93zt 12  0.99 zt 13  0.12 zt 14
diff global temp: ARMA(1,2)x(1,1)12 fit
4
4
2
2
x(t)
x(t)
diff global temp: ARMA(1,2)x(1,1)12 fit
0
-2
-2
-4
50 60 70
0
80 90 00
10 20 30 40 50 60
time t
70 80 90
00 10
20
-4
60
time t
Prediction of time series
Models for time series (AR, MA, ARMA, ARIMA, SARIMA)  prediction
Many applications
Index and volume of the Athens Stock Exchange (ASE)
Can we predict the index or volume the first day(s) of May 2002
given the observations until the end of April 2002?
General index of consumer prices (GICP)
General Index of Comsumer Prices
General Index of Comsumer Prices, period Jan 2001 - Aug 2005
125
120
115
110
105
100
01
02
03
04
05
06
years
At what level GICP is to be moved in the next months?
Sunspots
Annual sunspots, period 1960 - 2001
200
Annual sunspots, period 1700 - 2001
200
150
number of sunspots
number of sunspots
150
100
50
0
1700
1750
1800
1850
1900
1950
years
Annual sunspots, period 1900 - 2001
100
50
2000
0
1960
200
1970
1980
years
1990
2000
180
number of sunspots
160
Given the number of sunspots up
to the current date, how many
sunspots will be next year(s)?
140
120
100
80
60
40
20
1900
1920
1940
1960
years
1980
2000
Heart rate
What is the next heart rate(s) ?
The problem of time series prediction
• We are given the time series up to time n
• We want to estimate xn+k
Prediction xn(k)
Prediction error:
en (k )  xnk  xn (k )
x1 , x2 ,, xn 
Stochastic process { X n }
prediction Xn(k) is the estimation of the observation Xn+k of { X n }
Best prediction :
X n (k )  E[ X nk | X n , X n1 , ]
Properties of a good prediction:
• unbiasedness :
E[ X n (k )]  X nk
• efficiency, meaning small prediction error
Var[ n (k )]  Var[ X nk  X n (k )]
Optimizing both unbiasedness and efficiency 
minimization of the mean square prediction error
2
E  X n  k  X n ( k )  


Evaluation of a prediction model :
Given x1 , x2 ,, xn  learning /
given also
training set
{xn1 , xn 2 ,
test / validation set
, xnl }
prediction model
prediction errors k
time step ahead
xn (k ), xn1 (k ),
, xnl k (k )
en (k ), en1 (k ),
, enl k (k )
e j (k )  x j  k  x j (k )
j  n, n  1,
,n l k
Statistical measures of error
Estimation of mean square error (mse)
1 n l  k
1 n l  k
2
mse(k ) 
x j  k  x j (k )
 e j (k )  l  k  1 
l  k  1 j n
j n


2
root mean square error (rmse)
2
1 n l  k
1 n l  k
2
rmse(k ) 
e
(
k
)

x

x
(
k
)


 j
 j k j
l  k  1 j n
l  k  1 j n
normalized root mean
square error (nrmse) nrmse(k ) 
1
l  k 1
n l  k
 x
j n
jk
 x j (k ) 
2
1 n l  k
  x jk  x 
l  k  1 j n
2
nrmse  0
very good prediction
nrmse ≈ 1
prediction at the level
of mean value prediction
Predictions:
Prediction many steps ahead for a given current time
Given x1 , x2 ,, xn , we predict xn (1), xn (2),, xn (k )
Prediction at a given time step ahead for different current times
Given x1 , x2 ,, xn , xn1 ,, xnl , we want to evaluate the predictability
of a prediction model
1. We estimate the model parameters based on the time series
x1 , x2 ,, xn 
2. We pursue predictions for some time step ahead k
xn (k ), xn1 (k ),, xnl k (k )
3. We compute a statistic of prediction errors
2
1 n l  k
1 n l  k
2
rmse(k ) 
 x j  k  x j (k ) 
 e j (k )  l  k  1 
l  k  1 j n
j n
prediction limits xn (k )  c1 / 2 Var  en (k ) 
zt ~ N(0,  z2 )
c1 / 2  z1 / 2
Simple prediction techniques
Deterministic trend (revisited)
xt   t  zt
white noise
xn k   n k  znk
Prediction:
zt ~ WN(0,  z2 )
trend, a small varying function of time
xn (k )  E  nk  znk | xn , xn1,
, x1  
 n  k
Solution: Extrapolation of function μt for times > n
Prediction error:
μt = ?
en (k )  z nk
known  simple substitution
global (fit to x1 , x2 ,, xn )
unknown  estimation
e.g. polynomial
local (fit only to the m last observations)
pm (t )  c0  c1t    cmt m
xnm1, xnm2 ,, xn 
Index and volume of ASE, prediction with trend extrapolation (polynomial fit of trend)
Deterministic seasonal term xt  st  zt
deterministic seasonal term and deterministic trend xt   t  st  zt
Same approach: estimation of the deterministic term
{xt }56
t 1
xt  xt  t
GICP, January 2001 – August 2005
General Index of Comsumer Prices, linear trend is subtracted
3
t  103.9 + 0.31t
2
120
xt  t  st  zt
115
110
1
detrended GICP
General Index of Comsumer Prices
General Index of Comsumer Prices, period Jan 2001 - Aug 2005
125
0
-1
-2
105
100
01
-3
02
03
04
05
-4
01
06
02
03
st t 1
xn (k )  n k  snk
3
year cycle of GICP
2
Prediction of Sept 2005
57  103.9 + 0.31*57  121.70
s9  0.16
1
0
-1
-2
x56 (1)  121.86
-3
03
04
years
06
General Index of Comsumer Prices, trend and period comp. subtracted
4
detrended and deseasoned GICP
General Index of Comsumer Prices, year cycle
4
02
05
zt  xt  st  xt  t  st
n
-4
01
04
years
years
05
06
3
2
1
0
-1
-2
-3
-4
01
02
03
04
years
05
06
Exponential smoothing
Estimation of xn+k as a weighted sum of former observations
n 1
xn (k )  c0 xn  c1 xn1    cn1 x1   c j xn j
j 0
Desired condition on the weights: c0  c1    cn1
Determination of the weights
with a single parameter  :
c j   (1   ) j , j  0,1,, n  1, 0    1
recursive relation :
xn (k )  xn  (1   ) xn1 (k )

n 1
j 0
cj 1
Prediction with exponential smoothing: Examples
Index and volume of ASE
Prediction at one time step ahead for all days in May 2002
Comparison of the prediction performance of exponential smoothing for different 
Large  (weighting most the most recent observations) gives the best prediction
Sunspots
Heart rate
Prediction of stationary time series with linear models
Predictions with AR, MA and ARMA
Prediction with autoregressive models (AR)
Given the time series
AR(1) model
t  n 1
x1 , x2 ,, xn 
xt   xt 1  zt
white noise
xn1   xn  z n1
Prediction error:
en (1)  z n1
Optimal prediction at time step 1: xn (1)   xn
t n2
Var en (1)   z2
Prediction error :
xn2   xn1  z n2
Optimal prediction at time step 2: xn (2)   xn (1)   xn
2
t nk
k
Optimal prediction at time step k: xn (k )   xn
zt ~ WN(0,  z2 )
en (2)   zn1  zn 2
Var en (2)  ( 2  1) z2
Prediction error :
en (k )   k 1 zn1     znk 1  zn k
1   2k
Var  en (k )  
1 2
2
z
Given the time series
AR(p) model
t  n 1
x1 , x2 ,, xn 
xt  1 xt 1     p xt  p  zt
xn1  1 xn     p xn p1  z n1
Optimal prediction at time step 1:
Var en (1)   z2
xn (1)  1 xn     p xn p1
t n2
xn2  1 xn1     p xn p 2  zn 2
Optimal prediction at time step 2:
xn (2)  1 xn (1)  2 xn1     p xn p 2
t nk
xnk  1 xn1 k     p xn p  k  znk
Optimal prediction at time step k:
xn (k )  1 xn (k  1)     p xn (k  p)
prediction xn ( j )
where xn ( j )  
 observation xn  j
j 0
j0
Prediction error :
en (1)  z n1
Prediction error :
en (2)  1en (1)  zn2  1 zn1  zn2
Var en (2)  ( 2  1) z2
Prediction error :
en (k )  1en (k  1)    k en (1)  znk
k 1
en ( k )   b j z n  k  j
j 0
Var en (k )   
k 1
2
z
b
j 0
2
j
Index ASE, multi-step prediction for May 2002
AR(1) AR(6) AR(11)
0.9995 1.1535 1.1523
-0.2126 -0.2131
0.0944 0.0961
-0.0655 -0.0663
0.0103 0.0135
0.0194 -0.0031
-0.0058
0.0190
-0.0175
0.0458
-0.0213
xn (k ), n  30.04.2002, k  1,,20
Index ASE, one step ahead prediction in May 2002
xn (1), n  2.05.2002  31.05.2002
Volume ASE, multi-step prediction for May 2002
AR(1)
0.9097
AR(6)
0.3412
0.2092
0.1557
0.1369
0.0773
0.0528
AR(11)
0.3251
0.1955
0.1380
0.1138
0.0455
0.0009
0.0350
0.0068
0.0249
0.0420
0.0527
xn (k ), n  30.04.2002, k  1,,20
Volume ASE, one step ahead prediction in May 2002
xn (1), n  2.05.2002  31.05.2002
Sunspots, multi step prediction from 1991 to 2001
AR(1)
0.8205
AR(6)
1.3231
-0.5297
-0.1655
0.1895
-0.2576
0.1702
AR(11)
1.1848
-0.4385
-0.1718
0.1933
-0.1324
0.0311
0.0157
-0.0203
0.1993
-0.0186
0.0352
xn (k ), n  1990, k  1,,21
Sunspots, prediction one year ahead in period 1991-2001
xn (1), n  1991  2001
Heart rate, prediction of the next 21 heart rates
AR(1)
0.8065
AR(6)
0.7850
-0.1205
0.1983
0.1438
-0.1407
-0.0465
AR(11)
0.7803
-0.0736
0.1759
0.0858
-0.1239
-0.1899
0.1413
0.0761
0.0073
-0.0463
0.0347
xn (k ), n  1060, k  1,,21
Heart rate, prediction of the next heart rates
xn (1), n  1061  1081
Growth rate of GNP of USA
The observations are at annual-quarters, from the second
quarter of 1947 till the first quarter of 1991 (n=176)
Rate growth of GNP of USA
Autocorrelation of rate growth
0.04
0.5
0.03
0.4
0.3
0.02
0.2
xt
r()
0.01
0.1
0
0
-0.01
-0.1
-0.02
-0.2
-0.03
0
50
100
0
150
5
10

t
0.4
-9.16
0.3
-9.17
0.2
-9.18
p,p
AIC(p)
-9.15
0.1
-9.19
0
-9.2
-0.1
-9.21
-0.2
-9.22
2
4
6
p
8
20
AIC for Rate Growth
Partial Autocorrelation for Rate Growth
0.5
0
15
10
-9.23
0
2
4
6
p
8
10
Growth rate of GNP of USA
xn (k ), n  170, k  1,
xt  0  1 xt 1  2 xt 2  3 xt 3  zt
AR(3)
ˆ  0.0077

ˆ1  0.35
ˆ2  0.18
ˆ3  0.14

ˆ0  ˆ 1  ˆ1  ˆ2  ˆ3  0.0047
,6
Prediction of rate growth with AR(1)
0.04
0.03
0.02
0.01
xt  0.0047  0.35xt 1  0.18xt 2  0.14 xt 3  zt
sz  ˆ z  0.0098
0
-0.01
-0.02
-0.03
164
166
168
170
172
174
176
Prediction of rate growth with AR(3)
0.04
AR(1)
0.03
xt  0.0047  0.38xt 1  zt
0.02
0.01
sz  ˆ z  0.0099
0
-0.01
-0.02
-0.03
164
166
168
170
172
174
176
Growth rate of GNP of USA
predictability for k step ahead
AR(p), p=1,…,10
xn (1), n  126  176
xn (1), n  146  176
nrmse(k) on the last 50 data
nrmse(k) on the last 30 data
1.1
1.1
1
1
nrmse(p)
nrmse(p)
k=1
k=2
0.9
0.8
0.7
0
2
4
6
p
8
0.9
k=1
k=2
0.8
10
0.7
0
2
4
6
p
8
10
zt ~ WN(0,  )
2
z
MA(1) model
t  n 1
Ez n j
xt  zt   zt 1
xn1  z n1   z n
Optimal prediction at time step 1: xn (1)   zn
t n2
αν j  0
αν j  0
 0
| xn , xn1 ,  
 z n j
xn 2  zn 2   zn1
Optimal prediction at time step 2: xn (2)  0
Prediction error :
en (1)  z n1
Var en (1)   z2
Prediction error :
en (2)  xn 2
Var en (2)  Var xn 2    x2
For time step k:
 z n
xn ( k )  
 0
για k  1
για k  1
z
en (k )   n1
 xn  k
για k  1
για k  1
MA(q) model
t  n 1
xt  zt  1 zt 1     q zt q
xn1  z n1  1 z n     q z nq1
Prediction error :
en (1)  z n1
Optimal prediction at time step 1:
xn (1)  1 z n     q z nq1
Var en (1)   z2
t  n  2 xn 2  zn 2  1 zn1   2 zn     q znq  2
Prediction error :
Optimal prediction at time step 2:
en (2)  zn 2  1 zn1
Var en (2)  ( 2  1) z2
xn (2)   2 zn     q znq 2
t nk
xnk  zn k  1 zn k 1     q zn k q
Optimal prediction at time step k:
 z   z 
xn (k )   k n k 1 n 1
0

  q zn  q  k
Prediction error :
en (k )  znk  1 znk 1     k 1 zn1
if k  q
if k  q
k 1
en (k )   j zn  k  j
j 0
Var en (k )   
k 1
2
z

j 0
2
j
Growth rate of GNP of USA
The observations are at annual-quarters, from the second
quarter of 1947 till the first quarter of 1991 (n=176)
ΜΑ(2)
xt  0.0077  zt  0.41zt 1  0.40zt 2
sz  ˆ z  0.0109
xn (k ), n  170, k  1,
xn (1), n  146  176
,6
Prediction of rate growth with MA(2)
nrmse(k) with MA(q) on the last 30 data
0.04
k=1
k=2
0.03
1.1
0.02
nrmse(q)
0.01
0
1
0.9
-0.01
0.8
-0.02
-0.03
164
166
168
170
172
174
176
0.7
0
2
4
6
q
8
10
ARMA(p,q) model
xt  1 xt 1     p xt  p  zt  1 zt 1     q zt q
t  n 1
xn1  1 xn     p xn p1  zn1  1 zn     q znq1
Prediction error :
en (1)  z n1
Optimal prediction at time step 1:
xn (1)  1 xn     p xn p1  1 zn     q znq1
Var en (1)   z2
Optimal prediction at time step k:
1 xn (k  1) 
xn (k )  
1 xn (k  1) 
  p xn (k  p)   k zn 
  p xn (k  p)
  q zn  q  k
if k  q
if k  q
Prediction with ARMA: merging of the prediction with AR and MA
Growth rate of GNP of USA
ARMA(3,2)
xt  0.0034  0.15xt 1  0.29 xt 2  0.12 xt 3  zt  0.33zt 1  0.13zt 2
sz  ˆ z  0.0105
xn (k ), n  170, k  1,
xn (1), n  146  176
,6
Prediction of rate growth with ARMA(3,2)
nrmse(k) with ARMA(p,1) on the last 30 data
0.04
k=1
k=2
0.03
1.1
nrmse(p)
0.02
0.01
0
1
0.9
-0.01
0.8
-0.02
-0.03
164
166
168
170
172
174
176
0.7
0
2
4
6
p
8
10
Prediction of non-stationary time series
Given a non-stationary time series
 y1 , y2 ,
Stages of prediction:
 y1 , y2 ,
1. transformation to stationary
2. prediction of xn+k with some model
3. inverse transform on the prediction
standard decomposition model for yt :
t  n  k prediction of yn+k :
, yn 
yn (k )
1
x1 , x2 ,
3
, xn 
2
xn (k )
yt  t  st  xt
ynk  nk  snk  xnk
Estimation of μt and st
as functions of time t
1. xt  yt  t  st
2. xn (k ) : prediction (of type ARMA) of xn+k
3. yn (k )  nk  snk  xn (k )
, yn 
Removal of μt and st (using
differences)  prediction with
models ARIMA or SARIMA
ARIMA(p,1,q)
Stages of prediction of yn (1) :
1. transformation :
 y1 , y2 ,
, yn   x2 , x3
, xn  stationary
xt  yt  yt 1
2. prediction of xn+1 with ARMA(p,q)  xn (1)
3. inverse transform : yn (1)  xn (1)
prediction error :
e (1)  e~ (1)
n
yn (1)  yn  xn (1)
n
prediction error of xn (1)
For prediction at k steps ahead :
yn (k )  yn (k  1)  xn (k )
known from the prediction of yn+k-1
ARMA(p,q) prediction of xn+k
Similar procedure for the prediction with models
ARIMA(p,d,q) or SARIMA(p,d,q)(P,D,Q)s
Period from January 2002 to September 2005
ASE index
Autocorrelation of ASE General Index
1
3000
0.8
2500
0.6
r()
close index
ASE General Index, Jan 2002 - Sep 2005
3500
2000
0.4
1500
0.2
1000
02
Returns
03
04
years
05
0
0
06
yt  yt 1
xt 
yt 1
10
20

30
40
50
Autocorrelation of returns of ASE General Index
0.2
Returns of ASE General Index
0.05
0.15
0.04
0.1
0.05
0.02
r()
close index returns
0.03
0.01
0
0
-0.05
-0.01
-0.1
-0.02
-0.15
-0.03
-0.04
02
-0.2
0
03
04
years
05
06
5
10

15
20
Period from January 2002 to September 2005
ASE index
Order of AR model
Partial autocorrelation of returns of general index
0.2
xt 
yt  yt 1
yt 1
AIC of returns of general index
-9.105
0.15
-9.11
0.1
-9.115
AIC(p)
p,p
0.05
0
-0.05
-9.12
-9.125
-0.1
-9.13
-0.15
-0.2
0
5
10
p
15
-9.135
0
20
5
10
p
15
20
Prediction of many steps, all for current time on 20/9/2005
ASE index
returns of ASE
xn(k) of general index, n=20.9.2005
yn(k) of index return, n=20.9.2005
yt  yt 1 (1  xt )
0.015
return of index
y (T), AR(7)
n
0.01
3400
yn1  yn (1  xn1 )
0.005
Prediction
0
yn (1)  yn (1  xn (1))
-0.005
close index
returns of index
3450
3350
3300
3250
yn (k )  yn (k  1)(1  xn (k ))
-0.01
-0.015
18
25
02
days
09
16
general index
xn(T), AR(7)
3200
18
25
02
days
09
16
ASE index
Period from January 2002 to September 2005
One step ahead prediction for period
20/9/2005 – 12/10/2005
Estimation of prediction error
with ΑR(p) models
for the period 20/9/2005 – 12/10/2005
xn(1) of general index n=20.9.2005 to 12.10.2005
nrmse of AR for general index, 20.9.2005-12.10.2005
3450
1.5
k=1
k=2
k=5
3350
nrmse(p)
close index
3400
general index
AR(1)
AR(7)
3300
1
3250
3200
18
25
02
days
09
16
0.5
0
5
10
p
15
20
Download