Rob J Hyndman State space models 2: Structural models Outline 1 Simple structural models 2 Linear Gaussian state space models 3 Kalman filter 4 Kalman smoothing 5 Time varying parameter models State space models 2: Structural models 2 Outline 1 Simple structural models 2 Linear Gaussian state space models 3 Kalman filter 4 Kalman smoothing 5 Time varying parameter models State space models 2: Structural models 3 State space models xt −1 ETS state vector yt xt = (`t , bt , st , st−1 , . . . , st−m+1 ) xt yt+1 xt +1 yt+2 xt +2 y t +3 xt +3 yt+4 xt+4 State space models yt+5 2: Structural models 4 State space models xt −1 ETS state vector yt xt = (`t , bt , st , st−1 , . . . , st−m+1 ) xt yt+1 ETS models å yt depends on xt−1 . xt +1 å The same error yt+2 xt +2 process affects xt |xt−1 and yt |xt−1 . y t +3 xt +3 yt+4 xt+4 State space models yt+5 2: Structural models 4 State space models xt ETS state vector yt xt = (`t , bt , st , st−1 , . . . , st−m+1 ) xt +1 y t +1 xt+2 yt+2 xt+3 Structural models å yt depends on xt . å A different error process affects xt |xt−1 and yt |xt . State space models yt+3 xt +4 y t +4 xt+5 yt+5 2: Structural models 5 Local level model Stochastically varying level (random walk) observed with noise y t = `t + ε t ` t = ` t −1 + ξ t εt and ξt are independent Gaussian white noise processes. Compare ETS(A,N,N) where ξt = αεt−1 . Parameters to estimate: σε2 and σξ2 . If σξ2 = 0, yt ∼ NID(`0 , σε2 ). State space models 2: Structural models 6 Local level model Stochastically varying level (random walk) observed with noise y t = `t + ε t ` t = ` t −1 + ξ t εt and ξt are independent Gaussian white noise processes. Compare ETS(A,N,N) where ξt = αεt−1 . Parameters to estimate: σε2 and σξ2 . If σξ2 = 0, yt ∼ NID(`0 , σε2 ). State space models 2: Structural models 6 Local level model Stochastically varying level (random walk) observed with noise y t = `t + ε t ` t = ` t −1 + ξ t εt and ξt are independent Gaussian white noise processes. Compare ETS(A,N,N) where ξt = αεt−1 . Parameters to estimate: σε2 and σξ2 . If σξ2 = 0, yt ∼ NID(`0 , σε2 ). State space models 2: Structural models 6 Local level model Stochastically varying level (random walk) observed with noise y t = `t + ε t ` t = ` t −1 + ξ t εt and ξt are independent Gaussian white noise processes. Compare ETS(A,N,N) where ξt = αεt−1 . Parameters to estimate: σε2 and σξ2 . If σξ2 = 0, yt ∼ NID(`0 , σε2 ). State space models 2: Structural models 6 Local linear trend model Dynamic trend observed with noise y t = `t + ε t `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt εt , ξt and ζt are independent Gaussian white noise processes. Compare ETS(A,A,N) where ξt = (α + β)εt−1 and ζt = βεt−1 Parameters to estimate: σε2 , σξ2 , and σζ2 . If σζ2 = σξ2 = 0, yt = `0 + tb0 + εt . Model is a time-varying linear regression. State space models 2: Structural models 7 Local linear trend model Dynamic trend observed with noise y t = `t + ε t `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt εt , ξt and ζt are independent Gaussian white noise processes. Compare ETS(A,A,N) where ξt = (α + β)εt−1 and ζt = βεt−1 Parameters to estimate: σε2 , σξ2 , and σζ2 . If σζ2 = σξ2 = 0, yt = `0 + tb0 + εt . Model is a time-varying linear regression. State space models 2: Structural models 7 Local linear trend model Dynamic trend observed with noise y t = `t + ε t `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt εt , ξt and ζt are independent Gaussian white noise processes. Compare ETS(A,A,N) where ξt = (α + β)εt−1 and ζt = βεt−1 Parameters to estimate: σε2 , σξ2 , and σζ2 . If σζ2 = σξ2 = 0, yt = `0 + tb0 + εt . Model is a time-varying linear regression. State space models 2: Structural models 7 Local linear trend model Dynamic trend observed with noise y t = `t + ε t `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt εt , ξt and ζt are independent Gaussian white noise processes. Compare ETS(A,A,N) where ξt = (α + β)εt−1 and ζt = βεt−1 Parameters to estimate: σε2 , σξ2 , and σζ2 . If σζ2 = σξ2 = 0, yt = `0 + tb0 + εt . Model is a time-varying linear regression. State space models 2: Structural models 7 Local linear trend model Dynamic trend observed with noise y t = `t + ε t `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt εt , ξt and ζt are independent Gaussian white noise processes. Compare ETS(A,A,N) where ξt = (α + β)εt−1 and ζt = βεt−1 Parameters to estimate: σε2 , σξ2 , and σζ2 . If σζ2 = σξ2 = 0, yt = `0 + tb0 + εt . Model is a time-varying linear regression. State space models 2: Structural models 7 Basic structural model yt = `t + s1,t + εt `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt s1,t = − m −1 X sj,t−1 + ηt j=1 sj,t = sj−1,t−1 , j = 2, . . . , m − 1 εt , ξt , ζt and ηt are independent Gaussian white noise processes. Compare ETS(A,A,A). Parameters to estimate: σε2 , σξ2 , σζ2 and ση2 Deterministic seasonality if ση2 = 0. State space models 2: Structural models 8 Basic structural model yt = `t + s1,t + εt `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt s1,t = − m −1 X sj,t−1 + ηt j=1 sj,t = sj−1,t−1 , j = 2, . . . , m − 1 εt , ξt , ζt and ηt are independent Gaussian white noise processes. Compare ETS(A,A,A). Parameters to estimate: σε2 , σξ2 , σζ2 and ση2 Deterministic seasonality if ση2 = 0. State space models 2: Structural models 8 Basic structural model yt = `t + s1,t + εt `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt s1,t = − m −1 X sj,t−1 + ηt j=1 sj,t = sj−1,t−1 , j = 2, . . . , m − 1 εt , ξt , ζt and ηt are independent Gaussian white noise processes. Compare ETS(A,A,A). Parameters to estimate: σε2 , σξ2 , σζ2 and ση2 Deterministic seasonality if ση2 = 0. State space models 2: Structural models 8 Basic structural model yt = `t + s1,t + εt `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt s1,t = − m −1 X sj,t−1 + ηt j=1 sj,t = sj−1,t−1 , j = 2, . . . , m − 1 εt , ξt , ζt and ηt are independent Gaussian white noise processes. Compare ETS(A,A,A). Parameters to estimate: σε2 , σξ2 , σζ2 and ση2 Deterministic seasonality if ση2 = 0. State space models 2: Structural models 8 Trigonometric models yt = `t + J X sj,t + εt j=1 `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt sj,t = cos λj sj,t−1 + sin λj s∗j,t−1 + ωj,t s∗j,t = − sin λj sj,t−1 + cos λj s∗j,t−1 + ωj∗,t λj = 2π j/m εt , ξt , ζt , ωj,t , ωj∗,t are independent Gaussian white noise processes 2 ωj,t and ωj∗,t have same variance σω, j 2 2 and J = m/2 Equivalent to BSM when σω, = σ ω j Choose J < m/2 for fewer degrees of freedom State space models 2: Structural models 9 Trigonometric models yt = `t + J X sj,t + εt j=1 `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt sj,t = cos λj sj,t−1 + sin λj s∗j,t−1 + ωj,t s∗j,t = − sin λj sj,t−1 + cos λj s∗j,t−1 + ωj∗,t λj = 2π j/m εt , ξt , ζt , ωj,t , ωj∗,t are independent Gaussian white noise processes 2 ωj,t and ωj∗,t have same variance σω, j 2 2 and J = m/2 Equivalent to BSM when σω, = σ ω j Choose J < m/2 for fewer degrees of freedom State space models 2: Structural models 9 Trigonometric models yt = `t + J X sj,t + εt j=1 `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt sj,t = cos λj sj,t−1 + sin λj s∗j,t−1 + ωj,t s∗j,t = − sin λj sj,t−1 + cos λj s∗j,t−1 + ωj∗,t λj = 2π j/m εt , ξt , ζt , ωj,t , ωj∗,t are independent Gaussian white noise processes 2 ωj,t and ωj∗,t have same variance σω, j 2 2 and J = m/2 Equivalent to BSM when σω, = σ ω j Choose J < m/2 for fewer degrees of freedom State space models 2: Structural models 9 Trigonometric models yt = `t + J X sj,t + εt j=1 `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt sj,t = cos λj sj,t−1 + sin λj s∗j,t−1 + ωj,t s∗j,t = − sin λj sj,t−1 + cos λj s∗j,t−1 + ωj∗,t λj = 2π j/m εt , ξt , ζt , ωj,t , ωj∗,t are independent Gaussian white noise processes 2 ωj,t and ωj∗,t have same variance σω, j 2 2 and J = m/2 Equivalent to BSM when σω, = σ ω j Choose J < m/2 for fewer degrees of freedom State space models 2: Structural models 9 Trigonometric models yt = `t + J X sj,t + εt j=1 `t = `t−1 + bt−1 + ξt bt = bt−1 + ζt sj,t = cos λj sj,t−1 + sin λj s∗j,t−1 + ωj,t s∗j,t = − sin λj sj,t−1 + cos λj s∗j,t−1 + ωj∗,t λj = 2π j/m εt , ξt , ζt , ωj,t , ωj∗,t are independent Gaussian white noise processes 2 ωj,t and ωj∗,t have same variance σω, j 2 2 and J = m/2 Equivalent to BSM when σω, = σ ω j Choose J < m/2 for fewer degrees of freedom State space models 2: Structural models 9 ETS vs Structural models ETS models are much more general as they allow non-linear (multiplicative components). ETS allows automatic forecasting due to its larger model space. Additive ETS models are almost equivalent to the corresponding structural models. ETS models have a larger parameter space. Structural models parameters are always non-negative (variances). Structural models are much easier to generalize (e.g., add covariates). It is easier to handle missing values with structural models. State space models 2: Structural models 10 ETS vs Structural models ETS models are much more general as they allow non-linear (multiplicative components). ETS allows automatic forecasting due to its larger model space. Additive ETS models are almost equivalent to the corresponding structural models. ETS models have a larger parameter space. Structural models parameters are always non-negative (variances). Structural models are much easier to generalize (e.g., add covariates). It is easier to handle missing values with structural models. State space models 2: Structural models 10 ETS vs Structural models ETS models are much more general as they allow non-linear (multiplicative components). ETS allows automatic forecasting due to its larger model space. Additive ETS models are almost equivalent to the corresponding structural models. ETS models have a larger parameter space. Structural models parameters are always non-negative (variances). Structural models are much easier to generalize (e.g., add covariates). It is easier to handle missing values with structural models. State space models 2: Structural models 10 ETS vs Structural models ETS models are much more general as they allow non-linear (multiplicative components). ETS allows automatic forecasting due to its larger model space. Additive ETS models are almost equivalent to the corresponding structural models. ETS models have a larger parameter space. Structural models parameters are always non-negative (variances). Structural models are much easier to generalize (e.g., add covariates). It is easier to handle missing values with structural models. State space models 2: Structural models 10 ETS vs Structural models ETS models are much more general as they allow non-linear (multiplicative components). ETS allows automatic forecasting due to its larger model space. Additive ETS models are almost equivalent to the corresponding structural models. ETS models have a larger parameter space. Structural models parameters are always non-negative (variances). Structural models are much easier to generalize (e.g., add covariates). It is easier to handle missing values with structural models. State space models 2: Structural models 10 ETS vs Structural models ETS models are much more general as they allow non-linear (multiplicative components). ETS allows automatic forecasting due to its larger model space. Additive ETS models are almost equivalent to the corresponding structural models. ETS models have a larger parameter space. Structural models parameters are always non-negative (variances). Structural models are much easier to generalize (e.g., add covariates). It is easier to handle missing values with structural models. State space models 2: Structural models 10 Structural models in R StructTS(oil, type="level") StructTS(ausair, type="trend") StructTS(austourists, type="BSM") fit <- StructTS(austourists, type = "BSM") decomp <- cbind(austourists, fitted(fit)) colnames(decomp) <- c("data","level","slope", "seasonal") plot(decomp, main="Decomposition of International visitor nights") State space models 2: Structural models 11 Structural models in R 40 35 −0.5 10 −2.0 0 −10 seasonal slope 25 level 45 20 data 60 Decomposition of International visitor nights 2000 2002 2004 2006 2008 2010 Time State space models 2: Structural models 12 ETS decomposition 60 40 45 20 35 0.509025 0 5 0.5075 −10 season slope level observed Decomposition by ETS(A,A,A) method 2000 2002 2004 2006 2008 2010 Time State space models 2: Structural models 13 Outline 1 Simple structural models 2 Linear Gaussian state space models 3 Kalman filter 4 Kalman smoothing 5 Time varying parameter models State space models 2: Structural models 14 Linear Gaussian SS models Observation equation State equation yt = f 0 xt + εt xt = Gxt−1 + wt State vector xt of length p G a p × p matrix, f a vector of length p εt ∼ NID(0, σ 2 ), wt ∼ NID(0, W ). Local level model: f = G = 1, xt = `t . Local linear trend model: f 0 = [1 0], 2 σξ 0 `t 1 1 xt = G= W= 0 σζ2 bt 0 1 State space models 2: Structural models 15 Linear Gaussian SS models Observation equation State equation yt = f 0 xt + εt xt = Gxt−1 + wt State vector xt of length p G a p × p matrix, f a vector of length p εt ∼ NID(0, σ 2 ), wt ∼ NID(0, W ). Local level model: f = G = 1, xt = `t . Local linear trend model: f 0 = [1 0], 2 σξ 0 `t 1 1 xt = G= W= 0 σζ2 bt 0 1 State space models 2: Structural models 15 Linear Gaussian SS models Observation equation State equation yt = f 0 xt + εt xt = Gxt−1 + wt State vector xt of length p G a p × p matrix, f a vector of length p εt ∼ NID(0, σ 2 ), wt ∼ NID(0, W ). Local level model: f = G = 1, xt = `t . Local linear trend model: f 0 = [1 0], 2 σξ 0 `t 1 1 xt = G= W= 0 σζ2 bt 0 1 State space models 2: Structural models 15 Linear Gaussian SS models Observation equation State equation yt = f 0 xt + εt xt = Gxt−1 + wt State vector xt of length p G a p × p matrix, f a vector of length p εt ∼ NID(0, σ 2 ), wt ∼ NID(0, W ). Local level model: f = G = 1, xt = `t . Local linear trend model: f 0 = [1 0], 2 σξ 0 `t 1 1 xt = G= W= 0 σζ2 bt 0 1 State space models 2: Structural models 15 Linear Gaussian SS models Observation equation State equation yt = f 0 xt + εt xt = Gxt−1 + wt State vector xt of length p G a p × p matrix, f a vector of length p εt ∼ NID(0, σ 2 ), wt ∼ NID(0, W ). Local level model: f = G = 1, xt = `t . Local linear trend model: f 0 = [1 0], 2 σξ 0 `t 1 1 xt = G= W= 0 σζ2 bt 0 1 State space models 2: Structural models 15 Linear Gaussian SS models Observation equation State equation yt = f 0 xt + εt xt = Gxt−1 + wt State vector xt of length p G a p × p matrix, f a vector of length p εt ∼ NID(0, σ 2 ), wt ∼ NID(0, W ). Local level model: f = G = 1, xt = `t . Local linear trend model: f 0 = [1 0], 2 σξ 0 `t 1 1 xt = G= W= bt 0 1 0 σζ2 State space models 2: Structural models 15 Linear Gaussian SS models Observation equation State equation yt = f 0 xt + εt xt = Gxt−1 + wt State vector xt of length p G a p × p matrix, f a vector of length p εt ∼ NID(0, σ 2 ), wt ∼ NID(0, W ). Local level model: f = G = 1, xt = `t . Local linear trend model: f 0 = [1 0], 2 σξ 0 `t 1 1 xt = G= W= bt 0 1 0 σζ2 State space models 2: Structural models 15 Basic structural model Linear Gaussian state space model yt = f 0 xt + εt , εt ∼ N(0, σ 2 ) wt ∼ N(0, W ) xt = Gxt−1 + wt f 0 = [1 0 1 0 · · · 0], xt = `t bt s1,t s2,t s3,t .. . G= sm−1,t State space models W = diagonal(σξ2 , σζ2 , ση2 , 0, . . . , 0) ... 0 0 ... 0 0 . . . −1 −1 ... 0 0 .. .. ... 0 1 . . .. . . . . . . 0 0 . 0 ... 0 1 0 1 0 0 0 1 0 0 1 0 0 0 −1 −1 0 1 0 0 .. . 0 .. . 0 0 2: Structural models 16 Basic structural model Linear Gaussian state space model yt = f 0 xt + εt , εt ∼ N(0, σ 2 ) wt ∼ N(0, W ) xt = Gxt−1 + wt f 0 = [1 0 1 0 · · · 0], xt = `t bt s1,t s2,t s3,t .. . G= sm−1,t State space models W = diagonal(σξ2 , σζ2 , ση2 , 0, . . . , 0) ... 0 0 ... 0 0 . . . −1 −1 ... 0 0 .. .. ... 0 1 . . .. . . . . . . 0 0 . 0 ... 0 1 0 1 0 0 0 1 0 0 1 0 0 0 −1 −1 0 1 0 0 .. . 0 .. . 0 0 2: Structural models 16 Outline 1 Simple structural models 2 Linear Gaussian state space models 3 Kalman filter 4 Kalman smoothing 5 Time varying parameter models State space models 2: Structural models 17 Kalman filter Notation: x̂t|t = E[xt |y1 , . . . , yt ] P̂t|t = V[xt |y1 , . . . , yt ] x̂t|t−1 = E[xt |y1 , . . . , yt−1 ] P̂t|t−1 = V[xt |y1 , . . . , yt−1 ] ŷt|t−1 = E[yt |y1 , . . . , yt−1 ] v̂t|t−1 = V[yt |y1 , . . . , yt−1 ] Forecasting: ŷt|t−1 = f 0 x̂t|t−1 v̂t|t−1 = f 0 P̂t|t−1 f + σ 2 Updating or State Filtering: x̂t|t = x̂t|t−1 + P̂t|t−1 f v̂t−|t1−1 (yt − ŷt|t−1 ) P̂t|t = P̂t|t−1 − P̂t|t−1 f v̂t−|t1−1 f 0 P̂t|t−1 State Prediction x̂t+1|t = Gx̂t|t P̂t+1|t = GP̂t|t G0 + W State space models 2: Structural models 18 Kalman filter Notation: x̂t|t = E[xt |y1 , . . . , yt ] P̂t|t = V[xt |y1 , . . . , yt ] x̂t|t−1 = E[xt |y1 , . . . , yt−1 ] P̂t|t−1 = V[xt |y1 , . . . , yt−1 ] ŷt|t−1 = E[yt |y1 , . . . , yt−1 ] v̂t|t−1 = V[yt |y1 , . . . , yt−1 ] Forecasting: ŷt|t−1 = f 0 x̂t|t−1 v̂t|t−1 = f 0 P̂t|t−1 f + σ 2 Updating or State Filtering: x̂t|t = x̂t|t−1 + P̂t|t−1 f v̂t−|t1−1 (yt − ŷt|t−1 ) P̂t|t = P̂t|t−1 − P̂t|t−1 f v̂t−|t1−1 f 0 P̂t|t−1 State Prediction x̂t+1|t = Gx̂t|t P̂t+1|t = GP̂t|t G0 + W State space models 2: Structural models 18 Kalman filter Notation: x̂t|t = E[xt |y1 , . . . , yt ] P̂t|t = V[xt |y1 , . . . , yt ] x̂t|t−1 = E[xt |y1 , . . . , yt−1 ] P̂t|t−1 = V[xt |y1 , . . . , yt−1 ] ŷt|t−1 = E[yt |y1 , . . . , yt−1 ] v̂t|t−1 = V[yt |y1 , . . . , yt−1 ] Forecasting: ŷt|t−1 = f 0 x̂t|t−1 v̂t|t−1 = f 0 P̂t|t−1 f + σ 2 Updating or State Filtering: x̂t|t = x̂t|t−1 + P̂t|t−1 f v̂t−|t1−1 (yt − ŷt|t−1 ) P̂t|t = P̂t|t−1 − P̂t|t−1 f v̂t−|t1−1 f 0 P̂t|t−1 State Prediction x̂t+1|t = Gx̂t|t P̂t+1|t = GP̂t|t G0 + W State space models 2: Structural models 18 Kalman filter Notation: x̂t|t = E[xt |y1 , . . . , yt ] P̂t|t = V[xt |y1 , . . . , yt ] x̂t|t−1 = E[xt |y1 , . . . , yt−1 ] P̂t|t−1 = V[xt |y1 , . . . , yt−1 ] ŷt|t−1 = E[yt |y1 , . . . , yt−1 ] v̂t|t−1 = V[yt |y1 , . . . , yt−1 ] Forecasting: ŷt|t−1 = f 0 x̂t|t−1 v̂t|t−1 = f 0 P̂t|t−1 f + σ 2 Updating or State Filtering: x̂t|t = x̂t|t−1 + P̂t|t−1 f v̂t−|t1−1 (yt − ŷt|t−1 ) P̂t|t = P̂t|t−1 − P̂t|t−1 f v̂t−|t1−1 f 0 P̂t|t−1 State Prediction x̂t+1|t = Gx̂t|t P̂t+1|t = GP̂t|t G0 + W State space models 2: Structural models 18 Kalman filter Notation: x̂t|t = E[xt |y1 , . . . , yt ] P̂t|t = V[xt |y1 , . . . , yt ] x̂t|t−1 = E[xt |y1 , . . . , yt−1 ] P̂t|t−1 = V[xt |y1 , . . . , yt−1 ] ŷt|t−1 = E[yt |y1 , . . . , yt−1 ] v̂t|t−1 = V[yt |y1 , . . . , yt−1 ] Forecasting: ŷt|t−1 = f 0 x̂t|t−1 Iterate for t = 1, . . . , T v̂t|t−1 = f 0 P̂t|t−1 f + σ 2 Updating or State Filtering: x̂t|t = x̂t|t−1 + P̂t|t−1 f v̂t−|t1−1 (yt − ŷt|t−1 ) P̂t|t = P̂t|t−1 − P̂t|t−1 f v̂t−|t1−1 f 0 P̂t|t−1 State Prediction x̂t+1|t = Gx̂t|t P̂t+1|t = GP̂t|t G0 + W State space models 2: Structural models 18 Kalman filter Notation: x̂t|t = E[xt |y1 , . . . , yt ] P̂t|t = V[xt |y1 , . . . , yt ] x̂t|t−1 = E[xt |y1 , . . . , yt−1 ] P̂t|t−1 = V[xt |y1 , . . . , yt−1 ] ŷt|t−1 = E[yt |y1 , . . . , yt−1 ] v̂t|t−1 = V[yt |y1 , . . . , yt−1 ] Forecasting: ŷt|t−1 = f 0 x̂t|t−1 v̂t|t−1 = f 0 P̂t|t−1 f + σ 2 Updating or State Filtering: x̂t|t = x̂t|t−1 + P̂t|t−1 f v̂t−|t1−1 (yt − ŷt|t−1 ) Iterate for t = 1, . . . , T Assume we know x1|0 and P1|0 . P̂t|t = P̂t|t−1 − P̂t|t−1 f v̂t−|t1−1 f 0 P̂t|t−1 State Prediction x̂t+1|t = Gx̂t|t P̂t+1|t = GP̂t|t G0 + W State space models 2: Structural models 18 Kalman filter Notation: x̂t|t = E[xt |y1 , . . . , yt ] P̂t|t = V[xt |y1 , . . . , yt ] x̂t|t−1 = E[xt |y1 , . . . , yt−1 ] P̂t|t−1 = V[xt |y1 , . . . , yt−1 ] ŷt|t−1 = E[yt |y1 , . . . , yt−1 ] v̂t|t−1 = V[yt |y1 , . . . , yt−1 ] Forecasting: ŷt|t−1 = f 0 x̂t|t−1 Iterate for t = 1, . . . , T v̂t|t−1 = f 0 P̂t|t−1 f + σ 2 Updating or State Filtering: x̂t|t = x̂t|t−1 + P̂t|t−1 f v̂t−|t1−1 (yt − ŷt|t−1 ) Assume we know x1|0 and P1|0 . P̂t|t = P̂t|t−1 − P̂t|t−1 f v̂t−|t1−1 f 0 P̂t|t−1 State Prediction x̂t+1|t = Gx̂t|t 0 P̂t+1|t = GP̂t|t G + W State space models Just conditional expectations. So this gives minimum MSE estimates. 2: Structural models 18 Kalman recursions KALMAN RECURSIONS observation at time t y 2. Forecasting Forecast Observation 1. State Prediction x Filtered State Time t-1 State space models 3. State Filtering Predicted State Time t Filtered State Time t 2: Structural models 19 Initializing Kalman filter Need x1|0 and P1|0 to get started. Common approach for structural models: set x1|0 = 0 and P1|0 = kI for a very large k. Lots of research papers on optimal initialization choices for Kalman recursions. ETS approach was to estimate x1|0 and avoid P1|0 by assuming error processes identical. A random x1|0 could be used with ETS models, and then a form of Kalman filter would be required for estimation and forecasting. This gives more realistic prediction intervals. State space models 2: Structural models 20 Initializing Kalman filter Need x1|0 and P1|0 to get started. Common approach for structural models: set x1|0 = 0 and P1|0 = kI for a very large k. Lots of research papers on optimal initialization choices for Kalman recursions. ETS approach was to estimate x1|0 and avoid P1|0 by assuming error processes identical. A random x1|0 could be used with ETS models, and then a form of Kalman filter would be required for estimation and forecasting. This gives more realistic prediction intervals. State space models 2: Structural models 20 Initializing Kalman filter Need x1|0 and P1|0 to get started. Common approach for structural models: set x1|0 = 0 and P1|0 = kI for a very large k. Lots of research papers on optimal initialization choices for Kalman recursions. ETS approach was to estimate x1|0 and avoid P1|0 by assuming error processes identical. A random x1|0 could be used with ETS models, and then a form of Kalman filter would be required for estimation and forecasting. This gives more realistic prediction intervals. State space models 2: Structural models 20 Initializing Kalman filter Need x1|0 and P1|0 to get started. Common approach for structural models: set x1|0 = 0 and P1|0 = kI for a very large k. Lots of research papers on optimal initialization choices for Kalman recursions. ETS approach was to estimate x1|0 and avoid P1|0 by assuming error processes identical. A random x1|0 could be used with ETS models, and then a form of Kalman filter would be required for estimation and forecasting. This gives more realistic prediction intervals. State space models 2: Structural models 20 Initializing Kalman filter Need x1|0 and P1|0 to get started. Common approach for structural models: set x1|0 = 0 and P1|0 = kI for a very large k. Lots of research papers on optimal initialization choices for Kalman recursions. ETS approach was to estimate x1|0 and avoid P1|0 by assuming error processes identical. A random x1|0 could be used with ETS models, and then a form of Kalman filter would be required for estimation and forecasting. This gives more realistic prediction intervals. State space models 2: Structural models 20 Initializing Kalman filter Need x1|0 and P1|0 to get started. Common approach for structural models: set x1|0 = 0 and P1|0 = kI for a very large k. Lots of research papers on optimal initialization choices for Kalman recursions. ETS approach was to estimate x1|0 and avoid P1|0 by assuming error processes identical. A random x1|0 could be used with ETS models, and then a form of Kalman filter would be required for estimation and forecasting. This gives more realistic prediction intervals. State space models 2: Structural models 20 Local level model yt = `t + εt `t = `t−1 + ut εt ∼ NID(0, σ 2 ) ut ∼ NID(0, q2 ) Kalman recursions: ŷt|t−1 = `ˆt−1|t−1 v̂t|t−1 = p̂t|t−1 + σ 2 `ˆt|t = `ˆt−1|t−1 + p̂t|t−1 v̂t−|t1−1 (yt − ŷt|t−1 ) p̂t+1|t = p̂t|t−1 (1 − v̂t−|t1−1 p̂t|t−1 ) + q2 State space models 2: Structural models 21 Local level model yt = `t + εt `t = `t−1 + ut εt ∼ NID(0, σ 2 ) ut ∼ NID(0, q2 ) Kalman recursions: ŷt|t−1 = `ˆt−1|t−1 v̂t|t−1 = p̂t|t−1 + σ 2 `ˆt|t = `ˆt−1|t−1 + p̂t|t−1 v̂t−|t1−1 (yt − ŷt|t−1 ) p̂t+1|t = p̂t|t−1 (1 − v̂t−|t1−1 p̂t|t−1 ) + q2 State space models 2: Structural models 21 Handling missing values Forecasting: 0 ŷt|t−1 = f x̂t|t−1 v̂t|t−1 = f 0 P̂t|t−1 f + σ 2 Iterate for t = 1, . . . , T starting with x1|0 and P1|0 . Updating or State Filtering: x̂t|t = x̂t|t−1 +P̂t|t−1 f v̂t−|t1−1 (yt − ŷt|t−1 ) P̂t|t = P̂t|t−1 −P̂t|t−1 f v̂t−|t1−1 f 0 P̂t|t−1 State Prediction x̂t|t−1 = Gx̂t−1|t−1 P̂t|t−1 = GP̂t−1|t−1 G0 + W State space models 2: Structural models 22 Handling missing values Forecasting: 0 ŷt|t−1 = f x̂t|t−1 v̂t|t−1 = f 0 P̂t|t−1 f + σ 2 Iterate for t = 1, . . . , T starting with x1|0 and P1|0 . Updating or State Filtering: x̂t|t = x̂t|t−1 +P̂t|t−1 f v̂t−|t1−1 (yt − ŷt|t−1 ) P̂t|t = P̂t|t−1 −P̂t|t−1 f v̂t−|t1−1 f 0 P̂t|t−1 State Prediction x̂t|t−1 = Gx̂t−1|t−1 Ignored greyed out section if yt missing. P̂t|t−1 = GP̂t−1|t−1 G0 + W State space models 2: Structural models 22 Handling missing values Forecasting: 0 ŷt|t−1 = f x̂t|t−1 v̂t|t−1 = f 0 P̂t|t−1 f + σ 2 Iterate for t = 1, . . . , T starting with x1|0 and P1|0 . Updating or State Filtering: x̂t|t = x̂t|t−1 +P̂t|t−1 f v̂t−|t1−1 (yt − ŷt|t−1 ) P̂t|t = P̂t|t−1 −P̂t|t−1 f v̂t−|t1−1 f 0 P̂t|t−1 State Prediction x̂t|t−1 = Gx̂t−1|t−1 Ignored greyed out section if yt missing. P̂t|t−1 = GP̂t−1|t−1 G0 + W State space models 2: Structural models 22 Multi-step forecasting Forecasting: ŷt|t−1 = f 0 x̂t|t−1 v̂t|t−1 = f 0 P̂t|t−1 f + σ 2 Iterate for t = T + 1, . . . , T + h starting with xT |T and PT |T . Updating or State Filtering: x̂t|t = x̂t|t−1 +P̂t|t−1 f v̂t−|t1−1 (yt − ŷt|t−1 ) P̂t|t = P̂t|t−1 −P̂t|t−1 f v̂t−|t1−1 f 0 P̂t|t−1 State Prediction x̂t|t−1 = Gx̂t−1|t−1 P̂t|t−1 = GP̂t−1|t−1 G0 + W State space models 2: Structural models 23 Multi-step forecasting Forecasting: Iterate for t = T + 1, . . . , T + h starting with xT |T and PT |T . ŷt|t−1 = f 0 x̂t|t−1 v̂t|t−1 = f 0 P̂t|t−1 f + σ 2 Updating or State Filtering: x̂t|t = x̂t|t−1 +P̂t|t−1 f v̂t−|t1−1 (yt − ŷt|t−1 ) P̂t|t = P̂t|t−1 −P̂t|t−1 f v̂t−|t1−1 f 0 P̂t|t−1 State Prediction x̂t|t−1 = Gx̂t−1|t−1 Treat future values as missing. P̂t|t−1 = GP̂t−1|t−1 G0 + W State space models 2: Structural models 23 Kalman filter What’s so special about the Kalman filter Very general equations for any model in state space format. Any model in state space format can easily be generalized. Optimal MSE forecasts Easy to handle missing values. Easy to compute likelihood. State space models 2: Structural models 24 Kalman filter What’s so special about the Kalman filter Very general equations for any model in state space format. Any model in state space format can easily be generalized. Optimal MSE forecasts Easy to handle missing values. Easy to compute likelihood. State space models 2: Structural models 24 Kalman filter What’s so special about the Kalman filter Very general equations for any model in state space format. Any model in state space format can easily be generalized. Optimal MSE forecasts Easy to handle missing values. Easy to compute likelihood. State space models 2: Structural models 24 Kalman filter What’s so special about the Kalman filter Very general equations for any model in state space format. Any model in state space format can easily be generalized. Optimal MSE forecasts Easy to handle missing values. Easy to compute likelihood. State space models 2: Structural models 24 Kalman filter What’s so special about the Kalman filter Very general equations for any model in state space format. Any model in state space format can easily be generalized. Optimal MSE forecasts Easy to handle missing values. Easy to compute likelihood. State space models 2: Structural models 24 Likelihood calculation θ = all unknown parameters fθ (yt |y1 , y2 , . . . , yt−1 ) = one-step forecast density. Likelihood L(y1 , . . . , yT ; θ) = T Y fθ (yt |y1 , . . . , yt−1 ) t =1 Gaussian log likelihood log L = − T 2 T log(2π) − 1X 2 t =1 T log v̂t|t−1 − 1X 2 e2t /v̂t|t−1 t =1 where et = yt − ŷt|t−1 . All terms obtained from Kalman filter equations. State space models 2: Structural models 25 Likelihood calculation θ = all unknown parameters fθ (yt |y1 , y2 , . . . , yt−1 ) = one-step forecast density. Likelihood L(y1 , . . . , yT ; θ) = T Y fθ (yt |y1 , . . . , yt−1 ) t =1 Gaussian log likelihood log L = − T 2 T log(2π) − 1X 2 t =1 T log v̂t|t−1 − 1X 2 e2t /v̂t|t−1 t =1 where et = yt − ŷt|t−1 . All terms obtained from Kalman filter equations. State space models 2: Structural models 25 Likelihood calculation θ = all unknown parameters fθ (yt |y1 , y2 , . . . , yt−1 ) = one-step forecast density. Likelihood L(y1 , . . . , yT ; θ) = T Y fθ (yt |y1 , . . . , yt−1 ) t =1 Gaussian log likelihood log L = − T 2 T log(2π) − 1X 2 t =1 T log v̂t|t−1 − 1X 2 e2t /v̂t|t−1 t =1 where et = yt − ŷt|t−1 . All terms obtained from Kalman filter equations. State space models 2: Structural models 25 Structural models in R fit <- StructTS(austourists, type = "BSM") fc <- forecast(fit) plot(fc) 20 30 40 50 60 70 Forecasts from Basic structural model 2000 2002 State space models 2004 2006 2008 2010 2: Structural models 2012 26 Outline 1 Simple structural models 2 Linear Gaussian state space models 3 Kalman filter 4 Kalman smoothing 5 Time varying parameter models State space models 2: Structural models 27 Kalman smoothing Want estimate of xt |y1 , . . . , yT where t < T. That is, x̂t|T . x̂t|T = x̂t|t + At x̂t+1|T − x̂t+1|t P̂t|T = P̂t|t + At P̂t+1|T − P̂t+1|t A0t where At = P̂t|t G0 P̂t+1|t −1 . Uses all data, not just previous data. Useful for estimating missing values: ŷt|T = f 0 x̂t|T . Useful for seasonal adjustment when one of the states is a seasonal component. State space models 2: Structural models 28 Kalman smoothing Want estimate of xt |y1 , . . . , yT where t < T. That is, x̂t|T . x̂t|T = x̂t|t + At x̂t+1|T − x̂t+1|t P̂t|T = P̂t|t + At P̂t+1|T − P̂t+1|t A0t where At = P̂t|t G0 P̂t+1|t −1 . Uses all data, not just previous data. Useful for estimating missing values: ŷt|T = f 0 x̂t|T . Useful for seasonal adjustment when one of the states is a seasonal component. State space models 2: Structural models 28 Kalman smoothing Want estimate of xt |y1 , . . . , yT where t < T. That is, x̂t|T . x̂t|T = x̂t|t + At x̂t+1|T − x̂t+1|t P̂t|T = P̂t|t + At P̂t+1|T − P̂t+1|t A0t where At = P̂t|t G0 P̂t+1|t −1 . Uses all data, not just previous data. Useful for estimating missing values: ŷt|T = f 0 x̂t|T . Useful for seasonal adjustment when one of the states is a seasonal component. State space models 2: Structural models 28 Kalman smoothing in R fit <- StructTS(austourists, type = "BSM") sm <- tsSmooth(fit) plot(austourists) lines(sm[,1],col=’blue’) lines(fitted(fit)[,1],col=’red’) legend("topleft",col=c(’blue’,’red’),lty=1, legend=c("Filtered level","Smoothed level")) State space models 2: Structural models 29 Filtered level Smoothed level 40 30 20 austourists 50 60 Kalman smoothing in R 2000 2002 2004 2006 2008 2010 Time State space models 2: Structural models 30 Kalman smoothing in R fit <- StructTS(austourists, type = "BSM") sm <- tsSmooth(fit) plot(austourists) # Seasonally adjusted data aus.sa <- austourists - sm[,3] lines(aus.sa,col=’blue’) State space models 2: Structural models 31 40 30 20 austourists 50 60 Kalman smoothing in R 2000 2002 2004 2006 2008 2010 Time State space models 2: Structural models 32 Kalman smoothing in R x <- austourists miss <- sample(1:length(x), 5) x[miss] <- NA fit <- StructTS(x, type = "BSM") sm <- tsSmooth(fit) estim <- sm[,1]+sm[,3] plot(x, ylim=range(austourists)) points(time(x)[miss], estim[miss], col=’red’, pch=1) points(time(x)[miss], austourists[miss], col=’black’, pch=1) legend("topleft", pch=1, col=c(2,1), legend=c("Estimate","Actual")) State space models 2: Structural models 33 60 ● Estimate Actual 50 ● ● ● 40 ● ● ● ● 30 ● ● 20 x Kalman smoothing in R 2000 2002 2004 2006 2008 2010 Time State space models 2: Structural models 34 Outline 1 Simple structural models 2 Linear Gaussian state space models 3 Kalman filter 4 Kalman smoothing 5 Time varying parameter models State space models 2: Structural models 35 Time varying parameter models Linear Gaussian state space model yt = ft0 xt + εt , xt = Gt xt−1 + wt εt ∼ N(0, σt2 ) wt ∼ N(0, Wt ) Kalman recursions: ŷt|t−1 = ft0 x̂t|t−1 v̂t|t−1 = ft0 P̂t|t−1 ft + σt2 x̂t|t = x̂t|t−1 + P̂t|t−1 ft v̂t−|t1−1 (yt − ŷt|t−1 ) P̂t|t = P̂t|t−1 − P̂t|t−1 ft v̂t−|t1−1 ft0 P̂t|t−1 x̂t|t−1 = Gt x̂t−1|t−1 P̂t|t−1 = Gt P̂t−1|t−1 G0t + Wt State space models 2: Structural models 36 Time varying parameter models Linear Gaussian state space model yt = ft0 xt + εt , xt = Gt xt−1 + wt εt ∼ N(0, σt2 ) wt ∼ N(0, Wt ) Kalman recursions: ŷt|t−1 = ft0 x̂t|t−1 v̂t|t−1 = ft0 P̂t|t−1 ft + σt2 x̂t|t = x̂t|t−1 + P̂t|t−1 ft v̂t−|t1−1 (yt − ŷt|t−1 ) P̂t|t = P̂t|t−1 − P̂t|t−1 ft v̂t−|t1−1 ft0 P̂t|t−1 x̂t|t−1 = Gt x̂t−1|t−1 P̂t|t−1 = Gt P̂t−1|t−1 G0t + Wt State space models 2: Structural models 36 Structural models with covariates Local level with covariate yt = `t + β zt + εt ft0 = [1 zt ] `t = `t−1 + ξt 2 σξ 0 `t 1 0 Wt = xt = G= 0 1 β 0 0 Assumes zt is fixed and known (as in regression) Estimate of β is given by x̂T |T . Equivalent to simple linear regression with time varying intercept. Easy to extend to multiple regression with additional terms. State space models 2: Structural models 37 Structural models with covariates Local level with covariate yt = `t + β zt + εt ft0 = [1 zt ] `t = `t−1 + ξt 2 σξ 0 `t 1 0 Wt = xt = G= 0 1 β 0 0 Assumes zt is fixed and known (as in regression) Estimate of β is given by x̂T |T . Equivalent to simple linear regression with time varying intercept. Easy to extend to multiple regression with additional terms. State space models 2: Structural models 37 Structural models with covariates Local level with covariate yt = `t + β zt + εt ft0 = [1 zt ] `t = `t−1 + ξt 2 σξ 0 `t 1 0 Wt = xt = G= 0 1 β 0 0 Assumes zt is fixed and known (as in regression) Estimate of β is given by x̂T |T . Equivalent to simple linear regression with time varying intercept. Easy to extend to multiple regression with additional terms. State space models 2: Structural models 37 Structural models with covariates Local level with covariate yt = `t + β zt + εt ft0 = [1 zt ] `t = `t−1 + ξt 2 σξ 0 `t 1 0 Wt = xt = G= 0 1 β 0 0 Assumes zt is fixed and known (as in regression) Estimate of β is given by x̂T |T . Equivalent to simple linear regression with time varying intercept. Easy to extend to multiple regression with additional terms. State space models 2: Structural models 37 Structural models with covariates Local level with covariate yt = `t + β zt + εt ft0 = [1 zt ] `t = `t−1 + ξt 2 σξ 0 `t 1 0 Wt = xt = G= 0 1 β 0 0 Assumes zt is fixed and known (as in regression) Estimate of β is given by x̂T |T . Equivalent to simple linear regression with time varying intercept. Easy to extend to multiple regression with additional terms. State space models 2: Structural models 37 Structural models with covariates Local level with covariate yt = `t + β zt + εt ft0 = [1 zt ] `t = `t−1 + ξt 2 σξ 0 `t 1 0 Wt = xt = G= 0 1 β 0 0 Assumes zt is fixed and known (as in regression) Estimate of β is given by x̂T |T . Equivalent to simple linear regression with time varying intercept. Easy to extend to multiple regression with additional terms. State space models 2: Structural models 37 Time varying regression Simple linear regression with time varying parameters yt = `t + βt zt + εt ft0 = [1 zt ] `t = `t−1 + ξt βt = βt−1 + ζt 2 σξ 0 `t 1 0 xt = G= Wt = βt 0 1 0 σζ2 Allows for a linear regression with parameters that change slowly over time. Parameters follow independent random walks. Estimates of parameters given by x̂t|t or x̂t|T . State space models 2: Structural models 38 Time varying regression Simple linear regression with time varying parameters yt = `t + βt zt + εt ft0 = [1 zt ] `t = `t−1 + ξt βt = βt−1 + ζt 2 σξ 0 `t 1 0 xt = G= Wt = βt 0 1 0 σζ2 Allows for a linear regression with parameters that change slowly over time. Parameters follow independent random walks. Estimates of parameters given by x̂t|t or x̂t|T . State space models 2: Structural models 38 Time varying regression Simple linear regression with time varying parameters yt = `t + βt zt + εt ft0 = [1 zt ] `t = `t−1 + ξt βt = βt−1 + ζt 2 σξ 0 `t 1 0 xt = G= Wt = βt 0 1 0 σζ2 Allows for a linear regression with parameters that change slowly over time. Parameters follow independent random walks. Estimates of parameters given by x̂t|t or x̂t|T . State space models 2: Structural models 38 Time varying regression Simple linear regression with time varying parameters yt = `t + βt zt + εt ft0 = [1 zt ] `t = `t−1 + ξt βt = βt−1 + ζt 2 σξ 0 `t 1 0 xt = G= Wt = βt 0 1 0 σζ2 Allows for a linear regression with parameters that change slowly over time. Parameters follow independent random walks. Estimates of parameters given by x̂t|t or x̂t|T . State space models 2: Structural models 38 Time varying regression Simple linear regression with time varying parameters yt = `t + βt zt + εt ft0 = [1 zt ] `t = `t−1 + ξt βt = βt−1 + ζt 2 σξ 0 `t 1 0 xt = G= Wt = βt 0 1 0 σζ2 Allows for a linear regression with parameters that change slowly over time. Parameters follow independent random walks. Estimates of parameters given by x̂t|t or x̂t|T . State space models 2: Structural models 38 Updating (“online”) regression Same idea can be used to estimate a regression iteratively as new data arrives. Simple linear regression with updating parameters yt = `t + βt zt + εt ft0 = [1 zt ] `t = `t−1 + ξt βt = βt−1 + ζt `t 1 0 0 0 xt = G= Wt = βt 0 1 0 0 Updated parameter estimates given by x̂t|t . Recursive residuals given by yt − ŷt|t−1 . State space models 2: Structural models 39 Updating (“online”) regression Same idea can be used to estimate a regression iteratively as new data arrives. Simple linear regression with updating parameters yt = `t + βt zt + εt ft0 = [1 zt ] `t = `t−1 + ξt βt = βt−1 + ζt `t 1 0 0 0 xt = G= Wt = βt 0 1 0 0 Updated parameter estimates given by x̂t|t . Recursive residuals given by yt − ŷt|t−1 . State space models 2: Structural models 39 Updating (“online”) regression Same idea can be used to estimate a regression iteratively as new data arrives. Simple linear regression with updating parameters yt = `t + βt zt + εt ft0 = [1 zt ] `t = `t−1 + ξt βt = βt−1 + ζt `t 1 0 0 0 xt = G= Wt = βt 0 1 0 0 Updated parameter estimates given by x̂t|t . Recursive residuals given by yt − ŷt|t−1 . State space models 2: Structural models 39 Updating (“online”) regression Same idea can be used to estimate a regression iteratively as new data arrives. Simple linear regression with updating parameters yt = `t + βt zt + εt ft0 = [1 zt ] `t = `t−1 + ξt βt = βt−1 + ζt `t 1 0 0 0 xt = G= Wt = βt 0 1 0 0 Updated parameter estimates given by x̂t|t . Recursive residuals given by yt − ŷt|t−1 . State space models 2: Structural models 39 Updating (“online”) regression Same idea can be used to estimate a regression iteratively as new data arrives. Simple linear regression with updating parameters yt = `t + βt zt + εt ft0 = [1 zt ] `t = `t−1 + ξt βt = βt−1 + ζt `t 1 0 0 0 xt = G= Wt = βt 0 1 0 0 Updated parameter estimates given by x̂t|t . Recursive residuals given by yt − ŷt|t−1 . State space models 2: Structural models 39