Uploaded by Aftab Hassan (Aftab)

Mathematics behind Forecasting Models

advertisement
Mathematics behind Forecasting Models
ARIMA:
The ARIMA model is generally denoted as ARIMA (p, d, q), where:



p represents the order of the autoregressive (AR) component.
d represents the order of differencing.
q represents the order of the moving average (MA) component.
Now, let's break down the mathematical formulation of the ARIMA model:
Autoregressive (AR) Component:
The AR component captures the linear relationship between the current value of the
time series and its lagged values. The order of the AR component, denoted by p,
determines the number of lagged terms included in the model. Mathematically, the
AR component is defined as:
Xt = c + Φ₁Xt-1 + Φ₂ Xt-2 + ... + Φp Xt-p + εt
Here, Xt represents the value of the time series at time t.
c is a constant term.
Φ₁, Φ₂ … Φp are the coefficients of the lagged terms.
εt is the error term or residual at time t.
Differencing (Integrated) Component:
Differencing is performed to make the time series stationary, which means removing
any trends or seasonality. The order of differencing, denoted by d, represents the
number of times the series is differenced. The differenced series is defined as:
Xt' = (1 - B) ^d * Xt
Here, B is the backshift operator, which shifts the series backward by one time
period.
Moving Average (MA) Component:
The MA component accounts for the dependency between the current value of the
Time series and the residual errors from previous time periods. The order of the
MA component, denoted by q, determines the number of lagged residual terms
Included in the model.
Mathematically, the MA component is defined as:
Xt = μ + εt + θ₁ εt-₁ + θ₂ εt-₂ + ... + θq εt-q
Here, μ is the mean of the series.
εt, εt-₁, ..., εt-q are the residual errors at previous time periods.
θ₁, θ₂ … θq are the coefficients of the residual errors.
SARIMA:
SARIMA (Seasonal Auto Regressive Integrated Moving Average) is an extension of
the ARIMA model that incorporates seasonal components to capture periodic
patterns in time series data. The SARIMA model is denoted as
SARIMA (p, d, q)(P, D, Q, s)
Where:
- p: The order of the autoregressive (AR) component.
- d: The order of differencing.
- q: The order of the moving average (MA) component.
- P: The order of the seasonal autoregressive (SAR) component.
- D: The order of seasonal differencing.
- Q: The order of the seasonal moving average (SMA) component.
- s: The length of the seasonal period.
Now, let's break down the mathematical formulation of the SARIMA model:
1. Autoregressive (AR) Component:
The AR component captures the linear relationship between the current value of the
time series and its lagged values, as in the ARIMA model.
The AR component is defined using the same equation:
Xt = c + Φ₁Xt-1 + Φ₂ Xt-2 + ... + Φp Xt-p + εt
2. Differencing:
Differencing is performed to remove trends and seasonality, similar to the ARIMA
Model.
The differenced series equation remains the same:
Xt' = (1 - B) ^d * (1 - B^s) ^D * Xt
3. Moving Average (MA) Component:
The MA component accounts for the dependency between the current value of the
Time series and the residual errors from previous time periods, as in the ARIMA
Model.
The MA component equation is the same:
Xt = μ + εt + θ₁ εt-₁ + θ₂ εt-₂ + ... + θq εt-q
4. Seasonal Autoregressive (SAR) Component:
The SAR component captures the seasonal relationship between the current value
of the time series and its lagged values.
The seasonal AR component is defined as:
Xt = c + Φ₁ Xt-s + Φ₂ Xt-2s + ... + ΦP (Xt-ps) + εt
The seasonal AR component considers lagged terms that are multiples of the
seasonal period s.
5. Seasonal Moving Average (SMA) Component:
The SMA component accounts for the dependency between the current value of the
time series and the residual errors from previous seasonal time periods.
The seasonal MA component equation is:
Xt = μ + εt + θ₁ εt-s + θ₂ εt-2s + ... + θQ (εt-Qs)
The seasonal MA component considers lagged residual terms that are multiples of
the seasonal period s.
By combining the AR, differencing, MA, SAR, and SMA components, the SARIMA
model captures both the non-seasonal and seasonal patterns in time series data.
The model parameters (p, d, q, P, D, Q, s) are typically determined using techniques
like ACF and PACF analysis, as well as information criteria such as AIC or BIC.
The Holt-Winters additive model:
The Holt-Winters additive model, also known as the triple exponential smoothing
model, is used to forecast time series data with trend and seasonality.
The mathematical formula for the Holt-Winters additive model is as follows:
1. Level equation:
The level equation updates the current level estimate based on the previous level
estimate, the trend estimate, and the seasonal component at the current time period.
It is represented as:
Lₜ = α (Yₜ - Sₜ ₋ₛ ) + (1 - α)(Lₜ ₋₁ + Tₜ ₋₁)
Here,
- Lₜ represents the level estimate at time t.
- α is the smoothing parameter for the level, which determines the weight given to
the current observation.
- Yₜ is the actual value of the time series at time t.
- Sₜ ₋ₛ is the seasonal component at time t-s (where s is the length of the
seasonal period).
- Lₜ ₋₁ represents the level estimate at the previous time period.
- Tₜ ₋₁ represents the trend estimate at the previous time period.
1. Trend equation:
The trend equation updates the current trend estimate based on the level equation
and the previous trend estimate.
It is represented as:
Tₜ = β (Lₜ - Lₜ ₋₁) + (1 - β) Tₜ ₋₁
Here,
- Tₜ represents the trend estimate at time t.
- β is the smoothing parameter for the trend, which determines the weight given to
the current trend estimate.
- Lₜ is the level estimate at time t.
- Lₜ ₋₁ represents the level estimate at the previous time period.
- Tₜ ₋₁ represents the trend estimate at the previous time period.
2. Seasonal equation:
The seasonal equation updates the current seasonal component estimate based on
the level equation and the seasonal component at the previous seasonal period.
It is represented as:
Sₜ = γ (Yₜ - Lₜ ) + (1 - γ) Sₜ ₋ₛ
Here,
- Sₜ represents the seasonal component estimate at time t.
- γ is the smoothing parameter for the seasonal component, which determines the
weight given to the current seasonal observation.
- Yₜ is the actual value of the time series at time t.
- Lₜ is the level estimate at time t.
- Sₜ ₋ₛ represents the seasonal component estimate at the previous
seasonal period.
3. Forecast equation:
The forecast equation combines the level estimate, trend estimate, and seasonal
component to forecast future values of the time series.
It is represented as:
Fₜ ₊ₖ = Lₜ + kTₜ + Sₜ ₊ₖ ₋ₛ
Here,
- Fₜ ₊ₖ represents the forecasted value at time t+k (k periods ahead).
- Lₜ is the level estimate at time t.
- Tₜ is the trend estimate at time t.
- Sₜ ₊ₖ ₋ₛ represents the seasonal component estimate at time t+k-s
(k periods ahead, at the same seasonal period).
The initial values for L, T, and S need to be specified or estimated to start the
forecasting process. The smoothing parameters α, β, and γ determine the weight
given to the current observations and estimates, and they need to be selected based
on the characteristics of the data.
The Holt-Winters multiplicative model:
The Holt-Winters multiplicative model is another variation of the triple exponential
smoothing model used for forecasting time series data with trend and seasonality.
The mathematical formula for the Holt-Winters multiplicative model is as follows:
1. Level equation:
The level equation updates the current level estimate based on the previous level
estimate, the trend estimate, and the seasonal component at the current time
period.
It is represented as:
Lₜ = α (Yₜ / Sₜ ₋ₛ ) + (1 - α)(Lₜ ₋₁ + Tₜ ₋₁)
Here,
- Lₜ represents the level estimate at time t.
- α is the smoothing parameter for the level, which determines the weight given to
the current observation.
- Yₜ is the actual value of the time series at time t.
- Sₜ ₋ₛ is the seasonal component at time t-s (where s is the length of the seasonal
period).
- Lₜ ₋₁ represents the level estimate at the previous time period.
- Tₜ ₋₁ represents the trend estimate at the previous time period.
2. Trend equation:
The trend equation updates the current trend estimate based on the level equation
and the previous trend estimate.
It is represented as:
Tₜ = β (Lₜ - Lₜ ₋₁) + (1 - β)Tₜ ₋₁
Here,
- Tₜ represents the trend estimate at time t.
- β is the smoothing parameter for the trend, which determines the weight given to
the current trend estimate.
- Lₜ is the level estimate at time t.
- Lₜ ₋₁ represents the level estimate at the previous time period.
- Tₜ ₋₁ represents the trend estimate at the previous time period.
3. Seasonal equation:
The seasonal equation updates the current seasonal component estimate based on
the level equation and the seasonal component at the previous seasonal period.
It is represented as:
Sₜ = γ (Yₜ / Lₜ ) + (1 - γ)Sₜ ₋ₛ
Here,
- Sₜ represents the seasonal component estimate at time t.
- γ is the smoothing parameter for the seasonal component, which determines the
weight given to the current seasonal observation.
- Yₜ is the actual value of the time series at time t.
- Lₜ is the level estimate at time t.
- Sₜ ₋ₛ represents the seasonal component estimate at the previous seasonal
period.
4. Forecast equation:
The forecast equation combines the level estimate, trend estimate, and seasonal
component to forecast future values of the time series.
It is represented as:
Fₜ ₊ₖ = (Lₜ + kTₜ ) * Sₜ ₊ₖ ₋ₛ
Here,
- Fₜ ₊ₖ represents the forecasted value at time t+k (k periods ahead).
- Lₜ is the level estimate at time t.
- Tₜ is the trend estimate at time t.
- Sₜ ₊ₖ ₋ₛ represents the seasonal component estimate at time t+k-s (k periods
ahead, at the same seasonal period).
Similar to the additive model, the initial values for L, T, and S need to be specified or
estimated to start the forecasting process. The smoothing parameters α, β, and γ
need to be selected based on the characteristics of the data.
Download