Mony-Sak-Presentation - Clayton

MA and ARCH Time Series
model inference using
Minimum Message Length
Mony Sak - 13080512
Assoc. Prof. David Dowe, Dr Sid Ray
“The Problem”
Time Series Concepts
Minimum Message Length (MML)
MML applied to Time Series
My Project
Conclusion & Future Work
1. “The Problem”
 Which model fits the data best?
2. Time Series Concepts
 What is a Time Series (TS)?
 Observations over time
2. Time Series Concepts
 What is a Time Series (TS)?
 Some examples (1 of 4):
Light Curve of Beta Persei, also known as Algol, or “demon star”1
2. Time Series Concepts
 What is a Time Series (TS)?
 Some examples (2 of 4):
Closing stock price of Apple Computer Inc. (AAPL) (1984-2005)2
2. Time Series Concepts
 What is a Time Series (TS)?
 Some examples (3 of 4):
Global temperature difference vs. Years3
2. Time Series Concepts
 What is a Time Series (TS)?
 Some examples (4 of 4):
Average monthly busridership (weekdays) in Iowa city (1971-1982)4
2. Time Series Concepts
 Why study Time Series?
 Description
 The best method of conveying information
 Explanation
 A good model = good understanding of the underlying
process generating that data
 Prediction
 Predict future observation values
 Control
 If we can predict future values, we are able to ‘control’
the time series to our benefit
2. Time Series Concepts
 Some TS models (1 of 3)
 Autoregressive, order p = AR(p)
 Current observation value is a sum of weighted past
observation values + random error5
2. Time Series Concepts
 Some TS models (2 of 3)
 Moving Average, order q = MA(q)
 Current observation value is a sum of weighted past error
values + random error5
2. Time Series Concepts
 Some TS models (3 of 3)
 Autoregressive Conditional Heteroskedastic, order q = ARCH(q)
 Current variance value is a sum of weighted past squared error
2. Time Series Concepts
 1 set of data… and many many models
2. Time Series Concepts
 Partial solution to the “The Problem”
 The Model Selection Criterion (MSC)
 An equation, based on parsimony
 Objective scoring of different models
2. Time Series Concepts
 Some popular Model Selection Criteria:
 Akaike’s Information Criterion (AIC)6
 Bayesian Information Criterion (BIC)7
 …many more incl. HQ8, RCL9, MML10
3. Minimum Message Length
 What it is & History
 Information-theoretic criterion for model
selection and point estimation
 Developed here at Monash University by Wallace
& Boulton in 196811
 Has been applied to mixture modelling (“snob”),
decision tree/graph induction, generalized
Bayesian networks, and more…
3. Minimum Message Length
 Theory
 A “message” can be encoded in 2 parts:
 Part 1: Model,
 Part 2: Data (given the Model in Part 1)
 Combined Message Length = Part 1 + Part 2
 We choose the model that yields the
smallest Combined Message Length
3. Minimum Message Length
 Theory (example)
model 1
data|model 1
model 2
data|model 2
model 3
model 4
data|model 3
data|model 4
3. Minimum Message Length
 MML87 Approximation:
 Developed by Wallace & Freeman in 198713
 Part 1 (model):
 Part 2 (data|model):
4. MML87-based MSC
 Past Research
 MML87-based MSC for:
 AR model inference10,
 Stock market simulation of AR traders14
 ARMAX models15
 …Results:
 MML does very well when compared to the
other Model Selection Criteria
4. MML-based MSC
 Results from Fitzgibbon, Dowe, Vahid (2004)10
 Motivation for my project
 How well does MML-based MSCs perform with other models?
5. My Project
 How well does an MML-based MSC perform with:
 Moving Average (MA) models?
 Autoregressive Conditional Heteroskedastic
(ARCH) models?
 We need to derive 2 MSCs, 1 for each model
 MA is a conditional mean model, whereas ARCH is
a conditional variance model - quite different
 Complex math regarding Fisher Information
matrix. We resort to approximations
5. My Project
 MML87 equation we will be using
6. Results
 Results (simulations)
 Moving Average (MA) models
(Results from Sak, Dowe, Ray (2005). Accepted for inclusion in
proceedings of Advanced Computing in Financial Markets ‘05.
Istanbul, Turkey. Dec 15-17, 2005.)16
6. Results (MA simulations)
6. Results (MA simulations)
6. Results (MA simulations)
6. Results (MA simulations)
6. Results (MA simulations)
6. Results (MA simulations)
6. Results (ARCH simulations)
7. Conclusion & Future Work
 Conclusion
 MML-based MSC for MA models performs very well
 MML-based MSC for ARCH models….
 Future Work
 Try other MML approximations such as MMLD17
 Other Time Series models: Generalized ARCH
(GARCH)18, Generalized/Indexed AR (GAR)18
 Other parameter estimation methods: Maximum
Likelihood Estimation (MLE) is very very slow!
Negative Log Likelihood
 Takes into account the estimated variance
6. Results
Empirical Comparison
Simulate data sets for 200 models for each model order (i.e.
MA(1) - MA(8)) for a total of 1,600 MA data sets
Estimate model parameters using Maximum Likelihood (MLE)
Pass to each Model Selection Criterion (MSC) the same 1,600
data sets and parameter estimates (for each data set), and let
them choose the model they think best represents the data
Assessment is on correct model order selection accuracy and
negative log likelihoood
Repeat experiment for ARCH models (again 1,600 data sets)