Mony-Sak-Presentation - Clayton

advertisement
MA and ARCH Time Series
model inference using
Minimum Message Length
By:
Mony Sak - 13080512
Supervisors:
Assoc. Prof. David Dowe, Dr Sid Ray
Contents
1.
2.
3.
4.
5.
6.
7.
“The Problem”
Time Series Concepts
Minimum Message Length (MML)
MML applied to Time Series
My Project
Results
Conclusion & Future Work
1. “The Problem”
 Which model fits the data best?
1. “The Problem”
 Which model fits the data best?
?
1. “The Problem”
 Which model fits the data best?
?
?
?
?
?
?
2. Time Series Concepts
 What is a Time Series (TS)?
 Observations over time
Observation
value
time
2. Time Series Concepts
 What is a Time Series (TS)?
 Some examples (1 of 4):
Light Curve of Beta Persei, also known as Algol, or “demon star”1
2. Time Series Concepts
 What is a Time Series (TS)?
 Some examples (2 of 4):
Closing stock price of Apple Computer Inc. (AAPL) (1984-2005)2
2. Time Series Concepts
 What is a Time Series (TS)?
 Some examples (3 of 4):
Global temperature difference vs. Years3
2. Time Series Concepts
 What is a Time Series (TS)?
 Some examples (4 of 4):
Average monthly busridership (weekdays) in Iowa city (1971-1982)4
2. Time Series Concepts
 Why study Time Series?
 Description
 The best method of conveying information
 Explanation
 A good model = good understanding of the underlying
process generating that data
 Prediction
 Predict future observation values
 Control
 If we can predict future values, we are able to ‘control’
the time series to our benefit
2. Time Series Concepts
 Some TS models (1 of 3)
 Autoregressive, order p = AR(p)
 Current observation value is a sum of weighted past
observation values + random error5
2. Time Series Concepts
 Some TS models (2 of 3)
 Moving Average, order q = MA(q)
 Current observation value is a sum of weighted past error
values + random error5
2. Time Series Concepts
 Some TS models (3 of 3)
 Autoregressive Conditional Heteroskedastic, order q = ARCH(q)
 Current variance value is a sum of weighted past squared error
values5
2. Time Series Concepts
 1 set of data… and many many models
?
?
?
?
?
?
2. Time Series Concepts
 Partial solution to the “The Problem”
 The Model Selection Criterion (MSC)
 An equation, based on parsimony
 Objective scoring of different models
101.21
+
i am a
criterion!
+
99.90
2. Time Series Concepts
 Some popular Model Selection Criteria:
 Akaike’s Information Criterion (AIC)6
 Bayesian Information Criterion (BIC)7
 …many more incl. HQ8, RCL9, MML10
3. Minimum Message Length
 What it is & History
 Information-theoretic criterion for model
selection and point estimation
 Developed here at Monash University by Wallace
& Boulton in 196811
 Has been applied to mixture modelling (“snob”),
decision tree/graph induction, generalized
Bayesian networks, and more…
3. Minimum Message Length
 Theory
 A “message” can be encoded in 2 parts:
 Part 1: Model,
 Part 2: Data (given the Model in Part 1)
 Combined Message Length = Part 1 + Part 2
 We choose the model that yields the
smallest Combined Message Length
3. Minimum Message Length
 Theory (example)
model 1
data|model 1
model 2
data|model 2
model 3
model 4
data|model 3
data|model 4
3. Minimum Message Length
 MML87 Approximation:
 Developed by Wallace & Freeman in 198713
 Part 1 (model):
 Part 2 (data|model):
4. MML87-based MSC
 Past Research
 MML87-based MSC for:
 AR model inference10,
 Stock market simulation of AR traders14
 ARMAX models15
 …Results:
 MML does very well when compared to the
other Model Selection Criteria
4. MML-based MSC
 Results from Fitzgibbon, Dowe, Vahid (2004)10
 Motivation for my project
 How well does MML-based MSCs perform with other models?
5. My Project
 How well does an MML-based MSC perform with:
 Moving Average (MA) models?
 Autoregressive Conditional Heteroskedastic
(ARCH) models?
 We need to derive 2 MSCs, 1 for each model
 MA is a conditional mean model, whereas ARCH is
a conditional variance model - quite different
 Complex math regarding Fisher Information
matrix. We resort to approximations
5. My Project
 MML87 equation we will be using
6. Results
 Results (simulations)
 Moving Average (MA) models
(Results from Sak, Dowe, Ray (2005). Accepted for inclusion in
proceedings of Advanced Computing in Financial Markets ‘05.
Istanbul, Turkey. Dec 15-17, 2005.)16
6. Results (MA simulations)
6. Results (MA simulations)
6. Results (MA simulations)
6. Results (MA simulations)
6. Results (MA simulations)
6. Results (MA simulations)
6. Results (ARCH simulations)
7. Conclusion & Future Work
 Conclusion
 MML-based MSC for MA models performs very well
 MML-based MSC for ARCH models….
 Future Work
 Try other MML approximations such as MMLD17
 Other Time Series models: Generalized ARCH
(GARCH)18, Generalized/Indexed AR (GAR)18
 Other parameter estimation methods: Maximum
Likelihood Estimation (MLE) is very very slow!
Thanks!
References (1 of 2)
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
J. Stebbins. The measurement of the light of stars with a selenium photometer, with an
application to variations of Algol. The Astrophysical Journal, 32(3):185-214, 1910.
Data obtained from http://finance.yahoo.com/q?s=aapl
Data obtained from http://www.elmhurst.edu/~chm/vchembook/globalwarmA.html
Hyndman, R.J. (n.d.) Time Series Data Library, http://wwwpersonal.buseco.monash.edu.au/~hyndman/TSDL/. Accessed on 24 Oct., 2005.
J. D. Hamilton. Time Series Analysis. Princeton University Press, 1994.
H. Akaike. Information theory as an extension of the Maximum Likelihood principle. In Second
International Symposium on Information Theory, pages 267-281, 1973. Petrov, B.N. and Csaki, F.
(editors). Akademiai Kiado, Budapest.
G. Schwarz. Estimating the dimension of a model. The Annals of Statistics, 6(2):46-464, 1978.
E.J. Hannan and B.G. Quinn. The determination of the order of an autoregression. Journal of
the Royal Statistical Society, Series B (Methodological), 41(2):190-195, 1979.
H. Mitchell and D.M McKenzie. GARCH model selection criteria. Quantitative Finance, 3:262284, 2003.
L.J. Fitzgibbon, D.L. Dowe, and F. Vahid. Minimum Message Length Autoregressive Model Order
Selection. In M. Palanaswami, C. Chandra Sekhar, G. Kumar Venayagamoorthy, S. Mohan and M.
K. Ghantasala (eds.), International Conference on Intelligent Sensing and Information Processing
(ICISIP), pages 439-444, 2004. Chennai, India, 4-7 January 2004, (ISBN: 0-7803-8243-9, IEEE
Catalogue Number: 04EX783),
www.csse.monash.edu.au/∼dld/Publications/2004/Fitzgibbon+Dowe+Vahid2004.ref.
Want a copy of these slides? Send requests to monys@csse.monash.edu.au
References (2 of 2)
11.
12.
13.
14.
15.
16.
17.
18.
19.
C.S. Wallace and D.M. Boulton. An information measure for classification. Computer Journal,
11(2):185-194, 1968.
L.J. Fitzgibbon. Message from Monte Carlo: A Framework for Minimum Message Length Inference
using Markov Chain Monte Carlo Methods. PhD thesis, Monash University, Clayton Campus.
Wellington Rd, Clayton. Victoria 3800, Australia, 2004.
C.S. Wallace and P.R. Freeman. Estimation and inference by compact encoding. Journal of the
Royal Statistical Society. Series B (Methodological), 49(3):240-265, 1987.
M. J. Collie, D. L. Dowe, and L. J. Fitzgibbon. Stock market simulation and inference technique,
2005. Accepted for inclusion in proceedings of the 5th international conference on Hybrid
Intelligent Systems (HIS’05), Rio de Janeiro, Brazil, November 6-9, 2005.
[ Schmidt ]
M. Sak, D.L. Dowe, and S. Ray. Minimum Message Length Moving Average Time Series Data
Mining. In Computational Intelligence: Methods and Applications. First International ICSC
Symposium on Advanced Computing in Financial Markets (ACFM2005), 2005. Accepted for
inclusion in proceedings of Advanced Computing in Financial Markets (ACFM2005), Istanbul,
Turkey. Dec. 15-17, 2005.
E. Lam. Improved Approximations in MML. Honours Thesis, Monash University, School of
Computer Science and Software Engineering (CSSE), Monash University, Clayton 3168, Australia,
2000.
T. Bollerslev. Generalized Autoregressive Conditional Heteroskedasticity. Journal of
Econometrics, 31:307-27, 1986.
M.S. Peiris. Improving the Quality of Forecasting using Generalized AR Models: An Application to
Statistical Quality Control. Statistical Methods, 5(2):156-171, 2003.
Negative Log Likelihood
 Takes into account the estimated variance
6. Results

Empirical Comparison
1.
Simulate data sets for 200 models for each model order (i.e.
MA(1) - MA(8)) for a total of 1,600 MA data sets
2.
Estimate model parameters using Maximum Likelihood (MLE)
3.
Pass to each Model Selection Criterion (MSC) the same 1,600
data sets and parameter estimates (for each data set), and let
them choose the model they think best represents the data
4.
Assessment is on correct model order selection accuracy and
negative log likelihoood
5.
Repeat experiment for ARCH models (again 1,600 data sets)
Download