MA and ARCH Time Series model inference using Minimum Message Length By: Mony Sak - 13080512 Supervisors: Assoc. Prof. David Dowe, Dr Sid Ray Contents 1. 2. 3. 4. 5. 6. 7. “The Problem” Time Series Concepts Minimum Message Length (MML) MML applied to Time Series My Project Results Conclusion & Future Work 1. “The Problem” Which model fits the data best? 1. “The Problem” Which model fits the data best? ? 1. “The Problem” Which model fits the data best? ? ? ? ? ? ? 2. Time Series Concepts What is a Time Series (TS)? Observations over time Observation value time 2. Time Series Concepts What is a Time Series (TS)? Some examples (1 of 4): Light Curve of Beta Persei, also known as Algol, or “demon star”1 2. Time Series Concepts What is a Time Series (TS)? Some examples (2 of 4): Closing stock price of Apple Computer Inc. (AAPL) (1984-2005)2 2. Time Series Concepts What is a Time Series (TS)? Some examples (3 of 4): Global temperature difference vs. Years3 2. Time Series Concepts What is a Time Series (TS)? Some examples (4 of 4): Average monthly busridership (weekdays) in Iowa city (1971-1982)4 2. Time Series Concepts Why study Time Series? Description The best method of conveying information Explanation A good model = good understanding of the underlying process generating that data Prediction Predict future observation values Control If we can predict future values, we are able to ‘control’ the time series to our benefit 2. Time Series Concepts Some TS models (1 of 3) Autoregressive, order p = AR(p) Current observation value is a sum of weighted past observation values + random error5 2. Time Series Concepts Some TS models (2 of 3) Moving Average, order q = MA(q) Current observation value is a sum of weighted past error values + random error5 2. Time Series Concepts Some TS models (3 of 3) Autoregressive Conditional Heteroskedastic, order q = ARCH(q) Current variance value is a sum of weighted past squared error values5 2. Time Series Concepts 1 set of data… and many many models ? ? ? ? ? ? 2. Time Series Concepts Partial solution to the “The Problem” The Model Selection Criterion (MSC) An equation, based on parsimony Objective scoring of different models 101.21 + i am a criterion! + 99.90 2. Time Series Concepts Some popular Model Selection Criteria: Akaike’s Information Criterion (AIC)6 Bayesian Information Criterion (BIC)7 …many more incl. HQ8, RCL9, MML10 3. Minimum Message Length What it is & History Information-theoretic criterion for model selection and point estimation Developed here at Monash University by Wallace & Boulton in 196811 Has been applied to mixture modelling (“snob”), decision tree/graph induction, generalized Bayesian networks, and more… 3. Minimum Message Length Theory A “message” can be encoded in 2 parts: Part 1: Model, Part 2: Data (given the Model in Part 1) Combined Message Length = Part 1 + Part 2 We choose the model that yields the smallest Combined Message Length 3. Minimum Message Length Theory (example) model 1 data|model 1 model 2 data|model 2 model 3 model 4 data|model 3 data|model 4 3. Minimum Message Length MML87 Approximation: Developed by Wallace & Freeman in 198713 Part 1 (model): Part 2 (data|model): 4. MML87-based MSC Past Research MML87-based MSC for: AR model inference10, Stock market simulation of AR traders14 ARMAX models15 …Results: MML does very well when compared to the other Model Selection Criteria 4. MML-based MSC Results from Fitzgibbon, Dowe, Vahid (2004)10 Motivation for my project How well does MML-based MSCs perform with other models? 5. My Project How well does an MML-based MSC perform with: Moving Average (MA) models? Autoregressive Conditional Heteroskedastic (ARCH) models? We need to derive 2 MSCs, 1 for each model MA is a conditional mean model, whereas ARCH is a conditional variance model - quite different Complex math regarding Fisher Information matrix. We resort to approximations 5. My Project MML87 equation we will be using 6. Results Results (simulations) Moving Average (MA) models (Results from Sak, Dowe, Ray (2005). Accepted for inclusion in proceedings of Advanced Computing in Financial Markets ‘05. Istanbul, Turkey. Dec 15-17, 2005.)16 6. Results (MA simulations) 6. Results (MA simulations) 6. Results (MA simulations) 6. Results (MA simulations) 6. Results (MA simulations) 6. Results (MA simulations) 6. Results (ARCH simulations) 7. Conclusion & Future Work Conclusion MML-based MSC for MA models performs very well MML-based MSC for ARCH models…. Future Work Try other MML approximations such as MMLD17 Other Time Series models: Generalized ARCH (GARCH)18, Generalized/Indexed AR (GAR)18 Other parameter estimation methods: Maximum Likelihood Estimation (MLE) is very very slow! Thanks! References (1 of 2) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. J. Stebbins. The measurement of the light of stars with a selenium photometer, with an application to variations of Algol. The Astrophysical Journal, 32(3):185-214, 1910. Data obtained from http://finance.yahoo.com/q?s=aapl Data obtained from http://www.elmhurst.edu/~chm/vchembook/globalwarmA.html Hyndman, R.J. (n.d.) Time Series Data Library, http://wwwpersonal.buseco.monash.edu.au/~hyndman/TSDL/. Accessed on 24 Oct., 2005. J. D. Hamilton. Time Series Analysis. Princeton University Press, 1994. H. Akaike. Information theory as an extension of the Maximum Likelihood principle. In Second International Symposium on Information Theory, pages 267-281, 1973. Petrov, B.N. and Csaki, F. (editors). Akademiai Kiado, Budapest. G. Schwarz. Estimating the dimension of a model. The Annals of Statistics, 6(2):46-464, 1978. E.J. Hannan and B.G. Quinn. The determination of the order of an autoregression. Journal of the Royal Statistical Society, Series B (Methodological), 41(2):190-195, 1979. H. Mitchell and D.M McKenzie. GARCH model selection criteria. Quantitative Finance, 3:262284, 2003. L.J. Fitzgibbon, D.L. Dowe, and F. Vahid. Minimum Message Length Autoregressive Model Order Selection. In M. Palanaswami, C. Chandra Sekhar, G. Kumar Venayagamoorthy, S. Mohan and M. K. Ghantasala (eds.), International Conference on Intelligent Sensing and Information Processing (ICISIP), pages 439-444, 2004. Chennai, India, 4-7 January 2004, (ISBN: 0-7803-8243-9, IEEE Catalogue Number: 04EX783), www.csse.monash.edu.au/∼dld/Publications/2004/Fitzgibbon+Dowe+Vahid2004.ref. Want a copy of these slides? Send requests to monys@csse.monash.edu.au References (2 of 2) 11. 12. 13. 14. 15. 16. 17. 18. 19. C.S. Wallace and D.M. Boulton. An information measure for classification. Computer Journal, 11(2):185-194, 1968. L.J. Fitzgibbon. Message from Monte Carlo: A Framework for Minimum Message Length Inference using Markov Chain Monte Carlo Methods. PhD thesis, Monash University, Clayton Campus. Wellington Rd, Clayton. Victoria 3800, Australia, 2004. C.S. Wallace and P.R. Freeman. Estimation and inference by compact encoding. Journal of the Royal Statistical Society. Series B (Methodological), 49(3):240-265, 1987. M. J. Collie, D. L. Dowe, and L. J. Fitzgibbon. Stock market simulation and inference technique, 2005. Accepted for inclusion in proceedings of the 5th international conference on Hybrid Intelligent Systems (HIS’05), Rio de Janeiro, Brazil, November 6-9, 2005. [ Schmidt ] M. Sak, D.L. Dowe, and S. Ray. Minimum Message Length Moving Average Time Series Data Mining. In Computational Intelligence: Methods and Applications. First International ICSC Symposium on Advanced Computing in Financial Markets (ACFM2005), 2005. Accepted for inclusion in proceedings of Advanced Computing in Financial Markets (ACFM2005), Istanbul, Turkey. Dec. 15-17, 2005. E. Lam. Improved Approximations in MML. Honours Thesis, Monash University, School of Computer Science and Software Engineering (CSSE), Monash University, Clayton 3168, Australia, 2000. T. Bollerslev. Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics, 31:307-27, 1986. M.S. Peiris. Improving the Quality of Forecasting using Generalized AR Models: An Application to Statistical Quality Control. Statistical Methods, 5(2):156-171, 2003. Negative Log Likelihood Takes into account the estimated variance 6. Results Empirical Comparison 1. Simulate data sets for 200 models for each model order (i.e. MA(1) - MA(8)) for a total of 1,600 MA data sets 2. Estimate model parameters using Maximum Likelihood (MLE) 3. Pass to each Model Selection Criterion (MSC) the same 1,600 data sets and parameter estimates (for each data set), and let them choose the model they think best represents the data 4. Assessment is on correct model order selection accuracy and negative log likelihoood 5. Repeat experiment for ARCH models (again 1,600 data sets)