New Graduate Course Stevens Institute of Technology Approved by GCC 06-12-13 School: Howe School of Technology Management Course Title: Statistical Learning and Analytics Research Seminar Program: Howe Ph.D. program. Course: MGT787 Catalog Description: The significant amount of corporate information available requires a systematic and analytical approach to select the most important information and anticipate major events. Statistical learning algorithms facilitate this process understanding, modeling and forecasting the behavior of major corporate variables. This course introduces statistical and graphical models used in time series, machine learning and data mining for inference and prediction. The emphasis of the course is in the learning capability of the algorithms and their application to finance, direct marketing, and operations. Students should have a basic knowledge of probability theory, linear algebra, and multivariate analysis. Course Objectives: Students will: Learn the fundamental concepts of time series analysis and statistical learning algorithms. Explore existent and new applications of statistical learning methods to finance, marketing and operations problems. Course Outcomes: By the end of this course, the students will be able to: 1. Understand the foundations of statistical learning algorithms 2. Apply statistical models and analytical methods to several business domains using a statistical language such as R. 3. Recognize the value and also the limits of statistical learning algorithms to solve business problems. Prerequisites: Basic course in probability and statistics at the level of MGT 620. It is suggested that students should have also taken MGT 718 Multivariate analysis. Knowledge of linear algebra is very useful. Grading Percentages: Mid-term 25% Credits: 3 credits Final 35% Projects 30% Other Participation 10% For Graduate Credit toward Degree or Certificate X Yes No Not for Dept. Majors Other Project: PhD students should prepare a paper that counts as the final project for this course. The paper should be based on a theoretical or applied exploration of one of the methods studied in this course or any other method approved by the instructor. The research paper should follow (in general terms) this academic format. This paper describes the statistical tests to be used to compare different learning algorithms: Dietterich, T. G., (1998). Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation, 10 (7) 1895-1924. Postscript preprint. (Revised December 30, 1997). Software: R (open source version of S-plus) is the main software package that will be used in this class. No prior knowledge of the package is required. R can be downloaded as a free package at http://www.r-project.org (with FinTS, fSeries, and the Ox package with G@RCH). Mode of Delivery X Class Online Modules Other Program/Department Ownership: Howe School of Technology Management When first offered: Department Point of Contact and Title: Germán Creamer, Associate professor Date approved by individual school and/or department curriculum committee: 05-06-13 Sample Syllabus: Topic(s) Week 1 Week 2 Week 3 Week 4 Week 5 Linear algebra review Autoregressive & moving average models Seasonality, long memory ARMA and unit root test Volatility modeling via conditional heteroscedastic models Linear methods for classification Week 6 Kernel methods Week 7 Support vector machines Week 8 Bayesian and Markovian graphical models Week 9 EM algorithm Wee 10 Week 11 Inference Sampling methods Reading(s) Tsay, ch. 2 Tsay, ch. 2 Tsay, ch. 3 Bishop, ch. 4 HTF, ch. 4 Bishop, ch. 6 HFT, ch. 6 Bishop, ch. 7 HTF, ch. 12 Bishop, ch. 8 HTF, ch. 8 Bishop, ch. 9 HTF, ch. 8.5 Bishop, ch. 10 Bishop, ch. 11 Week 12 Latent variables Week 13 Hidden Markov models & linear dynamical systems Week 14 Ensemble methods HTF, ch. 8.6 Bishop, ch. 12 HTF, ch. 11.7 Bishop, ch. 13 HTF, ch. 3 Bishop, ch. 14 HTF, ch. 8, 15 & 16 Note: Bishop is Christopher M. Bishop, Pattern Recognition and Machine Learning, 2006. HTF is Hastie, Tibshirani and Friedman, The Elements of Statistical Learning. 2010 Textbook(s) or References: Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006. Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning. Springer-Verlag, New York,. 2010 (downloadable at http://www-stat.stanford.edu/~tibs/ElemStatLearn/). R. S. Tsay, Analysis of Financial Time Series, 3rd Ed, John Wiley, 2010. (Only chapters 2-3 will be reviewed in this course. The 2nd. Edition can be accessed through the library website) Optional Texts: Tony Jebara, Machine Learning: Discriminative and Generative, Kluwer, 2004, Boston, MA, 2004. R.O. Duda, P.E. Hart and D.G. Stork, Pattern Classification, John Wiley & Sons, 2001. Tom M. Mitchell, Machine Learning, McGraw-Hill Series in Computer Science, 1997. Vasant Dhar and Roger Stein. Seven methods for transforming corporate data into business intelligence. Upper Saddle River: Prentice Hall. 1997. Specialized papers will be assigned to complement the course texts.