STAT 519 Multivariate Analysis, 2010 Spring Instructor: Stephen Lee. Office at Brink 412 with phone: 5-7701. E-mail: stevel at uidaho dot edu Office Hours: MWF 12:30-1:20 pm, or by appointment. Course Homepage: http://www.webpages.uidaho.edu/~stevel/Stat519.html Course Objectives: The objective of this class is to give you a solid and thorough knowledge of multivariate data analysis so that you can understand the literature and be able to properly analyze multivariate data. The emphasis will be on concepts and applications, i.e., developing a sound understanding of the methods and knowing which method to employ and their limitations. Prerequisites: Stat 401 or equivalent coursework. You are expected to have a working knowledge in statistics, linear models, linear algebra, and computing in R/SAS. If you are concerned about your preparation, please come see me. Text: [1] Analyzing Multivariate Data, by J. Lattin, J.D. Carroll, and P.E. Green, Duxbury, 2003. [2] An R and S-PLUS Companion to Multivariate Analysis, by Brian Everitt, Springer, 2007. (datasets and R codes used in the book is available at http://biostatistics.iop.kcl.ac.uk/publications/everitt/ [3] SAS Companion to Analyzing Multivariate Data, by J. Lattin and Hasan Hamdan, Duxbury, 2008. [4] Statistical Data Mining by Wiesner Vos & Ludger Evers. [5] Applied Multivariate Analysis Notes from U of Canbridge at http://www.statslab.cam.ac.uk/~pat/AppMultNotes.pdf Intended Course Coverage: Not necessarily in the following order. Introduction To Linear Algebra Introduction to R/SAS Multivariate Exploratory Data Analysis Vector and Matrix Geometry Principal Component Analysis Factor Analysis Multidimensional Scaling Hierarchical Clustering Distance Measures K-Means Clustering Model-based Clustering Multivariate Normal Distribution MANOVA Discriminant Analysis Classification Trees (time permitting) Random Forests (time permitting) Canonical Correlation (time permitting) Homework: You are expected to type in LaTeX/Word/… Copy and paste computer codes in constant width fonts like this one in “Courier New” for alignment (unlike the “Times New Roman” fonts which do not align well). Turn in your homework in printed papers in class on the due date. I will assign homework often on a regular basis. Please make things easy on me and yourself; make your homeworks easy to read and grade. Please type and show your work. Although many of the problems will involve computer works, your homework grade will be low if you just re-produce the computer codes and outputs. You need to write/comment/remark/answer/report your findings to the questions asked. For the computational portions of the problems, I just want to see the MINIMUM computer output/graphics at the right spots. Finally, you are writing your homework paper by yourself independently, even if you discuss with your peers. Similar homeworks suggest plagiarism which will result in serious consequences according to the UI policy. Late Homeworks: Late homework after the due date will NOT be acceptable regardless of reasons. Please type and turn in a hard-paper copy of your work in class on the due date. Keep your own electronic file copy (of course) and you can e-mail it to me at the due date when you cannot show up in class. Computer Use: We will use the R/SAS computer language. R is an open source freeware and is available for download (Windows and Unix) from http://cran.us.r-project.org/. SAS is the traditional powerhouse in the field of statistics. R is more interactive and produces better graphics. Both will be used for class demonstrations, and you are encouraged to learn BOTH. One language may be easier for some questions (and vice versa) in your homework and during the mid-term and final exams. The main thing is not the language you choose, but to get the job done in the right way with correct interpretation of results. Grades: For each person I will compute an overall score according to the formula: 50% Homework + 20% Mid-term + 30% Final and will assign grades according to the traditional cutoff (i.e., 100-90, A; 80-89, B; …) Curving may or may not be considered when all homework and final grades are recorded, thus you are not going to rely on it for your grade! If there is any curving at all, it will be dropping your worst homework assignment. Exams Dates: Mid-term Exam Final Exam 3/11 Thursday 5/14 Friday 5:00-7:00 pm 12:30-2:30 pm Exams Formats: You will be tested on concepts and applications. Some data sets will be given to you a week prior to the exam date so you can familiarize/analysis them ahead of time. You bring all your works and results to the exams. The exams are open book, open notes, and you can bring your laptop computer with you. Mid-term and Final exam are comprehensive and cumulative, i.e., they cover all the materials from day one. The above schedule and procedures in this course are subject to change and will be announced in class when that happens.