Revised: March 7, 2016 Stevens Institute of Technology Howe School of Technology Management Syllabus MGT 718 Multivariate Analysis Fall, 2014 Yasuaki Sakamoto Babbio 632 Tel: 201-216-8198 Fax: 201-216-5385 ysakamot@stevens.edu Wednesdays, 6:15 pm Babbio 641 Office Hours: Wednesday 5:30 and 9:00 pm By appointment Course & Web Address: BC 641 http://www.stevens.edu/moodle Overview This course introduces basic methods underlying multivariate analysis through computer applications using R, which is used by many data scientists and is an attractive environment for learning multivariate analysis. Students will master multivariate analysis techniques, including principal components analysis, factor analysis, structural equation modeling, multidimensional scaling, correspondence analysis, cluster analysis, multivariate analysis of variance, discriminant function analysis, logistic regression, as well as other methods used for dimension reduction, pattern recognition, classification, and forecasting. Students will build expertise in applying these techniques to real data through class exercises and a project, and learn how to visualize data and present results. This proficiency will prepare students to conduct their own independent research. In this course, students will: - master various techniques used in multivariate analysis - learn how to apply multivariate analysis methods to real data - improve their ability to think critically about data analysis and interpretation - develop skills for visualizing and communicating results Learning Goals After taking this course, students will be able to: - use R to analyze multivariate data - visualize multivariate data and communicate results - recognize pattern, classify information, and forecast events - think critically about data and research findings Additional learning objectives include the development of: Written and oral communications skills - the written project report will be used to assess written communication skills and the oral presentations of the project will be used to assess oral communication skills. 1 Pedagogy The course incorporates demonstration, discussion, and in-class R exercise. Students are expected to complete a final project using their own data. The overall goal is to establish an active, comfortable, and creative learning environment. Readings Recommended textbooks T. W. Anderson (2003). An Introduction to Multivariate Statistical Analysis, Third Edition, Wiley. Abdelmonem A. Afifi, Virginia Clark, Susanne May (2004). Computer-Aided Multivariate Analysis, Fourth Edition, CRC Press. Additional tutorials Intro to R by GoogleDevelopers, Quick-R, inside-R, An Introduction to R, A short list of the most useful R commands, R reference card Supplementary materials Probability and statistics 1.151 or 18.05 from MIT OpenCourseWare (http://ocw.mit.edu/index.htm) Assignments Take-home midterm exam assigned on week 7: Materials up to week 7 Take-home final exam assigned on week 13: Comprehensive Final paper: The method and results sections in a journal manucript format Grading Assignment Midterm exam Final exam Final paper Total Grade Percent 30% 30% 40% 100% Ethical Conduct The following statement is printed in the Stevens Graduate Catalog and applies to all students taking Stevens courses, on and off campus. “Cheating during in-class tests or take-home examinations or homework is, of course, illegal and immoral. A Graduate Academic Evaluation Board exists to investigate academic improprieties, conduct hearings, and determine any necessary actions. The term ‘academic impropriety’ is meant to include, but is not limited to, cheating on homework, during in-class or take home examinations and plagiarism.” 2 Consequences of academic impropriety are severe, ranging from receiving an “F” in a course, to a warning from the Dean of the Graduate School, which becomes a part of the permanent student record, to expulsion. Reference: The Graduate Student Handbook, Academic Year 2003-2004 Stevens Institute of Technology, page 10. Consistent with the above statements, all homework exercises, tests and exams that are designated as individual assignments MUST contain the following signed statement before they can be accepted for grading. ____________________________________________________________________ I pledge on my honor that I have not given or received any unauthorized assistance on this assignment/examination. I further pledge that I have not copied any material from a book, article, the Internet or any other source except where I have expressly cited the source. Signature _________________________ Date: _____________ Please note that assignments in this class may be submitted to www.turnitin.com, a webbased anti-plagiarism system, for an evaluation of their originality. Course/Teacher Evaluation Continuous improvement can only occur with feedback based on comprehensive and appropriate surveys. Your feedback is an important contributor to decisions to modify course content/pedagogy which is why we strive for 100% class participation in the survey. All course teacher evaluations are conducted on-line. You will receive an e-mail one week prior to the end of the course informing you that the survey site (https://www.stevens.edu/assess) is open along with instructions for accessing the site. Login using your campus username and password. All responses are strictly anonymous. We especially encourage you to clarify your position on any of the questions and give explicit feedbacks on your overall evaluations in the section at the end of the formal survey which allows for written comments. We ask that you submit your survey prior to the close of the examination period. Course Schedule Topic Reading Exercise Week 1 Overview and goals of multivariate analysis Week 2 Statistical computing using the R environment, review of descriptive statistics - The 2013 KDnuggets Software Poll - Why use R? and Follow up (pdfs) - An introduction to R (pp 7-17) - Using R for introductory statistics (pp 1-32) Install R, try R code, and submit output Week 3 Getting used to R, review of probabilities and inferential statistics Looking at - Basic statistics (a pdf note) - An introduction to R (pp 18-39) - Using R for introductory statistics (pp 41-77) Think about data for project in your research area. - Basic statistics Graph and interpret Week 4 3 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 multivariate data, visualization methods, preparing for data analysis, selecting appropriate methods Simple regression, multiple regression, and correlation PCA, matrix manipulation, eigenvector and eigenvalue Exploratory and confirmatory factor analysis Path diagram and structural equation modeling Multidimensional scaling and correspondence analysis Clustering Week 13 Discriminant function analysis, MANOVA, Bayes net, neural net Logistic regression, binomially distributed data, maximum likelihood Forecasting Week 14 Presenting results Week 11 Week 12 - An introduction to R (pp 62-75) - Using R for introductory statistics (pp 32-41) the data - Basic statistics - An introduction to R (pp 50-61) - Using R for introductory statistics (pp 77-89) Computer-Aided Multivariate Analysis: PCA (pdf) Detect relationship between variables Computer-Aided Multivariate Analysis: Factor Analysis (pdf) Find underlying dimensions in the data Using Multivariate Statistics: SEM (pdf) Detect structure in the data An R and S-PLUS Companion to Multivariate Analysis: MDS and correspondence analysis (pdf) Measure distance and find spatial relationship An R and S-PLUS Companion to Multivariate Analysis: Cluster Analysis (pdf) Using Multivariate Statistics: Discriminant function analysis, MANOVA (pdf) Measure distance and partition data points Classify event Using Multivariate Statistics: Logistic regression (pdf) Predict event Using Multivariate Statistics: Time-series analysis (pdf) Analyze longitudinal data Writing method and results sections in a journal manuscript format Reduce the number of dimensions in the data 4