Statistics 505: App Mult Statistical Analysis Fall 2014 Instructor: Matthew Reimherr Department of Statistics 411 Thomas Building mreimherr@psu.edu Class Schedule: MWF 2:30--‐3:20, 111 Chambers Building Office Hours: W 4:00--‐5:00 and by appointment Textbook: Applied Multivariate Statistical Analysis. 6th ed., Johnson and Wichern. Computing: This course will utilize R for all computational needs. Each week, as appropriate, different aspects/tools in R will be explored. For homework, while R is encouraged, students may use any statistical software they prefer (SAS, Minitab, SPSS, Stata, etc). Please be advised that the instructor cannot guarantee support for any software other than R. Webpage: Available through ANGEL, check regularly for updates. Course Topics: 1. Review of basic matrix operations and random vectors 2. Numerical and graphical summaries of multivariate data 3. Multivariate normal distribution 4. Inference for multivariate means 5. Comparison of two or more mean vectors 6. Multivariate Linear Regression 7. Principal Components 8. Factor Analysis 9. Canonical Correlation 10. Discrimination and Classification 11. Cluster Analysis Learning Objectives: This course aims to provide students with an introduction to the common statistical tools employed when handling multivariate data. An emphasis will be placed on exploring and understanding the discussed methods and to see them employed in a variety of applications. 1. The first objective is to provide a brief review of concepts that should be familiar to students. This includes concepts from linear algebra as well as various techniques for summarizing/visualizing data. 2. The backbone of most multivariate statistical procedures is the multivariate normal. Students will learn the basics about this extension of the scalar normal, including theoretical properties and how to simulate multivariate normal data. 3. The first inferential tools learned will be concerned with the means of multivariate samples (one and two sample). This section will also be used to emphasize how estimation and testing are conducted in higher dimensions. Computational tools and examples will be discussed. 4. Next we will move to multivariate regression, the backbone of multivariate analysis. Estimating and testing will be examined. Computational machinery, applications, and interpretations will be discussed. 5. The first topic which does not have a scalar analogy is Principal Component Analysis, PCA. Students will be introduced to PCA and the concept/benefits of dimension reduction. The methods of factor analysis and canonical correlation will then be explored and contrasted with PCA. 6. After the low--‐dimensional techniques have been explored, we will move into classification and clustering of multivariate data. Model based methods will be contrasted with purely algorithmic schemes. Grading: Homework: 40% Midterm Exam: 25% Midterm Report: 10% Final Report and Presentation: 25% Homework: Homework will be assigned every Friday and will be due the following Friday. All homework must be turned in by Friday to me or the dropbox on ANGEL. No late homework will be accepted. You are encouraged to work together on homework, but each student must turn in their own write up and answers. Exams: There will be one exam for this course that will take place roughly a 1/3 of the way through the semester. Projects/Presentation: Students, forming teams of 1--‐3, and will work on a project throughout the semester. A midterm report will be submitted which outlines the primary goals and details of the project. A final report will be submitted at the end of the semester and students will give a 10--‐15 minute presentation based on that report. This project can fall under either one of two categories: 1. Students may investigate a relevant research paper/methodology not described in the class. Students will need to present a short simulation study, of their own design, which explores the procedure. 2. Students may present a novel data application where they apply several of the discussed methods. Academic Integrity: Academic integrity is the pursuit of scholarly activity free from fraud and deception and is an educational objective of this institution. All University policies regarding academic integrity apply to this course. Academic dishonesty includes, but is not limited to, cheating, plagiarizing, fabricating of information or citations, facilitating acts of academic dishonesty by others, having unauthorized possession of examinations, submitting work of another person or work previously used without informing the instructor, or tampering with the academic work of other students. All exam answers must be your own, and you must not provide any assistance to other students during exams. Disability Services: Penn State welcomes students with disabilities into the University’s educational programs. If you have a disability--‐related need for reasonable academic adjustments in this course, contact the Office for Disability Services (ODS) at 814--‐863--‐1807 (V/TTY). For further information regarding ODS, please visit the Office for Disability Services Web site at http://equity.psu.edu/ods/. In order to receive consideration for course accommodations, you must contact ODS and provide documentation. If the documentation supports the need for academic adjustments, ODS will provide a letter identifying appropriate academic adjustments. Please share this letter and discuss the adjustments with your instructor as early in the course as possible.