Statistics 505: App Mult Statistical Analysis Fall 2014

advertisement
Statistics 505: App Mult Statistical Analysis
Fall 2014
Instructor: Matthew Reimherr
Department of Statistics
411 Thomas Building
mreimherr@psu.edu
Class Schedule: MWF 2:30--‐3:20, 111 Chambers Building
Office Hours: W 4:00--‐5:00 and by appointment
Textbook: Applied Multivariate Statistical Analysis. 6th ed., Johnson and Wichern.
Computing: This course will utilize R for all computational needs. Each week, as
appropriate, different aspects/tools in R will be explored. For homework, while R is
encouraged, students may use any statistical software they prefer (SAS, Minitab,
SPSS, Stata, etc). Please be advised that the instructor cannot guarantee support for
any software other than R.
Webpage: Available through ANGEL, check regularly for updates.
Course Topics:
1. Review of basic matrix operations and random vectors
2. Numerical and graphical summaries of multivariate data
3. Multivariate normal distribution
4. Inference for multivariate means
5. Comparison of two or more mean vectors
6. Multivariate Linear Regression
7. Principal Components
8. Factor Analysis
9. Canonical Correlation
10. Discrimination and Classification
11. Cluster Analysis
Learning Objectives:
This course aims to provide students with an introduction to the common statistical
tools employed when handling multivariate data. An emphasis will be placed on
exploring and understanding the discussed methods and to see them employed in a
variety of applications.
1. The first objective is to provide a brief review of concepts that should be
familiar to students. This includes concepts from linear algebra as well as
various techniques for summarizing/visualizing data.
2. The backbone of most multivariate statistical procedures is the multivariate
normal. Students will learn the basics about this extension of the scalar
normal, including theoretical properties and how to simulate multivariate
normal data.
3. The first inferential tools learned will be concerned with the means of
multivariate samples (one and two sample). This section will also be used to
emphasize how estimation and testing are conducted in higher dimensions.
Computational tools and examples will be discussed.
4. Next we will move to multivariate regression, the backbone of multivariate
analysis. Estimating and testing will be examined. Computational machinery,
applications, and interpretations will be discussed.
5. The first topic which does not have a scalar analogy is Principal Component
Analysis, PCA. Students will be introduced to PCA and the concept/benefits
of dimension reduction. The methods of factor analysis and canonical
correlation will then be explored and contrasted with PCA.
6. After the low--‐dimensional techniques have been explored, we will move into
classification and clustering of multivariate data. Model based methods will
be contrasted with purely algorithmic schemes.
Grading:
Homework: 40%
Midterm Exam: 25%
Midterm Report: 10%
Final Report and Presentation: 25%
Homework:
Homework will be assigned every Friday and will be due the following Friday. All
homework must be turned in by Friday to me or the dropbox on ANGEL. No late
homework will be accepted. You are encouraged to work together on homework,
but each student must turn in their own write up and answers.
Exams: There will be one exam for this course that will take place roughly a 1/3 of
the way through the semester.
Projects/Presentation: Students, forming teams of 1--‐3, and will work on a project
throughout the semester. A midterm report will be submitted which outlines the
primary goals and details of the project. A final report will be submitted at the end
of the semester and students will give a 10--‐15 minute presentation based on that
report. This project can fall under either one of two categories:
1. Students may investigate a relevant research paper/methodology not
described in the class. Students will need to present a short simulation study,
of their own design, which explores the procedure.
2. Students may present a novel data application where they apply several of
the discussed methods.
Academic Integrity:
Academic integrity is the pursuit of scholarly activity free from fraud and deception
and is an educational objective of this institution. All University policies regarding
academic integrity apply to this course. Academic dishonesty includes, but is not
limited to, cheating, plagiarizing, fabricating of information or citations, facilitating
acts of academic dishonesty by others, having unauthorized possession of
examinations, submitting work of another person or work previously used without
informing the instructor, or tampering with the academic work of other students. All
exam answers must be your own, and you must not provide any assistance to other
students during exams.
Disability Services:
Penn State welcomes students with disabilities into the University’s educational
programs. If you have a disability--‐related need for reasonable academic adjustments
in this course, contact the Office for Disability Services (ODS) at 814--‐863--‐1807
(V/TTY). For further information regarding ODS, please visit the Office for Disability
Services Web site at http://equity.psu.edu/ods/. In order to receive consideration
for course accommodations, you must contact ODS and provide documentation. If
the documentation supports the need for academic adjustments, ODS will provide a
letter identifying appropriate academic adjustments. Please share this letter and
discuss the adjustments with your instructor as early in the course as possible.
Download