STAT 519 Multivariate Analysis, 2010 Spring

advertisement
STAT 519 Multivariate Analysis, 2010 Spring
Instructor: Stephen Lee. Office at Brink 412 with phone: 5-7701. E-mail: stevel at uidaho dot edu
Office Hours: MWF 12:30-1:20 pm, or by appointment.
Course Homepage: http://www.webpages.uidaho.edu/~stevel/Stat519.html
Course Objectives: The objective of this class is to give you a solid and thorough knowledge of
multivariate data analysis so that you can understand the literature and be able to properly analyze
multivariate data. The emphasis will be on concepts and applications, i.e., developing a sound
understanding of the methods and knowing which method to employ and their limitations.
Prerequisites: Stat 401 or equivalent coursework. You are expected to have a working knowledge in
statistics, linear models, linear algebra, and computing in R/SAS. If you are concerned about your
preparation, please come see me.
Text:
[1]
Analyzing Multivariate Data, by J. Lattin, J.D. Carroll, and P.E. Green, Duxbury, 2003.
[2]
An R and S-PLUS Companion to Multivariate Analysis, by Brian Everitt, Springer, 2007. (datasets
and R codes used in the book is available at http://biostatistics.iop.kcl.ac.uk/publications/everitt/
[3]
SAS Companion to Analyzing Multivariate Data, by J. Lattin and Hasan Hamdan, Duxbury, 2008.
[4]
Statistical Data Mining by Wiesner Vos & Ludger Evers.
[5]
Applied Multivariate Analysis Notes from U of Canbridge at
http://www.statslab.cam.ac.uk/~pat/AppMultNotes.pdf
Intended Course Coverage: Not necessarily in the following order.















Introduction To Linear Algebra
Introduction to R/SAS
Multivariate Exploratory Data Analysis
Vector and Matrix Geometry
Principal Component Analysis
Factor Analysis
Multidimensional Scaling
Hierarchical Clustering
Distance Measures
K-Means Clustering
Model-based Clustering
Multivariate Normal Distribution
MANOVA
Discriminant Analysis
Classification Trees (time permitting)


Random Forests (time permitting)
Canonical Correlation (time permitting)
Homework: You are expected to type in LaTeX/Word/… Copy and paste computer codes in
constant width fonts like this one in “Courier New” for alignment (unlike
the “Times New Roman” fonts which do not align well). Turn in your homework in printed papers in class
on the due date. I will assign homework often on a regular basis. Please make things easy on me and
yourself; make your homeworks easy to read and grade. Please type and show your work.
Although many of the problems will involve computer works, your homework grade will be low if you
just re-produce the computer codes and outputs. You need to write/comment/remark/answer/report your
findings to the questions asked. For the computational portions of the problems, I just want to see the
MINIMUM computer output/graphics at the right spots.
Finally, you are writing your homework paper by yourself independently, even if you discuss with your
peers. Similar homeworks suggest plagiarism which will result in serious consequences according to the
UI policy.
Late Homeworks: Late homework after the due date will NOT be acceptable regardless of reasons.
Please type and turn in a hard-paper copy of your work in class on the due date. Keep your own electronic
file copy (of course) and you can e-mail it to me at the due date when you cannot show up in class.
Computer Use: We will use the R/SAS computer language. R is an open source freeware and is available
for download (Windows and Unix) from http://cran.us.r-project.org/. SAS is the traditional
powerhouse in the field of statistics. R is more interactive and produces better graphics. Both will be used
for class demonstrations, and you are encouraged to learn BOTH. One language may be easier for some
questions (and vice versa) in your homework and during the mid-term and final exams. The main thing is
not the language you choose, but to get the job done in the right way with correct interpretation of results.
Grades: For each person I will compute an overall score according to the formula: 50% Homework + 20%
Mid-term + 30% Final and will assign grades according to the traditional cutoff (i.e., 100-90, A; 80-89, B;
…) Curving may or may not be considered when all homework and final grades are recorded, thus you
are not going to rely on it for your grade! If there is any curving at all, it will be dropping your worst
homework assignment.
Exams Dates:


Mid-term Exam
Final Exam
3/11 Thursday
5/14 Friday
5:00-7:00 pm
12:30-2:30 pm
Exams Formats: You will be tested on concepts and applications. Some data sets will be given to you a
week prior to the exam date so you can familiarize/analysis them ahead of time. You bring all your works
and results to the exams. The exams are open book, open notes, and you can bring your laptop
computer with you. Mid-term and Final exam are comprehensive and cumulative, i.e., they cover all the
materials from day one.
The above schedule and procedures in this course are subject to change and will be announced in class
when that happens.
Download