Applied Multivariate Statistical Analysis in the Environmental Sciences ENTX 6300 COURSE INFORMATION Description: The mechanisms behind multivariate techniques used in the environmental sciences will be explained at the level of the students in those fields. A wide variety of examples will be presented in class with an emphasis placed on the codes, functions, and packages necessary for multivariate data analysis in R, and the interpretation of the R outputs. At the end of the class, the students will be able to apply the multivariate techniques to their own research. Instructor: Rodica Gelca (http://www.tiehh.ttu.edu/rgelca/) Credits: 3 Textbooks: No textbook will be required. Lecture notes will be provided. Recommended references: 1. An Introduction to R. Free to download at http://www.r-project.org/. 2. Dalgaard, P. 2002. Introductory Statistics with R. Springer-Verlag. 3. Everitt, B., Hothorn, T. 2011. An Introduction to Applied Multivariate Analysis with R. Springer (eBook at TTU library). 4. Shaw, P. 2003. Multivariate Statistics for the Environmental Sciences. Prerequisites: Students are expected to have basic concepts in statistics and to know how to use R, or permission from the instructor. Meeting Time: Tuesdays and Thursdays, 3:30 – 5 pm. Meeting Place: Experimental Science Building, room 120. Office hours: Tues,Thurs 5:00-6:00 pm (Holden Hall, room 72), or by appointment. Attendance: Attendance in this class is expected, but not mandatory. Class attendance will be to the advantage of the students, for the better understanding of the concepts presented in class. Assignments: Assignments will be given on approximately two week intervals. Lecture Topics: Introduction to Multivariate Data Analysis 1. R overview 2. Multivariate data and multivariate analysis: a brief history of the development of multivariate analysis, types of variables, missing values, covariance, correlation, distances, and the multivariate normal density function. 3. Multivariate data visualization: graphical display, scatterplot, bivariate boxplot, the convex hull of bivariate data, chi-plot, the bubble and other glyph plots, the scatterplot matrix, and trellis graphics. 4. Basic concepts in ordination: direct and indirect ordination. Principal Components Analysis 1. 2. 3. 4. 5. 6. 7. 8. Introduction Normalizing the data The extraction of principal components Eigenvalues and eigenvectors Calculating principal components scores Plotting the principal components Biplots Examples in R Cluster Analysis 1. 2. 3. 4. 5. 6. Introduction Agglomerative hierarchical clustering K-means clustering Model-based clustering Displaying clustering solutions graphically Examples in R Multidimensional Scaling 1. 2. 3. 4. 5. 6. Introduction Models for proximity data Spatial models for proximities: Multidimensional scaling Classical multidimensional scaling Non-metric multidimensional scaling Examples in R Correspondence Analysis 1. 2. 3. 4. 5. 6. Introduction The correspondence analysis algorithm Limitations of correspondence analysis Detrended correspondence analysis The mechanics of detrended correspondence analysis Examples in R Canonical Correspondence Analysis 1. Introduction 2. The mechanics of canonical correspondence analysis 3. Examples in R Grading: Assignments: 60% Final exam: 40%