STAT 462: Applied Regression Analysis Instructor: Qing (Wendy) Wang Email: quw104@psu.edu Office: 422 Thomas Building Office Hours: Mon 2:00-3:00 pm, or by appointment. Phone: 865-8634 TA: Wei Wang Email: wzw114@psu.edu Office: 301 Thomas Building Office Hours: Mon 12:30-1:30 pm Phone: 863-2314 Text: Applied Linear Regression Models, 4th edition, by Neter, Nachtsheim, Kutner, and Wasserman (recommended but NOT required) Description: STAT 462 is an applied linear regression course that involves "hands on" data analysis. Students enrolling for this course should have taken at least one other Statistics course and should be familiar with the basic fundamentals of statistical testing and estimation. Computer Usage and Data Sets Data analysis is emphasized, so computers will be used frequently during the course. Throughout the course, Minitab for Windows will be used to analyze the data for handouts and lecture demonstrations in class. Those wishing to install Minitab on their own computers may go to www.minitab.com/education for details. Data sets can be found at the PSU ANGEL website. Grading: There are 500 equally weighted points. (200pts) (100pts) (100pts) (100pts) Homework Exam 1 Exam 2 Final Exam Homework: The homework is worth 40% of the semester grade and consists of weekly lab activities and textbook exercises. Lab activities need to be submitted on Angel during the class time. Each week’s homework assignment is collected during the following Tuesday’s lab unless otherwise specified. Late work will not be accepted unless the student has the permission from the instructor. (However, no late homework will be accepted if solutions have been posted.) Exams: Each exam is worth 20% of the semester grade. Most questions will be short answer and will include Minitab analysis output. A simple calculator may be used, and some formulas and tables will be provided. Dates for these exams are July 16, 30 and August 13 (tentative). Conflicts on exam dates must be resolved in advance. An unexcused absence on an exam date will incur at least a 10% penalty and may result in a score of 0. Please allow 24 hours for email response. Letter Grades: 93 – 99% 90 – 92% 87 – 89% 83 – 86% 80 – 82% Semester grades are assigned A 77 – 79% A70 – 76% B+ 60 – 69% B 0 – 59% B- according to this scale. C+ C D F Announcements: Lecture notes, lab activities, assignments, and all due dates will be posted on ANGEL, available at www.angel.psu.edu or from the PSU homepage. Students are expected to check this several times a week for updates. Academic Integrity: This course will follow the guidelines found in Section 49-20 of the University Faculty Senate Policies for Students. See http://www.science.psu.edu/academic/Integrity concerning academic integrity for details. Specific Topics Usually Covered 1. Simple Linear Regression Model Model for E(Y), model for distribution of errors Least squares estimation of model for E(Y) Estimation of variance 2. Inferences for Simple Linear Model Inferences concerning the slope ( confidence intervals and t-test) Confidence interval estimate of the mean Y at a specific X Prediction interval for a new Y Analysis of Variance partitioning of variation in Y R-squared calculation and interpretation 3. Diagnostic procedures for aptness of model Residual analyses o Plots of residuals versus fits, residuals versus x o Tests for normality of residuals o Lack of Fit test, Pure Error, Lack of Fit concepts Transformations as solution to problems with the model 4. Matrix Notation and Literacy X matrix, vector, y vector (X'X)-1 X'Y estimates coefficient vector Variance- Covariance matrix 5. Multiple Regression Models and Estimation Hyperplane extension to simple linear model Interaction models Basic estimation and inference for multiple regression 6. General Linear F test and Sequential SS Reduced and Full models F test for general linear hypotheses Effects of a variable controlled for other predictors o Sequential SS o Partial correlation 7. Multicollinearity between X variables Effect on standard deviations of coefficients Problems interpreting effects of individual variables Apparent conflicts between overall F test and individual variable t tests Benefits of designed experiments 8. Polynomial Regression Models 9. Categorical Predictor Variables Indicator Variables Interpretation of models containing indicator variables Piecewise regression 10. More Diagnostic Measures and Remedial Measures for Lack of Fit Variance Inflation Factors Deleted Residuals Influence statistics - Hat matrix, Cook's D and related measures 11. Examining All Possible Regressions R2, MSE , Cp, and PRESS criteria Stepwise algorithms 12. Miscellaneous Topics as Time Permits Estimating the regression model if residuals have autocorrelation. Logistic regression models Nonlinear Regression