STAT 462: Applied Regression Analysis Spring 2015, Section 001 Instructor: Lingzhou Xue Email: Through ANGEL or lingzhou@psu.edu Office: 318 Thomas Office Hours: Mon. and Wed. 9:00am – 10:00am, 11:15am – 12:15pm, and by appointment TA: Muzi Zhang Email: muz135@psu.edu Office: 301 Thomas Office Hours: Tue. 11:00am – 1:00pm; Thur. 12:30pm – 2:30pm, and by appointment Time and Location Lecture: Mon. and Fri. 8:00am – 8:50am Lab: Wed. 8:00am – 8:50am (Lecture and lab attendances are required.) 215 Thomas 112 Boucke Course Description: This is an applied linear regression course that involves hands-on data analysis. Students enrolling for this course should have taken at least once other statistics course and be familiar with the basic fundamentals of statistical testing and estimation. Statistical data analysis is emphasized, so computers will be used frequently during the course. Prerequisite: One of STAT 200, STAT 220, STAT 240, STAT 250, STAT 301, or STAT 401 Course Website: Class announcements and materials will be regularly posted on ANGEL. Please check the ANGEL site frequently. Homework and lab assignments, solutions, data sets, etc. will be posted. Materials: Textbook (recommended but NOT required): Applied Linear Statistical Models, 5th edition, by Kutner, Nachtsheim, Neter, and Li or Applied Linear Regression Models, 4th edition, by Neter, Nachtsheim, and Kutner Lecture Notes for Lecture and Lab Materials: will be posted on ANGEL before class Software: Minitab, which is available in all Penn State computer labs on campus. Information on obtaining Minitab can be found at http://stat.psu.edu/education/statistical-software-packages. An alternative is to use WebApps (http://webapps.psu.edu). One drawback of this, though, is that it is slower, requires internet, and you cannot access local files directly. However you may find out how to load your local data into pass space at (http://clc.its.psu.edu/UnivServices/WebFiles). Grading: 1. Lab Activities (15%): There will be approximately eight (8) lab activity assignments throughout the semester, which are administered and completed in-class. - Lab activities are due during the class time in which they are assigned, and they should be submitted in the drop box on ANGEL. - NO LATE LAB ACTIVITIES WILL BE ACCEPTED. - The lowest grade will be dropped prior to calculating the final grade. 2. Homework (15%): There will be approximately six (6) homework assignments throughout the semester. - You will NOT need Minitab to answer questions, but software output may be included. - The homework will be given and collected on Friday at the beginning of class. - You must show all work on the homework problems to receive full credit. - A reasonable amount of collaboration is allowed, but each student must turn in his or her own written work which reflects his or her own understanding of the material. - NO LATE HOMEWORK WILL BE ACCEPTED unless the student has PRIOR permission from the instructor. - The lowest grade will be dropped prior to calculating the final grade. 3. Quizzes (5%): There will be five (5) quizzes administered in-class, each worth 1% of the final grade. - Quizzes CANNOT be made up without a legitimate excuse. 4. Midterm Exams (40%): There will be two (2) midterm exams administered in-class, each worth 20% of the final grade. - They will NOT require statistical software but may include output in the problems. - You will be allowed to bring and use the following items only: one double-sided formula sheet, pen or pencils, and a simple calculator. - Tentative dates for the two midterm exams are February 24 and April 14. - Midterm exams CANNOT be made up or rescheduled without a legitimate excuse. 5. Final Project (25%): In lieu of a final exam, there will be a team project. Final project report will worth 20% of the final grade, and final project presentation worth 5% of the final grade. - Teams of 3 to 4 students will be assigned after the first midterm. I will take requests for teammates and all others will be randomly assigned. Students are encouraged to pick your own data set and topics, but I can suggest back-up datasets if you cannot find one. - Each team will present their projects to the rest of the class during the last two weeks of class. After the presentation, students can receive feedback and revise their final analysis and report, which is due the last day of finals week, Friday, May 8. - Final project report should explicitly include logs of group meetings/efforts and report individual contributions in the appendix. - More information on the project will be given in class and posted on ANGEL. Grading Scale: Final grades will be determined as follows: A: [93, 100], A- : [90, 93), B+: [87, 90), B: [83, 87), B- : [80, 83), C+: [77, 80), C: [70, 77), D: [60, 70), F: [0, 60). Other Course Rules: You have one week to appeal a homework or exam grade. No grade changes will be made one week after a graded homework or exam is returned Make-up midterm exams might be allowed, with prior arrangement, for students with direct conflicts due to other exams or required university activities (chess team, athletics, field trip, Blue Band trip, etc.). The director of that program must provide a valid letter requesting that the student be excused from the exam Students are responsible for all announcements and supplements given within any lecture and email. If you need to leave class early, please sit in the rear and leave as quietly as possible. Please be courteous to your classmates and keep extra noise to a minimum. All cell phones must be turned off before you enter the classroom Academic Integrity: All Penn State and Eberly College of Science policies regarding academic integrity apply to this course. See http://science.psu.edu/current-students/Integrity for details. ECOS Code of Mutual Respect and Cooperation: The Eberly College of Science (ECOS) Code of Mutual Respect and Cooperation embodies the values that we hope our faculty, staff, and students possess and will endorse to make The Eberly College of Science a place where every individual feels respected and valued, as well as challenged and rewarded. http://www.science.psu.edu/climate/Code-of-Mutual-Respect final.pdf Accommodations for Students with Disabilities: Penn State welcomes students with disabilities into the University's educational programs. If you have a disability-related need for reasonable academic adjustments in this course, contact the Office for Disability Services (ODS) at 814-863-1807 (V/TTY). For further information regarding ODS, please visit the Office for Disability Services Web site at http://equity.psu.edu/ods/ . In order to receive consideration for course accommodations, you must contact ODS and provide documentation (see the documentation guidelines at http://equity.psu.edu/ods/student-information). If the documentation supports the need for academic adjustments, ODS will provide a letter identifying appropriate academic adjustments. Please provide the letter and discuss any adjustments with me as early in the course as possible. You must contact ODS and request academic adjustment letters at the beginning of each semester. Specific Topics to Be Covered 1. Simple Linear Regression Model Model for E(Y), model for distribution of errors Least squares estimation of model for E(Y) Estimation of variance 2. Inferences for Simple Linear Regression Model Inferences concerning the slope (confidence intervals and t-test) Confidence interval estimate of the mean Y at a specific X Prediction interval for a new Y Analysis of Variance partitioning of variation in Y R-squared calculation and interpretation 3. Diagnostic Procedures for Aptness of Model: assessing regression assumptions Residual analyses Plots of residuals versus fits, residuals versus x Tests for normality of residuals Lack of Fit test, Pure Error, Lack of Fit concepts Transformations as solution to problems with the model 4. Multiple Regression Models and Estimation Matrix Notations Hyperplane extension to simple linear model Basic estimation and inference for multiple regression 5. Additional Topics for Multiple Regression Analysis General Linear F test and Sequential SS Effects of a variable controlled for other predictors Sequential SS Partial Correlation Multicollinearity between X variables Effect on standard deviations of coefficients Problems interpreting effects of individual variables Apparent conflicts between overall F test and individual variable t tests Benefits of designed experiments 6. Categorical Predictor Variables Indicator Variables Interpretation of models containing indicator variables 7. Model Comparison and Selection Methods R2, MSE , Cp, and PRESS criteria Stepwise algorithms 8. Logistic Regression: categorical outcome variables Binary outcome: Bernoulli Distribution Interpretation of models: odds ratio 9. Miscellaneous Topics as Time Permits TENTATIVE SCHEDULE: THIS IS SUBJECT TO CHANGE. Week No. Dates 1 Jan. 12-16 2 Jan. 19-23 3 Jan. 26-30 4 Feb. 2-6 5 Feb. 9-13 6 Feb. 16-20 7 Feb. 23-27 8 Mar. 2-6 9 Mar. 9-13 10 Mar. 16-20 11 Mar. 23-27 12 Mar. 30-Apr. 3 13 Apr. 6-10 14 Apr. 13-17 15 Apr. 20-24 16 Apr. 27-May 1 17 May 4-8 Chapter & Topic Chapter 1: Introduction and Simple Linear Regression Chapter 2: Inferences for Simple Linear Regression Chapter 2: Inferences for Simple Linear Regression (Cont.) Chapter 3: Assessing Regression Assumptions Chapter 4: Introduction to Multiple Regression Chapter 4: Introduction to Multiple Regression (Cont.) Chapter 5: General Linear F-Test Procedure and Multicollinearity Chapter 5: General Linear F-Test Procedure and Multicollinearity (Cont.) SPRING BREAK Chapter 6: Categorical Predictors and Indicator Variables Chapter 7: Model Comparison and Selection Methods Chapter 7: Model Comparison and Selection Methods (Cont.) Chapter 8: Diagnostics for Multiple Regression Chapter 9: Logistic Regression Assignments Due MIDTERM #1 on Feb. 24 MIDTERM #2 on Apr. 14 Chapter 9: Logistic Regression (Cont.) Project Presentations Project Presentations (Cont.) FINALS WEEK WORK ON PROJECTS REPORT due Fri. May 8