STAT 462: Applied Regression Analysis SYLLABUS – Fall 2014, Section 002 Penn State University Instructor: Stefanie Austin Email: Through ANGEL or stefanie.austin@psu.edu Office: 424A Thomas Office Hours: Mon. and Fri. 3:30pm – 4:30pm, and by appointment TA: Ame Osotsi Email: auo141@psu.edu Office: 301 Thomas Office Hours: Tue. and Wed. 3:00pm – 4:00pm Class: MF 4:40pm – 5:30pm 215 Thomas W 4:40pm – 5:30pm 004 Life Science Course Description: This is an applied linear regression course that involves hands-on data analysis. Students enrolling for this course should have taken at least once other statistics course and be familiar with the basic fundamentals of statistical testing and estimation. Prerequisite: One of STAT 200, STAT 220, STAT 240, STAT 250, STAT 301, or STAT 401 Materials: Textbooks (optional): Applied Linear Regression Models, 4th edition, by Neter, Nachtsheim, and Kutner, or Applied Linear Statistical Models, 5th edition, by Kutner, Nachtsheim, Neter, and Li Software: Data analysis is emphasized, so computers will be used frequently during the course. This course will use the statistical software Minitab, version 16 or 17. This software can be found throughout the computer labs on campus. You may purchase Minitab for your personal computer if you wish, but note that this is available only for PC. For MAC users we recommend SPSS. Information on obtaining both software packages can be found at the Statistical Software page (http://stat.psu.edu/education/statistical-software-packages). Another alternative is to use WebApps (http://webapps.psu.edu). One drawback of this, though, is that it is slower, requires internet, and you cannot access local files directly. However you may find out how to load your local data into pass space at (http://clc.its.psu.edu/UnivServices/WebFiles). Course Website: Class announcements and materials will be regularly posted on ANGEL, so it is recommended that you check the site frequently. Materials such as lecture notes, homework and lab assignments, solutions, etc. will be posted. Grading: 1. Lab Activities (10%): There will be approximately seven (7) lab activity assignments throughout the semester, which are administered and completed in-class. - Lab activities are due during the class time in which they are assigned. - NO LATE LAB ACTIVITIES WILL BE ACCEPTED. - The lowest grade will be dropped prior to calculating the final grade. 2. Homework (20%): There will be approximately six (6) homework assignments throughout the semester. - You will NOT need Minitab or SPSS to answer the questions, though software output may be included in the problems. - The homework will generally be due on every other Monday at the beginning of class. - You must show all work to receive full credit. - A reasonable amount of collaboration is allowed, but each student must turn in his or her own written work which reflects his or her understanding of the material. - NO LATE HOMEWORK WILL BE ACCEPTED unless the student has prior permission from the instructor. - The lowest grade will be dropped prior to calculating the final grade. 3. Midterm Exams (40%): There will be two (2) midterm examinations administered in-class, each worth 20% of the final grade. - They will NOT require statistical software but may include output in the problems. - You may only bring and use the following items: one-page of notes double-sided, plain scratch paper, pen or pencils, and a calculator. You may use no other items nor receive help from anyone. - Midterm exams CANNOT be made up or rescheduled without a legitimate excuse. Midterm #1 will tentatively take place on Friday, October 10 and will cover Chapters 1 through 4. Midterm #2 will tentatively take place on Friday, November 21 and will cover Chapters 5 through 8. 4. Final Project (30%): In lieu of a final exam, there will be a team project. - Teams of 3 to 4 students will be assigned after the first midterm exam. I will take requests for teammates and all others will be randomly assigned. - You will use techniques learned from class to analyze a data set and draw conclusions. - You are encouraged to pick your own data set and topics, but I can suggest datasets if you cannot find one. - Each team will conduct a 20-minute presentation on their methods and findings during the last two weeks of class. - After the presentation, students can receive feedback and revise their final report, which is due the last day of finals week, Friday, December 19. - More information on the project will be given in class and posted on ANGEL. Grading Scale: A (93-100); A- (90-92); B+ (87-89); B (83-86); B- (80-82); C+ (77-79); C (70-76); D (60-69); F (0-59) Academic Integrity: All Penn State and Eberly College of Science policies regarding academic integrity apply to this course. See http://science.psu.edu/current-students/Integrity for details. ECOS Code of Mutual Respect and Cooperation: The Eberly College of Science Code of Mutual Respect and Cooperation (http://www.science.psu.edu/climate/Code-of-Mutual-Respect final.pdf) embodies the values that we hope our faculty, staff, and students possess and will endorse to make The Eberly College of Science a place where every individual feels respected and valued, as well as challenged and rewarded. Accommodations for Students with Disabilities: Penn State welcomes students with disabilities into the University's educational programs. If you have a disability-related need for reasonable academic adjustments in this course, contact the Office for Disability Services (ODS) at 814-863-1807 (V/TTY). For further information regarding ODS, please visit the Office for Disability Services Web site at http://equity.psu.edu/ods/ . In order to receive consideration for course accommodations, you must contact ODS and provide documentation (see the documentation guidelines at http://equity.psu.edu/ods/student-information). If the documentation supports the need for academic adjustments, ODS will provide a letter identifying appropriate academic adjustments. Please provide the letter and discuss any adjustments with me as early in the course as possible. You must contact ODS and request academic adjustment letters at the beginning of each semester. Specific Topics Usually Covered 1. Simple Linear Regression Model Model for E(Y), model for distribution of errors Least squares estimation of model for E(Y) Estimation of variance 2. Inferences for Simple Linear Regression Model Inferences concerning the slope (confidence intervals and t-test) Confidence interval estimate of the mean Y at a specific X Prediction interval for a new Y Analysis of Variance partitioning of variation in Y R-squared calculation and interpretation 3. Diagnostic Procedures for Aptness of Model: assessing regression assumptions Residual analyses Plots of residuals versus fits, residuals versus x Tests for normality of residuals Lack of Fit test, Pure Error, Lack of Fit concepts Transformations as solution to problems with the model 4. Multiple Regression Models and Estimation Matrix Notations Hyperplane extension to simple linear model Basic estimation and inference for multiple regression 5. Additional Topics for Multiple Regression Analysis General Linear F test and Sequential SS Effects of a variable controlled for other predictors Sequential SS Partial Correlation Multicollinearity between X variables Effect on standard deviations of coefficients Problems interpreting effects of individual variables Apparent conflicts between overall F test and individual variable t tests Benefits of designed experiments 6. Categorical Predictor Variables Indicator Variables Interpretation of models containing indicator variables 7. Model Comparison and Selection Methods R2, MSE , Cp, and PRESS criteria Stepwise algorithms 8. Logistic Regression: categorical outcome variables Binary outcome: Bernoulli Distribution Interpretation of models: odds ratio 9. Miscellaneous Topics as Time Permits TENTATIVE SCHEDULE: THIS IS SUBJECT TO CHANGE and is only intended to serve as a guideline. I will try to keep an updated schedule on ANGEL. Weeks are defined as Monday to Sunday. Week No. Days 1 Aug. 25 to Aug. 31 2 Sep. 1 to Sep. 7 3 Sep. 8 to Sep. 14 4 Sep. 15 to Sep. 21 5 6 Sep. 22 to Sep. 28 Sep. 29 to Oct. 5 7 Oct. 6 to Oct. 12 8 Oct. 13 to Oct. 19 9 Oct. 20 to Oct. 26 10 Oct. 27 to Nov. 2 11 Nov. 3 to Nov. 9 12 Nov. 10 to Nov. 16 13 Nov. 17 to Nov. 23 14 Nov. 24 to Nov. 30 15 Dec. 1 to Dec. 7 16 Dec. 8 to Dec. 14 17 Dec. 15 to Dec. 21 Chapter & Topic Chapter 1: Introduction and Simple Linear Regression Chapter 2: Inferences for Simple Linear Regression Chapter 2: Inferences for Simple Linear Regression Chapter 3: Assessing Regression Assumptions Chapter 4: Intro to Multiple Regression Chapter 4: Intro to Multiple Regression Chapter 5: General Linear F-Test Procedure and Multicollinearity Chapter 5: General Linear F-Test Procedure and Multicollinearity Chapter 6: Categorical Predictors and Indicator Variables Chapter 7: Model Comparison and Selection Methods Chapter 7: Model Comparison and Selection Methods Chapter 8: Diagnostics for Multiple Regression Assignments Due HW #1 due Mon. 9/15 HW #2 due Mon. 10/6 MIDTERM #1 on Fri. 10/10, covering chapters 1-4 HW #3 due Mon. 10/20 HW #4 due Mon. 11/3 HW #5 due Mon. 11/17 MIDTERM #2 on Fri. 11/21, covering chapters 5-8 THANKSGIVING BREAK Chapter 9: Logistic Regression Project Presentations Project Presentations HW #6 due Mon. 12/8 FINALS WEEK – WORK ON PROJECT REPORT due Fri. PROJECTS 12/19 Chapter 9: Logistic Regression