STAT 462: Applied Regression Analysis SYLLABUS – Fall 2014, Section 002

advertisement
STAT 462: Applied Regression Analysis
SYLLABUS – Fall 2014, Section 002
Penn State University
Instructor: Stefanie Austin
Email: Through ANGEL or stefanie.austin@psu.edu
Office: 424A Thomas
Office Hours: Mon. and Fri. 3:30pm – 4:30pm, and by appointment
TA: Ame Osotsi
Email: auo141@psu.edu
Office: 301 Thomas
Office Hours: Tue. and Wed. 3:00pm – 4:00pm
Class:
MF 4:40pm – 5:30pm 215 Thomas
W 4:40pm – 5:30pm 004 Life Science
Course Description: This is an applied linear regression course that involves hands-on data analysis.
Students enrolling for this course should have taken at least once other statistics course and be familiar
with the basic fundamentals of statistical testing and estimation.
Prerequisite: One of STAT 200, STAT 220, STAT 240, STAT 250, STAT 301, or STAT 401
Materials:
Textbooks (optional): Applied Linear Regression Models, 4th edition, by Neter, Nachtsheim,
and Kutner, or Applied Linear Statistical Models, 5th edition, by Kutner, Nachtsheim, Neter, and
Li
Software: Data analysis is emphasized, so computers will be used frequently during the course.
This course will use the statistical software Minitab, version 16 or 17. This software can be
found throughout the computer labs on campus. You may purchase Minitab for your personal
computer if you wish, but note that this is available only for PC. For MAC users we recommend
SPSS. Information on obtaining both software packages can be found at the Statistical Software
page (http://stat.psu.edu/education/statistical-software-packages).
Another alternative is to use WebApps (http://webapps.psu.edu). One drawback of this,
though, is that it is slower, requires internet, and you cannot access local files directly. However
you may find out how to load your local data into pass space at
(http://clc.its.psu.edu/UnivServices/WebFiles).
Course Website: Class announcements and materials will be regularly posted on ANGEL, so it is
recommended that you check the site frequently. Materials such as lecture notes, homework and lab
assignments, solutions, etc. will be posted.
Grading:
1. Lab Activities (10%): There will be approximately seven (7) lab activity assignments throughout
the semester, which are administered and completed in-class.
- Lab activities are due during the class time in which they are assigned.
- NO LATE LAB ACTIVITIES WILL BE ACCEPTED.
- The lowest grade will be dropped prior to calculating the final grade.
2. Homework (20%): There will be approximately six (6) homework assignments throughout the
semester.
- You will NOT need Minitab or SPSS to answer the questions, though software output
may be included in the problems.
- The homework will generally be due on every other Monday at the beginning of class.
- You must show all work to receive full credit.
- A reasonable amount of collaboration is allowed, but each student must turn in his or her
own written work which reflects his or her understanding of the material.
- NO LATE HOMEWORK WILL BE ACCEPTED unless the student has prior permission
from the instructor.
- The lowest grade will be dropped prior to calculating the final grade.
3. Midterm Exams (40%): There will be two (2) midterm examinations administered in-class, each
worth 20% of the final grade.
- They will NOT require statistical software but may include output in the problems.
- You may only bring and use the following items: one-page of notes double-sided, plain
scratch paper, pen or pencils, and a calculator. You may use no other items nor receive
help from anyone.
- Midterm exams CANNOT be made up or rescheduled without a legitimate excuse.
 Midterm #1 will tentatively take place on Friday, October 10 and will cover
Chapters 1 through 4.
 Midterm #2 will tentatively take place on Friday, November 21 and will cover
Chapters 5 through 8.
4. Final Project (30%): In lieu of a final exam, there will be a team project.
- Teams of 3 to 4 students will be assigned after the first midterm exam. I will take
requests for teammates and all others will be randomly assigned.
- You will use techniques learned from class to analyze a data set and draw conclusions.
- You are encouraged to pick your own data set and topics, but I can suggest datasets if you
cannot find one.
- Each team will conduct a 20-minute presentation on their methods and findings during
the last two weeks of class.
- After the presentation, students can receive feedback and revise their final report, which
is due the last day of finals week, Friday, December 19.
- More information on the project will be given in class and posted on ANGEL.
Grading Scale:
A (93-100); A- (90-92); B+ (87-89); B (83-86); B- (80-82); C+ (77-79); C (70-76);
D (60-69); F (0-59)
Academic Integrity: All Penn State and Eberly College of Science policies regarding academic integrity
apply to this course. See http://science.psu.edu/current-students/Integrity for details.
ECOS Code of Mutual Respect and Cooperation: The Eberly College of Science Code of Mutual
Respect and Cooperation (http://www.science.psu.edu/climate/Code-of-Mutual-Respect final.pdf)
embodies the values that we hope our faculty, staff, and students possess and will endorse to make The
Eberly College of Science a place where every individual feels respected and valued, as well as
challenged and rewarded.
Accommodations for Students with Disabilities: Penn State welcomes students with disabilities into
the University's educational programs. If you have a disability-related need for reasonable academic
adjustments in this course, contact the Office for Disability Services (ODS) at 814-863-1807 (V/TTY).
For further information regarding ODS, please visit the Office for Disability Services Web site at
http://equity.psu.edu/ods/ .
In order to receive consideration for course accommodations, you must contact ODS and provide
documentation (see the documentation guidelines at http://equity.psu.edu/ods/student-information). If the
documentation supports the need for academic adjustments, ODS will provide a letter identifying
appropriate academic adjustments. Please provide the letter and discuss any adjustments with me as early
in the course as possible. You must contact ODS and request academic adjustment letters at the beginning
of each semester.
Specific Topics Usually Covered
1. Simple Linear Regression Model
 Model for E(Y), model for distribution of errors
 Least squares estimation of model for E(Y)
 Estimation of variance
2. Inferences for Simple Linear Regression Model
 Inferences concerning the slope (confidence intervals and t-test)
 Confidence interval estimate of the mean Y at a specific X
 Prediction interval for a new Y
 Analysis of Variance partitioning of variation in Y
 R-squared calculation and interpretation
3. Diagnostic Procedures for Aptness of Model: assessing regression assumptions
 Residual analyses
 Plots of residuals versus fits, residuals versus x
 Tests for normality of residuals
 Lack of Fit test, Pure Error, Lack of Fit concepts
 Transformations as solution to problems with the model
4. Multiple Regression Models and Estimation
 Matrix Notations
 Hyperplane extension to simple linear model
 Basic estimation and inference for multiple regression
5. Additional Topics for Multiple Regression Analysis
 General Linear F test and Sequential SS
 Effects of a variable controlled for other predictors
 Sequential SS
 Partial Correlation
 Multicollinearity between X variables
 Effect on standard deviations of coefficients
 Problems interpreting effects of individual variables
 Apparent conflicts between overall F test and individual variable t tests
 Benefits of designed experiments
6. Categorical Predictor Variables
 Indicator Variables
 Interpretation of models containing indicator variables
7. Model Comparison and Selection Methods
 R2, MSE , Cp, and PRESS criteria
 Stepwise algorithms
8. Logistic Regression: categorical outcome variables
 Binary outcome: Bernoulli Distribution
 Interpretation of models: odds ratio
9. Miscellaneous Topics as Time Permits
TENTATIVE SCHEDULE: THIS IS SUBJECT TO CHANGE and is only intended to serve as a
guideline. I will try to keep an updated schedule on ANGEL. Weeks are defined as Monday to Sunday.
Week
No.
Days
1
Aug. 25 to Aug. 31
2
Sep. 1 to Sep. 7
3
Sep. 8 to Sep. 14
4
Sep. 15 to Sep. 21
5
6
Sep. 22 to Sep. 28
Sep. 29 to Oct. 5
7
Oct. 6 to Oct. 12
8
Oct. 13 to Oct. 19
9
Oct. 20 to Oct. 26
10
Oct. 27 to Nov. 2
11
Nov. 3 to Nov. 9
12
Nov. 10 to Nov. 16
13
Nov. 17 to Nov. 23
14
Nov. 24 to Nov. 30
15
Dec. 1 to Dec. 7
16
Dec. 8 to Dec. 14
17
Dec. 15 to Dec. 21
Chapter & Topic
Chapter 1: Introduction and Simple Linear
Regression
Chapter 2: Inferences for Simple Linear
Regression
Chapter 2: Inferences for Simple Linear
Regression
Chapter 3: Assessing Regression
Assumptions
Chapter 4: Intro to Multiple Regression
Chapter 4: Intro to Multiple Regression
Chapter 5: General Linear F-Test
Procedure and Multicollinearity
Chapter 5: General Linear F-Test
Procedure and Multicollinearity
Chapter 6: Categorical Predictors and
Indicator Variables
Chapter 7: Model Comparison and
Selection Methods
Chapter 7: Model Comparison and
Selection Methods
Chapter 8: Diagnostics for Multiple
Regression
Assignments Due
HW #1 due Mon. 9/15
HW #2 due Mon. 10/6
MIDTERM #1 on Fri. 10/10,
covering chapters 1-4
HW #3 due Mon. 10/20
HW #4 due Mon. 11/3
HW #5 due Mon. 11/17
MIDTERM #2 on Fri. 11/21,
covering chapters 5-8
THANKSGIVING BREAK
Chapter 9: Logistic Regression
Project Presentations
Project Presentations
HW #6 due Mon. 12/8
FINALS WEEK – WORK ON
PROJECT REPORT due Fri.
PROJECTS
12/19
Chapter 9: Logistic Regression
Download