HUDM 5122: Applied Regression Analysis Spring 2015 Tuesday: 1:00pm– 2:40pm (lecture) Labs: Tuesday 3 – 4:40 (Lauren) Tuesday 5 – 6:40 (Lauren) Thursday 11 – 12:40 (Josh) Thursday 1 – 2:40 (Josh) Professor: L. Elizabeth Tipton Office Hours: sign up at: https://tipton.youcanbook.me/ Email: tipton@tc.columbia.edu Office: Grace Dodge 462 TA1: Lauren Fellers TA2: Josh Paxton Email: laf2156@tc.columbia.edu Email: jap2208@tc.columbia.edu Office hours: Office hours: Prerequisite: HUDM 4122 or permission of instructor Course Description: Least squares estimation theory. Traditional simple and multiple regression models and polynomial regression models, with grouping variables including one-way ANOVA, twoway ANOVA, and analysis of covariance. Required textbooks: Allison, P. (1998) Multiple regression: A primer. The Pine Forge Press series in research methods and statistics. Thousand Oaks, CA: Sage. ($25 - $45) Fox, J. (1991) Regression diagnostics. Sage University Papers Series on Quantitative Applications in the Social Sciences, 07-079. Newbury Park, CA: Sage. ($5 - $16) Hardy, M. (1993) Regression with dummy variables. Sage University Papers Series on Quantitative Applications in the Social Sciences, 07-093. Newbury Park, CA: Sage. ($9 - $16) Menard, S. (2001) Applied logistic regression analysis. Sage University Papers Series on Quantitative Applications in the Social Sciences, 07-106. Thousand Oaks, CA: Sage. ($8 - $16) Suggested textbooks (on reserve at the library): Fox, J. (2008) Applied Regression Analysis and Generalized Linear Models. Second Edition. SAGE. (HA31.3.F69 2008 ON RESERVE) Kutner, M.H., Nachtsheim, C.J., and Neter, J. (2004) Applied linear regression models. McGraw Hill. (QA278.2.A65 2004 ON RESERVE) Field, A. (2009) Discovering statistics using SPSS. Third edition. SAGE. (HA32.F54 2009 ON RESERVE) Software We will use SPSS in the laboratory sessions. Use of other software is permitted; please speak to me if you would like to do this. Best resource: UCLA statistical computing : http://www.ats.ucla.edu/stat/ 1 The method of least squares is the automobile of modern statistical analysis: despite its limitations, occasional accidents, and incidental pollution, this method and its numerous variations, extensions, and related conveyances carry the bulk of statistical analyses, and are known and valued by nearly all. --Stephen M. Stigler (1999, pg 320) Course objectives: (1) Identify, code, and recode continuous and discrete variables and choose the appropriate statistical procedure for their analysis; (2) Describe relationships between predictor variables and a continuous outcome variable; (3) Model, estimate, and interpret point estimates, standard errors, confidence intervals, and hypothesis tests for regression slopes; (4) Delineate assumptions of linear statistical models used and evaluate particular analyses for meeting these assumptions; (5) Recognize and explain similarities and differences between regression and ANOVA models; (6) Conduct analyses to diagnose problems of multi-collinearity, influential points, model fit, etc.; (7) Become familiar with a nationally representative dataset and issues associated with using data of this kind for analysis; (8) Formulate questions, identify appropriate models, estimate models, and interpret the results for questions of real social science interest. Grading: 50% Homework There are 6 assignments. The first 5 are REQUIRED. The 6th assignment is optional and its score can replace your lowest grade. For example, if you had scores of 79,100,97,95,93 on HW 1-5 you could do HW #6 and replace the 79 with your HW #6 grade. You are encouraged to work together on your homework. However, students must submit separate homework assignments and must run their own analyses. NO LATE HOMEWORK WILL BE ACCEPTED. 20% Midterm The midterm will cover Weeks 1 – 7. You will be allowed one 8 ½ x 11 one-sided “cheat sheet”. 30% Final Project You will choose a question, models, hypotheses, and run the analysis using methods learned in class. You will write this up as a research publication, but with a shorter introduction section and longer methods section. 2 Schedule Unit Week/ Date Class Lecture Reading (BEFORE Tuesday) Intro & Review Syllabus, Introduction, Review (1) 1/27 Runyon Ch 11 & 12 (on Moodle) Regression (2) 2/3 Simple regression Allison Ch 1 & 5 Fox Ch 5 & 6 (on Moodle) HW #1 {Review} (3) 2/10 Multiple regression Allison Ch 2 & 3 Fox Ch 5 & 6 (on Moodle) HW #2.1 {Simple regression} (4) 2/17 Qualitative variables Hardy GB sctns 1 – 3 (pp. 1-29) HW #2.2 {Multiple Reg} (5) 2/24 Interactions Allison Ch 8 Hardy GB sctn 4 (pp 29 – 63) HW #3.1 {Dummy vars} (6) 3/3 Non-linearity Allison Ch 8 Hardy GB pp. 78-82 Fox GB sctn 7 (pp 53 – 61) HW #3.2 {Interactions} Midterm (7) 3/10 Exam (2 hours) Homework due date (in lecture) HW #3.3 {nonlinearity} Midterm (8) 3/17 SPRING BREAK 3 Working with real data (9) 3/24 Real data Handout on Moodle: -Datasets -Codebooks -Cleaning/Importing Midterms returned (10) 3/31 Choosing a model (causality, omitted variable bias, variable selection) On Moodle: Gelman & Hill Duncan & Gipson-Davis Left-handed death Explain vs. predict HW #4.1 {real data} (11) 4/7 Outliers, leverage, and influence Allison Ch 6 Fox GB sctn 4 (pp 21 – 40) HW #4.2 {models} (12) 4/14 Non-normality, Non-constant variance, Collinearity Allison Ch 7 Fox GB sctns 3, 5, 6, 10 (pp 10 – 21; 40 – 53; 75 – 79) Hardy GB pp. 82 – 84 HW #5.1 {outliers, etc} Menard GB sctn 1-2.2; 3 (pp 1 – 27; 41 – 67) Hardy GB pp. 75 – 78 HW #5.2 {nonnorm,etc} Hardy, pp.56-60 HW #6.1 Extensions Final Project Final Project information given out Generalized linear models (e.g., logistic regression) (14) 4/28 Other topics in regression (e.g., Poisson regression) (15) 5/5 Review/ meetings about final projects HW #6.2 {extensions} (16) 5/12 Final project due Final Project [due 5/12,10 am] (13) 4/21 4 Additional Statements 1. The College will make reasonable accommodations for persons with documented disabilities. Students are encouraged to contact the Office of Access and Services for Individuals with Disabilities for information about registration (166 Thorndike Hall). Services are available only to students who are registered and submit appropriate documentation. As your instructor, I am happy to discuss specific needs with you as well. 2. The grade of Incomplete will be assigned only when the course attendance requirement has been met but, for reasons satisfactory to the instructor, the granting of a final grade has been postponed because certain course assignments are outstanding. If the outstanding assignments are completed within one calendar year from the date of the close of term in which the grade of Incomplete was received and a final grade submitted, the final grade will be recorded on the permanent transcript, replacing the grade of Incomplete, with a transcript notation indicating the date that the grade of Incomplete was replaced by a final grade. If the outstanding work is not completed within one calendar year from the date of the close of term in which the grade of Incomplete was received, the grade will remain as a permanent Incomplete on the transcript. In such instances, if the course is a required course or part of an approved program of study, students will be required to re-enroll in the course including repayment of all tuition and fee charges for the new registration and satisfactorily complete all course requirements. If the required course is not offered in subsequent terms, the student should speak with the faculty advisor or Program Coordinator about their options for fulfilling the degree requirement. Doctoral students with six or more credits with grades of Incomplete included on their program of study will not be allowed to sit for the certification exam. 3. Teachers College students have the responsibility for activating the Columbia University Network ID (UNI) and a free TC Gmail account. As official communications from the College – e.g., information on graduation, announcements of closing due to severe storm, flu epidemic, transportation disruption, etc. -- will be sent to the student’s TC Gmail account, students are responsible for either reading email there, or, for utilizing the mail forwarding option to forward mail from their account to an email address which they will monitor. 4. It is the policy of Teachers College to respect its members’ observance of their major religious holidays. Students should notify instructors at the beginning of the semester about their wishes to observe holidays on days when class sessions are scheduled. Where academic scheduling conflicts prove unavoidable, no student will be penalized for absence due to religious reasons, and alternative means will be sought for satisfying the academic requirements involved. If a suitable arrangement cannot be worked out between the student and the instructor, students and instructors should consult the appropriate department chair or director. If an additional appeal is needed, it may be taken to the Provost. 5