HUDM 5122-1: Applied Regression Analysis Spring 2013 Thursday: 11:00am – 2:00pm Location: Thompson 229 Professor: L. Elizabeth Tipton Office Hours: Tuesday 5-6, Thursday 2-4 Email: tipton@tc.columbia.edu Office: Grace Dodge 453G TA1: Marcus Waldman TA2: Zack Fisher Email: waldman@tc.columbia.edu Email: zff2000@tc.columbia.edu Office hours: Wednesdays, 12:30pm Office hours: Saturday, 2pm (Library) Prerequisite: HUDM 4122 or permission of instructor Course Description: Least squares estimation theory. Traditional simple and multiple regression models and polynomial regression models, with grouping variables including one-way ANOVA, twoway ANOVA, and analysis of covariance. Required textbook: Chatterjee, S. & Hadi, A.S. (2006) Regression analysis by example. Wiley. Book website: http://www.ilr.cornell.edu/~hadi/RABE4/ Highly suggested: Allison, P. (1998) Multiple regression: A primer. (QA278.2.A435 1999 ON RESERVE) Suggested textbooks (on reserve at the library): Angrist, J.D. and Pischke, J.S. (2009) Mostly harmless econometrics: An empiricist’s companion. Princeton University Press. (HB139.A54 2009 ON RESERVE) Berk, R.A. (2004) Regression analysis: A constructive critique. Advanced Quantitative Techniques in the Social Sciences Series. SAGE Publications. (QA278.2.B46 2004 ON RESERVE) Fox, J. (2008) Applied Regression Analysis and Generalized Linear Models. Second Edition. SAGE. (HA31.3.F69 2008 ON RESERVE) Kutner, M.H., Nachtsheim, C.J., and Neter, J. (2004) Applied linear regression models. McGraw Hill. (QA278.2.A65 2004 ON RESERVE) Murnane, R.J. and Willett, J.B. (2011) Methods matter: Improving causal inference in educational and social science research. Oxford Press. (LB1028.M86 2010 ON RESERVE) Software: We will use SPSS in the laboratory sessions. Use of other software is permitted; please speak to me if you would like to do this. Field, A. (2009) Discovering statistics using SPSS. Third edition. SAGE. (HA32.F54 2009 ON RESERVE) Websites: UCLA statistical computing : http://www.ats.ucla.edu/stat/ Khan Academy (Statistics videos): http://www.khanacademy.org/#statistics Note: SPSS is available in all computer labs and the library at TC. You can check out laptops with SPSS on them from the library. 1 The method of least squares is the automobile of modern statistical analysis: despite its limitations, occasional accidents, and incidental pollution, this method and its numerous variations, extensions, and related conveyances carry the bulk of statistical analyses, and are known and valued by nearly all. --Stephen M. Stigler (1999, pg 320) Course objectives: (1) Identify, code, and recode continuous and discrete variables and choose the appropriate statistical procedure for their analysis; (2) Describe relationships between predictor variables and a continuous outcome variable; (3) Model, estimate, and interpret point estimates, standard errors, confidence intervals, and hypothesis tests for regression slopes; (4) Delineate assumptions of linear statistical models used and evaluate particular analyses for meeting these assumptions; (5) Recognize and explain similarities and differences between regression and ANOVA models; (6) Conduct analyses to diagnose problems of multi-collinearity, influential points, model fit, etc.; (7) Become familiar with a nationally representative dataset and issues associated with using data of this kind for analysis; (8) Formulate questions, identify appropriate models, estimate models, and interpret the results for questions of real social science interest. Grading: 50% Homework There are 5 assignments, each worth 10% of the final grade. Assignments will include a mixture of handwork, SPSS model running, and output interpretation. You are encouraged to work together on your homework. However, students must submit separate homework assignments and must run their own analyses. NO LATE HOMEWORK WILL BE ACCEPTED. 20% Midterm The midterm will cover Weeks 1 – 7. You will be allowed one 8 ½ x 11 one-sided “cheat sheet”. 30% Final Project Everyone will use the same data set. You will choose a question, models, hypotheses, and run the analysis using methods learned in class. You will write this up as a research publication, but with a shorter introduction section and longer methods section. More details to follow. 2 Schedule Unit Week/ Date Class Lecture Reading (BEFORE Thursday) Chapter 1 Assignments /Exams/ Projects Intro & Review (1) 1/24 Syllabus, Introduction, Review Regression (2) 1/31 Simple regression 2.4 Simple model 2.5 Estimation 2.6 Hyp test 2.7 Conf intervals 2.9 Fit 3.6 Prop of OLS 3.2 Description 3.4 Estimation 3.5 Interpretation 3.7 M Corr coef 3.8 Inference 3.9 Hyp tests 9.5 Centering 5.1 Dummy 5.3 Interactions 5.4 Systems HW #1 {Review} [due 1/31] (3) 2/7 Multiple regression (4) 2/14 Qualitative variables (5) 2/21 Outliers, leverage, and influence 4.2 Reg Assumps 4.3 Residuals 4.4/5/6 Resid Graphs 4.7 Test Lin/Norm 4.8 Lev, Infl, Out 4.9 Influence 4.12 Added-var plot (6) 2/28 Non-normality, Non-constant variance, Collinearity 6.4 Trans variance 6.5 Detect heterosk 6.7 WLS 6.8 Log trans 6.9 Power trans 9.1 Collin 9.4 Detect Collin (7) 3/7 Review week (run by TA) (Prof at Conf) HW # 2 {Simple, Multiple Reg} [due 2/14] HW # 3 {Qual, Assumps} [due FRI 3/8] 3 Midterm (8) 3/14 Exam (2 hours) Spring break (9) 3/21 Spring break Working with real data (10) 3/28 Non-linearity (11) 4/4 Secondary data Handout on Moodle: analysis -Datasets -Codebooks -Cleaning/Importing Choosing a -Types of questions model -Causality: (causality, Gelman&Hill (Ch 9) omitted -Variable selection: variable bias, 11.5 Model fit stats variable selection) Non12.2 Modeling continuous 12.3 Logit model outcomes 12.5 Diagnostics (focus: 12.6/7 Fit Logistic regression) (12) 4/11 (13) 4/18 Experiments (14) 4/25 Final Midterm 4.7 linearity 6.2 Transform linear 6.3 Non-linearity 9.5 Centering ANOVA (1way and 2way) Fox 8.1 [1-way] Fox 8.2 [2-way] (15) 5/2 ANOVA (2way), ANCOVA Fox 8.4 [ANCOVA] (16) 5/9 Review/ meetings about final projects (17) 5/14 Final Project information given out HW # 4 {Nonlinearity, big data} [due 4/18] HW # 5 {ANOVA, logistic regression} [due FRI 5/3] Final Project [due 5/14, 9am] 4 Additional Statements 1. The College will make reasonable accommodations for persons with documented disabilities. Students are encouraged to contact the Office of Access and Services for Individuals with Disabilities for information about registration (166 Thorndike Hall). Services are available only to students who are registered and submit appropriate documentation. As your instructor, I am happy to discuss specific needs with you as well. 2. The grade of Incomplete will be assigned only when the course attendance requirement has been met but, for reasons satisfactory to the instructor, the granting of a final grade has been postponed because certain course assignments are outstanding. If the outstanding assignments are completed within one calendar year from the date of the close of term in which the grade of Incomplete was received and a final grade submitted, the final grade will be recorded on the permanent transcript, replacing the grade of Incomplete, with a transcript notation indicating the date that the grade of Incomplete was replaced by a final grade. If the outstanding work is not completed within one calendar year from the date of the close of term in which the grade of Incomplete was received, the grade will remain as a permanent Incomplete on the transcript. In such instances, if the course is a required course or part of an approved program of study, students will be required to re-enroll in the course including repayment of all tuition and fee charges for the new registration and satisfactorily complete all course requirements. If the required course is not offered in subsequent terms, the student should speak with the faculty advisor or Program Coordinator about their options for fulfilling the degree requirement. Doctoral students with six or more credits with grades of Incomplete included on their program of study will not be allowed to sit for the certification exam. 3. Teachers College students have the responsibility for activating the Columbia University Network ID (UNI) and a free TC Gmail account. As official communications from the College – e.g., information on graduation, announcements of closing due to severe storm, flu epidemic, transportation disruption, etc. -- will be sent to the student’s TC Gmail account, students are responsible for either reading email there, or, for utilizing the mail forwarding option to forward mail from their account to an email address which they will monitor. 4. It is the policy of Teachers College to respect its members’ observance of their major religious holidays. Students should notify instructors at the beginning of the semester about their wishes to observe holidays on days when class sessions are scheduled. Where academic scheduling conflicts prove unavoidable, no student will be penalized for absence due to religious reasons, and alternative means will be sought for satisfying the academic requirements involved. If a suitable arrangement cannot be worked out between the student and the instructor, students and instructors should consult the appropriate department chair or director. If an additional appeal is needed, it may be taken to the Provost. 5