Schedule - Columbia Blogs

advertisement
HUDM 5122-1: Applied Regression Analysis
Spring 2013
Thursday: 11:00am – 2:00pm
Location: Thompson 229
Professor: L. Elizabeth Tipton
Office Hours: Tuesday 5-6, Thursday 2-4
Email: tipton@tc.columbia.edu
Office: Grace Dodge 453G
TA1: Marcus Waldman
TA2: Zack Fisher
Email: waldman@tc.columbia.edu
Email: zff2000@tc.columbia.edu
Office hours: Wednesdays, 12:30pm
Office hours: Saturday, 2pm (Library)
Prerequisite: HUDM 4122 or permission of instructor
Course Description:
Least squares estimation theory. Traditional simple and multiple regression models and
polynomial regression models, with grouping variables including one-way ANOVA, twoway ANOVA, and analysis of covariance.
Required textbook:
Chatterjee, S. & Hadi, A.S. (2006) Regression analysis by example. Wiley.
Book website: http://www.ilr.cornell.edu/~hadi/RABE4/
Highly suggested:
Allison, P. (1998) Multiple regression: A primer. (QA278.2.A435 1999 ON RESERVE)
Suggested textbooks (on reserve at the library):
Angrist, J.D. and Pischke, J.S. (2009) Mostly harmless econometrics: An empiricist’s
companion. Princeton University Press. (HB139.A54 2009 ON RESERVE)
Berk, R.A. (2004) Regression analysis: A constructive critique. Advanced
Quantitative Techniques in the Social Sciences Series. SAGE Publications.
(QA278.2.B46 2004 ON RESERVE)
Fox, J. (2008) Applied Regression Analysis and Generalized Linear Models. Second
Edition. SAGE. (HA31.3.F69 2008 ON RESERVE)
Kutner, M.H., Nachtsheim, C.J., and Neter, J. (2004) Applied linear regression models.
McGraw Hill. (QA278.2.A65 2004 ON RESERVE)
Murnane, R.J. and Willett, J.B. (2011) Methods matter: Improving causal inference in
educational and social science research. Oxford Press.
(LB1028.M86 2010 ON RESERVE)
Software:
We will use SPSS in the laboratory sessions. Use of other software is permitted; please
speak to me if you would like to do this.
Field, A. (2009) Discovering statistics using SPSS. Third edition. SAGE. (HA32.F54
2009 ON RESERVE)
Websites:
UCLA statistical computing : http://www.ats.ucla.edu/stat/
Khan Academy (Statistics videos): http://www.khanacademy.org/#statistics
Note: SPSS is available in all computer labs and the library at TC. You can check
out laptops with SPSS on them from the library.
1
The method of least squares is the automobile of modern statistical analysis: despite its
limitations, occasional accidents, and incidental pollution, this method and its
numerous variations, extensions, and related conveyances carry the bulk of statistical
analyses, and are known and valued by nearly all.
--Stephen M. Stigler (1999, pg 320)
Course objectives:
(1) Identify, code, and recode continuous and discrete variables and choose the
appropriate statistical procedure for their analysis;
(2) Describe relationships between predictor variables and a continuous outcome
variable;
(3) Model, estimate, and interpret point estimates, standard errors, confidence
intervals, and hypothesis tests for regression slopes;
(4) Delineate assumptions of linear statistical models used and evaluate particular
analyses for meeting these assumptions;
(5) Recognize and explain similarities and differences between regression and
ANOVA models;
(6) Conduct analyses to diagnose problems of multi-collinearity, influential points,
model fit, etc.;
(7) Become familiar with a nationally representative dataset and issues associated
with using data of this kind for analysis;
(8) Formulate questions, identify appropriate models, estimate models, and interpret
the results for questions of real social science interest.
Grading:
50% Homework
 There are 5 assignments, each worth 10% of the final grade.
 Assignments will include a mixture of handwork, SPSS model running, and
output interpretation.
 You are encouraged to work together on your homework. However, students must
submit separate homework assignments and must run their own analyses.
 NO LATE HOMEWORK WILL BE ACCEPTED.
20% Midterm
 The midterm will cover Weeks 1 – 7.
 You will be allowed one 8 ½ x 11 one-sided “cheat sheet”.
30% Final Project
 Everyone will use the same data set.
 You will choose a question, models, hypotheses, and run the analysis using
methods learned in class.
 You will write this up as a research publication, but with a shorter introduction
section and longer methods section.
 More details to follow.
2
Schedule
Unit
Week/ Date
Class Lecture
Reading
(BEFORE
Thursday)
Chapter 1
Assignments
/Exams/
Projects
Intro &
Review
(1) 1/24
Syllabus,
Introduction,
Review
Regression
(2) 1/31
Simple
regression
2.4 Simple model
2.5 Estimation
2.6 Hyp test
2.7 Conf intervals
2.9 Fit
3.6 Prop of OLS
3.2 Description
3.4 Estimation
3.5 Interpretation
3.7 M Corr coef
3.8 Inference
3.9 Hyp tests
9.5 Centering
5.1 Dummy
5.3 Interactions
5.4 Systems
HW #1
{Review}
[due 1/31]
(3) 2/7
Multiple
regression
(4) 2/14
Qualitative
variables
(5) 2/21
Outliers,
leverage, and
influence
4.2 Reg Assumps
4.3 Residuals
4.4/5/6 Resid Graphs
4.7 Test Lin/Norm
4.8 Lev, Infl, Out
4.9 Influence
4.12 Added-var plot
(6) 2/28
Non-normality,
Non-constant
variance,
Collinearity
6.4 Trans variance
6.5 Detect heterosk
6.7 WLS
6.8 Log trans
6.9 Power trans
9.1 Collin
9.4 Detect Collin
(7) 3/7
Review week
(run by TA)
(Prof at Conf)
HW # 2
{Simple,
Multiple Reg}
[due 2/14]
HW # 3
{Qual,
Assumps}
[due FRI 3/8]
3
Midterm
(8) 3/14
Exam (2 hours)
Spring
break
(9) 3/21
Spring break
Working
with real
data
(10) 3/28
Non-linearity
(11) 4/4
Secondary data Handout on Moodle:
analysis
-Datasets
-Codebooks
-Cleaning/Importing
Choosing a
-Types of questions
model
-Causality:
(causality,
Gelman&Hill (Ch 9)
omitted
-Variable selection:
variable bias,
11.5 Model fit stats
variable
selection)
Non12.2 Modeling
continuous
12.3 Logit model
outcomes
12.5 Diagnostics
(focus:
12.6/7 Fit
Logistic
regression)
(12) 4/11
(13) 4/18
Experiments (14) 4/25
Final
Midterm
4.7 linearity
6.2 Transform linear
6.3 Non-linearity
9.5 Centering
ANOVA (1way and 2way)
Fox 8.1 [1-way]
Fox 8.2 [2-way]
(15) 5/2
ANOVA (2way),
ANCOVA
Fox 8.4 [ANCOVA]
(16) 5/9
Review/
meetings about
final projects
(17) 5/14
Final Project
information
given out
HW # 4 {Nonlinearity, big
data}
[due 4/18]
HW # 5
{ANOVA,
logistic
regression}
[due FRI 5/3]
Final Project
[due 5/14, 9am]
4
Additional Statements
1.
The College will make reasonable accommodations for persons with
documented disabilities. Students are encouraged to contact the Office of Access and
Services for Individuals with Disabilities for information about registration (166 Thorndike
Hall). Services are available only to students who are registered and submit appropriate
documentation. As your instructor, I am happy to discuss specific needs with you as well.
2.
The grade of Incomplete will be assigned only when the course attendance
requirement has been met but, for reasons satisfactory to the instructor, the granting of a
final grade has been postponed because certain course assignments are outstanding. If the
outstanding assignments are completed within one calendar year from the date of the close
of term in which the grade of Incomplete was received and a final grade submitted, the final
grade will be recorded on the permanent transcript, replacing the grade of Incomplete, with
a transcript notation indicating the date that the grade of Incomplete was replaced by a final
grade. If the outstanding work is not completed within one calendar year from the date of
the close of term in which the grade of Incomplete was received, the grade will remain as a
permanent Incomplete on the transcript. In such instances, if the course is a required course
or part of an approved program of study, students will be required to re-enroll in the course
including repayment of all tuition and fee charges for the new registration and satisfactorily
complete all course requirements. If the required course is not offered in subsequent terms,
the student should speak with the faculty advisor or Program Coordinator about their
options for fulfilling the degree requirement. Doctoral students with six or more credits with
grades of Incomplete included on their program of study will not be allowed to sit for the
certification exam.
3.
Teachers College students have the responsibility for activating the Columbia
University Network ID (UNI) and a free TC Gmail account. As official communications
from the College – e.g., information on graduation, announcements of closing due to severe
storm, flu epidemic, transportation disruption, etc. -- will be sent to the student’s TC Gmail
account, students are responsible for either reading email there, or, for utilizing the mail
forwarding option to forward mail from their account to an email address which they will
monitor.
4.
It is the policy of Teachers College to respect its members’ observance of their
major religious holidays. Students should notify instructors at the beginning of the
semester about their wishes to observe holidays on days when class sessions are scheduled.
Where academic scheduling conflicts prove unavoidable, no student will be penalized for
absence due to religious reasons, and alternative means will be sought for satisfying the
academic requirements involved. If a suitable arrangement cannot be worked out between
the student and the instructor, students and instructors should consult the appropriate
department chair or director. If an additional appeal is needed, it may be taken to the
Provost.
5
Download