Pampel - the Sociology Department at CU

advertisement
SOCY 7111 Data III Advanced Data Analysis
TTH 2:00-3:15 KTCH 33 Fall 2010
Instructor: Professor Fred Pampel
Office: 102A IBS #3 (1424 Broadway)
Email: fred.pampel@colorado.edu
Office Hours: 3:30-4:30 TTH and by appointment Phone: 2-5620
Texts
Generalized Linear Models: An Applied Approach (2004) – John P. Hoffman
Multilevel Modeling (2004) – Douglas A. Luke
Missing Data (2002) – Paul D. Allison
Selections (see CU Learn or http://www.colorado.edu/ibs/pop/pampel)
Statistics for Stata Version 10 (2009) – Lawrence Hamilton
Logistic Regression: A Primer (2000) – Fred C. Pampel
Regression Models for Categorical Dependent Variables Using Stata, 2nd Edition (2006)
– J. Scott Long and Jeremy Freese
UCLA Academic Technology Services: Resources to Help You Learn and Use Stata
(2010) – Xiao Chen, Phil Ender, Michael Mitchell, and Christine Wells, URL:
http://www.ats.ucla.edu/stat/stata
Introducing Multilevel Modeling (1998) – Ita Kreft and Jan de Leeuw
Multilevel Analysis: Techniques and Applications (2002) – Joop Hox
Quantitative Data Analysis: Doing Social Research to Test Ideas (2009) – Donald J.
Treiman
Stata Reference Manuals (2009) – Stata
Programs
Stata 11 – Stata Corp (versions 9 and 10 will work for most but not all assignments)
SPost: Post-Estimation Analysis with Stata – J. Scott Long and Jeremy Freese (free
download)
Objectives
Data III covers several widely used statistical methods that extend the basic regression
model to deal with 1) categorical and limited dependent variables, 2) multilevel data, 3)
1
missing data, and 4) complex survey designs. Although the methods also apply to
analysis of longitudinal data – a topic covered in another course – this course focuses on
cross-sectional data. Students should have had a previous course on multiple regression
and experience using Stata, the statistical analysis package to be used throughout the
course.
More informally, I intend this class as a practicum in quantitative social research that
emphasizes using statistics with real data. Perhaps the most important and most difficult
skill to teach is the insightful application of statistical techniques to real research
questions. The course thus emphasizes the match between theoretical reasoning,
substantive research problems, and statistical results. Toward that end, the assignments
require the application of the techniques covered in class to a topic and data set of your
choice.
Assignments
The sections below list the lecture topics and assigned readings. Mastering the material
requires more than reading, however. To apply the readings, the course requires
completion of four problem assignments and two papers (with details handed out during
the semester).
First, the four problem assignments involve the written interpretation of computer output.
Each assignment contributes 12.5 percent of the grade (50 percent in total). These
assignments involve the concrete application of material covered in a more abstract form
in the readings, will help you to explore your data, and are needed to prepare you to
complete the papers.
Second, the two papers are based on analysis of your data to address a research problem
of your own selection. They contribute 25 percent each to the final grade (50 percent in
total). Each should be about 10 pages, demonstrate your understanding of the statistical
techniques, and relate the results to a substantive issue and related theory. You can
choose the topic and data, but the end goal is to complete a professional research paper
that has the potential for later publication.
Stata will be used for the data analysis in the problems and papers, and it is available in
the Ketchum data labs (rooms 3 and 116). Having Stata on your own computer, however,
will make the assignments easier to complete. When working on the lab computers, it is
best to keep your data and command files on a flash drive and copy them to the particular
machine being used each time. Working on a single computer can avoid this
inconvenience.
I assume some experience using Stata. Those unfamiliar with the program (and those
wanting to upgrade their understanding) should work through a Stata tutorial and consult
two manuals: Getting Started with Stata and the User’s Guide. The manuals are available
in the Ketchum 3 lab, and the web tutorials can be found at
2
http://www.cpc.unc.edu/research/tools/data_analysis/statatutorial/index.html,
http://data.princeton.edu/stata, and http://www.ats.ucla.edu/stat/stata/sk/modules_sk.htm.
I will give examples and guidance on commands for the statistical techniques we cover,
but you will need to know the basic commands to create and recode variables, select
cases, and obtain descriptive statistics.
You will need to select a data set for analysis to be used in the problems and papers. I
have made several available to you on CU Learn: the 2008 General Social Survey, the
2006 National Health Interview Survey, the 2006 Monitoring the Future survey of teen
drug use, and the 2008 Eurobarometer Survey of European attitudes and climate views.
However, you can select any others of more interest. ICPSR
(http://www.icpsr.umich.edu/icpsrweb/ICPSR) makes thousands of data sets available for
download, or you may have your own data to use. In any case, choose your topic, data,
and variables in the first week or two, in time to use for the first problem.
Schedule
The schedule below lists the dates, topics, and assignments, and the section to follow lists
the specific readings to complete for each class period. At this stage, the schedule
represents a rough guide, and changes will occur throughout the semester. I would like to
cover all the topics listed but may need to spend more time on difficult topics or bring in
other materials. We may not proceed at exactly the pace initially planned.
Week Date
Topic
1
Orientation
Background: Logs and Exponents
Regression Review
Regression Assumptions
Link Functions
Maximum Likelihood
Logistic Regression
SPost
Interpretation Problems
Probit Models
Ordered Logit and Probit
Multinomial Logistic Regression
Poisson Regression
Negative Binomial Regression
Sample Selection
Event History and Survival Models
Introduction to Multilevel Models
Basic Multilevel Models
Building Multilevel Models
Multilevel Analysis I
2
3
4
5
6
7
8
9
10
Aug 24
Aug 26
Aug 31
Sept 2
Sep 7
Sep 9
Sep 14
Sep 16
Sep 21
Sep 23
Sep 28
Sep 30
Oct 5
Oct 7
Oct 12
Oct 14
Oct 19
Oct 21
Oct 26
Oct 28
Assignment
3
Select Data and Topic
Theory Outline
Problem 1 Due
Problem 2 Due
Paper 1 Due
11
12
13
Nov 2
Nov 4
Nov 9
Nov 11
Nov 16
Nov 18
Multilevel Analysis II
Multilevel Model Assessment
Multilevel Extensions
Standard Approaches to Missing Data
Multiple Imputation
Imputation Phase
Problem 3 Due
Fall Break and Thanksgiving Holiday
14
15
Nov 30
Dec 2
Dec 7
Dec 9
Estimation Phase
Complex Samples
Stata Survey Commands
More on Stata Survey Commands
Problem 4 Due
Finals Week
16
Dec 14
4:30 (Tuesday)
Paper 2 Due
Readings
Aug 26 Background
Freedman, David A. 1991. “Statistical Models and Shoe Leather.” Sociological
Methodology 21:291-313
Pampel, pp. 74-82
Aug 31 Regression Review
Hoffman, pp. 1-21
Sep 2 Regression Assumptions
Pampel, pp. 1-10
Hamilton, Chapter 7, “Regression Diagnostics,” pp. 209-228
Sep 7 Link Functions
Hoffman, pp. 22-33
Sep 9 Maximum Likelihood
Hoffman, pp. 33-44
Pampel, pp. 39-48
4
Sep 14 Logistic Models
Hoffman, pp. 45-54, 59-64
UCLA, “Logistic Regression with Stata.” URL:
http://www.ats.ucla.edu/stat/stata/webbooks/logistic/chapter1/statalog1.htm (up to
Tools to Assist)
UCLA, “Stata Annotated Output: Logistic Regression Analysis.” URL:
http://www.ats.ucla.edu/stat/stata/output/stata_logistic.htm
Sep 16 SPost: Additional Coefficients for Interpretations
Long and Freese, pp.136-181
UCLA, “Logistic Regression with Stata.” URL:
http://www.ats.ucla.edu/stat/stata/webbooks/logistic/chapter1/statalog1.htm (start
with Tools to Assist)
Sep 21 Interpretation Problems
Mood, Carina. 2010. “Logistic Regression: Why We Cannot Do What We Think We Can
Do, and What We Can Do About It.” European Sociological Review 26:67-82
Sep 23 Probit Models
Hoffman, pp. 54-59
Pampel, pp. 54-68
Sep 28 Ordered Logit and Probit
Hoffman, pp. 65-82
UCLA, “Ordinal Logistic Regression.” URL:
http://www.ats.ucla.edu/stat/stata/dae/ologit.htm
Sep 30 Multinomial Logistic Regression
Hoffman, pp. 83-100
UCLA, “Stata Data Analysis Examples: Multinomial Logistic Regression.” URL:
http://www.ats.ucla.edu/stat/stata/dae/mlogit.htm
Oct 5 Poisson Regression
Hoffman, pp. 101-112
Long and Freese, pp. 349-370
UCLA, “Stata Data Analysis Examples: Poisson Regression.” URL:
http://www.ats.ucla.edu/stat/stata/dae/poissonreg.htm
5
Oct 7 Negative Binomial Regression
Hoffman, pp. 112-120
Long and Freese, pp. 372-381
UCLA, “Stata Data Analysis Examples: Negative Binomial Regression.” URL:
http://www.ats.ucla.edu/stat/stata/dae/nbreg.htm
UCLA, “Stata Annotated Output: Negative Binomial Regression.” URL:
http://www.ats.ucla.edu/stat/stata/output/stata_nbreg_output.htm
Oct 12 Sample Selection
Berk, Richard A. 1983. “An Introduction to Sample Selection Bias in Sociological Data.”
American Sociological Review 48:386-398
Stata 11 Reference Manual, “Heckman Selection Model,” pp. 644-653
Oct 14 Event History and Survival Models
Hoffman, pp. 121-148
Oct 19 Introduction to Multilevel Models
Luke, pp. 1-9
Kreft and de Leeuw, pp. 1-14
Oct 21 Basic Multilevel Models
Luke, pp. 9-23
Kreft and de Leeuw, pp. 35-44
Oct 26 Building Multilevel Models
Luke, pp. 23-33
Kreft and de Leeuw, pp. 44-56
Oct 28 Multilevel Analysis I
Hox, Chapter 4, “Some Important Methodological and Statistical Issues,” pp. 49-58
Hamilton, Chapter 15, “Multilevel and Mixed Effects Modeling,” pp. 413-421
Luke, pp. 48-53
Nov 2 Multilevel Analysis II
Hox, Chapter 4, “Some Important Methodological and Statistical Issues, pp. 58-63
Hamilton, Chapter 15, “Multilevel and Mixed Effects Modeling,” pp. 421-434
6
Nov 4 Multilevel Model Assessment
Luke, pp. 33-48
Hox, Chapter 4, “Some Important Methodological and Statistical Issues, pp. 63-66
Nov 9 Multilevel Extensions
Luke, pp. 53-72
Hamilton, Chapter 15, “Multilevel and Mixed Effects Modeling,” pp. 434-438
Nov 11 Standard Approaches to Missing Data
Allison, pp. 1-12
Treiman, Chapter 8 “Multiple Imputation of Missing Data,” pp. 181-194
Nov 16 Multiple Imputation
Allison, pp. 27-50
Nov 18 Imputation Phase
Allison, pp. 50-55, 69-73
Stata 11 Reference Manual, “mi,” pp. 1-13
Fall Break and Thanksgiving Holiday
Nov 30 Estimation Phase
Stata 11 Reference Manual, “mi,” pp. 14-23, 105-118
Dec 2 Complex Samples
Treiman, Chapter 9, “Sample Design and Survey Estimation,” pp. 195-215
Dec 7 Stata Survey Commands
Treiman, Chapter 9, “Sample Design and Survey Estimation,” pp. 215-225
Hamilton, Chapter 14, “Survey Data Analysis,” pp. 391-399
Dec 9 More on Stata Survey Commands
Hamilton, Chapter 14, “Survey Data Analysis,” pp. 399-408
Stata Reference Manual, “Introduction to Survey Commands,” pp. 3-14, 16-17
7
Special Issues
Disability. If you qualify for accommodations because of a disability, please submit to
me a letter from Disability Services in a timely manner so that your needs may be
addressed. Disability Services determines accommodations based on documented
disabilities. Contact: 303-492-8671, Willard 322, and
http://www.Colorado.EDU/disabilityservices.
Religious Obligations. Campus policy regarding religious observances requires that
faculty make every effort to reasonably and fairly deal with all students who, because of
religious obligations, have conflicts with scheduled exams, assignments or required
attendance. Please inform me ahead of time of any obligation that conflicts with the
assignments or exams so that accommodations can be made. See full details at
http://www.colorado.edu/policies/fac_relig.html.
Appropriate Behavior. Students and faculty each have responsibility for maintaining an
appropriate learning environment. Students who fail to adhere to such behavioral
standards may be subject to discipline. Faculty have the professional responsibility to
treat all students with understanding, dignity and respect, to guide classroom discussion
and to set reasonable limits on the manner in which they and their students express
opinions. Professional courtesy and sensitivity are especially important with respect to
individuals and topics dealing with differences of race, culture, religion, politics, sexual
orientation, gender variance, and nationalities. Class rosters are provided to the instructor
with the student's legal name. I will gladly honor your request to address you by an
alternate name or gender pronoun. Please advise me of this preference early in the
semester so that I may make appropriate changes to my records. See polices at
http://www.colorado.edu/policies/classbehavior.html and at
http://www.colorado.edu/studentaffairs/judicialaffairs/code.html#student_code.
Discrimination and Harassment. The University of Colorado at Boulder policy on
Discrimination and Harassment (http://www.colorado.edu/policies/discrimination.html),
the University of Colorado policy on Sexual Harassment, and the University of Colorado
policy on Amorous Relationships applies to all students, staff and faculty. Any student,
staff or faculty member who believes s/he has been the subject of discrimination or
harassment based upon race, color, national origin, sex, age, disability, religion, sexual
orientation, or veteran status should contact the Office of Discrimination and Harassment
(ODH) at 303-492-2127 or the Office of Judicial Affairs at 303-492-5550. Information
about the ODH and the campus resources available to assist individuals regarding
discrimination or harassment can be obtained at http://www.colorado.edu/odh.
Honor Code. All students of the University of Colorado at Boulder are responsible for
knowing and adhering to the academic integrity policy of this institution. Violations of
this policy may include: cheating, plagiarism, aid of academic dishonesty, fabrication,
lying, bribery, and threatening behavior. All incidents of academic misconduct shall be
reported to the Honor Code Council (honor@colorado.edu; 303-725-2273). Students
8
who are found to be in violation of the academic integrity policy will be subject to both
academic sanctions from the faculty member and non-academic sanctions (including but
not limited to university probation, suspension, or expulsion). Other information on the
Honor Code can be found at http://www.colorado.edu/policies/honor.html and
http://www.colorado.edu/academics/honorcode.
9
Download