Syllabus - The University of North Carolina at Chapel Hill

advertisement
SOCI 718: LONGITUDINAL AND MULTILEVEL DATA ANALYSIS
Professor: Guang Guo
216 Hamilton Hall
Phone: 919-962-1246 (o)
Email: guang_guo@unc.edu
Office hours: 2:00 p.m. – 3:00 p.m., Wednesdays or by appointment
Lectures: 10:00 a.m. - 12:50 p.m. Wednesdays
TA: Seulki Choi
Office: HM 273
Email: chois@email.unc.edu
Office Hours: 1:00 p.m. – 2:00 p.m. Tuesday and 1:00 p.m. – 2:00 p.m. Thursday
DESCRIPTION
This course introduces analytical tools for three types of data: longitudinal event history data,
longitudinal and multilevel linear data, and longitudinal and multilevel non-linear data. In the
event history analysis, we investigate the timing and conditions of the occurrence of an event. A
classic example is the study of the timing and conditions of human mortality. Longitudinal linear
data consist of repeated measures of an outcome over time on a number of units (i.e., individuals,
counties, countries, etc.). Pooled cross-sectional and time series data represent one main source
of longitudinal linear data for social scientists. Multilevel data are also referred to as hierarchical
data, in which lower-level units are clustered into higher-level units. School/student data are one
of the most well-known multilevel data sources, in which individual students are clustered into a
number of schools. This course provides you with (1) an introduction to statistical techniques for
the three types of data and (2) experience of running models on such data. We start with the
event history methods and then proceed with the longitudinal and multilevel data. As you will
see, much of the statistical methodology for the longitudinal and multilevel data is closely
related.
PREREQUISITES
Students should have statistical and statistical software background at least equivalent to that
provided by Soci 709 (linear regression) and Soci 711 (categorical data analysis). While we are
dedicated to making the materials accessible to everyone in the class, devoting a fair amount of
time on your part is necessary in order to achieve a reasonable good understanding of the
subjects. Bear in mind that Soci 718 is an optional statistical methodology course.
GRADING
45% Assignments (individual projects)
10% Class Participation
10% Group methodological critiques of application articles (group projects)
35% Final Paper (12/3) (individual projects)
PRESENTATION
Each group is responsible for presenting the critiques and for organizing and leading a class
discussion on an article. Each group is divided into two subgroups with one presenting the ideas
in the article and the other providing the critical assessment of it. A guideline for the critiques
will be distributed. The class will be divided into a number of groups with each group
responsible for one article.
COURSE OUTLINE (approximate)
Presentations:
Group 1: Week 5
Group 2: Week 7
Group 3: Week 9
Group 4: Week 11
Group 5: Week 13
Group 6: Week 15
Event History Analysis
Week 1.
The outcome variable in event-history analysis; a few examples of
event-history analysis in social science research. Cohort life tables.
Required Readings- Class notes; Allison 1995: Chapter 3 (SAS
oriented); Allison 1984: Introduction. Optional Readings (some of
Kalbfleisch and Prentice can be difficult): Kalbfleisch and
Prentice: Chapter 1.
Week 2.
Cohort life tables (continued): concepts, computation, and
computer programming. Period life table (brief mentioning).
Basic statistical concepts in event history analysis: density, CDF,
survivor function, hazard function, censoring and censored data,
and proportionality. Kaplan-Meier's estimate of survivor function.
Failure time or survival time distributions. Required Readings–
Class notes; Allison 1995: Chapter 2. Optional Readings:
Kalbfleisch and Prentice: Chapter 2.
Week 3.
Continuous-time hazard model; review of maximum likelihood
estimation; maximum likelihood estimation of survival models;
exponential hazard model without censoring and covariates;
exponential hazard model with censoring, but without covariates;
proportional hazard model: proportionality, time varying
covariates, and time-varying effects; starting piece-wise
exponential hazard model: modeling baseline hazard. Required
Readings– Class notes; Allison 1995: Chapters 2 and 4. Optional
Readings: Kalbfleisch and Prentice: Chapter 2.
Week 4
Piece-wise exponential model continued; data preparation; baseline management;
interpretation of the coefficients in terms of the hazards ratios; time-varying
effects or interactions of the effects with time. Discrete time event history
analysis.
Week 5
First group presentation on Sorensen’s 2004 paper on race
composition and employment (Presentation should use about 40-45
minutes; the intro is long and we should spend just enough time to
understand it). Wrapping up discrete time models. Starting on
Cox’ models.
Week 6
The Cox’ proportional hazard model. Accelerated failure time models: the
differences between a PH model and an AFT model; interpretation; estimation;
and applications. Event history analysis for censored and truncated data. Left
truncation Required Readings-- Class notes; Allison 1995: Chapters 5-6; Guo
1993 Sociological Methodology paper. Optional Readings: Kalbfleisch and
Prentice.
Weeks 7
Second group presentation on Stuart and Ding’s 2006 AJS paper
using Cox’ model. Unobserved heterogeneity; clustered eventhistory data and multi-level analysis of event history data;
competing risks. Readings—class notes; Allison 1995; Vaupel et
al.; Trussell and Richards. Optional readings: Guo 1992
Demography; Guo and Rodríguez 1990 JASA.
Readings for Event History Analysis:
Allison, Paul D. 1982. " Discrete-Time Methods for the Analysis of Event Histories." Pp. 6198 in Sociological Methodology 1982, edited by S. Leinhardt. San Francisco: Jossey-Bass.
Allison, Paul D. 1984. Event History Analysis, Regression for Longitudinal Event Data.
Beverly Hills, CA: SAGE Publication, Inc.
Allison, Paul D. 1995. Survival Analysis Using the SAS System: A Practical Guide. NC: SAS
Institute Inc.
Guang Guo. 1993. "Event-History Analysis for Left-Truncated Data." In Sociological
Methodology 1993 edited by Peter V. Marsden Pp. 217-243. Washington, DC: American
Sociological Association.
Guo, Guang and Germán Rodríguez. 1992. "Estimating a Multivariate Proportional Hazards
Model for Clustered Data using the EM Algorithm, with an Application to Child Survival
in Guatemala." Journal of American Statistical Association 87: 969-976.
Heckman, James and B. Singer. 1986. "Econometric Analysis of Longitudinal Data." Pp. 16891763 in Handbook of Econometrics, Vol. 3, edited by Z. Griliches and M.D. Intriligator.
Amsterdam: North-Holland.
Kalbfleisch, J. D., and R.L. Prentice. 1980. The Statistical Analysis of Failure Time Data. New
York: Wiley.
Judith D. Singer and John B. Willett. 2003. Applied Longitudinal Data Analysis: Modeling
Change and Event Occurrence. New York: Oxford University Press, March, 2003
Trussell, J., and Toni Richards. 1985. "Correcting for Unmeasured Heterogeneity in Hazard
Models Using the Heckman-Singer Procedure", in Sociological Methodology 1985,
edited by N. Tuma. San Francisco: Jossey-Bass.
Vaupel, James W., Kenneth G. Manton, and Eric Stallard. 1979. "The Impact of Heterogeneity
in Individual Frailty on the Dynamics of Mortality." Demography 16:439-54.
Analysis of Linear and Non-Linear Longitudinal and Multilevel Data
Week 8
Random-effects linear models; the mixed models. Readings– Class notes; Diggle
et al. Chapters 1 and 3;
Week 9
Group 3 presentation Keister’s 1998 AJS paper using random effects models. The
mixed models. Readings: Class notes.
Week 10
Multilevel models for linear data. Readings–Bryk and Raudenbush Chapters 1, 2,
4, and 5; Goldstein Chapters 2 and 3 (B/R and G’s chapters could be read for
studying random effects models and these chapters may be more accessible to
many than those by Greene and Diggle); class notes.
Week 11
Group 4 presentation of Gamoran’s paper on multilevel models. Multilevel
models. Random coefficient models. Readings: class notes.
Week 12
Linear longitudinal data. Exploring longitudinal data. Growth curve models.
Readings: class notes; Guo and Hipp paper.
Week 13
Group 5 presenation of Johnson’s paper using SEM growth curve models. Growth
curve models framed as random effects or SEM models. Readings: Guo and Hipp
paper and class notes.
Week 14
Difference models (fixed effects). Huber-White procedure. Readings: Liker et al.;
class notes.
Week 15
Group 6 presentation of England’s paper using fixed effects models. Readings:
Liker et al.; class notes.
Week 16
Multilevel models for binary data. GEE models. Readings: Guo and Zhao’s
Annual Review of Sociology paper.
Readings for Linear and Nonlinear Multilevel and Longitudinal Models:
Dielman, Terry E. 1983. "Pooled Cross-Sectional and Time Series Data: A Survey of Current
Statistical Methodology." The American Statistician 37:111-22.
Greene, William H. 1994. "Models that Use Both Cross-Section and Time-Series Data" Chapter
16 in Econometric Analysis. 2nd Edition. NY:Macmillan Publishing.
Guang Guo and John Hipp. “The Analysis of Linear Longitudinal Data: Random-effects Models
and Structural Equations Models.” In New Handbook on Data Analysis, edited by
Melissa Hardy. London, England: Sage. Forthcoming.
Hsiao, Cheng. 1986. Analysis of Panel Data. NY:Cambridge University Press.
Kmenta, Jan. 1986. "Section 12.2 Pooling of Cross-Section and Time-Series Data." Pp. 616-35 in
J. Kmenta. Elements of Econometrics. NY:Macmillan Publishing.
Liker, Jeffrey K., Sue Augustyniak, and Greg J. Duncan. 1985. "Panel Data and Models of
Change: A Comparison of First Difference and Conventional Two-Wave Models."
SocialScience Research 14:80-101.
Bryk, Anthony S. and Stephen W. Raudenbush. Hierarchical Linear Models. New York:
SAGE.
Goldstein, Harvey. 1995. Multilevel Statistical Models. Second Edition. London: Arnold.
Diggle, Peter J., Kung-Yee Liang, and Scott L. Zeger. 1994. Analysis of Longitudinal Data.
London: Oxford University Press.
Mason, W.M., Wong G.M., and Entwisle, B. 1983. “Contextual Analysis through the multilevel
linear model,” in S. Leinhardt (Ed.), Sociological Methodology (pp.72-103). San
Francisco: Jossey-Bass.
Application Presentations:
Group 1:
Sorensen. 2004. Employment and Race Composition. American Journal of Sociology. (Piece –
wise exponential model; the intro is long; please Do Not spend too much time on the intro)
Group 2:
Dudros’ 1997 Social Forces paper (Discrete time models)
Group 3:
Keister, Lisa A. 1998. “Engneering Growth: Business Group Structure and Firm Performance in
China’s Transition Economy.” American Journal of Sociology 104: 404-440.
Group 4:
Gamoran, Adam. 1992. “The Variable Effects of High School Tracking.” American
Sociological Review 57: 812-828. (HLM model)
Group 5:
Johnson, Monica Kirkpatrick 2002. “Social Origins, Adolescent Experiences, and Work Value
Trajectories during the Transition to Adulthood.” Social Forces 80: 1307-1341.
Group 6:
.
England, Paula, George Farkas, Barbara Stanek Kilbourne, and Thomas Dou. 1988. “Explaining
Occupational Sex Segregation and Wages; Findings from a Model with Fixed Effects.”
American Sociological Review 53: 544-558. (Fixed effect model)
Other Papers on the Topics:
Arnold, Bruce, and John Hagan. 1992. “Careers of Misconduct: The Structure of Prosecuted
Professional Deviance among Lawyers.” American Sociological Review 57: 771-780.
(Cox’ partial likelihood model)
Barron David N., Elizabeth West, and Michael T. Hannan. 1994. “A Time to Grow and a Time
to Die: Growth and Mortality of Credit Unions in New York City, 1914-1990.”
American Journal of Sociology 100: 381-421
Carroll, Glenn R., and Karl Ulrich Mayer. 1986. "Job-Shift Patterns in the Federal Republic of
Germany: The Effects of Social Class, Industrial Sector and Organizational Size."
American Sociological Review 51:323-341. (Cox’ model)
Freeman, John and Glenn R. Carroll. 1983. "The Liability of Newness: Age Dependence in
Organizational Death Rates." American Sociological review 48:692-710. (Exploratory
analysis and Gompertz model)
Nielsen, Francois. 1994. “Income Inequality and Industrial Development: the Dualism
Revisited.” American Sociological Review 59: 654-677. (Random effect model)
Robert, T. Michael and Nancy Brandon Tuma. 1985. "Entry into Marriage and Parenthood by
Young Men and Women: The Influence of Family Background." Demography 22:515544. (Cox’s model)
Sandefur, Gary. 1985. "Variations in Interstate Migration of Men across the Early Stages of the
Life Cycle." Demography 22:353-366. (Gompertz model, a parametric model)
Vanderhoeft, C. 1982. Accelerated Failure Time Models: An Application to Current Status
Breast-feeding Data from Pakistan. Genus 38: 135-157. (Current status event history
analysis)
Guang Guo. 1993. "Use of Sibling Data to Estimate Family Mortality Effects in Guatemala."
Demography 30(1): 15-32. (Random effect event history model for clustered data)
Guang Guo and Laurence Grummer-Strawn. 1993. "Child Mortality among Twins in Developing
Countries". Population Studies 47(3): 1-16. (Random effect event history model for
clustered data)
Harris, Kathleen Mullan 1993. “Work and Welfare among Single Mothers in Poverty.”
American Journal of Sociology 99: 317-352. (Discrete time model)
Guang Guo. 1998. “The Timing of the Influences of Cumulative Poverty on Children's
Cognitive Outcomes in Childhood and Early Adolescence.” Social Forces. (Random
effect model)
Guang Guo and Leah Vanwey. “Family Size and Children’s Educational Performance: Is the
Effect Causal?” Forthcoming in American Sociological Review. (Difference model)
Entwisle, Barbara and William M. Mason. 1985. “Multilevel Effects of Socioeconomic
Development and Family Planning Programs on Children Ever Born.” American Journal
of Sociology 91: 616-49. (Multilevel model)
Cox, David R. and Oakes, D. 1984. Analysis of Survival Data. New York: Chapman and Hall.
Diamond, I., McDonald, J. And Shah, I. 1986. Proportional Hazard Models for Current Status
Data: Application to the Study of Differentials in Age at Weaning in Pakistan.
Demography, 23:607-620.
Tuma, N.B. and M.T. Hannan. 1984. Social Dynamics, Models and Methods, New York:
Academic Press, INC.
Guang Guo. "Negative Multinomial Regression Models For Clustered Event Counts." 1996. In
Sociological Methodology 1996 edited by Adrian E. Raftery Pp. 113-132. Washington,
DC: American Sociological Association.
Firebaugh, Glenn, Frank D. Beck. 1994. “Does Economic Growth Benefit the Masses? Growth,
Dependence, and Welfare in the Third World.” American Sociological Review 59: 631653. (Difference model)
Download