SOCI 718: LONGITUDINAL AND MULTILEVEL DATA ANALYSIS Professor: Guang Guo 216 Hamilton Hall Phone: 919-962-1246 (o) Email: guang_guo@unc.edu Office hours: 2:00 p.m. – 3:00 p.m., Wednesdays or by appointment Lectures: 10:00 a.m. - 12:50 p.m. Wednesdays TA: Seulki Choi Office: HM 273 Email: chois@email.unc.edu Office Hours: 1:00 p.m. – 2:00 p.m. Tuesday and 1:00 p.m. – 2:00 p.m. Thursday DESCRIPTION This course introduces analytical tools for three types of data: longitudinal event history data, longitudinal and multilevel linear data, and longitudinal and multilevel non-linear data. In the event history analysis, we investigate the timing and conditions of the occurrence of an event. A classic example is the study of the timing and conditions of human mortality. Longitudinal linear data consist of repeated measures of an outcome over time on a number of units (i.e., individuals, counties, countries, etc.). Pooled cross-sectional and time series data represent one main source of longitudinal linear data for social scientists. Multilevel data are also referred to as hierarchical data, in which lower-level units are clustered into higher-level units. School/student data are one of the most well-known multilevel data sources, in which individual students are clustered into a number of schools. This course provides you with (1) an introduction to statistical techniques for the three types of data and (2) experience of running models on such data. We start with the event history methods and then proceed with the longitudinal and multilevel data. As you will see, much of the statistical methodology for the longitudinal and multilevel data is closely related. PREREQUISITES Students should have statistical and statistical software background at least equivalent to that provided by Soci 709 (linear regression) and Soci 711 (categorical data analysis). While we are dedicated to making the materials accessible to everyone in the class, devoting a fair amount of time on your part is necessary in order to achieve a reasonable good understanding of the subjects. Bear in mind that Soci 718 is an optional statistical methodology course. GRADING 45% Assignments (individual projects) 10% Class Participation 10% Group methodological critiques of application articles (group projects) 35% Final Paper (12/3) (individual projects) PRESENTATION Each group is responsible for presenting the critiques and for organizing and leading a class discussion on an article. Each group is divided into two subgroups with one presenting the ideas in the article and the other providing the critical assessment of it. A guideline for the critiques will be distributed. The class will be divided into a number of groups with each group responsible for one article. COURSE OUTLINE (approximate) Presentations: Group 1: Week 5 Group 2: Week 7 Group 3: Week 9 Group 4: Week 11 Group 5: Week 13 Group 6: Week 15 Event History Analysis Week 1. The outcome variable in event-history analysis; a few examples of event-history analysis in social science research. Cohort life tables. Required Readings- Class notes; Allison 1995: Chapter 3 (SAS oriented); Allison 1984: Introduction. Optional Readings (some of Kalbfleisch and Prentice can be difficult): Kalbfleisch and Prentice: Chapter 1. Week 2. Cohort life tables (continued): concepts, computation, and computer programming. Period life table (brief mentioning). Basic statistical concepts in event history analysis: density, CDF, survivor function, hazard function, censoring and censored data, and proportionality. Kaplan-Meier's estimate of survivor function. Failure time or survival time distributions. Required Readings– Class notes; Allison 1995: Chapter 2. Optional Readings: Kalbfleisch and Prentice: Chapter 2. Week 3. Continuous-time hazard model; review of maximum likelihood estimation; maximum likelihood estimation of survival models; exponential hazard model without censoring and covariates; exponential hazard model with censoring, but without covariates; proportional hazard model: proportionality, time varying covariates, and time-varying effects; starting piece-wise exponential hazard model: modeling baseline hazard. Required Readings– Class notes; Allison 1995: Chapters 2 and 4. Optional Readings: Kalbfleisch and Prentice: Chapter 2. Week 4 Piece-wise exponential model continued; data preparation; baseline management; interpretation of the coefficients in terms of the hazards ratios; time-varying effects or interactions of the effects with time. Discrete time event history analysis. Week 5 First group presentation on Sorensen’s 2004 paper on race composition and employment (Presentation should use about 40-45 minutes; the intro is long and we should spend just enough time to understand it). Wrapping up discrete time models. Starting on Cox’ models. Week 6 The Cox’ proportional hazard model. Accelerated failure time models: the differences between a PH model and an AFT model; interpretation; estimation; and applications. Event history analysis for censored and truncated data. Left truncation Required Readings-- Class notes; Allison 1995: Chapters 5-6; Guo 1993 Sociological Methodology paper. Optional Readings: Kalbfleisch and Prentice. Weeks 7 Second group presentation on Stuart and Ding’s 2006 AJS paper using Cox’ model. Unobserved heterogeneity; clustered eventhistory data and multi-level analysis of event history data; competing risks. Readings—class notes; Allison 1995; Vaupel et al.; Trussell and Richards. Optional readings: Guo 1992 Demography; Guo and Rodríguez 1990 JASA. Readings for Event History Analysis: Allison, Paul D. 1982. " Discrete-Time Methods for the Analysis of Event Histories." Pp. 6198 in Sociological Methodology 1982, edited by S. Leinhardt. San Francisco: Jossey-Bass. Allison, Paul D. 1984. Event History Analysis, Regression for Longitudinal Event Data. Beverly Hills, CA: SAGE Publication, Inc. Allison, Paul D. 1995. Survival Analysis Using the SAS System: A Practical Guide. NC: SAS Institute Inc. Guang Guo. 1993. "Event-History Analysis for Left-Truncated Data." In Sociological Methodology 1993 edited by Peter V. Marsden Pp. 217-243. Washington, DC: American Sociological Association. Guo, Guang and Germán Rodríguez. 1992. "Estimating a Multivariate Proportional Hazards Model for Clustered Data using the EM Algorithm, with an Application to Child Survival in Guatemala." Journal of American Statistical Association 87: 969-976. Heckman, James and B. Singer. 1986. "Econometric Analysis of Longitudinal Data." Pp. 16891763 in Handbook of Econometrics, Vol. 3, edited by Z. Griliches and M.D. Intriligator. Amsterdam: North-Holland. Kalbfleisch, J. D., and R.L. Prentice. 1980. The Statistical Analysis of Failure Time Data. New York: Wiley. Judith D. Singer and John B. Willett. 2003. Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. New York: Oxford University Press, March, 2003 Trussell, J., and Toni Richards. 1985. "Correcting for Unmeasured Heterogeneity in Hazard Models Using the Heckman-Singer Procedure", in Sociological Methodology 1985, edited by N. Tuma. San Francisco: Jossey-Bass. Vaupel, James W., Kenneth G. Manton, and Eric Stallard. 1979. "The Impact of Heterogeneity in Individual Frailty on the Dynamics of Mortality." Demography 16:439-54. Analysis of Linear and Non-Linear Longitudinal and Multilevel Data Week 8 Random-effects linear models; the mixed models. Readings– Class notes; Diggle et al. Chapters 1 and 3; Week 9 Group 3 presentation Keister’s 1998 AJS paper using random effects models. The mixed models. Readings: Class notes. Week 10 Multilevel models for linear data. Readings–Bryk and Raudenbush Chapters 1, 2, 4, and 5; Goldstein Chapters 2 and 3 (B/R and G’s chapters could be read for studying random effects models and these chapters may be more accessible to many than those by Greene and Diggle); class notes. Week 11 Group 4 presentation of Gamoran’s paper on multilevel models. Multilevel models. Random coefficient models. Readings: class notes. Week 12 Linear longitudinal data. Exploring longitudinal data. Growth curve models. Readings: class notes; Guo and Hipp paper. Week 13 Group 5 presenation of Johnson’s paper using SEM growth curve models. Growth curve models framed as random effects or SEM models. Readings: Guo and Hipp paper and class notes. Week 14 Difference models (fixed effects). Huber-White procedure. Readings: Liker et al.; class notes. Week 15 Group 6 presentation of England’s paper using fixed effects models. Readings: Liker et al.; class notes. Week 16 Multilevel models for binary data. GEE models. Readings: Guo and Zhao’s Annual Review of Sociology paper. Readings for Linear and Nonlinear Multilevel and Longitudinal Models: Dielman, Terry E. 1983. "Pooled Cross-Sectional and Time Series Data: A Survey of Current Statistical Methodology." The American Statistician 37:111-22. Greene, William H. 1994. "Models that Use Both Cross-Section and Time-Series Data" Chapter 16 in Econometric Analysis. 2nd Edition. NY:Macmillan Publishing. Guang Guo and John Hipp. “The Analysis of Linear Longitudinal Data: Random-effects Models and Structural Equations Models.” In New Handbook on Data Analysis, edited by Melissa Hardy. London, England: Sage. Forthcoming. Hsiao, Cheng. 1986. Analysis of Panel Data. NY:Cambridge University Press. Kmenta, Jan. 1986. "Section 12.2 Pooling of Cross-Section and Time-Series Data." Pp. 616-35 in J. Kmenta. Elements of Econometrics. NY:Macmillan Publishing. Liker, Jeffrey K., Sue Augustyniak, and Greg J. Duncan. 1985. "Panel Data and Models of Change: A Comparison of First Difference and Conventional Two-Wave Models." SocialScience Research 14:80-101. Bryk, Anthony S. and Stephen W. Raudenbush. Hierarchical Linear Models. New York: SAGE. Goldstein, Harvey. 1995. Multilevel Statistical Models. Second Edition. London: Arnold. Diggle, Peter J., Kung-Yee Liang, and Scott L. Zeger. 1994. Analysis of Longitudinal Data. London: Oxford University Press. Mason, W.M., Wong G.M., and Entwisle, B. 1983. “Contextual Analysis through the multilevel linear model,” in S. Leinhardt (Ed.), Sociological Methodology (pp.72-103). San Francisco: Jossey-Bass. Application Presentations: Group 1: Sorensen. 2004. Employment and Race Composition. American Journal of Sociology. (Piece – wise exponential model; the intro is long; please Do Not spend too much time on the intro) Group 2: Dudros’ 1997 Social Forces paper (Discrete time models) Group 3: Keister, Lisa A. 1998. “Engneering Growth: Business Group Structure and Firm Performance in China’s Transition Economy.” American Journal of Sociology 104: 404-440. Group 4: Gamoran, Adam. 1992. “The Variable Effects of High School Tracking.” American Sociological Review 57: 812-828. (HLM model) Group 5: Johnson, Monica Kirkpatrick 2002. “Social Origins, Adolescent Experiences, and Work Value Trajectories during the Transition to Adulthood.” Social Forces 80: 1307-1341. Group 6: . England, Paula, George Farkas, Barbara Stanek Kilbourne, and Thomas Dou. 1988. “Explaining Occupational Sex Segregation and Wages; Findings from a Model with Fixed Effects.” American Sociological Review 53: 544-558. (Fixed effect model) Other Papers on the Topics: Arnold, Bruce, and John Hagan. 1992. “Careers of Misconduct: The Structure of Prosecuted Professional Deviance among Lawyers.” American Sociological Review 57: 771-780. (Cox’ partial likelihood model) Barron David N., Elizabeth West, and Michael T. Hannan. 1994. “A Time to Grow and a Time to Die: Growth and Mortality of Credit Unions in New York City, 1914-1990.” American Journal of Sociology 100: 381-421 Carroll, Glenn R., and Karl Ulrich Mayer. 1986. "Job-Shift Patterns in the Federal Republic of Germany: The Effects of Social Class, Industrial Sector and Organizational Size." American Sociological Review 51:323-341. (Cox’ model) Freeman, John and Glenn R. Carroll. 1983. "The Liability of Newness: Age Dependence in Organizational Death Rates." American Sociological review 48:692-710. (Exploratory analysis and Gompertz model) Nielsen, Francois. 1994. “Income Inequality and Industrial Development: the Dualism Revisited.” American Sociological Review 59: 654-677. (Random effect model) Robert, T. Michael and Nancy Brandon Tuma. 1985. "Entry into Marriage and Parenthood by Young Men and Women: The Influence of Family Background." Demography 22:515544. (Cox’s model) Sandefur, Gary. 1985. "Variations in Interstate Migration of Men across the Early Stages of the Life Cycle." Demography 22:353-366. (Gompertz model, a parametric model) Vanderhoeft, C. 1982. Accelerated Failure Time Models: An Application to Current Status Breast-feeding Data from Pakistan. Genus 38: 135-157. (Current status event history analysis) Guang Guo. 1993. "Use of Sibling Data to Estimate Family Mortality Effects in Guatemala." Demography 30(1): 15-32. (Random effect event history model for clustered data) Guang Guo and Laurence Grummer-Strawn. 1993. "Child Mortality among Twins in Developing Countries". Population Studies 47(3): 1-16. (Random effect event history model for clustered data) Harris, Kathleen Mullan 1993. “Work and Welfare among Single Mothers in Poverty.” American Journal of Sociology 99: 317-352. (Discrete time model) Guang Guo. 1998. “The Timing of the Influences of Cumulative Poverty on Children's Cognitive Outcomes in Childhood and Early Adolescence.” Social Forces. (Random effect model) Guang Guo and Leah Vanwey. “Family Size and Children’s Educational Performance: Is the Effect Causal?” Forthcoming in American Sociological Review. (Difference model) Entwisle, Barbara and William M. Mason. 1985. “Multilevel Effects of Socioeconomic Development and Family Planning Programs on Children Ever Born.” American Journal of Sociology 91: 616-49. (Multilevel model) Cox, David R. and Oakes, D. 1984. Analysis of Survival Data. New York: Chapman and Hall. Diamond, I., McDonald, J. And Shah, I. 1986. Proportional Hazard Models for Current Status Data: Application to the Study of Differentials in Age at Weaning in Pakistan. Demography, 23:607-620. Tuma, N.B. and M.T. Hannan. 1984. Social Dynamics, Models and Methods, New York: Academic Press, INC. Guang Guo. "Negative Multinomial Regression Models For Clustered Event Counts." 1996. In Sociological Methodology 1996 edited by Adrian E. Raftery Pp. 113-132. Washington, DC: American Sociological Association. Firebaugh, Glenn, Frank D. Beck. 1994. “Does Economic Growth Benefit the Masses? Growth, Dependence, and Welfare in the Third World.” American Sociological Review 59: 631653. (Difference model)