The University of North Carolina at Chapel Hill School of Social Work SOWO 917 Longitudinal and Multilevel Analysis Fall Semester, 2013 INSTRUCTOR Shenyang Guo, Ph.D., Room 524j, Tate Turner Kuralt Building CB #3550, School of Social Work, Chapel Hill, NC 27599-3550 Phone: (919) 843-2455 Email: sguo@email.unc.edu CLASS MEETING TIMES & OFFICE HOURS Class meets on Wednesdays 9:00-11:50 am Office hours are Tuesdays 10:30 – 12:30 (Room 524j TTK) COURSE DESCRIPTION This course introduces statistical frameworks, analytical tools, and social behavioral applications of three types of models: event history analysis, hierarchical linear modeling (HLM), and growth curve analysis. COURSE OBJECTIVES At the completion of the course, students will have a solid understanding of the challenges and problems in longitudinal and multilevel analysis. They will know how to choose appropriate statistical analyses that best suit the type of data and research questions for a given study. They are expected to be able to run, interpret, and communicate results clearly and effectively in writing based on the following models: life tables, Kaplan-Meier’s estimate of survivor function, discrete time model, Cox proportional hazard model, marginal models handling multilevel event data, two-level and three-level hierarchical linear models, growth curve analysis, and analysis of a categorical dependent variable using HGLM. PRE-REQUIREMENT Students are assumed to be familiar with descriptive and inferential statistics as well as multiple regression analysis. They should have statistical and statistical software background at least equivalent to that provided by SOCI209, PSYC282, EDUC284 (linear regression), or SOCI211 (categorical data analysis). Students without such prerequisites should contact the instructor to determine their eligibility to take this course. 1 STATISTICAL SOFTWARE PACKAGES Students may choose to use Stata, SAS, or SPSS as the primary statistical software package for the course, though the classroom lectures and materials will be based on Stata. Specialized software package HLM will also be demonstrated. TEXTBOOKS Guo, S. (2010). Survival Analysis: A Practical Guide to Social Work Research. New York, NY: Oxford University Press. Singer, J.D., & Willett, J.B., (2003). Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence, New York, NY: Oxford University Press RECOMMENDED TEXTBOOKS Allison, P.D. (1995). Survival Analysis Using the SAS System. Cary, NC: SAS Institute Inc. Collett, D. (1994). Modeling Survival Data in Medical Research. New York, NY: Chapman & Hall Cleves, M.A., Gould, W.W., & Gutierrez, R.G. (2004). An introduction to survival analysis using Stata, Rev. ed., College Station, TX: Stata Press. Raudenbush, S.W., & Bryk, A.S. (2002). Hierarchical Linear Models: Applications and Data Analysis Methods, Second Edition, Thousand Oaks, CA: Sage Publications Ltd. Rabe-Hesketh, S., & Skrondal, A. (2005). Multilevel and Longitudinal Modeling Using Stata, College Station, TX: Stata Press. ASSIGNMENTS GRADE PERCENTAGE Assignment 1 Assignment 2 Assignment 3 Assignment 4 Assignment 5 Midterm Exam (take home) Final Exam (take home) 10% 10% 10% 10% 10% 25% 25% GRADING SYSTEM The standard School of Social Work interpretation of grades and numerical scores will be used. H = 94-100 P = 80-93 L = 70-79 F = 69 and below 2 POLICY ON CLASS ATTENDANCE Class attendance is an important element of class evaluation, and you are expected to attend all scheduled sessions. Each class session will cover a great deal of materials, and you will fall behind the course when you miss even one class session. It’s student’s responsibility to inform the instructor via email in advance for missing a class session. You are expected not to miss more than two sessions for the whole semester. Starting from the second missing, your course grade will be reduced by 10% for each session missed. POLICY ON INCOMPLETE AND LATE ASSIGNMENTS Assignments are to be turned in to the professor by 5pm of the due date noted in the course outline. Extensions may be granted by the professor given advance notice of at least 24 hours. Late assignments (not turned in by 5pm on the due date) will be reduced 10 percent for each day late (including weekend days). A grade of incomplete will only be given under extenuating circumstances and in accordance with University policy. POLICY ON ACADEMIC DISHONESTY Students are expected to follow the UNC Honor Code. Please include the honor code statement along with your signature on all assignments: “I have neither given nor received unauthorized aid on this assignment.” Please refer to the APA Style Guide, the SSW Manual, and the SSW Writing Guide for information on attribution of quotes, plagiarism and appropriate use of assistance in preparing assignments. If reason exists to believe that academic dishonesty has occurred, a referral will be made to the Office of the Student Attorney General for investigation and further action as required. POLICY ON ACCOMMODATIONS FOR STUDENTS WITH DISABILITIES Students with disabilities which affect their participation in the course may notify the instructor if they wish to have special accommodations in instructional format, examination format, etc., considered. COURSE OUTLINE (TOPICS, READINGS, AND ASSIGNMENTS) 8/21/13 1. Introduction and course overview Review of longitudinal design: multi-wave panel, cohort, staggered multiple cohorts, cohort-sequential, experimental, survey, and designs using administrative data. Review of conventional approaches to longitudinal analysis: transition probability of Markov chain model, paired t test, within-subject ANOVA, repeated measure MANOVA. Review of statistical concepts: statistical assumptions embedded in OLS, problems of autocorrelation, and maximum likelihood estimator. Readings: Singer & Willett, Chapter 1. 3 Guo, S. (2008). Quantitative research. Entry for the Encyclopedia of Social Work, 20th Edition. New York, NY: The Oxford University Press. Guo, S. (2013). Advanced statistical analysis. Entry for the Encyclopedia of Social Work Online. New York, NY: The Oxford University Press. 8/28/13 2. Life table and Kaplan-Meier methods Overview of event history analysis Censoring Cohort life tables Kaplan-Meier’s estimate of survivor functions Readings: Guo, Chapters 1 & 2. Singer & Willett, Chapter 9. Guang Guo (1993). “Event history analysis for left-truncated data”, Sociological Methodology, 23, 217-243. Schwartz, I., Ortega, R., Guo, S., & Fishman, G. (1994) "Infants in nonpermanent placement". Social Service Review, 68(3), 405-416. (Hand out Assignment 1) Assignment 1 (Due: 9/11/13): (1) hand calculation of a life table; (2) use provided data set to construct life tables by stratum, perform a Kaplan-Meier test on group differences, and interpret findings; and (3) describe a longitudinal study that requires event history analysis. 9/4/13 3. Discrete time models Review of binary and multinomial logistic regression The logit model for discrete time Time-varying covariates Readings: Guo, Chapter 3. Singer & Willett, Chapters 10-11. Allison, P.D. (1982). “Discrete-time methods for the analysis of event histories”, Sociological Methodology, 13, 61-98. Harris, K.M. (1993). “Work and welfare among single mothers in poverty.” American Journal of Sociology 99: 317-352. 9/11/13 4. Parametric models The exponential model The Weibull model Overview of other parametric models Readings: Guo, Chapter 5. Singer & Willett, Chapters 12-13. Goerge, R.M. (1990). “The reunification process in substitute care”. Social Service Review 64(3): 422-457. 4 Sandefur & Cook, (1998). “Permanent exits from public assistance: The impact of duration, family, and work”. Social Forces, 77(2) 763-786. (Hand out Assignment 2) Assignment 2 (Due: 9/25/13): (1) solve problems on discrete-time and parametric models; and (2) use provided data to estimate a discrete-time model, and interpret findings. 9/18/13 5. Cox proportional hazards model (I) Overview Partial likelihood estimator Cox regression with time-varying covariates Readings: Guo, Chapter 4. Singer & Willett, Chapter 14. 9/25/13 6. Cox proportional hazards model (II) Competing risks Accelerated failure time models Model-predicted survivor curves Power analysis for survival models Readings: Singer & Willett, Chapter 15. Collett, D. (1994). Modeling Survival Data in Medical Research. London, Chapman & Hall. Chapter 9. Guo, S., Biegel, D., Johnson, J. & Dyches, H. (2001) “Assessing the impact of mobile crisis services on preventing hospitalization: A community-based evaluation”. Psychiatric Services 52(2):223-228. 10/2/13 7. Cox proportional hazards model (III) Introduction to multilevel event time data (multivariate failure time data) Marginal approaches to multilevel event times Unobserved heterogeneity Readings: Guo, Chapter 6. Guo, S., & Wells, K. (2003). Research on timing of foster-care outcomes: one methodological problem and approaches to its solution. Social Service Review 77(1): 1-24. Lin, D.Y. (1994). Cox regression analysis of multivariate failure time data: The marginal approach. Statistics in Medicine 13: 2233-2247. Guang Guo and German Rodriguez. (1992). “Estimating a multivariate proportional hazards model for clustered data using the EM algorithm, with an application to child survival in Guatemala.” Journal of American Statistical Association 87: 969-976. Heckman, J.J., & Singer, B. (1985), “Social science duration analysis”, in Longitudinal Studies of Labor Market Data, New York, NY: Cambridge University Press. Chapter 2. 5 Trussell, J., & Richards, T. (1985). “Correcting for unmeasured heterogeneity in hazard models using the Heckman-Singer procedure.” Sociological Methodology 15: 242-276. (Hand out Midterm) Midterm Exam (Due: 10/23/13): Use data sets provided by the course or data set you choose to run a Cox regression model. Write a paper (no more than 14 pages, double spaced) to present findings. The paper should include: (1) data and specification of Cox regression; (2) testing interaction terms; (3) present predicted survivor curves based on estimated model; and (4) interpret findings. 10/9/13 8. Overview of HLM and contextual analysis Multi-level hypotheses in social sciences Intra-class correlation Random effects Two-level model Readings: Guo, S. (2005). “Analyzing grouped data with hierarchical linear modeling”, Children and Youth Services Review 27:637-65. Singer & Willett, Chapter 2. 10/16/13 No Class, Fall Break 10/23/13 9. Growth curve and contextual analysis Three-level model Goodness-of-fit indices Application to contextual and multilevel analysis Readings: Singer & Willett, Chapters 3-4 Townsend, A., Miller, B., & Guo, S. (2001). “Depression in middle-aged and older married couples: A dyadic analysis”. Journal of Gerontology: Social Sciences, 56B(6): S352-S364. (Hand out Assignment 3) Assignment 3 (Due: 11/6/13): (1) describe a multilevel or longitudinal study that requires hierarchical linear modeling, (2) solve problems on HLM; (3) use provided data to estimate intra-class correlation, and (4) run a two-level HLM, interpret findings. 10/30/13 10. Computer Lab: Using Stata to run HLM Readings: Singer & Willett, Chapter 5-6 11/6/13 11. Principles of estimation, hypothesis testing, and GEE method Hypothesis testing Multiparameter testing 6 HLM assumptions about data Overview of estimation via ML and empirical Bayesian Generalized-estimating-equation (GEE) method Readings: Singer & Willett, Chapters 7 (Hand out Assignment 4) Assignment 4 (Due: 11/20/13): Read the Sampson, Raudenbush, & Earls’ (1997) article. Group discussion and classroom presentation. This Science-published study has made significant contributions to the field, both conceptually and methodologically. On the due date, students will present findings from their group discussions on the paper. Students will be divided into smaller groups in advance. Sampson, R.J., Raudenbush, S.W., & Earls, F. (1997) “Neighborhoods and violent crime: A multilevel study of collective efficacy.” Science 277(15): 918-924. 11/13/13 12. Growth curve analysis and testing mediating effects Graphic presentation of individual trajectories Three-level models in growth curve analysis Predicted values of outcome variable using SAS Proc Mixed Testing mediating effects in HLM Readings: Singer & Willett, Chapter 8 Guo, S. & Hussey, D. (1999). “Analyzing longitudinal rating data: A three-level hierarchical linear model”. Social Work Research, 23(4):258-269. Flannery, D., Vazsonyi, A., Liau, A., Guo, S., Powell, K., Atha, H., & Vesterdal, W. (2003). “Initial behavior outcomes for PeaceBuilders universal school-based violence prevention program”. Developmental Psychology 39 (2): 292-308. Krull, J.L., & MacKinnon, D.P. (1999). “Multilevel mediation modeling in groupbased intervention studies”. Evaluation Review 23(4):418-444. Krull, J.L., & MacKinnon, D.P. (2001). “Multilevel modeling of individual and group level mediated effects”. Multivariate Behavioral Research 36(2): 249-277. 11/20/13 13. Classroom discussion & other topics in HLM Part A (9:00-10:00): Student presentation and classroom discussion on Sampson, Raudenbush, & Earls (1997). Part B (10:00-11:50): Other applications of HLM: power analysis for HLM, HLM and SEM, multiple outcome variables, structured covariance matrix within subjects. Readings: Raudenbush, SW., & Liu, X. (2000). “Statistical power and optimal design for multisite randomized trials.” Psychological Methods 5(2): 199-213. 7 Curran, P.J. (2003). “have multilevel models been structural equation models all along?” Multivariate Behavioral Research 38(4): 529-569. Guang Guo & John Hipp. “The analysis of linear longitudinal data: Random-effects models and structural equation models.” In Melissa Hardy (edited), New Handbook on Data Analysis, London, England: Sage. MacCallum, R.C., & Kim, C. (2000). “Modeling multivariate change”, in Little, Schnabel, & Baumert edited, Modeling Longitudinal and Multilevel Data. Lawrence Erlbaum Associates, Publishers. Singer, L., Salvator, A., Guo, S., Collin, M., Lilien, L, & Baley, J. (1999). “Maternal psychological distress and parenting stress after the birth of a very low birthweight infant”. JAMA (Journal of American Medical Association), 281(9): 799-805. (Hand out Assignment 5) Assignment 5 (Due: 11/26/13): Choose one article from two that are provided by the course to perform a critical review. The provided articles employed (or potentially should have employed but did not) either a survival model or an HLM. This review (no more than two pages, single-spaced) should focus on: (1) strengths and limitations (very briefly), (2) major statistical problems, and (3) recommendations for revisions. (Hand out Final) Final Exam (Due: 12/7/13): Use data sets provided by the course or data set you choose to run an HLM. Write a paper (no more than 14 pages, double spaced) to present findings. The paper should include: (1) research objectives and questions; (2) methods and model specifications; (3) findings; and (4) conclusions and implications. 11/27/13 Thanksgiving holiday! No class. 12/4/13 14. Hierarchical generalized linear Models (HGLM) Overview of generalized linear model (GLM) Multilevel logistic regression Multilevel ordered logistic regression Multilevel Poisson regression Readings: Miller, B., & Guo, S. (2000) “Social support for spouse caregivers of persons with dementia”. Journal of Gerontology: Social Sciences 55B(3): S163-S172. Course summary Topics for future study 12/7/13 Final Exam Due 8