The University of North Carolina at Chapel Hill School of Social Work SOWO 917 Longitudinal and Multilevel Analysis Fall Semester, 2014 INSTRUCTOR Roderick Rose Ph.D. Room 245D Tate Turner Kuralt, CB #3550 School of Social Work, Chapel Hill, NC 27599-3550 Phone: (919) 260-052 Email: rarose@email.unc.edu CLASS MEETING TIMES & OFFICE HOURS Class meets on Wednesdays 9:00-11:50 am (Room 135 TTK) Office hours are Wednesday 12-1 and Thursday 10-2 COURSE DESCRIPTION This course introduces the context and intuition for longitudinal and multilevel models, and the statistical frameworks, analytical tools, and social behavioral applications of three types of models: event history analysis (EHA), multilevel modeling (MLM), and growth curve analysis. COURSE OBJECTIVES At the completion of the course, students will have a solid understanding of the challenges and problems in longitudinal and multilevel analysis. They will know how to choose appropriate statistical analyses that best suit the type of data and research questions for a given study. They are expected to be able to conceptualize, design, run, interpret, and communicate results clearly and effectively in spoken and written settings based on multilevel modeling (including two-level and three-level hierarchical linear models, growth curve analysis, categorical MLMs, and understanding cross-classification and cross-level effects) and event history analysis (life tables, Kaplan-Meier’s estimate of survivor function, discrete time model, Cox proportional hazard model, marginal models handling multilevel event data). PRE-REQUISITES Students are assumed to be familiar with descriptive and inferential statistics as well as multiple regression analysis. They should have statistical and statistical software background at least equivalent to that provided by SOWO918, SOCI209, PSYC282, EDUC284 (linear regression), or SOCI211 (categorical data analysis). Students without such prerequisites should contact the instructor to determine their eligibility to take this course. 1 SAKAI COURSE SITE Go to: https://www.unc.edu/sakai/ Enter your ONYEN Navigate to SOWO917.001.FA14 This syllabus is under “syllabus” on the left-hand navigation menu All class lecture notes, assignments, and other materials as needed will be provided under “resources” on the left-hand navigation menu All course materials are on the web site and students are responsible for bringing their materials to class. STATISTICAL SOFTWARE PACKAGES Students may choose to use Stata, SAS, or R as the primary statistical software package for the course. I will use all three at various times in classroom lectures, materials, and demonstrations. TEXTBOOKS Raudenbush, S.W., & Bryk, A.S. (2002). Hierarchical Linear Models: Applications and Data Analysis Methods, Second Edition, Thousand Oaks, CA: Sage Publications Ltd. Singer, J.D., & Willett, J.B., (2003). Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence, New York, NY: Oxford University Press RECOMMENDED TEXTBOOKS Allison, P.D. (1995). Survival Analysis Using the SAS System. Cary, NC: SAS Institute Inc. Cleves, M.A., Gould, W.W., & Gutierrez, R.G. (2004). An introduction to survival analysis using Stata, Rev. ed., College Station, TX: Stata Press. Guo, S. (2010). Survival Analysis: A Practical Guide to Social Work Research. New York, NY: Oxford University Press. Rabe-Hesketh, S., & Skrondal, A. (2005). Multilevel and Longitudinal Modeling Using Stata, College Station, TX: Stata Press. SAS Online documentation for Proc Mixed: http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer. htm#mixed_toc.htm R Manual for MLM http://cran.r-project.org/doc/contrib/Bliese_Multilevel.pdf 2 ASSIGNMENTS GRADE PERCENTAGE Assignment 1 Assignment 2 Assignment 3 Assignment 4 Assignment 5 Assignment 6 Assignment 7 Term project Midterm Exam (take home) Final Exam (take home) 5% 5% 5% 5% 5% 5% 5% 15% 30% 20% GRADING SYSTEM The standard School of Social Work interpretation of grades and numerical scores will be used. H = 94-100 P = 80-93 L = 70-79 F = 69 and below POLICY ON CLASS ATTENDANCE Class attendance is an important element of class evaluation, and you are expected to attend all scheduled sessions. You will easily fall behind the course if you miss a class session it will affect the class learning project, so it is imperative to attend. Students are responsible for informing the instructor when they must miss a class session. POLICY ON INCOMPLETE AND LATE ASSIGNMENTS Assignments are to be turned in to the professor by 5pm of the due date noted in the course outline. Brief extensions may be granted by the professor given advance notice of at least 24 hours. Late assignments (not turned in by 5pm on the due date) will be reduced 10 percent for each day late (including weekend days). A grade of incomplete will only be given under extenuating circumstances and in accordance with University policy. POLICY ON ACADEMIC DISHONESTY Students are expected to follow the UNC Honor Code. Please include the honor code statement along with your signature on all assignments: “I have neither given nor received unauthorized aid on this assignment.” Please refer to the APA Style Guide, the SSW Manual, and the SSW Writing Guide for information on attribution of quotes, plagiarism and appropriate use of assistance in preparing assignments. If reason exists to believe that academic dishonesty has occurred, a referral will be made to the Office of the Student Attorney General for investigation and further action as required. 3 POLICY ON ACCOMMODATIONS FOR STUDENTS WITH DISABILITIES Students with disabilities that affect their participation in the course may notify the instructor if they wish to have special accommodations in instructional format, examination format, etc., considered. SESSION SCHEDULE All sessions meet in Tate-Turner-Kuralt Room 135 except as noted. 1 8/20 2 8/27 3 9/3 4 9/10 Class will meet in the Tate-Turner-Kuralt telepresence room, 101. 5 9/17 6 9/24 7 10/1 8 10/8 -Fall break on 10/15, no class9 10/22 -Midterm exam due on 10/2310 10/29 11 11/5 12 11/12 13 11/19 -Thanksgiving on 11/26, no class14 12/3 -Final exam due on 12/7- 4 COURSE OUTLINE (TOPICS, READINGS, AND ASSIGNMENTS) 1 Introduction and course overview Overview and rationale for multilevel models. Overview and rationale for survival models. Review of fundamental statistical concepts. Readings to be completed for this session: Guo, S. (2013). Advanced statistical analysis. Entry for the Encyclopedia of Social Work Online. New York, NY: The Oxford University Press. Optional Reading: Guo, S. (2013). Maximum likelihood estimator: The untold stories, caveats, and tips for application. Chinese Sociological Review 45(3), 74-101. Assignment #1 (Due in session 2): In this assignment you will demonstrate your readiness to use SAS, Stata and R and convert data between them. You will be asked to do a simple 2 level longitudinal multilevel model in each, as well as some other random things to get you started on each package. 2 Introduction to multilevel and hierarchical linear modeling The importance of context to social and behavioral science. Overview of MLM/HLM. Multi-level hypotheses in social sciences Variance decomposition, intra-class correlation & reliability Random effects & fixed effects Two-level model Readings to be completed for this session: Raudenbush & Bryk, Chapters 1 and 2 Singer, J. D. (1998). Using SAS Proc Mixed to fit multilevel models, hierarchical models, and individual growth models. Journal of Educational and Behavioral Statistics 23(4), 323-355. Hedges, L. V. (2007). Correcting a significance test for clustering. Journal of Educational and Behavioral Statistics 32(2), 151-179. Seminar reading to be completed for this session: Sampson, R.J., Raudenbush, S.W., & Earls, F. (1997) “Neighborhoods and violent crime: A multilevel study of collective efficacy.” Science 277(15): 918-924. Assignment 1 due today Optional Reading: Guo, S. (2005). “Analyzing grouped data with hierarchical linear modeling”, Children and Youth Services Review 27:637-65. 5 Assignment 2 (due in session 3): For this assignment you will be asked to explain the meaning of several equations in chapter 3 of Raudenbush & Bryk, 2002. 3 Multilevel models in organizational applications Two-level model (finish) Writing out equations and substitution. Estimation theory. Organizational designs Variance explained and presenting results. Readings to be completed for this session: Raudenbush & Bryk, Chapter 5 (99-130) and selections from chapter 3. o For chapter 3, I am not expecting you to read it from beginning to end. However, in order to familiarize you with estimation concepts, I have designed assignment 2 to facilitate learning from this chapter. Primo, D., Jacobsmeier, M. L., and Milyo, J. (2007). Estimating the impact of state policies and institutions with mixed-level data. State Politics and Policy Quarterly, 7(4), 446-449. Snijders, T. A. B., & Bosker, R. J. (1994). Modeled variance in two-level models. Sociological Methods and Research 22(3), 342-363. Seminar reading to be completed for this session: Hedges, L. V. & Hedberg, E. C. (2007). Intraclass correlations forplanning group randomized experiments in rural education. Educational Evaluation and Policy Analysis 29(1), 60-87. Assignment 2 due today Assignment 3 (due in session 4): Multilevel models in organizational applications. Optional readings: Raudenbush & Bryk, Finish chapter 3 and read chapter 4. 4 Multilevel models in organizational applications, continued. Special topics including: Centering. Three-level organizational models (intro). Model fitting and goodness-of-fit indices. Power. Prediction of effects for organizations. Readings to be completed for this session: Raudenbush & Bryk, Finish chapter 5 Rose, R. A., & Bowen, G. L. (2009). Power analysis in social work intervention research design: Designing cluster randomized trials. Social Work Research, 33(1), 43–52. 6 Raudenbush, S. W. (1997). Statistical analysis and optimal design for cluster randomized trials. Psychological Methods, 2(2), 173-185. Seminar reading to be completed for this session: Sanders, Saxton & Horn. The Tennessee Value-Added Assessment System. (Provided on Sakai). Optional reading: Schochet, P. Z. (2005). Statistical power for random assignment evaluations of education programs. Washington, DC: Mathematica Policy Research. Assignment 3 due today 5 Individual growth models Questions related to change Longitudinal or panel data and specifying time Random effects vs. fixed effects models The multilevel model for change; model building Readings to be completed for this session: Singer & Willett, Chapters 1-3 Raudenbush, S. W. (2001). Comparing personal trajectories and drawing causal inferences from longitudinal data. Annual Review of Psychology 52, 501-525. Raudenbush, SW., & Liu, X. (2000). “Statistical power and optimal design for multisite randomized trials.” Psychological Methods 5(2): 199-213. Seminar reading to be completed for this session: Smokowski et al (Externalizing) (Under review). Assignment 4 (due in session 6): Individual growth models. Optional reading: Raudenbush & Bryk, Chapter 6. 6 Individual growth models, continued. The multilevel model for change; model building (continued) Flexible time specifications EBEs of individual growth parameters Moderators and cross-level interactions Readings to be completed for this session: Singer & Willett, Chapter 4-5. Morrell, C. H., Brant, L. J., & Ferrucci, L. (2009). Model choice can obscure results in longitudinal studies. Journal of Gerontology: Medical Sciences 64A(2), 215-222. 7 Seminar reading to be completed for this session: Akos, P. T., Rose, R. A., & Orthner, D. (2014). Sociodemographic moderators of middle school transition effects on academic achievement. The Journal of Early Adolescence, Online first, 1-29. Optional reading: Marsh, H. W. & Hau, K. T. (2002). Multilevel modeling of longitudinal growth and change: Substantive effects or regression toward the mean artifacts? Multivariate Behavioral Research 37(2), 245-282. Assignment 4 due today Assignment 5 (Due either session 8 or 9): Seminar reading assignment. 7 Advanced MLM: Non-linearities Non-linear and discontinuous change Hierarchical generalized linear model (HGLM) Multilevel models for binary and multinomial outcomes Multilevel models for count data Readings to be completed for this session: Raudenbush & Bryk, Chapter 10 Singer & Willett, Chapter 6 Seminar reading to be completed for this session. For this session, each student will be assigned one of the readings: Rose, R. A., Woolley, M. E., Orthner, D. K., Akos, P. T., & Jones-Sanpei, H. J. (2012). Increasing teacher use of career-relevant instruction: A randomized control trial of CareerStart. Educational Evaluation and Policy Analysis 34(3), 295-312. Snelgrove, J. W., Pikhart, H., & Stafford, M. (2009). A multilevel analysis of social capital and self-rated health: Evidence from the British Household Panel Survey. Social Science & Medicine 68, 1993-2001. Parish, S., Thomas, K., Rose, R., Kilany, M., & McConville, R. (2012). State insurance parity legislation for autism services and family financial burden. Intellectual and Developmental Disabilities 50(3), 190-198. 8 Advanced MLM: Complex data structures. Three level models Cross-classified models Readings to be completed for this session: Raudenbush & Bryk, Chapter 8 and Chapter 12 Luo, W., & Kwok, O. (2009). The impacts of ignorable a crossed factor in analyzing cross-classified data. Multivariate Behavioral Research 44, 182212. 8 Grady, M. W., & Beretvas, N. (2010). Incorporating student mobility in achievement growth modeling: A cross-classified multiple membership growth curve model. Multivariate Behavioral Research 45, 393-419. Seminar readings will be shared by students (assignment 5; half of the students will go today, half in the next session). 9 Advanced MLM: practice and applications. Diagnostic and model building Estimation and convergence Covariance structures Innovative applications Readings to be completed for this session: Raudenbush & Bryk, Chapter 9 Singer & Willett, Chapter 7 Bauer, D. J. & Cai, L. (2009). Consequences of unmodeled nonlinear effects in multilevel models. Journal of Educational and Behavioral Statistics, 34(1), 97-114. Guo, S. & Hussey, D. (1999). Analyzing longitudinal rating data: a threelevel hierarchical linear model. Social Work Research 23(4), 258-269. Optional readings: Guo, S. & Bollen, K. A. (2013). Research using longitudinal ratings collected by multiple raters: One methodological problem and approaches to its solution. Social Work Research 37(2), 85-98. Seminar readings will be shared by students (assignment 5; the remaining half of the students will go today). Hand Out Midterm Exam (Due on date noted in session schedule): Use data sets provided by the course or data set you choose to run a multilevel regression model. Write a brief paper (no more than 12 pages, double spaced) to present findings. The paper should include: (1) 2 research questions; (2) data description and specification of the multilevel regression; (3) description of the process by which the model will be fitted; (4) a description of model diagnostics and sensitivity tests; and (5) report and interpret the findings from each of (2)-(3). At least one of your research questions should imply a cross-level interaction term. You should be able to explain the findings to a lay audience. 10 Intro to Survival Analysis Review of binary and multinomial logistic regression Overview of event history analysis Censoring Discrete-time event occurrence Life tables Hazard and survival functions/curves 9 Readings to be completed for this session: Singer & Willett, Chapters 9-10. Yang, T. & Aldrich, H. E. (2012). Out of sight but not out of mind: Why failure to account for left truncation biases research on failure rates. Journal of Business Venturing 27, 477-492. Seminar reading to be completed for this session: Berger, M. C. & Black. D. A. (1998). The duration of Medicaid spells: An analysis using flow and stock samples. The Review of Economics and Statistics 80(4), 667-675. Optional readings: Guang Guo (1993). “Event history analysis for left-truncated data”, Sociological Methodology, 23, 217-243. Harris, K.M. (1993). “Work and welfare among single mothers in poverty.” American Journal of Sociology 99: 317-352. 11 Discrete-time models, continued The discrete time hazard model Alternate specifications for time Time-varying covariates Proportionality and unobserved heterogeneity Parametric models (Weibull, accelerated failure time, etc.) Readings to be completed for this session: Singer & Willett, Chapters 11-12. Nam, Y. (2005). The roles of employment barriers in welfare exits and reentries after welfare reform: Event history analysis. Social Service Review 79(2), 268-293. Haque, M. M. & Washington, S. (2014). A parametric duration model of the reaction times of drivers distracted by mobile phone conversations. Accident Analysis and Prevention 62, 42-53. Seminar reading to be completed for this session: Glick, J. E. & Van Hook, J. (2011). Does a house divided stand? Kinship and the continuity of shared living arrangements. Journal of Marriage and Family 73, 1149-1164. Optional readings: Lee, E. T. & Go, O. T. (1997). Survival analysis in public health research. Annual Review of Public Health 18, 105-134. Allison, P.D. (1982). “Discrete-time methods for the analysis of event histories”, Sociological Methodology, 13, 61-98. Hetling, A., Ovwigho, P. C., & Born, C. E. (2007). Do welfare avoidance grants prevent cash assistance? Social Service Review 81(4), 609-631. Assignment 6, due in session 12. 10 12 Kaplan Meier & Cox proportional hazards model The clog-log model Rare event models Kaplan-Meier’s estimate of survivor functions The cumulative hazard function and kernel smoothing Partial likelihood estimator Cox regression Readings to be completed for this session: Singer & Willett, Chapters 13 and 14 up to page 516. Heise, M. (2012). Law and policy entrepreneurs: Empirical evidence on the expansion of school choice policy. Notre Dame Law Review 87(5), 19171940. Seminar reading to be completed for this session: Kosterman, R., Hawkins, D., Guo, J., Catalano, R. F., & Abbott, R. D. (2000). The dynamics of alcohol and marijuana initiation: Patterns and predictors of first use in adolescence. American Journal of Public Health 90(3), 360-366. Assignment 6 due today Optional readings: Sandefur & Cook, (1998). “Permanent exits from public assistance: The impact of duration, family, and work”. Social Forces, 77(2) 763-786. Guo, S., Biegel, D., Johnson, J. & Dyches, H. (2001) “Assessing the impact of mobile crisis services on preventing hospitalization: A community-based evaluation”. Psychiatric Services 52(2):223-228. Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society 2(XX), 187-220. Efron, B. (1977). The efficiency of Cox’s likelihood function for censored dta. Journal of the American Statistical Association 72(359), 557-565. 13 Cox proportional hazards model, continued. Partial likelihood method Interpreting results Alternate structures for time Non-proportional hazards and interactions with time Diagnostics Competing risks Power analysis for survival models Introduction to multilevel event time data (multivariate failure time data) Readings to be completed for this session: Singer & Willett, finish chapter 14 and 15. 11 Jozwiak, K. & Moerbeek, M. (2012). Power analysis for trials with discretetime survival endpoints. Journal of Educational and Behavioral Statistics 37(5), 630-654. MORE No seminar reading Optional readings: Stata documentation on STPOWER: http://www.stata.com/manuals13/ststpower.pdf Heckman, J.J., & Singer, B. (1985), “Social science duration analysis”, in Longitudinal Studies of Labor Market Data, New York, NY: Cambridge University Press. Chapter 2. Grilli, L. (2005). The random effects proportional hazards model with grouped survival data: A comparison between the group continuous and continuation ratio versions. Journal of the Royal Statistical Society Series A, 168(1), 83-94. Guo, S., & Wells, K. (2003). Research on timing of foster-care outcomes: one methodological problem and approaches to its solution. Social Service Review 77(1): 1-24. Lin, D.Y. (1994). Cox regression analysis of multivariate failure time data: The marginal approach. Statistics in Medicine 13: 2233-2247. Trussell, J., & Richards, T. (1985). “Correcting for unmeasured heterogeneity in hazard models using the Heckman-Singer procedure.” Sociological Methodology 15: 242-276. Assignment 7, Due before Session 14: Revisit one article from any assigned seminar reading to perform a critical review. This review (no more than two pages, single-spaced) should focus on: (1) strengths and limitations (very briefly), (2) major statistical problems, and (3) recommendations for revisions. You will present your review in class during session 14. (Hand out Final) Final Exam (Due on date noted in session schedule): Use data sets provided by the course or data set you choose to run an event history/survival regression model (any type). Write a brief paper (no more than 12 pages, double spaced) to present findings. The paper should include: (1) 2 research questions; (2) data description and specification of the survival regression; (3) description of the process by which the model will be fitted; (4) a description of model diagnostics and sensitivity tests; and (5) report and interpret the findings from each of (2)-(3). You should be able to explain the findings to a lay audience. 14 Presentation of assignment 7 article reviews, no readings or other assignment. 12