A Conceptual Introduction to Multilevel Models as Structural Equations Lee Branum-Martin Georgia State University Language & Literacy Initiative A Workshop for the Society for the Scientific Study of Reading July 9, 2013 Hong Kong, China The analyses and software for this workshop were supported by the Institute of Education Sciences, U.S. Department of Education, through grants R305A10272 (Lee Branum-Martin, PI) and R305D090024 (Paras D. Mehta, PI) to University of Houston. The initial data collection was jointly funded by NICHD (HD39521) and IES (R305U010001) to UH (David J. Francis, PI). The opinions expressed are those of the author and do not represent views of these funding agencies. Important concepts for students interested in high-quality education research Psychometrics/test theory is the basis for educational measurement. • Item Response Theory • Confirmatory Factor Analysis, Structural Equation Modeling • Direct tests of theory Multilevel models for nested data. • Longitudinal models (observations nested within persons) • Complex clustering (regular instruction + tutoring) • Mixed effects, random effects, and multilevel models can be fit in a number of different software packages. Overall Goals for Today Get an introductory understanding of how theory and models get represented in three crucial dialects of social science research: 1. Diagrams (accurate and complete) 2. Equations a. Scalar equations for variables b. Matrix equations for variables c. Matrix representations of covariances 3. Code in different software Apply these translations for simple multilevel models in some example software: Mplus, lme4, and xxm. Get some experience with R. Today’s Workshop 1. What is a multilevel model? a. Conceptual basis: what is clustering? b. Graphical approach: histograms, boxplots c. Equations, data structure, diagram 2. Adding a predictor a. Conceptual basis: what is a predictor? b. Graphical approach: scatterplot c. Equations, data structure, diagram 3. Extensions: bivariate to SEM? Background Branum-Martin, L. (2013). Multilevel modeling: Practical examples to illustrate a special case of SEM. In Y. Petscher, C. Schatschneider & D. L. Compton (Eds.), Applied quantitative analysis in the social sciences (pp. 95-124). New York: Routledge. Singer, J. D. (1998). Using SAS PROC MIXED to fit multilevel models, hierarchical models, and individual growth models. Journal of Educational and Behavioral Statistics, 24(4), 323-355. Mehta, P. D., & Neale, M. C. (2005). People are variables too: Multilevel structural equations models. Psychological Methods, 10(3), 259–284. West, B. T., Welch, K. B., & Gałecki, A. T. (2007). Linear mixed models : a practical guide using statistical software. Boca Raton: Chapman & Hall. Nested Data: They’re everywhere Developmental: items, trials, days, persons Clinical: interview topics, sessions (days, weeks, months), persons, sites (relational, networked?) Cognitive: items, tests, traits, person, social group, neighborhood (region, hemisphere—spatial!) Neuropsychology: time (ms), electrode, person Education: items, tests, years, students, classrooms, schools If treatment is at one level, what does variability mean at lower and higher levels? Students in Classrooms 802 Students in 93 classrooms in 23 schools. Passage comprehension Wscores on Woodcock Johnson Language Proficiency Battery-Revised. Multilevel Regression: Random Intercept Model Level 1 (i students) Level 2 (j classrooms) Yij = b0j+ eij b0j = g00+ u0j random residual for level 1 fixed intercept for level 2 (grand intercept) random residual for level 2 (deviation from grand intercept) By substitution, we get the full equation: fixed random random Yij = g00+ u0j + eij proc mixed covtest data = mydata; class classroom; model y = / solution; random intercept / subject = classroom; run; Singer, J. D. (1998). "Using SAS PROC MIXED to fit multilevel models, hierarchical models, and individual growth models." Journal of Educational and Behavioral Statistics 24(4): 323-355. Multilevel Regression: Random Intercept Model Level 1 (i students) Level 2 (j classrooms) Yij = b0j+ eij b0j = g00+ u0j Yij g00 u0j random residual for level 1 fixed intercept for level 2 (grand intercept) random residual for level 2 (deviation from grand intercept) eij Multilevel Regression: SEM Diagram 1 u0j Level 2 (j classrooms) Level 1 (i students) g00 Yij fixed intercept for level 2 (grand intercept) random residual for level 2 (deviation from grand intercept) eij random residual for level 1 Mehta, P. D., & Neale, M. C. (2005). People are variables too: Multilevel structural equations models. Psychological Methods, 10(3), 259–284. Multilevel Regression: Variance components HLM-style notation SEM notation 1 a Level 2 (j classrooms) g00 Grand intercept Variance of classroom deviations y t00 u0j Variance of student deviations Level 1 (i students) q Yij eij s2 Mehta, P. D., & Neale, M. C. (2005). People are variables too: Multilevel structural equations models. Psychological Methods, 10(3), 259–284. Multilevel Regression: Results SEM notation 1 a Level 2 (j classrooms) Grand intercept = 444.0 Variance of classroom deviations y 89.8 (SD = 9.5) u0j Variance of student deviations Level 1 (i students) q 410.0 (SD = 20.2) Yij eij Intraclass correlation = 𝑣(𝑏𝑒𝑡𝑤𝑒𝑒𝑛) 𝑣(𝑡𝑜𝑡𝑎𝑙) = 89.8 89.8+410 = .18 Model Results Classroom SD = 9.5 g00= 444.0 Student SD = 20.2 How Does a Multilevel Model Work? Data Set (Excel, SPSS) Student Classroom Classroom Regressions SEM Outcome 1 1 Y11 2 1 Y21 3 2 Y32 4 2 Y42 5 3 Y53 6 3 Y63 1 Yi1 = h1 + ei1 a hj Yi2 = h2 + ei2 Yi3 = h3 + ei3 where h ~ N(a,y) e ~ N(0,q) Yij eij y q Multilevel Regression = Multilevel SEM Data Set (Excel, SPSS) Student Classroom Classroom Regressions Classroom SEMs Outcome 1 1 Y11 2 1 Y21 3 2 Y32 4 2 Y42 5 3 Y53 6 3 Y63 Yi1 = h1 + ei1 Yi2 = h2 + ei2 Yi3 = h3 + ei3 e11 Y11 e21 Y21 e32 Y32 e42 Y42 e53 Y53 e63 Y63 where h ~ N(a,y) e ~ N(0,q) h1 h2 h3 Multilevel Regression = Multilevel SEM Classroom Regressions Student Classroom Classroom SEMs Outcome 1 1 Y11 2 1 Y21 3 2 Y32 4 2 Y42 5 3 Y53 6 3 Y63 Yi1 = h1 + ei1 Yi2 = h2 + ei2 Yi3 = h3 + ei3 e11 Y11 e21 Y21 e32 Y32 e42 Y42 e53 Y53 e63 Y63 where h ~ N(a,y) e ~ N(0,q) h1 h2 h3 Classroom SEM: Expanded version Classroom 1 Classroom 2 Classroom 3 e11 y q e21 q Y11 h1 Y21 q e42 q e53 q e63 q Y32 y a a h2 y a e32 Y42 Y53 Y63 h3 1 Classroom SEM: Expanded version Classroom 1 Classroom 2 Classroom 3 e11 q e21 q e32 Y11 h1 Y21 y a a h2 y a Y32 q e42 Y42 q e53 Y53 𝑌 1 q 1 e63𝑌𝑌 Y63 0 = q 𝑌 0 11 21 32 42 𝑌53 𝑌63 y 0 0 1 1 0 0 0 0 h3 0 0 0 0 1 1 𝜂1 𝜂2 𝜂3 𝑒11 𝑒21 𝑒 + 𝑒32 42 𝑒53 𝑒63 1 Classroom SEM: Expanded version y Classroom 1 Classroom 2 Classroom 3 Matrix Equation for outcomes e11 q e21 q Y11 1 Y21 1 q e42 q e53 q e63 q Y32 1 Y42 1 Y53 1 Y63 1 e32 𝑌11 1 0 𝑌21 1 0 𝑌32 0 1 = 𝑌42 0 1 0 0 𝑌53 0 0 𝑌63 h1 y a a h2 y a 1 h3 (implicit) cross-level linking matrix 𝑒11 0 𝑒21 0 𝜂1 𝑒32 0 𝜂 2 + 𝑒 42 0 𝜂 3 𝑒53 1 𝑒63 1 Classroom SEM: Concise version Student Model variance of student residuals q Classroom Model Cross-level link l eij Yij y hj variance between classrooms a 1 Latent mean (across classrooms) Classroom deviation student residual Model matrices q l y a Passage Comprehension Predicted by Word Attack 802 Students in 93 classrooms in 23 schools. W-scores on Woodcock Johnson Language Proficiency Battery-Revised. Classroom Predictions of PC by WA 802 Students in 93 classrooms in 23 schools. W-scores on Woodcock Johnson Language Proficiency Battery-Revised. Adding a Predictor Data Set (Excel, SPSS) Student Classroom Classroom Regressions Outcome Predictor 1 1 Y11 X11 2 1 Y21 X21 3 2 Y32 X32 4 2 Y42 X42 5 3 Y53 X53 6 3 Y63 X63 Yi1 = h11 + Xi1h21 + ei1 Yi2 = h12 + Xi2h22 + ei2 Yi3 = h13 + Xi3h23 + ei3 Adding a Predictor SEM a1 y11 1 Classroom Regressions Classroom Model a2 y21 h1j h2j y22 Xij Yij q Yi1 = h11 + Xi1h21 + ei1 Yi2 = h12 + Xi2h22 + ei2 Yi3 = h13 + Xi3h23 + ei3 eij Student Model Adding a Predictor SEM a1 y11 1 Model Matrices Classroom Model a2 y21 h1j h2j y22 𝛼 2,2 𝛼1 = 𝛼 2 Ψ2,2 𝜓11 𝜓12 = 𝜓21 𝜓22 Λ2,1 = 1 𝑋𝑖𝑗 Θ1,1 = 𝜃11 Xij Observed Variable Matrices Yij q eij Student Model 1 𝑌11 1 𝑌21 𝑌32 0 = 𝑌42 0 𝑌53 0 𝑌63 0 𝑋11 𝑋21 0 1 0 1 0 0 0 0 0 0 0 0 𝑋32 𝑋42 0 1 0 1 0 0 0 0 0 0 0 0 𝑋53 𝑋63 𝜂11 𝑒11 𝜂21 𝑒21 𝜂12 𝑒32 + 𝜂22 𝑒42 𝜂13 𝑒53 𝜂23 𝑒63 Adding a Predictor SEM 1 443.4 37.0 h1j Classroom Regressions Classroom Model .85 -.34 (-.27) h2j .04 Xij Yij eij 234.6 Student Model Not Just a Predictor: Two Outcomes SEM: Random Slope a1 y11 1 SEM: Bivariate Random Intercepts Classroom Model a2 y21 h1j y22 1 y11 Classroom Model a1 a2 y21 h1j h2j Yij Yij Xij eij e1ij e2ij h2j y22 Xij q Student Model q11 q21 q22 Student Model