Latent variable modeling of psychological longitudinal data: taking into account the unobserved heterogeneity using Mplus Jacques Juhel University Rennes 2, CRPCC, EA 1285 June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 1 Introduction Studying individual differences in learning, change and development A double compromise : • random effect model, • classification techniques. June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 2 Introduction (among other methods) the GMM approach of Muthén and colleagues A technique for longitudinal data that : • combines categorical and continuous latent variables in the same model (“beyond SEM”), • accommodates unobserved heterogeneity in the sample, • allows for each class membership latent growth parameters to be influenced by time-varying covariates and time-invariant predictor variables, • incorporates consequent outcomes predicted by the latent class variable. June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 3 LGM specifications The LGM for a continuous outcome : the multivariate latent variable approach Factor analysis measurement model (level 1) : Yi ν Ληi e i , (1) Yi (mx1) repeated measures over fixed time points, n (mx1) intercepts in the regression from Yi on hi , hi (px1) latent growth factors, L (mxp) design matrix of factor loadings, ei (mx1) residuals in the regression of Yi on hi (covariance matrix Q). June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 4 LGM specifications The LGM for a continuous outcome : the multivariate latent variable approach Structural regression model (level 2) : ηi α Βηi z i , (2) a (px1) means of hi or intercepts in the regression of hi on hi , B (pxp) regression coefficients in the regression of hi on hi , hi (px1) latent growth factors, zi (px1) residuals in the regression of hi on hi (covariance matrix Y). June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 5 LGM assumptions The LGM for a continuous outcome : the multivariate latent variable approach The covariance and mean structure are derived for the population with the hypothesis that : e, z and h are mutually uncorrelated, E[e] and E[z] equal 0. June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 6 SEM representation The unconditional linear LGM Free parameters (Mplus output) ν0 y1 a Means of h0 and h1, Y var(h0) var(h1) cov(h0,h1) res. var(y) y3 y4 1 1 Λ 1 1 h0 0 1 2 3 h1 Β0 June 2-4, 2010 - Saint-Raphaël y2 Yi Ληi e i , (1) ηi α z i , (2) INSERM workshop : Mixture modelling for longitudinal data 7 LGM specifications The LGM with time-varying covariates Factor analysis measurement model (level 1) : Yi ν Ληi Kai e i , (1bis) Yi (mx1) repeated measures over fixed time points, n (mx1) intercepts in the regression from Yi on hi , hi (px1) latent growth factors, L (mxp) design matrix of factor loadings, K (mxr) coefficients in the regression from Yi on time-varying covariates ai. ei (mx1) residuals in the regression of Yi on hi (covariance matrix Q). June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 8 SEM representation Linear LGM with time-varying covariates Free parameters (Mplus output) ν0 y1 a Means of h0 and h1, Y var(h0) var(h1) cov(h0,h1) res.var(y) cov(a, h0) cov(a, h1) y2 y3 y4 1 1 Λ 1 1 h0 h1 a1 a2 a3 0 1 2 3 a4 B Regression coefficients from y on a June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 9 LGM specifications The LGM with time-invariant covariates Structural regression model (level 2), with vector of predictors x : ηi α Βηi ΓXi z i , (3) hi (px1) latent growth factors, a (px1) means of hi or intercepts in the regression of hi on hi , B (pxp) regression coefficients in the regression of hi on hi , Xi (qx1) time-invariant covariate predictors of change, G (pxq) regression coefficients in the regression from h on X, zi (px1) residuals in the regression of hi on hi (covariance matrix Y). June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 10 SEM representation The linear LGM with time-varying and time-invariant covariates Free parameters (Mplus output) ν0 y1 a Intercepts of h0 and h1, Means of a1-a4 Y res.var(h0) res. var(h1) res. cov(h0,h1) res. var(y) cov(a, h0) cov(a, h1) cov(a, x) y2 y3 y4 1 1 Λ 1 1 h0 h1 a1 a2 a3 0 1 2 3 a4 B Regression coefficients from y on a Regression coefficients from h0 and h1on X x1 June 2-4, 2010 - Saint-Raphaël x2 x3 INSERM workshop : Mixture modelling for longitudinal data 11 LGM specifications The linear LGM with time-varying, time-invariant covariates and a distal outcome Consequences of change as outcomes can be predicted by the latent growth factors : Zi ω Βηi xi , (4) Zi (dx1) vector of distal outcomes of change, B (dxp) matrix of regression coefficients from Z on h, w (dx1) vector of regression intercepts for Z, xi (px1) residuals in the regression of Zi on hi (covariance matrix Y). June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 12 SEM representation The linear LGM with time-varying, time-invariant covariates and a distal outcome ν0 Free parameters (Mplus output) y1 a Intercepts of h0 and h1, Means of a1-a4 Intercept of z Y res. var(h0) res. var(h1) res. cov(h0,h1) res. var(y) cov(a, h0) cov(a, h1) cov(a, x) y2 y3 y4 1 1 Λ 1 1 h0 h1 a1 a2 a3 0 1 2 3 a4 B Regression coefficients from y on a Regression coefficients from h0 and h1on x Regression coefficients from z on h0 and h1 x1 June 2-4, 2010 - Saint-Raphaël x2 x3 z INSERM workshop : Mixture modelling for longitudinal data 13 Illustration : data set 1 Clinical symptomatology, performance on the TMT and consciousness disorders in schizophrenia • 130 stabilized patients with schizophrenia (M=31.0 yr., QI>90, all with neuroleptic medication). • Time to complete TMT parts A and B separately at 4 equally spaced time points (t=0, t=2, t=4 and t=6 months). • t=-1 : scores to the Positive and Negative Syndrome Scale. June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 14 Illustration: data set 1 Trail Making Test : Responding time (t0 t3, N = 102 complete, only!) June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 15 Illustration: data set 1 Fitting a linear LGM with time-varying and time-invariant covariates to TMT data (N=102) TMT form B B1 B2 B3 B4 A1 A2 A3 A4 i s Dis Pos June 2-4, 2010 - Saint-Raphaël Neg Host TMT form A Anx INSERM workshop : Mixture modelling for longitudinal data 16 Illustration: data set 1 Is the linear growth model tenable? Growth shape Λ Fit indices #par chi-square ddl p-value CFI TLI AIC BIC SSABIC RMSEA SRMR June 2-4, 2010 - Saint-Raphaël 1 1 1 1 0 1 2 3 linear 21 44.676 29 0.0316 0.957 0.938 9139 9194 9128 0.073 0.046 1 1 1 1 0 1 2 3 0 1 4 9 quadratic 27 44.049 23 0.0052 0.943 0.886 9151 9221 9136 0.095 0.048 1 1 1 1 0 1 2 2 0 0 0 1 piecewise 27 42.489 23 0.0080 0.947 0.903 9149 9220 9135 0.091 0.064 INSERM workshop : Mixture modelling for longitudinal data 17 Illustration: data set 1 Conditional LGM : results ML estimation Two-Tailed Estimate S.E. Est./S.E. P-Value DISORG 5.075 2.666 1.904 0.057 POS 2.983 2.536 1.176 0.240 NEG 0.089 2.562 0.035 0.972 -3.696 2.875 -1.285 0.199 4.272 2.817 1.516 0.129 DISORG -2.006 1.034 -1.940 0.052 POS -1.376 0.984 -1.400 0.162 NEG 1.408 0.991 1.421 0.155 HOST 1.222 1.115 1.095 0.273 -0.360 1.092 -0.330 0.742 I ON HOST ANX S ON ANX June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 18 Illustration: data set 1 Conditional LGM : results ML estimation Two-Tailed B1 Estimate S.E. Est./S.E. P-Value 1.674 0.226 7.394 0.000 1.703 0.166 10.274 0.000 1.511 0.115 13.110 0.000 1.797 0.156 11.516 0.000 ON A1 B2 ON A2 B3 ON A3 B4 ON A4 June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 19 Illustration: data set 1 Conditional LGM : results ML estimation Two-Tailed Estimate S.E. Est./S.E. P-Value B1 0.000 0.000 999.000 999.000 B2 0.000 0.000 999.000 999.000 B3 0.000 0.000 999.000 999.000 B4 0.000 0.000 999.000 999.000 I -39.325 27.652 -1.422 0.155 S 4.543 10.730 0.423 0.672 Intercepts June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 20 Illustration: data set 1 Conditional LGM : results ML estimation Two-Tailed Estimate S.E. Est./S.E. P-Value B1 3172.312 461.870 6.868 0.000 B2 1034.587 164.132 6.303 0.000 B3 387.629 75.508 5.134 0.000 B4 378.444 72.855 5.194 0.000 I 265.423 61.838 4.292 0.000 S 0.000 0.000 999.000 999.000 B1 0.395 0.061 6.427 0.000 B2 0.584 0.055 10.594 0.000 B3 0.801 0.041 19.526 0.000 B4 0.770 0.045 17.118 0.000 I 0.468 0.144 3.240 0.001 S 1.000 999.000 999.000 999.000 Residual Variances R-SQUARE June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 21 GMM specification Representing heterogeneity with respect to the growth factors and covariates. GMM specifies a separate LGM for each of the K latent class simultaneously : Yik νk Λk ηik Kk Xik e ik , (5) and ηik αk Βk ηik Γk Xik z ik , June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data (6) 22 GMM specification Modeling predictive effects of time-invariant covariates on latent class membership Mixture components (c) are related to covariates through a multinomial logistic regression model : Pr(Ci k X i ) e K (p ok ΓCk Xi ) e , (7) (p oh ΓCh Xi ) h 1 with the reference class K, Γ(kC ) (1xq) vector of logistic regression coefficients from C on X, p0k logistic regression intercept for class k relative to class K. Xi (qx1) vector of time-invariant covariate predictors of change. June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 23 GMM selection Indices for determining the “best” GMM -Information-based criteria : BIC, SABIC - Nested model Likelihood Ratio Test : LMR (Low-Mendell-Rubin) LRT, bootstrapped LRT -Latent classification accuracy : Entropy, average latent class probabilities for most likely latent class membership June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 24 Illustration: data set 1 Mplus representation of a linear GMM fitted to TMT data (N=102). B1 B2 B3 B4 A1 A2 A3 A4 i s x Disorg Pos Neg Host Anx June 2-4, 2010 - Saint-Raphaël c INSERM workshop : Mixture modelling for longitudinal data 25 Illustration: data set 1 Determining the “best” growth two-class model i s i s i s differences between classes x c x c x c Restrictions Overall var(i)=0 var(s)=0 x -> c var(s)=0 x -> c x -> c class 1 class 2 LMR LRT p-value Nc1 18 OK 4083 4026 0,841 0,14 28.50 19 OK 4083 4023 0,787 0,78 76,17 21 OK 4079 4012 1 0,03 87,25 Nc2 71.49 23,85 12,75 #par starts (2000 20) BIC SSABIC Entropy June 2-4, 2010 - Saint-Raphaël res. var(i)=0 res. var(s)=0 res. var(s)=0 res. var(s)=0 x -> c i s x -> c i s x -> c x -> i x -> i 28 29 39 OK OK OK 4102 4102 4095 4014 4010 4004 0,991 0,987 0,801 0,01 0,036 0,20 93,03 93,03 25,41 6,97 INSERM workshop : Mixture modelling for longitudinal data 6,97 74,59 26 Illustration: data set 1 GMM results : TMT data (N=102) Information Criteria Number of Free Parameters 29 Akaike (AIC) 4025.603 Bayesian (BIC) 4101.727 Sample-Size Adjusted BIC 4010.126 (n* = (n + 2) / 24 FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASS PATTERNS BASED ON ESTIMATED POSTERIOR PROBABILITIES Latent Classes 1 7.10321 0.06964 2 94.89679 0.93036 June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 27 Illustration: data set 1 GMM results : TMT data (N=102) CLASSIFICATION QUALITY Entropy 0.987 CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS MEMBERSHIP Class Counts and Proportions Latent classes 1 7 0.06863 2 95 0.93137 Average Latent Class Probabilities for Most Likely Latent Class Membership (Row) by Latent Class (Column) 1 2 1 0.994 0.006 2 0.002 0.998 June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 28 Illustration: data set 1 Growth Mixture model results : TMT data (N=102) VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 1 (H0) VERSUS 2 CLASSES H0 Loglikelihood Value 2 Times the Loglikelihood Difference Difference in the Number of Parameters -2001.982 36.361 8 Mean -7.722 Standard Deviation 35.246 P-Value 0.0355 LO-MENDELL-RUBIN ADJUSTED LRT TEST Value 35.404 P-Value 0.0383 June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 29 Illustration: data set 1 Growth Mixture model results : TMT data (N=102) Categorical Latent Variables Two-Tailed Estimate S.E. Est./S.E. P-Value DISORG 1.478 0.550 2.689 0.007 POS 1.967 0.603 3.260 0.001 NEG -1.250 0.397 -3.150 0.002 HOST -2.240 0.869 -2.579 0.010 ANX -0.282 0.399 -0.706 0.480 -1.700 3.014 -0.564 0.573 C#1 ON Intercepts C#1 June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 30 Illustration: data set 1 GMM: probability of class membership as function of value on each of covariates : TMT data (N=102) c#1 on disorg pos neg host anx Value on each of the covariates 1,478 1 2 1 1,967 1 1 2 -1,250 1 1 1 -2,240 1 1 1 -0,282 0 0 0 intercept c#1 -1,700 log odds (c=1)= log odds (c=2)= June 2-4, 2010 - Saint-Raphaël -1,75 -0,27 0,00 0,00 Prob(c=1) 0,15 0,43 Prob(c=2) 0,85 0,57 0,22 0,00 0,56 0,44 1 3 1 1 0 2,19 0,00 0,90 0,10 1 4 1 1 0 1 1 2 1 0 1 1 3 1 0 1 1 1 2 0 1 1 1 3 0 4,16 -3,00 -4,25 -3,99 -6,23 0,00 0,00 0,00 0,00 0,00 0,98 0,05 0,01 0,02 0,00 0,02 0,95 0,99 0,98 1,00 INSERM workshop : Mixture modelling for longitudinal data 31 Illustration: data set 1 Growth Mixture model results : TMT data (N=102) Latent class 1 = Latent class 2 Two-Tailed Estimate S.E. Est./S.E. P-Value 1.335 2.595 0.514 0.607 POS -1.365 2.703 -0.505 0.613 NEG 4.387 2.412 1.819 0.069 HOST 0.264 3.270 0.081 0.936 ANX 5.051 2.659 1.900 0.057 DiSORG -1.617 1.090 -1.483 0.138 POS -0.892 1.196 -0.746 0.456 NEG 0.917 0.899 1.019 0.308 HOST 0.780 1.585 0.492 0.622 -0.434 1.206 -0.360 0.719 I ON DiSORG S ON ANX June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 32 Illustration: data set 1 Growth Mixture model results : TMT data (N=102) Nc#1= 7 Nc#2= 95 June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 33 Illustration: data set 2 Data set 2 : Learning to read and development of phonological and morphological processing • 344 children (6-7 years) tested 6 times (6 weeks between each measurement occasion) • t1-1: Raven Matrix (int) • t1 – t6 : 4 observed variables: Syllables Implicit Processing, Phonemes Implicit Processing , Syllables Explicit Processing, Phonemes Explicit Processing. • t6 + 1 week : Word reading (frequent words, rare words, pseudo-words) June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 34 Illustration: data set 2 Data set 2 : descriptive statistics June 2-4, 2010 - Saint-Raphaël t0 t1 t2 t3 t4 t5 t0 t1 t2 t3 t4 t5 t0 t1 t2 t3 t4 t5 t0 t1 t2 t3 t4 t5 INSERM workshop : Mixture modelling for longitudinal data 35 Illustration: data set 2 SEM representation of a quadratic GGMM with time invariant antecedents of change and a distal outcome (N=344) sip1 pip1 sep pep1 sip2 pip2 sep 1 pep2 pip3 sip3 sep 2 f1 f2 i pep3 sip4 pip4 sep 3 f3 s pep4 sip5 pip5 sep 4 f4 pep5 sip6 pip3 sep 5 f5 pep6 6 f6 q freq. Int c Lect. rare pseudo words June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 36 Multiple indicators GMM Multiple indicator LGM First-order factor scores : measurement model with (strong) invariance constraints Yi ν Ληi e i , Second-order growth factors : ηi Γξi z i , Factor scores as deviations from the group mean : ξi κ υi , Second-order growth model: Yi ν ΛΓ(κ υi ) z i ei . June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 37 Multiple indicators GMM Multiple indicator GMM Yik νk Λk Γk (κk υi k ) z ik e ik . First-order constraints : νk ν, Λk Λ, Ψk Ψ, θk θ, Differences between latent classes : - means κk , - covariances Φk , - parameters for representing growth Γk . June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 38 Illustration: data set 2 Unconditional GMM : 2 classes vs 3 classes Two-class GMM var(i) var(s) var(q) Between classes 0 0 0 0 0 Three-class GMM 0 0 0 0 Parameters BIC SABIC Entropy LMR-LRT Nc1 96 29953 29648 0,94 0,000 82,46 98 29120 28809 0,697 0,016 39,25 100 29080 28763 0,794 0,015 74,80 103 29058 28732 0,804 0,000 23,37 var(i) var(s) cov(i,s) 106 29015 28679 0,754 0,000 66,39 Nc2 17,51 60,75 25,20 76,63 33,61 Nc3 June 2-4, 2010 - Saint-Raphaël 0 0 0 100 29459 29141 0,944 0,000 32,78 102 29062 28738 0,718 0,190 36,74 104 29057 28727 0,762 0,540 67,89 107 29028 28688 0,858 0,140 8,71 66,93 35,77 11,72 61,08 0,29 27,49 20,69 30,21 INSERM workshop : Mixture modelling for longitudinal data 39 Illustration: data set 2 Three-class GMM with int as covariate, without (overall) and with (between) class differences var(i) var(s) var(q) covariate class1 class2 class3 Parameters BIC SABIC Entropy LMR-LRT Nc1 overall 0 0 0 c on x overall overall between overall between between 111 33330 32978 0,916 0,023 56,16 114 32473 32112 0,991 0,000 11,69 0 0 c on x i on x i on x i on x 116 32622 32254 0,820 0,204 49,46 Nc2 34,90 81,28 36,64 11,73 81,10 11,65 80,99 80,99 Nc3 8,94 7,03 13,90 80,80 7,46 7,13 7,55 11,36 June 2-4, 2010 - Saint-Raphaël 0 0 c i on x between 0 c i s on x 0 c on x c i s q on x i s on x i s on x i s on x 117 121 121 32259 32313 32263 31888 31930 31879 0,987 0,986 0,990 0,004 0,07 0,001 7,48 11,44 81,22 INSERM workshop : Mixture modelling for longitudinal data c on x i s q on x i s q on x i s q on x 127 32243 31841 0,986 0,013 11,46 c on x i s q on x, cov. i s q i s q on x, cov. i s q i s q on x, cov. i s q 139 32286 31845 0,987 0,050 7,66 40 Illustration: data set 2 Conditional GMM: estimated means June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 41 Illustration: data set 2 GMM results : information criteria an quality of classification Information Criteria Number of Free Parameters 127 Akaike (AIC) 31755.780 Bayesian (BIC) 32243.542 Sample-Size Adjusted BIC 31840.665 (n* = (n + 2) / 24) FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASS PATTERNS BASED ON ESTIMATED POSTERIOR PROBABILITIES Latent Classes 1 278.61914 0.80994 2 39.41000 0.11456 3 25.97086 0.07550 June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 42 Illustration: data set 2 GMM results : information criteria an quality of classification Entropy 0.986 CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS MEMBERSHIP Class Counts and Proportions Latent Classes 1 280 0.81395 2 38 0.11047 3 26 0.07558 Average Latent Class Probabilities for Most Likely Latent Class Membership (Row) by Latent Class (Column) 1 2 3 1 0.995 0.005 0.000 2 0.003 0.990 0.007 3 0.000 0.011 0.989 June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 43 Illustration: data set 2 GMM results : intercepts of i, s and q Class 1 Intercepts I 3.693 0.275 13.451 0.000 S 1.103 0.145 7.632 0.000 Q -0.095 0.027 -3.559 0.000 I 0.961 0.106 9.084 0.000 S 0.152 0.031 4.924 0.000 Q 0.005 0.001 5.221 0.000 Residual Variances June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 44 Illustration: data set 2 GMM results : intercepts of i, s and q Class 2 Intercepts I 2.616 0.420 6.223 0.000 S 1.907 0.284 6.725 0.000 Q -0.254 0.055 -4.617 0.000 I 0.961 0.106 9.084 0.000 S 0.152 0.031 4.924 0.000 Q 0.005 0.001 5.221 0.000 Residual Variances June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 45 Illustration: data set 2 GMM results : intercepts of i, s and q Class 3 Intercepts I 0.000 0.000 999.000 999.000 S 1.127 0.354 3.187 0.001 Q 0.077 0.068 -1.137 0.256 (linear trend in class 3 in fixing q@0) Residual Variances I 0.961 0.106 9.084 0.000 S 0.152 0.031 4.924 0.000 Q 0.005 0.001 5.221 0.000 June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 46 Illustration: data set 2 GMM results : coefficients regression from categorical variables c on covariate Categorical Latent Variables C#1 ON INTNV 0.172 0.058 2.969 0.003 0.044 0.076 0.575 0.565 C#1 0.392 0.709 0.553 0.580 C#2 -0.052 0.925 -0.056 0.955 C#2 ON INTNV Intercepts June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 47 Illustration: data set 2 GMM results : probability of class membership c#1 on int 0,172 c#2 on int 0,044 intercept c#1 0,392 intercept c#2 -0,052 value of int 0,5 log odds (c=1)= log odds (c=2)= log odds (c=3)= Prob(c=1) Prob(c=2) Prob(c=3) June 2-4, 2010 - Saint-Raphaël 1 2 5 10 0,478 0,564 -0,03 -0,008 0 0 0,736 0,036 0 1,252 0,168 0 2,112 0,388 0 0,51 0,25 0,24 0,62 0,21 0,18 0,77 0,14 0,09 0,45 0,27 0,28 0,47 0,26 0,27 INSERM workshop : Mixture modelling for longitudinal data 48 Illustration: data set 2 Estimated probabilities for c as a function of int level June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 49 Illustration: data set 2 GMM results : regression from i, s and q on covariate Class 1 I ON INTNV 0.122 0.020 6.050 0.000 -0.033 0.011 -2.939 0.003 0.003 0.002 1.567 0.117 -0.008 0.040 -0.206 0.837 I -0.015 0.007 -2.309 0.021 S -0.026 0.005 -4.943 0.000 S ON INTNV Q ON INTNV S WITH I Q WITH June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 50 Illustration: data set 2 GMM results : regression from i, s and q on covariate Class 2 I ON INTNV 0.140 0.040 3.477 0.001 -0.095 0.025 -3.802 0.000 0.015 0.005 3.136 0.002 -0.008 0.040 -0.206 0.837 I -0.015 0.007 -2.309 0.021 S -0.026 0.005 -4.943 0.000 S ON INTNV Q ON INTNV S WITH I Q WITH June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 51 Illustration: data set 2 GMM results : regression from i, s and q on covariate Class 3 I ON INTNV 0.341 0.022 15.275 0.000 -0.037 0.034 -1.085 0.278 0.002 0.007 0.297 0.766 -0.008 0.040 -0.206 0.837 I -0.015 0.007 -2.309 0.021 S -0.026 0.005 -4.943 0.000 S ON INTNV Q ON INTNV S WITH I Q WITH June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 52 Illustration: data set 2 GMM results : reading proficiency level for each class Class 1 Means LECT 7.508 0.434 17.288 0.000 4.430 0.287 15.455 0.000 0.000 0.000 999.000 999.000 Class 2 Means LECT Class 3 Means LECT June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 53 Concluding remarks Interest, limitations, cautions GMM is a promising approach for modeling heterogeneous latent change across unobserved population subgroups. But : -GMM is usually based on large samples. -The search for heterogeneity should be conducted in a principled and disciplined way; the best way to guide GMM selection is to test different models following theory-based models. - GMM always identify groups - The role that covariates play in the enumeration process has to be clarified. - An important question : how to model missing data on x variables? June 2-4, 2010 - Saint-Raphaël INSERM workshop : Mixture modelling for longitudinal data 54