Diapositive 1 - Christophe Genolini

advertisement
Latent variable modeling of
psychological longitudinal data:
taking into account the unobserved
heterogeneity using Mplus
Jacques Juhel
University Rennes 2, CRPCC, EA 1285
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
1
Introduction
 Studying individual differences in learning, change and
development
A double compromise :
• random effect model,
• classification techniques.
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
2
Introduction
 (among other methods) the GMM approach of Muthén and
colleagues
A technique for longitudinal data that :
• combines categorical and continuous latent variables in the same
model (“beyond SEM”),
• accommodates unobserved heterogeneity in the sample,
• allows for each class membership latent growth parameters to be
influenced by time-varying covariates and time-invariant predictor
variables,
• incorporates consequent outcomes predicted by the latent class
variable.
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
3
LGM specifications

The LGM for a continuous outcome : the multivariate latent
variable approach
Factor analysis measurement model (level 1) :
Yi  ν  Ληi  e i ,
(1)
Yi (mx1) repeated measures over fixed time points,
n (mx1) intercepts in the regression from Yi on hi ,
hi (px1) latent growth factors,
L (mxp) design matrix of factor loadings,
ei (mx1) residuals in the regression of Yi on hi (covariance matrix Q).
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
4
LGM specifications

The LGM for a continuous outcome : the multivariate latent
variable approach
Structural regression model (level 2) :
ηi  α  Βηi  z i ,
(2)
a (px1) means of hi or intercepts in the regression of hi on hi ,
B (pxp) regression coefficients in the regression of hi on hi ,
hi (px1) latent growth factors,
zi (px1) residuals in the regression of hi on hi (covariance matrix Y).
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
5
LGM assumptions

The LGM for a continuous outcome : the multivariate latent
variable approach
The covariance and mean structure are derived for the population with the
hypothesis that :
e, z and h are mutually uncorrelated,
E[e] and E[z] equal 0.
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
6
SEM representation
 The unconditional linear LGM
Free parameters (Mplus output)
ν0
y1
a
Means of h0 and h1,
Y
var(h0)
var(h1)
cov(h0,h1)
res. var(y)
y3
y4
1
1
Λ
1

1
h0
0
1 
2

3
h1
Β0
June 2-4, 2010 - Saint-Raphaël
y2
Yi  Ληi  e i ,
(1)
ηi  α  z i ,
(2)
INSERM workshop : Mixture modelling for longitudinal data
7
LGM specifications

The LGM with time-varying covariates
Factor analysis measurement model (level 1) :
Yi  ν  Ληi  Kai  e i ,
(1bis)
Yi (mx1) repeated measures over fixed time points,
n (mx1) intercepts in the regression from Yi on hi ,
hi (px1) latent growth factors,
L (mxp) design matrix of factor loadings,
K (mxr) coefficients in the regression from Yi on time-varying covariates ai.
ei (mx1) residuals in the regression of Yi on hi (covariance matrix Q).
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
8
SEM representation

Linear LGM with time-varying covariates
Free parameters (Mplus output)
ν0
y1
a
Means of h0 and h1,
Y
var(h0)
var(h1)
cov(h0,h1)
res.var(y)
cov(a, h0)
cov(a, h1)
y2
y3
y4
1
1
Λ
1

1
h0
h1
a1
a2
a3
0
1 
2

3
a4
B
Regression coefficients from y on a
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
9
LGM specifications

The LGM with time-invariant covariates
Structural regression model (level 2), with vector of predictors x :
ηi  α  Βηi  ΓXi  z i ,
(3)
hi (px1) latent growth factors,
a (px1) means of hi or intercepts in the regression of hi on hi ,
B (pxp) regression coefficients in the regression of hi on hi ,
Xi (qx1) time-invariant covariate predictors of change,
G (pxq) regression coefficients in the regression from h on X,
zi (px1) residuals in the regression of hi on hi (covariance matrix Y).
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
10
SEM representation
 The linear LGM with time-varying and time-invariant covariates
Free parameters (Mplus output)
ν0
y1
a
Intercepts of h0 and h1,
Means of a1-a4
Y
res.var(h0)
res. var(h1)
res. cov(h0,h1)
res. var(y)
cov(a, h0)
cov(a, h1)
cov(a, x)
y2
y3
y4
1
1
Λ
1

1
h0
h1
a1
a2
a3
0
1 
2

3
a4
B
Regression coefficients from y on a
Regression coefficients from h0 and h1on X
x1
June 2-4, 2010 - Saint-Raphaël
x2
x3
INSERM workshop : Mixture modelling for longitudinal data
11
LGM specifications

The linear LGM with time-varying, time-invariant covariates and a
distal outcome
Consequences of change as outcomes can be predicted by the latent
growth factors :
Zi  ω  Βηi  xi ,
(4)
Zi (dx1) vector of distal outcomes of change,
B (dxp) matrix of regression coefficients from Z on h,
w (dx1) vector of regression intercepts for Z,
xi (px1) residuals in the regression of Zi on hi (covariance matrix Y).
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
12
SEM representation

The linear LGM with time-varying, time-invariant covariates and a
distal outcome
ν0
Free parameters (Mplus output)
y1
a
Intercepts of h0 and h1,
Means of a1-a4
Intercept of z
Y
res. var(h0)
res. var(h1)
res. cov(h0,h1)
res. var(y)
cov(a, h0)
cov(a, h1)
cov(a, x)
y2
y3
y4
1
1
Λ
1

1
h0
h1
a1
a2
a3
0
1 
2

3
a4
B
Regression coefficients from y on a
Regression coefficients from h0 and h1on x
Regression coefficients from z on h0 and h1
x1
June 2-4, 2010 - Saint-Raphaël
x2
x3
z
INSERM workshop : Mixture modelling for longitudinal data
13
Illustration : data set 1
Clinical symptomatology, performance on the TMT and
consciousness disorders in schizophrenia
• 130 stabilized patients with schizophrenia (M=31.0 yr., QI>90, all with neuroleptic
medication).
• Time to complete TMT parts A and B separately at 4 equally spaced time points
(t=0, t=2, t=4 and t=6 months).
• t=-1 : scores to the Positive and Negative Syndrome Scale.
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
14
Illustration: data set 1
Trail Making Test : Responding time (t0  t3, N = 102 complete, only!)
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
15
Illustration: data set 1
 Fitting a linear LGM with time-varying and time-invariant covariates
to TMT data (N=102)
TMT form B
B1
B2
B3
B4
A1
A2
A3
A4
i
s
Dis
Pos
June 2-4, 2010 - Saint-Raphaël
Neg
Host
TMT form A
Anx
INSERM workshop : Mixture modelling for longitudinal data
16
Illustration: data set 1

Is the linear growth model tenable?
Growth shape
Λ
Fit indices
#par
chi-square
ddl
p-value
CFI
TLI
AIC
BIC
SSABIC
RMSEA
SRMR
June 2-4, 2010 - Saint-Raphaël
1
1

1

1
0
1 
2

3
linear
21
44.676
29
0.0316
0.957
0.938
9139
9194
9128
0.073
0.046
1
1

1

1
0
1
2
3
0
1 
4

9
quadratic
27
44.049
23
0.0052
0.943
0.886
9151
9221
9136
0.095
0.048
1
1

1

1
0
1
2
2
0
0 
0

1
piecewise
27
42.489
23
0.0080
0.947
0.903
9149
9220
9135
0.091
0.064
INSERM workshop : Mixture modelling for longitudinal data
17
Illustration: data set 1

Conditional LGM : results
ML estimation
Two-Tailed
Estimate
S.E.
Est./S.E.
P-Value
DISORG
5.075
2.666
1.904
0.057
POS
2.983
2.536
1.176
0.240
NEG
0.089
2.562
0.035
0.972
-3.696
2.875
-1.285
0.199
4.272
2.817
1.516
0.129
DISORG
-2.006
1.034
-1.940
0.052
POS
-1.376
0.984
-1.400
0.162
NEG
1.408
0.991
1.421
0.155
HOST
1.222
1.115
1.095
0.273
-0.360
1.092
-0.330
0.742
I
ON
HOST
ANX
S
ON
ANX
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
18
Illustration: data set 1

Conditional LGM : results
ML estimation
Two-Tailed
B1
Estimate
S.E.
Est./S.E.
P-Value
1.674
0.226
7.394
0.000
1.703
0.166
10.274
0.000
1.511
0.115
13.110
0.000
1.797
0.156
11.516
0.000
ON
A1
B2
ON
A2
B3
ON
A3
B4
ON
A4
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
19
Illustration: data set 1

Conditional LGM : results
ML estimation
Two-Tailed
Estimate
S.E.
Est./S.E.
P-Value
B1
0.000
0.000
999.000
999.000
B2
0.000
0.000
999.000
999.000
B3
0.000
0.000
999.000
999.000
B4
0.000
0.000
999.000
999.000
I
-39.325
27.652
-1.422
0.155
S
4.543
10.730
0.423
0.672
Intercepts
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
20
Illustration: data set 1

Conditional LGM : results
ML estimation
Two-Tailed
Estimate
S.E.
Est./S.E.
P-Value
B1
3172.312
461.870
6.868
0.000
B2
1034.587
164.132
6.303
0.000
B3
387.629
75.508
5.134
0.000
B4
378.444
72.855
5.194
0.000
I
265.423
61.838
4.292
0.000
S
0.000
0.000
999.000
999.000
B1
0.395
0.061
6.427
0.000
B2
0.584
0.055
10.594
0.000
B3
0.801
0.041
19.526
0.000
B4
0.770
0.045
17.118
0.000
I
0.468
0.144
3.240
0.001
S
1.000
999.000
999.000
999.000
Residual Variances
R-SQUARE
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
21
GMM specification
 Representing heterogeneity with respect to the growth factors
and covariates.
GMM specifies a separate LGM for each of the K latent class
simultaneously :
Yik  νk  Λk ηik  Kk Xik  e ik ,
(5)
and
ηik  αk  Βk ηik  Γk Xik  z ik ,
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
(6)
22
GMM specification
 Modeling predictive effects of time-invariant covariates on latent
class membership
Mixture components (c) are related to covariates through a multinomial
logistic regression model :
Pr(Ci  k X i ) 
e
K
(p ok  ΓCk Xi )
e
,
(7)
(p oh  ΓCh Xi )
h 1
with the reference class K,
Γ(kC ) (1xq) vector of logistic regression coefficients from C on X,
p0k logistic regression intercept for class k relative to class K.
Xi (qx1) vector of time-invariant covariate predictors of change.
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
23
GMM selection
 Indices for determining the “best” GMM
-Information-based criteria :
BIC, SABIC
- Nested model Likelihood Ratio Test :
LMR (Low-Mendell-Rubin) LRT, bootstrapped LRT
-Latent classification accuracy :
Entropy, average latent class probabilities for most likely latent class
membership
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
24
Illustration: data set 1
 Mplus representation of a linear GMM fitted to TMT data (N=102).
B1
B2
B3
B4
A1
A2
A3
A4
i
s
x
Disorg
Pos
Neg
Host
Anx
June 2-4, 2010 - Saint-Raphaël
c
INSERM workshop : Mixture modelling for longitudinal data
25
Illustration: data set 1
 Determining the “best” growth two-class model
i
s
i
s
i
s
differences
between
classes
x
c
x
c
x
c
Restrictions
Overall
var(i)=0
var(s)=0
x -> c
var(s)=0
x -> c
x -> c
class 1
class 2
LMR LRT p-value
Nc1
18
OK
4083
4026
0,841
0,14
28.50
19
OK
4083
4023
0,787
0,78
76,17
21
OK
4079
4012
1
0,03
87,25
Nc2
71.49
23,85
12,75
#par
starts (2000 20)
BIC
SSABIC
Entropy
June 2-4, 2010 - Saint-Raphaël
res. var(i)=0
res. var(s)=0 res. var(s)=0 res. var(s)=0
x -> c i s
x -> c i s
x -> c
x -> i
x -> i
28
29
39
OK
OK
OK
4102
4102
4095
4014
4010
4004
0,991
0,987
0,801
0,01
0,036
0,20
93,03
93,03
25,41
6,97
INSERM workshop : Mixture modelling for longitudinal data
6,97
74,59
26
Illustration: data set 1

GMM results : TMT data (N=102)
Information Criteria
Number of Free Parameters
29
Akaike (AIC)
4025.603
Bayesian (BIC)
4101.727
Sample-Size Adjusted BIC
4010.126
(n* = (n + 2) / 24
FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASS PATTERNS
BASED ON ESTIMATED POSTERIOR PROBABILITIES
Latent
Classes
1
7.10321
0.06964
2
94.89679
0.93036
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
27
Illustration: data set 1
 GMM results : TMT data (N=102)
CLASSIFICATION QUALITY
Entropy
0.987
CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS MEMBERSHIP
Class Counts and Proportions
Latent classes
1
7
0.06863
2
95
0.93137
Average Latent Class Probabilities for Most Likely Latent Class Membership (Row)
by Latent Class (Column)
1
2
1
0.994
0.006
2
0.002
0.998
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
28
Illustration: data set 1
 Growth Mixture model results : TMT data (N=102)
VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 1 (H0) VERSUS 2 CLASSES
H0 Loglikelihood Value
2 Times the Loglikelihood Difference
Difference in the Number of Parameters
-2001.982
36.361
8
Mean
-7.722
Standard Deviation
35.246
P-Value
0.0355
LO-MENDELL-RUBIN ADJUSTED LRT TEST
Value
35.404
P-Value
0.0383
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
29
Illustration: data set 1
 Growth Mixture model results : TMT data (N=102)
Categorical Latent Variables
Two-Tailed
Estimate
S.E.
Est./S.E.
P-Value
DISORG
1.478
0.550
2.689
0.007
POS
1.967
0.603
3.260
0.001
NEG
-1.250
0.397
-3.150
0.002
HOST
-2.240
0.869
-2.579
0.010
ANX
-0.282
0.399
-0.706
0.480
-1.700
3.014
-0.564
0.573
C#1
ON
Intercepts
C#1
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
30
Illustration: data set 1
 GMM: probability of class membership as function of value on each
of covariates : TMT data (N=102)
c#1 on
disorg
pos
neg
host
anx
Value on each of the covariates
1,478
1
2
1
1,967
1
1
2
-1,250
1
1
1
-2,240
1
1
1
-0,282
0
0
0
intercept
c#1
-1,700
log odds (c=1)=
log odds (c=2)=
June 2-4, 2010 - Saint-Raphaël
-1,75 -0,27
0,00 0,00
Prob(c=1) 0,15 0,43
Prob(c=2) 0,85 0,57
0,22
0,00
0,56
0,44
1
3
1
1
0
2,19
0,00
0,90
0,10
1
4
1
1
0
1
1
2
1
0
1
1
3
1
0
1
1
1
2
0
1
1
1
3
0
4,16 -3,00 -4,25 -3,99 -6,23
0,00 0,00 0,00 0,00 0,00
0,98 0,05 0,01 0,02 0,00
0,02 0,95 0,99 0,98 1,00
INSERM workshop : Mixture modelling for longitudinal data
31
Illustration: data set 1
 Growth Mixture model results : TMT data (N=102)
Latent class 1 = Latent class 2
Two-Tailed
Estimate
S.E.
Est./S.E.
P-Value
1.335
2.595
0.514
0.607
POS
-1.365
2.703
-0.505
0.613
NEG
4.387
2.412
1.819
0.069
HOST
0.264
3.270
0.081
0.936
ANX
5.051
2.659
1.900
0.057
DiSORG
-1.617
1.090
-1.483
0.138
POS
-0.892
1.196
-0.746
0.456
NEG
0.917
0.899
1.019
0.308
HOST
0.780
1.585
0.492
0.622
-0.434
1.206
-0.360
0.719
I
ON
DiSORG
S
ON
ANX
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
32
Illustration: data set 1

Growth Mixture model results : TMT data (N=102)
Nc#1= 7
Nc#2= 95
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
33
Illustration: data set 2
 Data set 2 : Learning to read and development of phonological
and morphological processing
• 344 children (6-7 years) tested 6 times (6 weeks between each
measurement occasion)
• t1-1: Raven Matrix (int)
• t1 – t6 : 4 observed variables: Syllables Implicit Processing, Phonemes
Implicit Processing , Syllables Explicit Processing, Phonemes Explicit
Processing.
• t6 + 1 week : Word reading (frequent words, rare words, pseudo-words)
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
34
Illustration: data set 2
 Data set 2 : descriptive statistics
June 2-4, 2010 - Saint-Raphaël
t0
t1
t2
t3
t4
t5
t0
t1
t2
t3
t4
t5
t0
t1
t2
t3
t4
t5
t0
t1
t2
t3
t4
t5
INSERM workshop : Mixture modelling for longitudinal data
35
Illustration: data set 2
 SEM representation of a quadratic GGMM with time invariant
antecedents of change and a distal outcome (N=344)
sip1
pip1
sep
pep1
sip2
pip2
sep
1
pep2
pip3
sip3
sep
2
f1
f2
i
pep3
sip4
pip4
sep
3
f3
s
pep4
sip5
pip5
sep
4
f4
pep5
sip6
pip3
sep
5
f5
pep6
6
f6
q
freq.
Int
c
Lect.
rare
pseudo
words
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
36
Multiple indicators GMM
 Multiple indicator LGM
First-order factor scores : measurement model with (strong) invariance
constraints
Yi  ν  Ληi  e i ,
Second-order growth factors :
ηi  Γξi  z i ,
Factor scores as deviations from the group mean :
ξi  κ  υi ,
Second-order growth model:
Yi  ν  ΛΓ(κ  υi )  z i   ei .
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
37
Multiple indicators GMM

Multiple indicator GMM
Yik  νk  Λk Γk (κk  υi k )  z ik   e ik .
First-order constraints :
νk  ν, Λk  Λ, Ψk  Ψ, θk  θ,
Differences between latent classes :
- means
κk ,
- covariances
Φk ,
- parameters for representing growth Γk .
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
38
Illustration: data set 2
 Unconditional GMM : 2 classes vs 3 classes
Two-class GMM
var(i)
var(s)
var(q)
Between
classes
0
0
0
0
0
Three-class GMM
0
0
0
0
Parameters
BIC
SABIC
Entropy
LMR-LRT
Nc1
96
29953
29648
0,94
0,000
82,46
98
29120
28809
0,697
0,016
39,25
100
29080
28763
0,794
0,015
74,80
103
29058
28732
0,804
0,000
23,37
var(i)
var(s)
cov(i,s)
106
29015
28679
0,754
0,000
66,39
Nc2
17,51
60,75
25,20
76,63
33,61
Nc3
June 2-4, 2010 - Saint-Raphaël
0
0
0
100
29459
29141
0,944
0,000
32,78
102
29062
28738
0,718
0,190
36,74
104
29057
28727
0,762
0,540
67,89
107
29028
28688
0,858
0,140
8,71
66,93
35,77
11,72
61,08
0,29
27,49
20,69
30,21
INSERM workshop : Mixture modelling for longitudinal data
39
Illustration: data set 2
 Three-class GMM with int as covariate, without (overall) and with
(between) class differences
var(i)
var(s)
var(q)
covariate
class1
class2
class3
Parameters
BIC
SABIC
Entropy
LMR-LRT
Nc1
overall
0
0
0
c on x
overall
overall
between
overall
between
between
111
33330
32978
0,916
0,023
56,16
114
32473
32112
0,991
0,000
11,69
0
0
c on x
i on x
i on x
i on x
116
32622
32254
0,820
0,204
49,46
Nc2
34,90
81,28
36,64
11,73
81,10
11,65
80,99
80,99
Nc3
8,94
7,03
13,90
80,80
7,46
7,13
7,55
11,36
June 2-4, 2010 - Saint-Raphaël
0
0
c i on x
between
0
c i s on x
0
c on x
c i s q on x
i s on x
i s on x
i s on x
117
121
121
32259
32313
32263
31888
31930
31879
0,987
0,986
0,990
0,004
0,07
0,001
7,48
11,44
81,22
INSERM workshop : Mixture modelling for longitudinal data
c on x
i s q on x
i s q on x
i s q on x
127
32243
31841
0,986
0,013
11,46
c on x
i s q on x, cov. i s q
i s q on x, cov. i s q
i s q on x, cov. i s q
139
32286
31845
0,987
0,050
7,66
40
Illustration: data set 2
 Conditional GMM: estimated means
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
41
Illustration: data set 2

GMM results : information criteria an quality of classification
Information Criteria
Number of Free Parameters
127
Akaike (AIC)
31755.780
Bayesian (BIC)
32243.542
Sample-Size Adjusted BIC
31840.665
(n* = (n + 2) / 24)
FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASS PATTERNS
BASED ON ESTIMATED POSTERIOR PROBABILITIES
Latent
Classes
1
278.61914
0.80994
2
39.41000
0.11456
3
25.97086
0.07550
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
42
Illustration: data set 2

GMM results : information criteria an quality of classification
Entropy
0.986
CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS
MEMBERSHIP
Class Counts and Proportions
Latent
Classes
1
280
0.81395
2
38
0.11047
3
26
0.07558
Average Latent Class Probabilities for Most Likely Latent Class
Membership (Row)
by Latent Class (Column)
1
2
3
1
0.995
0.005
0.000
2
0.003
0.990
0.007
3
0.000
0.011
0.989
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
43
Illustration: data set 2

GMM results : intercepts of i, s and q
Class 1
Intercepts
I
3.693
0.275
13.451
0.000
S
1.103
0.145
7.632
0.000
Q
-0.095
0.027
-3.559
0.000
I
0.961
0.106
9.084
0.000
S
0.152
0.031
4.924
0.000
Q
0.005
0.001
5.221
0.000
Residual Variances
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
44
Illustration: data set 2

GMM results : intercepts of i, s and q
Class 2
Intercepts
I
2.616
0.420
6.223
0.000
S
1.907
0.284
6.725
0.000
Q
-0.254
0.055
-4.617
0.000
I
0.961
0.106
9.084
0.000
S
0.152
0.031
4.924
0.000
Q
0.005
0.001
5.221
0.000
Residual Variances
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
45
Illustration: data set 2

GMM results : intercepts of i, s and q
Class 3
Intercepts
I
0.000
0.000
999.000
999.000
S
1.127
0.354
3.187
0.001
Q
0.077
0.068
-1.137
0.256
(linear trend in class 3 in fixing q@0)
Residual Variances
I
0.961
0.106
9.084
0.000
S
0.152
0.031
4.924
0.000
Q
0.005
0.001
5.221
0.000
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
46
Illustration: data set 2

GMM results : coefficients regression from categorical variables c
on covariate
Categorical Latent Variables
C#1
ON
INTNV
0.172
0.058
2.969
0.003
0.044
0.076
0.575
0.565
C#1
0.392
0.709
0.553
0.580
C#2
-0.052
0.925
-0.056
0.955
C#2
ON
INTNV
Intercepts
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
47
Illustration: data set 2

GMM results : probability of class membership
c#1 on int
0,172
c#2 on int
0,044
intercept c#1
0,392
intercept c#2
-0,052
value of int
0,5
log odds (c=1)=
log odds (c=2)=
log odds (c=3)=
Prob(c=1)
Prob(c=2)
Prob(c=3)
June 2-4, 2010 - Saint-Raphaël
1
2
5
10
0,478 0,564
-0,03 -0,008
0
0
0,736
0,036
0
1,252
0,168
0
2,112
0,388
0
0,51
0,25
0,24
0,62
0,21
0,18
0,77
0,14
0,09
0,45
0,27
0,28
0,47
0,26
0,27
INSERM workshop : Mixture modelling for longitudinal data
48
Illustration: data set 2
 Estimated probabilities for c as a function of int level
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
49
Illustration: data set 2

GMM results : regression from i, s and q on covariate
Class 1
I
ON
INTNV
0.122
0.020
6.050
0.000
-0.033
0.011
-2.939
0.003
0.003
0.002
1.567
0.117
-0.008
0.040
-0.206
0.837
I
-0.015
0.007
-2.309
0.021
S
-0.026
0.005
-4.943
0.000
S
ON
INTNV
Q
ON
INTNV
S
WITH
I
Q
WITH
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
50
Illustration: data set 2

GMM results : regression from i, s and q on covariate
Class 2
I
ON
INTNV
0.140
0.040
3.477
0.001
-0.095
0.025
-3.802
0.000
0.015
0.005
3.136
0.002
-0.008
0.040
-0.206
0.837
I
-0.015
0.007
-2.309
0.021
S
-0.026
0.005
-4.943
0.000
S
ON
INTNV
Q
ON
INTNV
S
WITH
I
Q
WITH
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
51
Illustration: data set 2

GMM results : regression from i, s and q on covariate
Class 3
I
ON
INTNV
0.341
0.022
15.275
0.000
-0.037
0.034
-1.085
0.278
0.002
0.007
0.297
0.766
-0.008
0.040
-0.206
0.837
I
-0.015
0.007
-2.309
0.021
S
-0.026
0.005
-4.943
0.000
S
ON
INTNV
Q
ON
INTNV
S
WITH
I
Q
WITH
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
52
Illustration: data set 2

GMM results : reading proficiency level for each class
Class 1
Means
LECT
7.508
0.434
17.288
0.000
4.430
0.287
15.455
0.000
0.000
0.000
999.000
999.000
Class 2
Means
LECT
Class 3
Means
LECT
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
53
Concluding remarks

Interest, limitations, cautions
GMM is a promising approach for modeling heterogeneous latent change
across unobserved population subgroups.
But :
-GMM is usually based on large samples.
-The search for heterogeneity should be conducted in a principled and
disciplined way; the best way to guide GMM selection is to test different
models following theory-based models.
- GMM always identify groups
- The role that covariates play in the enumeration process has to be
clarified.
- An important question : how to model missing data on x variables?
June 2-4, 2010 - Saint-Raphaël
INSERM workshop : Mixture modelling for longitudinal data
54
Download