Introduction to Mixture Modeling

CROSS-SECTIONAL MIXTURE
MODELING
1
Shaunna L. Clark
Advanced Genetic Epidemiology Statistical
Workshop
October 23, 2012
OUTLINE
What is a mixture?
 Introduction to LCA (LPA)

Basic Analysis Ideas\Plan and Issues
 How to choose the number of classes

How do we implement mixtures in OpenMx?
 Factor Mixture Model
 What do classes mean for twin modeling?

2
HOMOGENEITY VS. HETEROGENEITY

Most models assume homogeneity
i.e. Individuals in a sample all follow the same model
 What have seen so far (for the most part)


But not always the case

Ex: Sex, Age, Patterns of Substance Abuse
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
3
0
Alcohol
Tobacco
Cannabis
Opiates
Heroin
WHAT IS MIXTURE MODELING
Used to model unobserved heterogeneity by
identifying different subgroups of individuals
Ex: IQ, Religiosity
4
LATENT CLASS ANALYSIS (LCA)
5
Also known as Latent Profile Analysis (LPA) if
you have continuously distributed variables
LATENT CLASS ANALYSIS
Introduced by Lazarsfeld & Henry, Goodman, Clogg, Dayton
& Mcready

Setting


Cross-sectional data
Multiple items measuring a construct


Hypothesized construct represented as latent class variable
(categorical latent variable)


12 items measuring the construct of Cannabis
Abuse/Dependence
Different categories of Cannabis Abuse\Dependence patterns
Aim
Identify items that indicate classes well
Estimate proportion of sample in each class (class
probability)
 Classify individuals into classes (posterior probabilities)


6
LATENT CLASS ANALYSIS CONT’D
1
0.9
C
0.8
0.7
0.6
Non Users
0.5
0.4
Legal Drug
Users
0.3
Multiple Illicit
Users
0.2
0.1
0
Alcohol Tobacco Cannabis Opiates
Heroin
x1
x2
x3
x4
x5
7
LATENT CLASS ANALYSIS MODEL
Dichotomous (0/1) indicators u: u1, u2, ... , ur
Categorical latent variable c: c = k ; k = 1, 2, ... , K
Marginal probability for item uj = 1,
(probability item uj =1 is the sum over all class of
the product of the probability of being in class k
and the probability of endorsing item uj given
that you are in class k)
8
JOINT PROBABILITIES


Joint probability of all u’s, assuming conditional
independence:
Probability of observing a given response pattern
is equal to the sum over all classes of the product
of being in a given class and the probability of
observing a response on item 1 given that you are
in latent class k, . . . (repeat for each item)
9
POSTERIOR PROBABILITIES
Probability of being in class k given your
response pattern
 Used to assign most likely class membership


Based on highest posterior probability
Individual
P(Class1)
P(Class2)
MLCM
A
.90
.10
1
B
.8
.2
2
10
MODEL TESTING

Log-likelihood ratio χ2 test (LLRT)


Overall test against the data with H1 being the
unrestricted multinomial
Problem: Not distributed as χ2 due to boundary
conditions


Don’t use it!!! (McLachlan & Peele, 2000)
Information Criteria
Akaike Information Criteria, AIC (Akaike,1974)
AIC = 2h-2ln(L)
 Bayesian Information Criteria, BIC (Schwartz, 1978)
BIC = -2ln(L)+h*ln(n)
 Where L = log-likelihood, h = number of parameters,
n = sample size
 Chose model with lowest value of IC

11
OTHER TESTS

Since can’t do LLRT, use test which approximate
the difference in LL values between k and k-1
class models.
Vuong-Lo-Mendell-Rubin, LMR-LRT (Lo, Mendell, &
Rubin, 2001)
 Parametric bootstrapped LRT, BLRT (McLachlan,
1987)


P-value is probability that H0 is true
H0: k-1 classes; H1: k classes
 A low p-value indicates a preference for the estimated
model (i.e. k classes)


Look for the first time the p-value is nonsignificant or greater than 0.05
12
ANALYSIS PLAN
1.
Fit model with 1-class


2.
3.
Everyone in same class
Sometimes simple is better
Fit LCA models 2-K classes
Chose best number of classes
Seems simple right???
13
NOT REALLY . . .LOTS OF KNOWN ISSUES
IN MIXTURE ANALYSIS

Global vs. Local Maximum
Log Likelihood
Log Likelihood
Local
Global
Global
Local
Parameter

Parameter
Use multiple sets of random starting values to make
sure have global solution. Make sure that best LL
value has replicated
14
DETERMINING THE NUMBER OF CLASSES:
CLASS ENUMERATION

No agreed upon way to determine the correct
number of latent classes
Statistical comparisons (i.e. ICs, LRTs)
 Interpretability and usefulness of classes

Substantive theory
 Relationship to auxiliary variables
 Predictive validity of classes
 Class size


Quality of Classifications (not my favorite)
Classification table based on posterior probabilities
 Entropy - A value close to 1 indicates good classification in
that many individuals have posterior probabilities close to 0
or 1

15
SUGGESTED STRATEGY
Nylund et al. (2007), Tofighi & Enders (2008),
among others
 Simulation studies comparing tests and
information criteria described previously
 Suggest:

Use BIC and LMR to narrow down the number of
plausible models
 Then run BLRT on those models because BLRT can
be computationally intensive

16
OPENMX:
LCA EXAMPLE SCRIPT
17
LCA_example.R
MIXTURES IN OPENMX

Specify class-specific models


Specify class probabilities


Create MxModel objects for each class
Create an MxMatrix of class
probabilities\proportions
Specify model-wide objective function
Pull everything together in a parent model with data
 Weighted sum of the class models


Estimate entire model
Note: One of potentially many ways to do this
18
CLASS SPECIFIC MODELS
nameList <- names(<dataset>)
class1 <- mxModel("Class1",
mxMatrix("Iden", name = "R", nrow = nvar, ncol = nvar, free=FALSE),
mxMatrix("Full", name = "M", nrow = 1, ncol = nvar, free=FALSE),
mxMatrix("Full", name = "ThresholdsClass1", nrow = 1, ncol = nvar,
list("Threshold",nameList), free=TRUE),
dimnames =
mxFIMLObjective(covariance="R", means="M", dimnames=nameList,
thresholds="ThresholdsClass1",vector=TRUE))
Repeat for every class in your model
Don’t be like me, make sure to
change class numbers
19
DEFINE THE MODEL
lcamodel <- mxModel("lcamodel", class1, class2, mxData(vars,
type="raw"),
Next, specify class membership probabilities
20
CLASS MEMBERSHIP PROBABILITIES

When specifying need to remember:
1.
2.
Class probabilities must be positive
Must sum to a constant - 1
mxMatrix("Full", name = "ClassMembershipProbabilities",
nrow = nclass, ncol = 1, free=TRUE,
labels = c(paste("pclass", 1:nclass, sep=""))),
mxBounds(c(paste("pclass", 1:nclass, sep="")),0,1),
mxMatrix("Iden", nrow = 1, name = "constraintLHS"),
mxAlgebra(sum(ClassMembershipProbabilities),
name = "constraintRHS"),
21
mxConstraint(constraintLHS == constraintRHS),
MODEL-WIDE OBJECTIVE FUNCTION
Weighted sum of individual class likelihoods
 Weights are class probabilities


2LL  2*log  pk Lk


k
i1

So for two classes:
2LL  2 * log( p1L1  p2 L2 )
22
MODEL WIDE OBJECTIVE FUNCTION
CONT’D
mxAlgebra(
-2*sum(log(pclass1%x%Class1.objective
+ pclass2%x%Class2.objective)),
name="lca"),
mxAlgebraObjective("lca"))
)
Now we run the model:
model <- mxRun(lcamodel)
And we wait and wait and wait till it’s done.
23
PROFILE PLOT
One way to interpret the classes is to plot them.
 In our example we had binary items, so the
thresholds are what distinguishes between
classes



Can plot the thresholds
Or you can plot the probabilities
More intuitive
 Easier for non-statisticians to understand

24
PROFILE PLOTS IN R\OPENMX
#Pulling out thresholds
class1T <- model@output$matrices$Class1.ThresholdsClass1
class2T <- model@output$matrices$Class2.ThresholdsClass2
#Converting threshold to probabilities
class1P<-t(1/(1+exp(-class1T)))
class2P<-t(1/(1+exp(-class2T)))
25
PROFILE PLOTS CONT’D
plot(class1P, type="o", col="blue",ylim=c(0,1),axes=FALSE, ann=FALSE)
axis(1,at=1:12,lab=nameList)
axis(2,las=1,at=c(0,0.2,0.4,0.6,0.8,1))
box()
lines(class2P,type="o", pch=22, lty=2, col="red")
title(main="LCA 2 Class Profile Plot", col.main="black",font.main=4)
title(xlab="DSM Items", col.lab="black")
title(ylab="Probability", col.lab="black")
legend("bottomright",c("Class 1","Class 2"), cex=0.8,
col=c("blue","red"),pch=21:22,lty=1:2)
26
OPENMX EXERCISE
Unfortunately, it takes long time for these to run
so not feasible to do in this session
 However, I’ve run the 2-, 3-, and 4- class LCA
models for this data and (hopefully) the .Rdata
files are posted on the website


Exercise: Using the .Rdata files
1.
2.
Determine which model is better according to AIC\BIC

Want the lowest value
Make a profile plot of the best solution and interpret the
classes

What kind of substances users are there?
27
CODE TO PULL OUT LL AND COMPUTE
AIC\BIC
#Pull out LL
LL_2c <- model@output$Minus2LogLikelihood
LL_2cnsam = 1878
#parameters
npar <- (nclass-1) + (nthresh*nvar*nclass
npar
#Compute AIC & BIC
AIC_2c = 2*npar + LL_2c
AIC_2c
BIC_2c = LL_2c + (npar*log(nsam))
BIC_2c
28
TABLE OF RESULTS
# Classes
-2*LL
Npar
AIC
BIC
2
6589.96
25
6639
6778
3
6329.95
38
6405
6616
4
6308.96
51
6410
6693
29
3-CLASS PROFILE PLOT
30
FACTOR MIXTURE MODELING
31
PROBLEM WITH LCA
Once in a class, everyone “looks” the same.
 In the context of substance abuse, unlikely that
every user will have the same patterns of use

Withdrawal, tolerance, hazardous use
 There is variation within a latent class



Severity
One proposed solution is the factor mixture
model

Uses a latent class variables to classify individuals
and latent factor to model severity
32
σ2 F
FACTOR MIXTURE MODEL
F
C
λ1
x1
x2
x3
λ2
λ3
x4
λ4
λ5
x5
Classes can be indicated by item thresholds (categorical)\ item
means (continuous) or factor mean and variance
33
GENERAL FACTOR MIXTURE MODEL
yik = Λk ηik + εik ,
ηik = αk + ζik ,
where,
ζik ~ N(0, Ψk)


Similar to the FA model, except many parameters can
be class varying as indicated by the subscript k
Several variations of this model which differ in terms
of the measurement invariance

Lubke & Neale (2005), Clark et al. (2012)
34
FMM PROFILE PLOT
0.9
0.8
0.7
0.6
0.5
Non Users
0.4
Users
0.3
0.2
0.1
0
Alcohol
Tobacco
Cannabis
Opiates
Heroin
35
HOW DO WE DO THIS IN OPENMX?
You’ll have to wait till tomorrow!
 Factor Mixture Model is a generalization of the
Growth Mixture Model we’ll talk about tomorrow
afternoon.

36
MIXTURES & TWIN MODELS
37
How do we combine the ACDE model and
mixtures?
OPTION 1: ACE DIRECTLY ON THE
CLASSES
1.0 (MZ) / 0.5 (DZ)
aA
1.0
cB
cA
x1A
eA
eB
CA
CB
x2A
x3A x4A
x1B
x2B
aB
x3B x4B
38
WHAT WOULD THIS LOOK LIKE FOR THE
1.0 (MZ) / 0.5 (DZ)
FMM?
1.0
1.0 (MZ) / 0.5 (DZ)
aA
cA
eA
1.0
aA
x1A
aA
cA
cA
eA
CA
x3A x4A
cB
eB
eA
FB
FA
x2A
aB
x1B
x2B
CB
x3B x4B
39
FMM & ACE CONT’D


One of many possible ways to do FMM & ACE in the same
model
Can also have class specific ACE on the factors

Each class has own heritability
40
From Muthén et al. (2006)
ISSUE WITH OPTION 1

Model is utilizes the liability threshold model to
“covert” the latent categorical variable, C, to a
latent normal variable

This requires that classes are ordered

Ex: high, medium, low users

Don’t always have nicely ordered classes

Models are VERY time intensive

Take a vacation for a week or two
41
OPTION 2: THREE-STEP METHOD
1.
2.
3.
Estimate mixture model
Assign individuals into their most likely latent
class based on the posterior probabilities of class
membership
Use the observed, categorical variable of
assigned class membership as the phenotype in
a liability threshold model version of ACE
analysis
Note: Requires ordered classes
42
OPTION 2A

Contingency table analysis using most likely
class membership
Concordance between twins in terms of most likely
class membership
 If your classes are not ordered


Odds Ratio
 Excess
twin concordance due to stronger genetic relationship
can be represented by the OR for MZ twins compared to the OR
for DZ twins.

Place restrictions on the contingency table to test
specific hypotheses
Mendelian segregation, only shared environmental
effects
 Eaves (1993)

43
ISSUES WITH OPTION 2
Potential for biased parameter estimates and
underestimated standard errors
 Assigned membership ignores fractional class
membership suggested by posterior probabilities
 Treat the classification as not having any sampling
error
 Good option when entropy is high\ well separated
classes

Individual
P(Class1)
P(Class2)
MLCM
A
.90
.10
1
B
.8
.2
2
C
.51
.49
1
44
SELECTION OF CROSS-SECTIONAL
MIXTURE GENETIC ANALYSIS WRITINGS

Latent Class Analysis


Factor Mixture Analysis


Neale & Gillespie, 2005 (?); Clark, 2010; Clark et al.
(in preparation)
Additional References


Eaves, 1993; Muthén et al., 2006; Clark, 2010
McLachlan, Do, & Ambroise, 2004
Mixtures in Substance Abuse

Gillespie (2011, 2012)

Great cannabis examples
45