Applied
Structural Equation Modeling
for Dummies, by Dummies
February 22, 2013
Indiana University, Bloomington
Joseph J. Sudano, Jr., PhD
Center for Health Care Research and Policy
Case Western Reserve University at The MetroHealth System
Adam T. Perzynski, PhD
Center for Health Care Research and Policy
Case Western Reserve University at The MetroHealth System
Acknowledgements

Thanks Joe.

Thanks to Bill Pridemore and all of you here
at IU.

Thanks to Doug Gunzler.

Thanks to Kyle Kercher.
Rejected Titles for this Talk
February 22, 2013
Indiana University, Bloomington
Joseph J. Sudano, Jr., PhD
Center for Health Care Research and Policy
Case Western Reserve University at The MetroHealth System
Adam T. Perzynski, PhD
Center for Health Care Research and Policy
Case Western Reserve University at The MetroHealth System
Structural Equation Modeling
for Fashion Week
We have lots of Models!
Structural Equation Modelin’
fer Pirates
Structural Equation Modelin’
fer Pirates
SEM be a statistical
technique for testin'
and estimatin' causal
relations usin' a
combination o'
statistical data and
qualitative causal
assumptions
*From Wikipedia
Assumptions

I do not actually assume you are dummies

Feel free to assume what you want about me

I do not assume you will be experts in SEM
after this presentation

I assume you know something about means
and regression (hopefully)
Outline

Important SEM Resources

Measurement (and measurement error)

Examples
◦ Measurement Invariance
◦ Latent Class Analysis
◦ Latent Growth Mixture Modeling

Model Specification
Outline

Important SEM Resources

Measurement (and measurement error)

Examples
◦ Measurement Invariance
◦ Latent Class Analysis
◦ Latent Growth Mixture Modeling

Model Specification
SEM Resources
SEM Resources
SEM Resources
SEM Resources: Statmodel.com
SEM Resources
SEM Resources
SEM Resources
SEM Resources
SEM Resources
SEM Resources
SEM Resources
Outline

Important SEM Resources

Measurement (and measurement
error)

Examples
◦ Measurement Invariance
◦ Latent Class Analysis
◦ Latent Growth Mixture Modeling

Model Specification
Measurement Models
A special type of causal models
 Survey items are assumed to have
measurement error

◦ Each question has its own amount of error

Your answer to a survey question is
causally related to a latent, unobserved
variable.
Perfect Measurement
health
Self-rated health
1.0?
Causality and the Latent Concept of
Health

In general, how would you describe your
health?

We assume that every individual varies
along an infinite continuum from best
possible health to worst possible health.

When any given individual answers this
question, they are approximating their
position on this latent continuum.
Imperfect Measurement
1.0
health
Self-rated health
e4
Variance > 0
< 1.0
Measurement Models using Multiple
Indicators


Single items are unreliable
Single cases prevent generalizability

Use multiple indicators and large samples to
estimate the values of the latent, unobservered
variables or factors

The SF36 uses multiple indicators describing
multiple factors in order to measure health
more reliably.
Outline

Important SEM Resources

Measurement (and measurement error)

Examples
◦ Measurement Invariance
◦ Latent Class Analysis
◦ Latent Growth Mixture Modeling

Model Specification
Acknowledgement: This study was funded by Grant number R01-AG022459 from
the NIH National Institute on Aging.
Measuring Disparities:
Bias in Self-reported
Health Among Spanishspeaking Patients
J.J. Sudano1,2, A.T. Perzynski1,2,
T.E. Love2, S.A. Lewis1,B. Ruo3,
D.W. Baker3
1 The
MetroHealth System, Cleveland, OH; 2 Case
Western Reserve University School of Medicine,
Cleveland, OH; 3 Northwestern University Feinberg
School of Medicine
Measurement Model of
the SF36
Objective & Significance

Do observed differences in SRH reflect
true differences in health?
◦ Cultural and language differences may
create measurement bias
◦ If outcomes aren’t measuring the same
thing in different groups, we have a problem
Measurement Equivalence &
Factorial Invariance

It is only possible to properly interpret group
differences after measurement equivalence has
been established (Horn & McArdle, 1992;
Steenkamp & Baumgartner, 1998).

“It may be the case that the groups differ … but it
also may be the case that extraneous influences
are giving rise to the observed difference.”
Meredith & Teresi (2006 p. S69)

The external validity of any conclusion regarding
group differences rests securely on whether the
measurement equivalence of the scale has been
established (Borsboom, 2006).
Cross-sectional Study

N= 1281

Medical patients categorized into four
groups:White, Black, English-speaking
Hispanic and Spanish-speaking Hispanic.

Multigroup Confirmatory Factor Analysis
(MGCFA)
Two Types of Invariance

Metric (Weak) Invariance
◦ Are the item factor loadings equivalent across
groups?
◦ Is a one unit change in the item equal to a one
unit change in the factor score for all groups?

Scalar (Strong) Invariance
◦ Are the item intercepts equivalent across groups?
◦ Unequal intercepts results in unequal scaling of
factor scores
Weak invariance
health
Self-rated health
What happens to the
model fit when we
constrain all of these
paths (loadings) to be
equal across groups?
e4
Table 1: Goodness of Fit for SF36
Multigroup Factorial Invariance Testing (N = 1281)
Ref
∆RMSEA
∆CFI
B-S ∆χ2
3110 2253
1
0.001
-0.005
109
81
0.907
3215 2358
2
0.004
-0.024
105
105
0.033 (.032 - .034)
0.909
3179 2323
2
0.004
-0.022
69
70
Partial Scalar Invariance (B=W=HE not HS)
0.030 (.029 - .032)
0.921
3180 2323
2
0.001
-0.010
70
70
2nd Order Structural Invariance**
0.030 (.029 - .032)
0.921
3187 2333
2
0.001
-0.010
77
80
7
2nd & 3rd Order Structural Invariance**
0.030 (.029 - .032)
0.921
3196 2339 2
0.001
* The bootstrapped Bollen - Stine χ2 value is reported because of significant (p<.01) multivariate non-normality. **
Structural factor weights are constrained equal for Blacks, Whites and Hispanic English (Hispanic Spanish are
unconstrained).
-0.010
86
86
Model
B-S χ2*
Description
RMSEA (95% CI)
CFI
df
1
Unconstrained Model
0.028 (.017 - .030)
0.936
3001 2172
2
Metric Invariance (Factor Weights)
0.029 (.028 - .030)
0.931
3
Scalar Invariance (Intercepts)
0.033 (.032 - .034)
4
Partial Scalar Invariance (B=W=HS not HE)
5
6
∆df
The Unconstrained Model
Fits the Data Well
Table 1: Goodness of Fit for SF36
Multigroup Factorial Invariance Testing (N = 1281)
Ref
∆RMSEA
∆CFI
B-S ∆χ2
3110 2253
1
0.001
-0.005
109
81
0.907
3215 2358
2
0.004
-0.024
105
105
0.033 (.032 - .034)
0.909
3179 2323
2
0.004
-0.022
69
70
Partial Scalar Invariance (B=W=HE not HS)
0.030 (.029 - .032)
0.921
3180 2323
2
0.001
-0.010
70
70
2nd Order Structural Invariance**
0.030 (.029 - .032)
0.921
3187 2333
2
0.001
-0.010
77
80
7
2nd & 3rd Order Structural Invariance**
0.030 (.029 - .032)
0.921
3196 2339 2
0.001
* The bootstrapped Bollen - Stine χ2 value is reported because of significant (p<.01) multivariate non-normality. **
Structural factor weights are constrained equal for Blacks, Whites and Hispanic English (Hispanic Spanish are
unconstrained).
-0.010
86
86
Model
B-S χ2*
Description
RMSEA (95% CI)
CFI
df
1
Unconstrained Model
0.028 (.017 - .030)
0.936
3001 2172
2
Metric Invariance (Factor Weights)
0.029 (.028 - .030)
0.931
3
Scalar Invariance (Intercepts)
0.033 (.032 - .034)
4
Partial Scalar Invariance (B=W=HS not HE)
5
6
∆df
The model with factor
loadings constrained still fits
the data well.
The Unconstrained Model fits
the data well
Table 1: Goodness of Fit for SF36
Multigroup Factorial Invariance Testing (N = 1281)
Ref
∆RMSEA
∆CFI
B-S ∆χ2
3110 2253
1
0.001
-0.005
109
81
0.907
3215 2358
2
0.004
-0.024
105
105
0.033 (.032 - .034)
0.909
3179 2323
2
0.004
-0.022
69
70
Partial Scalar Invariance (B=W=HE not HS)
0.030 (.029 - .032)
0.921
3180 2323
2
0.001
-0.010
70
70
2nd Order Structural Invariance**
0.030 (.029 - .032)
0.921
3187 2333
2
0.001
-0.010
77
80
7
2nd & 3rd Order Structural Invariance**
0.030 (.029 - .032)
0.921
3196 2339 2
0.001
* The bootstrapped Bollen - Stine χ2 value is reported because of significant (p<.01) multivariate non-normality. **
Structural factor weights are constrained equal for Blacks, Whites and Hispanic English (Hispanic Spanish are
unconstrained).
-0.010
86
86
Model
B-S χ2*
Description
RMSEA (95% CI)
CFI
df
1
Unconstrained Model
0.028 (.017 - .030)
0.936
3001 2172
2
Metric Invariance (Factor Weights)
0.029 (.028 - .030)
0.931
3
Scalar Invariance (Intercepts)
0.033 (.032 - .034)
4
Partial Scalar Invariance (B=W=HS not HE)
5
6
∆df
Metric (Weak) Invariance was
Confirmed
I forget what an intercept is

Scalar (Strong) Invariance
◦ Are the item intercepts equivalent across
groups?
Intercept: the intercept in a multiple
regression model is the mean for the
response when all of the explanatory
variables take on the value 0.
 Could be called the “starting point”

The model with factor
loadings constrained still fits
the data well.
The Unconstrained Model fits
the data well
Table 1: Goodness of Fit for SF36
Multigroup Factorial Invariance Testing (N = 1281)
Ref
∆RMSEA
∆CFI
B-S ∆χ2
3110 2253
1
0.001
-0.005
109
81
0.907
3215 2358
2
0.004
-0.024
105
105
0.033 (.032 - .034)
0.909
3179 2323
2
0.004
-0.022
69
70
Partial Scalar Invariance (B=W=HE not HS)
0.030 (.029 - .032)
0.921
3180 2323
2
0.001
-0.010
70
70
2nd Order Structural Invariance**
0.030 (.029 - .032)
0.921
3187 2333
2
0.001
-0.010
77
80
7
2nd & 3rd Order Structural Invariance**
0.030 (.029 - .032)
0.921
3196 2339 2
0.001
* The bootstrapped Bollen - Stine χ2 value is reported because of significant (p<.01) multivariate non-normality. **
Structural factor weights are constrained equal for Blacks, Whites and Hispanic English (Hispanic Spanish are
unconstrained).
-0.010
86
86
Model
B-S χ2*
Description
RMSEA (95% CI)
CFI
1
Unconstrained Model
0.028 (.017 - .030)
0.936
3001 2172
2
Metric Invariance (Factor Weights)
0.029 (.028 - .030)
0.931
3
Scalar Invariance (Intercepts)
0.033 (.032 - .034)
4
Partial Scalar Invariance (B=W=HS not HE)
5
6
Constraining the intercepts
results in a worsening of
model fit
df
∆df
The model with factor
loadings constrained still fits
the data well.
Table 1: Goodness of Fit for SF36
Multigroup Factorial Invariance Testing (N = 1281)
Ref
∆RMSEA
∆CFI
B-S ∆χ2
3110 2253
1
0.001
-0.005
109
81
0.907
3215 2358
2
0.004
-0.024
105
105
0.033 (.032 - .034)
0.909
3179 2323
2
0.004
-0.022
69
70
Partial Scalar Invariance (B=W=HE not HS)
0.030 (.029 - .032)
0.921
3180 2323
2
0.001
-0.010
70
70
2nd Order Structural Invariance**
0.030 (.029 - .032)
0.921
3187 2333
2
0.001
-0.010
77
80
7
2nd & 3rd Order Structural Invariance**
0.030 (.029 - .032)
0.921
3196 2339 2
0.001
* The bootstrapped Bollen - Stine χ2 value is reported because of significant (p<.01) multivariate non-normality. **
Structural factor weights are constrained equal for Blacks, Whites and Hispanic English (Hispanic Spanish are
unconstrained).
-0.010
86
86
Model
B-S χ2*
Description
RMSEA (95% CI)
CFI
1
Unconstrained Model
0.028 (.017 - .030)
0.936
3001 2172
2
Metric Invariance (Factor Weights)
0.029 (.028 - .030)
0.931
3
Scalar Invariance (Intercepts)
0.033 (.032 - .034)
4
Partial Scalar Invariance (B=W=HS not HE)
5
6
Constraining the intercepts
results in a worsening of
model fit
df
The fit is still poor if you allow
intercepts for English-speaking
Hispanics to vary
∆df
The model with factor
loadings constrained still fits
the data well.
Table 1: Goodness of Fit for SF36
Multigroup Factorial Invariance Testing (N = 1281)
Ref
∆RMSEA
∆CFI
B-S ∆χ2
3110 2253
1
0.001
-0.005
109
81
0.907
3215 2358
2
0.004
-0.024
105
105
0.033 (.032 - .034)
0.909
3179 2323
2
0.004
-0.022
69
70
Partial Scalar Invariance (B=W=HE not HS)
0.030 (.029 - .032)
0.921
3180 2323
2
0.001
-0.010
70
70
2nd Order Structural Invariance**
0.030 (.029 - .032)
0.921
3187 2333
2
0.001
-0.010
77
80
7
2nd & 3rd Order Structural Invariance**
0.030 (.029 - .032)
0.921
3196 2339 2
0.001
* The bootstrapped Bollen - Stine χ2 value is reported because of significant (p<.01) multivariate non-normality. **
Structural factor weights are constrained equal for Blacks, Whites and Hispanic English (Hispanic Spanish are
unconstrained).
-0.010
86
86
Model
B-S χ2*
Description
RMSEA (95% CI)
CFI
df
1
Unconstrained Model
0.028 (.017 - .030)
0.936
3001 2172
2
Metric Invariance (Factor Weights)
0.029 (.028 - .030)
0.931
3
Scalar Invariance (Intercepts)
0.033 (.032 - .034)
4
Partial Scalar Invariance (B=W=HS not HE)
5
6
The fit is acceptable if you allow
intercepts for Spanish speaking
Hispanics to vary
∆df
Scalar (Strong) Invariance is NOT
Confirmed
Measurement equivalence of the SF36 does not exist
for Spanish speaking Hispanics
Intercepts are lower for Spanishspeaking Hispanics on nearly all
items
Measurement equivalence of the SF36 does not exist
for Spanish speaking Hispanics
Use of English Rating Categories on
Twiter
Using of Spanish Rating Categories on
Twitter
Outline

Important SEM Resources

Measurement (and measurement error)

Examples
◦ Measurement Invariance
◦ Latent Class Analysis
◦ Latent Growth Mixture Modeling

Model Specification
Everywhere and Nowhere: Latent Class Analysis
of Knowledge of the Spread of Hepatitis C
Adam T. Perzynski, PhD
E-mail: Adam.Perzynski@case.edu
Introduction


Hepatitis C is a widespread and serious disease that
affects the liver.
170 million people worldwide are infected.

3.9 million Americans infected with HCV. (AHRQ
2003)

More Americans die every year from Chronic HCV
infection than from HIV
HCV Transmission

Blood
◦
◦
◦
◦
Injection Drug Use
Blood Transfusions
Needle Sticks
Shared Household Items (Razor or
Toothbrush)
Sexual transmission of HCV is recognized
but is infrequent.
 HCV is not transmitted by Coughing,
Kissing, Sneezing, Touching, Bathrooms,
Fecal Matter, or Contaminated Food

Sample




Behavior Risk Factor Surveillance System (BRFSS), 2001,
Arizona
Conducted by the Centers for Disease Control (CDC)
The world’s largest telephone survey
Nearly 200,000 people participated in 2001
Measure

Do you think hepatitis C can be spread
thru?
◦
◦
◦
◦
◦
◦
◦
Sneezing or Coughing
Kissing
Unprotected Sex
Food or Water
Sharing Needles to Inject Street Drugs
Using the Same Bathroom
Contact with the Blood of an Infected Person
Methods of Analysis

Analyzed with Mplus

Analysis proceeded in several stages
1.
2.
3.
4.
5.
Exploratory Factor Analysis
Confirmatory Factor Analysis
Cluster Analysis (Not reported)
Latent Class Analysis
Mixture Modeling

Robust estimation for binary indicators

Missing Values Imputation using Full Information
Maximum Likelihood Estimation (FIML)
Distribution of Outcome Variables
Means and Standard Deviations
Exploratory Factor Analysis

Scree plot, Eigenvalues, and Root Mean
Square Residuals more or less supported a
two factor solution
Figure 1: Measurement Model #1
Confirmatory Factor Analysis with Two Latent Continuous Variables.
N = 3902
-.79
Unprotected Sex
Sharing Needles
.86
.98
HCV is Transmitted
HCV is not Transmitted
.83
Sneezing/Coughing
.79
Food or Water
.83
Bathrooms
.92
Blood Contact
.96
Kissing
What is different about LCA?

Instead of assuming that the latent variable is
continuous (infinitely poor to infinitely good)

We assume the latent variable is categorical.

Membership in “hidden” empirical forms
determines answers rather than a single latent
continuum.
Figure 2: Measurement Model #2
Latent Class Analysis with a Categorical Latent Variable
N = 3902
Kissing
Sneezing/Coughing
Estimation: Maximum Likelihood Robust
Fit Measures
Likelihood Ratio Chi-Square = 269.556
DF = 104
P-Value = 0.000
AIC = 17338.312
BIC = 17477.153
Adjusted BIC = 17404.073
Entropy = .888
Unprotected Sex
HCV Transmission
Awareness
Food or Water
Sharing Needles
Bathrooms
Blood Contact
Don’t Know
Three Latent Classes

The Two Category and Four Category models do not fit
the data as well as as the Three Category model.

HCV is Nowhere
◦ N = 1683 (The largest class!)

Full Awareness of how HCV is Spread
◦ N = 930

HCV is Everywhere
◦ N = 479
Figure 3: Estimated Probabilities of Knowing How
HCV is Spread by Class Membership
Additional Analyses

What predicts membership in each latent class?

Do the relationships between variables vary inside of a
particular class?

Mixture Modeling
◦ Simultaneously test continuous and categorical predictors of class
membership.
Figure 4: Causal Model with a Latent Categorical
Variable of HCV Transmission Awareness*
Socio-demographics
Age
HCV Module
Race/Ethnicity
Know Someone
w/ HCV
Kissing
Gender
HCV Test
Sneezing/Coughing
HCV Diagnosis
Unprotected Sex
Education
Income
Received Blood
Transfusion
Employment Status
HCV Transmission
Awareness
Food or Water
IDU
Sharing Needles
Self-Perceived HCV
Risk
Bathrooms
Blood Contact
Health and Health Care
Self-rated Health
Health Care Access
* Bivariate correlations are calculated for all variables inside of
boxed conceptual categories
Outline

Important SEM Resources

Measurement (and measurement error)

Examples
◦ Measurement Invariance
◦ Latent Class Analysis
◦ Latent Growth Mixture Modeling

Model Specification
Longitudinal Patterns of
Depressive Symptoms in
the Health and
Retirement Study
Adam T. Perzynski, PhD & Joseph S. Sudano, Jr., PhD
Center for Health Care Research and Policy
Case Western Reserve University and MetroHealth
Presentation at the Annual Meeting of the Gerontological
Society of America on November 22, 2010
Introduction

This is another measurement study

Explore the use of Latent Class Growth Analysis
to model changes in depressive symptoms over
time in the Health and Retirement Study.

Most studies compare the change in means
scores between two waves.

A small number of studies have modeled change
as a single growth trajectory
Change in Means Between Waves

Often we simply calculate the mean
depressive symptoms at Wave 1 (baseline).

Subract it from the mean at Wave 2
(followup).
What is a trajectory?



Regrettably, the term “trajectory” has taken
on multiple meanings across disciplines and
research studies.
A broad, inclusive definition of trajectory
modeling is the analysis of patterns of
change or stability.
Confusion is possible between aggregate
trajectories which summarize an overall
average pattern of change for a population
and disaggregated trajectories which
examine multiple potential trajectories of
different shapes (George 2006).
Example of a single growth
trajectory
Continuous Latent Growth Curve
Analysis

LGA / LGCA

Studies in older adults (ie George and
Lynch 2003) typically find that the slope
of the latent growth curve for depressive
symptoms is small and positive, and that
the slope of the curve is steepest in the
oldest cohorts.
Example from George and Lynch
(2003)
Example of an LGA finding
LGA estimates a single Aggregate
trajectory

Assumes that the average population starting
point (intercept for the growth curve) and
average amount of change (slope) are a
sufficient depiction of variation over time in
depressive symptoms.

If discrete subtypes of depressive symptom
trajectories exist, but are ignored (as in
single latent growth curve and
autoregressive models) the magnitude of
associations could be grossly misestimated.
What is Latent Class Growth
Analysis?

Latent Class Growth Analysis (LCGA),
also referred to as growth mixture
modeling, belongs to a family of statistical
techniques referred to as general latent
variable modeling or GLVM.
Why would we ever think we
should use LCGA?

Studying the mean change or using a single trajectory for
everyone assumes uniform heterogeneity in the population.

Researchers use familiar methods and typically assume that
the underlying (latent or real) distribution of variables is
continuous.

We have theoretical reasons to suspect that underlying
distributions could be categorical.

Life course theorists (Dannefer) specifically caution that
intracohort differentiation is unlikely to be homogeneous.
Why would we use LCGA?

We think individuals and cohorts diverge
over time

Cumulative change differentiates
individuals and cohorts.
Prior LCGA Models of Depression or
Depressive Symptoms

LCGA models and closely related
Longitudinal Latent Class Analysis (LLCA)
have been used to estimate models of
depressive symptoms in prior studies of
◦ maternity (Campbell et al 2009; Mora et al 2009)
◦ childhood and adolescence (Meadows et al 2006)
◦ adolescence through young adulthood (Olino et
al 2009)
◦ response to antidepressants among adults
(Muthen et al, 2007; Hunter et al 2009)
◦ patients who have had a cardiovascular event
(Kaptein et al 2006).
Methods

5,195 age-eligible respondents from the 1992 Health and
Retirement Study cohort, who completed interviews in all
seven waves through 2004.

Depressive symptoms in HRS are measured using a
dichotomous, 8-item version of the CES-D. Analysis begins
with Wave 2 data due to a change in response categories
from Wave 1.

Using MPlus, we compared the fit of LCGA models of two to
eight classes while also accounting for the HRS complex
sampling design.

We then tested the effect of a small number of covariates.
This is very similar to a multinomial logistic regression.
Demographic characteristics

Gender
◦ 60.3% female

Race/ethnicity
◦
◦
◦
◦

76.4% non-Hispanic White
14.4% Black
7.4% Hispanic
1.8% other racial/ethnic groups
Age
◦ Median=55

Education
◦ Mean=12.4 years (SD=3.0).
Rule for Determining the number of
Latent Classes

“How many trajectories are there?”

Measures of model fit including:
◦ Lo-Mendell-Rubin Test (LMR test)
◦ log-likelihood (LL)
◦ Bayesian Information Criteria (BIC) (Vuong, 1989; Muthen, 2004;
Muthen, & Muthen, 2005; Nylund et al, 2007).
Here we will use the LMR Test
 Where k is the number of latent classes, this test gives a pvalue for the k-1 versus the k-class model when running the
k-class model (Vuong, 1989; Muthen, B. 2005).
 The first time p > .05, k-1 is the preferred number of classes.

Results

How many classes are there?

What do the classes look like?

How is this different from looking at means or
single trajectory?

Are any demographic variables associated with
being in a particular class?
How many Classes are there?
Table 1. Depressive Symptoms LCGA Model Fit Comparison, N = 5,195
K
LL
BIC
Adjusted BIC LMR Test LMR p Entropy
2
-56367.49 112983.09
112890.94
10525.69 0.000
0.955
3
-55146.65 110618.41
110497.66
2410.38 0.000
0.922
4
-54652.99 109708.09
109558.74
974.66 0.015
0.925
5
-54357.08 109193.27
109015.32
519.88 0.149
0.901
6
-54090.08 108736.27
108529.72
397.39 0.354
0.912
7
-54079.98 108793.06
108557.91
97.55 0.392
0.920
8
-53895.87 108501.85
108238.10
307.84 0.314
0.732
How many Classes are there?
Table 1. Depressive Symptoms LCGA Model Fit Comparison, N = 5,195
K
LL
BIC
Adjusted BIC LMR Test LMR p Entropy
2
-56367.49 112983.09
112890.94
10525.69 0.000
0.955
3
-55146.65 110618.41
110497.66
2410.38 0.000
0.922
4
-54652.99 109708.09
109558.74
974.66 0.015
0.925
5
-54357.08 109193.27
109015.32
519.88 0.149
0.901
6
-54090.08 108736.27
108529.72
397.39 0.354
0.912
7
-54079.98 108793.06
108557.91
97.55 0.392
0.920
8
-53895.87 108501.85
108238.10
307.84 0.314
0.732
What do the classes look like?
Figure 1: Four Latent Classes of Depressive Symptoms over 12
Years of the HRS
Mean # of Depressive Symptoms
6
5
Many Persistent Symptoms = 5.4%
Decreasing Symptoms = 9.6%
Increasing Symptoms = 11.5%
Almost No Symptoms = 73.5%
4
3
2
1
0
1994
1996
1998
2000
HRS Study Wave
2002
2004
N = 5195
How is this different from looking at
Means or a Single Trajectory?
Online at:
http://spreadsheets.google.com/pub?key=0ApRkae54BRnudEYyUGdXZWlES3Z4VzZ6a
kNaOFFiekE&gid=5
Does anything influence the chances of
being in a particular class?
Figure 2. Relationship between Years of Education and
Depressive Symptoms Trajectory/Latent Class Membership
1.0
Latent Class Probability
0.9
0.8
Many Symptoms
Decreasing Symptoms
Increasing Symptoms
Almost No Symptoms
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
1
2
3
4
5
6
7
8
9 10 11 12
Years of Education
13
14
15
16
17
N = 5195
18
Does anything influence the chances of
being in a particular class?
Table. Effects of Demographics on the Likelikhood of a Depressive Symptoms Trajectory
Many Symptoms
Decreasing
Increasing
N = 5195
vs. Almost No Symptoms (reference category)
OR
b
p
OR
b
p
OR
b
p
Age
0.94
-0.057
0.010
0.96
-0.043
0.014
1.02
0.016
0.422
Female
2.19
0.785
0.000
1.53
0.428
0.001
1.41
0.346
0.002
Black
1.89
0.635
0.000
1.90
0.641
0.000
1.54
0.429
0.001
Hispanic
1.12
0.113
0.655
1.59
0.461
0.018
1.19
0.178
0.461
Low Education
1.32
0.274
0.000
0.84
-0.173
0.000
0.90
-0.105
0.000
• Females, African Americans and those
with fewer years of education have a
higher probability of being in the Many
Symptoms trajectory.
Outline

Important SEM Resources

Measurement (and measurement error)

Examples
◦ Measurement Invariance
◦ Latent Class Analysis
◦ Latent Growth Mixture Modeling

Model Specification
Model Specification

Choosing the model that best represents
the data structure and addresses the
research questions of interest can be a
daunting task.

Brief overview of model specification
tests and procedures.
Model Specification
 “First, your
return to shore was not part
of our negotiations nor our agreement
so I must do nothing. And secondly, you
must be a pirate for the pirate's code to
apply and you're not. And thirdly, the
code is more what you'd call ‘guidelines’
than actual rules.”

Captain Barbossa from Pirates of the Caribbean:
The Curse of the Black Pearl (2003)
Model Specification
 In
model specification a researcher
can use:
◦ logic, theory and prior empirical evidence to
choose the initial model
◦ model comparison testing to compare the
initial model to competing models
◦ a combination of theory, prior evidence, and
the results of the model comparison testing
to decide upon which model or models are
appropriate for a given study
Nested or Not Nested?
Chi Square Test

The Chi-square statistic is computed and
used to test whether the model does fit
the data well.

It is the basis for most other fit tests.

Along with other fit tests we use it to
evaluate whether to include or exclude
model paths relating measures to each
other for a given study.
Chi Square Test

Also called the discrepancy function

If not significant, the model is regarded as
acceptable.*
Chi Square Test*

Some limitations are:

Complex models with many parameters

With large samples, models will most often
be rejected, sometimes unfairly
Where multivariate non-normality is
present, the chi-square fit index is inaccurate.
Modified tests (The Satorra-Bentler scaled
chi-square) are available.


Modification Indices
Modification indices can be calculated
individually for every path that is fixed to
zero, by estimating a chi-square test statistic
with one df.
 The higher the value of the modification
index for a causal path, the better the
predicted improvement in overall model fit if
that path were added to the model.
 Jöreskog suggested that a modification index
should be at least five before the researcher
considers adding the causal path and
modifying the hypothesized model.

R-squared

In linear regression analysis, we interpret
the r2 value as the amount of variation in
the response that can be explained by the
regressors in the model.

In SEM, it is pretty much the same*

*Not exactly, but that is beyond the “for
dummies” version of this talk
AIC, BIC (and BCC)
Bayesian Information Criterion (BIC)
Akaike Information Criterion (AIC)
Based on the chi-squared test statistic
While the models under comparison can be
nested or non-nested, in both these tests, as with
all tests in this section, for a truly direct
comparison, we prefer that the same observed
measures are used in the models we are
comparing.
 Both BIC and AIC feature the goodness-of-fit
2
term for our model 𝜒𝑀
, derived directly from the
discrepancy function when applicable, along with a
penalty term.




AIC, BIC (and BCC)
Cannot identify if a model has good fit.
Only if one model fits better than
another.
 The lower the value of BIC, AIC and BCC,
the better the fit.
 BCC penalizes for model complexity
more than AIC and BIC.
 BIC penalizes for model complexity more
than AIC.

Specification Search

Allows researchers to choose a model
from among a number of candidates.

Exploratory

Should be guided by theory
Specification Search

Given the model in Figure 1, with 7
unknown paths, the number of models is
equivalent to 27=128 possible
specifications of the model.

128 different possible models!
Specification Search

The unconstrained model (The one with
all seven ambiguous paths in the model)
demonstrates satisfactory overall model
fit
◦
◦
◦
◦
CFI=.95
TLI=.92
RMSEA =.07
chisq= 476.33, DF=63
Table. Results of Specification
Search in AMOS
Model
Name
Params df
C
C - df
AIC
BCC
BIC
C / df
R2
p
A
Unconstrained
56
63
476.336
413.336
588.336
589.664
877.039
7.561
0 .11
B
No DIF path to PF1
55
64
476.357
412.357
586.357
587.661
869.904
7.443
0 .11
C
No DIF path to PF1, PF2
54
65
476.396
411.396
584.396
585.677
862.787
7.329
0 .11
D
No DIF paths to PF1, PF2, or PF3
53
66
476.857
410.857
582.857
584.114
856.093
7.225
0 .11
E
No DIF paths, No educ to SS
52
67
478.536
411.536
582.536
583.769
850.616
7.142
0 .11
F
No DIF paths, No educ to SS, SS to PP
51
68
482.272
414.272
584.272
585.481
847.197
7.092
0 .10
G
No DIF paths, No educ to SS, SS to PP/PF
50
69
495.104
426.104
595.104
596.289
852.873
7.175
0 .09
H
Fully Constrained (No DIF or SS paths)
49
70
540.892
470.892
638.892
640.054
891.506
7.727
0 .09
Notes: Reported R2 values are for the equation in each model with the endogenous PF latent
variable with the interpretation of total explained variance in physical functioning given all
other paths in the model. C is the chi-squared test statistic and df are the associated degrees
of freedom.
Specification Search

When a number of models are plausible,
specification tests can be used as
evidence for verification of or
improvement over an initial model.
‘Guidelines’

a researcher is ultimately left to decide if
the results of the specification tests are
unjustly in favor of a certain model due to
complexity or sample size, rather than the
meaning behind the causal paths.

Thus the specification tests act more like
guidelines, rather than strict codes
dictating the “best” fitting model.
Selected Strengths & Limitations of
SEM

Strengths
◦ Very flexible
◦ Estimate and correct for measurement error

Limitations
◦
◦
◦
◦
Large sample sizes
Challenging to learn
Need lots of hands-on experience to learn
Need a strong theoretical basis
 It’s easy to mis-specify a model if you have no idea
what you are doing.
Applied Structural Equation
Modeling for Everyone!
February 22, 2013
Indiana University, Bloomington
Joseph J. Sudano, Jr., PhD
Center for Health Care Research and Policy
Case Western Reserve University at The MetroHealth System
Adam T. Perzynski, PhD
Center for Health Care Research and Policy
Case Western Reserve University at The MetroHealth System