SEM PURPOSE

advertisement
LECTURE 16
• STRUCTURAL EQUATION MODELING
SEM PURPOSE
• Model phenomena from observed or
theoretical stances
• Develop and test constructs not directly
observed based on observed indicators
• Test hypothesized relationships, potentially
causal, ordered, or covarying
Relationships to other quantitative methods
STRUCTURAL EQUATION MODELS (SEM)
LATENT
Structural path
models
Confirmatory Exploratory
MANIFEST
Factor analysis
Canonical analysis
Discriminant
Analysis
True Score Theory
Validity
(concurrent/
predictive)
HLM
Reliability
(generalizability)
Multiple
regression
GLM
ATI
ANOVA
ANCOVA
2 group
t-test
IRT
bivariate
partial
correlation correlation
logistic models
Causal (Grizzle et al)
Loglinear
Models
Associational (Holland,et al)
Decomposition of Covariance/Correlation
• Most hypotheses about relationship can be
represented in a covariance matrix
• SEM is designed to reproduce the observed
covariance matrix as closely as possible
• How well the observed matrix is fitted by
the hypothesized matrix is Goodness of Fit
• Modeling can be either entirely theoretical
or a combination of theory and revision
based on imperfect fit of some parts.
Decomposition of Covariance/Correlation
• Example: Correlation between TAAS
reading level at grades 3 and 4 in 1999 was
.647 for 3316 schools that gave the test.
Suppose this is taken as the theoretical
value for year 2000. Thus,
.768
TAAS
Grade3
Reading
.647
TAAS
Grade4
Reading
error
Decomposition of Covariance/Correlation
• Example: Correlation between TAAS
reading level at grades 3 and 4 in 2000 was
.674 for 3435 schools that gave the test. We
then test the theory that the relationship is
stable across years;
H: =.647
TAAS
Grade3
Reading
.674
error
TAAS
Grade4
Reading
Decomposition of Covariance/Correlation
• In classical statistics this problem is solved
through Fisher’s Z-transform
Zr=tanh-1 r = 1/2 ln[1+  r  /(1 -  r |)
And a normal statistic developed, z=Zr - ZH
• In SEM this is a covariance problem of
fitting the observed covariance matrix to the
theoretical matrix:
1 .674
1 r
.674 1
= r
1
Decomposition of Covariance/Correlation
• The test is based on large sample
multivariate normality under either
maximum likelihood or generalized least
squares estimation. In this case there is no
estimation required, since all parameters are
known. For the Fisher Z-transform, the
statistic is z=1.044, p >.29. For the SEM
method,
Decomposition of Covariance/Correlation
• Under SEM, the model is represented as
F = log  + tr(S-1 ) - logS - (p – q)
= log | 1 .647 | + tr{ 1 .674 1 .647
.647 1
-
log |
1
.674
.674
.674
1
-1
}
.647 1
| - (2-1)
1
= log (1-.6472) + tr{ 1 .674
1/(1-.6472)
(1-.6742)-1
-.647/(1-.6472) 1/(1-.6472)
.674
1
= -.23553 + 1.94 - .23552-1 = .94
= 1.94 X2 = .94 , df=1, p > .33
- .647/(1-.6472)
} -log
Developing Theories
• Previous research- both model and estimates can
be used to create a theoretical basis for
comparison with new data
• Logical structures- time, variable stability,
construct definition can provide order
– 1999 reading in grade 3 can affect 2000 reading in
grade 4, but not the reverse
– Trait anxiety can affect state anxiety, but not the reverse
– IQ can affect grade 3 reading, but grade 3 reading is
unlikely to alter greatly IQ (although we can think of
IQ measurements that are more susceptible to reading
than others)
Developing Theories
•
•
Experimental randomized design- can be part of
SEM
What-if- compare competing theories within a
data set. Are all equally well explained by the
data covariances?
–
•
Danger- all just-identified models equally explain all
the data (ie. If all degrees of freedom are used, any
model reproduces the data equally well)
Parsimony- generally simpler models are
preferred; as simple as needed but not simple
minded
MEASUREMENT MODELS
BASIC EQUATION
• x=+e
• x = observed score
•  = true (latent) score: represents
the score that would be obtained
over many independent
administrations of the same item
or test
• e = error: difference between y
and 
ASSUMPTIONS
•  and e are independent
(uncorrelated)
• The equation can hold for an
individual or a group at one
occasion or across occasions:
• xijk = ijk + eijk (individual)
• x*** = *** + e*** (group)
• combinations (individual across
time)
x
x

e
RELIABILITY
• Reliability is a proportion of
variance measure (squared
variable)
• Defined as the proportion of
observed score (x) variance due
to true score ( ) variance:
• 2x = xx’
•
= 2 / 2x
Var()
Var(e)
Var(x)
reliability
Reliability: parallel forms
• x 1 =  + e 1 , x 2 =  + e2
• (x1 ,x2 ) = reliability
•
= xx’
•
= correlation between
parallel forms
x1
e
x
x

xx’ = x * x
x2
e
ASSUMPTIONS
•  and e are independent
(uncorrelated)
• The equation can hold for an
individual or a group at one
occasion or across occasions:
• xijk = ijk + eijk (individual)
• x*** = *** + e*** (group)
• combinations (individual across
time)
Reliability: Spearman-Brown
• Can show the reliability of the
composite is
• kk’ = [k xx’]/[1 + (k-1) xx’ ]
• k = # times test is lengthened
• example: test score has rel=.7
• doubling length produces
rel = 2(.7)/[1+.7] = .824
Reliability: parallel forms
• For 3 or more items xi, same
general form holds
• reliability of any pair is the
correlation between them
• Reliability of the composite (sum
of items) is based on the average
inter-item correlation: stepped-up
reliability, Spearman-Brown
formula
COMPOSITES AND FACTOR
STRUCTURE
• 3 MANIFEST VARIABLES REQUIRED
FOR A UNIQUE IDENTIFICATION OF A
SINGLE FACTOR
• PARALLEL FORMS REQUIRES:
– EQUAL FACTOR LOADINGS
– EQUAL ERROR VARIANCES
– INDEPENDENCE OF ERRORS
e
e
x1
x
e
x3
x
x2
x

xx’ = x  * x 
i
j
RELIABILITY FROM SEM
• TRUE SCORE VARIANCE OF THE
COMPOSITE IS OBTAINABLE FROM
THE LOADINGS:
K
 =  2i
i=1
K = # items or subtests
•
2
= K
x
RELIABILITY FROM SEM
• RELIABILITY OF THE COMPOSITE IS
OBTAINABLE FROM THE LOADINGS:
 = K/(K-1)[1 - 1/  ]
• example 2x =
.8, K=11
 = 11/(10)[1 - 1/8.8 ]
= .975
CONGENERIC MODEL
• LESS RESTRICTIVE THAN PARALLEL
FORMS OR TAU EQUIVALENCE:
– LOADINGS MAY DIFFER
– ERROR VARIANCES MAY DIFFER
• MOST COMPLEX COMPOSITES ARE
CONGENERIC:
– WAIS, WISC-III, K-ABC, MMPI, etc.
e2
e1
x1
x 
x 
1
e3
x 
3
x2
2

x3
(x1 , x2 )= x  * x 
1
2
COEFFICIENT ALPHA
•
•
•
•
•
•
•
•
xx’ = 1 - 2E /2X
= 1 - [2i (1 - ii )]/2X ,
since errors are uncorrelated
 = K/(K-1)[1 - s2i / s2X ]
where X = xi (composite score)
s2i = variance of subtest xi
sX = variance of composite
Does not assume knowledge of subtest ii
SEM MODELING OF
CONGENERIC FORMS
• PROC CALIS COV CORR MOD;
•
LINEQS
•
X1 = L1 F1 + E1,
•
X2 = L2 F1 + E2,
•
…
•
X10 = L10 F1 + E10;
•
STD E1-E10=THE:, F1= 1.0;
MULTIFACTOR STRUCTURE
• Measurement Model: Does it hold for each
factor?
– PARALLEL VS. TAU-EQUIVALENT VS.
CONGENERIC
• How are factors related?
• What does reliability mean in the context of
multifactor structure?
e1
MINIMAL CORRELATED FACTOR
STRUCTURE
x1
e2
x2
x 
x 
1 1
2 2
e3
x 1
1
2
3
x3
x 
x4
4 2
12
e4
STRUCTURAL MODELS
• Path analysis for latent variables- but can
include recursive models
• Begins with measurement model
• Theory-based model of relationships among
all variables
• Modification of model at path level:
LaGrange and Wald modification indices
Measurement Model first
1
1
X1
Y1
Y2
2
1
1
X2
12
2
X4
3
4
5
Y6
6
3
2
X3
Y5
2
Y3
Y4
3
4
Path analysis for latent variables
1
1
X1
Y1
Y2
2
1
1
X2
12
2
X4
3
4
5
Y6
6
3
2
X3
Y5
2
Y3
Y4
4
3
y = By + x + 
Modify Measurement Model as
needed
• Modification indices:
Wald Index: release constrained parameter
(usually 0 path)
* chi square statistic with df=#releases
LaGrange Multiplier Index: restrict to 0 a free
parameter
* chi square statistic with df =# restrictions
Test Full Model
• Examine overall Fit
• Examine Modification Indices
• Decide if there is evidence and theoretical
justification for dropping or adding a path
(Note- one path at a time- select most
critical/theoretically important to start with)
• Liberal rule for keeping, conservative rule
for adding (VW recommendation)
Computer Programs
• AMOS 5.0 – both drawing and syntax, SPSS
based
• Mplus 3.0 text data input, syntax only
• EQS 7.8 both drawing and syntax, similar to SAS
• LISREL 8.7 both drawing and syntax, difficult to
use
• SAS Proc Calis syntax based, easiest to integrate
with other data analysis procedures
Download