SEM_F14

advertisement
Structural Equation Modeling
Jiyoon An
Kiran Pedada
Agenda
 Part 1 (Presented by Jiyoon An)
- SEM and latent variable
- Find a model from dataset
 Part 2 (Presented by Kiran Pedada)
- SEM Structural model and measurement models
- How to use Lavaan
- Addressing missing values
- Path Diagrams
Part 1
(Presented by Jiyoon An)
Structural equation modeling (SEM)
 Test and estimate the (causal) relationships among
observable measures and non-observable theoretical (or
latent) variables, and further to describe relationships
between the latent variables themselves with directed
arrows
Source: http://davidakenny.net/
Why latent variable?
 A latent variable, a random variable, differs from a fixed
process parameter
 Measuring a person’s characteristics (e.g. dominance)
 Everyone has a different level of dominance. Some are less
dominant and some are more dominant
 We cannot measure dominance directly and need a latent
variable
Source: Borsboom, D., Mellenbergh, G. J., & Van Heerden, J. (2003), The theoretical status of latent
variables, Psychological review, 110(2), 203.
Measuring ‘dominance’ by using latent
variable
 Latent variable
 Manifest variables
Dominance
Xi
 X1: “I would like a job where I have power
over others”
 X2: “I would make a good military leader”
 X3: “I try to control others”
Source: Borsboom, D., Mellenbergh, G. J., & Van Heerden, J. (2003), The theoretical status of latent
variables, Psychological review, 110(2), 203.
When do you have a latent variable?
 A latent variable is defined as a random variable whose
realizations cannot be observed directly
 Remind an example of “ROA”
 Assess of true measure against measurement error (e.g. age)
Source: Borsboom, D. (2008), Latent variable theory, Measurement 6, 25-53,
Howell, R. D. (2014), course materials from MKT 6355 Theory Testing
SEM case in point: Student evaluation
 Infer from data structure to variable structure
 How to conceptualize latent variables?
 What are their causal relationships?
Source: Borsboom, D. (2008), Latent variable theory, Measurement 6, 25-53,
Howell, R. D. (2014), course martials from MKT 6355 Theory Testing
How to conceptualize latent variables?
 Perceived instructor competence (R1,
R3, R7, R8, R9, R10)
 Perceived instructor interaction (R6,
R4, R5)
 Perceived course quality (R11, R12,
R13, R14, R15, R16)
 R2 is removed
Factor analysis and SEM
 EFA
- Find a latent variable which affects observed variables
- Without prior assumption, all loadings are free to vary
 CFA
- Some loadings are forced to be zero by the researcher
- Factors are allowed to correlated
- No direct arrows between factors (Measured model)
 SEM
- Test and estimate the (causal) relationships
Where is latent variable?
R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 R16
F1
F2
F3
(Competence) (Interaction) (Course)
Student 1
Student 2
…
Student n
Student n
Comp.
Inter.
Course
R1
R3
R7
R8
R9
R10
R6
R4
R5
R11
R12
R13
R14
R15
R16
e1
e6
e7
e8
e9
e10
e3
e4
e5
e11
e12
e13
e14
e15
e16
What are their causal relationships?
 Criteria for classifying an explanation as causal - Temporal
sequentiality, nonspurious correlation, and common sense logic
 # of people of drowning and ice cream consumption
Source: Hunt, S. D. (2010), Foundations of marketing theory:Toward a general theory
of marketing, ME Sharpe
Applying criteria for choosing a model
• Latent variables: Perceived course quality, perceived instructor
competence, and perceived instructor interaction
• Discussion: What are our DV(s) and IV(s)?
A model that does not make sense
 A student forms an opinion about interaction,
which influences his/her opinion about
competence, which in turn influences his/her
opinion about course quality.
 Remember criteria of causality
Comp.
Inter.
Course
A model that makes more sense
 A student forms his/her opinion on interaction
and competence simultaneously, which influences
perceived course quality
 Opinions on interaction and competence are
correlated because they come from the same
student
 How the instructor offers and what the
instructor offers influence perceived quality
of course
Comp.
Inter.
Course
Source: Grönroos, C. (1984), A service quality model and its marketing implications, European Journal
of marketing, 18(4), 36-44.
Part 2
(Presented by Kiran Pedada)
SEM Structural Model
 SEM model for the case:
Model
Z = B zU + e z
Here:
 Z is the endogenous latent
variable,
 U is a (2x1) matrix of exogenous
latent variables
 Bz is a (1x2) matrix of
coefficients of exogenous
variables,
 ez is the error associated with
the endogenous variable.
Perceived
Competence
Perceived
Interaction
Perceived
Course
Quality
Source: “Factor Analysis, Path Analysis, and Structural Equations Modeling”, Book extract, Jones and Bartlett publishers.
http://www.jblearning.com/samples/0763755486/55485_CH14_Walker.pdf
Note: The equation is taken from the above mentioned source. However, the symbols are changed for ease and convenience.
Exogenous Measurement Model
 Exogenous measurement model:
X = BxU + ex
Here:
 X is a (9 x 1) matrix of exogenous indicators,
 Bx is a (9 x 2) matrix of coefficients from the exogenous variables to exogenous
indicators,
 U is a (2 x 1) matrix of exogenous latent variables,
 ex is a (9 x 1) matrix for error associated with the exogenous indicators.
Source: “Factor Analysis, Path Analysis, and Structural Equations Modeling”, Book extract, Jones and Bartlett publishers.
http://www.jblearning.com/samples/0763755486/55485_CH14_Walker.pdf
Note: The equation is taken from the above mentioned source. However, the symbols are changed for ease and convenience.
Exogenous Measurement Model
X = BxU + ex
Endogenous Measurement Model
 Endogenous measurement model:
Y = ByZ + ey
Here:
 Y is a (6x1) matrix of endogenous indicators,
 By is a (6x1) matrix of coefficients from the endogenous variable to endogenous
indicators,
 Z is a (1x1) matrix of endogenous latent variable,
 ey is a (6x1) matrix for error associated with the endogenous indicators.
Source: “Factor Analysis, Path Analysis, and Structural Equations Modeling”, Book extract, Jones and Bartlett publishers.
http://www.jblearning.com/samples/0763755486/55485_CH14_Walker.pdf
Note: The equation is taken from the above mentioned source. However, the symbols are changed for ease and convenience.
Endogenous Measurement Model
Y = ByZ + ey
SEM and Analysis of Covariance
 SEM is based on the analysis of covariances
 Analysis of covariances allows for estimation of both
standardized and unstandardized parameters
Source: www.structuralequations.com/resources/SEM+Essentials.pps
Example of Analysis of Covariance
Structure
Compare
S denotes the observed covariances (typically the unstandardized covariances)
∑ denotes the model-implied covariances
Source: www.structuralequations.com/resources/SEM+Essentials.pps
R Packages for SEM – Non-commerical
SEM
Developer: John Fox (since 2001)
For a long time, the only option in R
Will not do multiple groups
OpenMX
Developer: Steven Boker (available at http://openmx.psyc.Virginia.edu/)
Very powerful
All parts of OpenMX are open-source, except for the NPSOL optimizer, which is closed-source
Somewhat idiosyncratic syntax
Lavaan
Developer:Yves Rosseel (http://lavaan.ugent.be/)
First public release – May 2010. On 1st Oct’14 version 0.5-17 has been released on CRAN
Uses a more compact notation that sem
Will work on multiple groups
Source: Rosseel,Yves. "lavaan: An R package for structural equation modeling."Journal of Statistical Software 48.2 (2012): 1-36.
Source 2: https://personality-project.org/revelle/syllabi/454/wk6.lavaan.pdf
Why lavaan?
 A free, open-source for latent variable modeling
 Easy and intuitive to use
 Results are typically very close, to the results of Mplus
 Powerful, easy-to-use text-based syntax describing the model
 Fairly complete
Source: Rosseel,Yves. "lavaan: An R package for structural equation modeling."Journal of Statistical Software 48.2 (2012): 1-36.
Data
#Data
Data = read.csv(file.choose(), header=T)
attach(Data)
#Responses 1 to 16
evals=as.matrix(cbind(RESP_1,RESP_2,RESP_3,RESP_4,RESP_5,RESP_
6,RESP_7,RESP_8,RESP_9,RESP_10,RESP_11,RESP_12,RESP_13,RES
P_14,RESP_15,RESP_16))
Formulae and Operators
Formula type
Operator
Mnemonic
Latent variable
Regression
Covariance
Defined parameter
=~
~
~~
:=
is manifested by
is regressed on
is correlated with
is defined as
Equality constraint
Inequality constraint
Inequality constraint
==
<
>
is equal to
is smaller than
is larger than
Source: Rosseel,Yves. "lavaan: An R package for structural equation modeling."Journal of Statistical Software 48.2 (2012): 1-36.
Specifying the Model
model <- '
# Defining the Latent Variables
Competence =~ RESP_1 + RESP_3 + RESP_7 + RESP_8 +
RESP_9 + RESP_10
Course =~ RESP_11 + RESP_12 + RESP_13 + RESP_14 +
RESP_15 + RESP_16
Interaction =~ RESP_6 + RESP_4 + RESP_5
#Regression
Course ~ Interaction + Competence
#covariance of latent variables
Interaction ~~ Competence '
Install Packages
 Install.packages(“lavaan”)
 Install.packages(“semplot”)
Running the Model
require("lavaan")
#Fitting the data
fit <- sem(model, data = evals, missing = "FIML")
Dealing with Missing Values in Lavaan
 “listwise” - cases with missing data removed listwise (before
analysis)
 “fiml” - the package offers estimation using all available
data.This is also called “case-wise” maximum likelihood
estimation.
Source: http://cran.r-project.org/web/packages/lavaan/lavaan.pdf
Examining the Results
#Examining the results
summary(fit, fit.measure=TRUE, standardized = TRUE)
Examining the Results
Number of observations
Used
7828
Number of missing patterns
92
Estimator
Minimum Function Test Statistic
Degrees of freedom
P-value (Chi-square)
ML
6068.046
87
0.000
Parameter estimates:
Information
Standard Errors
Total
7830
Observed
Standard
Examining the Results
Estimate Std.err Z-value P(>|z|) Std.lv Std.all
Latent variables:
Competence =~
RESP_1
1.000
RESP_3
1.038
RESP_7
1.072
RESP_8
0.957
RESP_9
1.026
RESP_10
0.695
Course =~
RESP_11
1.000
RESP_12
0.971
RESP_13
0.947
RESP_14
0.766
RESP_15
0.829
RESP_16
0.890
Interaction =~
RESP_6
1.000
RESP_4
1.151
RESP_5
1.196
0.778
0.000
0.000
0.000
0.000
0.000
0.902
0.807
0.834
0.745
0.798
0.541
0.889
0.867
0.871
0.855
0.792
0.853
0.009 110.946 0.000
0.009 107.388 0.000
0.008 90.252 0.000
0.009 90.857 0.000
0.010 88.775 0.000
0.869
0.829
0.808
0.654
0.707
0.760
0.891
0.879
0.805
0.808
0.795
0.009 121.814
0.009 114.296
0.008 114.973
0.009 110.423
0.007 94.256
0.612 0.822
0.012 97.686 0.000 0.704 0.910
0.012 100.429 0.000 0.731 0.922
Regressions:
Course ~
Interaction
0.075 0.019 4.059 0.000 0.054 0.054
Competence 0.929 0.016 56.843 0.000 0.847 0.847
Covariances:
Competence ~~
Interaction
0.394 0.008 48.130 0.000 0.828 0.828
Plotting the SEM Path Diagram
#SEM path diagram
Require(“semplot”)
# Plot input path diagram
semPaths(fit,title=FALSE, curvePivot = TRUE, exoVar = FALSE, exoCov
= FALSE)
# Plot output path diagram with standardized parameters
semPaths(fit, "std", edge.label.cex = 1.0, curvePivot = TRUE)
Input Path Diagram
Output Path Diagram
Relating to the Results
Estimate Std.err
Latent variables:
Competence =~
RESP_1
RESP_3
RESP_7
RESP_8
RESP_9
RESP_10
Course =~
RESP_11
RESP_12
RESP_13
RESP_14
RESP_15
RESP_16
Interaction =~
RESP_6
RESP_4
RESP_5
1.000
1.038
1.072
0.957
1.026
0.695
1.000
0.971
0.947
0.766
0.829
0.890
1.000
1.151
1.196
0.009
0.009
0.008
0.009
0.007
Z-value
121.814
114.296
114.973
110.423
94.256
0.009 110.946
0.009 107.388
0.008 90.252
0.009 90.857
0.010 88.775
0.012 97.686
0.012 100.429
P(>|z|)
Std.lv
Std.all
0.000
0.000
0.000
0.000
0.000
0.778
0.807
0.834
0.745
0.798
0.541
0.902
0.889
0.867
0.871
0.855
0.792
0.000
0.000
0.000
0.000
0.000
0.853
0.829
0.808
0.654
0.707
0.760
0.869
0.891
0.879
0.805
0.808
0.795
0.000
0.000
0.612
0.704
0.731
0.822
0.910
0.922
Relating to the Results
Intercepts:
RESP_1
RESP_3
RESP_7
RESP_8
RESP_9
RESP_10
RESP_11
RESP_12
RESP_13
RESP_14
RESP_15
RESP_16
RESP_6
RESP_4
RESP_5
Competence
Course
Interaction
Estimate Std.err
Z-value
P(>|z|)
Std.lv
4.380
4.366
4.306
4.435
4.361
4.637
4.295
4.301
4.313
4.472
4.408
4.345
4.578
4.548
4.558
0.000
0.000
0.000
448.881
425.167
395.835
458.797
413.101
600.331
386.301
408.596
414.576
486.091
444.632
401.296
543.730
519.595
507.973
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
4.380
4.366
4.306
4.435
4.361
4.637
4.295
4.301
4.313
4.472
4.408
4.345
4.578
4.548
4.558
0.000
0.000
0.000
0.010
0.010
0.011
0.010
0.011
0.008
0.011
0.011
0.010
0.009
0.010
0.011
0.008
0.009
0.009
Std.all
5.077
4.810
4.479
5.191
4.674
6.792
4.372
4.624
4.694
5.506
5.036
4.546
6.155
5.879
5.747
0.000
0.000
0.000
Relating to the Results
Estimate Std.err
Variances:
RESP_1
RESP_3
RESP_7
RESP_8
RESP_9
RESP_10
RESP_11
RESP_12
RESP_13
RESP_14
RESP_15
RESP_16
RESP_6
RESP_4
RESP_5
Competence
Course
Interaction
0.139
0.172
0.230
0.176
0.234
0.174
0.237
0.178
0.191
0.232
0.266
0.336
0.179
0.103
0.094
0.605
0.148
0.374
0.003
0.003
0.004
0.003
0.004
0.003
0.005
0.004
0.004
0.004
0.005
0.006
0.003
0.003
0.003
0.012
0.004
0.009
Z-value
P(>|z|)
Std.lv
0.139
0.172
0.230
0.176
0.234
0.174
0.237
0.178
0.191
0.232
0.266
0.336
0.179
0.103
0.094
1.000
0.204
1.000
Std.all
0.187
0.209
0.248
0.241
0.269
0.373
0.245
0.205
0.227
0.352
0.348
0.368
0.324
0.172
0.150
1.000
0.204
1.000
References
 Borsboom, D., Mellenbergh, G. J., & Van Heerden, J. (2003), The





theoretical status of latent variables, Psychological review, 110(2),
203.
Borsboom, D. (2008), Latent variable theory, Measurement 6, 25-53.
Grönroos, C. (1984), A service quality model and its marketing
implications, European Journal of marketing, 18(4), 36-44.
Howell, R. D. (2014), course materials from MKT 6355 Theory
Testing.
Hunt, S. D. (2010), Foundations of marketing theory: Toward a
general theory of marketing, ME Sharpe.
Rosseel,Yves. "lavaan: An R package for structural equation
modeling."Journal of Statistical Software 48.2 (2012): 1-36
Thank You
Download