Nov. 16 - Global Association for Research Methods and Data Science

advertisement

Structural Equation Modeling

Mgmt 291

Lecture 8 – Model Diagnostics

And Model Validation

Nov. 16, 2009

Computing Problem 1:

“Not positive definite”

 determinant of the matrix =< 0 makes LogΣ and LogS undefined

Log|Σ(Θ)|+tr(S Σ -1 (Θ)) – log|S| -(p+q)

 computing work can not move forward

Common Sources

1) There are redundancies among the correlation matrices- in other words, some of the correlations may be a linear function of some of the other correlations.

You can fix this by removing the redundant variables or collecting more data.

2) Your model may be estimating more parameters than you have degrees of freedom to use. You can check this by examining how many degrees of freedom you have and the number of parameters you are estimating.

3) LISREL is not correctly reading the raw data, correlation matrix, or covariance.

Other causes of

“not positive definite”

Starting Values

The model-implied matrix Sigma is computed from the model's parameter estimates.

Especially before iterations begin, those estimates may be such that Sigma is not positive definite. So if the problem relates to Sigma, first make sure that the model has been specified correctly, with no syntax errors. If the proposed model is

"unusual," then the starting value routines that are incorporated into most SEM programs may fail. Then it is up to the researcher to supply likely starting values.

Sampling Variation

When sample size is small, a sample covariance or correlation matrix may be not positive definite due to mere sampling fluctuation. It has been documented how parameter matrices (Theta-Delta, Theta-Epsilon, Psi and possibly Phi) may be not positive definite through mere sampling fluctuation. Most often, such cases involve

"improper solutions," where some variance parameters are estimated as negative. In such cases, it has been suggested that the offending estimates could be fixed to zero with minimal harm to the program.

Missing Data

Solution 1 - Diagnostics

Multi-collinearity

Missing Values

Solution 2

Provide starting values

ST .5 ALL

ST .6 BE(2,1) LY(1,3) …

 in SIMPLIS, write starting values un equations in parentheses followed by an asterisk (*)

TotalScore = (1)* Verbal

TotalScore = 1*Verbal

Guideline on Selecting Starting

Values

Parameters Starting Values a

BE ij

(i j diff)

GAMMA ij

(i j diff)

PS ii

PS ij

(i j diff)

PH a(sd of y i a(sd of y i

/ sd of y j

) |a|=.9 strong, .4 moderate, .2 weak

/ sd of x j

) |a|=.9 strong, .4 moderate, .2 weak a var(y i a (PS ii

) |a|=.9 weak fit, .4 moderate, .2 strong fit

PS jj

) 1/2 |a|=.9 strong, .4 moderate, .2 weak correlation sample covariance of X

Solutions 3

Try other estimation methods

IV

2SLS

OLS

In LISREL, OU RC= c

Sidestepping the Problem

 make a ridge adjustment to the covariance or correlation matrix. This involves adding some quantity to the diagonal elements of the matrix. This addition has the effect of attenuating the estimated relations between variables. A large enough addition is sure to result in a positive definite matrix. The price of this adjustment, however, is bias in the parameter estimates, standard errors, and fit indices. a constant times the diagonal of S is added to S repeat 10 times until the matrix becomes positive-definite

Computing Problem 2:

Negative error variance

 construct with only one indicator

 too many latent variables for one indicator

Example 1

 sab1.spl - syntax errors sab2.spl (created latent vars) – still problem sab3.spl (use Correlation matrix) – negative error variance sab4.spl (set error variance as .001, ok)

Correlation matrix and set error var as 0

Solves the problem.

Example 2

Step by step diagnostics

 bollen80.ls8 (no method factors, ok)

 bollen80f1.ls8 (with all methods in, not working) bollen80f1t.ls8 (simplify, works)

(then, add to move up) bollen80f2.ls8 - okay

Bollen’s model

Political

Liberties x1 x2 x3 x4

Sussman

Gastil

Democratic

Rule x5 x6 x7 x8

Banks

MTMM – Multi-traits Multi Methods

Convergent validity – high correlation of indicators from diff methods for the same trait

Discriminant validity – low correlation of indicators from same methods for diff traits

MTMM Correlation Matrix

T1

T2

T1

T2

M1 x1 x1 x2 Corr

12 x3 Corr

13 x4

M1 M2 M2 x2 x3 x4

References for Bollen’s Example

Kenneth Bollen 1993 Liberal Democracy: Validity and Method Factors in Cross-National Measures.

American Journal of Political Science, Vol 37

(November) 1207-1230

Structural Equations with Latent Variables. New

York: Wiley 1989

Testing Structural Equation Models. Sage

Publications 1993

Kline’s list of 35 ways to mislead us

3. Fail to have sufficient numbers of indicators of latent variables

7. Overfit the model

8. Add disturbance or measurement error correlations without substantive reasons

……

More

26. Interpret good fit as meaning that the model is “proved”.

More

34. Fail to provide enough information so that your reader can reproduce your results

Model validation

Estimation methods always minimize residuals

CVI = F(S v

, Σ) – (1/2n v

)k(k+1) where F is the fit function, S v is the covariance matrix or correlation matrix of the validation sample, and Σ is the covariance (correlation) matrix fitted in the exploration sample under the model. The last matrix is saved in a file by including the line:

Save Sigma in File SIGMA2

Cross Validation

Program Sample

Cross-Validating Panel Model 2

Observed Variables from File PANEL.LAB

Correlation Matrix from File PANELUSA.PMV

Sample Size 395

Crossvalidate File SIGMA2

End of Problem

Save Sigma from ex9b.spl

Use Ex9bcv.spl

Download