2/18/2010 JUMPSTART Mplus Exploratory and Confirmatory Factor Analysis Factor Analysis Exploratory Factor Analysis (EFA) A method of data reduction which infers presence of latent factors which are responsible for the shared variance in a set of observed variables / items. EFA is by definition ‘exploratory’ - the user does not specify a structure, and assumes each item/ variable could be related to each latent factor. Confirmatory Factor Analysis (CFA) User defines which observed variables /items are related to the specified constructs or latent factors – based on a priori theory or the results of EFA 1 2/18/2010 EFA vs. CFA EFA CFA Purpose: To identify latent factors that account for variance and covariance among a set of observed variables (both based on common factor model) Descriptive / exploratory procedure Input: correlation matrix (all variables standardized) Requires strong empirical or conceptual foundation Input: variance-covariance matrix (standardized and unstandardized solution) Factor selection based on eigenvalue procedures and model fit statistics Prespecification of number of factors pattern of factor loadings Factor rotation to obtain simple structure Simple structure is achieved by fixing (most) indicator cross-loadings to zero Unique variances / measurement error Unique variances / measurement error uncorrelated can be modelled Overall, CFA offers more parsimonious solutions and greater modelling flexibility than EFA Latent Variables Latent Variables are variables that are not measured directly but are inferred through the relationships (or shared variance) of a set of observed (measured) variables. For example: Depression - measured by a set of questionnaire items – (i.e. observed variables) or Ability measured by a set of items designed to tap IQ. This compares with temperature which is directly measured. An advantage of using latent variables are that they reduce the dimensionality of data. A large number of observable variables can be aggregated to represent an underlying concept 2 2/18/2010 EFA/CFA important data considerations • • • • • • • Prior to analyses need to check: Continuous or categorical variables Normal distribution Missing data (or partially missing item data) Sample size Item endorsement Theoretical basis of model Example of EFA using GHQ_12 1. Been able to concentrate on what you’re doing 2. Lost much sleep over worry 3. Felt you were playing a useful part in things 4. Felt capable of making decisions about things 5. Felt constantly under strain 6. Felt you couldn’t overcome your difficulties 7. Been able to enjoy your normal day-to-day activities 8. Been able to face up to your problems 9. Been feeling unhappy and depressed 10. Been losing confidence in yourself 11. Been thinking of yourself as a worthless person. 12. Been feeling reasonably happy, all things considered 1. Not at all 2. No more than usual 3. Rather more than usual 4. Much more than usual 1. More so than usual 2. Same as usual 3. Less useful than usual 4. Much less useful NB: Mix of positive and negatively worded items 3 2/18/2010 First Stage - Importing the data into Mplus: Stata2Mplus • Stata2mplus using E:\mplus\egoghq12r1.dta Looks like this was a success. To convert the file to mplus, start mplus and run the file E:\mplus\egoghq12r1.dta.inp (NB: Need to import this program into Stata using findit command) Stata2Mplus - Mplus Title: Stata2Mplus conversion for E:\mplus\egoghq12r1.dta.dta List of variables converted shown below id : ghq01 : Able to concentrate P 1: Better than usual 2: Same as usual 3: Less than usual 4: Much less than usual ghq02 : Lost sleep N 1: Not at all 2: No more than usual <SNIP> ....... ! Item and value labels are automatically ! returned if labelled in Stata Data: File is E:\mplus\egoghq12r1.dta.dat ; Variable: Names are id ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12; Missing are all (-9999) ; ! Note if your missing are coded differently alter this Analysis: Type = basic ; ! Can run this initially to check your data and get descriptives ! But at the moment it will include all variables including IQ 4 2/18/2010 Initially Run Type=Basic MISSING DATA PATTERNS (x = not missing) 1 ID x GHQ01 x GHQ02 x . . . GHQ12 x COVARIANCE COVERAGE OF DATA ID ________ 1.000 1.000 1.000 GHQ01 ________ ID GHQ01 1.000 GHQ02 1.000 . . Estimated Sample Statistics (means) Covariances Correlations GHQ02 ________ 1.000 THIS WILL ALSO GIVE YOU THE NUMBER OF OBSERVATIONS IN THE ANALYSIS (IMPT TO CHECK) Results from type = Basic COVARIANCE COVERAGE OF DATA ID GHQ01 ID 1.000 GHQ01 1.000 1.000 GHQ02 1.000 1.000 GHQ02 1.000 Means GHQ01 ________ 2.383 GHQ02 ________ 2.161 GHQ03 ________ 2.132 GHQ04 ________ 2.123 GHQ05 ________ 2.424 GHQ01 ________ 1.000 0.429 0.407 0.475 0.492 GHQ02 ________ GHQ03 ________ GHQ04 ________ GHQ05 ________ 1.000 0.231 0.284 0.526 1.000 0.528 0.252 Covariances Correlations GHQ01 GHQ02 GHQ03 GHQ04 GHQ05 1.000 0.336 1.000 5 2/18/2010 EFA_1 - GHQ-12 Items as continuous Title: Stata2Mplus conversion for E:\mplus\egoghq12r1.dta.dta ! Can change title Data: File is E:\mplus\egoghq12r1.dta.dat ; ! Data file from stata so .dta.dat Variable: Names are id ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12; Missing are all (-9999) ; Usevariables are ! Here we specify which variables to use in the model (not IQ) ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12; Analysis: Type = efa 1 4; Estimator = ml; Rotation = promax; Output: sampstat; (! Specify potential number of Factors based on no of items) (! Default is ULS) (! Default is geomin) ! This will give correlation matrix, means etc Rotation • Orthogonal rotation – factors are constrained to be uncorrelated – interpretability of orthogonally rotated solutions (i.e. factors and factor loadings) – e.g. varimax • Oblique rotation – factors are allowed to intercorrelate – often preferred as it may provide a more realistic representation of how factors are interrelated – information on potential higher-order structure – e.g. promax Mplus V5 wide choice of rotation types 6 2/18/2010 EFA_1 output - Eigenvalues RESULTS FOR EXPLORATORY FACTOR ANALYSIS 1 EIGENVALUES FOR SAMPLE CORRELATION MATRIX 1 2 3 ________ ________ ________ 6.277 1.072 0.803 4 ________ 0.597 5 ________ 0.565 1 EIGENVALUES FOR SAMPLE CORRELATION MATRIX 6 7 8 ________ ________ ________ 0.497 0.460 0.445 9 ________ 0.375 10 ________ 0.365 1 EIGENVALUES FOR SAMPLE CORRELATION MATRIX 11 12 ________ ________ 0.319 0.225 EFA_1 -Test of model fit EXPLORATORY FACTOR ANALYSIS WITH 1 FACTOR(S): ! It will return goodness of fit for each ! Of the factors specified TESTS OF MODEL FIT Chi-Square Test of Model Fit Value Degrees of Freedom P-Value 836.052 54 0.0000 RMSEA (Root Mean Square Error Of Approximation) Estimate 90 Percent C.I. Probability RMSEA <= .05 0.114 0.107 0.000 (! sensitive to sample size looking for n/s) (< 0.06 good model fit) 0.121 ! These do not include 0.06 Root Mean Square Residual Value 0.060 (< 0.08 good model fit) 7 2/18/2010 Goodness of Fit EFA 1-4 factors - ML Chi-Square Test of Model Fit Value Degrees of Freedom P-Value RMSEA Estimate 90 Percent C.I. Probability RMSEA <= .05 Root Mean Square Residual 1F 836.052 54 0.000 2F 476.506 43 0.000 3F 155.396 33 0.000 4F 87.864 24 0.000 0.114 0.107, 0.121 0.000 0.060 0.095 0.087, 0.103 0.000 0.039 0.058 0.049, 0.067 0.081 0.020 0.049 0.038, 0.060 0.553 0.014 EFA 1 - Factor loadings ESTIMATED FACTOR LOADINGS (for 2 or more factors use rotated loadings) 1 1 2 1 2 3 1 2 3 4 GHQ01 0.68 0.42 0.33 0.49 0.41 -0.13 0.69 0.18 -0.11 0.01 GHQ02 0.61 -0.09 0.72 -0.08 0.74 0.01 0.02 0.72 -0.05 0.03 GHQ03 0.53 0.73 -0.09 0.69 -0.13 0.09 0.45 -0.15 0.18 0.19 GHQ04 0.60 0.82 -0.09 0.76 -0.07 0.05 0.03 0.03 -0.02 1.03 GHQ05 0.67 -0.01 0.71 0.01 0.73 0.01 0.15 0.64 -0.02 0.01 GHQ06 0.75 0.24 0.57 0.21 0.49 0.16 0.17 0.46 0.18 0.09 GHQ07 0.65 0.37 0.35 0.44 0.41 -0.11 0.71 0.16 -0.09 -0.06 GHQ08 0.69 0.49 0.28 0.47 0.22 0.13 0.38 0.15 0.18 0.12 GHQ09 0.81 -0.01 0.87 -0.03 0.69 0.28 0.12 0.60 0.27 -0.05 GHQ10 0.80 0.23 0.63 0.07 0.31 0.61 -0.02 0.32 0.65 0.02 GHQ11 0.73 0.34 0.46 0.17 0.06 0.72 0.03 0.04 0.85 -0.03 GHQ12 0.75 0.36 0.46 0.36 0.35 0.16 0.56 0.14 0.23 -0.09 This item cross loads Only 2 items on 3rd factor NB: Use ‘rotated’ loadings from the output as factor loadings i.e. regression coefficients Factor structure is the correlation between each item and factor. Items loading < 0.40 are considered poor loadings 8 2/18/2010 EFA_1: Factor correlations and determinacies PROMAX FACTOR CORRELATIONS 2 Factor: 1 1.000 0.668 1 2 2 1.000 3 Factor: 1 1.000 0.627 0.551 1 2 3 2 3 1.000 0.540 1.000 FACTOR DETERMINACIES 2 Factor: 1 0.916 2 0.949 3 Factor: 1 0.911 2 0.931 3 0.902 EFA_2 with categorical variables DATA: File is E:\mplus\egoghq12r1.dta.dat ; VARIABLE: Names are id ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12; Missing are all (-9999) ; USEVARIABLES are ghq01 - ghq12; CATEGORICAL ARE ghq01 - ghq12; !Add in categorical statement ANALYSIS: TYPE = efa 1 4; ESTIMATOR = WLSMV; ROTATION = ! Changed Estimator promax; OUTPUT: sampstat; 9 2/18/2010 EFA with categorical variables GHQ01 GHQ02 GHQ03 GHQ04 GHQ05 GHQ06 GHQ07 GHQ08 GHQ09 GHQ10 GHQ11 GHQ12 1 0.73 0.66 0.60 0.70 0.72 0.79 0.71 0.73 0.86 0.87 0.83 0.80 1 0.53 -0.05 0.79 0.86 0.05 0.23 0.53 0.53 0.07 0.07 0.09 0.43 2 0.29 0.73 -0.10 -0.06 0.72 0.63 0.27 0.30 0.84 0.84 0.78 0.45 1 0.45 -0.14 0.76 0.80 -0.07 0.16 0.44 0.45 -0.01 0.11 0.16 0.35 2 0.57 0.74 -0.13 -0.05 0.79 0.53 0.53 0.29 0.66 0.28 0.09 0.44 3 -0.19 0.12 0.12 0.11 0.08 0.25 -0.16 0.15 0.36 0.66 0.78 0.16 1 0.71 0.06 0.32 -0.04 0.11 0.01 0.67 0.30 0.25 -0.01 -0.04 0.64 2 0.21 0.69 -0.18 0.06 0.74 0.56 0.19 0.17 0.49 0.27 0.08 0.07 3 0.07 -0.03 0.42 0.99 0.02 0.23 0.06 0.25 -0.14 0.05 0.04 -0.09 4 -0.14 0.04 0.17 -0.03 -0.02 0.18 -0.09 0.17 0.39 0.70 0.88 0.29 Only 2 items on 3rd factor Difference from ML (continuous model) – item 07 does not crossload but item 12 does EFA 2 with categorical variables – Goodness of fit Chi-Square Test of Model Fit Value Degrees of Freedom P-Value 1F 2F 3F 4F 943.696 556.16 190.307 99.849 33 28 26 20 0.000 0.000 0.000 0.000 0.157 0.13 0.075 0.060 0.075 0.050 0.020 0.015 RMSEA Estimate 90 Percent C.I. Probability RMSEA <= .05 Root Mean Square Residual Factor determinacies: 2F: 0.935, 0.966 3F: 0.926, 0.948, 0.941 Goodness of fit suggests 3 or 4 factor model, but only 2 items load on 3rd factor 10 2/18/2010 EFA Summary EFA is exploratory – requires interpretation Mplus user can specify if the item responses are continuous (as in PCA) or binary (categorical) or ordinal Treatment of variables as binary or ordinal is particularly useful if item responses are not normally distributed Mplus can also include missing item level data Different rotations can be specified Confirmatory Factor Analysis in Mplus 11 2/18/2010 Alternative model structures for the GHQ_12 from Shevlin/Adamson 2005 Model 2 is the same as suggested by EFA2 – WLSMV with categorical data CFA_1 GHQ-12 Politi et al, 1994 Squares or rectangles represent observed variables 01 02 03 Dysphoria 04 05 F1 06 07 08 09 F2 10 11 12 Social dysfunction Circles represent latent variables 12 2/18/2010 Specifying the model 02 05 06 09 10 11 12 F1 by ghq02* ghq05 ghq06 ghq09 ghq10 ghq11 ghq12; F1@1; Dysphoria 01 F1 03 04 07 08 12 F2 by ghq01* ghq03 ghq04 ghq07 ghq08 ghq12; F2@1; Social dysfunction F2 F1 F2 F1 with F2; Mplus syntax for 2 factor CFA model VARIABLE: Names are id ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12; Missing are all (-9999) ; usevariables = ghq01-ghq12; categorical = ghq01-ghq12; idvariable = id; ANALYSIS: estimator = WLSMV; MODEL: ! (this differs from EFA) F1 by ghq02* ghq05 ghq06 ghq09 ghq10 ghq11 ghq12; !(1st item freely estimated) F1@1; !(fix factor variance to 1) F2 by ghq01* ghq03 ghq04 ghq07 ghq08 ghq12; F2@1; F1 with F2; !(factors are correlated) OUTPUT: Sampstat Res ; 13 2/18/2010 TESTS OF MODEL FIT Chi-Square Test of Model Fit Value Degrees of Freedom P-Value 561.922* 32** 0.0000 * The chi-square value for MLM, MLMV, MLR, ULSMV, WLSM and WLSMV cannot be used for chi-Square difference tests. MLM, MLR and WLSM chi-square difference testing is described in the Mplus Technical Appendices at www.statmodel.com. See chi-square difference testing in the index of the Mplus User's Guide. ** The degrees of freedom for MLMV, ULSMV and WLSMV are estimated according to a formula given in the Mplus Technical Appendices at www.statmodel.com. See degrees of freedom in the index of the Mplus User's Guide. Chi-Square Test of Model Fit for the Baseline Model Value Degrees of Freedom P-Value 9961.631 13 0.0000 CFI/TLI CFI TLI min=0.90, good 0.95+ 0.947 0.978 Number of Free Parameters 50 RMSEA (Root Mean Square Error Of Approximation) Estimate 0.122 < 0.06 WRMR (Weighted Root Mean Square Residual) Value < 1.0 2.067 Results for CFA_1 (Politi model) MODEL RESULTS F1 Two-Tailed P-Value Estimate S.E. Est./S.E. 0.679 0.745 0.816 0.884 0.886 0.845 0.327 0.018 0.015 0.013 0.009 0.009 0.012 0.043 37.113 48.738 61.601 93.445 97.383 68.721 7.516 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.779 0.638 0.737 0.760 0.793 0.516 0.015 0.021 0.017 0.015 0.016 0.043 51.337 30.899 44.364 49.694 49.767 11.967 0.000 0.000 0.000 0.000 0.000 0.000 0.825 0.012 69.584 0.000 BY GHQ02 GHQ05 GHQ06 GHQ09 GHQ10 GHQ11 GHQ12 F2 BY GHQ01 GHQ03 GHQ04 GHQ07 GHQ08 GHQ12 F1 WITH F2 Could try specifying the model without GHQ12 on F1 (same as Andrich pos /neg model) 14 2/18/2010 Results for CFA_2 Positive / Negative (Andrich, 1989) THE MODEL ESTIMATION TERMINATED NORMALLY TESTS OF MODEL FIT Chi-Square Test of Model Fit Value Degrees of Freedom P-Value 605.457* 33** 0.0000 Chi-Square Test of Model Fit for the Baseline Model Value Degrees of Freedom P-Value Removing item 12 from F1 has not made much difference to model fit if anything fit indices slightly worse 9961.631 13 0.0000 CFI/TLI CFI TLI 0.942 0.977 Number of Free Parameters 49 RMSEA (Root Mean Square Error Of Approximation) Estimate 0.125 WRMR (Weighted Root Mean Square Residual) Value This is still high 2.149 Results for CFA_2 (Pos / Neg model) MODEL RESULTS F1 Two-Tailed P-Value Estimate S.E. Est./S.E. 0.678 0.744 0.816 0.885 0.886 0.845 0.018 0.015 0.013 0.009 0.009 0.012 37.110 48.716 61.614 93.569 97.448 68.663 0.000 0.000 0.000 0.000 0.000 0.000 0.764 0.627 0.725 0.745 0.776 0.843 0.015 0.021 0.017 0.015 0.015 0.012 50.150 30.457 43.228 48.927 50.181 68.810 0.000 0.000 0.000 0.000 0.000 0.000 BY GHQ02 GHQ05 GHQ06 GHQ09 GHQ10 GHQ11 F2 BY GHQ01 GHQ03 GHQ04 GHQ07 GHQ08 GHQ12 Factor structure looks good tho’ 15 2/18/2010 Mplus syntax for adding Modification Indices VARIABLE: Names are id ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12; Missing are all (-9999) ; usevariables = ghq01-ghq12; categorical = ghq01-ghq12; idvariable = id; ANALYSIS: estimator = WLSMV; MODEL: F1 by ghq02* ghq05 ghq06 ghq09 ghq10 ghq11 ghq12; F1@1; F2 by ghq01* ghq03 ghq04 ghq07 ghq08 ghq12; F2@1; F1 with F2; ghq02 with ghq05; ! (this is needed OUTPUT: Sampstat Res Mod (10) ; to report modind for items) ! Specify cut-off 3.84 = sig Output for Modification Indices Minimum M.I. value for printing the modification index Std E.P.C. 10.000 M.I. E.P.C. StdYX E.P.C. 27.431 20.801 33.145 23.818 -0.299 -0.262 0.316 0.243 -0.299 -0.262 0.316 0.243 -0.299 -0.262 0.316 0.243 10.770 133.201 30.583 19.834 13.681 24.563 10.709 24.293 24.203 11.508 20.234 16.759 16.467 20.693 15.510 20.555 20.511 15.030 14.038 143.250 13.509 15.710 -0.093 0.231 0.108 -0.118 0.070 0.103 0.069 -0.107 0.089 -0.072 0.094 -0.101 -0.087 -0.087 -0.093 -0.127 -0.112 -0.098 -0.076 0.189 -0.086 0.071 -0.093 0.231 0.108 -0.118 0.070 0.103 0.069 -0.107 0.089 -0.072 0.094 -0.101 -0.087 -0.087 -0.093 -0.127 -0.112 -0.098 -0.076 0.189 -0.086 0.071 -0.161 0.444 0.252 -0.224 0.178 0.254 0.156 -0.280 0.254 -0.183 0.270 -0.321 -0.274 -0.326 -0.280 -0.317 -0.306 -0.284 -0.305 0.767 -0.216 0.257 BY Statements F1 F1 F1 F2 BY BY BY BY GHQ03 GHQ04 GHQ08 GHQ06 WITH Statements GHQ03 WITH GHQ02 GHQ04 WITH GHQ03 GHQ05 WITH GHQ01 GHQ05 WITH GHQ03 GHQ06 WITH GHQ05 GHQ07 WITH GHQ01 GHQ07 WITH GHQ05 GHQ08 WITH GHQ01 GHQ08 WITH GHQ06 GHQ08 WITH GHQ07 GHQ09 WITH GHQ02 GHQ09 WITH GHQ04 GHQ10 WITH GHQ05 GHQ10 WITH GHQ06 GHQ11 WITH GHQ01 GHQ11 WITH GHQ02 GHQ11 WITH GHQ05 GHQ11 WITH GHQ07 GHQ11 WITH GHQ09 GHQ11 WITH GHQ10 GHQ12 WITH GHQ04 GHQ12 WITH GHQ09 16 2/18/2010 Mplus syntax for adding residual correlations VARIABLE: Names are id ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12; Missing are all (-9999) ; usevariables = ghq01-ghq12; categorical = ghq01-ghq12; idvariable = id; ! DEFINE: ! IF (ghq01 EQ 3) THEN ghq01=2; ! CUT ghq03(0 2 ); this is useful if you need to recode ANALYSIS: estimator = WLSMV; MODEL: F1 by ghq02* ghq05 ghq06 ghq09 ghq10 ghq11 ghq12; F1@1; F2 by ghq01* ghq03 ghq04 ghq07 ghq08 ghq12; F2@1; F1 with F2; ghq03 with ghq04; ghq10 with ghq11; OUTPUT: Sampstat Res Mod (10) ; THE MODEL ESTIMATION TERMINATED NORMALLY TESTS OF MODEL FIT Chi-Square Test of Model Fit Value Degrees of Freedom P-Value 318.808* 32** 0.0000 Chi-Square Test of Model Fit for the Baseline Model Value Degrees of Freedom P-Value 9961.631 13 0.0000 CFI/TLI CFI TLI 0.971 0.988 Number of Free Parameters 52 RMSEA (Root Mean Square Error Of Approximation) Estimate 0.089 WRMR (Weighted Root Mean Square Residual) Value 1.489 By adding residual correlations WRMR has reduced 17 2/18/2010 Graphs OUTPUT: Sampstat Res Mod (10) Plot: type=plot3; ; Add plot command after output Use menu bar to select graph Select type of plot Under variable selection scroll down to find F1 F2 etc Distribution of factors 18 2/18/2010 Scatterplot of factors Saving Factor Scores VARIABLE: Names are id ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12; Missing are all (-9999) ; USEVARIABLES are CATEGORICAL ARE IDVAR is ghq01-ghq12; ghq01-ghq12; id; need to specify id <SNIP> …………………………… OUTPUT: sampstat res mod (10); PLOT: type=plot3; SAVEDATA: SAVE=FSCORES; FILE=E:\GHQ12score.DAT; 19 2/18/2010 Saving Factor Scores SAVEDATA INFORMATION This data file can be imported into SPSS / Excel etc using the text import wizard and merged back into your data file. Order and format of variables GHQ01 GHQ02 GHQ03 GHQ04 GHQ05 GHQ06 GHQ07 GHQ08 GHQ09 GHQ10 GHQ11 GHQ12 ID F1 F2 F10.3 F10.3 F10.3 F10.3 F10.3 F10.3 F10.3 F10.3 F10.3 F10.3 F10.3 F10.3 I5 F10.3 F10.3 Save file E:\GHQ12score.DAT If you need other variables in the saved file that are not specified in your model use the AUXILIARY command in the variable statement e.g. AUXILIARY = gender educ; Factor scores / latent trait scores for each individual Save file format 12F10.3 I5 2F10.3 CFA_3 - 2F Schmitz et al model This model is included to show that CFA is not always straightforward ANALYSIS: ! F1 - Anxiety Depression (Schmitz et al) ! F2 - Social Performance ESTIMATOR = WLSMV; MODEL: F1 by ghq01* ghq02 ghq03 ghq06 ghq07 ghq10 ghq11; F1@1; F2 by ghq03* ghq04 ghq05 ghq08 ghq09 ghq12; F2@1; F1 with F2; OUTPUT: sampstat res mod (10) tech1; PLOT: type=plot3; 20 2/18/2010 Identification Problems WARNING: THE RESIDUAL COVARIANCE MATRIX (THETA) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR AN OBSERVED VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO OBSERVED VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO OBSERVED VARIABLES. CHECK THE RESULTS SECTION FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE GHQ03. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 40. THE CONDITION NUMBER IS 0.893D-16. FACTOR SCORES WILL NOT BE COMPUTED DUE TO NONCONVERGENCE OR NONIDENTIFIED MODEL. Use Tech1 to establish what para-40 is LAMBDA GHQ01 GHQ02 GHQ03 GHQ04 GHQ05 GHQ06 GHQ07 GHQ08 GHQ09 GHQ10 GHQ11 GHQ12 F1 ________ 37 38 39 0 0 43 44 0 0 47 48 0 F2 ________ 0 0 40 41 42 0 0 45 46 0 0 49 Lambda is the matrix of factor loadings There is a problem with the loading of GHQ03 on the second factor 21 2/18/2010 Problem could also be identified from the output: MODEL RESULTS Estimate F1 BY GHQ01 GHQ02 GHQ03 GHQ06 GHQ07 GHQ10 GHQ11 F2 0.729 0.660 -7545.363 0.796 0.714 0.868 0.828 BY GHQ03 GHQ04 GHQ05 GHQ08 GHQ09 GHQ12 F1 7545.965 0.692 0.721 0.732 0.859 0.795 WITH F2 This problem stems from the fact that the third item is loading on both factors. This does not always lead to problems (see other models fitted here) but in this case it has done. It does not appear possible to replicate this model using the current dataset If the aim was not replication, but merely to test one’s own theories, then removing the loading from one of the factors would solve the problem. Depending on your theories on the underlying mechanism, this may or may not be desirable. 1.000 References Brown, T (2006) Confirmatory factor analysis for applied research Guildford Press, New York 22