“A User-Friendly Demonstration of Principal Components Analysis as a Data Reduction Method” R. Michael Haynes, PhD Assistant Vice President Student Life Studies Tarleton State University Keith Lamb, MBA Associate Vice President Student Affairs Midwestern State University What is Principal Components Analysis (PCA)? A member of the general linear model (GLM) where all analyses are correlational Term often used interchangeably with “factor analysis”, however, there are slight differences A method of reducing large data sets into more manageable “factors” or “components” A method of identifying the most useful variables in a dataset A method of identifying and classifying variables across common themes, or constructs that they represent Before we get started, a GLOSSARY of terms we’ll be using today: Bartletts’s Test of Sphericity Communality coefficients Construct Correlation matrix Cronbach’s alpha coefficient Effect sizes (variance accounted for) Eigenvalues Extraction Factor or component Kaiser criterion for retaining factors Kaiser-Meyer-Olkin Measure of Sampling Adequacy Latent Reliability Rotation Scree plot Split-half reliability Structure coefficients Desired outcomes from today’s session Understand: The terminology associated with principal components analysis (PCA) When using PCA is appropriate Conducting PCA using SPSS 17.0 Interpreting a correlation matrix Interpreting a communality matrix Interpreting a components matrix and the methods used in determining how many components to retain Analyzing a component to determine which variables to include and why The concept of reliability and why it is important in survey research LETS GET STARTED!! When is using PCA appropriate? When your data is interval or ratio level When you have at least 5 observations per variable and at least 100 observations (i.e.…20 variables>100 observations) When trying to reduce the number of variables to be used in another GLM technique (i.e....regression, MANOVA, etc...) When attempting to identify latent constructs that are being measured by observed variables in the absence of a priori theory. HUERISTIC DATA Responses to the Developing Purpose Inventory (DPI) collected at a large, metropolitan university between 2004-2006 (IRB approval received) 45 questions related to Chickering’s developing purpose stage Responses on 5 interval scale; 1=”always true” to 5=”never true” Sample size = 998 participants SUGGESTION: always visually inspect data for missing cases and potential outliers! (APA Task Force on Statistical Inference, 1999). Multiple ways of dealing with missing data, but that’s for another day! SPSS 17.0 Make sure your set-up in “Variable View” is complete to accommodate your data Names, labels, possible values of the data, and type of measure SPSS 17.0 Analyze>Dimension Reduction>Factor SPSS 17.0 SYNTAX Orange indicates sections specific to your analysis! DATASET ACTIVATE DataSet1. FACTOR /VARIABLES question1 question2 question3 question4 question5 question6 question7 question8 question9 question10 question11 question12 question13 question14 question15 question16 question17 question18 question19 question20 question21 question22 question23 question24 question25 question26 question27 question28 question29 question30 question31 question32 question33 question34 question35 question36 question37 question38 question39 question40 question41 question42 question43 question44 question45 /MISSING LISTWISE /ANALYSIS question1 question2 question3 question4 question5 question6 question7 question8 question9 question10 question11 question12 question13 question14 question15 question16 question17 question18 question19 question20 question21 question22 question23 question24 question25 question26 question27 question28 question29 question30 question31 question32 question33 question34 question35 question36 question37 question38 question39 question40 question41 question42 question43 question44 question45 /PRINT INITIAL CORRELATION SIG KMO EXTRACTION ROTATION FSCORE /FORMAT SORT BLANK(.000) /PLOT EIGEN /CRITERIA MINEIGEN(1) ITERATE(25) /EXTRACTION PC /CRITERIA ITERATE(25) /ROTATION VARIMAX /SAVE AR(ALL) /METHOD=CORRELATION. OUTPUT COMPONENTS Correlation Matrix Pearson R between the individual variables Variables range from -1.0 to +1.0; strong, modest, weak; positive, negative Correlations of 1.00 on the diagonal; every variable is “perfectly and positively” correlated with itself! It is this information that is the basis for PCA! In other words, if you have only a correlation matrix, you can conduct PCA! Question 1 - ARI Question 1 - ARI Question 2 - VI Question 3 - SL Question 4 - ARI Question 5 - VI 1.000 .157 .077 .165 .069 Question 2 - VI .157 1.000 .261 .109 .211 Question 3 - SL .077 .261 1.000 .157 .017 Question 4 - ARI .165 .109 .157 1.000 .098 Question 5 - VI .069 .211 .017 .098 1.000 OUTPUT COMPONENTS KMO Measure of Sampling Adequacy and Bartlett’s Test of Sphericity KMO values closer to 1.0 are better Kaiser (1970 & 1975; as cited by Meyers, Gamst, & Guarino, 2006) states that a value of .70 is considered adequate. Bartlett’s Test: you want a statistically significant value Reject the null hypothesis of a lack of sufficient correlation between the variables. Kaiser-Meyer-Olkin Measure of Sampling Adequacy. Bartlett's Test of Sphericity Approx. Chi-Square .861 9193.879 df 990 Sig. .000 OUTPUT COMPONENTS Communality Coefficients amount of variance in the variable accounted for by the components higher coefficients =stronger variables lower coefficients =weaker variables Initial Extraction Question 1 - ARI 1.000 .560 Question 2 - VI 1.000 .446 Question 3 - SL 1.000 .773 Question 4 - ARI 1.000 .519 Question 5 - VI 1.000 .539 Question 6 - SL 1.000 .439 Question 7 - ARI 1.000 .605 Question 8 - VI 1.000 .527 Question 9 - SL 1.000 .537 Question 10 - ARI 1.000 .775 Question 11 - VI 1.000 .635 Question 12 - SL 1.000 .476 Question 13 - ARI 1.000 .542 Question 14 - VI 1.000 .435 Question 15 - SL 1.000 .426 OUTPUT COMPONENTS Total Variance Explained Table Lists the individual components (remember, you have as many components as you have variables) by eigenvalue and variance accounted for How do we determine how many components to retain? Initial Eigenvalues % of Variance Extraction Sums of Squared Loadings Cumulative % Total % of Variance Cumulative % Rotation Sums of Squared Loadings Component Total Total % of Variance Cumulative % 1 7.216 16.035 16.035 7.216 16.035 16.035 3.666 8.147 8.147 2 3.107 6.904 22.938 3.107 6.904 22.938 2.649 5.887 14.034 3 2.455 5.456 28.395 2.455 5.456 28.395 2.597 5.771 19.806 4 1.846 4.103 32.498 1.846 4.103 32.498 2.555 5.677 25.482 5 1.690 3.755 36.253 1.690 3.755 36.253 2.243 4.984 30.466 6 1.458 3.239 39.493 1.458 3.239 39.493 2.189 4.865 35.331 7 1.307 2.906 42.398 1.307 2.906 42.398 1.746 3.880 39.212 8 1.180 2.623 45.021 1.180 2.623 45.021 1.578 3.507 42.719 9 1.107 2.461 47.482 1.107 2.461 47.482 1.555 3.455 46.174 10 1.064 2.364 49.846 1.064 2.364 49.846 1.314 2.919 49.093 11 1.024 2.275 52.121 1.024 2.275 52.121 1.221 2.712 51.805 12 1.014 2.253 54.374 1.014 2.253 54.374 1.156 2.569 54.374 13 .976 2.170 56.544 OUTPUT COMPONENTS Total Variance Explained Table Kaiser Criterion (K1 Rule): retain only those components with an eigenvalue of greater than 1; can lead to retaining more components than necessary Initial Eigenvalues % of Variance Extraction Sums of Squared Loadings Cumulative % Total % of Variance Cumulative % Rotation Sums of Squared Loadings Component Total Total % of Variance Cumulative % 1 7.216 16.035 16.035 7.216 16.035 16.035 3.666 8.147 8.147 2 3.107 6.904 22.938 3.107 6.904 22.938 2.649 5.887 14.034 3 2.455 5.456 28.395 2.455 5.456 28.395 2.597 5.771 19.806 4 1.846 4.103 32.498 1.846 4.103 32.498 2.555 5.677 25.482 5 1.690 3.755 36.253 1.690 3.755 36.253 2.243 4.984 30.466 6 1.458 3.239 39.493 1.458 3.239 39.493 2.189 4.865 35.331 7 1.307 2.906 42.398 1.307 2.906 42.398 1.746 3.880 39.212 8 1.180 2.623 45.021 1.180 2.623 45.021 1.578 3.507 42.719 9 1.107 2.461 47.482 1.107 2.461 47.482 1.555 3.455 46.174 10 1.064 2.364 49.846 1.064 2.364 49.846 1.314 2.919 49.093 11 1.024 2.275 52.121 1.024 2.275 52.121 1.221 2.712 51.805 12 1.014 2.253 54.374 1.014 2.253 54.374 1.156 2.569 54.374 13 .976 2.170 56.544 OUTPUT COMPONENTS Total Variance Explained Table Retain as many factors as will account for a pre-determined amount of variance, say 70%; can lead to retention of components that are variable specific (Stevens, 2002) Initial Eigenvalues % of Variance Extraction Sums of Squared Loadings Cumulative % Total % of Variance Cumulative % Rotation Sums of Squared Loadings Component Total Total % of Variance Cumulative % 1 7.216 16.035 16.035 7.216 16.035 16.035 3.666 8.147 8.147 2 3.107 6.904 22.938 3.107 6.904 22.938 2.649 5.887 14.034 3 2.455 5.456 28.395 2.455 5.456 28.395 2.597 5.771 19.806 4 1.846 4.103 32.498 1.846 4.103 32.498 2.555 5.677 25.482 5 1.690 3.755 36.253 1.690 3.755 36.253 2.243 4.984 30.466 6 1.458 3.239 39.493 1.458 3.239 39.493 2.189 4.865 35.331 7 1.307 2.906 42.398 1.307 2.906 42.398 1.746 3.880 39.212 8 1.180 2.623 45.021 1.180 2.623 45.021 1.578 3.507 42.719 9 1.107 2.461 47.482 1.107 2.461 47.482 1.555 3.455 46.174 10 1.064 2.364 49.846 1.064 2.364 49.846 1.314 2.919 49.093 11 1.024 2.275 52.121 1.024 2.275 52.121 1.221 2.712 51.805 12 1.014 2.253 54.374 1.014 2.253 54.374 1.156 2.569 54.374 13 .976 2.170 56.544 OUTPUT COMPONENTS Scree Plot Plots eigenvalues on Y axis and component number on X axis Recommendation is to retain all components in the descent before the first one on the line where it levels off (Cattell, 1966; as cited by Stevens, 2002). Other Retention Methods Velicer’s Minimum Average Partial (MAP) test Seeks to determine what components are common Does not seek “cut-off” point, but rather to find a more “comprehensive” solution Components that have high number of highly correlated variables are retained However, variable based decisions can result in underestimating the number of components to retain (Ledesma & Valero-Mora, 2007) Other Retention Methods Horn’s Parallel Analysis (PA) Compares observed eigenvalues with “simulated” eigenvalues Retain all components with an eigenvalue greater than the “mean” of the simulated eigenvalues Considered highly accurate and exempt from extraneous factors (Ledesma & Valero-Mora, 2007) OUTPUT COMPONENTS Component Matrix Column values are structure coefficients, or the correlation between the test question and the synthetic component; REMEMBER: squared structure coefficients inform us of how well the item can reproduce the effect in the component! Rotated Component Matrixa Component 1 2 3 4 5 6 7 8 9 10 Question 42 - SL .781 -.060 .000 .117 .034 .071 .055 -.062 .093 -.002 .032 .025 Question 39 - SL .778 -.132 .107 .109 .008 .024 -.025 .018 .044 -.010 .022 -.025 Question 33 - SL .765 -.042 .115 .098 .034 .090 -.035 -.035 .011 .013 -.012 .020 Question 9 - SL .672 -.103 .127 .092 .050 .126 .005 -.119 -.002 -.063 -.034 -.114 Question 37 - ARI .462 -.173 .193 -.103 .075 .197 .345 -.018 .024 .232 .009 .119 Question 15 - SL .406 -.002 .340 .038 .050 .091 .120 -.007 .067 -.152 -.127 -.273 Question 36 - SL .395 -.067 .212 -.104 .225 .125 .365 -.089 .110 .168 -.037 .221 Question 44 - VI .375 -.033 .360 .128 .175 .091 .221 -.023 .177 -.035 -.027 -.001 Question 26 - VI -.022 .660 -.113 .009 .021 -.063 -.096 .089 .044 .034 -.060 .174 Question 27 - SL -.158 .652 -.088 .032 .069 -.091 .040 .193 -.032 -.150 -.019 .003 Question 38 - VI -.058 .501 -.109 -.171 .032 -.276 -.051 .078 -.042 .255 -.016 -.097 Question 20 - VI -.240 .489 .016 .076 .036 -.092 -.052 .434 -.102 .071 -.079 .056 Question 32 - VI -.101 .488 -.134 .084 -.074 -.415 -.010 .046 .025 -.057 -.050 .020 Question 45 - SL -.144 .443 -.049 -.097 -.105 -.026 -.097 .078 -.031 .057 .421 -.013 Question 29 - VI -.006 .439 .154 -.114 .007 .231 .238 -.196 .145 -.098 .089 -.138 Question 41 - VI -.019 .421 -.087 -.210 .006 -.107 .333 -.005 .125 .091 .300 -.082 Question 24 - SL .129 -.067 .720 .101 .147 .119 -.003 .011 .005 .011 -.012 .203 Question 21 - SL .125 -.164 .676 -.056 .161 .047 .160 -.044 -.012 .137 -.006 .029 Question 23 - VI .313 -.164 .537 .286 .063 .007 .076 -.094 .119 .049 .123 .031 Question 17 - VI .076 -.050 .459 .187 .040 .136 .314 .048 .120 -.212 .083 -.140 Question 30 - SL .120 .114 .420 .287 -.081 .309 -.109 -.165 .061 .328 -.107 .161 Question 22 - ARI .042 .075 .364 .045 .087 -.081 -.135 -.353 .324 .216 .016 -.188 Question 34 - ARI .187 .042 .067 .791 -.002 .075 -.031 -.019 .012 .063 -.050 -.036 Question 1 - ARI -.002 -.062 .082 .722 .055 -.018 .008 -.014 .039 .132 .015 -.075 Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. 11 12 Rotated Component Matrixa , continued Component Question 35 - VI .113 .077 .015 .569 .030 .439 -.053 -.140 .067 -.089 .095 .105 Question 40 - ARI .194 -.161 .176 .553 .033 .057 -.041 .016 .186 -.086 .216 .147 Question 10 - ARI .029 .016 .144 .033 .860 .036 .010 -.074 .032 .063 -.010 .006 Question 3 - SL .069 -.015 .197 .050 .848 -.029 .025 -.011 .067 -.026 -.003 .004 Question 12 - SL .297 .069 .072 .000 .488 .137 .282 .024 .033 .091 .082 .158 Question 13 - ARI -.046 .058 -.118 .045 .447 -.102 .321 .069 .128 .368 -.222 -.033 Question 11 - VI .151 -.021 .024 .361 .115 .663 .000 -.006 -.124 -.028 .021 .104 Question 5 - VI .154 -.134 .201 .042 -.057 .652 .020 .028 -.019 .124 .039 -.092 Question 8 - VI -.090 .250 -.017 .010 .000 -.623 -.034 .115 -.105 .141 .120 .088 Question 18 - SL .034 .003 .095 -.055 .092 -.039 .686 -.026 .015 .006 -.024 .036 Question 14 - VI .241 -.157 .289 -.007 .132 .221 .418 .061 -.057 -.006 .122 -.080 Question 28 - ARI -.232 .248 .051 .181 -.128 -.237 .357 -.112 .043 .074 -.144 .240 Question 16 - ARI -.069 .213 -.008 .062 -.006 -.075 .033 .678 -.051 -.101 -.103 .023 Question 19 - ARI .001 .054 -.042 -.241 -.033 -.010 -.112 .630 .147 -.010 .127 .036 Question 43 - ARI .138 -.011 .067 .255 .017 .045 -.091 .086 .756 .024 -.074 .075 Question 31 - ARI .062 .045 .069 -.048 .122 -.040 .186 -.053 .721 .140 -.077 .033 Question 4 - ARI .023 -.057 .119 .100 .132 .007 .034 -.131 .184 .643 .020 -.088 Question 6 - SL -.186 .177 -.039 .065 -.051 -.066 .087 .372 -.059 .390 .230 -.080 Question 7 - ARI .024 -.059 .047 .149 .010 .005 .016 -.017 -.133 .008 .736 .126 Question 2 - VI .234 -.198 .246 .175 .233 .094 .203 .086 .179 -.161 .254 -.162 -.048 .063 .119 .021 .073 -.049 .064 .085 .078 -.123 .108 .767 Question 25 - ARI Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. OUTPUT COMPONENTS Component Matrix Column values are structure coefficients, or the correlation between the test question and the synthetic component; REMEMBER: squared structure coefficients inform us of how well the item can reproduce the effect in the component! Rule of thumb, include all items with structure coefficients with an absolute value of .300 or greater Rotated Component Matrixa Component 1 2 3 4 5 6 7 8 9 10 Question 42 - SL .781 -.060 .000 .117 .034 .071 .055 -.062 .093 -.002 .032 .025 Question 39 - SL .778 -.132 .107 .109 .008 .024 -.025 .018 .044 -.010 .022 -.025 Question 33 - SL .765 -.042 .115 .098 .034 .090 -.035 -.035 .011 .013 -.012 .020 Question 9 - SL .672 -.103 .127 .092 .050 .126 .005 -.119 -.002 -.063 -.034 -.114 Question 37 - ARI .462 -.173 .193 -.103 .075 .197 .345 -.018 .024 .232 .009 .119 Question 15 - SL .406 -.002 .340 .038 .050 .091 .120 -.007 .067 -.152 -.127 -.273 Question 36 - SL .395 -.067 .212 -.104 .225 .125 .365 -.089 .110 .168 -.037 .221 Question 44 - VI .375 -.033 .360 .128 .175 .091 .221 -.023 .177 -.035 -.027 -.001 Question 26 - VI -.022 .660 -.113 .009 .021 -.063 -.096 .089 .044 .034 -.060 .174 Question 27 - SL -.158 .652 -.088 .032 .069 -.091 .040 .193 -.032 -.150 -.019 .003 Question 38 - VI -.058 .501 -.109 -.171 .032 -.276 -.051 .078 -.042 .255 -.016 -.097 Question 20 - VI -.240 .489 .016 .076 .036 -.092 -.052 .434 -.102 .071 -.079 .056 Question 32 - VI -.101 .488 -.134 .084 -.074 -.415 -.010 .046 .025 -.057 -.050 .020 Question 45 - SL -.144 .443 -.049 -.097 -.105 -.026 -.097 .078 -.031 .057 .421 -.013 Question 29 - VI -.006 .439 .154 -.114 .007 .231 .238 -.196 .145 -.098 .089 -.138 Question 41 - VI -.019 .421 -.087 -.210 .006 -.107 .333 -.005 .125 .091 .300 -.082 Question 24 - SL .129 -.067 .720 .101 .147 .119 -.003 .011 .005 .011 -.012 .203 Question 21 - SL .125 -.164 .676 -.056 .161 .047 .160 -.044 -.012 .137 -.006 .029 Question 23 - VI .313 -.164 .537 .286 .063 .007 .076 -.094 .119 .049 .123 .031 Question 17 - VI .076 -.050 .459 .187 .040 .136 .314 .048 .120 -.212 .083 -.140 Question 30 - SL .120 .114 .420 .287 -.081 .309 -.109 -.165 .061 .328 -.107 .161 Question 22 - ARI .042 .075 .364 .045 .087 -.081 -.135 -.353 .324 .216 .016 -.188 Question 34 - ARI .187 .042 .067 .791 -.002 .075 -.031 -.019 .012 .063 -.050 -.036 Question 1 - ARI -.002 -.062 .082 .722 .055 -.018 .008 -.014 .039 .132 .015 -.075 Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. 11 12 Rotated Component Matrixa , continued Component Question 35 - VI .113 .077 .015 .569 .030 .439 -.053 -.140 .067 -.089 .095 .105 Question 40 - ARI .194 -.161 .176 .553 .033 .057 -.041 .016 .186 -.086 .216 .147 Question 10 - ARI .029 .016 .144 .033 .860 .036 .010 -.074 .032 .063 -.010 .006 Question 3 - SL .069 -.015 .197 .050 .848 -.029 .025 -.011 .067 -.026 -.003 .004 Question 12 - SL .297 .069 .072 .000 .488 .137 .282 .024 .033 .091 .082 .158 Question 13 - ARI -.046 .058 -.118 .045 .447 -.102 .321 .069 .128 .368 -.222 -.033 Question 11 - VI .151 -.021 .024 .361 .115 .663 .000 -.006 -.124 -.028 .021 .104 Question 5 - VI .154 -.134 .201 .042 -.057 .652 .020 .028 -.019 .124 .039 -.092 Question 8 - VI -.090 .250 -.017 .010 .000 -.623 -.034 .115 -.105 .141 .120 .088 Question 18 - SL .034 .003 .095 -.055 .092 -.039 .686 -.026 .015 .006 -.024 .036 Question 14 - VI .241 -.157 .289 -.007 .132 .221 .418 .061 -.057 -.006 .122 -.080 Question 28 - ARI -.232 .248 .051 .181 -.128 -.237 .357 -.112 .043 .074 -.144 .240 Question 16 - ARI -.069 .213 -.008 .062 -.006 -.075 .033 .678 -.051 -.101 -.103 .023 Question 19 - ARI .001 .054 -.042 -.241 -.033 -.010 -.112 .630 .147 -.010 .127 .036 Question 43 - ARI .138 -.011 .067 .255 .017 .045 -.091 .086 .756 .024 -.074 .075 Question 31 - ARI .062 .045 .069 -.048 .122 -.040 .186 -.053 .721 .140 -.077 .033 Question 4 - ARI .023 -.057 .119 .100 .132 .007 .034 -.131 .184 .643 .020 -.088 Question 6 - SL -.186 .177 -.039 .065 -.051 -.066 .087 .372 -.059 .390 .230 -.080 Question 7 - ARI .024 -.059 .047 .149 .010 .005 .016 -.017 -.133 .008 .736 .126 Question 2 - VI .234 -.198 .246 .175 .233 .094 .203 .086 .179 -.161 .254 -.162 -.048 .063 .119 .021 .073 -.049 .064 .085 .078 -.123 .108 .767 Question 25 - ARI Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. OUTPUT COMPONENTS Component Matrix For heuristic purposes, we’re retaining the first X components; what variables should we include in the components? Column values are structure coefficients, or the correlation between the test question and the synthetic component; REMEMBER: squared structure coefficients inform us of how well the item can reproduce the effect in the component! Rule of thumb, include all items with structure coefficients with an absolute value of .300 or greater Stevens’ recommends a better way! Critical Values for a Correlation Coefficient at α = .01 for a Two-Tailed Test n CV n CV 50 .361 180 .192 80 .286 200 .182 100 .256 250 .163 140 .217 300 .149 (Stevens, 2002, pp. 394) n 400 600 800 1000 CV .129 .105 .091 .081 Test the structure coefficient for statistical significance against a two-tailed table based on sample size and a critical value (CV); for our sample size of 998, the CV would be |.081| doubled (two-tailed), or |.162|. Rotated Component Matrixa Component 1 2 3 4 5 6 7 8 9 10 Question 42 - SL .781 -.060 .000 .117 .034 .071 .055 -.062 .093 -.002 .032 .025 Question 39 - SL .778 -.132 .107 .109 .008 .024 -.025 .018 .044 -.010 .022 -.025 Question 33 - SL .765 -.042 .115 .098 .034 .090 -.035 -.035 .011 .013 -.012 .020 Question 9 - SL .672 -.103 .127 .092 .050 .126 .005 -.119 -.002 -.063 -.034 -.114 Question 37 - ARI .462 -.173 .193 -.103 .075 .197 .345 -.018 .024 .232 .009 .119 Question 15 - SL .406 -.002 .340 .038 .050 .091 .120 -.007 .067 -.152 -.127 -.273 Question 36 - SL .395 -.067 .212 -.104 .225 .125 .365 -.089 .110 .168 -.037 .221 Question 44 - VI .375 -.033 .360 .128 .175 .091 .221 -.023 .177 -.035 -.027 -.001 Question 26 - VI -.022 .660 -.113 .009 .021 -.063 -.096 .089 .044 .034 -.060 .174 Question 27 - SL -.158 .652 -.088 .032 .069 -.091 .040 .193 -.032 -.150 -.019 .003 Question 38 - VI -.058 .501 -.109 -.171 .032 -.276 -.051 .078 -.042 .255 -.016 -.097 Question 20 - VI -.240 .489 .016 .076 .036 -.092 -.052 .434 -.102 .071 -.079 .056 Question 32 - VI -.101 .488 -.134 .084 -.074 -.415 -.010 .046 .025 -.057 -.050 .020 Question 45 - SL -.144 .443 -.049 -.097 -.105 -.026 -.097 .078 -.031 .057 .421 -.013 Question 29 - VI -.006 .439 .154 -.114 .007 .231 .238 -.196 .145 -.098 .089 -.138 Question 41 - VI -.019 .421 -.087 -.210 .006 -.107 .333 -.005 .125 .091 .300 -.082 Question 24 - SL .129 -.067 .720 .101 .147 .119 -.003 .011 .005 .011 -.012 .203 Question 21 - SL .125 -.164 .676 -.056 .161 .047 .160 -.044 -.012 .137 -.006 .029 Question 23 - VI .313 -.164 .537 .286 .063 .007 .076 -.094 .119 .049 .123 .031 Question 17 - VI .076 -.050 .459 .187 .040 .136 .314 .048 .120 -.212 .083 -.140 Question 30 - SL .120 .114 .420 .287 -.081 .309 -.109 -.165 .061 .328 -.107 .161 Question 22 - ARI .042 .075 .364 .045 .087 -.081 -.135 -.353 .324 .216 .016 -.188 Question 34 - ARI .187 .042 .067 .791 -.002 .075 -.031 -.019 .012 .063 -.050 -.036 Question 1 - ARI -.002 -.062 .082 .722 .055 -.018 .008 -.014 .039 .132 .015 -.075 Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. 11 12 Rotated Component Matrixa , continued Component Question 35 - VI .113 .077 .015 .569 .030 .439 -.053 -.140 .067 -.089 .095 .105 Question 40 - ARI .194 -.161 .176 .553 .033 .057 -.041 .016 .186 -.086 .216 .147 Question 10 - ARI .029 .016 .144 .033 .860 .036 .010 -.074 .032 .063 -.010 .006 Question 3 - SL .069 -.015 .197 .050 .848 -.029 .025 -.011 .067 -.026 -.003 .004 Question 12 - SL .297 .069 .072 .000 .488 .137 .282 .024 .033 .091 .082 .158 Question 13 - ARI -.046 .058 -.118 .045 .447 -.102 .321 .069 .128 .368 -.222 -.033 Question 11 - VI .151 -.021 .024 .361 .115 .663 .000 -.006 -.124 -.028 .021 .104 Question 5 - VI .154 -.134 .201 .042 -.057 .652 .020 .028 -.019 .124 .039 -.092 Question 8 - VI -.090 .250 -.017 .010 .000 -.623 -.034 .115 -.105 .141 .120 .088 Question 18 - SL .034 .003 .095 -.055 .092 -.039 .686 -.026 .015 .006 -.024 .036 Question 14 - VI .241 -.157 .289 -.007 .132 .221 .418 .061 -.057 -.006 .122 -.080 Question 28 - ARI -.232 .248 .051 .181 -.128 -.237 .357 -.112 .043 .074 -.144 .240 Question 16 - ARI -.069 .213 -.008 .062 -.006 -.075 .033 .678 -.051 -.101 -.103 .023 Question 19 - ARI .001 .054 -.042 -.241 -.033 -.010 -.112 .630 .147 -.010 .127 .036 Question 43 - ARI .138 -.011 .067 .255 .017 .045 -.091 .086 .756 .024 -.074 .075 Question 31 - ARI .062 .045 .069 -.048 .122 -.040 .186 -.053 .721 .140 -.077 .033 Question 4 - ARI .023 -.057 .119 .100 .132 .007 .034 -.131 .184 .643 .020 -.088 Question 6 - SL -.186 .177 -.039 .065 -.051 -.066 .087 .372 -.059 .390 .230 -.080 Question 7 - ARI .024 -.059 .047 .149 .010 .005 .016 -.017 -.133 .008 .736 .126 Question 2 - VI .234 -.198 .246 .175 .233 .094 .203 .086 .179 -.161 .254 -.162 -.048 .063 .119 .021 .073 -.049 .064 .085 .078 -.123 .108 .767 Question 25 - ARI Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. Obtaining Continuous Component Values for Use in Further Analysis Sum the interval values for the responses of all questions included in the retained component Obtain mean values for the responses of all questions included in the retained component…hint…you’ll get the same R, R², ß, and structure coefficients as with the sums! Use SPSS to obtain factor scores for the component Choose “Scores” button when setting up your PCA Options include calculating scores based on regression, Bartlett, or Anderson-Rubin methodologies…be sure and check “Save as Variables” Factor scores will appear in your data set and can be used as variables in other GLM analyses RELIABILITY The extent to which scores on a test are consistent across multiple administrations of the test; the amount of measurement error in the scores yielded by a test (Gall, Gall, & Borg, 2003). While validity is important in ensuring our tests are really measuring what we intended to measure; “You wouldn’t administer an English literature test to assess math competency, would you?” Can be measured several ways using SPSS 17.0 A Visual Explanation of Reliability and Validity RELIABILITY RELIABILITY RELIABILITY Cronbach’s Alpha Coefficient RELIABILITY /VARIABLES=question1 question2 question3 question4 question5 question6 question7 question8 question9 question10 question11 question12 question13 question14 question15 question16 question17 question18 question19 question20 question21 question22 question23 question24 question25 question26 question27 question28 question29 question30 question31 question32 question33 question34 question35 question36 question37 question38 question39 question40 question41 question42 question43 question44 question45 /SCALE('ALL VARIABLES') ALL /MODEL=ALPHA. Split-Half Coefficient RELIABILITY /VARIABLES=question1 question2 question3 question4 question5 question6 question7 question8 question9 question10 question11 question12 question13 question14 question15 question16 question17 question18 question19 question20 question21 question22 question23 question24 question25 question26 question27 question28 question29 question30 question31 question32 question33 question34 question35 question36 question37 question38 question39 question40 question41 question42 question43 question44 question45 /SCALE('ALL VARIABLES') ALL /MODEL=SPLIT. RELIABILITY Cronbach’s Alpha Coefficient Reliability Statistics Cronbach's Alpha .749 Benchmarks for Alpha • .9 & up = very good • .8 to .9 = good • .7 to .8 = acceptable • .7 & below = suspect. N of Items 45 “… don’t refer to the test as ‘reliable’, but scores from this administration of the test yielded reliable results”….Kyle Roberts RELIABILITY Split-Half Coefficient Reliability Statistics Cronbach's Alpha Part 1 Value N of Items Part 2 Value N of Items Total N of Items Spearman-Brown Coefficient .620 23a .623 22b 45 Correlation Between Forms .518 Equal Length .683 Unequal Length .683 Guttman Split-Half Coefficient .683 a. The items are: Question 1 - ARI, Question 2 - VI, Question 3 - SL, Question 4 - ARI, Question 5 - VI, Question 6 - SL, Question 7 - ARI, Question 8 - VI, Question 9 - SL, Question 10 - ARI, Question 11 - VI, Question 12 - SL, Question 13 - ARI, Question 14 - VI, Question 15 - SL, Question 16 - ARI, Question 17 - VI, Question 18 - SL, Question 19 - ARI, Question 20 - VI, Question 21 - SL, Question 22 - ARI, Question 23 - VI. b. The items are: Question 23 - VI, Question 24 - SL, Question 25 - ARI, Question 26 - VI, Question 27 - SL, Question 28 - ARI, Question 29 - VI, Question 30 - SL, Question 31 - ARI, Question 32 - VI, Question 33 - SL, Question 34 - ARI, Question 35 - VI, Question 36 - SL, Question 37 - ARI, Question 38 - VI, Question 39 - SL, Question 40 - ARI, Question 41 - VI, Question 42 - SL, Question 43 - ARI, Question 44 - VI, Questiton 45 - SL. RELATED LINKS http://faculty.chass.ncsu.edu/garson/PA765/factor.ht m http://www.uic.edu/classes/epsy/epsy546/Lecture%2 04%20--%20notes%20on%20PRINCIPAL%20COMPONENT S%20ANALYSIS%20AND%20FACTOR%20ANALYS IS1.pdf http://www.ats.ucla.edu/stat/Spss/output/factor1.htm http://www.statsoft.com/textbook/principalcomponents-factor-analysis/ REFERENCES Gall, M.D., Gall, J.P., & Borg, W.R. (2003). Educational research: An introduction 7th ed.). Boson: Allyn and Bacon. Ledesma, R.D., & Valero-Mora, P. (2007). Determining the number of factors to retain in EFA: an easy-to-use computer program for carrying out parallel analysis. Practical Assessment, Research, & Evaluation, 12(2). Meyers, L.S., Gamst, G., & Guarino, A.J. (2006). Applied multivariate research: Design and interpretation. Thousand Oaks, CA: Sage. Stevens, J. P. (2002). Applied multivariate statistics for the social sciences (4th ed.). Mahwaw, NJ: Lawrence Erlbaum Associates. University of California at Los Angeles Academic Technology Services (2009). Annotated SPSS output: Factor analysis. Retrieved January 11, 2010 from http://www.ats.ucla.edu/stat/Spss/output/factor1.htm University of Illinois at Chicago (2009). Principal components analysis and factor analysis. Retrieved January 11, 2010 from http://www.uic.edu/classes/epsy/epsy546/Lecture%204%20--%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20F ACTOR%20ANALYSIS1.pdf Wilkinson, L. & Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanation. American Psychologist, 54, 594604.