Principal Components Principal components is a method of dimension reduction. Suppose that you have a dozen variables that are correlated. You might use principal components analysis to reduce your 12 measures to a few principal components. Unlike factor analysis, principal components analysis is not usually used to identify underlying latent variables. 1 Wednesday, 08 April 2015 11:13 AM Principal Components Principal components is a technique that requires a large sample size. Principal components is based on the correlation matrix of the variables involved, and correlations usually need a large sample size before they stabilize. 2 Principal Components As a rule of thumb, a bare minimum of 10 observations per variable is necessary to avoid computational difficulties. Number of Cases Prospects 50 very poor 100 poor 200 fair 300 good 500 very good 1000 excellent Comrey and Lee (1992) A First Course In Factor Analysis 3 Principal Components In this example we have included many options, while you may not wish to use all of these options, we have included them here to aid in the explanation of the analysis. 4 Principal Components In this example we examine students assessment of academic courses. We restrict attention to 12 variables. Item 13 INSTRUCTOR WELL PREPARED Item 14 INSTRUCTOR SCHOLARLY GRASP Item 15 INSTRUCTOR CONFIDENCE Item 16 INSTRUCTOR FOCUS LECTURES Item 17 INSTRUCTOR USES CLEAR RELEVANT EXAMPLES Item 18 INSTRUCTOR SENSITIVE TO STUDENTS Item 19 INSTRUCTOR ALLOWS ME TO ASK QUESTIONS Item 20 INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS Item 21 INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING Item 22 I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION Item 23 COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS Item 24 COMPARED TO OTHER COURSES THIS COURSE WAS Scored on a five point Likert scale, seven is better. 5 Principal Components In this example we examine students assessment of academic courses. We restrict attention to 12 variables. Scored on a five point Likert scale. 6 Principal Components Analyze > Dimension Reduction > Factor 7 Principal Components Select variables 13-24 that is “instructor well prepared” to “compared to other courses this course was”. By using the arrow button. Use the buttons at the side of the screen to set additional options. 8 Principal Components Use the buttons at the side of the screen to set the Descriptives employ the Continue button to return to the main Factor Analysis screen. 9 Principal Components Use the buttons at the side of the screen to set the Extraction employ the Continue button to return to the main Factor Analysis screen. Select the appropriate method and the eigen value criteria, set at 1. It is essential to obtain a scree plot. 10 Principal Components Select the OK button to proceed with the analysis, or Paste to preserve the syntax. Syntax factor /variables item13 item14 item15 item16 item17 item18 item19 item20 item21 item22 item23 item24 /print initial correlation det kmo repr extraction univariate /format blank(.30) /plot eigen /extraction pc /method = correlate. After “/extraction” you can introduce a promax rotation /rotation promax(4) 11 Principal Components Descriptive Statistics Mean The descriptive statistics table is output because we used the univariate option. INSTRUC WELL PREPARED INSTRUC SCHOLARLY GRASP INSTRUCTOR CONFIDENCE INSTRUCTOR FOCUS LECTURES INSTRUCTOR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTOR ALLOWS ME TO ASK QUESTIONS INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS COMPARED TO OTHER COURSES THIS COURSE WAS Std. Deviation Analysis N 4.46 .729 1365 4.53 .700 1365 4.45 .732 1365 4.28 .829 1365 4.17 .895 1365 3.93 1.035 1365 4.08 .964 1365 3.78 .909 1365 3.77 .984 1365 3.61 1.116 1365 3.81 .957 1365 3.67 .926 1365 12 Principal Components Descriptive Statistics Mean Mean - These are the means of the variables used in the factor analysis. Are these appropriate for a Likert scale? INSTRUC WELL PREPARED INSTRUC SCHOLARLY GRASP INSTRUCTOR CONFIDENCE INSTRUCTOR FOCUS LECTURES INSTRUCTOR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTOR ALLOWS ME TO ASK QUESTIONS INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS COMPARED TO OTHER COURSES THIS COURSE WAS Std. Deviation Analysis N 4.46 .729 1365 4.53 .700 1365 4.45 .732 1365 4.28 .829 1365 4.17 .895 1365 3.93 1.035 1365 4.08 .964 1365 3.78 .909 1365 3.77 .984 1365 3.61 1.116 1365 3.81 .957 1365 3.67 .926 1365 13 Principal Components Descriptive Statistics Mean Std. Deviation - These are the standard deviations of the variables used in the factor analysis. Are these appropriate for a Likert scale? INSTRUC WELL PREPARED INSTRUC SCHOLARLY GRASP INSTRUCTOR CONFIDENCE INSTRUCTOR FOCUS LECTURES INSTRUCTOR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTOR ALLOWS ME TO ASK QUESTIONS INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS COMPARED TO OTHER COURSES THIS COURSE WAS Std. Deviation Analysis N 4.46 .729 1365 4.53 .700 1365 4.45 .732 1365 4.28 .829 1365 4.17 .895 1365 3.93 1.035 1365 4.08 .964 1365 3.78 .909 1365 3.77 .984 1365 3.61 1.116 1365 3.81 .957 1365 3.67 .926 1365 14 Principal Components Descriptive Statistics Mean Analysis N - This is the number of cases used in the factor analysis. INSTRUC WELL PREPARED INSTRUC SCHOLARLY GRASP INSTRUCTOR CONFIDENCE INSTRUCTOR FOCUS LECTURES INSTRUCTOR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTOR ALLOWS ME TO ASK QUESTIONS INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS COMPARED TO OTHER COURSES THIS COURSE WAS Std. Deviation Analysis N 4.46 .729 1365 4.53 .700 1365 4.45 .732 1365 4.28 .829 1365 4.17 .895 1365 3.93 1.035 1365 4.08 .964 1365 3.78 .909 1365 3.77 .984 1365 3.61 1.116 1365 3.81 .957 1365 3.67 .926 1365 15 Principal Components The correlation matrix table was included in the output because we included the correlation option. This table gives the correlations between the original variables (which were specified). Before conducting a principal components analysis, you want to check the correlations between the variables. If any of the correlations are too high (say above 0.9), you may need to remove one of the variables from the analysis, as the two variables seem to be measuring the same thing. Another alternative would be to combine the variables in some way (perhaps by taking the average). 16 Principal Components If the correlations are too low, say below 0.1, then one or more of the variables might load only onto one principal component (in other words, make its own principal component). This is not helpful, as the whole point of the analysis is to reduce the number of items (variables). 17 Principal Components The correlation matrix is extremely large. Correlation Matrixa INSTRUC WELL PREPARED INSTRUC SCHOLARLY GRASP INSTRUCTOR CONFIDENCE INSTRUCTOR FOCUS LECTURES INSTRUCTOR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTOR ALLOWS ME TO ASK QUESTIONS INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS COMPARED TO OTHER COURSES THIS COURSE WAS INSTRUCTO R ALLOWS ME TO ASK QUESTIONS INSTRUCTOR AWARE OF STUDENTS UNDERSTAN DING I AM SATISFIED WITH STUDENT PERFORMAN CE EVALUATION COMPARED TO OTHER INSTRUCTO RS, THIS INSTRUCTO R IS COMPARED TO OTHER COURSES THIS COURSE WAS INSTRUC SCHOLARLY GRASP INSTRUC TOR CONFIDE NCE INSTRUCT OR FOCUS LECTURES INSTRUCT OR USES CLEAR RELEVANT EXAMPLES 1.000 .661 .600 .566 .577 .409 .286 .304 .476 .333 .564 .454 .661 1.000 .635 .500 .552 .433 .320 .315 .449 .333 .565 .443 .600 .635 1.000 .505 .587 .457 .359 .356 .509 .369 .582 .435 .566 .500 .505 1.000 .586 .405 .335 .317 .452 .363 .459 .430 .577 .552 .587 .586 1.000 .555 .449 .417 .595 .450 .613 .521 .409 .433 .457 .405 .555 1.000 .627 .521 .554 .536 .569 .474 .286 .320 .359 .335 .449 .627 1.000 .446 .499 .484 .444 .374 .304 .315 .356 .317 .417 .521 .446 1.000 .425 .383 .410 .357 .476 .449 .509 .452 .595 .554 .499 .425 1.000 .507 .598 .500 .333 .333 .369 .363 .450 .536 .484 .383 .507 1.000 .493 .444 .564 .565 .582 .459 .613 .569 .444 .410 .598 .493 1.000 .705 .454 .443 .435 .430 .521 .474 .374 .357 .500 .444 .705 1.000 INSTRUC WELL PREPARED Correlation INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCT OR IS ACCESSIB LE TO STUDENTS OUTSIDE CLASS a. Determinant = .002 18 Principal Components The correlation matrix is extremely large. Correlation Matrixa INSTRUCT OR FOCUS LECTURES INSTRUCT OR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTO R ALLOWS ME TO ASK QUESTIONS INSTRUCT OR IS ACCESSIB LE TO STUDENTS OUTSIDE CLASS .600 .566 .577 .409 .286 .304 1.000 .635 .500 .552 .433 .320 .315 .600 .635 1.000 .505 .587 .457 .359 .356 .566 .500 .505 1.000 .586 .405 .335 .317 .577 .552 .587 .586 1.000 .555 .449 .417 .409 .433 .457 .405 .555 1.000 .627 .521 .286 .320 .359 .335 .449 .627 1.000 .446 .304 .315 .356 .317 .417 .521 .446 1.000 INSTRUC SCHOLARLY GRASP INSTRUC TOR CONFIDE NCE 1.000 .661 .661 INSTRUC WELL PREPARED Correlation INSTRUC WELL PREPARED INSTRUC SCHOLARLY GRASP INSTRUCTOR CONFIDENCE INSTRUCTOR FOCUS LECTURES INSTRUCTOR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTOR ALLOWS ME TO ASK QUESTIONS INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS INSTRUCTOR AWARE OF STUDENTS 19 .476 .449 .509 .452 .595 .554 .499 .425 Principal Components Kaiser-Meyer-Olkin Measure of Sampling Adequacy This measure varies between 0 and 1, and values closer to 1 are better. A value of 0.6 is a suggested minimum. KMO and Bartlett's Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy. Bartlett' s Test of Sphericity Approx. Chi-Square df Sig . .934 8676.712 66 .000 20 Principal Components Bartlett's Test of Sphericity - This tests the null hypothesis that the correlation matrix is an identity matrix. An identity matrix is matrix in which all of the diagonal elements are 1 and all off diagonal elements are 0. You want to reject this null hypothesis. KMO and Bartlett's Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy. Bartlett' s Test of Sphericity Approx. Chi-Square df Sig . .934 8676.712 66 .000 21 Principal Components Taken together, these tests provide a minimum standard, which should be passed before a principal components analysis (or a factor analysis) should be conducted. KMO and Bartlett's Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy. Bartlett' s Test of Sphericity Approx. Chi-Square df Sig . .934 8676.712 66 .000 22 Principal Components Communalities Initial Communalities - This is the proportion of each variable's variance that can be explained by the principal components (e.g. the underlying latent continua). INSTRUC WELL PREPARED INSTRUC SCHOLARLY GRASP INSTRUCTOR CONFIDENCE INSTRUCTOR FOCUS LECTURES INSTRUCTOR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTOR ALLOWS ME TO ASK QUESTIONS INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS COMPARED TO OTHER COURSES THIS COURSE WAS Extraction 1.000 .731 1.000 .690 1.000 .652 1.000 .549 1.000 .661 1.000 .704 1.000 .658 1.000 .494 1.000 .601 1.000 .557 1.000 .673 1.000 .509 Extraction Method: Principal Component Analysis. 23 Principal Components Communalities Initial Initial - By definition, the initial value of the communality in a principal components analysis is 1. INSTRUC WELL PREPARED INSTRUC SCHOLARLY GRASP INSTRUCTOR CONFIDENCE INSTRUCTOR FOCUS LECTURES INSTRUCTOR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTOR ALLOWS ME TO ASK QUESTIONS INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS COMPARED TO OTHER COURSES THIS COURSE WAS Extraction 1.000 .731 1.000 .690 1.000 .652 1.000 .549 1.000 .661 1.000 .704 1.000 .658 1.000 .494 1.000 .601 1.000 .557 1.000 .673 1.000 .509 Extraction Method: Principal Component Analysis. 24 Principal Components Communalities Initial Extraction - The values in this column indicate the proportion of each variable's variance that can be explained by the principal components. Variables with high values are well represented in the common factor space, while variables with low values are not well represented. (In this example, we don't have any particularly low values.) INSTRUC WELL PREPARED INSTRUC SCHOLARLY GRASP INSTRUCTOR CONFIDENCE INSTRUCTOR FOCUS LECTURES INSTRUCTOR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTOR ALLOWS ME TO ASK QUESTIONS INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS COMPARED TO OTHER COURSES THIS COURSE WAS Extraction 1.000 .731 1.000 .690 1.000 .652 1.000 .549 1.000 .661 1.000 .704 1.000 .658 1.000 .494 1.000 .601 1.000 .557 1.000 .673 1.000 .509 Extraction Method: Principal Component Analysis. 25 Principal Components Component - There are as many components extracted during a principal components analysis, as there are variables that are put into it. In our example, we used 12 variables (item13 through item24), so we have 12 components. Total Variance Explained Component 1 2 3 4 5 6 7 8 9 10 11 12 Total 6.249 1.229 .719 .613 .561 .503 .471 .389 .368 .328 .317 .252 Initial Eigenvalues % of Variance Cumulative % 52.076 52.076 10.246 62.322 5.992 68.313 5.109 73.423 4.676 78.099 4.192 82.291 3.927 86.218 3.240 89.458 3.066 92.524 2.735 95.259 2.645 97.904 2.096 100.000 Extraction Method: Principal Component Analysis. Extraction Sums of Squared Loading s Total % of Variance Cumulative % 6.249 52.076 52.076 1.229 10.246 62.322 26 Principal Components Initial eigen values - eigen values are the variances of the principal components. Because we conducted our principal components analysis on the correlation matrix, the variables are standardized, which means that the each variable has a variance of 1, and the total variance is equal to the number of variables used in the analysis, in this case, 12. Total Variance Explained Component 1 2 3 4 5 6 7 8 9 10 11 12 Total 6.249 1.229 .719 .613 .561 .503 .471 .389 .368 .328 .317 .252 Initial Eigenvalues % of Variance Cumulative % 52.076 52.076 10.246 62.322 5.992 68.313 5.109 73.423 4.676 78.099 4.192 82.291 3.927 86.218 3.240 89.458 3.066 92.524 2.735 95.259 2.645 97.904 2.096 100.000 Extraction Sums of Squared Loading s Total % of Variance Cumulative % 6.249 52.076 52.076 1.229 10.246 62.322 Extraction Method: Principal Component Analysis. 27 Principal Components Initial eigen values - Total - This column contains the eigen values. The first component will always account for the most variance (and hence have the highest eigen value), and the next component will account for as much of the left over variance as it can, and so on. Hence, each successive component will account for less and less variance. Total Variance Explained Component 1 2 3 4 5 6 7 8 9 10 11 12 Total 6.249 1.229 .719 .613 .561 .503 .471 .389 .368 .328 .317 .252 Initial Eigenvalues % of Variance Cumulative % 52.076 52.076 10.246 62.322 5.992 68.313 5.109 73.423 4.676 78.099 4.192 82.291 3.927 86.218 3.240 89.458 3.066 92.524 2.735 95.259 2.645 97.904 2.096 100.000 Extraction Sums of Squared Loading s Total % of Variance Cumulative % 6.249 52.076 52.076 1.229 10.246 62.322 Extraction Method: Principal Component Analysis. 28 Principal Components Initial eigen values - % of Variance - This column contains the percent of variance accounted for by each principal component (6.249/12 = 0.52). Total Variance Explained Component 1 2 3 4 5 6 7 8 9 10 11 12 Total 6.249 1.229 .719 .613 .561 .503 .471 .389 .368 .328 .317 .252 Initial Eigenvalues % of Variance Cumulative % 52.076 52.076 10.246 62.322 5.992 68.313 5.109 73.423 4.676 78.099 4.192 82.291 3.927 86.218 3.240 89.458 3.066 92.524 2.735 95.259 2.645 97.904 2.096 100.000 Extraction Sums of Squared Loading s Total % of Variance Cumulative % 6.249 52.076 52.076 1.229 10.246 62.322 Extraction Method: Principal Component Analysis. 29 Principal Components Initial eigen values - Cumulative % - This column contains the cumulative percentage of variance accounted for by the current and all preceding principal components. For example, the second row shows a value of 62.322. This means that the first two components together account for 62.322% of the total variance. Total Variance Explained Component 1 2 3 4 5 6 7 8 9 10 11 12 Total 6.249 1.229 .719 .613 .561 .503 .471 .389 .368 .328 .317 .252 Initial Eigenvalues % of Variance Cumulative % 52.076 52.076 10.246 62.322 5.992 68.313 5.109 73.423 4.676 78.099 4.192 82.291 3.927 86.218 3.240 89.458 3.066 92.524 2.735 95.259 2.645 97.904 2.096 100.000 Extraction Sums of Squared Loading s Total % of Variance Cumulative % 6.249 52.076 52.076 1.229 10.246 62.322 Extraction Method: Principal Component Analysis. 30 Principal Components Extraction Sums of Squared Loadings - The three columns in this half of the table exactly reproduce the values given on the same row on the left side of the table. The number of rows reproduced on the right side of the table is determined by the number of principal components whose eigen values are 1 or greater. Total Variance Explained Component 1 2 3 4 5 6 7 8 9 10 11 12 Total 6.249 1.229 .719 .613 .561 .503 .471 .389 .368 .328 .317 .252 Initial Eigenvalues % of Variance Cumulative % 52.076 52.076 10.246 62.322 5.992 68.313 5.109 73.423 4.676 78.099 4.192 82.291 3.927 86.218 3.240 89.458 3.066 92.524 2.735 95.259 2.645 97.904 2.096 100.000 Extraction Sums of Squared Loading s Total % of Variance Cumulative % 6.249 52.076 52.076 1.229 10.246 62.322 Totally agree Extraction Method: Principal Component Analysis. 31 Principal Components The scree plot graphs the eigen value against the component number. 32 Principal Components In general, we are interested in keeping only those principal components whose eigen values are greater than 1 (we set this value). 33 Principal Components Component Matrixa Component Matrix - This table contains component loadings, which are the correlations between the variable and the component. Because these are correlations, possible values range from -1 to +1. It is usual to not report any correlations that are less than |.3|. As shown. Component 1 INSTRUC WELL PREPARED INSTRUC SCHOLARLY GRASP INSTRUCTOR CONFIDENCE INSTRUCTOR FOCUS LECTURES INSTRUCTOR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTOR ALLOWS ME TO ASK QUESTIONS INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS COMPARED TO OTHER COURSES THIS COURSE WAS 2 .727 -.449 .724 -.408 .746 -.308 .685 .806 .755 .366 .641 .497 .593 .378 .763 .651 .364 .819 .714 Extraction Method: Principal Component Analysis. a. 2 components extracted. 34 Principal Components Component Matrixa Component Component - The columns under this heading are the principal components that have been extracted. As you can see by the footnote provided by SPSS, two components were extracted (the two components that had an eigen value greater than 1). 1 INSTRUC WELL PREPARED INSTRUC SCHOLARLY GRASP INSTRUCTOR CONFIDENCE INSTRUCTOR FOCUS LECTURES INSTRUCTOR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTOR ALLOWS ME TO ASK QUESTIONS INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS COMPARED TO OTHER COURSES THIS COURSE WAS 2 .727 -.449 .724 -.408 .746 -.308 .685 .806 .755 .366 .641 .497 .593 .378 .763 .651 .364 .819 .714 Extraction Method: Principal Component Analysis. a. 2 components extracted. 35 Principal Components Component Matrixa You usually do not try to interpret the components in the way that you would factors that have been extracted from a factor analysis. Rather, most people are interested in the component scores, which are used for dimension reduction (as opposed to factor analysis where you are looking for underlying latent continua). Component 1 INSTRUC WELL PREPARED INSTRUC SCHOLARLY GRASP INSTRUCTOR CONFIDENCE INSTRUCTOR FOCUS LECTURES INSTRUCTOR USES CLEAR RELEVANT EXAMPLES INSTRUCTOR SENSITIVE TO STUDENTS INSTRUCTOR ALLOWS ME TO ASK QUESTIONS INSTRUCTOR IS ACCESSIBLE TO STUDENTS OUTSIDE CLASS INSTRUCTOR AWARE OF STUDENTS UNDERSTANDING I AM SATISFIED WITH STUDENT PERFORMANCE EVALUATION COMPARED TO OTHER INSTRUCTORS, THIS INSTRUCTOR IS COMPARED TO OTHER COURSES THIS COURSE WAS 2 .727 -.449 .724 -.408 .746 -.308 .685 .806 .755 .366 .641 .497 .593 .378 .763 .651 .364 .819 .714 Extraction Method: Principal Component Analysis. a. 2 components extracted. 36 Principal Components For a component plot employ the Rotation option 37 Principal Components Its always wise to plot your results. Note the clusters. 38 Principal Components The advantages in adopting Factor Analysis as opposed to Principal Components Analysis for component evaluation and/or instrumental variable estimation purposes are reported (Travaglini 2011). Under Factor Analysis, the scores are in fact shown to produce more efficient slope estimators when utilized as regressor’s and/or instruments. Together with the factors they also exhibit a higher degree of consistency even for large sample dimensions. Finally under Factor Analysis, dimension reduction is definitely more stringent, greatly facilitating the search and identification of the common components of the available dataset (Travaglini 2011). 39 Principal Components Principal Components Analysis and Factor Analysis share the search for a common structure characterized by few common components, usually known as “scores” that determine the observed variables contained in matrix X. However, the two methods differ on the characterization of the scores as well as on the technique adopted for selecting their true number. In Principal Components Analysis the scores are the orthogonalised principal components obtained through rotation, while in Factor Analysis the scores are latent variables determined by unobserved factors and loadings which involve idiosyncratic error terms. The dimension reduction of X implemented by each method produces a set of fewer homogenous variables – the true scores – which contain most of the model’s 40 information. Principal Components For a detailed discussion and a brief numerical derivation see Velicer and Jackson (1990), who also give an extensive bibliography. “Should one do a component analysis? The choice is not obvious, because the two broad classes of procedures serve a similar purpose, and share many important mathematical characteristics. Despite many textbooks describing common factor analysis as the preferred procedure, principal component analysis has been the most widely applied.” Velicer, W.F. and Jackson, D.N. 1990 “Component Analysis Versus Common Factor Analysis: Some Issues In Selecting An Appropriate Procedure” Multivariate Behavioral Research 25(1) 1-28. 41 Principal Components After some mathematics! “An examination of the algebraic representations of the two methods of analysis has served to highlight the differences between them. However, when the same number of components or factors are extracted, the results from different types of component or factor analysis procedures typically yield highly similar results. Discrepancies are rarely, if ever, of any practical importance in subsequent interpretations.” Velicer, W.F. and Jackson, D.N. 1990 “Component Analysis Versus Common Factor Analysis: Some Issues In Selecting An Appropriate Procedure” Multivariate Behavioral Research 25(1) 1-28. 42 Principal Components Summary Principal Components is used to help understand the covariance structure in the original variables and/or to create a smaller number of variables using this structure. Factor Analysis like principal components is used to summarise the data covariance structure in a smaller number of dimensions. The emphasis is the identification of underlying “factors” that might explain the dimensions associated with large data variability. 43 Similarities Principal Components Analysis and Factor Analysis have these assumptions in common: Measurement scale is interval or ratio level. Random sample - at least 5 observations per observed variable and at least 100 observations. Larger sample sizes recommended for more stable estimates, 10-20 observations per observed variable. 44 Similarities Principal Components Analysis and Factor Analysis have these assumptions in common: Over sample to compensate for missing values Linear relationship between observed variables Normal distribution for each observed variable Each pair of observed variables has a bivariate normal distribution Are both variable reduction techniques. If communalities 45 are large, close to 1.00, results could be similar. Similarities Principal Components Analysis assumes the absence of outliers in the data. Factor Analysis assumes a multivariate normal distribution when using Maximum Likelihood extraction method. 46 Differences Principal Component Analysis Exploratory Factor Analysis Principal Components retained account for a maximal amount of variance of observed variables Factors account for common variance in the data Analysis decomposes correlation matrix Analysis decomposes adjusted correlation matrix Ones on the diagonals of the correlation matrix Diagonals of correlation matrix adjusted with unique factors Minimizes sum of squared perpendicular distance to the component axis Estimates factors which influence responses on observed variables Component scores are a linear combination of the observed variables weighted by eigenvectors Observed variables are linear combinations of the underlying and unique factors 47 SPSS Tips Now you should go and try for yourself. Each week our cluster (5.05) is booked for 2 hours after this session. This will enable you to come and go as you please. Obviously other timetabled sessions for this module take precedence. 48