Factor Analysis The objective of this hand out is to inform the students about factor analysis tool in SPSS. Factor analysis is a method of data reduction. Factor analysis is used to find factors among observed variables. In other words, if your data contains many variables, you can use factor analysis to reduce the number of variables. Factor analysis groups variables with similar characteristics together. With factor analysis you can produce a small number of factors from a large number of variables which is capable of explaining the observed variance in the larger number of variables. The reduced factors can also be used for further analysis. FA could be performed first, to reduce the original number of variables and making the interpretations possibly easier, especially if meaningful factors have been selected. For example, suppose that a bank asked a large number of questions about a given branch. Consider how the following characteristics might be more parsimoniously represented by just a few constructs (factors). Try to reduce 10 items to 3 factors by using Factor Analysis to confirm. Steps involved in Factor Analysis 1. First, a correlation matrix is generated for all the variables. A correlation matrix is a rectangular array of the correlation coefficients of the variables with each other. 2. Second, factors are extracted from the correlation matrix based on the correlation coefficients of the variables. 3. Third, the factors are rotated in order to maximize the relationship between the variables and some of the factors. Task 1: Running the Factor Analysis Procedure From the menu bar select Analyze and choose Data Reduction and then click on Factor. Highlight related variables and send them to variables lists. Then select some options and run the procedure. The Factor Analysis dialogue box Click on the DESCRIPTIVES button and its dialogue box will be loaded on the screen. Within this dialogue box select the following check boxes Univariate Descriptives, Coefficients, Determinant, KMO and Bartlett's test of sphericity, and Reproduced. Click on Continue to return to the Factor Analysis dialogue box. The Factor Analysis: Descriptive dialogue box should be completed as shown below. From the Factor Analysis dialogue box click on the EXTRACTION button and its dialogue box will be loaded on the screen. Select the check box for Scree Plot. Click on Continue to return to the Factor Analysis dialogue box. The Factor Analysis: Extraction dialogue box should be completed as shown below. The Factor Analysis: Extraction dialogue box From the Factor Analysis dialogue box click on the ROTATION button and its dialogue box will be loaded on the screen. Click on the radio button next to Varimax to select it. Click on Continue to return to the Factor Analysis dialogue box. The Factor Analysis: Rotation dialogue box should be completed as shown below. The Factor Analysis: Rotation dialogue box From the Factor Analysis dialogue box click on the OPTIONS button and its dialogue box will be loaded on the screen. Click on the check box of Suppress absolute values less than to select it. Type 0.50 in the text box. Click on Continue to return to the Factor Analysis dialogue box. Click on OK to run the procedure. The Factor Analysis: Options dialogue box should be completed as shown below. The Factor Analysis: Options dialogue box Now you will be able to see the results of factor analysis and we could interpret the results. Let’s do an example for understanding factor analysis. Following figure will show the SPSS data used in this analysis . After doing the Factor analysis procedures mentioned above, you will get the following output. Let us look at the first output, This table shows you the actual factors that were extracted. If you look at the section labelled “Rotation Sums of Squared Loadings,” it shows you only those factors that met your cut-off criterion (extraction method). In this case, there were three factors with eigenvalues greater than 1. SPSS always extracts as many factors initially as there are variables in the dataset, but the rest of these didn’t make the grade. The “% of variance” column tells you how much of the total variability (in all of the variables together) can be accounted for by each of these summary scales or factors. Factor 1 accounts for 34.88% of the variability in all 13 variables, Factor 2 accounts for 19.33% of the variability in all 13 variables and Factor 3 accounts for 15.6% of the variability in all 13 variables so on. We can see that altogether these three factors accounted for the 69.8% of total variance. Let us look at the second table Finally, the Rotated Component Matrix shows you the factor loadings for each variable. I went across each row, and highlighted the factor that each variable loaded most strongly on. Based on these factor loadings, we know exactly which items belong to factor1, 2, and 3. Factor 1 Expensive Factor 2 Factor 3 Appeals to Others Reliable Attractive Looking Latest Features Trend Setting Trust Exciting Luxury Distinctive Not Conservative Not Family Not Basic Now we can give them the name that reflect what these items (in the same factor) have in common. So we call factor 1 as EXCLUSIVE, Factor 2 as TRENDY, and Factor 3 as RELIABLE. Assigning factor Scores. There are two ways we can calculate the values of the three factors which are exclusive, Trendy, and Reliable. Method 1: Assign equal weights to all variables Method 2: Assign different weights to all variables based on their prediction power. I will explain both methods here. Method 1: Assign equal weights to all variables In this method, we simply add the values of factor loadings as shown below. EXCLUSIVE = (Expensive + Exciting + Luxury + Distinctive – Conservative – Family – Basic)/7 TRENDY = (Appeals to Others + Attractive Looking + Trend Setting)/3 RELIABLE = (Reliable + Latest Features + Trust)/3 Following figure shows how to compute these variables in SPSS Method 2: Calculate Factor Scores with Unequaled Weights (Recommended) In this case we calculate the value of each of the three variables by a linear regression equation such as given below. Exclusive = [(.73)2 Expense + (.715)2 Exciting + (.626)2 Luxury + (.755)2 Distinctive – (.881)2 Conserv – (.797)2 Family – (.914)2 Basic] / [(.73)2+ (.715)2 + (.626)2 + (.755)2 +(.881)2 + (.797)2 + (.914)2 ] Trendy = [(.88)2 Appeal + (.776)2 Attract + (.616)2 Trend / [(.88)2 + (.776)2 + (.616)2] Reliable = [(.662)2 Reliable + (.811)2 Latest + (.716)2 Trust / [(.662)2 + (.811)2 + (.716)2] We will use the above formulae in SPSS compute variable and calculate the value of the three factors we identified. Now, we can calculate descriptive statistics of the three new factors, no need to use original 13 items anymore; we have less variables to analyze (mission accomplished) Look at the following descriptive statistics output. Now we are able to compare five different cars based on the three factors exclusive, trendy and reliable. Means of the three factors for each vehicles Exclusive Trendy Reliable Beetle 1.4 6.7 6.9 Hummer 3.9 6.2 6.7 Lotus 4.1 7.3 6.7 Minivan -1.67 4.83 6.5 Pick-Up -0.43 4.93 6.3 We can see that in terms of reliability, all cars are found equal. However, in terms of exclusive and trendy factors, all five cars have different perceptions. Look at the following graph. Vehicle by Component Pick-Up Minivan Lotus Hummer Beetle -3 -2 -1 0 1 2 3 4 5 Exclusive 6 7 Trendy 8