Week 1

advertisement
Week 12
Factor analysis
READING
Text reading
Dancey and Reidy, Chapter 12 in 3rd edition,
Or in 4th Edition its Chapter 14
CHAPTER OVERVIEW
In social science, we are very often in a situation where we need to measure traits
through the use of questionnaires. Often, we need to aggregate items together in some
coherent way, since single questions (ie items) simply are not sufficient to really
capture the nature of the underlying trait (called the latent variable). In questionnaire
research in particular, we need tools that help us describe and define the range of
items that will relate meaningfully to our underlying latent trait. This chapter will look
at factor analysis, an especially valuable technique for identifying groups or clusters
of variables (items). In this chapter, we will



give a conceptual understanding of factor analysis
show how to enter a dataset into SPSS and analyse it by factor analysis
show how to interpret the statistical output from factor analysis
KEY TERMS
1- Factor (latent variable)
2- Principal component analysis
3- Method of Least Squares (MLS).
4- Factor loadings
5- Communality
6- Structure matrix
7- Pattern matrix
8- Kaiser criterion
9- Screeplot
10- Rotation
11- Oblique rotation
12- Orthogonal rotation
KEY POINTS
1- Factor analysis is used to uncover the latent structure (dimensions) of a set of
variables (items).
2- Factor analysis seeks to uncover the underlying structure of a relatively large
set of variables (items). The researcher’s a priori assumption is that any
indicator may be associated with any factor. This is the most common form of
factor analysis. There is no prior theory and one uses factor loadings to intuit
the factor structure of the dataset.
3- Factors or components are the dimensions (or latent variables) identified with
clusters of variables (items), as computed using factor analysis.
4- Principal components analysis (PCA) is the most common form of factor
analysis. PCA seeks a linear combination of variables (items) such that the
maximum variance is extracted from the variables. It then removes this
variance and seeks a second linear combination which explains the maximum
proportion of the remaining variance, and so on. This is called the principal
axis method and results in orthogonal (uncorrelated) factors. PCA analyses
total (common and unique) variance.
5- Factor analysis generates a table in which the rows are the observed raw
indicator variables (items) and the columns are the factors or latent variables
which explain as much of the variance in these variables as possible. The cells
in this table are factor loadings, and the meaning of the factors must be
induced from seeing which variables are most heavily loaded on which
factors. This inferential labelling process will naturally introduce a level of
interpretation or subjectivity as different researchers could employ different
labels. This is normal and expected.
6- The factor loadings are the correlation coefficients between the variables
(rows) and factors (columns). Analogous to Pearson's r, the squared factor
loading is the percent of variance in that variable explained by the factor.
7- To get the percent of variance in all the variables accounted for by each factor,
add the sum of the squared factor loadings for that factor (column) and divide
by the number of variables.
8- The structure matrix is simply the factor loading matrix as in orthogonal
rotation, representing the variance in a measured variable explained by a factor
on both a unique and common contributions basis.
9- The pattern matrix, in contrast, contains coefficients which just represent
unique contributions. For oblique rotation, the researcher looks at both the
structure and pattern coefficients when attributing a label to a factor.
10- The sum of the squared factor loadings for all factors for a given variable
(row) is the variance in that variable accounted for by all the factors, and this
is called the communality.
11- The ratio of the squared factor loadings for a given variable shows the relative
importance of the different factors in explaining the variance of the given
variable. Factor loadings are the basis for imputing a label to the different
factors.
12- Communality is the squared multiple correlation for the variable as dependent
using the factors as predictors. The communality measures the percent of
variance in a given variable explained by all the factors jointly and may be
interpreted as the reliability of the indicator.
13- The eigenvalue for a given factor measures the variance in all the variables
(items) which is accounted for by that factor.
14- Kaiser criterion is a common rule of thumb for dropping all factors with
eigenvalues under 1.0.
15- Screeplot is the Cattell scree test plots the components as the X axis and the
corresponding eigenvalues as the Y axis. As one moves to the right, toward
later components, the eigenvalues drop. When the drop ceases and the curve
makes an elbow toward less steep decline, Cattell's scree test says to drop all
further components after the one starting the elbow. (ie some factors may still
appear with eigens greater than 1, but still be essentially meaningless).
16- Rotation serves to make the output more understandable and is usually
necessary to facilitate the interpretation of factors. Unrotated solutions are
hard to interpret because variables (items) tend to load on multiple factors.
17- Oblique rotations allow the factors to be correlated.
18- Orthogonal rotations such as varimax are selected and no factor correlation
matrix is produced as the correlation of any factor with another is zero.
Further reading
1- Factor analysis
http://www.psych.cornell.edu/Darlington/factor.htm
2- Best practices in factor analysis: An academic article detailing some of the
debates around factor analysis. Not for beginners, and has views stated that are
not necessarily shared by others.
http://pareonline.net/pdf/v10n7.pdf
3- Factor analysis using SPSS
http://www.sussex.ac.uk/Users/andyf/factor.pdf
SPSS activities
1- Factor analysis
http://calcnet.mth.cmich.edu/org/spss/Clips/24FACT~1.mov
ACTIVE LEARNING AND OPPORTUNITIES
1- In order to name factors that have been extracted, researchers look at:
A- The rotated factor loadings
B- The unrotated factor loadings
C- The table of the eigenvalues
D- None of the above
2- A factor is thought of as an underlying latent variable:
A- That is influenced by observed variables
B- That is unexplained by unobserved variables
C- Along which individuals differ
D- Along which individuals are homogenous
3- Factor analysis requires that variables:
A- Are not related to each other
B- Are related to each other
C- Have only a weak relationship with each other
D- Are measured in the same units
4- The decision on how many factors to keep is decided on:
A- Statistical criteria
B- Theoretical criteria
C- Both (a) and (b)
D- Neither (a) nor (b)
5- The original unrotated matrix is usually rotated so that
A- The factors are more significant
B- The mathematical calculations are easier
C- Interpretation is easier
D- All of these
6- A scree plot is a number of:
A- Variables plotted against variance account for
B- Variables plotted against factor loadings
C- Factors plotted against correlation coefficients
D- None of the above
Final note: Factor analysis is a strange method in that its most effective use will hinge
upon making several clear decisions about the analysis as you employ it. That is, there
are often certain assumptions and interpretations about “what goes with what”. For
example, when items cross-load (ie load on more than one factor), there are several
steps one can take, and sometimes, this may even involve eliminating items and rerunning the analysis. This is often normal and responsible action. Sometimes results
appear with an unclear focus as simply too much variance is inherent in the analysis.
A far clearer picture may emerge once irrelevant items are taken out.
In short, in many ‘real life’ situations, it helps the researchers to seek assistance from
experienced people accustomed to using this procedure. This is simply responsible
behaviour as it is pointless to advancing knowledge claims based upon incomplete or
superficial analyses. Indeed, a casual observer may suspect that somehow this method
is ‘suspicious’ in that an experienced person seems to ‘steer’, or ‘shape’ the results he
or she wants. But such a perception would be quite incorrect. An experienced
researcher cannot create a variable using factor analysis, but he or she can use this
tool to effectively locate and define a latent trait that genuinely does contribute to the
variance.
Download