PRINCIPAL COMPONENTS ANALYSIS AND FACTOR ANALYSIS

advertisement
PRINCIPAL COMPONENTS ANALYSIS AND FACTOR ANALYSIS
Principal components analysis and Factor analysis are used to identify underlying constructs or factors
that explain the correlations among a set of items.
They are often used to summarize a large number of items with a smaller number of derived items,
called factors.
BASIC DEFINITIONS
Communality – Denoted h2. It is the proportion of the variance of an item that is accounted for by the common
factors in a factor analysis.
The unique variance of an item is given by 1− h2 = Item Specific variance + Item error variance (random error).
Eigenvalue - The standardized variance associated with a particular factor. The sum of the eigenvalues cannot
exceed the number of items in the analysis, since each item contributes 1 to the sum of variances.
Factor: A linear combination of items (in a regression sense, where the total test score is the dependent variable,
and the items are the independent variables).
Factor loading: For a given item in a given factor, this is the correlation between the vector of subjects’
responses to that item, with the vector of (subjects’) predicted scores, according to a regression equation treating
the entire set of items as independent variables.
The factor loading expresses the correlation of the item with the factor.
The square of this factor loading indicates the proportion of variance shared by the item with the factor.
Factor Pattern Matrix - A matrix containing the coefficients or "loadings" used to express the item in terms of
the factors. This is the same as the “structure matrix” if the factors are orthogonal (uncorrelated).
Factor Structure Matrix - A matrix containing the correlations of the item with each of the factors. This is the
same as the pattern matrix if the factors are orthogonal, i.e., uncorrelated (principal components analysis).
Rotated factor solution – A factor solution, where the axis of the factor plot is rotated for the purposes of
uncovering a more meaningful pattern of item factor loadings.
Scree plot: A plot of the obtained eigenvalue for each factor.
CORRELATION MATRIX INFORMATION
Reproduced - The estimated correlation matrix from the factor solution. Residuals (differences between
observed and reproduced correlations) are also displayed.
Anti-image - The anti-image correlation matrix contains the negatives of the partial correlation coefficients, and
the anti-image covariance matrix contains the negatives of the partial covariances. Most of the offdiagonal elements should be small in a good factor model.
KMO and Bartlett's test of sphericity - The Kaiser-Meyer-Olkin measure of sampling adequacy tests whether
the partial correlations among items are small. Bartlett's test of sphericity tests whether the correlation
matrix is an identity matrix, which would indicate that the factor model is inappropriate.
1
EXTRACTION METHODS IN FACTOR ANALYSIS
Principal components refers to the principal components model, in which items are assumed to be exact linear
combinations of factors. The Principal components method assumes that components (“factors”) are
uncorrelated. It also assumes that the communality of each item sums to 1 over all components
(factors), implying that each item has 0 unique variance.
The remaining factor extraction methods allow the variance of each item to be composed to be a function of
both item communality and nonzero unique item variance. The following are methods of Common
Factor Analysis:
Principal axis factoring uses squared multiple correlations as initial estimates of the communalities. These
communalites are entered into the diagonals of the correlation matrix, before factors are extracted from
this matrix.
Maximum likelihood produces parameter estimates that are the most likely to have produced the observed
correlations, if the sample is from a multivariate normal population.
Minimum residual factor analysis extracts factors from the correlation matrix, ignoring the diagonal elements.
Alpha factoring treats the items as a sample from the universe of possible items. It selects factors with the intent
of maximizing coefficient alpha reliability.
Image factoring is based on the concept of an "image" of an item, based on the multiple regression of one item
(dependent variable) on all the other items (independent variables).
Unweighted least squares minimizes the squared differences between the observed and reproduced correlation
matrices.
Generalized least squares also minimizes the squared differences between observed and reproduced
correlations, weighting them by the uniqueness of the items involved.
EXTRACTION CRITERIA
You can either retain all factors whose eigenvalues exceed a specified value (e.g. > 1), or retain a specific
number of factors.
FACTOR ROTATION METHODS
Varimax rotates the axis such that the two vertices remain 90 degrees (perpindicular) to each other. Assumes
uncorrelated factors. Also referred to as “orthogonal” rotation.
Oblique rotation (Direct Oblimin) rotates the axis such that the vertices can have any angle (e.g., other than 90
degrees). Allows factors to be correlated.
One can specify Delta to control the extent to which factors can be correlated among themselves. Delta
should be 0 or negative, with 0 yielding the most highly correlated factors and large negative numbers
yielding nearly orthogonal solutions.
2
FACTOR SCORES
A factor score coefficient matrix shows the coefficients by which items are multiplied to obtain factor scores.
Regression Scores - Regression factor scores have a mean of 0 and variance equal to the squared multiple
correlation between the estimated factor scores and the true factor values. They can be correlated even when
factors are assumed to be orthogonal. The sum of squared discrepancies between true and estimated factors over
individuals is minimized.
Bartlett Scores - Bartlett factor scores have a mean of 0. The sum of squares of the unique factors over the range
of items is minimized.
Anderson-Rubin Scores - Anderson-Rubin factor scores are a modification of Bartlett scores to ensure
orthogonality of the estimated factors. They have a mean of 0 and a standard deviation of 1.
3
Download