Principle Components Analysis with SPSS

advertisement
Principal Components
Analysis with SAS
Karl L. Wuensch
Dept of Psychology
East Carolina University
When to Use PCA
• You have a set of p continuous variables.
• You want to repackage their variance into
m components.
• You will usually want m to be < p, but not
always.
Components and Variables
• Each component is a weighted linear
combination of the variables
Ci  Wi 1 X1  Wi 2 X 2    Wip X p
• Each variable is a weighted linear
combination of the components.
X j  A1 j C1  A2 j C2    Amj Cm
Factors and Variables
• In Factor Analysis, we exclude from the
solution any variance that is unique, not
shared by the variables.
X j  A1 j F1  A2 j F2    Amj Fm  U j
• Uj is the unique variance for Xj
Goals of PCA and FA
• Data reduction.
• Discover and summarize pattern of
intercorrelations among variables.
• Test theory about the latent variables
underlying a set a measurement variables.
• Construct a test instrument.
• There are many others uses of PCA and
FA.
Data Reduction
• Ossenkopp and Mazmanian (Physiology
and Behavior, 34: 935-941).
• 19 behavioral and physiological variables.
• A single criterion variable, physiological
response to four hours of cold-restraint
• Extracted five factors.
• Used multiple regression to develop a
multiple regression model for predicting
the criterion from the five factors.
Exploratory Factor Analysis
• Want to discover the pattern of
intercorrleations among variables.
• Wilt et al., 2005 (thesis).
• Variables are items on the SOIS at ECU.
• Found two factors, one evaluative, one on
difficulty of course.
• Compared FTF students to DE students,
on structure and means.
Confirmatory Factor Analysis
• Have a theory regarding the factor
structure for a set of variables.
• Want to confirm that the theory describes
the observed intercorrelations well.
• Thurstone: Intelligence consists of seven
independent factors rather than one global
factor.
Construct Test Instrument
• Write a large set of items designed to test
the constructs of interest.
• Administer the survey to a sample of
persons from the target population.
• Use FA to help select those items that will
be used to measure each of the constructs
of interest.
• Use Cronbach’s alpha to check reliability
of resulting scales.
An Unusual Use of PCA
• Poulson, Braithwaite, Brondino, and Wuensch
(1997, Journal of Social Behavior and
Personality, 12, 743-758).
• Simulated jury trial, seemingly insane
defendant killed a man.
• Criterion variable = recommended verdict
– Guilty
– Guilty But Mentally Ill
– Not Guilty By Reason of Insanity.
• Predictor variables = jurors’ scores on 8
scales.
• Discriminant function analysis.
• Problem with multicollinearity.
• Used PCA to extract eight orthogonal
components.
• Predicted recommended verdict from
these 8 components.
• Transformed results back to the original
scales.
A Simple, Contrived Example
• Consumers rate importance of seven
characteristics of beer.
– low Cost
– high Size of bottle
– high Alcohol content
– Reputation of brand
– Color
– Aroma
– Taste
PCA-Beer.sas
• Download PCA-Beer.sas from
http://core.ecu.edu/psyc/wuenschk/SAS/S
AS-Programs.htm .
• Bring it into SAS.
• Run the program. Look at the output.
Checking for Unique Variables 1
• Check the correlation matrix (page 1 of
output).
• If there are any variables not well
correlated with some others, might as well
delete them.
• Or add more variables expected to be
correlated with them.
• Can still include deleted variables in postPCA analysis.
Checking for Unique Variables 2
Correlation Matrix
cost
size
alcohol
reputat
color
aroma
taste
cost
size
alcohol reputat color
aroma taste
1.00
.832
.767
-.406
.018
-.046
-.064
.832
1.00
.904
-.392
.179
.098
.026
.767
.904
1.00
-.463
.072
.044
.012
-.046
.098
.044
-.443
.909
1.00
.870
-.406
-.392
-.463
1.00
-.372
-.443
-.443
.018
.179
.072
-.372
1.00
.909
.903
-.064
.026
.012
-.443
.903
.870
1.00
Checking for Unique Variables 3
• For each variable, check R2 between it
and the remaining variables. You will see
these when we cover factor analysis.
• Look at partial correlations – variables with
large partial correlations share variance
with one another but not with the
remaining variables – this is problematic.
• See page 2 of the output.
Checking for Unique Variables 4
• Kaiser’s MSA will tell you, for each
variable, how much of this problem exists.
• The smaller the MSA, the greater the
problem.
• An MSA of .9 is marvelous, .5 miserable.
• See page 2 of the output.
• Typically we would have more than seven
variables, and MSA would be likely be
larger.
Extracting Principal Components 1
• From p variables we can extract p components.
• Each of p eigenvalues represents the amount of
standardized variance that has been captured
by one component.
• The first component accounts for the largest
possible amount of variance.
• The second captures as much as possible of
what is left over, and so on.
• Each is orthogonal to the others.
Extracting Principal Components 2
• Each variable has standardized variance =
1.
• The total standardized variance in the p
variables = p.
• The sum of the m = p eigenvalues = p.
• All of the variance is extracted.
• For each component, the proportion of
variance extracted = eigenvalue / p.
Extracting Principal Components 3
• For our beer data, here are the
eigenvalues and proportions of variance
for the seven components:
How Many Components to Retain
• From p variables we can extract p
components.
• We probably want fewer than p.
• Simple rule: Keep as many as have
eigenvalues  1.
• A component with eigenvalue < 1 captured
less than one variable’s worth of variance.
• Visual Aid: Use a Scree Plot
• Scree is rubble at base of cliff.
• See page 3 of the output.
Scree Plot
3.5
3.0
2.5
2.0
1.5
Eigenvalue
1.0
.5
0.0
1
2
Component Number
3
4
5
6
7
• Only the first two components have
eigenvalues greater than 1.
• Big drop in eigenvalue between
component 2 and component 3.
• Components 3-7 are scree.
• By default, SAS will retain all components
with eigenvalues of 1 or more.
• Should also look at a solution with one
fewer component and one with one more
component.
Loadings, Unrotated and Rotated
• Loading matrix = factor pattern matrix =
component matrix.
• Each loading is the Pearson r between one
variable and one component.
• Since the components are orthogonal, each
loading is also a β weight from predicting X from
the components.
• Here are the unrotated loadings for our 2
component solution:
Factor Pattern Matrix
Pre-Rotation Loadings
• All variables load well on first component,
economy and quality vs. reputation.
• Second component is more interesting,
economy versus quality.
• See page 4 of the output.
• See the preplot on page 5 of output.
Rotate the Axes
• Rotate these axes so that the two
dimensions pass more nearly through the
two major clusters (COST, SIZE, ALCH
and COLOR, AROMA, TASTE).
• The number of degrees by which I rotate
the axes is the angle PSI. For these data,
rotating the axes -40.63 degrees has the
desired effect.
Loadings After Rotation
Components After Rotation
• Component 1 = Quality versus reputation.
• Component 2 = Economy (or cheap drunk)
versus reputation.
• Page 6 of output.
• See the postplot on page 7 of the output.
Number of Components in the
Rotated Solution
• Try extracting one fewer component, try one
more component.
• Which produces the more sensible solution?
• Error = difference in obtained structure and true
structure.
• Overextraction (too many components)
produces less error than underextraction.
• If there is only one true factor and no unique
variables, can get “factor splitting.”
• In this case, first unrotated factor  true
factor.
• But rotation splits the factor, producing an
imaginary second factor and corrupting the
first.
• Can avoid this problem by including a
garbage variable that will be removed prior
to the final solution.
Explained Variance
• Square the loadings and then sum them across
variables.
• Get, for each component, the amount of
variance explained.
• Prior to rotation, these are eigenvalues.
• Our SAS output shows the SSL for each
component on page 6, just below the rotated
factor pattern.
• After rotation the two components together
account for (3.02 + 2.91) / 7 = 85% of the total
variance. If the last component has a small SSL,
one should consider dropping it.
• If SSL = 1, the component has extracted one
variable’s worth of variance.
• If only one variable loads well on a component,
the component is not well defined.
• If only two load well, it may be reliable, if the two
variables are highly correlated with one another
but not with other variables.
Naming Components
• For each component, look at how it is
correlated with the variables.
• Try to name the construct represented by
that factor.
• If you cannot, perhaps you should try a
different solution.
• I have named our components “aesthetic
quality” and “cheap drunk.”
Communalities
• For each variable, sum the squared
loadings across components.
• This gives you the R2 for predicting the
variable from the components,
• which is the proportion of the variable’s
variance which has been extracted by the
components.
• See page 4 of the output.
Orthogonal Rotations
• Varimax -- minimize the complexity of the
components by making the large loadings
larger and the small loadings smaller
within each component.
• Quartimax -- makes large loadings larger
and small loadings smaller within each
variable.
• Equamax – a compromize between these
two.
Oblique Rotations
• Axes drawn through the two clusters in the
upper right quadrant would not be
perpendicular.
• May better fit the data with axes that are
not perpendicular, but at the cost of having
components that are correlated with one
another.
• More on this later.
Download