Lab Lecture 10

advertisement
Psychology 522/622
Winter 2008
Lab Lecture #10
Exploratory Factor Analysis
Last week we explored Principle Components Analysis. This week we explore
Exploratory Factor Analysis. We continue with the humor scale data set. We’ll use
Principal Axis Factor Analysis to conduct the EFA. We’ll compare the results of the
PCA from last week with the EFA from this week.
Last week, we decided that there were two dimensions underlying the humor scale, one
dimension relating to a Rickles style of humor and the other dimension related to an
Allen style of humor. Using Principal Components Analysis, we forced these two
dimensions (or components) to be uncorrelated. However, are they really uncorrelated?
We’ll use a different approach and rotation method to explore whether these two
dimensions are really uncorrelated.
Conceptually, scores for each item can be separated into two distinct sources, one source
due to true differences on the underlying dimension and one source due to measurement
error. Principal Components Analysis makes no attempt to separate the variance in each
item due to these two sources. In other words, PCA analyzes all of the variance in the set
of items. In contrast, EFA (and factor analysis in general) attempts to separate item
variance into these two sources, variance due to true scores and variance due to error.
We change the names slightly in factor analysis. Error variance is called “unique
variance” and true score variance is called “communality.” An item’s communality is the
percentage of variance that the item shares with the underlying factor. Naturally, we
want this to be high as we hope that the items are at least moderately related to the
underlying factor(s).
Now the difficulty is in determining each item’s communality. There are several ways to
do this as outlined in Tabachnick and Fidell. The details of how this is accomplished are
less important than the big picture idea that differentiates PCA from EFA: PCA ignores
measurement error and EFA attempts to estimate the measurement error in each item by
estimating its communality. Another way to think of this is that PCA bases its solution
on the total variance in each item whereas EFA bases its solution on the variance in each
item due to the underlying factors. Thus EFA is more closely aligned with issues of
reliability and measurement error in conjunction with data reduction and PCA is seen
more strictly as a data reduction technique.
So there are two issues we’ll focus on today: 1) conducting an EFA and focusing on
communalities and output interpretation and 2) rotating the factor solution so factors can
be correlated.
EFA of the Humor Scale
DATAFILE: HUMOR.SAV
We conduct an EFA of the humor scale using Principal Axis Factoring.
Analyze  Data Reduction  Factor Analysis
Select all 10 items and move them to the Variables box.
Click Descriptives, initial solution should be selected by default; in the “Correlation
Matrix” section select Reproduced. Click Continue.
Click Extraction, select Principal Axis Factoring from the “Method” drop down menu,
select scree plot (unrotated factor solution should be selected by default); in the “Extract”
section click on the button next to Number of factors and type “2” in the box. Click
Continue.
Click Rotation, select Direct Oblimin (in the Display box, rotated solution should be
selected by default). Click Continue.
Click Options, in the “Coefficient Display Format” section select Sort by size. Click
Continue.
Click OK.
Factor Analysis
Communalities
Initial
I like to make fun of
others.
I make people laugh by
making fun of myself.
People find me funny
when I make jokes about
others.
I talk about my problems
to make people laugh.
I frequently make others
the target of my jokes .
People find me funny
when I tell them about
my failings.
I love to get people to
laugh by us ing s arcasm.
I am funnies t when I talk
about my own
weaknesses.
I make people laugh by
exposing other people's
stupidities.
I am funny when I tell
others about the dumb
things I have done.
Extraction
.147
.148
.424
.403
.475
.700
.462
.395
.395
.472
.404
.401
.248
.282
.339
.343
.209
.199
.397
.312
Extraction Method: Principal Axis Factoring.
These variables are in the order in which you specified them in the analysis (i.e., they are
not sorted by size yet). We see both the initial and after-the-extraction-and-rotation
communality estimates. They range from .148 for item 1 to .700 for item 3. The
communalities are the percentage of variance in each item accounted for by the factors.
Thus, variability in the responses to item 3 is well described by the two factors whereas
variability in the responses to item 1 is less well described by the two factors. It seems
that item 1 has a lot of error variance/measurement error.
Total Variance Explained
Factor
1
2
3
4
5
6
7
8
9
10
Total
2.510
2.319
1.130
.960
.759
.649
.583
.445
.334
.313
Initial Eigenvalues
% of Variance
Cumulative %
25.103
25.103
23.186
48.290
11.297
59.587
9.600
69.187
7.585
76.772
6.487
83.259
5.826
89.085
4.447
93.532
3.337
96.869
3.131
100.000
Extraction Sums of Squared Loadings
Total
% of Variance
Cumulative %
1.872
18.724
18.724
1.782
17.822
36.546
Rotation
Sums
Totalof
Squared
1.864a
Loadings
1.796
Extraction Method: Principal Axis Factoring.
a. When factors are correlated, sums of squared loadings cannot be added to obtain a total variance.
We see here the initial eigenvalues, the extraction sums of squared loadings, and their
values after rotation. The initial eigenvalues are the same as they were for the PCA last
week. Note that the extraction sums of squared loadings are smaller compared to last
week (Factor 1: 1.87 v. 2.51; Factor 2: 1.78 v. 2.32). This is due to the difference
between PCA and EFA in how it handles (or ignores) error variance as discussed above.
So it is no surprise that these values are lower compared to last week. The rotation sums
of squared loadings differ by only a little from the extraction sums of squared loadings.
As the extraction sums of squared loadings are computed assuming uncorrelated
(orthogonal factors) and these loadings differ little from the after-rotation values, we
might surmise that the two factors are close to being uncorrelated. We’ll see if this
conjecture holds later in the output.
Fa ctor Ma trixa
Factor
1
I make people laugh by
making fun of myself.
I talk about my problems
to mak e people laugh.
I am funniest when I talk
about my own
weaknesses.
People find me funny
when I tell them about
my failings .
I am funny when I tell
others about the dumb
things I have done.
I like to make fun of
others.
People find me funny
when I mak e jokes about
others.
I frequently mak e ot hers
the target of my jok es.
I love t o get people to
laugh by us ing sarc asm.
I make people laugh by
ex posing other people's
stupidities.
2
.634
.031
.576
.252
.559
.175
.546
.321
.537
.153
-.290
.253
-.201
.812
-.143
.672
-.270
.458
-.153
.420
Ex trac tion Method: Principal Ax is Factoring.
a. 2 factors ex trac ted. 11 iterat ions required.
This table includes the factor loadings for the 10 items onto the two factors pre-rotation
(i.e., the two factors are still forced to be uncorrelated). Note that the order of the items
has been changed so that the items with the largest loadings on each factor occur first.
You’ll recognize the first five items as belonging to the Allen scale and the last four items
belonging to the Rickles Scale. The item “I like to make fun of others” loads about
equally on each scale and might be deemed “complex.” Again, though, this matrix
represents pre-rotation factor loadings, so we won’t worry about it too much.
Reproduced Correlations
I make
people laugh
by making
fun of myself.
People find
me funny
when I make
jokes about
others.
I talk about my
problems to
make people
laugh.
.148
-.176
.263
-.103
-.176
.403
-.102
.373
.263
-.102
.700
.089
-.103
.373
.089
.395
.211
-.070
.574
.087
-.077
.356
.151
.395
.194
-.157
.426
-.040
-.117
.359
.030
.366
.150
-.084
.372
.018
-.117
.345
.017
.348
-.042
.005
-.013
.002
.202
I like to make
fun of others.
Reproduced Correlation
Residual a
I like to make fun of
others.
I make people laugh by
making fun of myself.
People find me funny
when I make jokes about
others.
I talk about my problems
to make people laugh.
I frequently make others
the target of my jokes.
People find me funny
when I tell them about
my failings.
I love to get people to
laugh by using sarcasm.
I am funniest when I talk
about my own
weaknesses.
I make people laugh by
exposing other people's
stupidities.
I am funny when I tell
others about the dumb
things I have done.
I like to make fun of
others.
I make people laugh by
making fun of myself.
People find me funny
when I make jokes about
others.
I talk about my problems
to make people laugh.
I frequently make others
the target of my jokes.
People find me funny
when I tell them about
my failings.
I love to get people to
laugh by using sarcasm.
I am funniest when I talk
about my own
weaknesses.
I make people laugh by
exposing other people's
stupidities.
I am funny when I tell
others about the dumb
things I have done.
b
b
-.042
b
b
.005
.002
.027
-.013
.202
.027
.010
.046
.017
.076
-.012
-.062
.023
-.134
.039
-.007
-.006
-.017
.093
-.115
.011
.068
-.079
-.046
-.034
-.027
-.019
-.061
-.073
-.166
m
t
Here we see the residual matrix. The output alerts you that 42% of the residuals are
larger than .05. The largest residual is .202 between the item “I talk about my problems
to make people laugh” and “I make people laugh by making fun of myself.” On the
whole the size of the residuals are on the small side and there are no apparent patterns to
the residuals to warrant fitting a third factor. Things look okay so we’ll continue.
Pattern Matrixa
Factor
1
I talk about my problems
to make people laugh.
People find me funny
when I tell them about
my failings .
I make people laugh by
making fun of myself.
I am funniest when I talk
about my own
weaknesses.
I am funny when I tell
others about the dumb
things I have done.
People find me funny
when I make jokes about
others .
I frequently make others
the target of my jokes.
I love to get people to
laugh by us ing sarcasm.
I make people laugh by
exposing other people's
stupidities.
I like to make fun of
others .
2
.630
.061
.630
.136
.598
-.168
.585
-.007
.556
-.021
.129
.837
.129
.685
-.073
.521
.022
.448
-.170
.331
Extraction Method: Principal Axis Factoring.
Rotation Method: Oblimin with Kaiser Normalization.
a. Rotation converged in 4 iterations.
After an oblique rotation, Tabachnick and Fidell suggest interpreting the pattern matrix.
(For discussion of the issues and dangers, see page 602.) The pattern matrix gives
coefficients that describe the unique relationship between each item and each factor
(controlling for the other factors). After rotation, the five Allen items all hang together
on the first factor and the last five Rickles items all hang together on the second factor.
All of these coefficients are above the |.30| level to suggest a “salient” loading. We
would call this a “clean” solution as there are no complex items and the factor loadings
for each item onto its primary factor is above the salient threshold.
Structure Matrix
Factor
1
I talk about my problems
to make people laugh.
People find me funny
when I tell them about
my failings .
I make people laugh by
making fun of myself.
I am funniest when I talk
about my own
weaknesses.
I am funny when I tell
others about the dumb
things I have done.
People find me funny
when I make jokes about
others .
I frequently make others
the target of my jokes.
I love to get people to
laugh by us ing sarcasm.
I make people laugh by
exposing other people's
stupidities.
I like to make fun of
others .
2
.626
.009
.619
.085
.612
-.217
.586
-.054
.558
-.066
.061
.827
.073
.675
-.115
.527
-.015
.446
-.197
.345
Extraction Method: Principal Axis Factoring.
Rotation Method: Oblimin with Kaiser Normalization.
The structure matrix yields the correlation of each item with each factor. Though people
tend not to report the structure matrix, it is a good idea to inspect it to see whether the
interpretation from the pattern matrix also holds for the structure matrix. We see that
these two matrices yield similar conclusions.
Factor Correlation Matrix
Factor
1
2
1
1.000
-.082
2
-.082
1.000
Extraction Method: Principal Axis Factoring.
Rotation Method: Oblimin with Kaiser Normalization.
Here we see the correlation between the Allen and Rickles factors. Note that the
correlation is not zero (though it is close to zero). The correlation is negative which
suggests that people who score high on the Rickles scale tend to score lower on the Allen
scale.
Summary
We’ve conducted an EFA of the humor scale. We found evidence of a good fit of a twofactor solution to the data. The two factors are interpretable as Allen-type humor and
Rickles-type humor. These factors correlated negatively and weakly. The two-factor
solution is clean with no complex items and all items are salient on their respective
factors. These results suggest that we could create two scales based on the 10 items, one
an Allen scale and one a Rickles scale, and include these scale scores in later analyses, if
desired.
The table below summarizes the different PCA/EFA procedures that we’ve covered.
FA Procedure
Extraction
Components are
Rotation Matrix of Interest
PCA unrotated Principal
Orthogonal
None
Component Matrix
solution
Components
(i.e., uncorrelated)
PCA rotated
Principal
Orthogonal
Varimax Rotated Component
solution
Components
(i.e., uncorrelated)
Matrix
EFA rotated
Principal Axis
Oblique
Direct
Pattern Matrix
solution
Factoring
(i.e., correlated)
Oblimin
APPENDIX
Review: Creating Composite Scores
A common method of computing composite scores is to calculate the mean of the item
responses. We learned this waaaaaay back in the first week of PSY 522/622, but let’s
practice by creating a mean composite score for the Rickles scale.
Using the “mean” command:
COMPUTE rickles = mean(item01,item03,item05,item07,item09) .
EXECUTE .
This version of the menu command will allow you to compute a mean from any/all
available data. If a participant only responded to one of these items (e.g., let’s say they
responded 5 to item 1) then they would receive a value of 5 for the rickles variable.
A slightly different way to compute mean composite scores:
COMPUTE rickles = mean.3(item01,item03,item05,item07,item09) .
EXECUTE .
This version of the menu commence will only compute a mean composite score for
individuals who responded to at least 3 of the 5 items. If a participant only responded to
item 1, they will not get a score for the rickles variable.
The following command is identical to the first command (i.e., it will compute a mean
composite score for any/all available data):
COMPUTE rickles = (item01+item03+item05+item07+item09)/5 .
EXECUTE .
Download