Quit Whining and Learn Factor Analysis Already

advertisement
Factor Analysis is Your Friend
AnnMaria De Mars, PhD.
The Julia Group & 7 Generation Games
WHY?
Imagine this
What exactly were you planning on
doing with that?
Let’s say you have
a massive pile of
data …
You Could:
“9% of adolescents reported blah blah blah”
“23% of adults said blah blah blah”
The Problem:
1. Boring! No one is going to read each one.
2. Cannot conduct relational analysis of each
variable --- statistical sin
3. Individual items are notoriously unreliable.
SUBSCALES ?
•Do you own guinea pigs?
•Do you have any stuffed
animals?
Don’t Be Scared of Factor Analysis!
Conceptually, it’s pretty simple
Image Source: scottyoungpsalm37.blogspot.com
Factor analysis is for …
• 1. Revealing patterns of interrelationships among
variables
• 2. Detecting clusters of variables
• 3. Reducing a large number of variables to a
smaller number of variables, the factors of factor
analysis.
You can factor analyze anything
– test scores,
– individual items on a test
– measurements of various dimensions (i.e. height
or weight)
– agricultural measures like yield of a rice field
– Socioeconomic measres
Is this a fair test?
• You have studied for a final exam in Biology
101. There is one question, “What is the
relationship between respiration and
photosynthesis?”
• Your child is in fifth grade. Her weekly
spelling test consists of one word.
You already understand this
True Variance vs. Error Variance
Variability & Validity
Image Source: www.edudemic.com
Questions …
How do I …
• Decide on the number of factors
• Interpret factors
• With SAS Enterprise Guide
How to do Factor Analysis using
SAS Enterprise Guide
A Brief Overview of the Process
1. Open a data set, run a factor analysis, and
observe the data’s fit.
2. If necessary, run a correlation analysis to
create a dataset to analyze
3. If necessary , make modifications and run
your model once or twice more
A complete
project
Our data
• From the 500 Family Study
• Hundreds of questions answered
• Example uses 42 items asked adolescents
regarding parent communication, rules,
decision-making
FILE > OPEN > DATA
Select the variables
Hold down shift key to select more than one at a time
TASKS > MULTIVARIATE > FACTOR ANALYSIS
Look at your log first!
• When you get your results, do NOT look at
your results first. Be smarter than most people
and look at your log. To do that you click on
the tab that says LOG
What if you see this?
• WARNING: 123 OF 465 OBSERVATIONS IN
DATA SET WORK.SORTTEMTABLESORTED
OMITTED DUE TO MISSING VALUES.
• If we didn’t have a lot of people missing data,
we could skip the next few steps, but hey,
that’s life.
Tasks > Describe > Summary
statistics
Drag and drop to select variables to
analyze
Why factor analyze the correlation
matrix?
• The default for SAS is to delete a record if it is
missing ANY of the variables.
My first analysis was missing 120 records but no single item was missing
for more than 49 people
Select the variables you want
by clicking on them and pressing the blue arrows in
between the panes
Step Two: Create a Correlation Matrix
Dataset
DATA SET > TASKS > MULTIVARIATE > CORRELATIONS.
Output matrix as SAS dataset of
type=CORR:
OUTPUT DATA > SAVE OUTPUT > DATA > RUN
TASKS > MULTIVARIATE > FACTOR ANALYSIS
Select the variables
Hold down shift key to select more than one at a time
So Now What does all of this Factor
Analysis crap mean, anyway?
What exactly is a factor?
• Recall, a factor is some underlying trait that is
measured indirectly by the items you
measured directly.
• AKA ‘dimension reduction technique’
How many factors?
3 possibilities
Eigenvalue
• The amount of variance in the individual
measures explained by the factor.
Square the loadings in the factor pattern and add them up. The total is the eigenvalue.
Prediction: At least one person who reads this will do exactly that
and be surprised that I am right. Contrary to appearances, I do not make this s*** up.
Eigenvalue
• Common criterion for deciding the number of
factors is “Minimum eigenvalue greater than
1.”
• Makes intuitive sense but …
Method 3
• Parallel analysis criterion (macros available –
Google it)
Now do it again
What do the factors
mean?
Important Point One
• The correlation of a variable with a factor is
called the loading.
– loadings can be positive or negative.
Important point two
• To ease interpretation we’d really like to have
“simple structure”
– variables load close to 1.0 on one factor and close
to zero on the others.
Next step: iterate
No, it is not party in the pool, sorry.
Secret to factor analysis
• Finding a solution that
is defensible BOTH
statistically and
theoretically
• Nice first step to
structural equation
modeling. In fact, it IS
the first step.
Download