Factor Analysis is Your Friend AnnMaria De Mars, PhD. The Julia Group & 7 Generation Games WHY? Imagine this What exactly were you planning on doing with that? Let’s say you have a massive pile of data … You Could: “9% of adolescents reported blah blah blah” “23% of adults said blah blah blah” The Problem: 1. Boring! No one is going to read each one. 2. Cannot conduct relational analysis of each variable --- statistical sin 3. Individual items are notoriously unreliable. SUBSCALES ? •Do you own guinea pigs? •Do you have any stuffed animals? Don’t Be Scared of Factor Analysis! Conceptually, it’s pretty simple Image Source: scottyoungpsalm37.blogspot.com Factor analysis is for … • 1. Revealing patterns of interrelationships among variables • 2. Detecting clusters of variables • 3. Reducing a large number of variables to a smaller number of variables, the factors of factor analysis. You can factor analyze anything – test scores, – individual items on a test – measurements of various dimensions (i.e. height or weight) – agricultural measures like yield of a rice field – Socioeconomic measres Is this a fair test? • You have studied for a final exam in Biology 101. There is one question, “What is the relationship between respiration and photosynthesis?” • Your child is in fifth grade. Her weekly spelling test consists of one word. You already understand this True Variance vs. Error Variance Variability & Validity Image Source: www.edudemic.com Questions … How do I … • Decide on the number of factors • Interpret factors • With SAS Enterprise Guide How to do Factor Analysis using SAS Enterprise Guide A Brief Overview of the Process 1. Open a data set, run a factor analysis, and observe the data’s fit. 2. If necessary, run a correlation analysis to create a dataset to analyze 3. If necessary , make modifications and run your model once or twice more A complete project Our data • From the 500 Family Study • Hundreds of questions answered • Example uses 42 items asked adolescents regarding parent communication, rules, decision-making FILE > OPEN > DATA Select the variables Hold down shift key to select more than one at a time TASKS > MULTIVARIATE > FACTOR ANALYSIS Look at your log first! • When you get your results, do NOT look at your results first. Be smarter than most people and look at your log. To do that you click on the tab that says LOG What if you see this? • WARNING: 123 OF 465 OBSERVATIONS IN DATA SET WORK.SORTTEMTABLESORTED OMITTED DUE TO MISSING VALUES. • If we didn’t have a lot of people missing data, we could skip the next few steps, but hey, that’s life. Tasks > Describe > Summary statistics Drag and drop to select variables to analyze Why factor analyze the correlation matrix? • The default for SAS is to delete a record if it is missing ANY of the variables. My first analysis was missing 120 records but no single item was missing for more than 49 people Select the variables you want by clicking on them and pressing the blue arrows in between the panes Step Two: Create a Correlation Matrix Dataset DATA SET > TASKS > MULTIVARIATE > CORRELATIONS. Output matrix as SAS dataset of type=CORR: OUTPUT DATA > SAVE OUTPUT > DATA > RUN TASKS > MULTIVARIATE > FACTOR ANALYSIS Select the variables Hold down shift key to select more than one at a time So Now What does all of this Factor Analysis crap mean, anyway? What exactly is a factor? • Recall, a factor is some underlying trait that is measured indirectly by the items you measured directly. • AKA ‘dimension reduction technique’ How many factors? 3 possibilities Eigenvalue • The amount of variance in the individual measures explained by the factor. Square the loadings in the factor pattern and add them up. The total is the eigenvalue. Prediction: At least one person who reads this will do exactly that and be surprised that I am right. Contrary to appearances, I do not make this s*** up. Eigenvalue • Common criterion for deciding the number of factors is “Minimum eigenvalue greater than 1.” • Makes intuitive sense but … Method 3 • Parallel analysis criterion (macros available – Google it) Now do it again What do the factors mean? Important Point One • The correlation of a variable with a factor is called the loading. – loadings can be positive or negative. Important point two • To ease interpretation we’d really like to have “simple structure” – variables load close to 1.0 on one factor and close to zero on the others. Next step: iterate No, it is not party in the pool, sorry. Secret to factor analysis • Finding a solution that is defensible BOTH statistically and theoretically • Nice first step to structural equation modeling. In fact, it IS the first step.