Multivariate Statistics Principal Components/Factor Analysis Structural Equation Modeling Principal Components & Factor Analysis • We have only one set of variables. • Each is well correlated with some of the others. • We want to capture the variance, or covariance, of the p variables • Repackaging it into m components or factors. [usually with m < p] • Each of which is a weighted linear combination of the variables. Creation or Discovery? • I tend to think of the components/factors as being things we have created out of the data. • Psychologists are more likely to think of them as estimates of underlying dimensions (latent variables). • I am skeptical about their being a concrete reality that we can know, I think our reality is created. Data Reduction • PCA/FA may be used to reduce the p variables to a smaller set of m components/factors for use in subsequent analysis. • Chia, Wuensch, Childers, Chuang, Cheng, Cesar-Romero, & Nava (1994) • Students in Mexico, Taiwan, and the US • 45 item “family values” scale. • My research associate wanted me to conduct 45 3-way ANOVAs (one on each item) • I balked, insisted on reducing the 45 variables to a smaller number of components. • Then did a Culture x Sex x Age (under 20 vs. over 20) ANOVA on each component. • The loadings were used to name the components. The Seven Components 1. Family Solidarity (respect for the family) 2. Executive Male (men make decisions, women are homemakers) 3. Conscience (important for family to conform to social and moral standards) 4. Equality of the Sexes (minimizing sexual stereotyping) 5. Temporal Farsightedness (interest in the future and the past) 6. Independence (desire for material possessions with freedom from parental constraint) 7. Spousal Employment (each spouse should make decisions about his/her own job) Results of the ANOVAs • US students (especially the women) stood out as being sexually egalitarian, wanting independence. • Younger US students put little importance on family solidarity. • Taiwanese students were distinguished by scoring very high on the temporal farsightedness component • Taiwanese students were low on the conscience component. • Among Taiwanese students the men were more sexually egalitarian than the women. • and the women more concerned with independence than were the men. • The Mexican students were like the Taiwanese in being concerned with family solidarity, • but not with sexual egalitarianism and independence. • The Mexican students were like the US students in attaching more importance to conscience and less to temporal farsightedness than did the Taiwanese. • Among the Mexican students the men attached more importance to independence than did the women. Factor Analysis and Test Development • 21 items in Patel’s SBS • Were designed to measure a single dimension. • FA indicated there were three dimensions. – avoidance behaviors (such as moving away from a gay) -- 13 items – aggression from a distance (such as making harassing phone calls) – 6 items – up-close aggression (physical fighting) – 2 items Item Analysis • For a scale or subscale, see how well each item correlates with the sum of scores on other items. • Find Cronbach alpha, the average splithalf reliability, corrected for attenuation. • This is a conservative estimate of reliability. • Would deletion of any item increase Cronbach alpha? Homework Assignment • Please read the document Cronbach's Alpha and Maximized Lambda4. • Follow the instructions there to conduct an item analysis with SAS and with SPSS. • Bring your output to class for discussion. Path Analysis • This is a simplification of Structural Equation Modeling. • Only measured variables are included, no latent variables. • We test to see how well our causal model fits with the (nonexperimentally obtained) data. Sunita Patel’s Thesis The data are the same as those discussed earlier under the topic of canonical correlation. Significant paths are in red. The path coefficients are standardized, like beta weights. Structural Equation Modeling • Essentially a combination of sequential multiple regression and factor analysis. • There are two parts of the model. • The measurement model is how the observed variables are related to latent variables (factors). • The structural model relates the latent variables to one another. Student Opinion of Instruction Surveys • Greenwald and Gillmore (1997) • Measured variables are items on the survey. • Latent variables are – Expected grade in course – Work put into the course by the student – The evaluation the student gives the professor Conclusions • Professors who have lenient grading policies (making Expected Grades high) – Get good evaluations – Do not motivate the students to put much work into the course • Professors with stringent grading policies – Get poor evaluations – Do motivate the students to work hard on the course. Confirmatory Factor Analysis • A type of SEM. • How well does an a priori factor model fit the data? • Example from Tabachnick & Fidell (2007) – Measured variables are subscales of the Wechsler Intelligence Scale for Children (WISC) – Latent variables are Verbal IQ and Performance IQ. Goodness of Fit • There are several statistics that measure how well the model fits the data, and standards for how well is good enough. • These are used in path analysis, CFA, and other applications of SEM. • One may also modify the model and see how that affects the fit. Adding the path in red produced noticeable improvements in the goodness of fit statistics.