Multivariate Statistics Least Squares ANOVA & ANCOV Repeated Measures ANOVA Cluster Analysis Least Squares ANOVA • Do ANOVA as a multiple regression. • Each factor is represented by k-1 dichotomous dummy variables • Interactions are represented as products of dummy variables. A x B Factorial: Dummy Variables • Two levels of A – A1 = 1 if at Level 1 of A, 0 if not • Three levels of B – B1 = 1 if at Level 1 of B, 0 if not – B2 = 1 if at Level 2 of B, 0 if not • A x B interaction (2 df) – A1B1 codes the one df – A1B2 codes the other df A x B Factorial: The Model • Y = a + b1A1 + b2B1 +b3B2 + b4A1B1 + b5A1B2 + error. • Do the multiple regression. • The regression SS represents the combined effects of A and B (and interaction). Partitioning the Sums of Squares • Drop A1 from the full model. – The decrease in the regression SS is the SS for the main effect of A. • Drop B1 and B2 from the full model. – The decrease in the regression SS is the SS for the main effect of B. • Drop A1B1 and A1B2 from the full model – The decrease in the regression SS is the SS for the interaction. Unique Sums of Squares • This method produces a unique sum of squares for each effect, representing the effect after eliminating overlap with any other effects in the model. • In SAS these are Type III sums of squares • Overall and Spiegel called them Method I sums of squares. Analysis of Covariance • Simply put, this is a multiple regression where there are both categorical and continuous predictors. • In the ideal circumstance (the grouping variables are experimentally manipulated), there will be no association between the covariate and the grouping variables. • Adding the covariate to the model may reduce the error sum of squares and give you more power. Big Error = Small F, Large p Sums of Squares Treatment Error Add Covariate, Lower Error Sum of Squares Treatment Baseline Error Big F = Happy Researcher Sum of Squares Treatment Baseline Scatophobin Error Confounded ANCOV • If the data are nonexperimental, or the covariate measured after manipulating the independent variables, then the covariate will be correlated with the grouping variables. • Including it in the model will change the treatment sums of squares. • And make interpretation rather slippery. A Simple Example • One Independent Variable (A) with three levels. • One covariate (C) • Y = a + b1A1 + b2A2 +b3C + b4A1C + b5A2C + error. • A1C and A2C represent the interaction between the independent variable and the covariate Covariate x IV Interaction • We drop the two interaction terms from the model. • If the regression SS decreases markedly, then the relationship between the covariate and Y varies across levels of the IV. • This violates the homogeneity of regression assumption of the traditional ANCOV. Wuensch & Poteat, 1998 • Decision (stop or continue the research) was not the only dependent variable. • Subjects also were asked to indicate how justified the research was. • Predict justification scores from – Idealism and relativism (covariates) – Sex and purpose of research (grouping variables) Covariates Not Necessarily Nuisances • Psychologists often think of the covariate as being nuisance variables. • They want their effects taken out of error variance. • For my research, however, I had a genuine interest in the effects of idealism and relativism. The Results • There were no significant interactions. • Every main effect was significant. • Idealism was negatively related to justification. • Relativism was positively related to justification. • Men thought the research more justified than did women. • Purpose of the research had a significant effect. • The cosmetic testing and neuroscience theory testing received mean justification ratings significantly less than those of the medical research. • Hmmm, our students think the cosmetic testing not justified, but they vote to continue it anyhow. Repeated Measures ANOVA • In the traditional (“univariate”) approach, subjects is treated as an additional classification variable. • A one-way RM ANOVA is really a two-way ANOVA, with subjects being the second factor. • This analysis assumes sphericity. Sphericity • Suppose we have five levels of repeated factor A. • Find the standard error for the difference between level j and level k. • We assume that standard error is constant across jk pairs. • This assumption is frequently violated with behavioral data. Corrections • There are procedures that correct for violation of the assumption of sphericity. • They reduce the degrees of freedom, much like done in the Welch ANOVA. • Greenhouse-Geisser is the more conservative procedure. • Huynh-Feldt is the less conservative procedure. The Multivariate Approach • Suppose you have a one-way RM design with five levels of the grouping variable (G). • You treat the scores at any one level of G as one variable, so you now have five variables (G1 through G5), not two variables (G and Y). Orthogonal Contrasts • Behind the scenes, your statistical program creates a complete set of orthogonal contrasts for the RM factor. • It then tests the null that every one of those contrasts has a mean of zero. • If that null is rejected, you conclude the RM factor has a significant effect. • There is no sphericity assumption with the multivariate-approach analysis. Doubly Multivariate Analysis • Suppose that you have a design with one or more RM factor(s) • And you also have multiple dependent variables. • If you take the multivariate approach to analysis of the RM factor(s), then you have a doubly multivariate analysis. Effects of Cross-Species Rearing • Wuensch (1992) • Newborn Mus fostered onto Mus, Peromyscus or Rattus. • Tested in apparatus where could visit four tunnels which smelled like – Clean pine shavings – Mus – Peromyscus – Rattus Mus musculus Peromyscus maniculatus Rattus norwegicus The Design • Dependent variables were – Latency to first visit of each tunnel – Number of visits to each tunnel – Cumulative time spent in each tunnel • Independent variables were – Scent of tunnel (4 levels, within-subjects) – Foster species (3 levels, between-subjects) Doubly Multivariate Results • There were significant results of Foster Species, Scent of Tunnel, and the Interaction. • This was followed by univariate ANOVA, Foster Species x Scent of Tunnel, on each of the three dependent variables. Results of the Univariate ANOVAs • The interaction was significant for each dependent variable. • Conducted simple main effects analysis. • Mus reared by Rattus had significantly more visits to and cumulative time in the rat-scented tunnel that did the other groups, and shorter latencies as well. • The other groups avoided the rat-scented tunnel. Cluster Analysis • Goal is to cluster cases into groups based on shared characteristics. • Start out with each case being a one-case cluster. • The clusters are located in k-dimensional space, where k is the number of variables. • Compute the squared Euclidian distance between each case and each other case. Squared Euclidian Distance v X i Yi 2 i 1 • the sum across variables (from i = 1 to v) of the squared difference between the score on variable i for the one case (Xi) and the score on variable i for the other case (Yi) Agglomerate • The two cases closest to each other are agglomerated into a cluster. • The distances between entities (clusters and cases) are recomputed. • The two entities closest to each other are agglomerated. • This continues until all cases end up in one cluster. What is the Correct Solution? • You may have theoretical reasons to expect a certain k cluster solution. • Look at that solution and see if it matches your expectations. • Alternatively, you may try to make sense out of solutions at two or more levels of the analysis. Faculty Salaries • Subjects were faculty in Psychology at ECU. • Variables were rank, experience, number of publications, course load, and salary. • The 2 cluster solution was adjuncts versus everybody else. • Adjuncts had lower rank, experience, number of publications, course load, and salary. Three Cluster Solution • Non-adjuncts were split into senior faculty and junior faculty. • Senior faculty had higher salary, experience, rank, and number of pubs. Four Cluster Solution • The senior faculty were split into two groups: The acting chair of the department and all of the rest of the senior faculty. • The acting chair had a higher salary and number of publications. Workaholism • Aziz & Zickar (2005) • Workaholics may be defined as those – High in work involvement, – High in drive to work, and – Low in work enjoyment. • For each case, a score was obtained for each of these three dimensions. The Three Cluster Solution • Workaholics – High work involvement – High drive to work – Low work enjoyment • Positively engaged workers (KLW) – High work involvement – Medium drive to work – High work enjoyment • Unengaged workers – Low work involvement – Low drive to work – Low work enjoyment • Past research/theory indicated there should be six clusters, but the theorized six clusters were not obtained.