EXPERIMENTS WITH MORE THAN TWO GROUPS EXPERIMENTAL DESIGN: ADDING TO THE BASIC BUILDING BLOCK Every experimental design is based on the two group design Moving beyond this design allows us to ask more complicated and interesting questions. The single factor multiple-group design These are designs with 1 IV that has 3 or more levels Could have any number of control and experimental groups Generally, the IV would have 5 or fewer levels Between Subjects Design Participants are randomly assigned to the levels … different people in each level who have not been paired or matched Random assignment (RA) controls for EVs that you may be unaware of… RA will hopefully equate the groups, but you cannot be certain If groups are not equal with respect to one or more EV’s, we have a confound(s) Within Subjects designs This design is involves doing one of three things: Testing the same participants in each level, using natural matches, or matching participants on a relevant EV(s) Recall, if you do not match on a relevant EV, error variance will not decrease, the TS value will not change, and you will have FEWER degrees of freedom… power decreases. You would be better of running a between group design if this is the case. Should use a within subjects design if you have few participants (n<20/level) or you expect a small effect size… but note that within groups design are not always feasible/possible. Comparing the multiple-group vs. two group single factor designs Two group designs are well suited if you first need to establish whether the IV has an effect Always do a lit review first, to make sure the question has not already been answered and to give you design ideas (do’s and don’ts). Multiple group designs are well suited if you want to know more precise info about your IV, once you’ve established that it does have an effect. Follow this principle of parsimony (KISS). Do not add more groups unless you need to. The more levels (groups) you have, the more difficult it can become to detect a treatment effect because error variance tends to increase. Choosing a multiple groups design The decision to use or not use this design depends on the research question. Once you’ve decided that a single factor (1 IV) is appropriate, you need to decide whether to use between or within subjects design This decision depends on: sample size: if <20 per group, use within subjects expected effect size: if small, use within subjects practical issues: depending on the issues, could go either way and whether a within subjects design is even possible Practical issues relate often to number of levels in your design and the participants If you have many levels and you’re considering a within subjects design = more difficult to find matched sets and natural sets. If using the same people in each group, it is less likely that people will want to serve in all conditions (these are reasons not to do a within subjects design) If you have many levels and you’re considering a between subjects design = because you have different people in each group, and because you should have at least 20/group for RA to work, you will need lots of people (these are reasons not to do a between subjects design) Variations on the multiple-group design IV may be treatment (true IV) or classification/subject If the latter, called quasi experiment or ex post facto design Regardless, a single factor multiple group design can have any number of control (including 0) and experimental groups ANALYZING MULTIPLE GROUP DESIGNS Calculating your statistics Between subjects = one way independent ANOVA Check for homogeneity of variance using Levene’s F. if significant, this is bad… adjust for an elevated risk of a type 1 error by only declaring “significance” if the sig value is less then .01 Within subjects = one way repeated ANOVA Check for sphericity using mauchly’s test If significant, this is bad… use Greenhouse corrected data on the SPSS output sheet… this will control for the inflated risk of a type 1 error Rationale of ANOVA Compares between and within group variability F = variance between = treatment effect + individual diff + error variance within individual diff + error Note: “variance within” is also called error variance If a treatment effect is present, F will be large If no treatment effect is present, F will be close to 1 The bigger the F, the more likely it is to be significant Interpreting your statistics Look at descriptive (mean and SD) as well as inferential statistics (F values, df, p or sig values, eta squared). Eta squared is an estimate of effect size … if you have just two groups, effect size is measured by Cohen’s d. SPSS will give you df for between and within groups: df between is for the treatment effect df within is for the error term F( df for the treatment effect, df for the error term) = actual value of F, p = actual probability of F … or if not significant, can say p > .05 If F is significant AND there are more than two means being compared, you must do a post hoc test to see which means are different from which Post hoc tests include: Scheffe, Tukey, NK, and LSD t These are listed from least to most powerful. I usually run all except NK, and report the one that gives me the results I like the best Important notes about post hoc test: In a 1 way independent ANOVA, SPSS calls the post hoc tests “Post hoc tests”. If two or more means are significantly different from each other, simply tell people where the differences were. For example: “A (name of the post hoc test used) showed that group X was significantly greater than group Y (p<.05), but that X and Y were not significantly different from Z” In a one way repeated ANOVA, SPSS calls the post hoc tests “compare main effects” in the ANALYSIS menu, and “pairwise comparisons” on the output page. These comparisons are akin to the LSD t-test… and they still compare the means 2 x 2 to see which are different from which. If any are significant, report this to people the same way as in the previous paragraph – except where it says “name of post hoc test”, just say “multiple pairwise comparisons” Translating statistics into words In the results section of an APA report, you report both descriptive and inferential stats. “To test the hypothesis that ____, __name of the DV___ was/were analyzed using a one way independent (dependent) ANOVA. These results are summarized in Figure 1. The ANOVA was not significant, F(2,24) = 3.22, p >.05. Sales clerks’ latency to help was the same regardless of whether the customer was dressed sloppily, casually, or richly.” If the ANOVA is significant, it will read something like: “To test the hypothesis that ____, __name of the DV___ was analyzed using a one way independent (dependent) ANOVA. These results are summarized in Figure 1. The ANOVA was significant, F(2,24) = 13.22, p = .002, with an estimated effect size eta squared = .83. Post hoc Tukey tests indicated (p < .05) that sales clerks took longer to respond to sloppily dressed customers than to casual or richly dressed costumers, who did not differ significantly from each other.” Eta Squared is a measure of effect size, similar to r squared… it tells you the percentage of variance in the DV accounted for by the IV. The bigger your eta squared, the larger your effect size… note that it can take on any value between 0 and +1