Two-way Repeated Measures ANOVA. Page 1 Research Methods II: Spring Term 2002 Using SPSS: Two-way Repeated-Measures ANOVA: Suppose we have an experiment in which there are two independent variables: time of day at which subjects are tested (with two levels: morning and afternoon) and amount of caffeine consumption (with three levels: low, medium and high). Subjects are given a memory test under all permutations of these two variables. In other words, each subject's performance is tested six times: after low, medium and high doses of caffeine in the morning, and after low, medium and high doses of caffeine in the afternoon. (Each subject would receive these six conditions in a different random order, to avoid systematic effects of practice, etc.) A two-way repeated-measures ANOVA is the appropriate test in these circumstances. 1. Entering the Data: Entering the data is a little more complicated than with previous ANOVA's. (Or rather, it's a bit more complicated to explain: once you get the idea of what's required, it's not too difficult to do). Basically, we have to use a separate column for the data that come from each permutation of our two variables. Assigning codes to the various conditions: In this example, we have two IV's: time of day and caffeine consumption. Time of day has two "levels": morning, and afternoon. Caffeine consumption has three levels: low, medium and high. We need to give code-numbers to the IV's, and to the levels of each IV, to help SPSS identify them correctly. (a) Let's call time of day variable 1, and caffeine consumption variable 2. (b) For the levels of time of day, let's use a 1 to identify "morning", and a 2 to identify "afternoon". (c) For the levels of caffeine consumption, let's use a 1 to identify "low". a 2 to identify "medium" and a 3 to identify "high". Now each combination of code-numbers identifies a specific level of one or other of our variables, as follows: 1,1 means "time of day: morning; caffeine consumption: low". 1,2 means "time of day: morning"; caffeine consumption: medium". 1,3 means "time of day: morning; caffeine consumption: high". 2,1 means "time of day:afternoon; caffeine consumption: low". 2,2 means "time of day: afternoon; caffeine consumption, medium". 2,3 means "time of day: afternoon; caffeine consumption: high". Entering the data into columns in SPSS: With a one-way repeated-measures ANOVA, we entered the data for each condition in a separate column (see Using SPSS handout 12). So, in this instance, if we were interested only in the effects of caffeine (and had not considered time of day), we would have had only three columns, for "low", "medium" and "high" levels of caffeine. Now we have the additional variable of time of day and we need to include the columns for these data somehow. In this example, the data would be entered in six columns, one for each permutation of caffeine and time of day. We need a separate column for each of the following: "morning, low caffeine" data (1,1 in our codes); "morning, medium caffeine" data (1,2), "morning, high caffeine" (1,3); "afternoon, low caffeine" (2,1); "afternoon, medium caffeine (2,2); and finally, "afternoon, low caffeine" (2,3). Our SPSS data-screen might look like this (as usual, I've included a column labelled "subject", to show whose data is whose, but it's not required for the analysis). Two-way Repeated Measures ANOVA. Page 2 Notice how I have arranged the columns. First, I have all the columns that relate to the "morning" level of my time of day IV. So, I begin with the columns which correspond to the various levels of caffeine consumption (low, medium and high) for the morning testing. Then , I have all the columns which relate to the "afternoon" level of the time of day variable. (It's not strictly necessary to arrange the data like this, but it makes it easier for you to keep track of what you are doing). I would strongly advise you to label the columns with as meaningful titles as you can manage. Here, for example, it's pretty obvious that "amlow" contains the data for the "morning testing /low caffeine dose" data, "pmmedium" contains the data for the "afternoon testing/medium caffeine dose" data, and so on. Running the ANOVA: Having entered your data, do the following. (a) Click on "Analyze"; then click on "General Linear Model"; then click on "Repeated Measures". The "Repeated Measures Define Factor(s) dialog box: Two-way Repeated Measures ANOVA. Page 3 (b) For each of your IV's, you have to make entries in this box. You have to tell SPSS the name of each IV, and how many levels it has. Start with the time of day variable. Replace the words "factor 1" with a more meaningful name that describes this variable - for example, "testtime". Then enter the number of levels in the next box down. We have two levels of time of day, so we enter a "2" in the box. Now click on the button labelled "Add", and SPSS will put a brief summary of this IV into the box beside the button. In this case, SPSS will put "testtime(2)" into the box, as shown below: Two-way Repeated Measures ANOVA. Page 4 (c) Repeat step (b) for the caffeine consumption IV. So, next to "within-subject factor name", enter a label for this variable. I've used "caffeine". For this variable, there are three levels, so I have entered "3" in the next box down. Finally, click on "Add" to enter the details. Your dialog box will now look like this: (c) Now we have to tell SPSS which columns contain the data needed for the ANOVA. Click on the button labelled "Define". The "Repeated Measures Define Factor(s)" dialog box disappears, and is replaced with a new, fearsome-looking one, entitled "Repeated Measures ". Two-way Repeated Measures ANOVA. Page 5 This looks horrendous, but stay calm. Let's take it bit by bit. On the left-hand side of the dialog box is a box containing the names of the columns in your SPSS data-window. On the right-hand side, is a box which contains empty slots (shown as _?_[1,1], for example).. Your mission, should you choose to accept it, is to move each column name on the left, into its correct slot on the right. This is where all that fuss about column labelling and coding pays off. Take the top slot in the right-hand box: it's got (1,1) next to it. This means that this is the slot for the name of the column that represents the permutation of the first level of IV1 and the first level of IV2. In our example, this means the column containing the data for "morning/low caffeine consumption" (coded 1,1 earlier on). The next slot is for "morning/medium caffeine consumption" (which we coded as 1,2), and so on. Click on the slot first; then click on the appropriate column name; then click on the arrow-button between the boxes, to enter the column name into the slot. Do this for each slot in turn. Your dialog box should end up looking like this: (d) Click on "Options…", and then "Descriptive statistics" in the dialogue box, and then "Continue" to return to the previous box. (e) Click on "Contrasts…". Click on Caffeine to highlight it. In the "Change Contrasts" box click on the arrow find the "Repeated" option; click on this, and then click on "Change". Finally, click on "Continue". This step is to prduce post hoc tests for Caffeine, which has three levels. No post hoc tests are needed for Testtime, which only has two levels. (f) Click "OK". The ANOVA Output: This is the output that you would get from our example. My explanations are the bold-type bracketed bits: General Linear Model Two-way Repeated Measures ANOVA. Page 6 Within-Subjects Factors Measure: MEASURE_1 TESTTIME 1 2 CAFFEINE 1 2 3 1 2 3 Dependent Variable AMLOW AMMEDIUM AMHIGH PMLOW PMMEDIUM PMHIGH De scri ptive Statistics AMLOW AMME DIUM AMHIGH PMLOW PMME DIUM PMHIGH Mean 10.8750 11.8750 13.5000 14.7500 19.6250 22.0000 St d. Deviat ion 1.2464 2.9001 2.3299 3.9911 2.7742 2.1381 N 8 8 8 8 8 8 [Just ignore this next table of" Multivariate Tests".] Multivariate Testsb Effect TESTTIME CAFFEINE TESTTIME * CAFFEINE Pillai's Trace Wilks' Lambda Hotelling's Trace Roy's Largest Root Pillai's Trace Wilks' Lambda Hotelling's Trace Roy's Largest Root Pillai's Trace Wilks' Lambda Hotelling's Trace Roy's Largest Root Value .938 .062 15.221 15.221 .695 .305 2.276 2.276 .735 .265 2.771 2.771 F Hypothesis df 106.546a 1.000 106.546a 1.000 a 106.546 1.000 106.546a 1.000 6.829a 2.000 a 6.829 2.000 6.829a 2.000 a 6.829 2.000 8.314a 2.000 a 8.314 2.000 8.314a 2.000 8.314a 2.000 Error df 7.000 7.000 7.000 7.000 6.000 6.000 6.000 6.000 6.000 6.000 6.000 6.000 Sig. .000 .000 .000 .000 .028 .028 .028 .028 .019 .019 .019 .019 a. Exact s tatis tic b. Design: Intercept Within Subjects Des ign: TESTTIME+CAFFEINE+TESTTIME*CAFFEINE [The next table tests the assumption of sphericity (see last handout for definition). Sphericity is tested separately for each effect; you should look at the associated p values ("sig.") - if any is below .05, the assumption has been violated. In each case, there is no violation of the sphericity assumption, so we do not need to consider any corrections to the F tests for each effect.] Two-way Repeated Measures ANOVA. Page 7 Mauchly's Test of Sphericityb Measure: MEASURE_1 Epsilon Within Subjects Effect TESTTIME CAFFEINE TESTTIME * CAFFEINE Mauchly's W 1.000 .932 .737 Approx. Chi-Square .000 .424 1.832 df Sig. 0 2 2 . .810 .397 Greenhous e-Geis ser 1.000 .936 .792 a Huynh-Feldt 1.000 1.000 .985 Lower-bound 1.000 .500 .500 Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables is proportional to an identity matrix. a. May be used to adjust the degrees of freedom for the averaged tests of s ignificance. Corrected tests are displayed in the Tests of Within-Subjects Effects table. b. Design: Intercept Within Subjects Des ign: TESTTIME+CAFFEINE+TESTTIME*CAFFEINE [Here is the Analysis of Variance summary table. For each effect you can look at just the row labelled "Sphericity Assumed". Remember that if the sphericity assumption had been violated for any effect, you would have looked at the row labelled "Huyn-Feldt" for that effect. Note also that each effect has its own associated error term. The error for each effect is its interaction with subjects; e.g. the error for testtime is the interaction of testtime with subjects, as explained in the lecture. First in the table, there is the main effect of "testtime": was test performance significantly affected by whether subjects were tested in the morning or the afternoon (averaging over caffeine dosage)? In this example, there is a highly significant F-ratio - time of testing had a significant effect on performance. Next there is the main effect for "caffeine": was test performance significantly affected by the amount of caffeine subjects received (averaging over time of testing)? In fact, we have a highly significant effect of caffeine consumption on memory performance. Finally, the interaction between your IV's: do the effects of one IV depend on the level of the other IV? In this example, there is a significant interaction between time of testing and caffeine consumption: the effects of caffeine depend on what time of day people were tested.] Two-way Repeated Measures ANOVA. Page 8 Te sts of W ithi n-Subje cts Effects Measure: MEASURE_1 Source TESTTIME Sphericity Ass umed Greenhous e-Geiss er Huynh-Feldt Lower-bound Error(TESTTIME) Sphericity Ass umed Greenhous e-Geiss er Huynh-Feldt Lower-bound CAFFEINE Sphericity Ass umed Greenhous e-Geiss er Huynh-Feldt Lower-bound Error(CAFFEINE) Sphericity Ass umed Greenhous e-Geiss er Huynh-Feldt Lower-bound TESTTIME * CAFFEINE Sphericity Ass umed Greenhous e-Geiss er Huynh-Feldt Lower-bound Error(TESTTIME*CAFF Sphericity Ass umed EINE) Greenhous e-Geiss er Huynh-Feldt Lower-bound Ty pe III Sum of Squares 540.021 540.021 540.021 540.021 35.479 35.479 35.479 35.479 197.375 197.375 197.375 197.375 148.292 148.292 148.292 148.292 49.292 49.292 49.292 49.292 55.708 55.708 55.708 55.708 df 1 1.000 1.000 1.000 7 7.000 7.000 7.000 2 1.872 2.000 1.000 14 13.106 14.000 7.000 2 1.583 1.969 1.000 14 11.083 13.784 7.000 Mean Square 540.021 540.021 540.021 540.021 5.068 5.068 5.068 5.068 98.688 105.417 98.688 197.375 10.592 11.315 10.592 21.185 24.646 31.132 25.032 49.292 3.979 5.026 4.041 7.958 F 106.546 106.546 106.546 106.546 Sig. .000 .000 .000 .000 9.317 9.317 9.317 9.317 .003 .003 .003 .019 6.194 6.194 6.194 6.194 .012 .020 .012 .042 [The following table produces various post hoc tests. No post hoc tested is needed for testtime; in fact, notice that the test it performs on testtime produces exactly the same F value and p value as in the ANOVA above. So reporting it again here is just a way SPSS has of wasting space. In fact, ignore any of the rows which mention "testtime". You may be interested in the post hoc tests it performs for caffeine (remember: you would only look at these if the main effect of caffeine was significant); SPSS presents you with tests of successive levels, i.e. low with medium and medium with high. This may be sufficient for your purposes, or you may want all possible pair-wise comparisons - see below for how to compute these.] Tests of Within-Subjects Contrasts Measure: MEASURE_1 Source TESTTIME Error(TESTTIME) CAFFEINE TESTTIME Linear Linear Error(CAFFEINE) TESTTIME * CAFFEINE Linear Error(TESTTIME*CAFF EINE) Linear CAFFEINE Level 1 vs. Level 2 Level 2 vs. Level 3 Level 1 vs. Level 2 Level 2 vs. Level 3 Level 1 vs. Level 2 Level 2 vs. Level 3 Level 1 vs. Level 2 Level 2 vs. Level 3 Type III Sum of Squares 180.007 11.826 138.063 64.000 162.437 110.000 60.062 2.250 80.437 55.750 df 1 7 1 1 7 7 1 1 7 7 Mean Square 180.007 1.689 138.063 64.000 23.205 15.714 60.062 2.250 11.491 7.964 F 106.546 Sig. .000 5.950 4.073 .045 .083 5.227 .283 .056 .612 Two-way Repeated Measures ANOVA. Page 9 [The following part of the output tests to see if the overall mean (of all your data) is significantly different from zero; this is not very interesting in this case.] Te sts of Betw een-Subjects Effects Measure: MEASURE_1 Transformed Variable: Average Source Int ercept Error Ty pe III Sum of Squares 11439. 188 65.646 df 1 7 Mean Square 11439. 188 9.378 F 1219.793 Sig. .000 Post hoc tests You may wish to analyze two effects further: the main effect of caffeine and the interaction. Taking these cases in turn: Interpreting a main effect For the main effect of testtime there is no further inferential test that needs to be done: Subjects overall have different levels of performance in the morning rather than the afternoon and that’s that. There are only two levels and we know there is a difference so there is nothing more to be tested. However, the main effect of caffeine indicates that at least one level is different from at least one other, but we don’t know what the pattern is. So we may like to perform post hoc tests to determine the pattern. Remember you would go on to perform these post hoc tests ONLY IF the main effect was significant. But just because the main effect is significant, it does not mean you have to perform the post hoc tests – just do so if you are interested in the pattern. You might argue that since the main effect is qualified by an interaction, that means the pattern is an average of two possibly quite different patterns (one in the morning and one in the afternoon) and you are not interested in the average of two quite different things. So then you would just go on to interpret the interaction. Assuming you did want to analyze the main effect further, how would you do it? There are a number of techniques you could do, and this is something you will want to check with your supervisor if it comes up in your project. You could use the same procedure as we used for the one-way repeated measures case. That is, you perform comparisons between each possible pair of levels: low with medium, low with high, and medium with high. The complication in this case compared with the one-way case is that we want to perform these comparisons averaging over time of day. Here’s the quickest way of getting SPSS to do this. Click once more on: Analyze> General Linear Model > Repeated Measures tell SPSS that you have two factors, but this time say that they only have two levels each. For caffeine enter the two levels as low and medium. Just enter four columns when it asks you to match columns to combinations of IVs: morning low, morning medium, afternoon low, afternoon medium. In the results, the ONLY effect you are interested in is the main effect of caffeine. This is a test of whether low is different from medium, averaging over time of day. Ignore the results for all other effects in this output. Repeat the procedure for the other two pair-wise comparisons. Interpreting an interaction Just as for the second module, if you have a significant interaction, you may be interested in gaining further information on the pattern of the interaction. Just as before, you could break down the interaction in two ways: 1) the effect of caffeine in the morning; and the effect of caffeine in the afternoon; OR 2) the effect of time of day at low doses of caffeine; the effect of time of day a medium doses; and the effect of time of day at high doses. Choose ONE of these ways of breaking it down, unless you have a theory that makes important predictions according to both ways. For (1) you would determine the simple effect of caffeine for each time of day separately. Click once more on Analyze > General Linear Model > Repeated Measures tell SPSS that you have one factor, caffeine, with three levels. Just select the three columns for morning and run the one-way ANOVA. This is the effect of caffeine in the morning. If significant (which it is), you could perform further post hoc tests to determine the pattern of mean differences; e.g. all possible pairwise comparisons. That is, click once more on Analyze > General Linear Model > Repeated Measures tell SPSS you have one factor, caffeine, with TWO levels. Select two of the levels and run the one-way ANOVA. (Note: You could have run a t-test using the related t-test command and you would get exactly the same p value out. In fact, if you squared the t-value you would get the F value. The two procedures are equivalent when there are only two levels of a single IV.) Having conducted all possible pair-wise comparisons, determine the simple effect of caffeine for the afternoon, followed up by post hoc tests. For (2), you would determine the simple effect of time of day for each level of caffeine. That is, you would compare morning low with afternoon low, using either related t-tests or the repeated measures one-way ANOVA; then compare morning medium with afternoon medium in the same way; and finally morning high with afternoon high. Two-way Repeated Measures ANOVA. Page 10 Interpreting the Results: The means for the "morning" sessions are lower than those for the "afternoon" sessions (as confirmed by the significant effect of time of testing in the ANOVA output). Also, the more caffeine, the better the memory performance (as confirmed by the significant effect of caffeine consumption in the ANOVA output). However, the effect of caffeine is more pronounced when testing was in the morning than when it took place in the afternoon (which is why there is a significant interaction between time of testing and caffeine consumption in the output). The graph shows this more clearly (remember to add standard errors to your own graphs!). Effects of caffeine and time of testing on memory-test score 25 Mean memory score 20 15 Morning Afternoon 10 5 0 Low1 2 Medium 3 High Level of caffeine (mg)