A SIMPLE FACTORIAL, 2X2, ANALYSIS OF VARIANCE

advertisement
D:\98947318.doc
Page 1 of 10
A SIMPLE FACTORIAL, 2X2, ANALYSIS OF VARIANCE
TREATMENT BY LENGTH (# CIGARETTES)
Suppose a researcher is working with smokers to help them cut down on their consumption of cigarettes. From experience, he
believes that those who have been smoking for 20 years are not as successful as those who have been smoking for 10 years.
Additionally, he believes that counseling with a money incentive is superior to counseling alone. He designs an experiment to test
these hypotheses.
He has 6 clients who have been smoking 20 years and 6 who have been smoking 10 years. All are smoking the same number of
cigarettes before any treatment. He randomly assigns half of each group to either (a) counseling, or (b) counseling plus money.
After treatment he counts the number of cigarettes consumed daily by each client.
Here is the data:
TREATMENT
20 years
LENGTH
10 years
Counseling
30
35
40
28
25
22
Counseling Plus $
28
30
32
16
20
24
He decides to do a factorial, 2X2 ANOVA, Treatment by Length.
The first task is to decide how to enter the data into SPSS for Windows. Since a 2X2 Factorial ANOVA always has two
independent variables and one dependent variable, you know you must define at least three variables, not counting ID numbers.
For this example, we will not use ID numbers.
Here are the three variables we must define:
1. numcigs - (dependent variable)
2. treatmen - (independent variable) [1 = counseling; 2 = money]
3. length - (independent variable) [1 = 10 years; 2 = 20 years]
How do we enter them?
_____ 1. Well, first you should click the variable view tab at the bottom left of the SPSS screen. There, you can define the variables
and the VALUE LABLES:
_____ 2. To designate the Value Labels, click in the Values column, the treatmen row until you see a small gray box with three
dots.
_____ 3. Click on the gray box and define the value labels by typing a 1 in the Value field, and counseling in the Value Label field,
then click Add. Now, type a 2 in the Value field and money in the Value Label field and click Add. Then, click OK:
D:\98947318.doc
Page 2 of 10
_____ 4. Now do the same thing to define Value Labels for the length variable:
_____ 5. Now, return to the data screen by clicking the Data View tab in the lower left corner of the SPSS screen:
Here is what the first few rows of your data screen will look like:
Now it is time to enter the data. Here is the raw data table, again:
Look at the first case above, the one who smoked 30 cigarettes. This person received counseling only (1) and has been smoking 20
years (2). So, we will enter a 30 under numcigs, a 1 under treatmen, and a 2 under length.
Now, with a pencil, fill out the following SPSS data screen. In the table of scores above, work upper left quadrant to upper right
quadrant, to lower left quadrant, to lower right quadrant.
D:\98947318.doc
Page 3 of 10
Now, turn to the next page to see what the data should look like:
D:\98947318.doc
Page 4 of 10
You can make the job of checking these scores after you enter them a little less error prone by making SPSS display the VALUE
LABELS instead of the numbers for the treatmen and length variables. To do that:
_____ 1. In the menu bar at the top of the screen click View, then put a check mark next to Value Labels. Now your data should look
like this, and it will be easy to make sure you made no mistakes:
Running the Two-Way ANOVA
_____ 1. Click Analyze, General Linear Model, and Univariate. The following box will open:
_____ 2. numcigs is the dependent variable, and treatmen and length are both independent variables (FACTORS). Since they do
not represent repeated measures (such are pretest/posttest), they are both Fixed Factors. Move the variable names into the appropriate
boxes as follows:
D:\98947318.doc
Page 5 of 10
There are six buttons along the right side of the above box. You will need to click on each and enter the information needed:
_____ 1. Click on Model and make sure the choice is for Full factorial and click CONTINUE.
_____ 2. Click on Contrasts and make sure that both treatmen and length are in the Factors: field. If both are not there, you made an
error on one of the previous steps.
_____ 3. Now click the Plots button. You can order graphs of any interaction found. To do so, highlight treatmen and click the right
arrow button to move it into the Horizontal Axis: field. Then highlight length and click the right arrow button to move it into the
Separate Lines: field. Then click the ADD button next to PLOTS:
Now, repeat this exactly, except this time, place length in the Horizontal Axis: field and treatmen in the
Separate Lines: field.
This will order two different graphs, reversing which of the two independent variables will appear on the X axis, and which will be
defined by separate lines in the body of the graph. When you are finished, the box should look like this (notice the large field at the
bottom of the box):
_____ 4. Now, click the CONTINUE button. Now, click the Post Hoc button. This will allow you to order post hoc analyses in case
of significant findings. This is necessary whenever there are more than two means in a main effect. In this analysis, there are only
two means in each of the independent variables. THEREFORE, NO POST HOC TESTS WILL BE NECESSARY. If the main
effect for treatmen is significant, there will be only two means. Therefore, we know that is the pair that differs (since it is the ONLY
pair) and we can simply look at those means and see which is highest. The same is true for length, if it proves to be significant.
D:\98947318.doc
Page 6 of 10
_____ 5. Click the Continue button.
_____ 6. Click the Options button. Move all the variables listed on the left into the field marked Display Means For: At the bottom
of the box, put check marks next to Descriptive statistics, Estimates of effect size, Observed power and Homogeneity tests. The box
should look like this:
_____ 7. Click the Continue button.
_____ 8. Click the OK button to run the analysis. PRINT OUT THE OUTPUT TO STUDY. It should be just like the handout I
gave you.
INTERPRETING THE RESULT OF THE FACTORIAL, 2X2, ANOVA
One of the first things you will need to help you understand the results of your analysis is a table of means with
cell means, row means, and column means filled in. Use the printout you just ran or the one I gave you to
locate these means and fill in the following table. Then, read on to see where you find these means on the
printout, and to see how this table should be filled out:
TREATMENT
Counseling
Counseling Plus $
20 years
Row Mean =
LENGTH
Row Mean =
10 years
Column Mean =
Column Mean =
GRAND MEAN =
D:\98947318.doc
Page 7 of 10
The means you need for the table can be found on page 1 of the output. It is in the second box on the page labeled Descriptive
Statistics. Check your box to be sure it looks like the following:
TREATMENT
Counseling
Counseling Plus $
20 years
LENGTH
10 years
35
30
Row Mean = 32.5
25
20
Row Mean = 22.5
Column Mean = 30
Column Mean =25
GRAND MEAN =
27.5
Just looking at these means do you predict that there is a significant main effect for:
A. TREATMENT?
B. LENGTH?
Do you predict a significant interaction effect for
C. TREATMENT BY LENGTH?
The next thing we need is a traditional two-way ANOVA source table. See if you can use the printout to fill in the empty spaces in the
following table. Then, go on to the next page where you will find a completed table and more information.
Here is the source table from the factorial ANOVA:
SOURCE
Rows (length)
Columns (treatmen)
Interaction
Within (Error)
Total
SS
df
MS
F
p
D:\98947318.doc
Page 8 of 10
Here is how the source table from the factorial 2X2 ANOVA should look:
SOURCE
Rows (length)
Columns (treatmen)
Interaction
Within (Error)
Total
SS
300.00
75.00
.00
108.00
483.00
df
1
1
1
8
11
MS
300.00
75.00
.00
13.50
F
22.22
5.56
.00
p
.002
.046
-
Where is this data found? It is actually found in several places on the printout.
The most convenient place to find it is in the FIRST BOX on the SECOND PAGE of the printout, labeled Tests of Between Subjects
Effects. Look there now.
You will notice that there is no mention of columns or rows. That is because your choice of either of these independent variables for
columns or for rows is entirely arbitrary. You could just as easily and correctly drawn your table of means exactly opposite, with
length in the columns and treatemen in the rows.
So, you will need to select carefully from page 2 of the SPSS printout to make sure you get the right numbers.
Look back at the printout. Find the LENGTH row in the Tests of Between-Subjects Effects box.
This is the information that goes in our source table in the first line headed rows (length).
On the printout, the line labeled TREATMEN goes in the second line of our source table, and the information in the line labeled
TREATMEN * LENGTH goes in the source table in the line labeled Interaction.
On the SPSS printout, the line labeled Error has the data that goes in our source table labled Within (Error).
And, finally, on the SPSS printout, the line labeled Corrected Total goes in our source table line labeled TOTAL.
Look at the top of page 2 of SPSS printout. Find the lines of text just under the first box. It gives the Adjusted R Squared as .693.
What does this mean? An R square gives the proportion of the variance of the numcigs scores that can be explained by type of
treatment and smoking length. So, about 69% of the variance in mean number of cigarettes smoked can be explained by type of
treatment and length of smoking time. (The R Squared is obtained by adding together the SS for length, treatmen, and
Interaction and dividing by the Corrected Total SS [300+75+0/483 = 375/483 = .776]. The Adjusted R SQUARED is adjusted for
sample size and is .693.
************************************
Now, look at the source table at the top of page 2 of the SPSS printout.
Two of the three Fs are significant.
That means we reject those two null hypotheses. What were they?:
No difference in mean no. of cigarettes smoked by 10 year vs. 20 year smokers (length variable)
No difference in mean no. of cigarettes smoked after counseling vs. counseling plus money (treatmen)
The F for interaction, however was NOT significant.
Since there is NO INTERACTION EFFECT, we do not reject that null (Null: There will be no interaction between length and
treatmeant), and we can very confidently now consider our significant main effect results. If there HAD been a significant
interaction, we would have to be very cautious about even looking at the main effects until we thoroughly examined and understood
our significant interaction effect.
In considering our significant main effect results for both length and treatmen, we need to have our table of means in front of us
again:
D:\98947318.doc
Page 9 of 10
TREATMENT
Counseling
Counseling Plus $
20 years
LENGTH
10 years
35
30
Row Mean = 32.5
25
20
Row Mean = 22.5
Column Mean = 30
Column Mean =25
GRAND MEAN =
27.5
First, consider the significant main effect for LENGTH. What that part of the ANOVA did was compare the row mean of 32.5
cigarettes for 20-year smokers to the row mean of 22.5 cigarettes for the 10-year smokers. Since that F was significant, we know
these differ, and we conclude that regardless of treatment, 20-year smokers are not as successful as 10-year smokers in cutting
down on their habit.
Now, consider the significant main effect for TREATMEN. What that part of the ANOVA did was compare the column mean of 30
cigarettes for those who received counseling only to the column mean of 25 cigarettes for those who received counseling plus the
money incentive. Since that F was significant, we know these differ, and we conclude that regardless of length of time spent
smoking, counseling plus money results in fewer cigarettes smoked than does counseling alone.
Note that no post hoc, multiple comparison tests are necessary here, since there were only two means in each of the main effects.
Therefore, when we find that one of these is significant, we know they differ since there is only that one pair to compare. Multiple
comparison tests such as Tukey and Scheffe are necessary only when there are three or more means in a significant main effect.
The nonsignificant interaction effect could easily have been predicted by looking at the table of means above.
Look across each row. The difference between the two cells in the first row is +5, and the difference between the two cells in the
second row is also exactly +5. That means there is no interaction. Another way to say this is that the effect of treatment is the same
for both lengths. If you look at the table, you will see that counseling alone results in 5 more cigarettes on the average than does
counseling plus money, and that is true for 20-year smokers and for 10-year smokers.
Or, we could say the effect of length is the same for both treatments. Look at the table. 20-year smokers smoke 10 more cigarettes
on the average than 10-year smokers, and that is true whether they receive counseling or counseling plus money.
If we were to graph the interaction, the lines in the graph should be exactly parallel, indicating no interaction at all. The two graphs
are on pages 3 and 4 of the printout. These are line graphs. Until the last few years, these are the only kind of graphs seen for
depicting interaction in ANOVA. But, bar graphs are becoming more common today, although you will see both types in journals.
SPSS for Windows does line graphs as part of a factorial ANOVA.
The first graph on page 3 of the SPSS printout has the independent variable TREATMEN on the horizontal axis. There is one point
on that line for counseling, and one for money.
The values representing means of the dependent variable NUMCIGS are on the vertical line.
There is one line for 10-year smokers, and one for 20-year smokers. On the printout, the lines are in different colors, which you
cannot see on your copy. For your information, the upper line represents 20-year smokers, and the lower line represents 10-year
smokers.
Here is a copy of that graph:
D:\98947318.doc
Page 10 of 10
The other graph on page 4 reverses the independent variables. In this one, the lower line represents those who received counseling
plus money, while the upper line represents those who received counseling alone. It makes very little difference which one we use. In
a factorial ANOVA, we generally choose the one that will result in the fewest separate lines. However, if there are the same number
of levels in each independent variable as there are here (because it is 2 X 2), that is not a consideration.
We would then write up our analysis, in which we would suggest that counseling plus money is the best alternative for both types of
smokers. Further, we would suggest that 10-year smokers achieve more success in cutting down on smoking than do 20-year
smokers.
END
Download