Misinterpretation of Interaction Effects: A Reply to Rosnow and

advertisement
Psychological Bulletin
1991, Vol. 110, No. 3, 571-573
Copyright 1991 by the American Psychological Association~Inc.
0033-2909/91/$3.00
Misinterpretation of Interaction Effects: A Reply to Rosnow and Rosenthal
D o n a l d L. M e y e r
University o f Pittsburgh
The structural model of analysis of variance for multiway tables, with main effects and interaction
parameters, is overspeeified. Only the set of estimable functions of the cell means is useful. The
individual estimates of the parameters are artificial as regards the underlying scientific process.
Rosnow and Rosenthal (1989a) stated that it is "absolutely necessary" (pg 146) that interactions
should be interpreted by examining the usual estimates of the interaction parameters. This is
incorrect.
effects, two column effects, and four interaction effects. That
gives a total o f nine parameters (effects). It is at once obvious
that four observed cell means cannot estimate nine effects!
How then is progress naade? The answer is that certain functions o f the parameters can be estimated. In particular, the
difference o f the differences o f the cell means in row 1 and the
means in row 2 estimates the corresponding difference o f the
differences o f the 7(~j):
Rosnow and Rosenthal (1989a) stated that it is an error to
report cell means to interpret the pattern o f an observed interaction. Because I suspect that several people will be concerned
with this statement, especially when Rosnow and Rosenthal
(1989b) stated that interaction effects are "the universally most
misinterpreted empirical results in psychology" (p. 1282), I felt
this reply was in order. As an aside, I find it fascinating that a
four-page article was published explaining how to interpret four
numbers---more than 50 years after R. A. Fisher (1935) explained the analysis o f variance (ANOVA).
Reviewers o f a previous version o f this article called my attention to a number o f articles also discussing interaction effects in
the ANOVA: Marascuilo and Levin (1970, 1976), Levin and
Marascuilo (1972, 1973), Games (1973), and Boik (1979). The
same lack o f sensitivity to the principal point o f the present
article is evident in some of these articles.
Rosnow and Rosenthal (1989a) present an "algorithm" for
obtaining two-way interaction effects "corrected" for main effects. They are seemingly unaware that their algorithm is nothing more than the solution for the estimates o f the interaction
parameters in the usual linear model for a two-way table o f
means:
tz(ij) = tz + a ( i ) + B(I) + r(i3),
[re(l, 2) - re(l, 1)1 - [m(2, 2) - m(2, 1)1.
This is the familiar interaction contrast shown in statistics textbooks.
For concreteness, consider the example used by Rosnow and
Rosenthal (1989a) o f an investigator studying the despair of
bereavement o f family members when a child dies. Grief intensity is the dependent variable; health and sex o f child are the
two "factors"
At the top of Table 1 is a hypothetical set o f true population
means consistent with Rosnow and Rosenthal's (1989a) ranking
o f the cells: Healthy male > healthy female > unhealthy female > unhealthy male. Next is presented the "scientific" effects, or what Levin and Marascuilo (1973) refer to as the "latent
structure o f the variable" (p. 308). This structure specifies that
the true means arise from combinations o f additive row, column, and interaction effects, as in Model 1. These "true" effects
show that being healthy adds 3 points to bereavement, and being unhealthy adds 10 points. Girls receive 5 points; boys receive
0. The numbers in the cells are the interaction effects when sex
is mixed with health status. Healthy boys have a true mean o f 17
because they are in a cell receiving 3 for healthy, 0 for boys, 12
for the interaction, and 2 for the grand mean. Similarly, the
other three true means result as shown. These effects were simply made up and are just one o f the infinite number o f sets o f
possible effects consistent with the true means.
Samples from the four groups are taken, and suppose our
observed cell means are the same as the true means: a successful
research stud~. The bottom o f Table 1 shows the usual estimates
using Model 1. These are calculated using constraints on the
e s t i m a t e s o f the parameters. I emphasize this to show that the
statistical analysis is arbitrary as regards the actual values o f the
parameters. The usual constraints specify that the row and column estimates both sum to zero and the interaction estimates
(1)
in which t*(~]) is the mean in row i and column j, t~ is the grand
mean, a(i) is the main effect for row i, B(J) is the main effect for
column j, and r(ij) is the interaction effect for the cell in row i
and column j. The estimates o f the r(/j) are well-known to be
rn(ij) - re(i) - r e ( j ) + m , in which all the ms are observed
means.
The principal error o f Rosnow and Rosenthal (1989a) is that
they have confused the estimation o f the statistical model with
scientific reality. Stated another way, even if the cell means in a
two-way table result from additive effects as in Model 1, the
parameters of that model are unknowable. The model is overspecified, and the parameters are said to be nonestimable.
Think of a 2 × 2 table: There is one grand mean, two row
I thank the reviewers for their many helpful comments.
Correspondence concerning this article should be addressed to
Donald L. Meyer, Department o f Psychology, University of Pittsburgh,
Pittsburgh, Pennsylvania 15260.
571
572
DONALD L. MEYER
Table 1
Cell Means, Effects, and Estimates
Health status
Girl
Boy
By row
14
6
17
5
15.5
5.5
10
11
Population M
Healthy
Unhealthy
M
Grand M
10.5
Scientific effect
Healthy
Unhealthy
Column effect
4
- 11
12
-7
3.0
10.0
5
0
2.0
Estimate
Healthy
Unhealthy
Column estimate
- 1
1
-0.5
1
- 1
0.5
5
-5
10.5
sum to zero in any row or column. This results in one "free" row
parameter, one free column parameter, and one free interaction
parameter. With the grand mean, a 2 × 2 table with 4 data
points has the usual four degrees of freedom.
One should notice that the familiar"l, - r' pattern of interaction estimates results. For Cell 12,17 - 15.5 - l 1 + l 0.5 = l or, as
Rosnow and Rosenthal (1989a) show: 17 - (5 + 0.5 + 10.5). Note
that the estimates o f the interaction parameters have no resemblance to the "true" interaction effects! However, the difference
(of the differences o f column 2 and column l) for row 1 and row
2 is 4 points for all three sections o f Table 1: (l 7 - 14) - (5 - 6) =
02 - 4) - (-7 + 11) = (1 - l) - ( - l + 1) = 4. In the technical
jargon, this interaction contrast is an estimable function. Rosnow and Rosenthal's (1989a) example concerned a researcher
who predicted a certain ranking o f group means and who collected data that didn't verify the prediction (in relation to the
ranking o f the means). Rosnow and Rosenthal obtained the 1,
- 1 pattern for both the researcher's prediction and for the observed data (Rosnow & Rosenthal, 1989a, Table 1). They stated,
' ~ l s o seemingly unbeknown to the investigator, we see that he
found exactly the pattern o f interaction that was predicted"
(Rosnow & Rosenthal, 1989a, p. 144). Rosnow and Rosenthal
made the interaction contrast to be 4 in both cases. The 1, - 1
pattern must result but does not reflect the true interaction
parameters or the ranking o f the true means.
Rosnow and Rosenthal's (1989a) criticism was that the researcher reported the obtained cell means and said they didn't
conform to his prediction. I f a person predicts a certain ranking
o f means, there is no obligation on that person to specify implied main effects or interactions. The only obligation is honesty, for which this researcher should be applauded. If one
wants to discuss interaction, then noting that the sex effect is 3
points for healthy and - 1 for unhealthy is sufficient for point
estimation. The difference o f these "simple" effects is the interaction contrast of 4. For confidence intervals one would also
calculate the variance o f the interaction contrast and, if desired, the variance o f these simple contrasts.
The example also shows that even the difference and order of
row and column effects are misestimated! The true scientific
effects show a 7-point effect favoring unhealthy, but the ANOVA
model shows a difference o f 10 points favoring healthy. When
Rosnow and Rosenthal "peel" away main effects, they use
unreal main effects. It is a well-known fact that when interaction occurs, main effects are not readily interpretable. Rosnow
and Rosenthal give a second example in which they calculate
meaningless F tests for main effects in the presence o f a significant interaction. At least they are in good company, because
this is seen repeatedly both in articles and textbooks!
This second example shows how Rosnow and Rosenthal's
advice can confuse even themselves. The example concerned
inexperienced (I) and experienced (E) ball players assigned to
either a control treatment or a "Ralphing" treatment. Both
groups scored 3 in the control condition, and I went up to 5 and
E went up to 7 in the Ralphing condition. Both groups benefited from Ralphing, but the experienced group benefited even
more. Rosnow and Rosenthal plotted the interaction estimates
and obtained the +, - crossing pattern. They then said, "The
experienced ball players benefited moderately from Ralphing
to the same degree that inexperienced ball players were harmed
by it" (Rosnow & Rosenthal, 1989a, p. 146).
The theoretical cell means are estimable by the observed cell
means (or more generally, by linear functions o f the means). In
plain words, they reflect reality. When interaction occurs, they
are the best guesses (non-Bayesian) o f future behavior for the
various factor combinations defining the cells. Furthermore, in
the presence of interaction, the structural model with main
effects is o f no use. Rosnow and Rosenthal (1989a) state, "The
point o f this article is to emphasize that if investigators are
claiming to speak o f an interaction, the exercise o f looking at
the corrected cell (or condition) means is absolutely necessary"
(p. 146). This is misguided. The usual graph of cell means shows
how the two factors are behaving together. The interaction is
seen in the lack o f parallelism o f the two lines. Rosnow and
Rosenthal would have us always plot the lines as crossing!
Marascuilo and Levin (1970) stated that many o f the errors in
analysis after rejection o f a test for interaction come from "an
incorrect understanding that researchers have concerning interaction in a factorial design" (p. 414). They called this the
"intuitive m o d e l ; or "synergistic model7 Their example concerns four groups: placebo, Drug A, Drug B, both Drug A and
B. Their model specifies the following:
Group
Effect
placebo
M
Drug A
M + A
Drug B
M + B
DrugA, B
M+A+B+(AB)
Marascuilo and Levin (1970) identify the (AB) parameter as the
interaction resulting from the synergistic effect of the two drugs
administered together. They go on to say that in this model it
makes good sense to think o f interaction in a single cell, but in
573
MISINTERPRETATION OF INTERACTION EFFECTS
the generally applied ANOVA model, an interaction is "a component that involves every cell o f the design and not just the cell
representing the joint administration o f the two drugs" (2 × 2
designs; Marascuilo & Levin, 1970, p. 416).
Table 2 shows the estimates using the same true means as in
the top o f Table 1. Lo and behold! The synergistic interaction
parameter is estimated to be 4, which is the value o f the interaction contrast o f the cell means calculated previously. Again the
main effects o f drugs, A = - 1 and B = 8, bear no resemblance to
either the usual estimates or the "latent" effects o f Table 1. This
"intuitive" model is just another way o f parameterizing four cell
means. Whether the true science is synergistic or follows the
latent structural model is unknowable from four data points (or
more in a larger two-way table)!
Much o f the discussion is valuable because Marascuilo and
Levin (I 970) emphasize interaction as the difference o f differences and they address the problem of multiple-error rates.
However, they seemed to have not followed their own dictum
that interaction involves every cell when they made a rather
large issue o f calculating confidence intervals for single interaction parameters. Games made almost the same criticism when
he pointed out that the interaction parameters "have no generalizability to a modified replication" (Games, 1973, p. 305). As I
have shown, they do not even have utility for a current data set,
except as they enter estimable functions.
Most presentations o f the structural model (with more parameters than data points) in statistics textbooks for the behavioral
sciences are incomplete. This is probably because o f the desire
to avoid rather heavy mathematics. However, it is unfortunate
that the distinction between the original (scientific) model and
the reparameterized model is blurred. The ANOVA is an analysis o f estimable functions, not o f individual parameters. Two
important points are (a) any reparameterization leads to the
same estimate of an estimable function, and (b) it is possible to
test hypotheses only about estimable functions (Kempthorne,
1952). The failure to make the distinction has misled many, the
most recent being Rosnow and Rosenthal (1989a). Once one
thinks in terms o f cell means or the cell means model as Marascuilo and Levin did later (Marascuilo & Levin, 1976), most of
Table 2
Estimates Using Synergistic Parameterization
Drug A
Drug B
Yes
No
No
M+B=6+8
M=6
the issues disappear, because contrasts o f the cell means are
estimable functions. In a 2 × 2 table, four means can be compared in 11 different ways using unit weights. Only one o f those
ways directly addresses interaction: the difference o f the differences. If a researcher wants to report other comparisons like
"simple effects" and if those comparisons are deemed useful in
interpretation, then by all means they should be reported. I
believe Games (1973) subscribed to that view. If one wants to
analyze various interaction contrasts in larger tables, refer to
textbooks such as Kirk (1982) and the article by Boik (1979).
The only real issue is not how to interpret interactions, but how
one controls significance levels in multiple-testing situations.
The structural model with only main effects is still overparameterized but seemingly causes no difficulties in interpretation.
To all those who make graphs o f cell means to show the
interaction as nonparallel fines, ignore the Rosnow and Rosenthal articles and keep on plotting!
References
Boik, R. J. (1979). Interactions, partial interactions and interaction
contrasts in the analysis of variance. Psychological Bulletin, 86,
1084--1089.
Fisher, R. A. (1935). The design of experiments. Edinburgh, Scotland:
Oliver & Boyd.
Games, P. A. (1973). Type IV errors revisited. Psychological Bulletin,
80, 304-307.
Kempthorne, O. (1952). The design and analysis of experiments. New
York: Wiley.
Kirk, R. E. (1982). Experimental design: Procedures for the behavioral
sciences (2nd ed.). Monterey, CA: Brooks/Cole.
Levin, J. R., & Marascuilo, L. A. (1972). Type IV errors and interaction.
Psychological Bulletin, 78, 368-374.
Levin, J. R., & Marascuilo, L. A. (1973). Type IV errors and games.
Psychological Bulletin, 80, 308-309.
Marascuilo, L. A., & Levin, J. R. (1970). Appropriatepost hoc comparisons for interaction and nested hypotheses in analysis of variance
designs: The elimination of Type IV errors. American Educational
Research Journal 7, 397-421.
Marascuilo, L. A., & Levin, J. R. (1976). The simultaneous investigation of interaction and nested hypotheses in two-factor analysis of
variance designs. American Educational Research Journal, 13, 6165.
Rosnow, R. L., & Rosenthal, R. (1989a). Definition and interpretation
of interaction effects. Psychological Bulletin, 105, 143-146.
Rosnow, R. L., & Rosenthal, R. (1989b). Statistical procedures and the
justification of knowledge in psychological science. American Psychologist, 44, 1276-1284.
Yes
M+A+B+(AB)=6-1
M+A=6-1
+8+4
Received November 29, 1989
Revision received October 4, 1990
Accepted December 24, 1990 •
Download