Back to Realism Applied to Home Page Perceptual and Motor Skills, 1967, 24, 443-450. Perceptual_motor67.doc INTERACTIVE CLASSIFICATION: A METHOD FOR ASSESSING THE ADEQUACY OF COUNTERBALANCING AS A MEANS OF CONTROL1 JOHN J. FUREDY Indiana University Summary.—In situations where a treatment is varied over the same Ss in order that each S may serve as his own control and where an associated source of variation is controlled for by counterbalancing between Ss, the success of this method of control depends on the absence of interaction between the treatments and the counterbalanced factor. When the data are classified into a factorial system involving the treatments and the counterbalanced factor as the two classifications, it is difficult to find a statistical model to test such an interaction. A strategy, by means of which established statistical models can be used to evaluate this interaction, is presented here, and its range of application is discussed. The device of presenting all treatments to all Ss is common in experimental work. It is generally supposed that this procedure reduces sampling error and thus increases the power of the test of the treatments hypothesis. To take the simple case of two treatments as an example, suppose that in a classical conditioning study E is concerned with the effect on performance of varying the intensity of the unconditioned stimulus (UCS). Rather than using two separate groups of Ss to compare the pairings of a conditioned stimulus (CS) with a weak and a strong UCS, E may choose to pair the two CSs with the weak and strong UCSs respectively over the same group of Ss. To obtain a within-S evaluation of the treatments hypothesis of UCS intensity, E may then perform a related-samples t test based on the differences between CSs paired with the weak and strong UCS for each S. However, since the two CSs must differ in order to allow differentiation of the effects of the weak and strong UCS training conditions, this attempt to use Ss "as their own control" introduces another factor into the situation: the nature of the CS. This associated source of variation, which could affect the dependent variate in such a way as to invalidate the above evaluation of the treatments hypothesis, is usually controlled by counterbalancing: each CS, say a tone (T) and a light (L), is paired with the weak (W) UCS for half of the Ss and with the strong (S) UCS for the other half of the Ss. In addition, E sometimes attempts to assess the effect of this counterbalanced CS factor (T-L) by reclassifying the data as performance to T vs performance to L, taking differences between these pairs of values for each S and performing a related-samples t test as before. However, insofar as E is concerned only with the validity of his evaluation of 1Appreciation is extended to R. H. Day, R. S. Rodger, and J. P. Sutcliffe for their helpful comments, with special thanks to J. P. Sutcliffe for providing the intellectual goad which led to the technique presented in this paper. The preparation of this report was supported by Contract Nonr 908-15, Roger W. Russell, Project Director. 444 J. J. FUREDY the treatments hypothesis of UCS intensity (W-S), the results of this second t test are irrelevant, since it can be shown that the absence of a main effect due to CS (T-L) is neither a necessary nor a sufficient condition for the adequacy of counterbalancing as a method of control. That such an absence is not a necessary condition is illustrated by Panels I and II in Fig. 1. The conclusion that there is (or is not) a treatment effect of S > W is unaffected by a main effect due to T-L being absent (Panel I) or present (Panel II), so that E can ignore the CS effect in his assessment of the UCS treatments hypothesis. The absence of a main effect due to CS is not sufficient for the effectiveness of counterbalancing, as can be shown if we consider the possibility of interaction between the T-L and W-S factors. Panels III and IV in Fig. 1 are two illustrations where, despite the absence of any main effect due to T-L, the interaction affects the validity of the conclusions about the W-S treatment effect. In the first case (III), the simple comparison of W with S yields the conclusion that there is no W-S effect, but this is misleading. The interaction shows that there is indeed a W-S effect, although it is differentially influenced by the nature of the CS involved. In the second idealized case (IV), a simple comparison of W with S would indicate a W-S treatment effect, but this should be modified in the light of the interaction, which suggests that the effect due to UCS intensity occurs appreciably only when the CS is T. Unlike the main effect of the T-L factor, the interaction of this factor with the W-S treatment factor cannot be ignored If it is decided that such an interaction exists, it follows that, for the purpose of assessing the treatments hypothesis, the counterbalancing procedure can no longer be regarded as an adequate experimental control of the T-L factor, the effectiveness of this control being limited to the main effect only of that factor. This is not to say that E's strategy in the design of future experimental tests of the treatments hypothesis is fully determined by the decision that interaction is present. In some circumstances, E may still be content to retain the T-L factor (i.e., test 5s with both UCSs and differing CSs) despite its apparent interference, or interaction, with the W-S treatment FlG. 1. Four sets of idealized results of presenting two UCS treatments (W and S) with counterbalancing over an associated source of CS variation (T and L). (In all four panels the data are plotted with the T-L factor as the variable and the W-S treatment factor as the parameter.) COUNTERBALANCING AS A CONTROL 445 factor. E would then attempt to provide statistical control of the interactive effect of T-L by unconfounding the variance due to this interaction from that due to the main effect of the W-S factor. In other instances, a detected interaction may lead E in future experiments to eliminate any possible source of variation from the T-L factor by holding it constant over treatments. E would then present the two UCSs to different groups of Ss, using only one CS (T or L) and thus giving up the attempt to reduce sampling error in the test of the W-S treatments hypothesis. No doubt there will also be situations where, on detecting interaction, E will find it difficult to choose between retaining and abandoning the T-L factor in future studies. While a detected interaction can thus pose new problems for E, it is at least clear that, for a valid assessment of the treatments hypothesis, it is necessary to consider whether such an interaction is present. Since the decision about the presence or absence of such an interaction would be expected to be based on a statistical test, Es first reaction is to classify the data in a conventional 2 X 2 factorial, as in Fig. 2. CS type (tone or light) and UCS intensity (weak or strong) are the A and B classifications, respectively. The symbols n1 and n2 specify the two equal sub-groups of Ss on which measures of performance are taken, and it will be noted that the same 5s appear in the diagonals, but not in the rows or columns, of the factorial. Because of this "crossover" distribution of Ss, no standard statistical model is obviously applicable for testing the A, B, and AB effects, the third being of crucial interest. It will be proposed in the next section that we transform the classification of the data in such a way that it becomes possible to apply established statistical models for testing the significance of this AB interaction, as well as the A and B effects. INTERACTIVE CLASSIFICATION Consider the alternative system of classification drawn up in Fig. 3 where FIG. 2. Conventional 2X2 classification of hypothetical data obtained from presenting two UCS treatments (W and S) to all Ss (n1 and n2) with counterbalancing over an associated source of CS variation (T and L). (Simple Classification System—A X B) FIG. 3. Alternative 2x2 classification of hypothetical data in Fig. 2. (Interactive Classification System—Ab X B) 444 J. J. FUREDY the A classification of CS type has been replaced by a new classification, Ab. Whereas the A classification (T-L) involved the CS type per se, this new classification specifies which of the two UCSs specified by the B classification (W or S) is paired with the T (and, consequently, which is paired with the L). The subscript, "b," indicates that this new classification specifies something in connection with the B classification; in this case, Ab specifies CS type in connection with UCS intensity. The Ab classification may be said to "interact with" the B classification, in the sense that it specifies something about what is specified by the B classification. For this reason, the system of classifying the data in Fig. 3 may be called an interactive classification system, as distinct from the simple system of Fig. 2. Two points, which together constitute the solution to the problem of testing for the A, B, and AB effects of Fig. 2 can be made about this interactive classification of the data. First, inspection of the distribution of 5s, n± and n2, of the interactive system in Fig. 3 shows that, as compared with the situation in Fig. 2, the design is now a two-factor experiment with repeated measures on one factor, for which a standard statistical model is available (e.g., Winer, 1962, pp. 302-312; McNemar, 1955, p. 332), a model which, from its agricultural origin, is frequently known as the "split-plot" (Cochran & Cox, 1957, pp. 293-296). A split-plot analysis will yield statistical decisions about the significance of the three effects: Ab, B, and Ab B. Given the data based on a sample of Ab groups of 5s each measured over B conditions, as classified in Fig. 3, E can use the split-plot model to decide whether the Ab, B, and Ab B effects are present or absent in the population. The second point is best introduced by considering any given set of scores and applying, in turn, the interactive (Fig. 3) and simple (Fig. 2) systems of classification to this set. It can then be asserted that, besides the trivial identity of the B effect, which appears in both systems, the following two equivalences exist between the effects of the two systems: Ab = AB AbB = A [1] [2] Assertion [1] states that an Ab effect (in the interactive system) is materially equivalent to an AB effect (in the simple system); assertion [2] states a similar equivalence between the A bB and A effects in the interactive and simple systems, respectively. While the equivalences are stated in terms of effects which are present, equivalences between effects which are absent are also implied. For example, [1] could be written as "not-Ab == not-AB," an equivalence between the negations of effects. The equivalences whose truth is suggested by inspection of Figs. 2 and 3, can be formally demonstrated to hold for any set of scores,2 so that [1] and [2] are true for population, sample, or any other type of 2 A cyclostyled copy of this demonstration is available from the author and as Document No. 9335 from the ADI Auxiliary Publications Project, Photoduplication Service, Library of Congress, Washington, D. C. 20540. Remit $1.25 for photocopies or $1.25 for 35-mm. microfilm. COUNTERBALANCING AS A CONTROL 445 scores. Briefly, the demonstration involves defining row and interaction effects in terms of the four cell scores of any 2X2 factorial, and then showing that, in terms of these definitions, [1] and [2] hold. The interactive classification strategy, then, is useful when we are faced with the problem of wanting to test for interaction in a situation of the sort shown in Fig. 2, where the same group of 5s appear in the principal diagonal cells but not in the same row of the factorial. The data are first reclassified into an interactive system as in Fig. 3, and the split-plot model is used to make statistical decisions or inferences about the effects in the interactive system where these refer to the state of affairs in the population. The second step is to use the logical equivalences, [1] and [2], which hold between the interactive and simple systems for any given set of scores and which allow the results of statistical decisions to be interpreted in simple system terms of order, treatment, and order-treatment interaction. Thus, although the use of the strategy is clearly statistical, it is essentially a logical technique, resting as it does on certain equivalence relations between alternate systems of classification, relations which hold independently of statistical considerations. APPLICATIONS Experimental Examples The counterbalanced source of variation may either be introduced by E as a convenience, or its presence may be necessary for the rationale of the experiment. The conditioning example discussed above is clearly a case of convenience. Since E may eliminate the CS factor by using only one CS (e.g., the T) and presenting only one of the two UCS intensity treatments to any S, we may speak of this CS variable as being extrinsic to the experiment. When the source of variation is necessary for testing the treatments hypothesis, so that E is not free to present only one treatment to any S, this type of source is intrinsic to the experiment. In indicating some of the concrete cases to which the interactive strategy can be applied, the examples are distinguished in terms of the extrinsic-intrinsic criterion, although it will be noted that the application itself of the strategy is not affected by this distinction. Extrinsic sources.—The situation which led to the use of this type of classification was a pilot study in which two durations of the UCS were the treatments with counterbalancing over the associated CS source of variation of tone and light (Furedy, 1965, p. 206). The intensity of the UCS has also been studied in a similar within-51 setting by Wickens and Harding, who have pointed out the advantages of using such within-S designs (Wickens & Harding, 1965, p. 153). The interactive strategy can be used to check whether this gain has been offset by any interaction of the treatment effect with the counterbalanced factor, an interaction for which counterbalancing does not control. 446 J. J. FUREDY More generally, there is a large class of experiments in psychology where the counterbalanced factor is simply the order in which a given treatment is presented. In a perceptual study R. P. Power of Queen's University of Belfast (personal communication) was concerned with the rate of apparent reversals of rotating figures as a function of the physical shape of the figure. A circular and an elliptical shape constituted two levels of the treatments factor, which was tested within Ss by administering both treatments to all of 20 Ss. The associated source of variation was whether a given treatment was presented first or second, and this order factor was controlled by counterbalancing, 10 Ss being presented with the circle first and the other 10 Ss with the ellipse first. In terms of Fig. 2, then, the A and B classifications were presentation order (first-second) and rotating figure shape (circular-elliptical) respectively, with the two groups, n± and n2, each containing 10 5s. When the data were reclassified into an interactive system as in Fig. 3, a significant (F = 5.35, df = 1/18, p < .05) Ab effect was found. In terms of the simple system this result indicated an AB interaction effect, in this case an interaction between presentation order and rotating figure shape. This result led E to abandon the method of "using each S as his own control" in subsequent experiments in favor of presenting only one shape to any S. Power's study is representative of all experiments in which the associated source of variation introduced by presenting both treatments to all Ss is the order in which a given treatment is presented. If the application of the interactive strategy suggests an interaction between this order factor and the treatments effect, it may well be advisable for E to eliminate the former source of variation which, in these cases, is always extrinsic to the experiment. Intrinsic sources.—When part of the treatment effect is that it be varied within the same S, the associated source of variation becomes intrinsic to the experiment. All differential conditioning studies (both instrumental and classical) contain such an intrinsic source, because, in order to examine the degree to which S can be differentially conditioned to two stimuli, the nature of the stimuli themselves has to be varied. This source of variation is usually counterbalanced against the treatment effect of differential reinforcement. However, if the interactive strategy indicates an interaction between the nature-of-the-stimulus factor and the treatment effect of differential reinforcement, E cannot eliminate the former source of variation. He can only vary it over a different range of values, in the hope that the interaction will not occur within that new range. Formal Limits and Extensions By considering the applicability of the interactive classification to cases which are formally different from the examples quoted above, formal extensions of the method can be investigated. For example, Gerjuoy and Myers (1964) have shown in another context that equivalence assertions [1] and [2] do not hold if the number of Ss on which each cell aggregate is based is not constant. COUNTERBALANCING AS A CONTROL 447 This is not a serious limitation here, since the procedure of counterbalancing ensures that this constancy condition is met. Of more interest are cases where the counterbalancing is preserved, but where the complexity of the design is increased. Adding further bases of classification to the original 2X2 factorial is one such source of complication. To take the UCS intensity case as an example again (Fig. 2), we may want to add a third basis of classification, C, which may either be within 5s (e.g., two or more stages of training) or between Ss (e.g., two or more levels of anxiety). As long as these new bases of classification are not involved in the interactive reclassification, their addition does not affect the applicability of the strategy. In the A X B case for two levels of A and B, the simple ABC system which results from adding C as a new basis of classification can be reclassified into an interactive AbBC system (note that the C classification is not involved in this shift). On the basis of the equivalence assertions [1] and [2], any statistical effects in the interactive system are interpretable through the substitution of the terms AB and A for the terms Ab and AbB respectively. Thus a significant AbBC effect is equivalent to an AC interaction effect by [2], while an AbC effect is interpretable, by [1], as a second-order ABC interaction effect. More generally, given any multiplicative classification system ABCD. . . . such that the A and B classifications have only two categories, the interactive shift into an AbBCD. . . . system is applicable in the sense that any Ab and AbB terms can be replaced by AB and A terms respectively in the interpretation of the statistical effects. The other method of increasing the complexity of the design is to increase the number of categories in the original A and B classifications which are involved in the interactive shift. In the case of the UCS intensity treatments effect, this would involve presenting three levels of UCS intensity (weak, medium, and strong) to all Ss, and pairing each UCS with a different CS (perhaps a light and two tones of different frequencies). A more common example, however, is provided when the counterbalanced factor is the order of presenting a given treatment. Thus, if three treatments were administered to all Ss, counterbalancing for order would be carried out by dividing the Ss into three equal groups and varying the order of presentation of treatments in such a way that each treatment is presented first, second, and third to equal numbers of Ss. The data would then be classified into a simple system similar to that in Fig. 2, with order and treatments as the A and B classifications respectively, with the exceptions that these two factors would have three levels each, and there would be three equal sub-groups of Ss, which could be labeled m, n2, and ns, respectively. The interactive system analogous to that in Fig. 3 is that system of classification in which the same Ss appear in any given row of the factorial. In these 3X3 cases it appears that the technique is only partially applicable, since the two equivalences no longer hold. Instead, one can assert only the following two relations of implication between the interactive and the simple systems: Ab -->AB not-AbB --> not-A [V] [2'] That is, for the 3X3 case, an Ab effect implies an AB interaction in the simple system ([!']), and no AbB effect implies no A effect in the simple system ([2']); however, not-Ab no longer implies not-AB, and AbB no longer implies A. No detailed demonstration of the truth of these implications is given here.3 As for the 2x2 case, the method involves considering the row and interaction effects in terms of cell scores. When this is done, it becomes clear that the reason why the equivalence relations of the 2 X 2 no longer hold is because, in the 3X3 case, the defining conditions for the absence of interaction are of a more complex form than those for the absence of a row effect. No demonstration is yet available for 448 J. J. FUREDY cases involving more than three categories in the original A and B classifications, but it would appear that the relations stated by [1'] and [2'J also hold for these n Y, n cases where n > 3. Compared to the 2x2 case, however, it is clear that the power of the strategy has been reduced. In the three-treatments case which was used to discuss the difference between the 2x2 and 3X3 systems it will be noted that there are two, not one, independent ways in which the order factor can be counterbalanced. Thus, to counterbalance the order of the three treatments, Tl, T2, and T3, either the sequence . . . Tl, T2, T3 . . ., or the sequence . . . Tl, T3, T2 . . ., may be used, these alternatives being the two levels of the additional factor of sequence. Inasmuch as E feels constrained to vary this sequence factor, it seems that increasing the number of categories in the A and B classifications when the former specifies order of presentation introduces an additional basis of classification, C. This addition constitutes the first method of increasing design complexity discussed above, which was found not to interfere with the applicability of the interactive stratagem. Thus, where there are more than two categories in the original A X B system, so that only the relations of implications stated by [1'] and [2'] hold between that system and the interactive system, the addition of further classifications leaves these relations of implications unchanged. For example, given any multiplicative system ABCD. . . ., such that the A and B classifications have more than two categories, an AbCD. . . . effect is interpretable as an ABCD. . . . effect, although an AbBCD. . . . effect is no longer interpret-able as an ACD. . . . effect. The latter limitation, however, is not due to the addition of new bases of classification. It is only the addition of categories to the classifications involved in the interactive shift which weakens the power of the strategy. It remains only to be stressed that the strategy provides no new statistic. Though set in a statistical context, it is no more than a logical device for reclassifying data into a form which is amenable to statistical analysis by the familiar split-plot model. The statistical validity of the procedure, then, is identical to that of the split-plot model on which the statistical analysis of the interactive classification system is based. REFERENCES COCHRAN, W. G., & COX, G. M. Experimental designs. New York: Wiley, 1957. COPI, I. M. Symbolic logic. New York: Macmillan, 1959FUREDY, J. J. Reinforcement through UCS offset in classical aversive conditioning. Aust. J. Psychol, 1965, 17, 205-212. GERJUOY, H., & MYERS, A. E. Analysis of variance: effects of interchanging main effects and interactions by rearranging data. Percept, mot. Skills, 1964, 18, 833-838. MCNEMAR, Q. Psychological statistics. New York: Wiley, 1955. WlCKENS, D. D., & HARDING, G. B. Effect of UCS strength on GSR conditioning; a within-subject design. /. exp. Psychol., 1965, 70, 151-153. WINER, B. J. Statistical principles in experimental design. New York: McGraw-Hill, 1962. Accepted February 21, 1967. 3 A cyclostyled copy of this demonstration is available from the author or from the American Documentation Institute. See Footnote 2 for details.