Stat 404 Fixing X (by constructing variables that meet the linearity assumption) A. Consider the following experiment: A list of 15 words is projected on a screen in front of three groups of subjects. Prior to this subjects in one group (Group #1) were told to “memorize the words for a recall test to be given later.” Another group (Group #2) was told to “meditate on the words while relaxing using biofeedback.” A third group (Control, Group #3) received no prior instructions. 1. The data: Group #2 Bio-memorization Y Yˆ1 raw data 10 0 8 -2 11 1 13 3 8 -2 Y2 10 Group #1 Rote Memorization Y Yˆ1 raw data 9 -2 12 1 11 0 15 4 8 -3 Y1 11 Group #3 No memorization Y Yˆ1 raw data 5 -1 6 0 8 2 7 1 4 -2 Y3 6 Overall mean: Y 9 2. A plot: 15 Number of words recalled 10 • • • • • • • • 2 • • • • • 2 3 5 1 Group 1 3. The ANOVA table: Source Treatment Error Total SS 70 58 128 df 2 12 14 MS 35 4.83 F 7.24 4. Note that group differences in this table explain a significant amount of variance at both the .05 and .01 levels of significance, since F122 ,.05 3.88 and F122 ,.01 6.93 . 5. The next few lectures will be considering a variety of ways that independent variables can be constructed to explain this variance (i.e., the treatment sum of squares of 70 given in the ANOVA table). SPSS output that summarizes these ways is provided at the end of this section of your lecture notes. B. Doing an ANOVA with dummy variables. 1. The variables: 1 if Treatment #1 D1 0 otherwise and 1 if Treatment #2 D2 0 otherwise 2. The data matrix: Treatment #1 (rote memory) Treatment #2 (bio-memory) Control Y 9 12 11 15 8 10 8 11 13 8 5 6 8 7 4 D1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 2 D2 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 3. The resulting regression equation: Yˆ 6 5D1 4 D2 Notice that (as you might expect) the estimated Y-value for each group equals the group mean: Yˆ1 6 5(1) 4(0) 11 Y1 Yˆ2 6 5(0) 4(1) 10 Y2 Yˆ3 6 5(0) 4(0) 6 Y3 4. Interpreting the regression coefficients: a. aˆ 6 : On average, 6 words were recalled by subjects with no prior instructions. b. bˆ1 5 : On average, the rote memorization group recalled 5 words more than this. c. bˆ2 4 : On average, those using bio-memorization recalled 4 words more than did those in the control group (i.e., those with no prior instructions). 6. Note: When slopes associated with dummy variables are stated in words, you do not (unless they are constructed from different nominal-level variables [such as race, gender, religious affiliation, etc.] refer to the effects of one dummy variable being adjusted for its collinearity with another one. C. Effect coding 1. The variables: 1 if Treatment #1 E1 1 if Control group 0 otherwise 1 if Treatment # 2 and E 2 1 if Control group 0 otherwise 3 2. The data matrix: Treatment #1 (rote memory) Treatment #2 (bio-memory) Control Y 9 12 11 15 8 10 8 11 13 8 5 6 8 7 4 E1 1 1 1 1 1 0 0 0 0 0 -1 -1 -1 -1 -1 E2 0 0 0 0 0 1 1 1 1 1 -1 -1 -1 -1 -1 3. Definition: A contrast is a variable having a mean of zero. a. Every variable can be converted to a contrast by subtracting its mean from each of its values. Statisticians sometimes refer to such a conversion as centering one’s data on a variable. b. Note that unlike dummy variables, effect measures are contrasts. c. Note also that when all of one’s independent variables are contrasts, the constant in one’s regression equation is an estimate of the dependent variable’s mean. For example, in this case aˆ Y bˆ1 E1 bˆ2 E 2 9 2(0) 1(0) 9 Y . d. Whenever the constant in a regression model estimates the mean, slopes associated with contrasts can be described as deviations from the overall mean. 4. The resulting regression equation: Yˆ 9 2 E1 1E2 4 Again note that the estimated Y-value for each group equals the group mean: Yˆ1 9 2(1) 1(0) 11 Y1 Yˆ2 9 2(0) 1(1) 10 Y2 Yˆ3 9 2( 1) 1( 1) 6 Y3 5. Interpreting the regression coefficients: a. aˆ 9 : Overall 9 words were recalled on average. b. bˆ1 2 : On average, the rote memorization group recalled 2 words more than this. c. bˆ2 1 : On average, those using bio-memorization recalled 1 word more than the overall average. 6. Note that the “effect” of the control group (i.e., the deviation of the mean of the control 2 group from the overall mean) equals bˆi 1 2 3 . After obtaining this third i 1 effect, notice that the 3 effects sum to zero. 7. In the U.S. we have a folk saying, “There are many ways to skin a cat.” The meaning of the expression is that there are a variety of ways to do some things. In these notes, you are learning a variety of ways to explain the variance among the 2 treatment groups and the control group (i.e., to explain SSTREATMENT 70 ). At this point we have considered 2 ways to “skin this cat.” D. Weighted effect coding 1. A potential problem: Effect coding (as described above) only yields contrasts in balanced experimental designs (i.e., when each of the groups being compared is of the same size [e.g., in this case 5 subjects in each group]). Weighted effect coding is needed when group sizes are unequal. 5 2. The variables (assuming a 3-level nominal-level variable with ni units of analysis in the i th group for i=1,2,3): 1 if Treatment #1 n EW1 1 if Control group n3 0 otherwise and 1 if Treatment #2 n EW2 2 if Control group n3 0 otherwise 3. The data matrix (an altered version of the previous data matrix in which the overall mean 29 and the treatment means are the same, but the mean of the control group is Yˆ3 5.8 ): 5 Y 9 12 11 15 8 11 8 11 13 8 5 6 7 7 4 Treatment #1 (rote memory) Treatment #2 (bio-memory) Control E1 1 1 1 1 1 1 0 0 0 0 -1.2 -1.2 -1.2 -1.2 -1.2 E2 0 0 0 0 0 0 1 1 1 1 -.8 -.8 -.8 -.8 -.8 4. Notice how weighted effect coding leaves each effect code a contrast: E1 6 1 5 1.2 E 15 2 4 1 5 .8 0 15 5. With the changes made to the data matrix, the regression equation is identical to the previous one. Moreover, interpretations of regression coefficients are identical to those in the previous equation as well. 6. However, the effect of the control group is different: 6 2 i 1 ni ˆ bi 1.2 2 .8 1 2.4 .8 3.2 , which is the deviation of the control n3 group’s mean from the overall mean (i.e., 9 - 3.2=5.8). 7. Weighted effects sum to zero: a. In a balanced design one need only sum each group’s effects (i.e., each group mean’s deviation from the overall mean). b. In an unbalanced design one must weight each group’s effects by its size before summing them. Thus if among k unequal groups, the i th group’s size is ni and its effect is b̂i , then the weighted effects sum (as illustrated using the above data) to zero k as follows: n bˆ 6 2 4 1 5 3.2 12 4 16 0 i i i 1 E. Orthogonal contrasts 1. As already mentioned during our discussion of principal components, when two independent variables are constructed by the researcher in such a way that they explain distinct amounts of variance in any dependent variable, they are said to be orthogonal to each other. 2. The key advantages of orthogonal independent variables are that their effects will not be confounded with each other (i.e., they will each explain a distinct amount of the dependent variable’s variance) and that their slopes can be interpreted without concern for their having been adjusted for such confounding. 3. Note that orthogonality implies an absence of multicollinearity. Yet the two involve the researcher in different ways. a. On the one hand, the researcher may actively assign sets of values to categories of his/her independent variable such that each set is orthogonal to every other one. 7