Stat 404 Fixing X (by constructing variables that meet the linearity assumption) A. Consider the following experiment: A list of 15 words is projected on a screen in front of three groups of subjects. Prior to this subjects in one group (Group #1) were told to “memorize the words for a recall test to be given later.” Another group (Group #2) was told to “meditate on the words while relaxing using biofeedback.” A third group (Control, Group #3) received no prior instructions. 1. The data: Group #2 Bio-memorization Y Yˆ1 raw data 10 0 8 -2 11 1 13 3 8 -2 Y2 10 Group #1 Rote Memorization Y Yˆ1 raw data 9 -2 12 1 11 0 15 4 8 -3 Y1 11 Group #3 No memorization Y Yˆ1 raw data 5 -1 6 0 8 2 7 1 4 -2 Y3 6 Overall mean: Y 9 2. A plot: 15 Number of words recalled 10 • • • • • • • • 2 • • • • • 2 3 5 1 Group 1 3. The ANOVA table: Source Treatment Error Total SS 70 58 128 df 2 12 14 MS 35 4.83 F 7.24 4. Note that group differences in this table explain a significant amount of variance at both the .05 and .01 levels of significance, since F122 ,.05 3.88 and F122 ,.01 6.93 . 5. The next few lectures will be considering a variety of ways that independent variables can be constructed to explain this variance (i.e., the treatment sum of squares of 70 given in the ANOVA table). SPSS output that summarizes these ways is provided at the end of this section of your lecture notes. B. Doing an ANOVA with dummy variables 1. The variables: 1 if Treatment #1 D1 0 otherwise and 1 if Treatment #2 D2 0 otherwise 2. The data matrix: Treatment #1 (rote memory) Treatment #2 (bio-memory) Control Y 9 12 11 15 8 10 8 11 13 8 5 6 8 7 4 D1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 2 D2 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 3. The resulting regression equation: Yˆ 6 5D1 4 D2 Notice that (as you might expect) the estimated Y-value for each group equals the group mean: Yˆ1 6 5(1) 4(0) 11 Y1 Yˆ2 6 5(0) 4(1) 10 Y2 Yˆ3 6 5(0) 4(0) 6 Y3 4. Interpreting the regression coefficients: a. aˆ 6 : On average, 6 words were recalled by subjects with no prior instructions. b. bˆ1 5 : On average, the rote memorization group recalled 5 words more than this. c. bˆ2 4 : On average, those using bio-memorization recalled 4 words more than did those in the control group (i.e., those with no prior instructions). 6. Note: When slopes associated with dummy variables are stated in words, you do not (unless they are constructed from different nominal-level variables [like race, gender, and religious affiliation]) refer to the effects of one dummy variable being adjusted for its collinearity with another one. C. Effect coding 1. The variables: 1 if Treatment #1 E1 1 if Control group 0 otherwise 1 if Treatment # 2 and E 2 1 if Control group 0 otherwise 3 2. The data matrix: Treatment #1 (rote memory) Treatment #2 (bio-memory) Control Y 9 12 11 15 8 10 8 11 13 8 5 6 8 7 4 E1 1 1 1 1 1 0 0 0 0 0 -1 -1 -1 -1 -1 E2 0 0 0 0 0 1 1 1 1 1 -1 -1 -1 -1 -1 3. Definition: A contrast is a variable having a mean of zero. a. Every variable can be converted to a contrast by subtracting its mean from each of its values. Statisticians sometimes refer to such a conversion as centering one’s data on a variable. b. Note that unlike dummy variables, effect measures are contrasts. c. Note also that when all of one’s independent variables are contrasts, the constant in one’s regression equation is an estimate of the dependent variable’s mean. For example, in this case aˆ Y bˆ1 E1 bˆ2 E 2 9 2(0) 1(0) 9 Y . d. Whenever the constant in a regression model estimates the mean, slopes associated with contrasts can be described as deviations from the overall mean. 4. The resulting regression equation: Yˆ 9 2 E1 1E2 4 Again note that the estimated Y-value for each group equals the group mean: Yˆ1 9 2(1) 1(0) 11 Y1 Yˆ2 9 2(0) 1(1) 10 Y2 Yˆ3 9 2( 1) 1( 1) 6 Y3 5. Interpreting the regression coefficients: a. aˆ 9 : Overall 9 words were recalled on average. b. bˆ1 2 : On average, the rote memorization group recalled 2 words more than this. c. bˆ2 1 : On average, those using bio-memorization recalled 1 word more than the overall average. 6. Note that the “effect” of the control group (i.e., the deviation of the mean of the control 2 group from the overall mean) equals bˆi 1 2 3 . After obtaining this third i 1 effect, notice that the 3 effects sum to zero. 7. In the U.S. we have a folk saying, “There are many ways to skin a cat.” The meaning of the expression is that there are a variety of ways to do some things. In these notes, you are learning a variety of ways to explain the variance among the 2 treatment groups and the control group (i.e., to explain SSTREATMENT 70 ). At this point we have considered 2 ways to “skin this cat.” D. Weighted effect coding 1. A potential problem: Effect coding (as described above) only yields contrasts in balanced experimental designs (i.e., when each of the groups being compared is of the same size [e.g., in this case 5 subjects in each group]). Weighted effect coding is needed when group sizes are unequal. 5 2. The variables (assuming a 3-level nominal-level variable with ni units of analysis in the i th group for i=1,2,3): 1 if Treatment #1 n EW1 1 if Control group n3 0 otherwise and 1 if Treatment #2 n EW2 2 if Control group n3 0 otherwise 3. The data matrix (an altered version of the previous data matrix in which the overall mean 29 and the treatment means are the same, but the mean of the control group is Yˆ3 5.8 ): 5 Y 9 12 11 15 8 11 8 11 13 8 5 6 7 7 4 Treatment #1 (rote memory) Treatment #2 (bio-memory) Control E1 1 1 1 1 1 1 0 0 0 0 -1.2 -1.2 -1.2 -1.2 -1.2 E2 0 0 0 0 0 0 1 1 1 1 -.8 -.8 -.8 -.8 -.8 4. Notice how weighted effect coding leaves each effect code a contrast: E1 6 1 5 1.2 E 15 2 4 1 5 .8 0 15 5. With the changes made to the data matrix, the regression equation is identical to the previous one. Moreover, interpretations of regression coefficients are identical to those in the previous equation as well. 6. However, the effect of the control group is different: 6 2 i 1 ni ˆ bi 1.2 2 .8 1 2.4 .8 3.2 , which is the deviation of the control n3 group’s mean from the overall mean (i.e., 9 - 3.2=5.8). 7. Weighted effects sum to zero: a. In a balanced design one need only sum each group’s effects (i.e., each group mean’s deviation from the overall mean). b. In an unbalanced design one must weight each group’s effects by its size before summing them. Thus if among m unequal groups, the i th group’s size is ni and its effect is b̂i , then the weighted effects sum (as illustrated using the above data) to zero m as follows: n bˆ 6 2 4 1 5 3.2 12 4 16 0 i i i 1 E. Orthogonal contrasts 1. As already mentioned during our discussion of principal components, when two independent variables are constructed by the researcher in such a way that they explain distinct amounts of variance in any dependent variable, they are said to be orthogonal to each other. 2. The key advantages of orthogonal independent variables are that their effects will not be confounded with each other (i.e., they will each explain a distinct amount of the dependent variable’s variance) and that their slopes can be interpreted without concern for their having been adjusted for such confounding. 3. Note that orthogonality implies an absence of multicollinearity. Yet the two involve the researcher in different ways. a. On the one hand, the researcher may actively assign sets of values to categories of his/her independent variable such that each set is orthogonal to every other one. 7 b. On the other hand, the researcher may passively acknowledge meaningful multicollinearity among measures of various concepts. c. A third possibility is to passively describe orthogonal dimensions among collinear measures of the same concepts, as we did with measures combined using principle component analysis. 4. The dot product is a technique to test for orthogonality among contrasts. If the dot product between a pair of variables equals zero, they are orthogonal. The formula for a dot product is as follows: m n C C i 1 1i i 2i where ni = the number of units of analysis in the i th group C ji = the value of the j th contrast for all units of analysis in the i th group m = the total number of groups 5. The variables: 1 if Treatment #1 C1 1 if Treatment # 2 2 if Control group 1 if Treatment #1 and C2 1 if Treatment # 2 0 if Control group Notice how the first variable “contrasts” the treatment groups from the control group, whereas the second variable “contrasts” the first (rote memorization) from the second (bio-memorization) treatment group. 6. To see if the contrasts are orthogonal (and recalling that ni 5 for each group), we compute their dot product as follows: m n C i 1 i 1i C 2 i 5 1 1 5 1 1 5 2 0 5 5 0 0 Since the dot product equals zero, the two contrasts are orthogonal. 8 7. The data matrix: Treatment #1 (rote memory) Treatment #2 (bio-memory) Control Y 9 12 11 15 8 10 8 11 13 8 5 6 8 7 4 C1 1 1 1 1 1 1 1 1 1 1 -2 -2 -2 -2 -2 C2 1 1 1 1 1 -1 -1 -1 -1 -1 0 0 0 0 0 8. The resulting regression equation: Yˆ 9 1.5C1 0.5C2 Once again, the estimated Y-value for each group equals the group’s mean: Yˆ1 9 1.5(1) 0.5(1) 11 Y1 Yˆ2 9 1.5(1) 0.5( 1) 10 Y2 Yˆ3 9 1.5( 2) 0.5(0) 6 Y3 9. Regression coefficients associated with orthogonal contrasts are rarely interpreted, since they are typically used to test for differences in group means (and nothing more). Nonetheless, here’s a “valiant attempt” at interpreting them: a. aˆ 9 : Overall, subjects recalled 9 words on average. (No problem here. When estimating contrast effects, the constant always has this meaning.) 9 b. bˆ1 1.5 : This is the amount that … Y 1 Y Y Y Y2 Y and 3 2 2 differ in opposite directions from zero. Or maybe, “the two treatment groups remembered an average of 1.5 words more than the overall average.” c. bˆ2 0.5 : This is the amount that Y1 Yt and Y2 Yt differ in opposite directions from zero, where Yt is the average words recalled among subjects in the 2 treatment groups only (i.e., without averaging in the control subjects’ words recalled). Or maybe, “on average the rote memorization group remembered 0.5 words more than the average of the words remembered only among subjects who either used rote memorization or bio-memorization.” 10. Finally, please note that we have now accounted for the same variance in yet a different way (i.e., we have “skinned the cat” differently again). F. Orthogonal polynomial contrasts—a particular kind of orthogonal contrast 1. Polynomial measures are used when one can speak of “units of distance” between adjacent groups to which you are assigning X-values. Accordingly, the experiment (as we have described it thus far) does not lend itself to the development of polynomial measures, since there are no common units in terms of which the 3 groups can be compared. 2. So let’s reconceptualize our experiment such that our groups differ according to how long subjects were given to memorize the words. Let’s imagine that each group was instructed to do rote memorization, but that … 10 a. the group (previously called the Control group) was given 5 minutes to memorize the words, b. the (previously bio-memorization) group was given 10 minutes to memorize the words, and c. the (previously rote memorization) group was give 15 minutes to memorize the words. 3. Now consider the following variables: 1 if 15 min . P1 0 if 10 min . 1 if 5 min . 1 if 15 min . and P2 2 if 10 min . 1 if 5 min . Notice how the first variable measures memorization-time linearly in 5-minute intervals, whereas the second variable measures quadratic variation across time. 4. Again note that the contrasts are orthogonal when we compute their dot product as follows: m n P P i 1 i 1i 2i 5 1 1 5 0 2 5 1 1 5 0 5 0 Since the dot product equals zero, the two polynomial contrasts are orthogonal. 5. The data matrix: 15 minutes (rote memory) 10 minutes (rote memory) 5 minutes (rote memory) Y 9 12 11 15 8 10 8 11 13 8 5 6 8 7 4 P1 1 1 1 1 1 0 0 0 0 0 -1 -1 -1 -1 -1 11 P2 1 1 1 1 1 -2 -2 -2 -2 -2 1 1 1 1 1 5. The resulting regression equation: Yˆ 9 2.5P1 0.5P2 And again, the estimated Y-value for each group equals the group’s mean: Yˆ1 9 2.5(1) 0.5(1) 11 Y1 Yˆ2 9 2.5(0) 0.5( 2) 10 Y2 Yˆ3 9 2.5( 1) 0.5(1) 6 Y3 Accordingly, we have “skinned the cat” a fourth time (i.e., we have explained the same between-group variance, SSTREATMENT 70 , once again). 6. The constant and slope coefficients associated with these polynomial contrasts can be interpreted as follows: a. aˆ 9 (The constant is interpreted the same way as for all regressions with only contrasts as independent variables.): Overall, subjects recalled 9 words on average. b. bˆ1 2.5 (This slope’s interpretation should sound familiar.): There is an increase of 2.5 words remembered by respondents for each additional 5 minutes they are given to memorize the 15 words. NOTE: This slope corresponds, of course, to the linear relation between two intervallevel variables. When taken in conjunction with quadratic, cubic, etc. relations between variables, it is not surprising that some social scientists (psychologists, in particular) tend to argue that linear regression is a special case of analysis of variance, namely it is the case when each independent variable is a linear polynomial. 12 c. bˆ2 0.5 : Note that the quadratic polynomial measure contrasts values on the dependent variable for the extremes (i.e., having 5 or 15 minutes of memorization time) versus those for the middle (i.e., with 10 minutes for memorization). Here the extremes have a half-word less recall than one would estimate based on the linear relation between time for memorization and words memorized. 6. A plot: • 15 • Number of words recalled • • 10 5 • • • • • • • 2 • • 10 0 2 15 1 1 bˆ1 2.5 bˆ2 0.5 5 1 1 Time for memorization P1 P2 7. Unlike other contrasts, polynomials take into account distances between the groups being compared. That is, the assignment of polynomial values to groups assumes that these groups differ on an interval (or ratio) scale. For example, imagine that the times-formemorization were 5 minutes, 10 minutes, and 30 minutes. In this case, the polynomial 13 2 contrast, P1 1 could be used. If we obtained the same data as before, a revised plot 3 might look as follows: • 15 • Number of words recalled • • • • 10 • • • • • 5 5 (-2) • • 2 10 (-1) Time for memorization 30 (3) P1 a. Note that the values of P1 preserve the relative distances among 5, 10, and 30. b. Moreover, unlike the “time for memorization” variable, P1 is a contrast: 3 n P 5 2 5 1 5 3 10 5 15 0 i 1 i 1i c. AN ASIDE: You may have noticed some similarities between the contrasts and the (equally-spaced) polynomials we have considered: 1 if 15 min . P1 0 if 10 min . 1 if 5 min . and 1 if 15 min . P2 2 if 10 min . 1 if 5 min . 14 1 if Treatment #1 C1 1 if Treatment # 2 2 if Control group 1 if Treatment #1 C2 1 if Treatment # 2 0 if Control group and In particular, P1 and C 2 are the same, as are P2 and C1 , once values for the second and third groups are reversed. However, note that this similarity ends with 4 or more groups. d. In the 4-group case, polynomial and nonpolynomial contrasts might look as follows: 3 1 P1 1 3 if 5 min . if 10 min . if 15 min . if 20 min . 1 if 1 if C1 1 if 1 if Treat. #1 Treat. #2 Treat. #3 Treat. #4 1 1 P2 1 1 if 5 min . if 10 min . if 15 min . if 20 min . 1 if 1 if C2 1 if 1 if Treat. #1 Treat. #2 Treat. #3 Treat. #4 1 3 P3 3 1 if 5 min . if 10 min . if 15 min . if 20 min . 1 if 1 if C3 1 if 1 if Treat. #1 Treat. #2 Treat. #3 Treat. #4 Notice how linear (no bend), quadratic (one bend), and cubic (two bends in regression line) trends in Y-values across increasing numbers of minutes can be modeled via variations in P1 , P2 , and P3 respectively. Yet with C1 mean values of Y are compared between the first 2 versus last 2 treatments. With C 2 mean comparisons are made between 1st and 3rd versus 2nd and 4th treatments. And with C3 mean comparisons are made between 1st and 4th versus 2nd and 3rd treatments. 8. How to construct a linear contrast: a. Consider the case in which four equal-sized groups have equal “distances” between adjacent pairs. Let this distance be “h” and let the value of the lowest group be “ℓ+h.” 15 b. It follows that values for the four groups equal . . . h 2h 3h 4h Note: The h’s trace a linear function across 4 equally-spaced intervals. The ℓ places this set of intervals at a specific point on the number line. c. Since we want a contrast, these values should sum to zero. That is, 4ℓ + 10h = 0. And so, 2ℓ = -5h. d. At this point we can choose any value for “h” and the value of “ℓ” will be set. So let’s choose h=2 , thereby setting ℓ= -5 . The resulting linear contrast is as follows: 5 2 3 5 4 1 56 1 58 3 Note that this contrast increases linearly in jumps of h=2 units from one group to the next. It is a contrast because group values sum to zero and each of the 4 groups is of the same size. 9. A quadratic contrast is found by squaring a linear contrast and then centering it (i.e., by then creating a contrast via subtracting out the mean of the squared values): 16 P1 P12 P12 P12 P2 Sums: -3 -1 1 3 0 9 1 1 9 20 9-5=4 1-5=-4 1-5=-4 9-5=4 0 1 -1 -1 1 0 P3 P1 P2 P3 -3 1 -1 3 0 -1 3 -3 1 0 10. Cubic and higher-order orthogonal contrasts are easiest found in tables of orthogonal polynomials (usually at the ends of advanced analysis of variance texts). For example, the last column in the above table lists the cubic polynomial contrast for the 4-group case. a. A word of caution: Cubic contrasts are not found by raising a linear contrast to the third power or by multiplying linear and quadratic contrasts (as with P3 above). For example, you will note that the cubic polynomial, P3 , is orthogonal to both P1 and P2 (which, as promised, are themselves orthogonal to each other): 4 P P i 1 4 1i 2 i 3 1 1 1 1 1 3 1 3 1 1 3 0 P P 3 1 1 3 1 3 3 1 3 3 3 3 0 i 1 4 1i 3i P P 1 1 1 3 1 3 1 1 1 3 3 1 0 i 1 2 i 3i However, although P3 P1 P2 is orthogonal to P2 , it is not orthogonal to P1 : 4 P P 3 3 1 1 1 1 3 3 9 1 1 9 16 i 1 4 1i 3i P P 1 3 1 1 1 1 1 3 3 1 1 3 0 i 1 2 i 3i 17 b. Every cubic polynomial measure affords a regression line with two bends in the pattern of Y-values across increasing units of X (i.e., of the variable from which the polynomial measure was constructed). Quadratic polynomial measures afford one such bend, and linear polynomial measures afford no bends (i.e., they afford straightline relations) across X’s units. In general, the order of the polynomial term needed will depend on the number of bends you theorize that Y will vary with increasing values of X. NOTE: Nonlinear relations can be estimated in multiple regression using polynomial terms. Contrasts can be approximated by first subtracting out the variable’s mean before raising it to a power of 2, 3, etc. c. WARNING: Contrasts listed in tables of orthogonal contrasts are only orthogonal in balanced research designs. Orthogonal polynomial contrasts listed there are only orthogonal in balanced research designs in which consecutive groups have equal distances between them. i. It is easy to incorporate unequal distances into the construction of a linear polonomial. (See part F.7. above.) This situation is more complicated when constructing higher-order polynomials, however. ii. In unbalanced designs, polynomials’ values are commonly attached weights in ways that ensure that they remain contrasts (as we did with weighted effect codes) and that they are orthogonal to each other. iii. For our purposes, it suffices to say that it is ALWAYS possible to construct “m-1” orthogonal polynomial contrasts among “m” groups that can be differentiated according to their values along a single interval- or ratio-level metric. 18 TABLE 1 Summary Table of Researcher-Constructed Variables Type of Variable Do the variables have . . . Zero correlations Zero means? with each other? Units? Parameter estimated by Constant Slope Form of hypothesis tests Dummy No No No m i m H0: i m Effect Yes No No i H0: i No rarely interpreted H0: g h Orthogonal Contrast Yes Yes H0: V and Y have no linear, quadrainterpreted tic, or etc. relation Note: Variables are constructed from a variable, V, with m attributes. These constructed variables are used to explain variance in the dependent variable, Y. In all cases, the subscript, i, ranges from i=1,...,m-1. The overall population mean is μ, and the means of populations with successive attributes of V are μ1, μ2, ... μm respectively. μg and μh are population means for 2 groups having distinct subsets of V's attributes. Orthogonal Polynomial Contrast Yes Yes Yes 19 bi or not G. As a general observation, note that it is because dummy variables, effect contrasts, orthogonal contrasts, and orthogonal polynomials all yield the same values of Yˆ that they all explain the same variance in the dependent variable (i.e., that each is a method for “skinning the same cat”). Where they differ is in the interpretations that each affords to the researcher. H. Interaction (a.k.a. moderation) coding When the effect of one variable on a dependent variable is nonlinear, a measure is (or measures are) needed to estimate the quadratic, cubic, etc. shape of the nonlinear relation. Polynomial measures allow this. When the effects of two variables are nonadditive, a measure (or measures) may be needed to estimate the joint (or multiplicative) effects of the two variables. Interaction measures allow this. 1. What is an interaction? a. Technically speaking, interaction occurs when the effect of one variable differs among levels of another variable. b. For example, consider the effects of gender and race on income. The variables are … 1 if Male 1 if White X1 X2 1 if Female 1 if Nonwhite Y Annual income in thousands of dollars c. Note that there are 4 possible combinations among gender and race: Nonwhite Female NF White Female WF Nonwhite Male NM White Male WM 20 d. Let’s assume that our sample is of 5 subjects within each of these 4 combinations, yielding a total sample size of n=20. We calculate average income values among subjects with the following results: Y NF 10 YF 13 YWF 16 Y NM 18 YWM 28 Y N 14 YW 22 Y M 23 Y 18 e. A plot: 30 20 Income W 1 N 1 W 1 10 N 1 0 -1 Male 0 1 Female Notice that there are 3 “things” going on in this plot: Marginal gender effects: Men earn more than women ($10,000 annually). Marginal race effects: Whites earn more than nonwhites ($8,000 annually). Gender-by-race interaction/moderation: This can be expressed either as “gender differences in income are greater among whites than nonwhites” or, equivalently, “race differences in income are greater among males than females.” 21 f. Some data: Characteristics Nonwhite Female White Female Nonwhite Male White Male Sums: # of Subjects (ni) 5 5 5 5 20 Gender (X1) 1 1 -1 -1 0 Race (X2) 1 -1 1 -1 0 Dot Product (niX1X2) 5 -5 -5 5 0 g. Note that not only are X1 and X2 contrasts (because they have zero means), but they are orthogonal contrasts as well (because their dot product equals zero). You will recall that this means that X1 and X2 will explain distinct amounts of the variance in income. Moreover, note that when Y is regressed on X1 and X2, the resulting regression equation is as follows: Yˆ 18 5 X 1 4 X 2 Accordingly, the following are the estimated values for the four combinations of gender and race characteristics: YˆNF 18 5 1 4 1 9 YNF 10 YˆWF 18 5 1 4 1 17 YWF 16 YˆNM 18 5 1 4 1 19 YNM 18 YˆWM 18 5 1 4 1 27 YWM 28 Note that these estimates are of higher incomes for white women and nonwhite men and of lower incomes for nonwhite women and white men than are actually the case (based on the groups’ means). These deviations result because at the moment we have only estimated the marginal effects of gender and race, and have yet to estimate their interaction. 22 h. Consider the following plot: 30 YˆWM 27 20 $10,000 (within race) YˆNM 19 YˆWF 17 Income $8,000 (within gender) 10 YˆNF 9 0 -1 Male 0 1 Female Among these estimates, notice that gender differences in income are the same ($10,000) for nonwhites and whites. Similarly, race differences in income are the same ($8,000) for females and males. However (as suggested by the arrows in the plot), the estimates differ from the group means in that nonwhite females’ and white males’ incomes are underestimated, and white females’ and nonwhite males’ incomes are overestimated. Three thoughts: You can tell that no interaction is estimated in your regression model when regression lines between Y and one independent variable are the same (i.e., are parallel) among levels of another independent variable. This is evident in the above sketch when you note among nonwhites versus whites that the regression lines between income and gender are parallel. The greater the income difference between nonwhite females and white males relative to white females and nonwhite males, the more the effect of gender will 23 differ by race (or, equivalently, the more the effect of race will differ by gender), which is to say the stronger the interaction effect of gender and race on income. The required interaction measure could be constructed with high values for “nonwhite females and white males” but low values for “white females and nonwhite males.” 2. Constructing an interaction measure a. The concept of interaction is based on the idea that “the whole is greater than the sum of its parts.” i. If the gender and race variables are considered in isolation, each is binomial and each is associated with a single degree of freedom. ii. But when the two are combined into a “whole,” a four-level variable is produced (nonwhite female, white female, nonwhite male, white male). There are three degrees of freedom associated with this new variable: One for each of the marginal effects of gender and race, plus one for the gender-by-race interaction. b. A general procedure for computing an interaction measure between two variables (i.e., a two-way interaction measure) is to subtract each variable’s mean from its respective values and to multiply these differences together. For example, the variable, INT, could be constructed to estimate the interaction effects of X1 and X2 as follows: INT X 1 X 1 X 2 X 2 c. Returning to the gender-by-race illustration, note that since X1 and X2 are contrasts, X 1 X 2 0 and so INT X 1 X 2 . Here are the calculations: 24 Characteristics Nonwhite Female White Female Nonwhite Male White Male Race (X2) 1 -1 1 -1 Gender (X1) 1 1 -1 -1 Gender-by-Race Interaction (INT=X1X2) 1 -1 -1 1 Note that that this measure has the qualities we seek. That is, it takes high scores for “nonwhite females and white males” and takes low scores for “white females and nonwhite males.” You can also check the dot products to verify that INT is orthogonal to X1 and X2. Given this orthogonality, the regression coefficients associated with X1 and X2 will not change when INT is added into the regression equation. 3. Interpreting coefficients associated with interaction measures a. With our data, regressing income on gender, race, and INT produces the following regression model: Yˆ 18 5 X 1 4 X 2 1INT Note that unlike the previous equation, the Y-estimates from this equation would equal the means of each of the four groups. This would result as nonwhite females’ and white males’ estimates were each increased by $1,000 and as white females’ and nonwhite males’ estimates were decreased by the same amount. b. The regression lines would now no longer be parallel but would (as per the sketch of group means) be closer among females than among males. This is the nonparallel pattern one would expect if the slope associated with INT were positive. Note that if the slope were negative, the lines would be closer among males than among females. 25 More generally, the sign of the slope associated with an interaction measure indicates the direction in which one’s regression lines change from being parallel (i.e., which one shifts counterclockwise as the other shifts clockwise). c. Two important pieces of information are needed to determine if two independent variables have a specific interaction effect on a dependent variable: i. The interaction term must explain a significant amount of variation in the dependent variable in addition to the variance explained by the marginal effects of the variables out of which it was constructed. Accordingly (and as already mentioned in our previous discussion of hierarchical models), the following regression model would be misspecified: Yˆ aˆ bˆ1 X 1 bˆ2 INT The proper significance test would be an F-test that compares R-squareds from the following two regression models: Reduced model: Yˆ aˆ bˆ1 X 1 bˆ2 X 2 Complete model: Yˆ aˆ bˆ1 X 1 bˆ2 X 2 bˆ3 INT ii. The sign of the slope associated with the interaction term must be consistent with your theory. (Note: Since the just-mentioned F-test is two-tailed, you should perform the test at the 2α significance level and fail to reject if the slope’s sign is not as hypothesized.) 4. How does one know when to estimate interactions among variables in addition to the marginal effects of the variables themselves? a. The decision to estimate a quadratic, cubic, etc. polynomial relation is easy: Simply obtain a boxplot to see if Y has a nonlinear association with X. Unfortunately, interaction effects cannot be detected in bivariate plots of one’s data. 26 b. Consider the following full and partial tables: Yes No Voted for the president Voted for the president Party voted for Democratic Republican 100 100 100 100 Work status Unemployed Employed Democratic Republican Democratic Republican 10 90 90 10 90 10 10 90 Yes No Clearly there is a strong interaction between “work status” (W) and “party voted for” (P), since the associations between “voted for the president” (V) and “party voted for” are reversed between employed and unemployed respondents in this hypothetical survey of US voters. (Note: This is neither distortion nor suppression. Here at issue is not that associations between full and partial tables differ, but that associations differ among partial tables.) c. Maybe we can detect the interaction by looking at a few correlations: rVP 0 rVW 0 rPW 0 Not even partial correlations disclose the existence of an interaction: rVP.W rVP rVW rPW 1 r 1 r 2 VW 2 PW 0 0 1 d. In conclusion, you should not look to your data for guidance on where you might find interaction effects. The “place” to look is among your theoretical hunches about why your variables are interrelated. So, for instance, it was only after someone theorized that discrimination is not additive (but that the effects of a second discriminatory 27 status is much less detrimental than those of the first) that researchers in the field of stratification found evidence of a “double negative equals positive” effect regarding the disadvantage of Black women in the United States. 5. An illustration: One theory of political science argues that citizens of a country will only support aggression toward a foreign power (e.g., Iraq) when the country has both recently suffered a humiliating military defeat (e.g., in Iran) AND has a healthy economy. To test this theory let’s assume that the unit of analysis is the event (in this case, the instance of aggression of one country against another), and that we have data on the following three variables: PUBOP = public support for aggression within an aggressor nation DEFEAT = whether or not the aggressor nation experienced a recent humiliating military defeat PCGNP = economic heath within the aggressor nation (as measured by percaptia GNP) INTERACT = an interaction measure constructed from DEFEAT and PCGNP a. If you are unsure whether an interaction measure should be included among the independent variables, you should sketch a plot of how your data would look if the theory were correct. For example, note in the following plot how high public support is only found for instances of aggression that followed a recent humiliating military defeat: High d Public Support Of Aggression (PUBOP) Low d d d d d d d dn d n n n n n nn Low d d d d n n n d d d d Recent hum. defeat d d d (DEFEAT = 1) No recent hum. defeat (DEFEAT = 2) n n n n nn n n Economic Health 28 (PCGNP) High b. Now, imagine that you have measures of PUBOP (a public opinion measure on which high scores mean support for aggression); DEFEAT (1 = had a recent defeat; 2 = had none); PCGNP (per capita gross national product); and INTERACT = DEFEAT DEFEAT PCGNP PCGNP. You collect data on 100 instances of aggression and access these data using SPSS and the following command: regression vars=pubop,defeat,pcgnp,interact/dep=pubop/stepwise. In the first step of the stepwise regression procedure, the variable DEFEAT enters the model and the output is (in part) as follows: Model Summary Model R R Square 1 .462 .213 a. Predictors: (Constant), DEFEAT In the second step, INTERACT enters the model with the following output: Model Summary Model R R Square 1 .552 .305 a. Predictors: (Constant), DEFEAT There is no third step, because PCGNP does not increment R-square by a significant amount at the .05 level (the default significance level in SPSS). Please take note: You should not end your analysis with a regression model that includes INTERACT, but excludes PCGNP!!! The regression estimated in the second step is misspecified, because it excludes the lower-order measure, PCGNP. Please remember that your regression models must always be correctly specified. (Look back to our notes on model specification, and the discussion there on hierarchically related regression models.) So now imagine that PCGNP is forced into two regressions as follows: regression vars=pubop,defeat,pcgnp/dep=pubop/enter. 29 regression vars=pubop,defeat,pcgnp,interact/dep=pubop/enter. These two commands yield (in part) the following output: Model Summary Model R R Square 1 .476 .226 a. Predictors: (Constant), DEFEAT, PCGNP and Model Summary Model R R Square 1 .579 .335 a. Predictors: (Constant), DEFEAT, PCGNP, INTERACT Two pieces of information are needed to evaluate whether there is support for the theory. i. First, one must test whether INTERACT explains a significant amount of variance in PUBOP in addition to that explained by DEFEAT and PCGNP. Accordingly, a familiar F-test is called for: Fnkkg1 Rc2 Rr2 kg 3 2 F100 31 1 Rc2 n k 1 .335 .226 3 2 1 .335 100 3 1 15.69 F961 ,.10 2.79 F601 ,.10 15.69 Note that this significance test used a critical value of F at the .10 significance level, despite the fact that the .05 level of significance was being used. This is because the F-test is inherently a 2-tailed test, whereas our theory specifies a 30 specific way that the slopes between PUBOP and PCGNP should differ among levels of DEFEAT. If the slopes differ in the opposite direction, one would not have support for the theory even if a significant amount of variation may have been explained by INTERACT. So we come to the other piece of information that is needed to evaluate whether or not we have support for our theory. ii. Second, one must see if the sign of the slope associated with INTERACT corresponds to what one would expect if the theory were true. What follows is a “cookbook” for deciding if your theory is consistent with the sign of a 2-way interaction measure’s partial slope. 1. Begin by finding the values that INTERACT takes for various combinations of high and low values on the variables from which it was constructed. Recent defeat (DEFEAT = 1) No recent defeat (DEFEAT = 2) Healthy Economy – + No Healthy Economy + – Note that since INTERACT = DEFEAT DEFEAT PCGNP PCGNP , it takes positive values in the “+” cells of this table and takes negative values in the table’s “–” cells. Let’s call this 2x2 table our measure table. 2. Next plot the parallel lines that would be estimated based on the marginal effects of the variables used in constructing the interaction measure (here, DEFEAT and PCGNP). 3. On this plot (see below) place arrows at the ends of the lines to indicate the way in which the lines must rotate to fit the pattern suggested by your theory. 31 Recent defeat High No recent defeat Public Support Of Aggression (PUBOP) Low Low Economic Health (PCGNP) High 4. Using “up arrows” to indicate where PUBOP estimates should increase, and “down arrows” to indicate where they should decrease, create a theory table in which +’s and –’s correspond respectively to the up- and down-arrows in the plot: Recent defeat (DEFEAT = 1) No recent defeat (DEFEAT = 2) Healthy Economy + – No Healthy Economy – + 5. If identical signs are in corresponding cells of these two tables, your theory suggests a positive partial slope between INTERACT and the dependent variable; if opposite signs are in corresponding cells of the tables, the theory suggests a negative partial slope. Since the latter situation holds in this illustration, we conclude that a negative partial slope (along with a significantly large F-statistic) would provide evidence in support of the political science theory. Put differently, the hypothesis being tested here is as follows: 32 H 0: bI 0 H A: bI 0 The alternative hypothesis is that the slope is negative, because if it were positive the estimated values of PUBOP would shift from the parallel lines in the above plot in the direction opposite to that in which the arrows are pointing. Why? Well, the parallel lines are those estimated by a regression model that only includes the marginal effects of DEFEAT and PCGNP. When INTERACT is added to the model, the Y-hats from the previous model will be larger when bˆI INTERACT is positive (i.e., when a positive number is added to them), but will be smaller when this product is negative (i.e., when a negative number is added). If (as hypothesized) b̂I is negative, bˆI INTERACT will be positive when INTERACT is negative (i.e., when an event occurs after a recent defeat and at a time of economic health, or when it occurs neither after a recent defeat nor at a time of economic health) and it will be negative when INTERACT is positive (i.e., when an event occurs after a recent defeat but not at a time of economic health, or when it occurs not after a recent defeat but at a time of economic health). The hypothesized negative relation between INTERACT and PUBOP can thus be sketched as follows: High Public Support Of Aggression (PUBOP) Low INTERACT Defeat + Health No defeat + No health 33 Defeat + No health No defeat + Health I. Although illustrations of polynomial and interaction measures in these notes have consistently used independent variables that only have a few levels (e.g., three lengths-oftime subjects are given to memorize words or two-levels distinguishing events that occurred “soon after” versus “not soon after” a humiliating military defeat), in many studies these measures are constructed from finely-grained interval- or ratio-level variables. In these cases, polynomial and interaction measures are typically constructed by respectively raising a variable to the desired power (squaring for quadratic, cubing for cubic, etc.) or by multiplying variables together (two variables for a 2-way interaction, three variables for a 3way interaction, etc.). To reduce collinearity among polynomial measures or between interaction measures and the variables from which they were constructed, I recommend that you subtract out the means of (i.e., center) your variables before raising them to a specific power or multiplying them together. What follow are some comments on this act of subtracting out a number(s) prior to power-raising or multiplying of variables: 1. Subtracting out numbers does not influence the Y-hat values generated by one’s regression models. Consider the following equation with an interaction measure obtained by multiplying centered variables: Yˆ aˆ bˆ1 X 1 bˆ2 X 2 bˆ3 X 1 X 1 X 2 X 2 Yˆ aˆ bˆ1 X 1 bˆ2 X 2 bˆ3 X 1 X 2 X 1 X 2 X 1 X 2 X 1 X 2 Yˆ aˆ bˆ1 X 1 bˆ2 X 2 bˆ3 X 1 X 2 bˆ3 X 1 X 2 bˆ3 X 1 X 2 bˆ3 X 1 X 2 Yˆ aˆ bˆ X X bˆ bˆ X X bˆ bˆ X X bˆ X X 3 1 2 1 3 2 1 2 3 1 Thus if the estimated equation is of the form, Yˆ aˆ bˆ1 X 1 bˆ2 X 2 bˆ3 X 1 X 2 , you will find that aˆ aˆ bˆ3 X 1 X 2 34 2 3 1 2 bˆ1 bˆ1 bˆ3 X 2 bˆ2 bˆ2 bˆ3 X 1 , and bˆ bˆ . 3 3 Note that the slope associated with the interaction is the same, whether or not the variables were centered while constructing the interaction measure. We shall return to this point in a moment. 2. It is easy to verify that collinearity can be reduced via prior centering. For example, consider a variable, X, that takes values of the integers 1 through 10. If we square X before centering versus after centering, we find that X has a correlation of .97 with noncentered squared values of itself but a correlation of zero with its squared-butcentered values. 3. Sometimes you may wish to subtract out a theoretically meaningful number to give intuitive meaning to the resulting slope. For example, consider the argument that high school graduates with no post-high school training are paid less because there is too great a supply of them in the labor market. This argument suggests that income is a quadratic function of education, in which people with more or less than a high school education earn more than those with a high school diploma. In this case, it would be reasonable to subtract 12 from the “years of education” measure as a quadratic term is constructed. 4. Finally, you should be aware that it is always possible to construct a polynomial measure that is orthogonal to all its lower-order polynomial measures. Likewise, it is always possible to construct an interaction measure that is orthogonal to all lower-order interactions and marginal measures from which it was constructed. The key here is in one’s selection of the numbers subtracted out prior to power-raising or multiplying of variables. By “subtracting out the right constants,” you can ensure that your higher-order 35 measure is orthogonal to all lower-order measures associated with it. The following pages contain output from an SPSS program with which a measure of church attendance (ATTEND) is regressed on RACE (1=white; 2=black) and SEX (1=male; 2=female). After listing frequencies on these three variables, regression output is provided first for the regression of ATTEND on RACE, SEX, and INT (which was computed simply by multiplying RACE by SEX). Next, output is provided for a parallel regression, except this time INT was computed as race 1.1475 sex 1.575 . Unlike in the previous regression, this time INT has zero correlations with RACE and SEX (although the latter two variables do have a modest positive correlation of .113). Of most interest in this output are differences in the slopes associated with the marginal effects of RACE and SEX in the two regression models. Which slopes should be believed? Independent Variables in Model slope for race slope for sex slope for int race, sex, int (uncentered) -.849 -.935 1.139 race, sex, int (centered) .945 .372 1.139 race & sex only .942 .369 n/a The slopes associated with race and sex change signs when one changes the way in which int is calculated!!! As noted above, the slope associated with int (as well as its standard error) is not altered when int is calculated from uncentered or centered race and sex variables. Deciding which slope to believe should be based on your knowledge that it is always possible to construct an interaction measure that is orthogonal to all lower-order 36 interactions and marginal measures from which it was constructed. If you had constructed an interaction measure that was orthogonal to race and sex, the slopes associated with race and sex would be identical to those obtained in the model that includes race and sex only. So why not just estimate the marginal effects of race and sex based on a model that simply excludes the interaction measure? Actually, that is what I am recommending that you do. However, you should be careful to alter the standard errors associated with these marginal effects to take into account the reduction in MSE from the reduced model from which int is excluded ( MSE R ) to the MSE from the complete model that includes int ( MSEC ). Recall that the formula for a slope’s standard error is as follows: ˆ x MSE SS X 1 R X2 .V1V2 ...Vk 1 (Here V1…Vk-1 is a list of all of the model’s independent variables other than X.) Now think about what will change in this formula if an interaction measure is added into this regression model—an interaction measure that is orthogonal to X. Well, SSX won’t change. Nor will “ 1 R X2 .V1V2 ...Vk 1 ,” since the orthogonality of X to the added measure (i.e., the interaction) has been ensured. And so it is the MSE that has to be changed here. For the output listed in the next four pages, the corrected standard errors for race and sex are as follows: ˆ Race MSE C MSE R 2 MSE R SS Race 1 rRaceSex ˆ Sex MSEC MSE R 2 MSE R SS Sex 1 rRaceSex 37 6.275 .184 .183 6.310 6.275 .131 .130 6.310 Interaction & Collinearity select if ((race ne 3)and(wrkstat eq 1)). frequencies vars=attend,race,sex. compute int=race*sex . pearson corr vars=race, sex, int. regression vars=attend,race,sex,int/dep=attend/enter. compute k1=1.1475 . compute k2=1.5750 . compute int=(race - k1)*(sex - k2). pearson corr vars=race, sex, int. regression vars=attend,race,sex,int/dep=attend/enter. regression vars=attend,race,sex/dep=attend/enter. ATTEND HOW OFTEN R ATTENDS RELIGIOUS SERVICES Valid Missing Total 0 NEVER 1 LT ONCE A YEAR 2 ONCE A YEAR 3 SEVRL TIMES A YR 4 ONCE A MONTH 5 2-3X A MONTH 6 NRLY EVERY WEEK 7 EVERY WEEK 8 MORE THN ONCE WK Total 9 DK,NA Frequency 223 136 242 240 114 147 82 226 Percent 14.4 8.8 15.6 15.5 7.4 9.5 5.3 14.6 Valid Percent 14.8 9.0 16.1 16.0 7.6 9.8 5.5 15.0 Cumulative Percent 14.8 23.9 40.0 56.0 63.5 73.3 78.8 93.8 93 6.0 6.2 100.0 1503 48 1551 96.9 3.1 100.0 100.0 RACE RACE OF RESPONDENT Valid 1 WHITE 2 BLACK Total Frequency 1317 234 1551 Percent 84.9 15.1 100.0 Valid Percent 84.9 15.1 100.0 Cumulative Percent 84.9 100.0 SEX RESPONDENTS SEX Valid 1 MALE 2 FEMALE Total Frequency 830 721 1551 Percent 53.5 46.5 100.0 Valid Percent 53.5 46.5 100.0 38 Cumulative Percent 53.5 100.0 Correlations Correlations RACE RACE OF RESPONDENT SEX RESPONDENTS SEX INT RACE RACE OF RESPON DENT 1.000 . 1551 .113 .000 1551 .726 .000 1551 Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N SEX RESPON DENTS SEX .113 .000 1551 1.000 . 1551 .735 .000 1551 INT .726 .000 1551 .735 .000 1551 1.000 . 1551 Regression Model Summary Model 1 R R Square .175a .031 Adjusted R Square .029 Std. Error of the Estimate 2.50 a. Predictors: (Constant), INT, RACE RACE OF RESPONDENT , SEX RESPONDENTS SEX ANOVAb Model 1 Regression Residual Total Sum of Squares 298.428 9406.111 9704.539 df 3 1499 1502 Mean Square 99.476 6.275 F 15.853 Sig. .000a a. Predictors: (Constant), INT, RACE RACE OF RESPONDENT , SEX RESPONDENTS SEX b. Dependent Variable: ATTEND HOW OFTEN R ATTENDS RELIGIOUS SERVICES Coefficients a Model 1 (Constant) RACE RACE OF RESPONDENT SEX RESPONDENTS SEX INT Unstandardized Coefficients B Std. Error 3.966 .715 Standardi zed Coefficie nts Beta t 5.548 Sig. .000 -.849 .610 -.119 -1.391 .164 -.935 .444 -.183 -2.107 .035 1.139 .370 .385 3.076 a. Dependent Variable: ATTEND HOW OFTEN R ATTENDS RELIGIOUS SERVICES 39 .002 Correlations Correlations RACE RACE OF RESPONDENT SEX RESPONDENTS SEX INT RACE RACE OF RESPON DENT 1.000 . 1551 .113 .000 1551 .000 .993 1551 Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N SEX RESPON DENTS SEX .113 .000 1551 1.000 . 1551 .000 .987 1551 INT .000 .993 1551 .000 .987 1551 1.000 . 1551 Regression Model Summary Model 1 R R Square .175a .031 Adjusted R Square .029 Std. Error of the Estimate 2.50 a. Predictors: (Constant), INT, RACE RACE OF RESPONDENT , SEX RESPONDENTS SEX ANOVAb Model 1 Regression Residual Total Sum of Squares 298.428 9406.111 9704.539 df 3 1499 1502 Mean Square 99.476 6.275 F 15.853 Sig. .000a a. Predictors: (Constant), INT, RACE RACE OF RESPONDENT , SEX RESPONDENTS SEX b. Dependent Variable: ATTEND HOW OFTEN R ATTENDS RELIGIOUS SERVICES Coefficients a Model 1 (Constant) RACE RACE OF RESPONDENT SEX RESPONDENTS SEX INT Unstandardized Coefficients B Std. Error 1.907 .276 Standardi zed Coefficie nts Beta t 6.913 Sig. .000 .945 .183 .132 5.157 .000 .372 .130 .073 2.857 .004 1.139 .370 .078 3.076 a. Dependent Variable: ATTEND HOW OFTEN R ATTENDS RELIGIOUS SERVICES 40 .002 Regression Model Summary Model 1 R .157a R Square .025 Adjusted R Square .023 Std. Error of the Estimate 2.512 a. Predictors: (Constant), SEX RESPONDENTS SEX , RACE RACE OF RESPONDENT ANOVAb Model 1 Regression Residual Total Sum of Squares 239.063 9465.476 9704.539 df 2 1500 1502 Mean Square 119.531 6.310 F 18.942 Sig. .000a a. Predictors: (Constant), SEX RESPONDENTS SEX , RACE RACE OF RESPONDENT b. Dependent Variable: ATTEND HOW OFTEN R ATTENDS RELIGIOUS SERVICES Coefficientsa Model 1 (Constant) RACE RACE OF RESPONDENT SEX RESPONDENTS SEX Unstandardized Coefficients B Std. Error 1.937 .277 Standardi zed Coefficien ts Beta t 7.006 Sig. .000 .942 .184 .132 5.126 .000 .369 .131 .073 2.827 .005 a. Dependent Variable: ATTEND HOW OFTEN R ATTENDS RELIGIOUS SERVICES 41 Class Examples on Dummy Variables and Effect, Orthogonal, & Polynomial Contrasts The program: data list records=1 / words 1-2 d1 4 d2 6 e1 8-9 e2 11-12 c1 14-15 c2 17-18 p1 20-21 p2 23-24. begin data. 9 1 0 1 0 1 1 1 1 12 1 0 1 0 1 1 1 1 11 1 0 1 0 1 1 1 1 15 1 0 1 0 1 1 1 1 8 1 0 1 0 1 1 1 1 10 0 1 0 1 1 -1 0 -2 8 0 1 0 1 1 -1 0 -2 11 0 1 0 1 1 -1 0 -2 13 0 1 0 1 1 -1 0 -2 8 0 1 0 1 1 -1 0 -2 5 0 0 -1 -1 -2 0 -1 1 6 0 0 -1 -1 -2 0 -1 1 8 0 0 -1 -1 -2 0 -1 1 7 0 0 -1 -1 -2 0 -1 1 4 0 0 -1 -1 -2 0 -1 1 end data. regression vars=words d1 d2/des=corr/dep=words/enter. regression vars=words e1 e2/des=corr/dep=words/enter. regression vars=words c1 c2/des=corr/dep=words/enter. regression vars=words p1 p2/des=corr/dep=words/enter. Output from first regression: Correlations Pearson Correlation WORDS WORDS 1.000 D1 .484 D2 .242 D1 .484 1.000 -.500 Model Summary D2 .242 -.500 1.000 Model 1 Regression Residual Total Sum of Squares 70.000 58.000 128.000 df 2 12 14 Mean Square 35.000 4.833 F 7.241 a. Predictors: (Constant), D2, D1 Coefficientsa b. Dependent Variable: WORDS Model 1 (Constant) D1 D2 Stand ardize d Coeffi Unstandardized cients Coefficients B Std. Error Beta 6.000 .983 5.000 1.390 .807 4.000 1.390 .645 a. Dependent Variable: WORDS Adjusted R Square .471 a. Predictors: (Constant), D2, D1 ANOVAb Model 1 R R Square .740a .547 t 6.103 3.596 2.877 Sig. 42 .000 .004 .014 Sig. .009a Std. Error of the Estimate 2.20 Output from second regression: Correlations Pearson Correlation WORDS WORDS 1.000 E1 .699 E2 .559 E1 .699 1.000 .500 Model Summary E2 .559 .500 1.000 Model 1 Regression Residual Total Sum of Squares 70.000 58.000 128.000 df 2 12 14 Adjusted R Square .471 Std. Error of the Estimate 2.20 a. Predictors: (Constant), E2, E1 ANOVAb Model 1 R R Square .740a .547 Mean Square 35.000 4.833 F 7.241 Sig. .009a a. Predictors: (Constant), E2, E1 b. Dependent Variable: WORDS Coefficientsa Model 1 (Constant) E1 E2 Stand ardize d Coeffi Unstandardized cients Coefficients B Std. Error Beta 9.000 .568 2.000 .803 .559 1.000 .803 .280 t 15.855 2.491 1.246 Sig. .000 .028 .237 a. Dependent Variable: WORDS Output from third regression: Correlations Pearson Correlation WORDS WORDS 1.000 C1 .726 C2 .140 C1 .726 1.000 .000 Model Summary C2 .140 .000 1.000 Model 1 Regression Residual Total Sum of Squares 70.000 58.000 128.000 df 2 12 14 Adjusted R Square .471 a. Predictors: (Constant), C2, C1 ANOVAb Model 1 R R Square .740a .547 Mean Square 35.000 4.833 F 7.241 a. Predictors: (Constant), C2, C1 b. Dependent Variable: WORDS 43 Sig. .009a Std. Error of the Estimate 2.20 Coefficientsa Model 1 (Constant) C1 C2 Stand ardize d Coeffi Unstandardized cients Coefficients B Std. Error Beta 9.000 .568 1.500 .401 .726 .500 .695 .140 t 15.855 3.737 .719 Sig. .000 .003 .486 a. Dependent Variable: WORDS Output from fourth regression: Correlations Pearson Correlation WORDS WORDS 1.000 P1 .699 P2 -.242 P1 .699 1.000 .000 Model Summary P2 -.242 .000 1.000 Model 1 Regression Residual Total Sum of Squares 70.000 58.000 128.000 df 2 12 14 Mean Square 35.000 4.833 F 7.241 a. Predictors: (Constant), P2, P1 b. Dependent Variable: WORDS Coefficientsa Model 1 (Constant) P1 P2 Adjusted R Square .471 a. Predictors: (Constant), P2, P1 ANOVAb Model 1 R R Square .740a .547 Stand ardize d Coeffi Unstandardized cients Coefficients B Std. Error Beta 9.000 .568 2.500 .695 .699 -.500 .401 -.242 t 15.855 3.596 -1.246 Sig. .000 .004 .237 a. Dependent Variable: WORDS 44 Sig. .009a Std. Error of the Estimate 2.20