Fixing X (by constructing variables that meet the linearity assumption)

advertisement
Stat 404
Fixing X
(by constructing variables that meet the linearity assumption)
A. Consider the following experiment: A list of 15 words is projected on a screen in front of
three groups of subjects. Prior to this subjects in one group (Group #1) were told to
“memorize the words for a recall test to be given later.” Another group (Group #2) was told
to “meditate on the words while relaxing using biofeedback.” A third group (Control,
Group #3) received no prior instructions.
1. The data:
Group #2
Bio-memorization
Y  Yˆ1
raw data
10
0
8
-2
11
1
13
3
8
-2
Y2  10
Group #1
Rote Memorization
Y  Yˆ1
raw data
9
-2
12
1
11
0
15
4
8
-3
Y1  11
Group #3
No memorization
Y  Yˆ1
raw data
5
-1
6
0
8
2
7
1
4
-2
Y3  6
Overall mean: Y  9
2. A plot:
15
Number of
words recalled
10
•
•
•
•
•
•
•
•
2
•
•
•
•
•
2
3
5
1
Group
1
3. The ANOVA table:
Source
Treatment
Error
Total
SS
70
58
128
df
2
12
14
MS
35
4.83
F
7.24
4. Note that group differences in this table explain a significant amount of variance at both
the .05 and .01 levels of significance, since F122 ,.05  3.88 and F122 ,.01  6.93 .
5. The next few lectures will be considering a variety of ways that independent variables
can be constructed to explain this variance (i.e., the treatment sum of squares of 70 given
in the ANOVA table). SPSS output that summarizes these ways is provided at the end of
this section of your lecture notes.
B. Doing an ANOVA with dummy variables
1. The variables:
 1 if Treatment #1 
D1  

0 otherwise

and
1 if Treatment #2 
D2  

0 otherwise

2. The data matrix:
Treatment #1
(rote memory)
Treatment #2
(bio-memory)
Control
Y
9
12
11
15
8
10
8
11
13
8
5
6
8
7
4
D1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
2
D2
0
0
0
0
0
1
1
1
1
1
0
0
0
0
0
3. The resulting regression equation:
Yˆ  6  5D1  4 D2
Notice that (as you might expect) the estimated Y-value for each group equals the group
mean:
Yˆ1  6  5(1)  4(0)  11  Y1
Yˆ2  6  5(0)  4(1)  10  Y2
Yˆ3  6  5(0)  4(0)  6  Y3
4. Interpreting the regression coefficients:
a. aˆ  6 : On average, 6 words were recalled by subjects with no prior instructions.
b. bˆ1  5 : On average, the rote memorization group recalled 5 words more than this.
c. bˆ2  4 : On average, those using bio-memorization recalled 4 words more than did
those in the control group (i.e., those with no prior instructions).
6. Note: When slopes associated with dummy variables are stated in words, you do not
(unless they are constructed from different nominal-level variables [like race, gender, and
religious affiliation]) refer to the effects of one dummy variable being adjusted for its
collinearity with another one.
C. Effect coding
1. The variables:
 1 if Treatment #1 


E1   1 if Control group 
 0 otherwise



 1 if Treatment # 2 


and E 2   1 if Control group 
 0 otherwise



3
2. The data matrix:
Treatment #1
(rote memory)
Treatment #2
(bio-memory)
Control
Y
9
12
11
15
8
10
8
11
13
8
5
6
8
7
4
E1
1
1
1
1
1
0
0
0
0
0
-1
-1
-1
-1
-1
E2
0
0
0
0
0
1
1
1
1
1
-1
-1
-1
-1
-1
3. Definition: A contrast is a variable having a mean of zero.
a. Every variable can be converted to a contrast by subtracting its mean from each of its
values. Statisticians sometimes refer to such a conversion as centering one’s data
on a variable.
b. Note that unlike dummy variables, effect measures are contrasts.
c. Note also that when all of one’s independent variables are contrasts, the constant in
one’s regression equation is an estimate of the dependent variable’s mean. For
example, in this case aˆ  Y  bˆ1 E1  bˆ2 E 2  9  2(0)  1(0)  9  Y .
d. Whenever the constant in a regression model estimates the mean, slopes associated
with contrasts can be described as deviations from the overall mean.
4. The resulting regression equation:
Yˆ  9  2 E1  1E2
4
Again note that the estimated Y-value for each group equals the group mean:
Yˆ1  9  2(1)  1(0)  11  Y1
Yˆ2  9  2(0)  1(1)  10  Y2
Yˆ3  9  2( 1)  1( 1)  6  Y3
5. Interpreting the regression coefficients:
a. aˆ  9 : Overall 9 words were recalled on average.
b. bˆ1  2 : On average, the rote memorization group recalled 2 words more than this.
c. bˆ2  1 : On average, those using bio-memorization recalled 1 word more than the
overall average.
6. Note that the “effect” of the control group (i.e., the deviation of the mean of the control
2
group from the overall mean) equals   bˆi  1  2   3 . After obtaining this third
i 1
effect, notice that the 3 effects sum to zero.
7. In the U.S. we have a folk saying, “There are many ways to skin a cat.” The meaning of
the expression is that there are a variety of ways to do some things. In these notes, you
are learning a variety of ways to explain the variance among the 2 treatment groups and
the control group (i.e., to explain SSTREATMENT  70 ). At this point we have considered 2
ways to “skin this cat.”
D. Weighted effect coding
1. A potential problem: Effect coding (as described above) only yields contrasts in balanced
experimental designs (i.e., when each of the groups being compared is of the same size
[e.g., in this case 5 subjects in each group]). Weighted effect coding is needed when
group sizes are unequal.
5
2. The variables (assuming a 3-level nominal-level variable with ni units of analysis in the
i th group for i=1,2,3):
 1 if Treatment #1 
 n

EW1   1 if Control group 
 n3

 0 otherwise

and
 1 if Treatment #2 
 n

EW2   2 if Control group 
 n3

 0 otherwise

3. The data matrix (an altered version of the previous data matrix in which the overall mean
29
and the treatment means are the same, but the mean of the control group is Yˆ3 
 5.8 ):
5
Y
9
12
11
15
8
11
8
11
13
8
5
6
7
7
4
Treatment #1
(rote memory)
Treatment #2
(bio-memory)
Control
E1
1
1
1
1
1
1
0
0
0
0
-1.2
-1.2
-1.2
-1.2
-1.2
E2
0
0
0
0
0
0
1
1
1
1
-.8
-.8
-.8
-.8
-.8
4. Notice how weighted effect coding leaves each effect code a contrast:
E1 
6  1  5   1.2  E
15
2

4  1  5   .8  0
15
5. With the changes made to the data matrix, the regression equation is identical to the
previous one. Moreover, interpretations of regression coefficients are identical to those
in the previous equation as well.
6. However, the effect of the control group is different:
6
2

i 1
ni ˆ
bi  1.2  2   .8  1  2.4  .8  3.2 , which is the deviation of the control
n3
group’s mean from the overall mean (i.e., 9 - 3.2=5.8).
7. Weighted effects sum to zero:
a. In a balanced design one need only sum each group’s effects (i.e., each group mean’s
deviation from the overall mean).
b. In an unbalanced design one must weight each group’s effects by its size before
summing them. Thus if among m unequal groups, the i th group’s size is ni and its
effect is b̂i , then the weighted effects sum (as illustrated using the above data) to zero
m
as follows:
 n bˆ  6  2  4  1  5   3.2  12  4  16  0
i i
i 1
E. Orthogonal contrasts
1. As already mentioned during our discussion of principal components, when two
independent variables are constructed by the researcher in such a way that they explain
distinct amounts of variance in any dependent variable, they are said to be orthogonal to
each other.
2. The key advantages of orthogonal independent variables are that their effects will not be
confounded with each other (i.e., they will each explain a distinct amount of the
dependent variable’s variance) and that their slopes can be interpreted without concern
for their having been adjusted for such confounding.
3. Note that orthogonality implies an absence of multicollinearity. Yet the two involve the
researcher in different ways.
a. On the one hand, the researcher may actively assign sets of values to categories of
his/her independent variable such that each set is orthogonal to every other one.
7
b. On the other hand, the researcher may passively acknowledge meaningful
multicollinearity among measures of various concepts.
c. A third possibility is to passively describe orthogonal dimensions among collinear
measures of the same concepts, as we did with measures combined using principle
component analysis.
4. The dot product is a technique to test for orthogonality among contrasts. If the dot
product between a pair of variables equals zero, they are orthogonal. The formula for a
dot product is as follows:
m
n C C
i 1
1i
i
2i
where ni = the number of units of analysis in the i th group
C ji = the value of the j th contrast for all units of analysis in the i th group
m = the total number of groups
5. The variables:
 1 if Treatment #1 


C1   1 if Treatment # 2 
 2 if Control group 


 1 if Treatment #1 


and C2    1 if Treatment # 2 
 0 if Control group 


Notice how the first variable “contrasts” the treatment groups from the control group,
whereas the second variable “contrasts” the first (rote memorization) from the second
(bio-memorization) treatment group.
6. To see if the contrasts are orthogonal (and recalling that ni  5 for each group), we
compute their dot product as follows:
m
n C
i 1
i
1i
C 2 i  5  1  1  5  1   1  5   2   0   5  5  0  0
Since the dot product equals zero, the two contrasts are orthogonal.
8
7. The data matrix:
Treatment #1
(rote memory)
Treatment #2
(bio-memory)
Control
Y
9
12
11
15
8
10
8
11
13
8
5
6
8
7
4
C1
1
1
1
1
1
1
1
1
1
1
-2
-2
-2
-2
-2
C2
1
1
1
1
1
-1
-1
-1
-1
-1
0
0
0
0
0
8. The resulting regression equation:
Yˆ  9  1.5C1  0.5C2
Once again, the estimated Y-value for each group equals the group’s mean:
Yˆ1  9  1.5(1)  0.5(1)  11  Y1
Yˆ2  9  1.5(1)  0.5( 1)  10  Y2
Yˆ3  9  1.5( 2)  0.5(0)  6  Y3
9. Regression coefficients associated with orthogonal contrasts are rarely interpreted, since
they are typically used to test for differences in group means (and nothing more).
Nonetheless, here’s a “valiant attempt” at interpreting them:
a. aˆ  9 : Overall, subjects recalled 9 words on average. (No problem here. When
estimating contrast effects, the constant always has this meaning.)
9
b. bˆ1  1.5 : This is the amount that …
Y
1
Y Y
 Y   Y2  Y 
and 3
2
2
differ in opposite directions from zero. Or maybe,
“the two treatment groups remembered an average of 1.5 words more than the overall
average.”
c. bˆ2  0.5 : This is the amount that Y1  Yt  and Y2  Yt  differ in opposite
directions from zero, where Yt is the average words recalled among subjects in
the 2 treatment groups only (i.e., without averaging in the control subjects’ words
recalled). Or maybe, “on average the rote memorization group remembered 0.5
words more than the average of the words remembered only among subjects who
either used rote memorization or bio-memorization.”
10. Finally, please note that we have now accounted for the same variance in yet a different
way (i.e., we have “skinned the cat” differently again).
F. Orthogonal polynomial contrasts—a particular kind of orthogonal contrast
1. Polynomial measures are used when one can speak of “units of distance” between
adjacent groups to which you are assigning X-values. Accordingly, the experiment (as
we have described it thus far) does not lend itself to the development of polynomial
measures, since there are no common units in terms of which the 3 groups can be
compared.
2. So let’s reconceptualize our experiment such that our groups differ according to how long
subjects were given to memorize the words. Let’s imagine that each group was
instructed to do rote memorization, but that …
10
a. the group (previously called the Control group) was given 5 minutes to memorize the
words,
b. the (previously bio-memorization) group was given 10 minutes to memorize the
words, and
c. the (previously rote memorization) group was give 15 minutes to memorize the words.
3. Now consider the following variables:
 1 if 15 min . 


P1   0 if 10 min . 
 1 if 5 min . 


 1 if 15 min . 


and P2   2 if 10 min . 
 1 if 5 min . 


Notice how the first variable measures memorization-time linearly in 5-minute intervals,
whereas the second variable measures quadratic variation across time.
4. Again note that the contrasts are orthogonal when we compute their dot product as follows:
m
n P P
i 1
i 1i
2i
 5  1  1  5  0   2   5   1  1  5  0  5  0
Since the dot product equals zero, the two polynomial contrasts are orthogonal.
5. The data matrix:
15 minutes
(rote memory)
10 minutes
(rote memory)
5 minutes
(rote memory)
Y
9
12
11
15
8
10
8
11
13
8
5
6
8
7
4
P1
1
1
1
1
1
0
0
0
0
0
-1
-1
-1
-1
-1
11
P2
1
1
1
1
1
-2
-2
-2
-2
-2
1
1
1
1
1
5. The resulting regression equation:
Yˆ  9  2.5P1  0.5P2
And again, the estimated Y-value for each group equals the group’s mean:
Yˆ1  9  2.5(1)  0.5(1)  11  Y1
Yˆ2  9  2.5(0)  0.5( 2)  10  Y2
Yˆ3  9  2.5( 1)  0.5(1)  6  Y3
Accordingly, we have “skinned the cat” a fourth time (i.e., we have explained the same
between-group variance, SSTREATMENT  70 , once again).
6. The constant and slope coefficients associated with these polynomial contrasts can be
interpreted as follows:
a. aˆ  9 (The constant is interpreted the same way as for all regressions with only
contrasts as independent variables.): Overall, subjects recalled 9 words on average.
b. bˆ1  2.5 (This slope’s interpretation should sound familiar.): There is an increase of
2.5 words remembered by respondents for each additional 5 minutes they are given to
memorize the 15 words.
NOTE: This slope corresponds, of course, to the linear relation between two intervallevel variables. When taken in conjunction with quadratic, cubic, etc. relations
between variables, it is not surprising that some social scientists (psychologists, in
particular) tend to argue that linear regression is a special case of analysis of variance,
namely it is the case when each independent variable is a linear polynomial.
12
c. bˆ2  0.5 : Note that the quadratic polynomial measure contrasts values on the
dependent variable for the extremes (i.e., having 5 or 15 minutes of memorization
time) versus those for the middle (i.e., with 10 minutes for memorization). Here the
extremes have a half-word less recall than one would estimate based on the linear
relation between time for memorization and words memorized.
6. A plot:
•
15
•
Number of
words recalled
•
•
10
5
•
•
•
•
•
•
•
2
•
•
10
 0 
 
  2
15
 1
 
 1
bˆ1  2.5
bˆ2  0.5
5
  1
 
 1
Time for memorization
 P1 
 
 P2 
7. Unlike other contrasts, polynomials take into account distances between the groups being
compared. That is, the assignment of polynomial values to groups assumes that these
groups differ on an interval (or ratio) scale. For example, imagine that the times-formemorization were 5 minutes, 10 minutes, and 30 minutes. In this case, the polynomial
13
  2
contrast, P1    1 could be used. If we obtained the same data as before, a revised plot
 
 3 
might look as follows:
•
15
•
Number of
words recalled
•
•
•
•
10
•
•
•
•
•
5
5
(-2)
•
•
2
10
(-1)
Time for memorization
30
(3)
P1
a. Note that the values of P1 preserve the relative distances among 5, 10, and 30.
b. Moreover, unlike the “time for memorization” variable, P1 is a contrast:
3
 n P  5   2  5   1  5  3  10  5  15  0
i 1
i 1i
c. AN ASIDE: You may have noticed some similarities between the contrasts and the
(equally-spaced) polynomials we have considered:
 1 if 15 min . 


P1   0 if 10 min . 
 1 if 5 min . 


and
 1 if 15 min . 


P2   2 if 10 min . 
 1 if 5 min . 


14
 1 if Treatment #1 


C1   1 if Treatment # 2 
 2 if Control group 


 1 if Treatment #1 


C2    1 if Treatment # 2 
 0 if Control group 


and
In particular, P1 and C 2 are the same, as are P2 and C1 , once values for the second and
third groups are reversed. However, note that this similarity ends with 4 or more
groups.
d. In the 4-group case, polynomial and nonpolynomial contrasts might look as follows:
 3
 1

P1  
 1
 3
if 5 min .
if 10 min . 

if 15 min . 
if 20 min .
  1 if
 1 if

C1  
 1 if
 1 if
Treat. #1
Treat. #2
Treat. #3
Treat. #4
 1
1

P2  
1
 1






if 5 min . 
if 10 min . 

if 15 min . 
if 20 min .
  1 if
 1 if

C2  
 1 if
 1 if
Treat. #1
Treat. #2
Treat. #3
Treat. #4
 1
 3

P3  
 3
 1
if 5 min . 
if 10 min .

if 15 min . 
if 20 min .

 1 if

  1 if


 C3  

 1 if

 1 if
Treat. #1
Treat. #2
Treat. #3
Treat. #4






Notice how linear (no bend), quadratic (one bend), and cubic (two bends in regression
line) trends in Y-values across increasing numbers of minutes can be modeled via
variations in P1 , P2 , and P3 respectively. Yet with C1 mean values of Y are
compared between the first 2 versus last 2 treatments. With C 2 mean comparisons
are made between 1st and 3rd versus 2nd and 4th treatments. And with C3 mean
comparisons are made between 1st and 4th versus 2nd and 3rd treatments.
8. How to construct a linear contrast:
a. Consider the case in which four equal-sized groups have equal “distances” between
adjacent pairs. Let this distance be “h” and let the value of the lowest group be
“ℓ+h.”
15
b. It follows that values for the four groups equal . . .
h
  2h
  3h
  4h
Note: The h’s trace a linear function across 4 equally-spaced intervals. The ℓ places
this set of intervals at a specific point on the number line.
c. Since we want a contrast, these values should sum to zero. That is, 4ℓ + 10h = 0.
And so, 2ℓ = -5h.
d. At this point we can choose any value for “h” and the value of “ℓ” will be set. So
let’s choose h=2 , thereby setting ℓ= -5 . The resulting linear contrast is as follows:
 5  2  3
 5  4  1
56 1
58  3
Note that this contrast increases linearly in jumps of h=2 units from one group to the
next. It is a contrast because group values sum to zero and each of the 4 groups is of
the same size.
9. A quadratic contrast is found by squaring a linear contrast and then centering it (i.e., by
then creating a contrast via subtracting out the mean of the squared values):
16
P1 P12 P12  P12 P2
Sums:
-3
-1
1
3
0
9
1
1
9
20
9-5=4
1-5=-4
1-5=-4
9-5=4
0
1
-1
-1
1
0
P3  P1  P2
P3
-3
1
-1
3
0
-1
3
-3
1
0
10. Cubic and higher-order orthogonal contrasts are easiest found in tables of orthogonal
polynomials (usually at the ends of advanced analysis of variance texts). For example,
the last column in the above table lists the cubic polynomial contrast for the 4-group case.
a. A word of caution: Cubic contrasts are not found by raising a linear contrast to the
third power or by multiplying linear and quadratic contrasts (as with P3 above). For
example, you will note that the cubic polynomial, P3 , is orthogonal to both P1 and
P2 (which, as promised, are themselves orthogonal to each other):
4
P P
i 1
4
1i 2 i
  3  1   1   1  1   1  3  1  3  1  1  3  0
 P P   3   1   1  3  1   3  3  1  3  3  3  3  0
i 1
4
1i 3i
 P P  1   1   1  3   1   3  1  1  1  3  3  1  0
i 1
2 i 3i
However, although P3  P1  P2 is orthogonal to P2 , it is not orthogonal to P1 :
4
 P P   3   3   1  1  1   1  3  3  9  1  1  9  16
i 1
4
1i 3i
 P P  1   3   1  1   1   1  1  3  3  1  1  3  0
i 1
2 i 3i
17
b. Every cubic polynomial measure affords a regression line with two bends in the
pattern of Y-values across increasing units of X (i.e., of the variable from which the
polynomial measure was constructed). Quadratic polynomial measures afford one
such bend, and linear polynomial measures afford no bends (i.e., they afford straightline relations) across X’s units. In general, the order of the polynomial term needed
will depend on the number of bends you theorize that Y will vary with increasing
values of X.
NOTE: Nonlinear relations can be estimated in multiple regression using polynomial
terms. Contrasts can be approximated by first subtracting out the variable’s mean
before raising it to a power of 2, 3, etc.
c. WARNING: Contrasts listed in tables of orthogonal contrasts are only orthogonal in
balanced research designs. Orthogonal polynomial contrasts listed there are only
orthogonal in balanced research designs in which consecutive groups have equal
distances between them.
i. It is easy to incorporate unequal distances into the construction of a linear
polonomial. (See part F.7. above.) This situation is more complicated when
constructing higher-order polynomials, however.
ii. In unbalanced designs, polynomials’ values are commonly attached weights in
ways that ensure that they remain contrasts (as we did with weighted effect codes)
and that they are orthogonal to each other.
iii. For our purposes, it suffices to say that it is ALWAYS possible to construct “m-1”
orthogonal polynomial contrasts among “m” groups that can be differentiated
according to their values along a single interval- or ratio-level metric.
18
TABLE 1
Summary Table of Researcher-Constructed Variables
Type of Variable
Do the variables have . . .
Zero correlations
Zero means?
with each other?
Units?
Parameter estimated by
Constant
Slope
Form of
hypothesis tests
Dummy
No
No
No
m
i   m
H0: i   m
Effect
Yes
No
No

i  
H0: i  
No

rarely
interpreted
H0:  g   h
Orthogonal Contrast
Yes
Yes

H0: V and Y have
no linear, quadrainterpreted tic, or etc. relation
Note: Variables are constructed from a variable, V, with m attributes. These constructed variables are used to explain variance in the
dependent variable, Y. In all cases, the subscript, i, ranges from i=1,...,m-1. The overall population mean is μ, and the means of
populations with successive attributes of V are μ1, μ2, ... μm respectively. μg and μh are population means for 2 groups having distinct
subsets of V's attributes.
Orthogonal
Polynomial Contrast
Yes
Yes
Yes
19
bi or not
G. As a general observation, note that it is because dummy variables, effect contrasts,
orthogonal contrasts, and orthogonal polynomials all yield the same values of Yˆ that they all
explain the same variance in the dependent variable (i.e., that each is a method for “skinning
the same cat”). Where they differ is in the interpretations that each affords to the researcher.
H. Interaction (a.k.a. moderation) coding
When the effect of one variable on a dependent variable is nonlinear, a measure is (or
measures are) needed to estimate the quadratic, cubic, etc. shape of the nonlinear relation.
Polynomial measures allow this. When the effects of two variables are nonadditive, a
measure (or measures) may be needed to estimate the joint (or multiplicative) effects of the
two variables. Interaction measures allow this.
1. What is an interaction?
a. Technically speaking, interaction occurs when the effect of one variable differs
among levels of another variable.
b. For example, consider the effects of gender and race on income. The variables are …
 1 if Male
 1 if White
X1  
X2  
 1 if Female
 1 if Nonwhite
Y  Annual income in thousands of dollars
c. Note that there are 4 possible combinations among gender and race:
Nonwhite Female NF 
White Female WF 
Nonwhite Male NM 
White Male WM 
20
d. Let’s assume that our sample is of 5 subjects within each of these 4 combinations,
yielding a total sample size of n=20. We calculate average income values among
subjects with the following results:
Y NF  10
YF  13
YWF  16
Y NM  18
YWM  28
Y N  14
YW  22
Y M  23
Y  18
e. A plot:
30
20
Income
W  1
N 1
W  1
10
N 1
0
-1
Male
0
1
Female
Notice that there are 3 “things” going on in this plot:

Marginal gender effects: Men earn more than women ($10,000 annually).

Marginal race effects: Whites earn more than nonwhites ($8,000 annually).

Gender-by-race interaction/moderation: This can be expressed either as “gender
differences in income are greater among whites than nonwhites” or, equivalently,
“race differences in income are greater among males than females.”
21
f. Some data:
Characteristics
Nonwhite Female
White Female
Nonwhite Male
White Male
Sums:
# of
Subjects
(ni)
5
5
5
5
20
Gender
(X1)
1
1
-1
-1
0
Race
(X2)
1
-1
1
-1
0
Dot
Product
(niX1X2)
5
-5
-5
5
0
g. Note that not only are X1 and X2 contrasts (because they have zero means), but they
are orthogonal contrasts as well (because their dot product equals zero). You will
recall that this means that X1 and X2 will explain distinct amounts of the variance in
income. Moreover, note that when Y is regressed on X1 and X2, the resulting
regression equation is as follows:
Yˆ  18  5 X 1  4 X 2
Accordingly, the following are the estimated values for the four combinations of
gender and race characteristics:
YˆNF  18  5  1  4  1  9  YNF  10
YˆWF  18  5  1  4   1  17  YWF  16
YˆNM  18  5   1  4  1  19  YNM  18
YˆWM  18  5   1  4   1  27  YWM  28
Note that these estimates are of higher incomes for white women and nonwhite men
and of lower incomes for nonwhite women and white men than are actually the case
(based on the groups’ means). These deviations result because at the moment we
have only estimated the marginal effects of gender and race, and have yet to estimate
their interaction.
22
h. Consider the following plot:
30
YˆWM  27
20
$10,000
(within race)
YˆNM  19
YˆWF  17
Income
$8,000
(within gender)
10
YˆNF  9
0
-1
Male
0
1
Female
Among these estimates, notice that gender differences in income are the same
($10,000) for nonwhites and whites. Similarly, race differences in income are the
same ($8,000) for females and males. However (as suggested by the arrows in the
plot), the estimates differ from the group means in that nonwhite females’ and white
males’ incomes are underestimated, and white females’ and nonwhite males’ incomes
are overestimated. Three thoughts:

You can tell that no interaction is estimated in your regression model when
regression lines between Y and one independent variable are the same (i.e., are
parallel) among levels of another independent variable. This is evident in the
above sketch when you note among nonwhites versus whites that the regression
lines between income and gender are parallel.

The greater the income difference between nonwhite females and white males
relative to white females and nonwhite males, the more the effect of gender will
23
differ by race (or, equivalently, the more the effect of race will differ by gender),
which is to say the stronger the interaction effect of gender and race on income.

The required interaction measure could be constructed with high values for
“nonwhite females and white males” but low values for “white females and
nonwhite males.”
2. Constructing an interaction measure
a. The concept of interaction is based on the idea that “the whole is greater than the sum
of its parts.”
i. If the gender and race variables are considered in isolation, each is binomial and
each is associated with a single degree of freedom.
ii. But when the two are combined into a “whole,” a four-level variable is produced
(nonwhite female, white female, nonwhite male, white male). There are three
degrees of freedom associated with this new variable: One for each of the
marginal effects of gender and race, plus one for the gender-by-race interaction.
b. A general procedure for computing an interaction measure between two variables
(i.e., a two-way interaction measure) is to subtract each variable’s mean from its
respective values and to multiply these differences together. For example, the
variable, INT, could be constructed to estimate the interaction effects of X1 and X2 as
follows:
INT  X 1  X 1 X 2  X 2 
c. Returning to the gender-by-race illustration, note that since X1 and X2 are contrasts,
X 1  X 2  0 and so INT  X 1  X 2 . Here are the calculations:
24
Characteristics
Nonwhite Female
White Female
Nonwhite Male
White Male
Race
(X2)
1
-1
1
-1
Gender
(X1)
1
1
-1
-1
Gender-by-Race
Interaction
(INT=X1X2)
1
-1
-1
1
Note that that this measure has the qualities we seek. That is, it takes high scores for
“nonwhite females and white males” and takes low scores for “white females and
nonwhite males.” You can also check the dot products to verify that INT is
orthogonal to X1 and X2. Given this orthogonality, the regression coefficients
associated with X1 and X2 will not change when INT is added into the regression
equation.
3. Interpreting coefficients associated with interaction measures
a. With our data, regressing income on gender, race, and INT produces the following
regression model:
Yˆ  18  5 X 1  4 X 2  1INT
Note that unlike the previous equation, the Y-estimates from this equation would
equal the means of each of the four groups. This would result as nonwhite females’
and white males’ estimates were each increased by $1,000 and as white females’ and
nonwhite males’ estimates were decreased by the same amount.
b. The regression lines would now no longer be parallel but would (as per the sketch of
group means) be closer among females than among males. This is the nonparallel
pattern one would expect if the slope associated with INT were positive. Note that if
the slope were negative, the lines would be closer among males than among females.
25
More generally, the sign of the slope associated with an interaction measure indicates
the direction in which one’s regression lines change from being parallel (i.e., which
one shifts counterclockwise as the other shifts clockwise).
c. Two important pieces of information are needed to determine if two independent
variables have a specific interaction effect on a dependent variable:
i. The interaction term must explain a significant amount of variation in the
dependent variable in addition to the variance explained by the marginal effects of
the variables out of which it was constructed. Accordingly (and as already
mentioned in our previous discussion of hierarchical models), the following
regression model would be misspecified: Yˆ  aˆ  bˆ1 X 1  bˆ2 INT
The proper significance test would be an F-test that compares R-squareds from
the following two regression models:
Reduced model: Yˆ  aˆ  bˆ1 X 1  bˆ2 X 2
Complete model: Yˆ  aˆ  bˆ1 X 1  bˆ2 X 2  bˆ3 INT
ii. The sign of the slope associated with the interaction term must be consistent with
your theory. (Note: Since the just-mentioned F-test is two-tailed, you should
perform the test at the 2α significance level and fail to reject if the slope’s sign is
not as hypothesized.)
4. How does one know when to estimate interactions among variables in addition to the
marginal effects of the variables themselves?
a. The decision to estimate a quadratic, cubic, etc. polynomial relation is easy: Simply
obtain a boxplot to see if Y has a nonlinear association with X. Unfortunately,
interaction effects cannot be detected in bivariate plots of one’s data.
26
b. Consider the following full and partial tables:
Yes
No
Voted for the
president
Voted for
the president
Party voted for
Democratic
Republican
100
100
100
100
Work status
Unemployed
Employed
Democratic
Republican
Democratic
Republican
10
90
90
10
90
10
10
90
Yes
No
Clearly there is a strong interaction between “work status” (W) and “party voted for”
(P), since the associations between “voted for the president” (V) and “party voted for”
are reversed between employed and unemployed respondents in this hypothetical
survey of US voters. (Note: This is neither distortion nor suppression. Here at issue
is not that associations between full and partial tables differ, but that associations
differ among partial tables.)
c. Maybe we can detect the interaction by looking at a few correlations:
rVP  0
rVW  0
rPW  0
Not even partial correlations disclose the existence of an interaction:
rVP.W 
rVP  rVW rPW
1  r 1  r 
2
VW
2
PW

0
0
1
d. In conclusion, you should not look to your data for guidance on where you might find
interaction effects. The “place” to look is among your theoretical hunches about why
your variables are interrelated. So, for instance, it was only after someone theorized
that discrimination is not additive (but that the effects of a second discriminatory
27
status is much less detrimental than those of the first) that researchers in the field of
stratification found evidence of a “double negative equals positive” effect regarding
the disadvantage of Black women in the United States.
5. An illustration:
One theory of political science argues that citizens of a country will only support
aggression toward a foreign power (e.g., Iraq) when the country has both recently
suffered a humiliating military defeat (e.g., in Iran) AND has a healthy economy. To test
this theory let’s assume that the unit of analysis is the event (in this case, the instance of
aggression of one country against another), and that we have data on the following three
variables:
PUBOP
= public support for aggression within an aggressor nation
DEFEAT
= whether or not the aggressor nation experienced a recent humiliating
military defeat
PCGNP
= economic heath within the aggressor nation (as measured by percaptia
GNP)
INTERACT = an interaction measure constructed from DEFEAT and PCGNP
a. If you are unsure whether an interaction measure should be included among the
independent variables, you should sketch a plot of how your data would look if the theory
were correct. For example, note in the following plot how high public support is only
found for instances of aggression that followed a recent humiliating military defeat:
High
d
Public Support
Of Aggression
(PUBOP)
Low
d
d
d d
d
d d
dn d n
n
n n n
nn
Low
d
d
d
d
n
n
n
d
d
d d Recent hum.
defeat
d d
d (DEFEAT = 1)
No recent
hum. defeat
(DEFEAT = 2)
n
n n
n
nn n n
Economic Health
28 (PCGNP)
High
b. Now, imagine that you have measures of PUBOP (a public opinion measure on which
high scores mean support for aggression); DEFEAT (1 = had a recent defeat; 2 = had
none); PCGNP (per capita gross national product); and INTERACT =
DEFEAT  DEFEAT  PCGNP  PCGNP. You collect data on 100 instances of
aggression and access these data using SPSS and the following command:
regression vars=pubop,defeat,pcgnp,interact/dep=pubop/stepwise.
In the first step of the stepwise regression procedure, the variable DEFEAT enters the
model and the output is (in part) as follows:
Model Summary
Model
R
R Square
1
.462
.213
a. Predictors: (Constant), DEFEAT
In the second step, INTERACT enters the model with the following output:
Model Summary
Model
R
R Square
1
.552
.305
a. Predictors: (Constant), DEFEAT
There is no third step, because PCGNP does not increment R-square by a significant
amount at the .05 level (the default significance level in SPSS). Please take note:
You should not end your analysis with a regression model that includes INTERACT,
but excludes PCGNP!!! The regression estimated in the second step is misspecified,
because it excludes the lower-order measure, PCGNP. Please remember that your
regression models must always be correctly specified. (Look back to our notes on
model specification, and the discussion there on hierarchically related regression
models.)
So now imagine that PCGNP is forced into two regressions as follows:
regression vars=pubop,defeat,pcgnp/dep=pubop/enter.
29
regression vars=pubop,defeat,pcgnp,interact/dep=pubop/enter.
These two commands yield (in part) the following output:
Model Summary
Model
R
R Square
1
.476
.226
a. Predictors: (Constant), DEFEAT,
PCGNP
and
Model Summary
Model
R
R Square
1
.579
.335
a. Predictors: (Constant), DEFEAT,
PCGNP, INTERACT
Two pieces of information are needed to evaluate whether there is support for the
theory.
i. First, one must test whether INTERACT explains a significant amount of variance
in PUBOP in addition to that explained by DEFEAT and PCGNP. Accordingly, a
familiar F-test is called for:
Fnkkg1
Rc2  Rr2
 kg
3 2
F100
31
1  Rc2
n  k 1
.335  .226
3 2

1  .335
100  3  1
 15.69
F961 ,.10  2.79  F601 ,.10  15.69
Note that this significance test used a critical value of F at the .10 significance
level, despite the fact that the .05 level of significance was being used. This is
because the F-test is inherently a 2-tailed test, whereas our theory specifies a
30
specific way that the slopes between PUBOP and PCGNP should differ among
levels of DEFEAT. If the slopes differ in the opposite direction, one would not
have support for the theory even if a significant amount of variation may have
been explained by INTERACT. So we come to the other piece of information
that is needed to evaluate whether or not we have support for our theory.
ii. Second, one must see if the sign of the slope associated with INTERACT
corresponds to what one would expect if the theory were true. What follows is a
“cookbook” for deciding if your theory is consistent with the sign of a 2-way
interaction measure’s partial slope.
1. Begin by finding the values that INTERACT takes for various combinations
of high and low values on the variables from which it was constructed.
Recent defeat
(DEFEAT = 1)
No recent defeat
(DEFEAT = 2)
Healthy
Economy
–
+
No Healthy
Economy
+
–

 

Note that since INTERACT = DEFEAT  DEFEAT  PCGNP  PCGNP ,
it takes positive values in the “+” cells of this table and takes negative values
in the table’s “–” cells. Let’s call this 2x2 table our measure table.
2. Next plot the parallel lines that would be estimated based on the marginal
effects of the variables used in constructing the interaction measure (here,
DEFEAT and PCGNP).
3. On this plot (see below) place arrows at the ends of the lines to indicate the
way in which the lines must rotate to fit the pattern suggested by your theory.
31
Recent defeat
High
No recent defeat
Public Support
Of Aggression
(PUBOP)
Low
Low
Economic Health
(PCGNP)
High
4. Using “up arrows” to indicate where PUBOP estimates should increase, and
“down arrows” to indicate where they should decrease, create a theory table in
which +’s and –’s correspond respectively to the up- and down-arrows in the
plot:
Recent defeat
(DEFEAT = 1)
No recent defeat
(DEFEAT = 2)
Healthy
Economy
+
–
No Healthy
Economy
–
+
5. If identical signs are in corresponding cells of these two tables, your theory
suggests a positive partial slope between INTERACT and the dependent
variable; if opposite signs are in corresponding cells of the tables, the theory
suggests a negative partial slope. Since the latter situation holds in this
illustration, we conclude that a negative partial slope (along with a
significantly large F-statistic) would provide evidence in support of the
political science theory. Put differently, the hypothesis being tested here is as
follows:
32
H 0: bI  0
H A: bI  0
The alternative hypothesis is that the slope is negative, because if it were
positive the estimated values of PUBOP would shift from the parallel lines in
the above plot in the direction opposite to that in which the arrows are
pointing. Why? Well, the parallel lines are those estimated by a regression
model that only includes the marginal effects of DEFEAT and PCGNP.
When INTERACT is added to the model, the Y-hats from the previous model
will be larger when bˆI INTERACT is positive (i.e., when a positive number is
added to them), but will be smaller when this product is negative (i.e., when a
negative number is added). If (as hypothesized) b̂I is negative, bˆI INTERACT
will be positive when INTERACT is negative (i.e., when an event occurs after
a recent defeat and at a time of economic health, or when it occurs neither
after a recent defeat nor at a time of economic health) and it will be negative
when INTERACT is positive (i.e., when an event occurs after a recent defeat
but not at a time of economic health, or when it occurs not after a recent
defeat but at a time of economic health). The hypothesized negative relation
between INTERACT and PUBOP can thus be sketched as follows:
High
Public Support
Of Aggression
(PUBOP)
Low
INTERACT
Defeat + Health
No defeat + No health 33
Defeat + No health
No defeat + Health
I. Although illustrations of polynomial and interaction measures in these notes have
consistently used independent variables that only have a few levels (e.g., three lengths-oftime subjects are given to memorize words or two-levels distinguishing events that occurred
“soon after” versus “not soon after” a humiliating military defeat), in many studies these
measures are constructed from finely-grained interval- or ratio-level variables. In these
cases, polynomial and interaction measures are typically constructed by respectively raising a
variable to the desired power (squaring for quadratic, cubing for cubic, etc.) or by
multiplying variables together (two variables for a 2-way interaction, three variables for a 3way interaction, etc.). To reduce collinearity among polynomial measures or between
interaction measures and the variables from which they were constructed, I recommend that
you subtract out the means of (i.e., center) your variables before raising them to a specific
power or multiplying them together. What follow are some comments on this act of
subtracting out a number(s) prior to power-raising or multiplying of variables:
1. Subtracting out numbers does not influence the Y-hat values generated by one’s
regression models. Consider the following equation with an interaction measure obtained
by multiplying centered variables:
Yˆ  aˆ  bˆ1 X 1  bˆ2 X 2  bˆ3 X 1  X 1  X 2  X 2 
Yˆ  aˆ  bˆ1 X 1  bˆ2 X 2  bˆ3 X 1 X 2  X 1 X 2  X 1 X 2  X 1 X 2 
Yˆ  aˆ  bˆ1 X 1  bˆ2 X 2  bˆ3 X 1 X 2  bˆ3 X 1 X 2  bˆ3 X 1 X 2  bˆ3 X 1 X 2
Yˆ  aˆ  bˆ X X  bˆ  bˆ X X  bˆ  bˆ X X  bˆ X X

3
1
2
 
1
3
2

1

2
3
1
Thus if the estimated equation is of the form,
Yˆ  aˆ   bˆ1 X 1  bˆ2 X 2  bˆ3 X 1 X 2 ,
you will find that
aˆ   aˆ  bˆ3 X 1 X 2
34

2
3
1
2
bˆ1  bˆ1  bˆ3 X 2
bˆ2  bˆ2  bˆ3 X 1 , and
bˆ  bˆ .
3
3
Note that the slope associated with the interaction is the same, whether or not the
variables were centered while constructing the interaction measure. We shall return to
this point in a moment.
2. It is easy to verify that collinearity can be reduced via prior centering. For example,
consider a variable, X, that takes values of the integers 1 through 10. If we square X
before centering versus after centering, we find that X has a correlation of .97 with
noncentered squared values of itself but a correlation of zero with its squared-butcentered values.
3. Sometimes you may wish to subtract out a theoretically meaningful number to give
intuitive meaning to the resulting slope. For example, consider the argument that high
school graduates with no post-high school training are paid less because there is too great
a supply of them in the labor market. This argument suggests that income is a quadratic
function of education, in which people with more or less than a high school education
earn more than those with a high school diploma. In this case, it would be reasonable to
subtract 12 from the “years of education” measure as a quadratic term is constructed.
4. Finally, you should be aware that it is always possible to construct a polynomial measure
that is orthogonal to all its lower-order polynomial measures. Likewise, it is always
possible to construct an interaction measure that is orthogonal to all lower-order
interactions and marginal measures from which it was constructed. The key here is in
one’s selection of the numbers subtracted out prior to power-raising or multiplying of
variables. By “subtracting out the right constants,” you can ensure that your higher-order
35
measure is orthogonal to all lower-order measures associated with it. The following
pages contain output from an SPSS program with which a measure of church attendance
(ATTEND) is regressed on RACE (1=white; 2=black) and SEX (1=male; 2=female).
After listing frequencies on these three variables, regression output is provided first for
the regression of ATTEND on RACE, SEX, and INT (which was computed simply by
multiplying RACE by SEX). Next, output is provided for a parallel regression, except
this time INT was computed as race  1.1475  sex  1.575 . Unlike in the previous
regression, this time INT has zero correlations with RACE and SEX (although the latter
two variables do have a modest positive correlation of .113).

Of most interest in this output are differences in the slopes associated with the
marginal effects of RACE and SEX in the two regression models. Which slopes
should be believed?
Independent
Variables in Model
slope for race
slope for sex
slope for int
race, sex, int
(uncentered)
-.849
-.935
1.139
race, sex, int
(centered)
.945
.372
1.139
race & sex only
.942
.369
n/a

The slopes associated with race and sex change signs when one changes the way in
which int is calculated!!! As noted above, the slope associated with int (as well as its
standard error) is not altered when int is calculated from uncentered or centered race
and sex variables.

Deciding which slope to believe should be based on your knowledge that it is always
possible to construct an interaction measure that is orthogonal to all lower-order
36
interactions and marginal measures from which it was constructed. If you had
constructed an interaction measure that was orthogonal to race and sex, the slopes
associated with race and sex would be identical to those obtained in the model that
includes race and sex only.

So why not just estimate the marginal effects of race and sex based on a model that
simply excludes the interaction measure? Actually, that is what I am recommending
that you do. However, you should be careful to alter the standard errors associated
with these marginal effects to take into account the reduction in MSE from the
reduced model from which int is excluded ( MSE R ) to the MSE from the complete
model that includes int ( MSEC ).

Recall that the formula for a slope’s standard error is as follows:
ˆ x 
MSE
SS X 1  R X2 .V1V2 ...Vk 1


(Here V1…Vk-1 is a list of all of the model’s independent variables other than X.)

Now think about what will change in this formula if an interaction measure is added
into this regression model—an interaction measure that is orthogonal to X. Well, SSX
won’t change. Nor will “ 1  R X2 .V1V2 ...Vk 1 ,” since the orthogonality of X to the added
measure (i.e., the interaction) has been ensured. And so it is the MSE that has to be
changed here. For the output listed in the next four pages, the corrected standard
errors for race and sex are as follows:
ˆ Race 
MSE C
MSE R

2

MSE R
SS Race 1  rRaceSex
ˆ Sex 
MSEC
MSE R


2
MSE R
SS Sex 1  rRaceSex


37
6.275
 .184  .183
6.310
6.275
 .131  .130
6.310
Interaction & Collinearity
select if ((race ne 3)and(wrkstat eq 1)).
frequencies vars=attend,race,sex.
compute int=race*sex .
pearson corr vars=race, sex, int.
regression vars=attend,race,sex,int/dep=attend/enter.
compute k1=1.1475 .
compute k2=1.5750 .
compute int=(race - k1)*(sex - k2).
pearson corr vars=race, sex, int.
regression vars=attend,race,sex,int/dep=attend/enter.
regression vars=attend,race,sex/dep=attend/enter.
ATTEND HOW OFTEN R ATTENDS RELIGIOUS SERVICES
Valid
Missing
Total
0 NEVER
1 LT ONCE A YEAR
2 ONCE A YEAR
3 SEVRL TIMES A YR
4 ONCE A MONTH
5 2-3X A MONTH
6 NRLY EVERY WEEK
7 EVERY WEEK
8 MORE THN ONCE
WK
Total
9 DK,NA
Frequency
223
136
242
240
114
147
82
226
Percent
14.4
8.8
15.6
15.5
7.4
9.5
5.3
14.6
Valid
Percent
14.8
9.0
16.1
16.0
7.6
9.8
5.5
15.0
Cumulative
Percent
14.8
23.9
40.0
56.0
63.5
73.3
78.8
93.8
93
6.0
6.2
100.0
1503
48
1551
96.9
3.1
100.0
100.0
RACE RACE OF RESPONDENT
Valid
1 WHITE
2 BLACK
Total
Frequency
1317
234
1551
Percent
84.9
15.1
100.0
Valid
Percent
84.9
15.1
100.0
Cumulative
Percent
84.9
100.0
SEX RESPONDENTS SEX
Valid
1 MALE
2 FEMALE
Total
Frequency
830
721
1551
Percent
53.5
46.5
100.0
Valid
Percent
53.5
46.5
100.0
38
Cumulative
Percent
53.5
100.0
Correlations
Correlations
RACE RACE OF
RESPONDENT
SEX RESPONDENTS
SEX
INT
RACE
RACE OF
RESPON
DENT
1.000
.
1551
.113
.000
1551
.726
.000
1551
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
SEX
RESPON
DENTS
SEX
.113
.000
1551
1.000
.
1551
.735
.000
1551
INT
.726
.000
1551
.735
.000
1551
1.000
.
1551
Regression
Model Summary
Model
1
R
R Square
.175a
.031
Adjusted
R Square
.029
Std. Error
of the
Estimate
2.50
a. Predictors: (Constant), INT, RACE RACE OF
RESPONDENT
, SEX
RESPONDENTS SEX
ANOVAb
Model
1
Regression
Residual
Total
Sum of
Squares
298.428
9406.111
9704.539
df
3
1499
1502
Mean
Square
99.476
6.275
F
15.853
Sig.
.000a
a. Predictors: (Constant), INT, RACE RACE OF RESPONDENT
, SEX RESPONDENTS SEX
b. Dependent Variable: ATTEND HOW OFTEN R ATTENDS RELIGIOUS
SERVICES
Coefficients a
Model
1
(Constant)
RACE RACE OF
RESPONDENT
SEX RESPONDENTS
SEX
INT
Unstandardized
Coefficients
B
Std. Error
3.966
.715
Standardi
zed
Coefficie
nts
Beta
t
5.548
Sig.
.000
-.849
.610
-.119
-1.391
.164
-.935
.444
-.183
-2.107
.035
1.139
.370
.385
3.076
a. Dependent Variable: ATTEND HOW OFTEN R ATTENDS RELIGIOUS SERVICES
39
.002
Correlations
Correlations
RACE RACE OF
RESPONDENT
SEX RESPONDENTS
SEX
INT
RACE
RACE OF
RESPON
DENT
1.000
.
1551
.113
.000
1551
.000
.993
1551
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
SEX
RESPON
DENTS
SEX
.113
.000
1551
1.000
.
1551
.000
.987
1551
INT
.000
.993
1551
.000
.987
1551
1.000
.
1551
Regression
Model Summary
Model
1
R
R Square
.175a
.031
Adjusted
R Square
.029
Std. Error
of the
Estimate
2.50
a. Predictors: (Constant), INT, RACE RACE OF
RESPONDENT
, SEX
RESPONDENTS SEX
ANOVAb
Model
1
Regression
Residual
Total
Sum of
Squares
298.428
9406.111
9704.539
df
3
1499
1502
Mean
Square
99.476
6.275
F
15.853
Sig.
.000a
a. Predictors: (Constant), INT, RACE RACE OF RESPONDENT
, SEX RESPONDENTS SEX
b. Dependent Variable: ATTEND HOW OFTEN R ATTENDS RELIGIOUS
SERVICES
Coefficients a
Model
1
(Constant)
RACE RACE OF
RESPONDENT
SEX RESPONDENTS
SEX
INT
Unstandardized
Coefficients
B
Std. Error
1.907
.276
Standardi
zed
Coefficie
nts
Beta
t
6.913
Sig.
.000
.945
.183
.132
5.157
.000
.372
.130
.073
2.857
.004
1.139
.370
.078
3.076
a. Dependent Variable: ATTEND HOW OFTEN R ATTENDS RELIGIOUS SERVICES
40
.002
Regression
Model Summary
Model
1
R
.157a
R Square
.025
Adjusted
R Square
.023
Std. Error of
the Estimate
2.512
a. Predictors: (Constant), SEX RESPONDENTS SEX
, RACE RACE OF RESPONDENT
ANOVAb
Model
1
Regression
Residual
Total
Sum of
Squares
239.063
9465.476
9704.539
df
2
1500
1502
Mean Square
119.531
6.310
F
18.942
Sig.
.000a
a. Predictors: (Constant), SEX RESPONDENTS SEX
, RACE
RACE OF RESPONDENT
b. Dependent Variable: ATTEND HOW OFTEN R ATTENDS RELIGIOUS SERVICES
Coefficientsa
Model
1
(Constant)
RACE RACE OF
RESPONDENT
SEX
RESPONDENTS SEX
Unstandardized
Coefficients
B
Std. Error
1.937
.277
Standardi
zed
Coefficien
ts
Beta
t
7.006
Sig.
.000
.942
.184
.132
5.126
.000
.369
.131
.073
2.827
.005
a. Dependent Variable: ATTEND HOW OFTEN R ATTENDS RELIGIOUS SERVICES
41
Class Examples on Dummy Variables
and Effect, Orthogonal, & Polynomial Contrasts
The program:
data list records=1 / words 1-2 d1 4 d2 6 e1 8-9 e2 11-12
c1 14-15 c2 17-18 p1 20-21 p2 23-24.
begin data.
9 1 0 1 0 1 1 1 1
12 1 0 1 0 1 1 1 1
11 1 0 1 0 1 1 1 1
15 1 0 1 0 1 1 1 1
8 1 0 1 0 1 1 1 1
10 0 1 0 1 1 -1 0 -2
8 0 1 0 1 1 -1 0 -2
11 0 1 0 1 1 -1 0 -2
13 0 1 0 1 1 -1 0 -2
8 0 1 0 1 1 -1 0 -2
5 0 0 -1 -1 -2 0 -1 1
6 0 0 -1 -1 -2 0 -1 1
8 0 0 -1 -1 -2 0 -1 1
7 0 0 -1 -1 -2 0 -1 1
4 0 0 -1 -1 -2 0 -1 1
end data.
regression vars=words d1 d2/des=corr/dep=words/enter.
regression vars=words e1 e2/des=corr/dep=words/enter.
regression vars=words c1 c2/des=corr/dep=words/enter.
regression vars=words p1 p2/des=corr/dep=words/enter.
Output from first regression:
Correlations
Pearson Correlation
WORDS
WORDS
1.000
D1
.484
D2
.242
D1
.484
1.000
-.500
Model Summary
D2
.242
-.500
1.000
Model
1
Regression
Residual
Total
Sum of
Squares
70.000
58.000
128.000
df
2
12
14
Mean Square
35.000
4.833
F
7.241
a. Predictors: (Constant), D2, D1
Coefficientsa
b. Dependent Variable: WORDS
Model
1
(Constant)
D1
D2
Stand
ardize
d
Coeffi
Unstandardized
cients
Coefficients
B
Std. Error Beta
6.000
.983
5.000
1.390
.807
4.000
1.390
.645
a. Dependent Variable: WORDS
Adjusted
R Square
.471
a. Predictors: (Constant), D2, D1
ANOVAb
Model
1
R
R Square
.740a
.547
t
6.103
3.596
2.877
Sig.
42 .000
.004
.014
Sig.
.009a
Std. Error of
the Estimate
2.20
Output from second regression:
Correlations
Pearson Correlation
WORDS
WORDS
1.000
E1
.699
E2
.559
E1
.699
1.000
.500
Model Summary
E2
.559
.500
1.000
Model
1
Regression
Residual
Total
Sum of
Squares
70.000
58.000
128.000
df
2
12
14
Adjusted
R Square
.471
Std. Error of
the Estimate
2.20
a. Predictors: (Constant), E2, E1
ANOVAb
Model
1
R
R Square
.740a
.547
Mean Square
35.000
4.833
F
7.241
Sig.
.009a
a. Predictors: (Constant), E2, E1
b. Dependent Variable: WORDS
Coefficientsa
Model
1
(Constant)
E1
E2
Stand
ardize
d
Coeffi
Unstandardized
cients
Coefficients
B
Std. Error Beta
9.000
.568
2.000
.803
.559
1.000
.803
.280
t
15.855
2.491
1.246
Sig.
.000
.028
.237
a. Dependent Variable: WORDS
Output from third regression:
Correlations
Pearson Correlation
WORDS
WORDS
1.000
C1
.726
C2
.140
C1
.726
1.000
.000
Model Summary
C2
.140
.000
1.000
Model
1
Regression
Residual
Total
Sum of
Squares
70.000
58.000
128.000
df
2
12
14
Adjusted
R Square
.471
a. Predictors: (Constant), C2, C1
ANOVAb
Model
1
R
R Square
.740a
.547
Mean Square
35.000
4.833
F
7.241
a. Predictors: (Constant), C2, C1
b. Dependent Variable: WORDS
43
Sig.
.009a
Std. Error of
the Estimate
2.20
Coefficientsa
Model
1
(Constant)
C1
C2
Stand
ardize
d
Coeffi
Unstandardized
cients
Coefficients
B
Std. Error Beta
9.000
.568
1.500
.401
.726
.500
.695
.140
t
15.855
3.737
.719
Sig.
.000
.003
.486
a. Dependent Variable: WORDS
Output from fourth regression:
Correlations
Pearson Correlation
WORDS
WORDS
1.000
P1
.699
P2
-.242
P1
.699
1.000
.000
Model Summary
P2
-.242
.000
1.000
Model
1
Regression
Residual
Total
Sum of
Squares
70.000
58.000
128.000
df
2
12
14
Mean Square
35.000
4.833
F
7.241
a. Predictors: (Constant), P2, P1
b. Dependent Variable: WORDS
Coefficientsa
Model
1
(Constant)
P1
P2
Adjusted
R Square
.471
a. Predictors: (Constant), P2, P1
ANOVAb
Model
1
R
R Square
.740a
.547
Stand
ardize
d
Coeffi
Unstandardized
cients
Coefficients
B
Std. Error Beta
9.000
.568
2.500
.695
.699
-.500
.401
-.242
t
15.855
3.596
-1.246
Sig.
.000
.004
.237
a. Dependent Variable: WORDS
44
Sig.
.009a
Std. Error of
the Estimate
2.20
Download