Fixing X (by constructing variables that meet the linearity assumption)

advertisement
Stat 404
Fixing X
(by constructing variables that meet the linearity assumption)
A. Consider the following experiment: A list of 15 words is projected on a screen in front of
three groups of subjects. Prior to this subjects in one group (Group #1) were told to
“memorize the words for a recall test to be given later.” Another group (Group #2) was told
to “meditate on the words while relaxing using biofeedback.” A third group (Control,
Group #3) received no prior instructions.
1. The data:
Group #2
Bio-memorization
Y  Yˆ1
raw data
10
0
8
-2
11
1
13
3
8
-2
Y2  10
Group #1
Rote Memorization
Y  Yˆ1
raw data
9
-2
12
1
11
0
15
4
8
-3
Y1  11
Group #3
No memorization
Y  Yˆ1
raw data
5
-1
6
0
8
2
7
1
4
-2
Y3  6
Overall mean: Y  9
2. A plot:
15
Number of
words recalled
10
•
•
•
•
•
•
•
•
2
•
•
•
•
•
2
3
5
1
Group
1
3. The ANOVA table:
Source
Treatment
Error
Total
SS
70
58
128
df
2
12
14
MS
35
4.83
F
7.24
4. Note that group differences in this table explain a significant amount of variance at both
the .05 and .01 levels of significance, since F122 ,.05  3.88 and F122 ,.01  6.93 .
5. The next few lectures will be considering a variety of ways that independent variables
can be constructed to explain this variance (i.e., the treatment sum of squares of 70 given
in the ANOVA table). SPSS output that summarizes these ways is provided at the end of
this section of your lecture notes.
B. Doing an ANOVA with dummy variables.
1. The variables:
 1 if Treatment #1 
D1  

0 otherwise

and
1 if Treatment #2 
D2  

0 otherwise

2. The data matrix:
Treatment #1
(rote memory)
Treatment #2
(bio-memory)
Control
Y
9
12
11
15
8
10
8
11
13
8
5
6
8
7
4
D1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
2
D2
0
0
0
0
0
1
1
1
1
1
0
0
0
0
0
3. The resulting regression equation:
Yˆ  6  5D1  4 D2
Notice that (as you might expect) the estimated Y-value for each group equals the group
mean:
Yˆ1  6  5(1)  4(0)  11  Y1
Yˆ2  6  5(0)  4(1)  10  Y2
Yˆ3  6  5(0)  4(0)  6  Y3
4. Interpreting the regression coefficients:
a. aˆ  6 : On average, 6 words were recalled by subjects with no prior instructions.
b. bˆ1  5 : On average, the rote memorization group recalled 5 words more than this.
c. bˆ2  4 : On average, those using bio-memorization recalled 4 words more than did
those in the control group (i.e., those with no prior instructions).
6. Note: When slopes associated with dummy variables are stated in words, you do not
(unless they are constructed from different nominal-level variables [such as race, gender,
religious affiliation, etc.] refer to the effects of one dummy variable being adjusted for its
collinearity with another one.
C. Effect coding
1. The variables:
 1 if Treatment #1 


E1   1 if Control group 
 0 otherwise



 1 if Treatment # 2 


and E 2   1 if Control group 
 0 otherwise



3
2. The data matrix:
Treatment #1
(rote memory)
Treatment #2
(bio-memory)
Control
Y
9
12
11
15
8
10
8
11
13
8
5
6
8
7
4
E1
1
1
1
1
1
0
0
0
0
0
-1
-1
-1
-1
-1
E2
0
0
0
0
0
1
1
1
1
1
-1
-1
-1
-1
-1
3. Definition: A contrast is a variable having a mean of zero.
a. Every variable can be converted to a contrast by subtracting its mean from each of its
values. Statisticians sometimes refer to such a conversion as centering one’s data
on a variable.
b. Note that unlike dummy variables, effect measures are contrasts.
c. Note also that when all of one’s independent variables are contrasts, the constant in
one’s regression equation is an estimate of the dependent variable’s mean. For
example, in this case aˆ  Y  bˆ1 E1  bˆ2 E 2  9  2(0)  1(0)  9  Y .
d. Whenever the constant in a regression model estimates the mean, slopes associated
with contrasts can be described as deviations from the overall mean.
4. The resulting regression equation:
Yˆ  9  2 E1  1E2
4
Again note that the estimated Y-value for each group equals the group mean:
Yˆ1  9  2(1)  1(0)  11  Y1
Yˆ2  9  2(0)  1(1)  10  Y2
Yˆ3  9  2( 1)  1( 1)  6  Y3
5. Interpreting the regression coefficients:
a. aˆ  9 : Overall 9 words were recalled on average.
b. bˆ1  2 : On average, the rote memorization group recalled 2 words more than this.
c. bˆ2  1 : On average, those using bio-memorization recalled 1 word more than the
overall average.
6. Note that the “effect” of the control group (i.e., the deviation of the mean of the control
2
group from the overall mean) equals   bˆi  1  2   3 . After obtaining this third
i 1
effect, notice that the 3 effects sum to zero.
7. In the U.S. we have a folk saying, “There are many ways to skin a cat.” The meaning of
the expression is that there are a variety of ways to do some things. In these notes, you
are learning a variety of ways to explain the variance among the 2 treatment groups and
the control group (i.e., to explain SSTREATMENT  70 ). At this point we have considered 2
ways to “skin this cat.”
D. Weighted effect coding
1. A potential problem: Effect coding (as described above) only yields contrasts in balanced
experimental designs (i.e., when each of the groups being compared is of the same size
[e.g., in this case 5 subjects in each group]). Weighted effect coding is needed when
group sizes are unequal.
5
2. The variables (assuming a 3-level nominal-level variable with ni units of analysis in the
i th group for i=1,2,3):
 1 if Treatment #1 
 n

EW1   1 if Control group 
 n3

 0 otherwise

and
 1 if Treatment #2 
 n

EW2   2 if Control group 
 n3

 0 otherwise

3. The data matrix (an altered version of the previous data matrix in which the overall mean
29
and the treatment means are the same, but the mean of the control group is Yˆ3 
 5.8 ):
5
Y
9
12
11
15
8
11
8
11
13
8
5
6
7
7
4
Treatment #1
(rote memory)
Treatment #2
(bio-memory)
Control
E1
1
1
1
1
1
1
0
0
0
0
-1.2
-1.2
-1.2
-1.2
-1.2
E2
0
0
0
0
0
0
1
1
1
1
-.8
-.8
-.8
-.8
-.8
4. Notice how weighted effect coding leaves each effect code a contrast:
E1 
6  1  5   1.2  E
15
2

4  1  5   .8  0
15
5. With the changes made to the data matrix, the regression equation is identical to the
previous one. Moreover, interpretations of regression coefficients are identical to those
in the previous equation as well.
6. However, the effect of the control group is different:
6
2

i 1
ni ˆ
bi  1.2  2   .8  1  2.4  .8  3.2 , which is the deviation of the control
n3
group’s mean from the overall mean (i.e., 9 - 3.2=5.8).
7. Weighted effects sum to zero:
a. In a balanced design one need only sum each group’s effects (i.e., each group mean’s
deviation from the overall mean).
b. In an unbalanced design one must weight each group’s effects by its size before
summing them. Thus if among k unequal groups, the i th group’s size is ni and its
effect is b̂i , then the weighted effects sum (as illustrated using the above data) to zero
k
as follows:
 n bˆ  6  2  4  1  5   3.2  12  4  16  0
i i
i 1
E. Orthogonal contrasts
1. As already mentioned during our discussion of principal components, when two
independent variables are constructed by the researcher in such a way that they explain
distinct amounts of variance in any dependent variable, they are said to be orthogonal to
each other.
2. The key advantages of orthogonal independent variables are that their effects will not be
confounded with each other (i.e., they will each explain a distinct amount of the
dependent variable’s variance) and that their slopes can be interpreted without concern
for their having been adjusted for such confounding.
3. Note that orthogonality implies an absence of multicollinearity. Yet the two involve the
researcher in different ways.
a. On the one hand, the researcher may actively assign sets of values to categories of
his/her independent variable such that each set is orthogonal to every other one.
7
Download