class 24 mod mult regression II

advertisement
Moderated Multiple
Regression
Class 23
STATS TAKE HOME EXERCISE IS
DUE THURSDAY DEC. 12
Deliver to Kent’s Mailbox or Place
under his door (Rm. 352)
Regression Model for
Esteem and Affect as Information
Model
Y = b0 + b1X + b2Z + b3XZ
Where
Y
X
Z
XZ
= cry rating
= upset
= esteem
= esteem*upset
And
b0
b1 =
b2 =
b3 =
= X.XX = MEANING?
= X.XX = MEANING?
= X.XX = MEANING?
=X.XX = MEANING?
Regression Model for
Esteem and Affect as Information
Model:
Y = b0 + b1X + b2Z + b3XZ
Where
Y
X
Z
XZ
= cry rating
= upset
= esteem
= esteem*upset
And
b0
= 6.53 = intercept (average score when
upset, esteem, upsetXexteem = 0)
= -0.57 = slope (influence) of upset
= -0.48 = slope (influence) of esteem
= 0.18 = slope (influence) of upset X
esteem interaction
b1
b2
b3
Plotting Outcome: Baby Cry Ratings as a Function
of Listener's Upset and Listener's Self Esteem
???
???
???
Plotting Outcome: Baby Cry Ratings as a Function
of Listener's Upset and Listener's Self Esteem
Self
Esteem
cry rating
Upset
Plotting Interactions with
Two Continuous Variables
Y = b0 + b1X + b2Z + b3XZ
equals
Y = (b1 + b3Z)X + (b2Z + b0)
Y = (b1 + b3Z)X is simple slope of Y on X at Z.
Means "the effect X has on Y, conditioned by the
interactive contribution of Z."
Thus, when Z is one value, the X slope takes one shape,
when Z is another value, the X slope takes other shape.
Plotting Simple Slopes
1. Compute regression to obtain values of
Y = b0 + b1X + b2Z + b3XZ
2. Transform Y = b0 + b1X + b2Z + b3XZ into
Y = (b1 + b3Z)X + (b2Z + b0) and insert values
Y = (? + ?Z)X + (?Z + ?)
3. Select 3 values of Z that display the simple slopes of
X when Z is low, when Z is average, and when Z is high.
Standard practice:
Z at one SD above the mean = ZH
Z at the mean
= ZM
Z at one SD below the mean = ZL
page A1
Interpreting SPSS Regression Output (a)
Regression
Descriptive Statistics
Mean
Std. Deviation
N
crytotl
5.1715
.53171
77
upset
2.9351
1.20675
77
esteem
3.9519
.76168
77
upsteem
11.3481
4.87638
77
Plotting Simple Slopes
(continued)
4. Insert values for all the regression coefficients (i.e., b1, b2, b3) and
the intercept (i.e., b0), from computation (i.e., SPSS print-out).
5. Insert ZH into (b1 + b3Z)X + (b2Z + b0) to get slope when Z is high
Insert ZM into (b1 + b3Z)X + (b2Z + b0) to get slope when Z is
moderate
Insert ZL into (b1 + b3Z)X + (b2Z + b0) to get slope when Z is low
Example of Plotting Baby Cry Study, Part I
Y (cry rating) = b0 (rating when all predictors = zero)
+ b1X (effect of upset) + b2Z (effect of esteem)
+ b3XZ (effect of upset X esteem interaction).
Y
= 6.53 + -.53X + -.48Z + .18XZ.
Y
= (b1 + b3Z)X + (b2Z + b0) [conversion for simple slopes]
Y
= (-.53 + .18Z)X + (-.48Z + 6.53)
Compute ZH, ZM, ZL via “Frequencies" for esteem, 3.95 = mean, .76 = SD
ZH, = (3.95 + .76) = 4.71
ZM = (3.95 + 0) = 3.95
ZL = (3.95 - .76) = 3.19
Slope at ZH = (-.53 + .18 * 4.71)X + ([-.48 * 4.71] + 6.53) = .32X + 4.27
Slope at ZM = (-.53 + .18 * 3.95)X + ([-.48 * 3.95] + 6.53) = .18X + 4.64
Slope at ZL = (-.53 + .18 * 3.19)X + ([-.48 * 3.19] + 6.53) = .04X + 4.99
Example of Plotting, Baby Cry Study, Part II
1. Compute mean and SD of main predictor ("X") i.e., Upset
Upset mean = 2.94, SD = 1.21
2. Select values on the X axis displaying main predictor, e.g. upset at:
Low upset
Medium upset
High upset
= 1 SD below mean`
= mean
= 1SD above mean
= 2.94 – 1.21 = 1.73
= 2.94 – 0.00 = 2.94
= 2.94 + 1.21 = 4.15
3. Plug these values into ZH, ZM, ZL simple slope equations
Simple
Slope
Formula
Low Upset
(X = 1.73)
Medium Upset
(X = 2.94)
High Upset
(X = 4.15)
ZH
Y =.32X + 4.28
4.83
5.22
5.61
ZM
Y =.18X + 4.64
4.95
5.17
5.38
ZL
Y =.04X + 4.99
5.06
5.11
5.16
4. Plot values into graph
Graph Displaying Simple Slopes
Baby Cry Ratings
5.8
High Esteem
Med. Esteem
Low Esteem
5.4
5
4.6
Mild Upset
Mod. Upset
Extreme Upset
Participants' Level of Upset
Are the Simple Slopes Significant?
Question: Do the slopes of each of the simple effects lines
(ZH, ZM, ZL) significantly differ from zero?
Procedure to test, using as an example ZH (the slope when esteem is high):
1. Transform Z to Zcvh (CV = conditional value) by subtracting ZH from Z.
Zcvh = Z - ZH = Z – 4.71
Conduct this transformation in SPSS as:
COMPUTE esthigh = esteem - 4.71.
2. Create new interaction term specific to Zcvh, i.e., (X* Zcvh)
COMPUTE
upesthi = upset*esthigh .
3. Run regression, using same X as before, but substituting
Zcvh for Z, and X* Zcvh for XZ
Are the Simple Slopes Significant?--Programming
COMMENT SIMPLE SLOPES FOR CLASS DEMO
COMPUTE esthigh
COMPUTE estmed
COMPUTE estlow
= esteem - 4.71 .
= esteem - 3.95.
= esteem - 3.19 .
COMPUTE upesthi
= esthigh*upset .
COMPUTE upestmed = estmed*upset .
COMPUTE upestlow = estlow*upset .
REGRESSION [for the simple effect of high esteem (esthigh)]
/MISSING LISTWISE
/STATISTICS COEFF OUTS BCOV R ANOVA CHANGE
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT crytotl
/METHOD=ENTER upset esthigh
/METHOD=ENTER upset esthigh upesthi .
Simple Slopes Significant?—Results
Regression
Model Summary
Change Statistics
Model
1
2
R
R Square
Adjusted
R Square
Std. Error of
the Estimate
R Square
Change
F Change
df1
df2
Sig. F Change
.461a
.213
.191
.47810
.213
9.999
2
74
.000
b
.297
.269
.45473
.085
8.803
1
73
.004
.545
a. Predictors: (Constant), es thigh, upset
b. Predictors: (Constant), es thigh, upset, upesthi
NOTE: Key outcome is B of "upset", Model 2. If significant,
then the simple effect of upset
fora the high esteem slope is signif.
Coefficients
Unstandardized
Coefficients
Model
1
2
B
(Constant)
Standardized
Coefficients
Std. Error
4.639
.145
upset
.211
.047
esthigh
.114
.075
4.277
.184
.336
.062
esthigh
-.478
upesthi
.183
(Constant)
upset
Beta
t
Sig.
31.935
.000
.479
4.462
.000
.163
1.522
.132
23.212
.000
.762
5.453
.000
.212
-.685
-2.256
.027
.062
1.009
2.967
.004
Moderated Multiple Regression with
Continuous Predictor and Categorical Moderator
(Aguinis, 2004)
Problem:
Does caffeine lead to more arguments, but mainly
for people with hostile personalities?
Criterion:
Weekly arguments
Continuous Var. 0-10
Predictor:
Caffeinated coffee
Categorical Var.
0 = decaff, 1 = caffeinated
Moderator: Hostility
Continuous var. 1 - 7
Regression Models to Test Moderating
Effect of Tenure on Salary Increase
Without Interaction
Arguments = b0 (ave.arguments) + b1 (coffee.type) + b2 (hositility.score)
With Interaction
Salary increase = b0 (ave. salary) + b1 (coffee) + b2 (hostility)
+ b3 (coffee*hostility)
Coffee is categorical, therefore a "dummy variable", values = 0 or 1
These values are markers, do not convey quantity
Interaction term = Predictor * moderator, = coffee*hositility. That simple.
Conduct regression, plotting, simple slopes analyses same as when
predictor and moderator are both continuous variables.
Coffee Hostility
.00
.00
.00
.00
.00
.00
.00
.00
.00
.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
2.00
3.00
4.00
5.00
2.00
3.00
4.00
5.00
1.00
7.00
2.00
3.00
4.00
5.00
2.00
3.00
4.00
5.00
1.00
7.00
Args. Coff.hostile
3.00
2.00
4.00
5.00
3.00
3.00
6.00
4.00
.00
5.00
2.00
3.00
4.00
3.00
2.00
3.00
2.00
1.00
3.00
3.00
.00
.00
.00
.00
.00
.00
.00
.00
.00
.00
2.00
3.00
4.00
5.00
2.00
3.00
4.00
5.00
1.00
7.00
DATASET ACTIVATE DataSet1.
COMPUTE coffee.hostile=coffee * hostile.personality.
EXECUTE.
REGRESSION
/DESCRIPTIVES MEAN STDDEV CORR SIG N
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA CHANGE
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT arguments
/METHOD=ENTER coffee hostile.personality
/METHOD=ENTER coffee.hostile .
Plotting of Arguments due to Caffeine & Hostility
Y (arguments) = b0 (args when all predictors = zero)
+ b1X (effect of coffee) + b2Z (effect of hostility)
+ b3XZ (effect of coffee X hostility).
Y
= 0.84 + 1.71X+ 0.74Z + -0.73XZ.
Y
= (b1 + b3Z)X + (b2Z + b0) [conversion for simple slopes]
Y
= (1.17 + -.73Z)X + (.74Z + .84)
Compute ZH, ZM, ZL via “Frequencies" for esteem, 3.95 = mean, .76 = SD
ZH, = (3.60 + 1.72) = 5.32
ZM = (3.60+ 0) = 3.60
ZL = (3.60 - 1.72 ) = 1.88
Slope at ZH = (1.17 -.73 * 5.32)X + ([.74 * 5.32] + .84) = 2.34X+ 4.78
Slope at ZM = (1.17 -.73 * 3.60)X + ([.74 * 3.60] + .84) = 1.58X + 3.50
Slope at ZL = (1.17 -.73 * 1.88)X + ([.74 * 1.88] + .84) = 0.83X + 2.23
Plotting Dummy Variable Interaction
1. Main predictor has only 2 values, 0 and 1
2. Select values on the X axis displaying main predictor, e.g. upset at:
No Caffeine
Caffeine
=0
=1
3. Plug these values into ZH, ZM, ZL simple slope equations
Simple
Slope
Formula
No Caff.
(X = 0)
Caffeinated
(X = 1)
ZH
Y= 2.34X +4.78
4.78
7.12
ZM
Y =1.58X+ 3.50
3.50
5.08
ZL
Y =.83X + 2.23
2.23
3.06
4. Plot values into graph
Graph Displaying Simple Slopes
8.00
7.00
Low Hostile
Med. Hostile
High Hostile
Arguments
6.00
5.00
4.00
3.00
2.00
1.00
0.00
No Caff
Caffeinated
Coffee Type
Centering Data
Centering data is done to standardize it.
Aiken and West recommend doing it in all cases.
* Makes zero score meaningful
* Has other benefits
Aguinas recommends doing it in some cases.
* Sometimes uncentered scores are meaningful
Procedure
upset
M = 2.94, SD = 1.19;
COMPUTE upcntr
COMPUTE estcntr
upcntr
esteem M = 3.94, SD = 0.75
= upset – 2.94.
= esteem = 3.94
M = 0, SD = 1.19;
esteem M = 0, SD = 0.75
Centering may affect the slopes of predictor and moderator, BUT
it does not affect the interaction term.
Requirements and Assumptions (Continued)
Independent Errors: Residuals for Sub. 1 ≠ residuals for Sub. 2.
For example Sub. 2 sees Sub 1 screaming as Sub 1 leaves
experiment. Sub 1 might influence Sub 2. If each new sub is
affected by preceding sub, then this influence will reduce
independence of errors, i.e., create autocorrelation.
Autocorrelation is bias due to temporal adjacency.
Assess: Durbin-Watson test. Values range from 0 - 4, "2" is
ideal. Closer to 0 means neg. correl, closer to 4 = pos. correl.
Sub 1
Sub 2
Sub 3
Sub 4
Sub 5
Sub 6
Funny movie
Funny movie
Sad movie
Sad movie
Funny movie
Funny movie
r (s1 s2)
r (s2 s3)
r (s3 s4)
r (s4 s5)
r (s5 s6)
+
+
+
Durbin-Watson Test of Autocorrelation
DATASET ACTIVATE DataSet1.
REGRESSION
/DESCRIPTIVES MEAN STDDEV CORR SIG N
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA CHANGE
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT crytotl
/METHOD=ENTER age upset
/RESIDUALS DURBIN.
Multicollinearity
In multiple regression, statistic assumes that each new predictor
is in fact a unique measure.
If two predictors, A and B, are very highly correlated, then a
model testing the added effect of Predictors A and B might, in
effect, be testing Predictor A twice.
If so, the slopes of each variable are not orthogonal (go in
different directions, but instead run parallel to each other (i.e.,
they are co-linear).
Non-orthogonal
Orthogonal
Mac Collinearity: A Multicollinearity Saga
Suffering negative publicity regarding the health risks of fast food, the
fast food industry hires the research firm of Fryes, Berger, and Shayque
(FBS) to show that there is no intrinsic harm in fast food.
FBS surveys a random sample, and asks:
a. To what degree are you a meat eater? (carnivore)
b. How often do you purchase fast food? (fast.food)
c. What is your health status? (health)
FBS conducts a multiple regression, entering fast.food in step one and
carnivore in step 2.
FBS Fast Food and Carnivore Analysis
“See! See!” the FBS researchers rejoiced “Fast Food negatively
predicts health in Model 1, BUT the effect of fast food on health goes
away in Model 2, when being a carnivore is considered.”
Not So Fast, Fast Food Flacks
Colinearity Diagnostics
1. Correlation table
2. Collinearity Statistics
VIF (should be < 10) and/or
Tolerance should be more than .20
Download