Comparing Regression Lines

advertisement
Comparing Regression
Lines
From Independent Samples
The Design
•
•
•
•
•
•
You have two or more groups.
One or more continuous predictors (C).
And one continuous outcome variable (Y).
You want to know if
Y = a + b1C1 + … + bpCp + error
Is the same across groups.
Poteat, Wuensch, & Gregg
• Predictive validity study
• Children referred for school psychology
services
• Does Grades = a + bIQ + error
• Differ across races?
• Called a “Potthoff analysis” by school
psychologists
• Differences fell short of significance.
Two Groups, One X
• Y = a + b1C + b2G + b3CG
• If there are more than two groups, groups
is represented by k-1 dummy variables
• Wuensch, Jenkins, & Poteat
• Y = Attitude to animals
• C = Misanthropy
• G = High idealism or not
SAS
•
•
•
•
•
•
Potthoff.sas -- download
Potthoff.dat – download
Point program file to data file
Run the program
Data step: MxI = Misanth Idealism
Page 1: Ignoring idealism, is a .2 corr
between misanthropy and attitude to
animals.
Zero-Order Correlations
Interpretation
• Misanthropy is significant related to
support for animal rights.
• The two idealism groups do not differ
significantly on support for animal rights –
rpb = .092.
• The two idealism groups do not differ
significantly on misanthropy – rpb = -.099.
Four Regression Models
•
•
•
•
•
Proc Reg;
CGI: model ar = misanth idealism MxI;
C: model ar=misanth;
CG: model ar = misanth idealism;
CI: model ar = misanth MxI;
Model CGI
Analysis of Variance
Source
DF
Sum of
Squares
Model
3
4.05237
Error
150
39.73945
Corrected 153
43.79182
Total
Mean
Square
1.35079
0.26493
F Value
Pr > F
5.10
0.0022
Model C
Analysis of Variance
Source
DF
Sum of
Squares
Model
1
2.13252
Error
152
41.65930
Corrected 153
43.79182
Total
Mean
Square
2.13252
0.27407
F Value
Pr > F
7.78
0.0060
Test of Coincidence
• Compare model CGI with model C
• Use a partial F test
F
SSreg full  SSreg reduced
(f  r )MSEfull
4.05237  2.13252
F (2, 150) 
 3.623
(3  1)(.26493)
p = .029
Conclusion
• This was a simultaneous test of intercept
and slopes.
• We conclude that the two groups differ
with respect to
• The intercepts, or
• The slopes,
• Or both.
An Easier Way
proc reg; model ar = misanth idealism MxI;
TEST idealism=0, MxI=0; run;
Test 1 Results for Dependent Variable ar
Source
Numerator
Denominator
DF Mean Square
2
0.95992
150
0.26493
F Value
Pr > F
3.62
0.0291
Model CI
Analysis of Variance
Source
DF
Sum of
Squares
Model
2
2.29525
Error
151 41.49657
Corrected 153 43.79182
Total
Mean F Value Pr > F
Square
1.14763 4.18
0.0172
0.27481
Test of Intercepts
• Compare model CGI with model CI.
4.05237  2.29525
F (1, 150) 
 6.632
(3  2)(.26493)
p = .011
• The intercepts differ significantly.
F(1, 150) = 6.632, p = .011
• As you know, on one df, t = SQRT(F)
• Look back at Model CGI
• For the test of main effect of idealism,
t = SQRT(6.632) = 2.58, p = .011.
• If we had more than two groups we could
not take this shortcut.
Model CGI
Parameter Estimates
Parameter Standard
Variable DF
Estimate
Error
Intercept 1
1.62581 0.19894
misanth 1
0.30006 0.08059
idealism 1
0.77869 0.30236
MxI
1
-0.28472 0.12641
t Value
Pr > |t|
8.17
3.72
2.58
-2.25
<.0001
0.0003
0.0110
0.0258
Test of Parallelism
• Do the slopes differ significantly?
• Compare model CGI with model CG
• Is model fit significantly reduced when we
remove the interaction term?
4.05237  2.70839
F (1, 150) 
 5.073
(3  2)(.26493)
p = .026
Model CG
Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F
Model
2
2.70839
1.35419
4.98
0.0081
Error
151
41.08343 0.27208
Corrected 153
Total
43.79182
F(1, 150) = 5.073, p = .026
• As you know, on one df, t = SQRT(F)
• Look back at Model CGI
• For the test of the interaction, t = 2.25,
p = .026.
• If we had more than two groups we could
not take this shortcut.
Get the Separate Regression
Lines
• Sort by groups.
• Run the bivariate regressions
• For nonidealists,
AR  1.63  .30  Misanth
• For idealists,
AR  2.40  .02  Misanth
Prepare Plots
Proc sgplot; scatter x = misanth y = ar;
reg x = misanth y = ar;
yaxis label='Attitude to Animals‘
grid values=(1 to 5 by 1);
xaxis label='Misanthropy‘
grid values=(1 to 5 by 1);
by idealism; run;
Another Plot
proc sgplot; reg x = misanth
y = ar / group = idealism nomarkers;
yaxis label='Attitude to Animals';
xaxis label='Misanthropy'; run;
Full Model Slope 1
• AR = 1.626 + (.300)Misanthropy +
(.779)Idealism + (- .285)Interaction.
• This is a conditional slope.
• predicted increase in AR accompanying a
one-point increase in misanthropy is .3
given that idealism has value zero (the
idealists).
 X Y  bX  bIM
• the conditional effect of X on Y given a
particular value of the moderator is the
conditional slope for predictor X + the
interaction slope times the value of the
moderator.
• Idealism as moderator, simple effect of
misanthropy
 X Y  .3  (.285)M
Simple Slopes for Misanthropy
• Idealism = 0 (nonidealists)
 X Y  .3  (.285)0  .3
• Each one point increase in Misanthropy lead
to a .3 point increase in AR.
• Idealism = 1 (idealists)
 X Y  .3  (.285)1  .015
• Each one point increase in Misanthropy leads
to a .015 point increase in AR.
 X Y  bX  bIM
• Misanthropy as moderator, simple effects
of idealism (group differences)
 X Y  .779  (.285)M
Full Model Slope 2
• AR = 1.626 + (.300)Misanthropy +
(.779)Idealism + (- .285)Interaction.
• This is a conditional slope.
• The predicted increase in AR
accompanying a one-point increase in
idealism (idealism groups were coded 0,1)
is .779 given that misanthropy has value
zero.
X Y  bX  bIM
• Treating Idealism as the moderator, the simple
slope for the effect of misanthropy on AT is
 X Y  .779  (.285)M
Simple Slopes for Idealism
• predict the difference between the two
idealism groups (idealist minus
nonidealist) when misanthropy = 1)
• .779 -.285(1) = .505.
• If misanthropy = 4, the predicted difference
in means is .779 - .285(4) = -.361
Probing the Interaction
• Same as simple effects analysis in ANOVA
• We have already shown that the
relationship between misanthropy and
support for animal rights is significant for
nonidealists but not for idealists.
• Change perspectives -- how does
misanthropy moderate the relationship
between idealism (group) and support of
animal rights.
Analysis of Simple Slopes
• Arbitrarily pick two or more values of
misanthropy and compare the groups at
those points.
• The points are often 1 SD below the
mean, the mean, and 1 SD above the
mean.
• Here, that would be misanthropy = 1.65,
2.32, and 2.99.
Testing the Simple Slopes
• To test the null that mean AR does not
differ between groups when misanthropy =
1.65, we center the misanthropy scores
around 1.65, recomputed the interaction
term, and run the full model again.
• We repeat this action with the scores
centered around 2.32 and then again
centered around 2.99.
• See the code in the program.
The Code
Data Centered; set kevin;
MisanthLow = misanth - 1.65;
InteractLow = MisanthLow * Idealism;
MisanthMean = misanth - 2.32;
InteractMean = MisanthMean * Idealism;
MisanthHigh = misanth - 2.99;
InteractHigh = MisanthHigh * Idealism;
proc reg;
Low: model ar = MisanthLow idealism InteractLow;
Mean: model ar = MisanthMean idealism InteractMean;
High: model ar = MisanthHigh idealism InteractHigh;
run; Quit;
Low Misanthropy
When MIS is low, AR is significantly
higher (by .309) in the idealistic group
than it is in the nonidealistic group.
Average Misanthropy
• The groups do not differ significantly when
MIS is average.
High Misanthropy
• The groups do not differ significantly when
MIS is High.
Process Hayes
• Makes it way easier to do this analysis.
• Bring the process.sas program into SAS
and run it.
• You have already read the data into the
work file “kevin.”
• Hayes also provides a script to do the
same in SPSS.
The SAS Macro
%process (data=kevin,vars=ar misanth
idealism,y=ar,x=idealism,m=misanth,
model=1,jn=1,plot=1);
• Data= points to the SAS data file
• Vars= identifies the variables
• Y= identifies the outcome variable
• X= identifies the focal predictor variable
• M= identifies the moderator variable
• Model=1 identifies the simple moderation model –
see the templates document
The SAS Macro
%process (data=kevin,vars=ar misanth
idealism,y=ar,x=idealism,m=misanth,
model=1,jn=1,plot=1);
• jn=1 invokes the Johnson-Neyman technique
• Plot=1 requests the values for making a plot to
visualize the interaction.
• Notice that the output includes all of the tests
we did earlier, the hard way.
Johnson-Neyman Technique
• Maps out the values of the moderator for
which the effect of the focal predictor is
significant versus those values for which it
is not significant.
• I’ll use idealism groups as the focal
predictor and misanthropy as the
moderator.
The Boundary
• When misanthropy = 2.1286 or less, the
difference between the groups is
statistically significant (higher for the
idealists), otherwise it is not.
• If we were to extrapolate beyond misanthropy = 4,
we would find a second region where the
difference between the groups would be significant
(with the mean higher for the nonidealists).
Don’t Confuse Test of Slopes
with Test of Correlation
Coefficients
• If the slopes are the same across groups,
the correlation coefficients (standardized
slopes) may or may not.
• If the correlation coefficients are the same
across groups, the slopes may or may not.
Different Slopes, Similar
Correlations
Identical Slopes, Different
Correlation Coefficients
Comparing the Groups on r
• At SPSS and SAS programs for
comparing Pearson correlations and OLS
regression coefficients is the code for this
analysis.
• r is significant for nonidealists, not for
idealists.
• For more than two groups, use this
chisquare (also available at link above).
Analysis of Covariance
• You already know how to do this.
• Just drop the interaction term from the
model.
• Here that would not be appropriate, as
well have heterogeneity of regression.
A Couple of t Tests
• You may also want to compare the groups
on the Y (ignoring C) and/or C.
• I have include those tests in the program.
• This is redundant with the initial Proc Corr
output (point biserial correlations).
SPSS
• This analysis is easy to do with SPSS too.
• See my handout.
• You can do the analysis in a sequential
fashion.
• And get the partial F tests from SPSS,
even with df > 1: Leave your calculator in
the desk drawer.
Three Groups
• Two dummy variables, G1 and G2
• Two interaction terms, G1C and G2C
• To test the slopes you would see if the
model fit were significantly reduced by
simultaneously removing G1C and G2C .
• That would be an F with two df in its
numerator.
• To test the intercepts, remove both G1 and
G2
Let’s Go Fishing
• Length = a + bWeight for flounder
• Does the relationship differ across
regions?
– Pamlico Sound
– Pamlico River
– Tar River
• Potthoff3.sas and Potthoff3.dat
• Download and run.
Proc GLM
• Proc GLM; class Location;
o Model Length = WeightSR|Location;
• GLM creates the (2) dummy variables for
you
• And the (2) interaction terms.
Full Model
Source
DF
Model
5
Error
745
Corrected 750
Total
Sum of
Mean
F Value
Squares
Square
1927227.450 385445.490 3088.09
92988.508
124.817
2020215.957
Pr > F
<.0001
Covariate Only Model
Proc GLM; class Location;
model Length = WeightSR / solution;
Weights are significantly correlated with
lengths, r2 = .95, F(1, 749) = 14,544, p < .001.
Test of Coincidence
F
SSreg  full  SSreg  reduced
(f  r )MSE full

1927227.450 - 1921272.704
 11.927
(5 - 1)124.817
• On 4, 745 df, p < .001
• The lines are not coincident.
Look at Full Model
• The slopes do not differ significantly,
F(2, 745) = 1.63, p = .20.
• The intercepts do differ significantly,
F(2, 745) = 4.90, p = .008.
• Since the slopes do not differ significantly
• But the intercepts do,
• The group means must differ.
Source
DF Type III SS
Mean Square
F Value Pr > F
WeightSR 1
692152.6411 692152.6411
5545.35 <.0001
Location 2
WeightSR* 2
Location
1223.2971
407.5286
4.90
1.63
611.6486
203.7643
0.0077
0.1961
Analysis of Covariance
• Location significantly affects mean length
of flounder, after adjusting for the effect of
weight.
Unadjusted Means (notice the
different pattern)
Download