Comparing Regression Lines From Independent Samples The Design • • • • • • You have two or more groups. One or more continuous predictors (C). And one continuous outcome variable (Y). You want to know if Y = a + b1C1 + … + bpCp + error Is the same across groups. Poteat, Wuensch, & Gregg • Predictive validity study • Children referred for school psychology services • Does Grades = a + bIQ + error • Differ across races? • Called a “Potthoff analysis” by school psychologists • Differences fell short of significance. Two Groups, One X • Y = a + b1C + b2G + b3CG • If there are more than two groups, groups is represented by k-1 dummy variables • Wuensch, Jenkins, & Poteat • Y = Attitude to animals • C = Misanthropy • G = High idealism or not SAS • • • • • • Potthoff.sas -- download Potthoff.dat – download Point program file to data file Run the program Data step: MxI = Misanth Idealism Page 1: Ignoring idealism, is a .2 corr between misanthropy and attitude to animals. Zero-Order Correlations Interpretation • Misanthropy is significant related to support for animal rights. • The two idealism groups do not differ significantly on support for animal rights – rpb = .092. • The two idealism groups do not differ significantly on misanthropy – rpb = -.099. Four Regression Models • • • • • Proc Reg; CGI: model ar = misanth idealism MxI; C: model ar=misanth; CG: model ar = misanth idealism; CI: model ar = misanth MxI; Model CGI Analysis of Variance Source DF Sum of Squares Model 3 4.05237 Error 150 39.73945 Corrected 153 43.79182 Total Mean Square 1.35079 0.26493 F Value Pr > F 5.10 0.0022 Model C Analysis of Variance Source DF Sum of Squares Model 1 2.13252 Error 152 41.65930 Corrected 153 43.79182 Total Mean Square 2.13252 0.27407 F Value Pr > F 7.78 0.0060 Test of Coincidence • Compare model CGI with model C • Use a partial F test F SSreg full SSreg reduced (f r )MSEfull 4.05237 2.13252 F (2, 150) 3.623 (3 1)(.26493) p = .029 Conclusion • This was a simultaneous test of intercept and slopes. • We conclude that the two groups differ with respect to • The intercepts, or • The slopes, • Or both. An Easier Way proc reg; model ar = misanth idealism MxI; TEST idealism=0, MxI=0; run; Test 1 Results for Dependent Variable ar Source Numerator Denominator DF Mean Square 2 0.95992 150 0.26493 F Value Pr > F 3.62 0.0291 Model CI Analysis of Variance Source DF Sum of Squares Model 2 2.29525 Error 151 41.49657 Corrected 153 43.79182 Total Mean F Value Pr > F Square 1.14763 4.18 0.0172 0.27481 Test of Intercepts • Compare model CGI with model CI. 4.05237 2.29525 F (1, 150) 6.632 (3 2)(.26493) p = .011 • The intercepts differ significantly. F(1, 150) = 6.632, p = .011 • As you know, on one df, t = SQRT(F) • Look back at Model CGI • For the test of main effect of idealism, t = SQRT(6.632) = 2.58, p = .011. • If we had more than two groups we could not take this shortcut. Model CGI Parameter Estimates Parameter Standard Variable DF Estimate Error Intercept 1 1.62581 0.19894 misanth 1 0.30006 0.08059 idealism 1 0.77869 0.30236 MxI 1 -0.28472 0.12641 t Value Pr > |t| 8.17 3.72 2.58 -2.25 <.0001 0.0003 0.0110 0.0258 Test of Parallelism • Do the slopes differ significantly? • Compare model CGI with model CG • Is model fit significantly reduced when we remove the interaction term? 4.05237 2.70839 F (1, 150) 5.073 (3 2)(.26493) p = .026 Model CG Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model 2 2.70839 1.35419 4.98 0.0081 Error 151 41.08343 0.27208 Corrected 153 Total 43.79182 F(1, 150) = 5.073, p = .026 • As you know, on one df, t = SQRT(F) • Look back at Model CGI • For the test of the interaction, t = 2.25, p = .026. • If we had more than two groups we could not take this shortcut. Get the Separate Regression Lines • Sort by groups. • Run the bivariate regressions • For nonidealists, AR 1.63 .30 Misanth • For idealists, AR 2.40 .02 Misanth Prepare Plots Proc sgplot; scatter x = misanth y = ar; reg x = misanth y = ar; yaxis label='Attitude to Animals‘ grid values=(1 to 5 by 1); xaxis label='Misanthropy‘ grid values=(1 to 5 by 1); by idealism; run; Another Plot proc sgplot; reg x = misanth y = ar / group = idealism nomarkers; yaxis label='Attitude to Animals'; xaxis label='Misanthropy'; run; Full Model Slope 1 • AR = 1.626 + (.300)Misanthropy + (.779)Idealism + (- .285)Interaction. • This is a conditional slope. • predicted increase in AR accompanying a one-point increase in misanthropy is .3 given that idealism has value zero (the idealists). X Y bX bIM • the conditional effect of X on Y given a particular value of the moderator is the conditional slope for predictor X + the interaction slope times the value of the moderator. • Idealism as moderator, simple effect of misanthropy X Y .3 (.285)M Simple Slopes for Misanthropy • Idealism = 0 (nonidealists) X Y .3 (.285)0 .3 • Each one point increase in Misanthropy lead to a .3 point increase in AR. • Idealism = 1 (idealists) X Y .3 (.285)1 .015 • Each one point increase in Misanthropy leads to a .015 point increase in AR. X Y bX bIM • Misanthropy as moderator, simple effects of idealism (group differences) X Y .779 (.285)M Full Model Slope 2 • AR = 1.626 + (.300)Misanthropy + (.779)Idealism + (- .285)Interaction. • This is a conditional slope. • The predicted increase in AR accompanying a one-point increase in idealism (idealism groups were coded 0,1) is .779 given that misanthropy has value zero. X Y bX bIM • Treating Idealism as the moderator, the simple slope for the effect of misanthropy on AT is X Y .779 (.285)M Simple Slopes for Idealism • predict the difference between the two idealism groups (idealist minus nonidealist) when misanthropy = 1) • .779 -.285(1) = .505. • If misanthropy = 4, the predicted difference in means is .779 - .285(4) = -.361 Probing the Interaction • Same as simple effects analysis in ANOVA • We have already shown that the relationship between misanthropy and support for animal rights is significant for nonidealists but not for idealists. • Change perspectives -- how does misanthropy moderate the relationship between idealism (group) and support of animal rights. Analysis of Simple Slopes • Arbitrarily pick two or more values of misanthropy and compare the groups at those points. • The points are often 1 SD below the mean, the mean, and 1 SD above the mean. • Here, that would be misanthropy = 1.65, 2.32, and 2.99. Testing the Simple Slopes • To test the null that mean AR does not differ between groups when misanthropy = 1.65, we center the misanthropy scores around 1.65, recomputed the interaction term, and run the full model again. • We repeat this action with the scores centered around 2.32 and then again centered around 2.99. • See the code in the program. The Code Data Centered; set kevin; MisanthLow = misanth - 1.65; InteractLow = MisanthLow * Idealism; MisanthMean = misanth - 2.32; InteractMean = MisanthMean * Idealism; MisanthHigh = misanth - 2.99; InteractHigh = MisanthHigh * Idealism; proc reg; Low: model ar = MisanthLow idealism InteractLow; Mean: model ar = MisanthMean idealism InteractMean; High: model ar = MisanthHigh idealism InteractHigh; run; Quit; Low Misanthropy When MIS is low, AR is significantly higher (by .309) in the idealistic group than it is in the nonidealistic group. Average Misanthropy • The groups do not differ significantly when MIS is average. High Misanthropy • The groups do not differ significantly when MIS is High. Process Hayes • Makes it way easier to do this analysis. • Bring the process.sas program into SAS and run it. • You have already read the data into the work file “kevin.” • Hayes also provides a script to do the same in SPSS. The SAS Macro %process (data=kevin,vars=ar misanth idealism,y=ar,x=idealism,m=misanth, model=1,jn=1,plot=1); • Data= points to the SAS data file • Vars= identifies the variables • Y= identifies the outcome variable • X= identifies the focal predictor variable • M= identifies the moderator variable • Model=1 identifies the simple moderation model – see the templates document The SAS Macro %process (data=kevin,vars=ar misanth idealism,y=ar,x=idealism,m=misanth, model=1,jn=1,plot=1); • jn=1 invokes the Johnson-Neyman technique • Plot=1 requests the values for making a plot to visualize the interaction. • Notice that the output includes all of the tests we did earlier, the hard way. Johnson-Neyman Technique • Maps out the values of the moderator for which the effect of the focal predictor is significant versus those values for which it is not significant. • I’ll use idealism groups as the focal predictor and misanthropy as the moderator. The Boundary • When misanthropy = 2.1286 or less, the difference between the groups is statistically significant (higher for the idealists), otherwise it is not. • If we were to extrapolate beyond misanthropy = 4, we would find a second region where the difference between the groups would be significant (with the mean higher for the nonidealists). Don’t Confuse Test of Slopes with Test of Correlation Coefficients • If the slopes are the same across groups, the correlation coefficients (standardized slopes) may or may not. • If the correlation coefficients are the same across groups, the slopes may or may not. Different Slopes, Similar Correlations Identical Slopes, Different Correlation Coefficients Comparing the Groups on r • At SPSS and SAS programs for comparing Pearson correlations and OLS regression coefficients is the code for this analysis. • r is significant for nonidealists, not for idealists. • For more than two groups, use this chisquare (also available at link above). Analysis of Covariance • You already know how to do this. • Just drop the interaction term from the model. • Here that would not be appropriate, as well have heterogeneity of regression. A Couple of t Tests • You may also want to compare the groups on the Y (ignoring C) and/or C. • I have include those tests in the program. • This is redundant with the initial Proc Corr output (point biserial correlations). SPSS • This analysis is easy to do with SPSS too. • See my handout. • You can do the analysis in a sequential fashion. • And get the partial F tests from SPSS, even with df > 1: Leave your calculator in the desk drawer. Three Groups • Two dummy variables, G1 and G2 • Two interaction terms, G1C and G2C • To test the slopes you would see if the model fit were significantly reduced by simultaneously removing G1C and G2C . • That would be an F with two df in its numerator. • To test the intercepts, remove both G1 and G2 Let’s Go Fishing • Length = a + bWeight for flounder • Does the relationship differ across regions? – Pamlico Sound – Pamlico River – Tar River • Potthoff3.sas and Potthoff3.dat • Download and run. Proc GLM • Proc GLM; class Location; o Model Length = WeightSR|Location; • GLM creates the (2) dummy variables for you • And the (2) interaction terms. Full Model Source DF Model 5 Error 745 Corrected 750 Total Sum of Mean F Value Squares Square 1927227.450 385445.490 3088.09 92988.508 124.817 2020215.957 Pr > F <.0001 Covariate Only Model Proc GLM; class Location; model Length = WeightSR / solution; Weights are significantly correlated with lengths, r2 = .95, F(1, 749) = 14,544, p < .001. Test of Coincidence F SSreg full SSreg reduced (f r )MSE full 1927227.450 - 1921272.704 11.927 (5 - 1)124.817 • On 4, 745 df, p < .001 • The lines are not coincident. Look at Full Model • The slopes do not differ significantly, F(2, 745) = 1.63, p = .20. • The intercepts do differ significantly, F(2, 745) = 4.90, p = .008. • Since the slopes do not differ significantly • But the intercepts do, • The group means must differ. Source DF Type III SS Mean Square F Value Pr > F WeightSR 1 692152.6411 692152.6411 5545.35 <.0001 Location 2 WeightSR* 2 Location 1223.2971 407.5286 4.90 1.63 611.6486 203.7643 0.0077 0.1961 Analysis of Covariance • Location significantly affects mean length of flounder, after adjusting for the effect of weight. Unadjusted Means (notice the different pattern)