Analysis of Covariance Allows comparison between groups allowing for effect of other variate e.g. comparison of nitrate levels over region allowing for differences in salinity Regression with a combination of a categorical (factor) and continuous variate Additive model simple interpretation separate effects 1 Example – Mountain Possums • • • 1. trapped at seven different sites nine morphometric measurements gender and age also recorded Is there any gender difference in total length? 2. Relationship between total length and head length - is this the same for males and females? Prediction of animal size from skulls 2 Gender difference? sex f m N 43 61 Mean 87.91 86.51 StDev 4.18 4.34 SE Mean 0.64 0.56 Difference = mu (f) - mu (m) Estimate for difference: 1.39550 95% CI for difference: (-0.28452, 3.07552) T-Test of difference = 0 (vs not =): T-Value = 1.65 P-Value = 0.102 DF = 92 Conclusion? 3 Relationship between lengths Linear regression: Predictor Constant hdlngth Coef 9.888 0.83367 SE Coef 8.000 0.08633 T 1.24 9.66 P 0.219 0.000 Fitted Line Plot totlngth = 9.888 + 0.8337 hdlngth S R-Sq R-Sq(adj) 95 3.13075 47.8% 47.2% totlngth 90 85 80 75 80 85 90 95 hdlngth 100 105 4 Head length and gender? The regression equation is totlngth = 8,26 + 0,864 hdlngth - 2,06 male Predictor Constant hdlngth male Coef 8.261 0.86432 -2.0646 SE Coef 7.615 0.08249 0.5957 T 1.08 10.48 -3.47 P 0.281 0.000 0.001 • Both terms are significant • Additive model – Total length increases by 0.86 for each unit increase in head length – For the same head length males are 2 units shorter 5 Scatterplot of totlngth, main effects vs hdlngth Variable totlngth totlngth main effects main effects 95 sex f m f m Y-Data 90 85 80 75 80 85 90 95 100 105 hdlngth Simple interpretation, but not a convincing fit! Evidence of different slopes INTERACTION 6 Interaction model • Need product between indicators and continuous variables (here: m_hdlen = hdlngth * male) The regression equation is totlngth = - 28,7 + 1,27 hdlngth + 45,1 male - 0,511 m_hdlen Predictor Constant hdlngth male m_hdlen Coef -28.72 1.2657 45.08 -0.5107 SE Coef 15.98 0.1733 18.06 0.1955 T -1.80 7.30 2.50 -2.61 P 0.075 0.000 0.014 0.010 • All terms significant 7 Use Scatterplot with fitted lines for groups to display this Scatterplot of totlngth vs hdlngth sex f m 95 totlngth 90 85 80 75 80 85 90 95 100 105 hdlngth 8 Analysis of covariance • output as for standard ANOVA: Source Regression Residual Error Total DF 3 100 103 SS 1077,41 836,41 1913,83 MS 359,14 8,36 F 42,94 P 0,000 • Total DF: N-1= 104-1 = 103 • Regression DF = p-1, with p = 4 estimated parameters • MS = SS/DF • F = MS (Regr.)/ MS (Residual Err.) 9 Unusual Observations Obs 31 39 48 51 55 73 82 85 hdlngth 93 85 99 96 103 83 86 90 totlngth 93,000 75,000 85,000 85,000 92,500 82,000 82,000 92,000 Fit 86,419 78,479 90,797 92,275 94,194 78,643 80,125 83,927 SE Fit 0,370 1,364 0,633 0,743 0,993 1,013 1,153 0,483 Residual 6,581 -3,479 -5,797 -7,275 -1,694 3,357 1,875 8,073 St Resid 2,29 -1,36 -2,05 -2,60 -0,62 1,24 0,71 2,83 R X R R X X X R R denotes an observation with a large standardized residual. X denotes an observation whose X value gives it large influence. Interpretation? Look at these observations in scatter plot! 10