Chapter 9: Multiple Regression We can use several predictors to build a model for Brain weight of mammals. > brain <- read.csv("data/brain-body.csv") > plot(brain) ## matrix of scatterplots > plot(log(brain)) ## use log on all variables. 2500 ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ●● ● ● ●● ● ●●●● ●● ● ● ●● ●● ● ● 0 ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●●● ●● ● ● ●●●● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ●● ● ●●●● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ●● ● ●● ● ● ● ●● ●● ●●● 0 ● 2000 4000 ● ● 0 2.0 ●● ● 200 ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ●● ●● ● ● ●●● ●● ●● ● ● ● ● ● ●● ●● ● 200 0.0 1.0 litter ● 500 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ●● ● ● ●● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ● ●● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● 0 2 4 6 8 ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ●●● ●●●● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ●● ● ●● ● ● ● ● ●●● ● ● ● ●● ● ●●● ●●● ●● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ●● ●● ●●● ● ● ● ● ● ● ● ● ● ● ●●●● ●● ●● ● ●● ● ● ●● ●●● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● body ● 0 ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● 8 ● ● ●●●● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ● ● ●● ●● ●● ● ●● ● ●● ● ●● ●● ●● ● ● ●● ● ●● ●●● ● ●●● ● ● ●●● ● ● ●● ● ●●● ● ● 0.0 7 5 3 1 gestation ● ● ●●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● 500 ● 4 ●● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●●● ●● ●● ●● ● ● ● ● ● ●● ●● ●● ●● brain −4 1000 0 body ● 0 ● 4 ● −4 ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●●● ●● ●● ● ● ● ●● ● ●●● ● ● ●● ● ● ●● ●● ● ● ● ●● ●● ● ●● ●●● ● ● ●● ● ●● ● ● ● ●● ●● ● ●●● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ●● ● ● ●● ●● ● ●● ● ●● ● ●● ●●● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ●●● ● ●● ●● ● ● ● ●●●● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ●● 1.0 2.0 ● ● ● ● ● ● ●● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● 0 2 4 6 8 ● 7 ● ● ● ● ● ●● ●● ● ●● ●● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● gestation 6 ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● 5 ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ●● ●● ●● ●● ● ● ●● ●●●● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● 3 4 5 ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●●● ●● ● ● ●● ● ●● ●●●●● ●● ● ● 5 ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 3 ● 4 brain ● 3 1 ● 8 2500 2000 4000 1000 0 0 litter 6 Which scale is preferred? Which predictor has strongest correlation with brain weight (logged or raw)? > brainL2 <- log(brain) > names(brainL2) = c("Lbrain","Lbody","Lgest", "Llitter") > brain.fit1 <- lm(Lbrain ~ Lbody, brainL2) > summary(brain.fit1) Call: lm(formula = Lbrain ~ Lbody, data = brainL2) Residuals: Min 1Q Median -1.16218 -0.44640 -0.04525 3Q 0.35076 Max 1.83561 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.33235 0.07325 31.84 <2e-16 Lbody 0.71919 0.02037 35.30 <2e-16 Residual standard error: 0.5781 on 94 degrees of freedom Multiple R-squared: 0.9299,Adjusted R-squared: 0.9291 F-statistic: 1246 on 1 and 94 DF, p-value: < 2.2e-16 ## two possible predictors. Do they improve the fit? > par(mfrow=c(1,2)) > plot(resid(brain.fit1)~brainL2$Lgest) > ## plot(resid(brain.fit1)~brainL2$litter) 1 > plot(resid(brain.fit1)~ jitter(brainL2$Llitter)) > cor(resid(brain.fit1),brainL2$Lgest) ## .29 > cor(resid(brain.fit1),brainL2$Llitter) ## -.44 ● ● 3 4 5 0.0 ● 1.0 ● −1.0 ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ●●● ●● ● ● ●●● ●● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● resid(brain.fit1) 1.0 0.0 −1.0 resid(brain.fit1) ● 6 ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 brainL2$Lgest ● ● 0.5 1.0 1.5 ● ● ● ● ● ● 2.0 jitter(brainL2$Llitter) Add a second predictor. This takes us into another dimension. > brain.fit2 <- lm(Lbrain ~ Lbody + Llitter, data=brainL2) > summary(brain.fit2) Residuals: Min 1Q Median 3Q Max -1.13795 -0.32132 -0.02347 0.35217 1.65212 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.79827 0.10014 27.945 < 2e-16 Lbody 0.65153 0.02079 31.339 < 2e-16 Llitter -0.53845 0.09029 -5.963 4.42e-08 Residual standard error: 0.4943 on 93 degrees of freedom Multiple R-squared: 0.9493,Adjusted R-squared: 0.9482 F-statistic: 869.9 on 2 and 93 DF, p-value: < 2.2e-16 Important: the t-test in the Lbody row is for the linear effect of Log body weight on log brain weight given that log litter is in the model. Similarly, the t-test in the Llitter row is for the linear effect of log litter size on log brain weight given that log body wt is in the model. > anova(brain.fit2) Analysis of Variance Table Response: Lbrain Df Sum Sq Mean Sq F value Pr(>F) Lbody 1 416.40 416.40 1704.268 < 2.2e-16 Llitter 1 8.69 8.69 35.562 4.417e-08 Residuals 93 22.72 0.24 The anova command is sequential. It first does an ESS F test for the SLR of Lbrain on Lbody compared to a null model of a single mean (no Lbody effect). Then it tests adopts the model with the Lbody effect as the null model and tests to see if Llitter improves the fit. The second test tests the Llitter effect conditional on having Lbody in the model. Try the other predictor: > brain.fit3 <- lm(Lbrain ~ Lbody + Lgest, data=brainL2) > summary(brain.fit3) 2 Residuals: Min 1Q Median -1.00286 -0.30372 -0.05242 3Q 0.37851 Max 1.58788 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.45728 0.45848 -0.997 0.321 Lbody 0.55117 0.03236 17.033 < 2e-16 Lgest 0.66782 0.10875 6.141 2.00e-08 Residual standard error: 0.4902 on 93 degrees of freedom Multiple R-squared: 0.9501,Adjusted R-squared: 0.949 F-statistic: 885.2 on 2 and 93 DF, p-value: < 2.2e-16 The t-test in the Lbody row is for The t-test in the Lgest row is for The F test uses the “single mean” null model, so it answers the question: “is either predictor helping to explain the response?” > anova(brain.fit3) Analysis of Variance Table Response: Lbrain Df Sum Sq Mean Sq F value Pr(>F) Lbody 1 416.40 416.40 1732.785 < 2.2e-16 Lgest 1 9.06 9.06 37.713 2.002e-08 Residuals 93 22.35 0.24 First test: Second test: > brain.fit4 <- lm(Lbrain ~ Lbody + Lgest +Llitter, data=brainL2) > summary(brain.fit4) Residuals: Min 1Q Median 3Q Max -0.95415 -0.29639 -0.03105 0.28111 1.57491 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.85482 0.66167 1.292 0.19962 Lbody 0.57507 0.03259 17.647 < 2e-16 Lgest 0.41794 0.14078 2.969 0.00381 Llitter -0.31007 0.11593 -2.675 0.00885 Residual standard error: 0.4748 on 92 degrees of freedom Multiple R-squared: 0.9537,Adjusted R-squared: 0.9522 F-statistic: 631.6 on 3 and 92 DF, p-value: < 2.2e-16 Lbody line tests: Lgest line tests: Llitter line tests: 3 First test: > anova(brain.fit4) Analysis of Variance Table Response: Lbrain Df Sum Sq Mean Sq F value Pr(>F) Lbody 1 416.40 416.40 1847.4486 < 2.2e-16 Lgest 1 9.06 9.06 40.2084 8.388e-09 Llitter 1 1.61 1.61 7.1541 0.008852 Residuals 92 20.74 0.23 Coefficient estimates: Model number 1 Second test: Third test: intercept Lbody Llitter 2 3 4 These predictors are correlated, sharing some of the same information, so when we add a term to the model, it changes coefficient estimates for the others. With multiple regression, all inferences are conditional on the other terms in the model. The t-tests are conditional on all other terms (above or below). The anova F tests are sequential, conditional on the tests above. Assumptions are the same as with SLR. Diagnostics are the same: par(mfrow=c(1,3)); plot(brain.fit4, which = 1:3) Residuals vs Fitted Normal Q−Q Scale−Location ● Human being 0 2 4 Fitted values 6 8 1.5 Tapir ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● 0.0 ● ● ● Tapir −2 Dolphin ● ● ● 1.0 Standardized residuals 2 1 −2 −1.0 ●● ● 0 0.0 ● ● ●● ● ● ● ● ● ● ● ● Standardized residuals ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Tapir ● ● ● ● ●● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● −1 0.5 ● ● −0.5 Residuals 1.0 Dolphin ● 0.5 Dolphin ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● being Human being ● 3 1.5 ● Human −1 0 1 2 0 Theoretical Quantiles 2 4 6 8 Normality: seems good, no problems in the qqnorm plot Linearity: There is curvature in the first plot, This is questionable. Constant variance: slight fan shape in plot 1, slight trend in plot 3. Independent obs: No reason to doubt independence. Fitted values > summary(brain.fit5 <- update(brain.fit4, . ~ . +I(Lbody^2))) Call: lm(formula = Lbrain ~ Lbody + Lgest + Llitter + I(Lbody^2), data = brainL2) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.927918 0.643674 1.442 0.15285 Lbody 0.618311 0.035979 17.185 < 2e-16 Lgest 0.415145 0.136820 3.034 0.00314 Llitter -0.280850 0.113250 -2.480 0.01498 I(Lbody^2) -0.013114 0.005178 -2.532 0.01304 Residual standard error: 0.4614 on 91 degrees of freedom Multiple R-squared: 0.9567,Adjusted R-squared: 0.9548 F-statistic: 503.2 on 4 and 91 DF, p-value: < 2.2e-16 4 Lgest