DO ANIMALS WHO LIVE LONGER HAVE HIGHER GESTATION PERIODS ? Animal baboon bear (grizzly) beaver camel chimpanzee cow dog elephant fox giraffe goat guinea pig horse leopard monkey mouse pig rabbit sea lion squirrel wolf Gestation (days) 187 225 122 406 231 284 61 645 52 425 151 68 330 98 164 21 112 31 350 44 63 Longevity (years) 20 25 5 12 20 15 12 40 7 10 8 4 20 12 15 3 10 5 12 10 5 1. Judging from the Research Question at the top of this page, a. Explanatory variable = longevity Units = years b. Response variable = gestation Units = days 2. Make a scatterplot of the data on your calculator and sketch it neatly below. 3. Describe the plot (DOFS) Fairly strong, positive, linear association. One point far from the pack (the elephant) 4. Do a linear regression on your calculator and write the regression equation here, and save the equation to Y1 (use LinReg L1, L2, Y1). Add the line to your scatterplot above. gestation-hat = 16.760 + 13.770 (longevity) a. Interpret the y-intercept of this line. An animal with no lifespan is predicted to have a gestational period of 16.760 days. b. Interpret the slope of this line. For every year increase in longevity, we predict 13.770 more days gestation. 5. r = 0.7316 Interpret this value: fairly strong linear association 6. r2= 0.535 Interpret this value: 53.5% of the variation in gestation can be explained by our linear model of longevity vs gestation. 7. Use your calculator to create a residuals plot and sketch it neatly below. Be sure to include a horizontal line where resid=0. One var stats on RESID: 8. Does there appear to be any relationship between residuals and longevities? not really 9. Calculate s, the standard deviation of the residuals (see Step 4 on page 178 of your text). Note: you’ll need to store RESID to another list name (eg L6) to do a OneVarStats!). Units? SQRT(245746.6/(21-2)) = 114 days 10. Which of the animals is clearly an outlier both in longevity and in gestation period? Circle this animal on your scatterplot and on your residuals plot, and calculate its residual value (take its gestation period from the table and subtract its calculated predicted value according your regression line). elephant resid = 645 – (16.76-13.77*40) = 645-568 = 77 days underpredicted a. Does your model over- or under-predict that animal’s gestation period? b. Does it seem to have the largest residual (in absolute value) of any animal? looking at the resid plot, this is not a relatively high residual! Outlier: In the context of regression lines, outliers are observations which fall far from the regression line, not following the pattern of the line. Outliers may have large residuals. 11. Is the animal you identified in #10 an outlier to the regression line? no; it is an outlier only in the x-direction. 12. Which animal does have the largest (in absolute value) residual? Is its gestation period longer or shorter than expected for an animal with its longevity? giraffe; shorter 13. Use the Stats List Editor to eliminate the giraffe’s information from the analysis. Use your calculator to determine the equation of the regression line for predicting gestation period from longevity in this case. Record the equation and the value of r2 below. Save the new regression line to Y2 and add this line to your scatterplot above (label it “No Giraffe”). predicted gestation = -3.88 + 14.317 (longevity) r=.802, r2=.644 14. Compare your new regression line to the original one. Is it substantially different? I barely see the difference on the graph 15. Return the giraffe’s values to the list (gestation = 425, longevity=10; you can just add it at the bottom of the lists) and now eliminate the elephant’s data. Find the new equation of the regression line and save it to Y3. Record the regression equation and the new r2 here. Add the line to your scatterplot above and label it “No elephant.” predicted gestation = 48.18 + 10.70*longevity r=.511 r2= 0.261 16. Compare your new regression line to the original one. Is it substantially different? In which case (giraffe or elephant) did the removal of one animal affect the regression line more? well, removal of elephant is certainly more influential! Influential Observation: In the context of regression lines, influential points are ones whose removal would substantially affect the regression line. Points that are outliers in the x-direction of a scatterplot are often influential for the regression. 17. To appreciate even further the potential influence of the elephant, change its gestation period to 45 days instead of 645 days. Use your calculator to find the new regression line (save it to Y4) and record the line and r2 value here. Add this line to the scatterplot (label it Elephant=45). Describe what happened. Line is much flatter; hardly any variation in gestation is accounted for by this linear model of longevity vs gestation.