answer key

advertisement
DO ANIMALS WHO LIVE LONGER HAVE HIGHER GESTATION PERIODS ?
Animal
baboon
bear (grizzly)
beaver
camel
chimpanzee
cow
dog
elephant
fox
giraffe
goat
guinea pig
horse
leopard
monkey
mouse
pig
rabbit
sea lion
squirrel
wolf
Gestation (days)
187
225
122
406
231
284
61
645
52
425
151
68
330
98
164
21
112
31
350
44
63
Longevity (years)
20
25
5
12
20
15
12
40
7
10
8
4
20
12
15
3
10
5
12
10
5
1. Judging from the Research Question at the top of this page,
a. Explanatory variable = longevity Units = years
b. Response variable = gestation Units = days
2. Make a scatterplot of the data on your calculator and sketch it neatly below.
3. Describe the plot (DOFS) Fairly strong, positive, linear association. One point far from
the pack (the elephant)
4. Do a linear regression on your calculator and write the regression equation here, and
save the equation to Y1 (use LinReg L1, L2, Y1). Add the line to your scatterplot above.
gestation-hat = 16.760 + 13.770 (longevity)
a. Interpret the y-intercept of this line. An animal with no lifespan is predicted to
have a gestational period of 16.760 days.
b. Interpret the slope of this line. For every year increase in longevity, we predict
13.770 more days gestation.
5. r = 0.7316
Interpret this value: fairly strong linear association
6. r2= 0.535
Interpret this value: 53.5% of the variation in gestation can be
explained by our linear model of longevity vs gestation.
7. Use your calculator to create a residuals plot and sketch it neatly below. Be sure to
include a horizontal line where resid=0.
One var stats on RESID:
8. Does there appear to be any relationship between residuals and longevities? not really
9. Calculate s, the standard deviation of the residuals (see Step 4 on page 178 of your text).
Note: you’ll need to store RESID to another list name (eg L6) to do a OneVarStats!).
Units?
SQRT(245746.6/(21-2)) = 114 days
10. Which of the animals is clearly an outlier both in longevity and in gestation period?
Circle this animal on your scatterplot and on your residuals plot, and calculate its
residual value (take its gestation period from the table and subtract its calculated
predicted value according your regression line).
elephant resid = 645 – (16.76-13.77*40) = 645-568 = 77 days underpredicted
a. Does your model over- or under-predict that animal’s gestation period?
b. Does it seem to have the largest residual (in absolute value) of any animal?
looking at the resid plot, this is not a relatively high residual!
Outlier: In the context of regression lines, outliers are observations which fall far from
the regression line, not following the pattern of the line. Outliers may have large
residuals.
11. Is the animal you identified in #10 an outlier to the regression line? no; it is an outlier
only in the x-direction.
12. Which animal does have the largest (in absolute value) residual? Is its gestation period
longer or shorter than expected for an animal with its longevity? giraffe; shorter
13. Use the Stats List Editor to eliminate the giraffe’s information from the analysis. Use
your calculator to determine the equation of the regression line for predicting gestation
period from longevity in this case. Record the equation and the value of r2 below. Save
the new regression line to Y2 and add this line to your scatterplot above (label it “No
Giraffe”).
predicted gestation = -3.88 + 14.317 (longevity) r=.802, r2=.644
14. Compare your new regression line to the original one. Is it substantially different?
I barely see the difference on the graph
15. Return the giraffe’s values to the list (gestation = 425, longevity=10; you can just add it
at the bottom of the lists) and now eliminate the elephant’s data. Find the new
equation of the regression line and save it to Y3. Record the regression equation and
the new r2 here. Add the line to your scatterplot above and label it “No elephant.”
predicted gestation = 48.18 + 10.70*longevity
r=.511 r2= 0.261
16. Compare your new regression line to the original one. Is it substantially different? In
which case (giraffe or elephant) did the removal of one animal affect the regression line
more? well, removal of elephant is certainly more influential!
Influential Observation: In the context of regression lines, influential points are
ones whose removal would substantially affect the regression line. Points that
are outliers in the x-direction of a scatterplot are often influential for the
regression.
17. To appreciate even further the potential influence of the elephant, change its gestation
period to 45 days instead of 645 days. Use your calculator to find the new regression
line (save it to Y4) and record the line and r2 value here. Add this line to the scatterplot
(label it Elephant=45). Describe what happened.
Line is much flatter; hardly any variation in gestation is accounted for by this linear model of
longevity vs gestation.
Download