AP Statistics Chapter 8 Practice Problems Linear Regression Linear Regression A diver is investigating a wreck under the water and has to come up to the surface slowly. The following is a chart detailing his depth from the time he starts ascending. Time (sec) Depth (ft) Time (sec) Depth (ft) 0 240 210 155 30 225 280 185 60 203 330 130 100 189 360 125 140 180 390 120 180 164 Linear Regression 1. We wish to perform regression on the data. What are the three conditions that we must check before we attempt to do regression? The data is quantitative b) The data is linear c) There are no outliers a) 2. Graph the scatterplot and determine if linear regression seems appropriate. a) 3. No, regression doesn’t seem appropriate because there appears to be an outlier Which point is the outlier? a) At time 280 seconds, he was at depth of 185 feet. Linear Regression Determine the equation for the Least Squares Regression Line (LSRL). Describe the association. There appears to be a strong, negative, linear association between time and depth, but there appears to be an outlier Determine the correlation. Depth 225.642 0.272(time) r = -0.932 Eliminate the outlier. What is the new correlation? Why does it change? r = -0.980; it’s stronger because the residuals are smaller Linear Regression 8. Determine the equation for the Least Squares Regression Line (LSRL) without the outlier. a) 9. Explain the meaning of the slope of the line. a) 10. Depth 225.698 0.292(time) For every 1 second increase in time, our model predicts an average decrease of 0.292 feet in depth. Explain the meaning of the b0 in context of this problem a) b) At a time of 0 seconds, our model predicts a depth of 225.698 feet. Although this makes sense, we know that the diver began to ascend at a depth of 240 ft. Linear Regression 11. Describe the relationship between time and depth using r2 to make your description more precise. a) b) Since r = -0.980, r2 = 0.960 96% of the variation in the depth can be explained by the approximate linear relationship with the time. Linear Regression 12. Using the modified model, predict the depth of the diver at each of the following times and comment on the confidence of your prediction: 2 min. 50 sec. a) ≈ 176 feet b) 5 min. a) ≈ 138 feet c) 6 min. 30 sec. What is the residual at this time? a) ≈ 112 feet. The residual is 120 – 112 = 8 (or if we use the calculator ≈ 8.27) d) 10 min. a) ≈ 50.4 feet a) Linear Regression Using the following summary statistics of a statistics class, determine the LSRL (assume that IQ is the explanatory variable): IQ 112 S IQ 10 r 0.893 SAT 1821 S SAT 107 107 sSAT 0.893 9.556 slope b1 r 10 sIQ b0 y b1 x b0 SAT b1 ( IQ ) b0 1821 9.556(112) 750.728 Linear Regression Using the following summary statistics of a statistics class, determine the LSRL (assume that IQ is the explanatory variable): IQ 112 S IQ 10 r 0.893 SAT 1821 S SAT 107 LSRL yˆ b0 b1 x Since b0 750.728 and b1 9.556 SAT 750.728 9.556 IQ Linear Regression With an LSRL of: SAT 750.728 9.556 IQ Interpret b0 With an IQ of 0, our model predicts an SAT score of 750.728. This make absolutely no sense. You can’t have an IQ of 0! Interpret b1 For every increase of 1 point in IQ, our model predicts an average increase of 9.556 point on the SAT. Linear Regression With an LSRL of: SAT 1803.976 0.152 IQ Interpret r2 Since r = 0.893, r2 = 0.797 Approximately 80% of the variation in SAT score can be explained by the approximate linear relationship with the IQ. Review Question A researcher uses a regression equation to predict home heating bills (dollar cost), based on home size (square feet). The correlation between predicted bills and home size is 0.70. What is the correct interpretation of this finding? 70% of the variability in home heating bills can be explained by home size. b) 49% of the variability in home heating bills can be explained by home size. c)The For each added home size, heating bills answer is b)square sincefoot theofcoefficient of determination increased by 70 cents. measures the proportion of variation in the dependent d) For each added square foot of home size, heating bills variable that is predictable from the independent increased by 49 cents. variable. e) None of the above. a) Review Question A national consumer magazine reported the following correlations: The correlation between car weight and car reliability is -0.30. The correlation between car weight and annual maintenance cost is 0.20. Which of the following statements are true? I. II. III. Heavier cars tend to be less reliable. Heavier cars tend to cost more to maintain. Car weight is related more strongly to reliability than to maintenance cost. a) I only e) I, II, and III The answer is e) since reliability tends to decrease as car b) II only weight increases, costs tend to increase as car weight increases, c) III only and d) strength I and II increases as correlation gets closer to ±1 Review Question In the context of regression analysis, which of the following statements are true? I. II. III. When the sum of the residuals is greater than zero, the data set is nonlinear A random pattern of residuals supports a linear model. A random pattern of residuals supports a non-linear model. I only b) II only c) answer III only The is b) since a random pattern of residuals supports a linear model; d) I and IIa non-random pattern supports a non-linear model. The of the e) sum I and III residuals is always zero, whether the data set is linear or nonlinear a) Assignment Chapter 8 Chapter 9 Lesson: Read: Problems: Linear Regression Regression Wisdom Chapter 8 Chapter 9 1 – 49 (odd) 1 - 31 (odd)