Chapter 3 Review-KEY The midterm and final test grades of a

Chapter 3 Review-KEY The midterm and final test grades of a sample of 11 statistics students were recorded. Use this data to answer questions 1 – 8. Graph the data so that you can predict final exam grade from midterm grade. Student Number Midterm Score Final Exam Score 1 77 81 2 90 96 3 65 72 4 86 91 5 59 82 6 92 93 7 97 95 8 72 69 9 79 89 10 76 74 11 50 42 1. Identify the explanatory and response variables. Justify your choices. The explanatory variable is the midterm score and the response variable is the final exam score because we are predicting final exam score from midterm score. 2. Describe the form, strength, and direction of the scatterplot. Are there any distinguishing characteristics or influential points? The graph has a strong, positive, linear relationship. The point at (59, 82) is an influential point because it appears to be an outlier in the y-direction. 3. Using your calculator, determine the equation of the regression line. Plot that line on your graph. Predicted final exam score = 8.546 + 0.937 (midterm score) 4. Identify the slope and y-intercept. Interpret them in context. Slope: 0.937; this value represents an increase in final exam score of 0.937 points for every 1 point increase in midterm score. Y - intercept: 8.546; this value represents the final exam score if the midterm score were 0 points. 5. The correlation for this data is 0.8544. How would removing the point (59, 82) change the correlation? Removing point (59, 82) would increase the correlation because that point increases the spread of the data. Removing it would make all the points closer to the line. 6. Which point appears to have the largest negative residual? The point at (50, 42) appears to have the largest negative residual. 7. Calculate the residual for that point and interpret it in context. The predicted value needs to be calculated before the residual can be calculated. Using the equation of the line from question 3: Predicted value = 8.546 + 0.937(midterm grade) Predicted value = 8.546 + 0.937 (50) = 55.4 points Residual = actual value – predicted value Residual = 42 points – 55.4 points = -13.4; this student score 13.4 points less than expected 8. What does the following residual plot tell about the form of the data? Because the residual plot has no obvious pattern, is completely scattered, a linear model is appropriate for this data. Use this regression analysis for diameter (in inches) versus age (in years) for a sample of 25 oak trees to answer questions 9 - 12. 9. What is the equation of the line? Predicted diameter = 1.1755 + 0.16476 (age) 10. What does a coefficient of determination of 80.89% represent in the context of this problem? A coefficient of determination (r2) of 80.89% means that 80.89% of the variation in height is due to the variation in age. The other 19.11% of variation could be attributed to such things as soil conditions, growing season, and annual rainfall / droughts. 11. Calculate the correlation. What does this tell you about the data? Correlation (r) is calculated by taking the square root of the coefficient of determination. For r2 = 0.8089, r  0.8089 = ±0.90. Because the slope is positive, the correlation is also positive. A correlation of 0.90 indicates a strong relationship. 12. What would happen to the value of the correlation if the diameter were measured in centimeters rather than inches? Correlation would not change with a change in units because it is not dependent on units. An ecologist studying breeding habits of the common crossbill in different years finds that there is a linear relationship between the number of breeding pairs of crossbills and the abundance of the spruce cones. Below are statistics on eight years of measurements, where x = average number of cones per tree and y = number of breeding pairs of crossbills in a certain forest. The correlation between x and y is r = 0.968. Use this information to answer questions 13 – 15. Mean x = mean number of cones/tree y = number of crossbill pairs 23.0 18.0 Standard deviation 16.2 15.1 13. Determine the equation of the least-squares regression line (with y as the response variable). Slope: b = r sy sx  15.1  b = 0.986    16.2  b = 0.9023 Y-intercept: a= y -bx a = 18.0 - (0.9023)(23) = -2.753 Therefore the equation of the line is: predicted number of crossbill pairs = -2.753 + 0.9023 (number of cones per tree) 14. What percentage of the variation in numbers of breeding pairs of crossbills can be accounted for by this regression? The percentage of the variation is the coefficient of determination (r2). r = 0.968 r2 = (0.968)2 = 0.937 or 93.7% 93.7% of the variation in the number of breeding pairs can be accounted for by the number of cones per tree. 15. Based on these data, can we conclude that the abundance of spruce cones is responsible for the number of breeding pairs of crossbills? Explain. No, correlation does not imply causation. Just because there is a strong, linear relationship does not necessarily mean that one variable caused the other.

Chapter 3 Review-KEY The midterm and final test grades of a

Related documents

Products

Support

Chapter 3 Review-KEY The midterm and final test grades of a

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib