STAT 113 Week 14 (Apr 8) Work Sheet 9: Chapter 14 & 15 Scatterplots 1. The gas mileage of an automobile first increases and then decreases as the speed increases. Suppose that this relationship is very regular, as shown by the following data on speed (miles per hour) and mileage (miles per gallon): Speed 30 40 50 60 70 Mileage 20 24 26 24 20 (a) Make a scatterplot of mileage versus speed. Mileage vs Speed Mileage 30 25 20 15 20 30 40 50 Speed 60 70 80 (b) The correlation between speed and mileage is r = 0. Explain why the correlation is 0 even though there is a strong relationship between speed and mileage. The relationship is nonlinear, correlation measures the strength of only straight-line association between two variables. 2. Below is a scatterplot of grade on exam 2 versus time spent playing video games in hours for the Exam 2 week for 40 students. Exam 2 150 G r 100 a 50 d e 0 0 10 20 Time spent playing video games 30 (a) Describe the important features of the relationship between two variables: Form: linear or nonlinear? Linear Direction: positive or negative (or no association)? Negative Strength: strong, moderate, weak? Strong (b) If we have an outlier like in the scatterplot below, will correlation increase, decrease or stay the same? Exam 2 150 G r 100 a 50 d e 0 0 10 20 30 Time spent playing video games correlation will decrease (c) What should we do to see if the outlier is influential? Re-run the data without the outlier and see how much the slope and R^2 change. If there are big changes, then the outlier is influential. 3. Below is the Excel output for least-squares regression in Example 2: Regression Statistics Multiple R 0.900706456 R Square 0.81127212 Adjusted R Square 0.806305597 Standard Error 10.48251906 Observations 40 ANOVA df SS MS F Significance F Regression 1 17949.21 17949.21 163.3481 Residual 38 4175.562 109.8832 Total 39 22124.77 2.47E-15 Standard Intercept Time Coefficients Error 101.1837484 3.728178 27.14027 -3.011027947 0.23559 t Stat -12.7808 P-value 1.75E-26 2.47E-15 (a) What is the least squares regression line for this data? Be sure to identify your variables by name, not just x and y. y=101.18-3.01x, where y=Exam 2 grade and x=time spent playing video games in hours for the Exam 2 week (b) What does the intercept mean in terms of the story? The predicted value of Exam 2 grade for a student who does not play video games at all will be 101.18 (c) What does the slope mean in terms of the story? The predicted value of Exam 2 grade will decrease by 3.01 when the time spent playing video games increases by 1 hour. (d) What is the predicted value of Exam 2 score for a student who plays 6 hours of video games during Exam 2 week? Is this a prediction or an extrapolation? Explain your answer. y=101.18-3.01*6=83.12 This is a prediction because 6 is within the range of x-values in our data (e) What is the predicted value of Exam 2 score for a student who plays 33 hours of video games during Exam 2 week? Is this a prediction or an extrapolation? Explain your answer. y=101.18-3.01*33=1.85 This is a extrapolation because 33 is outside of the range of x-values in our data (f) How to interpret R-square in terms of the story? 81.13% of variation in the values of Exam 2 grade is explained by the least-squares regression (g) What is the correlation between grade on exam 2 and time spent playing video games in hours for the Exam 2 week? −√𝟎. 𝟖𝟏𝟏𝟑 = −0.9007 or attach a negative sign to the “Multiple R” part of the output. 4. In 1988, the Kalamazoo (Michigan) Symphony advertised a “Mozart for Minors” program with this statement: “Questions: Which students scored 51 points higher in verbal skills and 39 points higher in math? Answer: Students who had experience in music.” (a) What do you think of the claim that “experience in music” causes higher test scores? Explain possible lurking variables that could be interfering. It is not appropriate to conclude that the relationship is due to cause and effect. Lurking variables might explain the relationship. Students with music experience might have other advantages (wealthier parents, better school systems…) (b) This is best described as an example of: a. Confounding b. A valid conclusion c. Causation d. Common response (c) Draw a circles-and-arrows diagram to demonstrate your answer. x: experience in music y: higher test scores z: lurking variables (wealthier parents, better school systems…) 5. Come up with a story not used in class or in the book that is an example of common response and another that is an example of confounding. Draw a circles-and-arrows diagram to go with each story. 1. common response: during summer more people drink Coke (x) and also more people are drowned (y), it’s the warmer weather (z) that’s causing an increase in both. 2. confounding : Joe got a good night's sleep(x) before his statistics exam, and he got an A on the exam (y). Sleep is important for mental focus, but so is studying, doing practice problems, etc (z). 1. 2.