Worksheet 9

advertisement
STAT 113 Week 14 (Apr 8)
Work Sheet 9: Chapter 14 & 15
 Scatterplots
1. The gas mileage of an automobile first increases and then decreases as the speed
increases. Suppose that this relationship is very regular, as shown by the following
data on speed (miles per hour) and mileage (miles per gallon):
Speed
30
40
50
60
70
Mileage
20
24
26
24
20
(a) Make a scatterplot of mileage versus speed.
Mileage vs Speed
Mileage
30
25
20
15
20
30
40
50
Speed
60
70
80
(b) The correlation between speed and mileage is r = 0. Explain why the correlation is
0 even though there is a strong relationship between speed and mileage.
The relationship is nonlinear, correlation measures the strength of only
straight-line association between two variables.
2. Below is a scatterplot of grade on exam 2 versus time spent playing video games
in hours for the Exam 2 week for 40 students.
Exam 2
150
G
r 100
a
50
d
e
0
0
10
20
Time spent playing video games
30
(a) Describe the important features of the relationship between two variables:
Form: linear or nonlinear?
Linear
Direction: positive or negative (or no association)?
Negative
Strength: strong, moderate, weak?
Strong
(b) If we have an outlier like in the scatterplot below, will correlation increase,
decrease or stay the same?
Exam 2
150
G
r 100
a
50
d
e
0
0
10
20
30
Time spent playing video games
correlation will decrease
(c) What should we do to see if the outlier is influential?
Re-run the data without the outlier and see how much the slope and R^2
change. If there are big changes, then the outlier is influential.
3. Below is the Excel output for least-squares regression in Example 2:
Regression Statistics
Multiple R
0.900706456
R Square
0.81127212
Adjusted R Square
0.806305597
Standard Error
10.48251906
Observations
40
ANOVA
df
SS
MS
F
Significance
F
Regression
1
17949.21 17949.21 163.3481
Residual
38
4175.562 109.8832
Total
39
22124.77
2.47E-15
Standard
Intercept
Time
Coefficients
Error
101.1837484
3.728178 27.14027
-3.011027947
0.23559
t Stat
-12.7808
P-value
1.75E-26
2.47E-15
(a) What is the least squares regression line for this data? Be sure to identify your
variables by name, not just x and y.
y=101.18-3.01x,
where y=Exam 2 grade and
x=time spent playing video games in hours for the Exam 2
week
(b) What does the intercept mean in terms of the story?
The predicted value of Exam 2 grade for a student who does
not play video games at all will be 101.18
(c) What does the slope mean in terms of the story?
The predicted value of Exam 2 grade will decrease by 3.01
when the time spent playing video games increases by 1
hour.
(d) What is the predicted value of Exam 2 score for a student who plays 6 hours of
video games during Exam 2 week? Is this a prediction or an extrapolation?
Explain your answer.
y=101.18-3.01*6=83.12
This is a prediction because 6 is within the range of x-values
in our data
(e) What is the predicted value of Exam 2 score for a student who plays 33 hours of
video games during Exam 2 week? Is this a prediction or an extrapolation?
Explain your answer.
y=101.18-3.01*33=1.85
This is a extrapolation because 33 is outside of the range of
x-values in our data
(f) How to interpret R-square in terms of the story?
81.13% of variation in the values of Exam 2 grade is
explained by the least-squares regression
(g) What is the correlation between grade on exam 2 and time spent playing video
games in hours for the Exam 2 week?
−√𝟎. 𝟖𝟏𝟏𝟑 = −0.9007 or attach a negative sign to the
“Multiple R” part of the output.
4. In 1988, the Kalamazoo (Michigan) Symphony advertised a “Mozart for Minors”
program with this statement: “Questions: Which students scored 51 points higher
in verbal skills and 39 points higher in math? Answer: Students who had
experience in music.”
(a) What do you think of the claim that “experience in music” causes higher test
scores? Explain possible lurking variables that could be interfering.
It is not appropriate to conclude that the relationship is due to cause
and effect. Lurking variables might explain the relationship.
Students with music experience might have other advantages
(wealthier parents, better school systems…)
(b) This is best described as an example of:
a. Confounding
b. A valid conclusion
c. Causation
d. Common response
(c) Draw a circles-and-arrows diagram to demonstrate your answer.
x: experience in music
y: higher test scores
z: lurking variables (wealthier parents, better school systems…)
5. Come up with a story not used in class or in the book that is an example of
common response and another that is an example of confounding. Draw a
circles-and-arrows diagram to go with each story.
1. common response: during summer more people drink Coke (x)
and also more people are drowned (y), it’s the warmer weather
(z) that’s causing an increase in both.
2. confounding : Joe got a good night's sleep(x) before his
statistics exam, and he got an A on the exam (y). Sleep is
important for mental focus, but so is studying, doing practice
problems, etc (z).
1.
2.
Download