Stat 301 B -- Fall 2014 -- Midterm... 6 November 2014

advertisement
Stat 301 B -- Fall 2014 -- Midterm exam 2
6 November 2014
Instructions:
1. Please put your name on the back of the last page. I don’t want to see your name until I have
finished grading.
2. Read each question carefully and completely. Ask if you don’t understand something.
3. Answer each question and show work in the space provided. Scratch paper is provided for your
use, but I will only read and evaluate what you put in the answer spaces.
4. Use the JMP output. I am very happy to answer questions along the lines of ‘(You pointing to a
number on the JMP output) Is this the confidence interval for the regression slope?’.
There are 90 points of questions; you get 10 points “for free”.
Investigators have studied whether the “pace of life” is associated with health. You may be familiar with
the stereotypes: Los Angeles is a “laid back” city, with a slower pace of life; New York City is “hurry up”
city with a faster pace of life. These investigators studied 36 US cities, which can be considered a
random sample of US cities (population larger than 100,000 people). They quantified the pace of life
with three measures:
1. bank: the average speed of a specific transaction at a bank
2. walk: the average walking speed of pedestrians in the business district
3. talk: the average number of words per minute spoken by postal clerks
The response variable, heart, is the number of heart attacks per 1000 population. All questions on this
exam concern various analyses of this data set. The packet of JMP output includes:
1.
2.
3.
4.
5.
6.
7.
8.
Correlations among each pair of variables and the scatterplot matrix
Summary statistics for each variable (condensed from JMP output)
Analyze / Fit Model with X = walk
Analyze / Fit Model with X = walk and bank
Analyze / Fit Model with X = walk, bank, and talk
Analyze / Fit Model with X = walk, bank, and walk*bank
Analyze / Fit Model with X = walk, bank and walk2
Residual vs. predicted value plot for the model with X= walk and bank
For some questions, but not all, I have indicated parts of the output that might be relevant for a
question.
1) 6 pts. JMP output 1 and 3. Five numbers that describe different aspects of the relationship between
walk and heart are:
the correlation coefficient: 0.348
the regression slope: 0.423
the standard error of the regression slope: 0.196
the p-value for the regression slope: 0.038
the root mean-squared-error for the regression: 4.96
What number most clearly:
(Note: No explanations needed; just give me the number that is your answer).
a) describes the strength of the linear association between walk and heart?
b) predicts the difference in number of heart attacks between a city with a walk value of 15 and a city
with a walk value of 16?
c) supports your claim that the walking speed helps predict the number of heart attacks?
2) 5 pts. What is the equation that predicts the number of heart attacks per 1000 population from the
values of bank, talk, and walk?
3) 5 pts. Another equation that predicts the number of heart attacks per 1000 population from the
values of bank, talk, and walk is:
heart = 5.2 + 0.5 Walk + 0.5 Bank + 0.5 Talk
The prediction equation in your answer to question 2 is better than this equation. Explain in what way
your answer to question 2 is better.
4) 5 pts. JMP output 1, 5. The negative coefficient for talk in the model with walk, bank, and talk is
somewhat surprising. Is there any concern with multicollinearity in this model? Briefly explain why or
why not.
5) 5 pts. Does adding information about the speed of talking (the talk variable) improve predictions of
the number of heart attacks, above and beyond what you would predict from bank and walk alone?
Briefly explain your answer.
6) 5 pts. JMP output 4. Give a careful interpretation of the estimated coefficient for walk in the model
with walk and bank.
7) 5 pts. JMP output 1, 2, and 4. In the model with X=walk and bank, which variable (walk or bank) is
more important in predicting the number of heart attacks? Briefly explain your answer.
8) 5 pts. JMP output 4. The output from the model with X=bank and walk includes a p-value of 0.0214
(underlined in the Analysis of Variance block of output). This p-value of 0.0214 is the result of a test of a
particular null hypothesis. What is that null hypothesis?
9) 5 pts. JMP output 4 and 8. Is there any concern about lack of fit of the regression model with X=bank
and walk? Briefly explain why or why not.
10) 5 pts. JMP output 4 and 8. Is there any concern about the assumption of equal variance when
fitting the model with X=bank and walk? Briefly explain why or why not.
11) 5 pts. Consider the model with X=walk and bank. This model proposes that the relationship
between walking speed (walk) and the number of heart attacks is described by a straight line. Is this
appropriate, or is the relationship to walking speed something more complicated? Briefly explain your
answer.
12) 5 pts. Does the slope of the relationship between walking speed (walk) and the number of heart
attacks depend on the speed of a bank transaction (bank). Briefly explain why or why not.
13) 5 pts. JMP output 6. Give a careful interpretation of the estimated coefficient for walk in the
model with walk, bank and walk*bank.
14) 5 pts. JMP output 6. A friend looks at the results for parameter estimates from the model with
walk, bank, and walk*bank and comments that “All variables are non-significant (p > 0.05), so there is no
use trying to predict the number of heart attacks from the walking speed or bank speed”. Do you agree
or not? Briefly explain your answer.
15) 5 pts. Consider the model with walk, bank, and walk*bank. Calculate the F statistic that tests the
null hypothesis that the slope for bank = 0 and slope for walk*bank=0.
16) 5 pts. JMP output 5 and 7. Could you use a model comparison F statistic to compare the 3-variable
model with walk, bank, and talk, to the quadratic model with walk and walk2? Briefly explain why or
why not.
17) 5 pts. JMP output 1, 2, and 3. Would you have any concerns using the model with X= walk to
predict the number of heart attacks in a city with walk = 35? Briefly explain why or why not.
18) 4 pts. JMP output 1, 2, and 3. A friend is interested in the “backwards” relationship, a regression
model to predict Y= walking speed from the X = number of heart attacks per 1000 population. What is
the slope, 𝛽1 for this regression? Show your work.
NOTE: we did not cover the material for Q 18 in 2015.
Download