Statistics 101: Section L - Laboratory 7

advertisement
Statistics 101: Section L - Laboratory 7
In today’s lab, we are going to look more at topics in least squares regression.
Activity 1: In last week’s lab we saw that our attempts to use least squares regression to estimate
the weight of an empty bag and the weight of a single M&M did not turn out too well. In this
activity we will look at how we can do a better job of estimating these quantities from simple
summary statistics and using a feature in JMP to fit special lines. Below are some simple summary
statistics on the Total Weight (g), Contents Weight (g) and Number of M&Ms.
Total Weight (g)
88
19.14 g
# of Fun Size bags
Sample Mean
Contents Weight (g)
88
18.33 g
Number of M&Ms
88
21.26
1. Weight of an empty bag?
a) From the simple summary statistics, come up with an estimate of the weight of a single
empty bag. Explain your reasoning.
b) In last week’s lab we used Total Weight as the Y, Response and Contents Weight as the X,
factor and obtained the following prediction equation.
Predicted Total Weight = 2.91 + 0.885*Contents Weight
What is the value of the estimated y-intercept? Given your answer in a) does this value
look right? Explain your answer.
c) What is the value of the estimated slope? Why does this value seem wrong? Hint: Think
of the interpretation of the slope within the context of the problem. What should the true
value of the slope of the line relating Total Weight to Contents Weight be?
d) Use Analyze – Fit Y by X with Total Weight as the Y, Response and Contents Weight as
the X, Factor. Use Fit Special and constrain the slope to its true value. What is the value of
the estimated y-intercept for this special fit? How close is this value to the value you came
up with in a)?
2. Weight of a single M&M?
a) From the simple summary statistics, come up with an estimate for the weight of a single
M&M. Explain your reasoning.
b) In last week’s lab we used Contents Weight as the Y, Response and Number as the X, factor
and obtained the following prediction equation.
Predicted Contents Weight = 3.62 + 0.692*Number
What is the value of the estimated slope? Given your answer in a) does this value look
right? Explain your answer.
1
c) What is the value of the estimated y-intercept? Why does this value seem wrong? Hint:
Think about the interpretation of the intercept within the context of the problem. What
should the value of the intercept of the line that relates Contents Weight to Number be?
d) Use Analyze – Fit Y by X with Contents Weight as the Y, Response and Number as the X,
Factor. Use Fit Special and constrain the intercept to its true value. What is the value of the
estimated slope for this special fit? How close is this value to the value you came up with
in a)?
Activity 2: In this activity, your group is going to play Survivor - Pennies. In this rip-off of the TV
show Survivor, the goal is to simply outlast the other groups. Each group begins the game with 100
pennies in a plastic cup. At the start of each round, your group will shake the pennies in the cup
and pour them out on the table. Any penny that lands “heads” up is a “survivor” and continues to
the next round. Pennies landing “tails” up are losers and are set aside. At the end of each round,
you need to count the number of “survivors” and record this value in the table provided. The game
continues until no “survivors” are left.
Once you have played the game, you should enter your data (round and number) into JMP. Using
round as the explanatory variable (X, Factor) and number as the response variable (Y, Response),
Fit Y by X. Turn in the JMP output for both fits 1. and 2. below.
1. Fit line
a) Describe the relationship between round and number. How well does the regression line
summarize this relationship?
b) Give the equation for the least squares regression line.
c) Using the least squares regression line, what is the predicted number of M&Ms that have
survived until round 4? Is this prediction more or less than the observed number of M&Ms
that actually did survive until round 4?
d) Describe the residual plot. Do you see any problems with using a line to summarize the
relationship between round and number?
2. Fit Special – Y Transformation – Natural Logarithm: log(y)
Because you will take the natural logarithm of the number of survivors and your final round has
zero survivors, the log(0) is not defined. Change the number of survivors for your final round to
0.5.
a) Give the equation for the least squares regression line that relates log(y) to X.
b) Back transform the equation in a) so that you get a prediction equation on the original scale.
c) Using the prediction equation in b), what is the predicted number of M&Ms that have
survived until round 4? Is this prediction more or less than the observed number of M&Ms
that actually did survive until round 4?
d) Describe the residual plot. Do you see any problems with using the special prediction
equation to summarize the relationship between round and number?
2
Stat 101 L: Laboratory 7 – Answer Sheet
Names: _________________________
_________________________
_________________________
_________________________
Activity 1:
1. Weight of an empty bag?
a) From the simple summary statistics, come up with an estimate of the weight of a single
empty bag. Explain your reasoning.
b) What is the value of the estimated y-intercept? Given your answer in a) does this value
look right? Explain your answer.
c) What is the value of the estimated slope? Why does this value seem wrong? Hint: Think
of the interpretation of the slope within the context of the problem. What should the true
value of the slope of the line relating Total Weight to Contents Weight be?
d) What is the value of the estimated y-intercept for this special fit? How close is this value to
the value you came up with in a)?
3
2. Weight of a single M&M?
a) From the simple summary statistics, come up with an estimate for the weight of a single
M&M. Explain your reasoning.
b) What is the value of the estimated slope? Given your answer in a) does this value look
right? Explain your answer.
c) What is the value of the estimated y-intercept? Why does this value seem wrong? Hint:
Think about the interpretation of the intercept within the context of the problem. What
should the value of the intercept of the line that relates Contents Weight to Number be?
d) What is the value of the estimated slope for this special fit? How close is this value to the
value you came up with in a)?
4
Activity 2:
Round
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Number 100
1. Fit line
a) Describe the relationship between round and number. How well does the regression line
summarize this relationship?
b) Give the equation for the least squares regression line.
c) Using the least squares regression line, what is the predicted number of M&Ms that have
survived until round 4? Is this prediction more or less than the observed number of M&Ms
that actually did survive until round 4?
d) Describe the residual plot. Do you see any problems with using a line to summarize the
relationship between round and number?
2. Fit Special – Y Transformation – Natural Logarithm: log(y)
a) Give the equation for the least squares regression line that relates log(y) to X.
b) Back transform the equation in a) so that you get a prediction equation on the original scale.
c) Using the prediction equation in b), what is the predicted number of M&Ms that have
survived until round 4? Is this prediction more or less than the observed number of M&Ms
that actually did survive until round 4?
d) Describe the residual plot. Do you see any problems with using the special prediction
equation to summarize the relationship between round and number?
5
Download