Statistics 101: Section L - Laboratory 7 In today’s lab, we are going to look more at topics in least squares regression. Activity 1: In last week’s lab we saw that our attempts to use least squares regression to estimate the weight of an empty bag and the weight of a single M&M did not turn out too well. In this activity we will look at how we can do a better job of estimating these quantities from simple summary statistics and using a feature in JMP to fit special lines. Below are some simple summary statistics on the Total Weight (g), Contents Weight (g) and Number of M&Ms. Total Weight (g) 88 19.14 g # of Fun Size bags Sample Mean Contents Weight (g) 88 18.33 g Number of M&Ms 88 21.26 1. Weight of an empty bag? a) From the simple summary statistics, come up with an estimate of the weight of a single empty bag. Explain your reasoning. b) In last week’s lab we used Total Weight as the Y, Response and Contents Weight as the X, factor and obtained the following prediction equation. Predicted Total Weight = 2.91 + 0.885*Contents Weight What is the value of the estimated y-intercept? Given your answer in a) does this value look right? Explain your answer. c) What is the value of the estimated slope? Why does this value seem wrong? Hint: Think of the interpretation of the slope within the context of the problem. What should the true value of the slope of the line relating Total Weight to Contents Weight be? d) Use Analyze – Fit Y by X with Total Weight as the Y, Response and Contents Weight as the X, Factor. Use Fit Special and constrain the slope to its true value. What is the value of the estimated y-intercept for this special fit? How close is this value to the value you came up with in a)? 2. Weight of a single M&M? a) From the simple summary statistics, come up with an estimate for the weight of a single M&M. Explain your reasoning. b) In last week’s lab we used Contents Weight as the Y, Response and Number as the X, factor and obtained the following prediction equation. Predicted Contents Weight = 3.62 + 0.692*Number What is the value of the estimated slope? Given your answer in a) does this value look right? Explain your answer. 1 c) What is the value of the estimated y-intercept? Why does this value seem wrong? Hint: Think about the interpretation of the intercept within the context of the problem. What should the value of the intercept of the line that relates Contents Weight to Number be? d) Use Analyze – Fit Y by X with Contents Weight as the Y, Response and Number as the X, Factor. Use Fit Special and constrain the intercept to its true value. What is the value of the estimated slope for this special fit? How close is this value to the value you came up with in a)? Activity 2: In this activity, your group is going to play Survivor - Pennies. In this rip-off of the TV show Survivor, the goal is to simply outlast the other groups. Each group begins the game with 100 pennies in a plastic cup. At the start of each round, your group will shake the pennies in the cup and pour them out on the table. Any penny that lands “heads” up is a “survivor” and continues to the next round. Pennies landing “tails” up are losers and are set aside. At the end of each round, you need to count the number of “survivors” and record this value in the table provided. The game continues until no “survivors” are left. Once you have played the game, you should enter your data (round and number) into JMP. Using round as the explanatory variable (X, Factor) and number as the response variable (Y, Response), Fit Y by X. Turn in the JMP output for both fits 1. and 2. below. 1. Fit line a) Describe the relationship between round and number. How well does the regression line summarize this relationship? b) Give the equation for the least squares regression line. c) Using the least squares regression line, what is the predicted number of M&Ms that have survived until round 4? Is this prediction more or less than the observed number of M&Ms that actually did survive until round 4? d) Describe the residual plot. Do you see any problems with using a line to summarize the relationship between round and number? 2. Fit Special – Y Transformation – Natural Logarithm: log(y) Because you will take the natural logarithm of the number of survivors and your final round has zero survivors, the log(0) is not defined. Change the number of survivors for your final round to 0.5. a) Give the equation for the least squares regression line that relates log(y) to X. b) Back transform the equation in a) so that you get a prediction equation on the original scale. c) Using the prediction equation in b), what is the predicted number of M&Ms that have survived until round 4? Is this prediction more or less than the observed number of M&Ms that actually did survive until round 4? d) Describe the residual plot. Do you see any problems with using the special prediction equation to summarize the relationship between round and number? 2 Stat 101 L: Laboratory 7 – Answer Sheet Names: _________________________ _________________________ _________________________ _________________________ Activity 1: 1. Weight of an empty bag? a) From the simple summary statistics, come up with an estimate of the weight of a single empty bag. Explain your reasoning. b) What is the value of the estimated y-intercept? Given your answer in a) does this value look right? Explain your answer. c) What is the value of the estimated slope? Why does this value seem wrong? Hint: Think of the interpretation of the slope within the context of the problem. What should the true value of the slope of the line relating Total Weight to Contents Weight be? d) What is the value of the estimated y-intercept for this special fit? How close is this value to the value you came up with in a)? 3 2. Weight of a single M&M? a) From the simple summary statistics, come up with an estimate for the weight of a single M&M. Explain your reasoning. b) What is the value of the estimated slope? Given your answer in a) does this value look right? Explain your answer. c) What is the value of the estimated y-intercept? Why does this value seem wrong? Hint: Think about the interpretation of the intercept within the context of the problem. What should the value of the intercept of the line that relates Contents Weight to Number be? d) What is the value of the estimated slope for this special fit? How close is this value to the value you came up with in a)? 4 Activity 2: Round 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Number 100 1. Fit line a) Describe the relationship between round and number. How well does the regression line summarize this relationship? b) Give the equation for the least squares regression line. c) Using the least squares regression line, what is the predicted number of M&Ms that have survived until round 4? Is this prediction more or less than the observed number of M&Ms that actually did survive until round 4? d) Describe the residual plot. Do you see any problems with using a line to summarize the relationship between round and number? 2. Fit Special – Y Transformation – Natural Logarithm: log(y) a) Give the equation for the least squares regression line that relates log(y) to X. b) Back transform the equation in a) so that you get a prediction equation on the original scale. c) Using the prediction equation in b), what is the predicted number of M&Ms that have survived until round 4? Is this prediction more or less than the observed number of M&Ms that actually did survive until round 4? d) Describe the residual plot. Do you see any problems with using the special prediction equation to summarize the relationship between round and number? 5