MATH 1342. Instructor: Mary Parker Homework Chapter 5 notes last updated Sept 14, 2007 page 1 of 2 Homework Notes Chapter 5 These notes do not include full answers. Students are expected to read the full answers in StatsPortal before looking at these notes. I am preparing these notes to provide more details about what you should be thinking about and learning from each of the exercises. The multiple choice questions just before the Exercises at the end of the chapter are not discussed here because there is information in StatsPortal beside the correct answer to help students see why that answer is better than the others. (And also because it would take me twice as long to write these if I had to address all of those questions as well!) Chapter 5: [5.1, 5.3(M), 5.5(M), 5.6, 5.7(M), 5.9(M), 5.11, 5.13, 5.15, 5.17, 5.19, 5.21, 5.23(M)], 5.25, 5.27(M), 5.31, 5.33(M), 5.34(M), 5.35(M), 5.37, 5.39(T), 5.45(M), 5.51, 5.54(T) In my classes, before you do any homework problems, you should go through the worksheet as part of learning about the first 8 pages of this chapter. 5.1 – 5.5. I have nothing more to say than was said in StastPortal. 5.6. When you see numerical correlations given, you should visualize pictures more or less like those on page 102. From those, you can answer the question in this problem. I realize the answer to this is not in StatsPortal. Since this might be given for a quiz problem, I will not provide the actual answer here. 5.7. You can do this problem, as stated, by hand fairly easily, since they have given you the regression equation and all the residuals. It is a good idea to visualize how you would do each part of it by hand. And then it is a good idea to also practice doing all the parts of it using MINITAB. That will include finding the regression equation, storing the residuals, and then using those stored residuals – both to find the sum of them and to make a scatterplot of the residuals (on the vertical axis) versus the x-values (on the horizontal axis.) Probably you’ll find it useful to look at the MINITAB Manual for instructions for doing these tasks. (Choose Resources > Technology Manuals > Chapter __ > Minitab Manual) 5.9. For some reason, StatsPortal only includes the answer for part a, but not b and c. However, the answers in the back of the book includes answers to parts b and c. b. Do you recognize which question here is asking you for the slope coefficient. Do you recognize which question is asking for r2? It’s very important that you learn to recognize how you may be asked for these. c. You noticed that the prediction was for a negative number of people living on farms, and clearly that isn’t reasonable. That does not mean there is anything wrong with either the mathematics or the statistics. It just means that the somewhat linear pattern between 1935 and 1980 does not continue in that same form. This illustrates the problem with extrapolation. Most variables which are linearly related are only linearly related over some set of x-values. And you don’t know exactly where that begins and ends. So if you are making predictions outside the range of the data you observed, you must recognize that such MATH 1342. Instructor: Mary Parker Homework Chapter 5 notes last updated Sept 14, 2007 page 2 of 2 predictions may not be correct because the model has changed for this new set of values for x. That’s why we must “beware of extrapolation.” The crucial thing here is not that the prediction is negative, but that it is a prediction for an x-value fairly far from the observed x-values. 5.11, 5.13. These are somewhat open-ended questions. It is important for you to write these in your own words and then determine whether what you say communicates the same type of answers as those given in StatsPortal. 5.25, 5.31, 5.33. I have nothing to add to what is in StatsPortal. 5.34. The point of this question is that the numerical summaries (correlation coefficient and regression line) are not adequate to understand the data set. You must also look at graphs. Since this is an even-numbered problem, I’m not giving you the answers. I believe you’ll find it fairly easy to answer these when you look at the scatterplots. 5.35, 5.37, 5.39. 5.41. These are very important questions for you to do and understand. I have nothing else to say about them except what is in StatsPortal. 5.45. The result of part b shouldn’t be a surprise because doing the regression finds the linear relationship between the two variables. So the residuals show what is left from the data after all the effects of the linear relationship are subtracted off. Thus, the residuals should not have anything left of a linear relationship with the explanatory variable. And the correlation coefficient measures the extent of a linear relationship between the variables. Since there can be no remnants of a linear relationship left between x and the residuals, then the correlation coefficient between them must be zero, up to round-off error. 5.51. I have nothing else to say about this except what is in StatsPortal. 5.54. Since this is an even-numbered problem, and may be used for a quiz, I will not provide an answer. I believe that, if you create the scatterplots as indicated, it will be clear why it is not reasonable to use a linear regression equation to describe the relationship in some of the situations.