Practice 5: Regression 1. The scatter plot and best-fit line show the relation among the number of cars waiting by a school (y) and the amount of time after the end of classes (x) in arbitrary units. The correlation coefficient is −0.55. Determine the amount of variation in the number of cars not explained by the variation time after school. SOLUTION About 70% 2. On the basis of the following plot of horn length versus tail length of peppermint flavored unicorns, what is the approximate value of the correlation coefficient? X 25-] ] 20-] X ] X 15-] X ] X 10-] X ] 5-X X ] ]____X______________________ 0 5 10 Tail Length Horn Length SOLUTION a. -1 b. -.6 c. 0 d. +.6 e. +1 3. If the coefficient of correlation equals 0.61, what is the proportion of the variation in the dependent variable explained by the variation in the independent variable is SOLUTION: 37% r = 0.61 r**2 = .37 (coefficient of determination) 4. Suppose that a report contains this graph (Note: to complete graph, connect the *'s with a smooth curve.) Annual Income (thousands of $ per year) 50 25 | | | | | | + * | * | | | | | | | * | + | | | | * | | | | ----------+---------+---------+----------> 10 20 30 Years of Experience in Trade a. What does the graph indicate as annual income for someone with no experience in the trade? b. Describe the relation between income and experience over the interval from 0 to 20. c. Describe the relation between income and experience over the interval 20 to 30. d. Describe the overall graph. SOLUTION a. Around 12,500 dollars per year. b. There appears to be approximately a straight line relation in which income increases with experience over the interval from 0 to 20. (There seems to be some curvature or flattening for experience near 20.) The change in income in this range is from around 12.5 to around 48, so the rate of increased income is roughly $35,500/20 = $1775 per year. c. The relation between experience and income for experience between 20 and 30 years also appears to be roughly a straight line, but a flat straight line, indicating that income stays roughly constant at a little less than $50,000 per year. d. The overall graph indicates income initially around $12,500 (no experience), increasing income in the range from 0 to 20 years experience, approaching a limit that seems to be a little below $50,000. That limit seems to be reached sometime between 10 and 25 years. (Income seems to remain about constant afterward.) 5. Below are the average heights for American boys. (Source: Physician’s Handbook, 1990) Age (years) Height (cm) birth 50.8 2 83.8 3 91.4 5 106.6 7 119.3 10 137.1 14 157.5 a. Calculate the least squares line. Put the equation in the form of: y =a+bx b. Find the estimated average height for a 1 year–old and for an 11 year–old. c. Draw a scatter plot of the data and plot the least squares line on your graph. d. Use the least squares line to estimate the average height for a sixty–two year–old man. Do you think that your answer is reasonable? Why or why not? e. What is the slope of the least squares (best-fit) line? Interpret the slope. SOLUTION a. y = 65.0876 + 7.0948*x b. 72.2 cm; 143.13 cm d. 505.0 cm; No e. slope = 7.0948. As the age of an American boy increases by one year, the average height tends to increase by 7.0948 cm. 6. Once upon a time there was a wizard who invented a means of amplifying the sounds produced by the king's musicians. By dropping a gold coin into a slot, the king could amplify sound: 1 coin - 1 fold, 2 coins - 2 fold, etc. The wizard persuaded the king to write down a number indicating the pleasure he experienced from various performances of the musicians. Later he presented these results: Coins 0 1 2 3 Pleasure 10 15 20 25 Said the wizard, "Clearly, your majesty, we have found the fountain of pure joy. The more coins, the greater amplification, and the greater your pleasure." Which, if any, of these amounts do you recommend that the king use for his next investment in amplification (you should suggest more than one if appropriate). Explain your advice. a. b. c. d. e. 3 coins 100 coins 1000 coins 4 coins 10 coins SOLUTION I would recommend only 3 coins because not having any observations over 3 coins, I have no idea what may happen when extrapolating. Another factor may enter the problem at higher levels that may be very unpleasant. Don't forget that kings can be very nasty people, liable to chop off heads as their whims take them. 7. The following data show the number of hurricanes by category to directly strike the mainland U.S. each decade. Source: www.nhc.noaa.gov/gifs/table6.gif A major hurricane is one with a strength rating of 3, 4 or 5. Number of Major Hurricanes Decade Total Number of Hurricanes 1941-1950 24 1951-1960 17 1961-1970 14 1971-1980 12 1981-1990 15 1991-2000 14 2001 – 20049 10 8 6 4 5 5 3 a. Using only completed decades (1941 – 2000), calculate the least squares line for the number of major hurricanes expected based upon the total number of hurricanes. SOLUTION: y =0.5x−1.67 b. The data for 2001-2004 show 9 hurricanes have hit the mainland United States. The line of best fit predicts 2.83 major hurricanes to hit mainland U.S. Choose the best answer to the following question: Can the least squares line be used to make this prediction? A. No, because 9 lies outside the independent variable values B. Yes, because, in fact, there have been 3 major hurricanes this decade C. No, because 2.83 lies outside the dependent variable values D. Yes, because how else could we predict what is going to happen this decade.