Practice 5: Regression 1. The scatter plot and best

advertisement
Practice 5: Regression
1. The scatter plot and best-fit line show the relation among the number of cars waiting by a school
(y) and the amount of time after the end of classes (x) in arbitrary units. The correlation coefficient is −0.55.
Determine the amount of variation in the number of cars not explained by the variation time after school.
SOLUTION
About 70%
2. On the basis of the following plot of horn length versus tail length of peppermint flavored unicorns, what
is the approximate value of the correlation coefficient?
X
25-]
]
20-]
X
]
X
15-]
X
]
X
10-]
X
]
5-X
X
]
]____X______________________
0
5
10
Tail
Length
Horn Length
SOLUTION
a. -1
b. -.6
c. 0
d. +.6
e. +1
3. If the coefficient of correlation equals 0.61, what is the proportion of the variation in the dependent
variable explained by the variation in the independent variable is
SOLUTION: 37%
r = 0.61
r**2 = .37 (coefficient of determination)
4. Suppose that a report contains this graph (Note: to complete graph, connect the *'s with a smooth curve.)
Annual Income
(thousands of
$ per year)
50
25
|
|
|
|
|
|
+
*
|
*
|
|
|
|
|
|
|
*
|
+
|
|
|
|
*
|
|
|
|
----------+---------+---------+---------->
10
20
30
Years of Experience in Trade
a. What does the graph indicate as annual income for someone with no experience in the trade?
b. Describe the relation between income and experience over the interval from 0 to 20.
c. Describe the relation between income and experience over the interval 20 to 30.
d. Describe the overall graph.
SOLUTION
a. Around 12,500 dollars per year.
b. There appears to be approximately a straight line relation in which income increases with experience
over the interval from 0 to 20. (There seems to be some curvature or flattening for experience near 20.)
The change in income in this range is from around 12.5 to around 48, so the rate of increased income is
roughly $35,500/20 = $1775 per year.
c. The relation between experience and income for experience between 20 and 30 years also appears to be
roughly a straight line, but a flat straight line, indicating that income stays roughly constant at a little less
than $50,000 per year.
d. The overall graph indicates income initially around $12,500 (no experience), increasing income in the
range from 0 to 20 years experience, approaching a limit that seems to be a little below $50,000. That limit
seems to be reached sometime between 10 and 25 years. (Income seems to remain about constant
afterward.)
5. Below are the average heights for American boys. (Source: Physician’s Handbook, 1990)
Age (years) Height (cm)
birth
50.8
2
83.8
3
91.4
5
106.6
7
119.3
10
137.1
14
157.5
a. Calculate the least squares line. Put the equation in the form of: y =a+bx
b. Find the estimated average height for a 1 year–old and for an 11 year–old.
c. Draw a scatter plot of the data and plot the least squares line on your graph.
d. Use the least squares line to estimate the average height for a sixty–two year–old man. Do you think
that your answer is reasonable? Why or why not?
e. What is the slope of the least squares (best-fit) line? Interpret the slope.
SOLUTION
a. y = 65.0876 + 7.0948*x
b. 72.2 cm; 143.13 cm
d. 505.0 cm; No
e. slope = 7.0948. As the age of an American boy increases by one year, the average height tends to
increase by 7.0948 cm.
6. Once upon a time there was a wizard who invented a means of amplifying the sounds produced by the
king's musicians. By dropping a gold coin into a slot, the king could amplify sound: 1 coin - 1 fold, 2 coins
- 2 fold, etc. The wizard persuaded the king to write down a number indicating the pleasure he experienced
from various performances of the musicians. Later he presented these results:
Coins
0
1
2
3
Pleasure
10
15
20
25
Said the wizard, "Clearly, your majesty, we have found the fountain of pure joy. The more coins, the
greater amplification, and the greater your pleasure."
Which, if any, of these amounts do you recommend that the king use for his next investment in
amplification (you should suggest more than one if appropriate). Explain your advice.
a.
b.
c.
d.
e.
3 coins
100 coins
1000 coins
4 coins
10 coins
SOLUTION
I would recommend only 3 coins because not having any observations over 3 coins, I have no idea what
may happen when extrapolating. Another factor may enter the problem at higher levels that may be very
unpleasant. Don't forget that kings can be very nasty people, liable to chop off heads as their whims take
them.
7. The following data show the number of hurricanes by category to directly strike the mainland U.S. each
decade. Source: www.nhc.noaa.gov/gifs/table6.gif A major hurricane is one with a strength rating of 3, 4 or
5.
Number of Major
Hurricanes
Decade Total Number of Hurricanes
1941-1950 24
1951-1960 17
1961-1970 14
1971-1980 12
1981-1990 15
1991-2000 14
2001 – 20049
10
8
6
4
5
5
3
a. Using only completed decades (1941 – 2000), calculate the least squares line for the number of major
hurricanes expected based upon the total number of hurricanes.
SOLUTION: y =0.5x−1.67
b. The data for 2001-2004 show 9 hurricanes have hit the mainland United States. The line of best fit
predicts 2.83 major hurricanes to hit mainland U.S. Choose the best answer to the following question: Can
the least squares line be used to make this prediction?
A. No, because 9 lies outside the independent variable values
B. Yes, because, in fact, there have been 3 major hurricanes this decade
C. No, because 2.83 lies outside the dependent variable values
D. Yes, because how else could we predict what is going to happen this decade.
Download