# Quiz 14 - La Sierra University ```Math 251, Practice Questions on Topics Since 3rd Test
(Linear Regression, Goodness of Fit, Analysis of Variance)
1. (From p. 594 #12) Let x be the age of a licensed driver in years. Let y be the percentage of all
fatal accidents (for a given age) due to failure to yield the right of way. For example, the first
data pair says that 5% of all fatal accidents of 37-year olds are due to failure to yield the right of
way. The Wall Street Journal reported the following data:
x
y
37
5
47
8
Note: x = 372
57
10
67
16
y = 112
77
30
87
43
x2 = 24,814 y2 = 3194
xy =
8254
(a) Draw a scatter plot and find the equation of the least squares line for this data.
(b) Use your line from (a) to predict the percentage of fatal accidents due to failure to yield right
of way for 40-year olds.
(c) Use the regression line in (a) to predict the age for which the percentage of fatal accidents due
to failure to yield is 20%.
(d) Do the data appear to be positively or negatively correlated? Explain.
(e) Compute the correlation coefficient, does its sign agree with your answer in (d)? Does it
suggest that there is a good linear fit?
(f) Compute the coefficient of determination, and interpret what it means.
Answer. A scatter plot of data with predicted values of line of best fit is included below:
Plot of Data &amp; Predicted Values
50
45
40
35
30
y
20
Predi cte d y
y
25
15
10
5
0
-5
0
20
40
60
x
80
100
The rest of the answers to question 1 may be found in a PDF file that better displays math
2. (From p. 655#3) The type of raw material used to construct stone tools found at the
archaeological site Casa del Rito is shown below. A random sample of 1486 stone tools was
obtained from a current excavation site.
Raw Material
Basalt
Obsidian
Welded tuff
Pedernal chert
Other
Regional Percent of
Stone Tools
61.3%
10.6%
11.4%
13.1%
3.6%
Observed Number of
Tools at Current
Excavation Site
906
162
168
197
53
(a) Use a 1% level of significance to test the claim that the regional distribution of raw materials
fits the current excavation site.
Answer. We use a Chi-Square test (Section 11.2) with:
Null Hypothesis: The distribution is as claimed.
Alternative Hypothesis: The distribution is different from what is claimed.
At the 1% level of significance, the critical region (from Table 7, p. A29) is 2  13.28.
We now complete the following table to compute the sample test statistic.
Raw Material
Basalt
Obsidian
Welded tuff
Pedernal chert
Other
Observed
906
162
168
197
53
Expected
910.918
157.516
169.404
194.666
53.496
(O – E)2/E
.026552
.127646
.011636
.027984
.004599
The observed value 2 =  (O-E) 2/E = sum of the right column values = 0.1984, because this
does not fall in the critical region, we fail to reject the null hypothesis.
(b) Repeat (a) using a 5% level of significance.
Answer. The only difference is that the critical region is now 2  9.49. The observed value of
0.1984 still does not fall in the critical region, so we fail to reject the null hypothesis.
3. (From p. 694 #9) A sociologist studying New York City ethnic groups wishes to determine if
there is a difference in income for immigrants from four different countries during there first year
in the city. She obtained the data in the following table from a random sample of immigrants
from these countries (incomes in thousands of dollars). Use a 0.05 level of significance to test the
claim that there is no difference in the earnings of immigrants from the four different countries.
Country I
12.7
9.2
10.9
8.9
16.4
Country II
8.3
17.2
19.1
10.3
Country III
20.3
16.6
22.7
25.2
19.9
Country IV
17.2
8.8
14.7
21.3
19.8
(a) State the null and alternative hypotheses.
The null hypothesis is: 1 = 2 = 3 =4
The alternative hypothesis is: at least two of the means 1 ,2 ,3 ,4 are different.
(b) What assumptions should be made on the populations in order to conduct the test?
Answer: The populations should be normal (or approximately normal) with equal variances.
(c) State the critical region for the test.
Answer: First observe that N = 19, and k=4. Thus the numerator has k-1= 3 degrees of freedom,
and the denominator has N-k = 15 degrees of freedom. Using Table 8, on p. A32 we find that the
critical region is F  3.29.
(d) Given that MSBET = 79.408 and MSW = 17.223 for the given data, report the conclusion of the
hypothesis test.
Answer: Using the formula for F, we find that F = 79.408  17.223 = 4.61. Thus F is in the
critical region, so we reject the null hypothesis, that is, there is evidence that the population
means are not all the same.
(e) Repeat the test at the 1% level of significance.
Answer: For this, the critical region is F  5.42, and so the observed value of F = 4.61 does not
fall in the critical region, so we would not reject the null hypothesis at the 1% level of
significance.
```