Homework #10 - BetsyMcCall.net

advertisement
Stat 2470, Homework #10, Fall 2014
Name ______________________________________
Instructions: Show work or give calculator commands used to solve each problem. You may use Excel or
other software for any graphs. Be sure to answer all parts of each problem as completely as possible,
and attach work to this cover sheet with a staple.
1. As the air temperature drops, river water becomes super-cooled and ice crystals form. Such ice
can significantly affect the hydraulics of a river. An article described an experiment in which ice
thickness (mm) was studied as function of elapsed time (hr) under specified conditions. The
following data was read from a graph in the article. 𝑛 = 33, 𝑥 = 0.16, 0.33, 0.50, 0.67,
… 5.50; 𝑦 = 0.50, 1.25, 1.50, 2.75, 3.50, 4.75, 5.75, 5.60, 7.00, 8.00, 8.25, 9.50, 10.5, 11.00,
10.75, 12.5, 12.25, 13.25, 15.50, 15.00, 15.25, 16.25, 17.25, 18.00, 18.25, 18.15, 20.25, 19.50,
20,00, 20.50, 20.60, 20.50, 19.80 .
a. The 𝑟 2 value resulting from a least squares fit is 0.977. Interpret this value in the context of
the problem and comment on the appropriateness of assuming an approximate linear
relationship.
b. The residuals, listed in the same order as the x-values are −1.03, −0.92, −1.35, −0.78,
−0.68, −0.11, 0.21, −0.59, 0.13, 0.45, 0.06, 0.62, 0.94, 0.80, −0.14, 0.93, 0.04, 0.36, 1.92,
0.78, 0.35, −0.24, −0.43, −1.01, −1.75, −3.14. Plot the residuals against elapsed time.
What does the plot suggest?
2. Continuous recording of heart rate can be used to obtain information about the level of exercise
intensity or physical strain during sports participation, work, or other daily activities. An article
reported on a study to investigate using heart rate response (x, as a percentage of the maximum
rate) to predict oxygen uptake (y, as a percentage of maximum uptake) during exercise. The
accompanying data was read from a graph in the article.
HR
43.5
44.0
44.0
44.5
44.0
45.0
48.0
49.0
VO2
22.0
21.0
22.0
21.5
25.5
24.5
30.0
28.0
HR
49.5
51.0
54.5
57.5
57.7
61.0
63.0
72.0
VO2
32.0
29.0
38.5
30.5
57.0
40.0
58.0
72.0
Perform a simple linear regression analysis (construct a scatterplot of the data, find the linear
regression line, find the residuals, and construct a residual plot), paying particular attention to
the presence of any unusual or influential observations.
3. No tortilla chip aficionado likes soggy chips, so it is important to find characteristics of the
production process that produce chips with an appealing texture. The following data on
x=frying time (sec) and y=moisture content (%) appeared in an article on the subject.
5
10
15
20
25
30
45
60
𝒙
16.3
9.7
8.1
4.2
3.4
2.9
1.9
1.3
𝒚
a. Construct a scatterplot of y vs x and comment.
b. Construct a scatterplot of the (ln(𝑥) , ln(y)) pairs and comment.
c. What probabilistic relationship between x and y is suggested by the linear pattern in the plot
of part (b)?
d. Predict the value of moisture content when frying time is 20, in a way that conveys
information about reliability and precision.
e. Analyze the residuals from fitting the simple linear regression model to the transformed
data and comment.
4. A plot in an article suggests that the expected value of thermal conductivity y is a linear function
of 104 ∙ 1/𝑥 where x is lamellar thickness.
240
410
460
490
520
590
745
8300
𝒙
12.0
14.7
14.7
15.2
15.2
15.6
16.0
18.1
𝒚
a. Estimate the parameters of the regression function and the regression function itself.
b. Predict the value of thermal conductivity when lamellar thickness is 500 Å.
5. In each of the following cases, decide whether the given function is intrinsically linear. If so,
identify 𝑥 ′ 𝑎𝑛𝑑 𝑦′, and then explain how a random error term 𝜖 can be introduced to yield an
intrinsically linear probabilistic model.
1
a. 𝑦 = 𝛼+𝛽𝑥
1
b. 𝑦 = 1+𝑒 𝛼+𝛽𝑥
𝛼+𝛽𝑥
c. 𝑦 = 𝑒 𝑒
d. 𝑦 = 𝛼 + 𝛽𝑒 𝜆𝑥
6. The following data on y=glucose concentration (g/L) and x=fermentation time (days) for a
particular blend of malt liquor was read from a scatterplot in an article.
1
2
3
4
5
6
7
8
𝒙
74
54
52
51
52
53
58
71
𝒚
a. Verify that a scatterplot of the data is consistent with the choice of a quadratic regression
model.
b. The estimated quadratic regression equation is 𝑦 = 84.482 − 15.875𝑥 + 1.7679𝑥 2 .
Predict the value of glucose concentration for a fermentation time of six days, and compute
the corresponding residual.
c. Using SSE=61.77, what proportion of observed variation can be attributed to the quadratic
regression relationship?
d. The 𝑛 = 8 standardized residuals based on the quadratic model are 1.91, −1.95, −0.25,
0.58, 0.90, 0.04, −0.66, 0.20. Construct a plot of the standardized residuals versus x. Does
the plot exhibit any troublesome features?
e. The estimated standard deviation of 𝜇𝑌,6 = 1.69. Compute a 95% confidence interval for
𝜇𝑌,6 .
f. Compute a 95% prediction interval for a glucose concentration observation made after 6
days of fermentation time.
7. Let y=sales at a fast-food outlet (1000s of $), 𝑥1 =number of competing outlets within a 1-mile
radius, 𝑥2 =population within a 1-mile radius (1000s of people), and 𝑥3 be an indicator variable
that equals 1 if the outlet has a drive-up window and 0 otherwise. Suppose that the true
regression model is 𝑌 = 10.00 − 1.2𝑥1 + 6.8𝑥2 + 15.3𝑥3 + 𝜖.
a. What is the mean value of sales when the number of competing outlets is 2, there are 8000
people within a 1-mile radius and that outlet has a drive-up window?
b. What is the mean value of sales for an outlet without a drive-up window, that has three
competing outlets and 5000 people within a 1-mile radius?
c. Interpret 𝛽3 .
8. What conclusion would be appropriate for an upper-tailed chi-squared test in each of the
following situations?
a.
b.
c.
d.
𝛼
𝛼
𝛼
𝛼
= 0.05, 𝑑𝑓
= 0.01, 𝑑𝑓
= 0.10, 𝑑𝑓
= 0.01, 𝑑𝑓
= 4, 𝜒 2
= 3, 𝜒 2
= 2, 𝜒 2
= 6, 𝜒 2
= 12.25
= 8.54
= 4.36
= 10.20
9. Say as much as you can about the P-value for an upper-tailed chi-squared test in each of the
following situations.
a. 𝜒 2 = 7.5, 𝑑𝑓 = 2
b. 𝜒 2 = 13.0, 𝑑𝑓 = 6
c. 𝜒 2 = 18.0, 𝑑𝑓 = 9
d. 𝜒 2 = 21.3, 𝑑𝑓 = 5
e. 𝜒 2 = 5.0, 𝑘 = 4
10. Criminologists have long debated whether there is a relationship between weather conditions
and the incidence of violent crime. An article classified 1361 homicides according to season,
resulting in the accompanying data. Test the null hypothesis of equal proportions using 𝛼 =
0.01 by using the chi-squared table to say as much as possible about the P-value.
Winter
Spring
Summer
Fall
328
334
372
327
11. Consider a large population of families in which each family has exactly three children. If the
genders of the three children in any family are independent of one another, then number of
male children in a randomly selected family will have a binomial distribution based on three
trials.
a. Suppose a random sample of 160 families yields the following results. Test the relevant
hypotheses.
Number of Male
0
1
2
3
Children
Frequency
14
66
64
16
b. Suppose a random sample of families in a nonhuman population resulting in observed
frequencies shown in the table below. Would the chi-squared test be based on the same
number of degrees of freedom? Conduct the test.
Number of Male
0
1
2
3
Children
Frequency
15
20
12
3
12. Each individual in a random sample of high school students and college students was crossclassified with respect to both political views and marijuana usage, resulting in the data
displayed in the accompanying two-way table. Does the data support the hypothesis that
political views and marijuana usage level are independent within the population? Test the
appropriate hypotheses using the level of significance 0.01.
Usage Level
Never
Rarely
Frequently
Political Views
Liberal
479
173
119
Conservative
214
47
15
Other
172
45
85
13. The accompanying data on degree of spirituality for samples of natural and social scientists at
research universities as well as for a sample of non-academics with graduate degrees
Degree of Spirituality
Moderate
162
223
164
Very
Slightly
Not at all
Natural Science
56
198
211
Social Science
56
243
239
Graduate
109
74
28
Degree
a. Is there substantial evidence for concluding that the three types of individuals are not
homogeneous with respect to their degree of spirituality? State and test the appropriate
hypotheses.
b. Considering just the natural scientists and social scientists, is there evidence for nonhomogeneity? Base your conclusion on a P-value.
Download