AP Statistics

advertisement
Inference for Means
Chapters 23-25
REVIEW
Name________________________________
Class period_____
1. The Environmental Protection Agency (EPA) estimated that the 1991 G-car obtains a mean of 35 miles
per gallon on the highway, and the company that manufactures the car claims that that is incorrect. To
support its assertion, the company randomly selects thirty-six 1991 G-cars and records the mileage
obtained for each car over a driving course similar to that used by the EPA. The company found a mean
of 36.8 miles per gallon with a standard deviation of 6 miles per gallon. Do the data provide sufficient
evidence to support the manufacturer’s claim that the mean miles per gallon is different from 35? Use a
5% significance level ( = .05).
2. A restaurant franchise company has a policy of opening new restaurants in those areas that have a
mean household income of at least $35,000 per year. The company is currently considering an area to
open a new restaurant. The research department took a sample of 20 households from this area and found
that the mean income of these households is $34,124 per year with a standard deviation of $3400. The
income distribution in this area is approximately normal. Using the 1% significance level, would you
conclude that the company should not open a restaurant in this area? ( = .01)
3. Last year the average grade on the inference test was 82. A random sample of this year’s student
scores on the same test produced the results below. Test at the .02 significance level to see whether
the scores have changed significantly from last year?
83
87
98
92
91
100
88
89
93
81
94
96
4. Which example would give a longer confidence interval in each case below.
a) 95% confidence with n = 25 or 95% confidence with n = 75?
b) 99% confidence with n = 50 or 90% confidence with n = 50?
5. Two different firms design their own version of the same aptitude test, and a psychologist administers
both versions to randomly selected adult subjects with the results given below. At the 0.05 level of
significance, test the claim that both versions produce about the same results.
subject
test 1
test 2
A
106
99
B
115
112
C
101
104
D
124
113
E
123
101
F
96
88
G
101
110
H
105
109
I
110
109
6. The accompanying data on water salinity (%) was obtained during a study of seasonal influence of
Amazon River water on biological production in the western tropical Atlantic. Let  1 denote the true
average salinity level of summer and  2 denote the true average salinity level of winter. Test the claim
that the salinity level is higher in the winter at the 2% significance level.
period
sample size
sample mean
sample standard
deviation
summer
51
33.4
0.428
winter
54
35.39
0.294
7. A botanist measures the heights of 40 random seedlings and obtains a mean and standard deviation of
58.4 cm and 6.2 cm, respectively. Construct the 98% confidence interval for the population mean.
8. If 60 persons on a diet had weight losses with x = 22.9 pounds and s = 4.2 pounds, construct a 95%
confidence interval for the true mean weight loss of persons on this diet.
9. Suppose that the yield of a certain fruit variety has variance 2 = 900 (lb)2. A random sample of 100
trees this season have a mean yield of x = 275 lb per tree. Give a 95% confidence interval for the mean
yield of this variety this season.
10. Spreading fires tend to do more damage to hardwood forests than do spot fires. Data on the percent
of trees scarred by fire that appeared in the paper “Natural Disturbance Regimes in Hemlock-Hardwood
Forests of the Upper Great Lakes Region” can be used to quantify the difference in the extent of damage.
Test the claim that spreading fires cause more damage.
Use  = 0.05.
Percent Scarred
Spreading fires 21.9 26.7 9.2 6.7 29.2 26.7 6.7 8.3 18.4 4.9
Spot fires 1.6 4.6 1.1 5.6 21.1 11.9 4.8 14.7 7.5
11. Give a 90% confidence interval for 1  2 using the information in problem 10 and explain
what it means.
12. As you may know, there has been bad blood between the Montague family and the Capulet family for
a good while. In this modern day, the appropriate resolution would be by psychological tests. In the test
“propensity to fall in love” the mean of the 6 Montagues was 54 and the mean of the 10 Capulets was 64.
(Italian norms show a national average of 100.) When a statistician compared the families with a t-test, a
value of 1.75 was obtained. If you adopt an  level of .10 (two tailed test), you should conclude that the
Capulets are
A) significantly more loving than the Montagues.
B) significantly less loving than the Montagues.
C) not significantly different from the Montagues.
D) not yet comparable; additional information is needed.
13. We want to estimate the mean energy consumption level for a home in one region. We want to be
98% confident that our sample mean is within 20 kwh of the true population mean. Past data strongly
suggests that the population standard deviation is 120 kwh. How large must our sample be?
14. The logic of hypothesis testing involves assuming
A) that two populations have equal means and then using sample data to conclude that they are
probably equal.
B) that two populations have unequal means and then using sample data to conclude that they are
probably unequal.
C) that two populations have equal means and then using sample data to conclude that they are
probably unequal.
D) that two populations have unequal means and then using sample data to conclude that they are
probably equal.
15. A manager at a manufacturing plant tests a shipment of parts to see if the diameter of the parts meets
the specifications of 5 mm. He tests 50 randomly selected parts and finds the mean diameter to be 5.93
mm. Describe a Type I and Type II error for this problem.
16. When completing a test of significance, what does the P-value mean?
17. A 90% confidence interval for the mean score of statistics students on a particular unit is
(79.3 , 85.1). If the teacher claims that the mean score is 86 and runs a one tailed significance test at the
5% level, what can you expect?
18. What is t* for a 70% confidence interval with a sample size of 25?
19. Fruit Loops is labeled to contain 14 ounces of cereal in each box. To test this 32 randomly selected
boxes are opened and the cereal is weighed. Ho:  = 14 is tested against Ha: 14 at the 2% level.
a) What is the corresponding confidence level for this test?
b) If the corresponding confidence interval is (12.9 , 14.1), would you reject or
fail to reject Ho? Why?
20. A confidence interval is found to be (16.9, 25.5) What is the sample mean?
21. If a result is statistically significant (reject Ho) at the 10% level, is it always significant at the 5%
level? Is it significant at the 20% level?
22. What is the margin of error?
23. The mean time it takes an egg to hard boil is approximately 9 minutes with a certain campstove. A
random sample of 15 eggs was tested one at a time with the same amount of water at the 10%
significance level. What is the critical value if
a) Ho:  = 9 and Ha:  > 9
b) Ho:  = 9 and Ha:   9
c) Ho:  = 9 and Ha:  < 9
24. A 98% confidence interval about the difference between sample means is (3, 6). For a one tailed t-test
at the .01 significance level we would
A)fail to reject the null hypothesis
B) reject the null hypothesis
C) more information is needed
25. a)A one tailed t-test gives a test statistic of t = 2.234. If df = 18, between what two probabilities does
the P-value fall in the table? b) a two tailed test with same statistics
26. A 98% confidence interval for the difference of two means is ( 7, 2). You could conclude?
A) The first group has a much lower mean than the second group.
B) There is a significant difference in the means of the two groups.
C) There is no significant difference in the means of the two groups.
D) There is a 2% chance that the two means are the same.
27. A new variety of apple is intended to resist cedar apple rust which tends to stunt growth. A
horticulturist grows 50 trees of the new strain and 100 of the parent strain under the same field conditions.
At the end of three years, the height of each tree is measured. The new strain has a mean of 15.2 feet with
a standard deviation of 1.4 feet while the parent strain has a mean height of 14.9 feet with a standard
deviation of 1.8 feet. Give a 99% confidence interval for the difference between the heights of new strain
versus parent strain trees. State what the confidence interval means.
For questions 28-30, name the appropriate test. DO NOT COMPLETE THE TEST.
28. An environmentalist is interested in whether or not education would help students properly use
recycling bins. He randomly selects 10 days before his presentation and 10 days after his presentation to
count the number of non-plastic items in the plastic bottle recycling container at WWHS with the results
listed in the table below. Did the educational presentation work?
Before
12
15
9
12
15
23
14
8
13
14
after
4
3
5
0
1
1
6
3
0
0
29. A nature preserve is interested in the effect hikers walking dogs have on the wildlife. An
environmentalist randomly selects 23 dogs from the Humane Society and takes each for a walk in the
preserve, noting whether or not the dog barks or startles the wildlife. He finds that 17 of the dogs tend to
startle local wildlife. Does this contradict the dog lover’s claim that only 30% of dogs chase wildlife?
30. The environmentalist above decides to see if dogs can be trained to walk without barking at the birds
and squirrels. He takes 10 randomly selected dogs for a walk in the nature preserve and counts the
number of times each dog barks at the wildlife. Then the dogs are sent for special training for three
weeks. After the training the dogs again go for a walk in the preserve where the number of times they
bark at wildlife is recorded. Did the training help deter the barking at wildlife?
Dog
A
B
C
D
E
F
G
H
I
J
Before
12
15
9
12
15
23
14
8
13
14
after
4
3
5
0
1
1
6
3
0
0
**********************Answers**********************************
1. population characteristic: μ = mean miles per gallon; Ho: =35 ; Ha: 35 ;
conditions: random sample of cars given; independence is reasonable since one car’s mileage per gallon
36.8  35
doesn’t affect another car’s mpg. Large enough sample n = 36 30; t =
= 1.8, df = 35
6
36
p-value = .0805. Fail to reject Ho since the p-value > ; There is not sufficient evidence to support the
claim that a mean of 35 miles per gallon is incorrect. The G-cars may get 35 mpg.
2. population characteristic: μ = mean household income Ho: =35000 ; Ha:   35000
conditions: random sample of households given; Independence is reasonable since one household’s
income doesn’t affect another household. Given normal population income distribution so an
34124  35000
approximately normal sampling distribution is reasonable; t =
= 1.15, df = 19 ,
3400
20
p-value = .132. Fail to reject Ho because the p-value > ; The sample data is not sufficient to claim that
the mean income is less than $35,000. A restaurant in this area may be successful.
3. population characteristic: μ = average grade on inference test; Ho:  = 82; Ha  82;
conditions: random sample of student scores given, Independence is reasonable since one student’s score
shouldn’t affect another student’s score. It appears reasonable to assume an approximately normal
sampling distribution since the boxplot of sample scores is fairly symmetric.
80
82
84
86
88
90
92
94
96
98
100
3 con’t) t = 5.435 ; df = 11 ; p-value = .0002055; reject Ho since the p-value < ; There is enough
evidence to show that the test scores are indeed different this year. Students appear to be scoring
significantly higher.
4.a) 95% confidence with n=25
b) 99% confidence with n = 50
5. µd = mean of the differences between subjects scores on Test 1 and Test 2; Ho:d= 0; Ha:d0;
d =4.0, sd = 9.287, Assumptions: random sample of subjects given; Aptitude scores are dependant on
subject.
Boxplot appears symmetric enough to assume
an approximately normal sampling distribution ─10
─5
0
5
10
15
20
of differences.
40
Test statistic: t =
=1.292; df = 8; P-value = 0.232
9.287
9
fail to reject Ho since p-value > . There is not enough evidence to claim the two versions are
significantly different. They could produce about the same results.
25
6. 1, 2 defined in problem; Ho: 1=2 ; Ha: 1<2 ; Conditions: random water samples given;
Independence is reasonable since there are certainly more than 510 (51x10) summer samples and 540
(54x10) winter water samples available. approx. normal sampling distribution of ( x1  x2 )since both
33.4  35.39
samples are large enough; n1=51 and n2=54 are both > 30. Test statistic: t =
.428 2 .294 2

51
54
P-value = 0; reject Ho since p-value < ; The data strongly supports the claim that the salinity level is
higher in the winter.
7. one sample t-interval ; conditions: Random seedlings given; Independence is reasonable since one
seedling’s height doesn’t affect another’s height. large enough sample since n = 40 > 30.
(56.022 , 60.778) We are 98% confident the mean seedling height is between 56.022 cm and 60.778 cm.
8. one sample t-interval; conditions: random selection not addressed in problem; large enough sample
since n = 60 > 30. 21.815    23.985 (Note: some texts write confidence intervals like this)
We are 95% confident the mean weight loss for persons on this diet is between 21.8 and 24 pounds.
9. one sample t-interval; conditions: Random sample of trees given; large enough sample since
n = 100 > 30; Independent tree yields seems reasonable. (269.05 , 280.95) We are 95% confident the
true mean yield for this variety of tree is between 269.05 and 280.95 pounds.
10. 1= mean percent of scarred trees from spreading fires; 2 =mean percent of scarred trees from spot
fires; Ho: 1=2; Ha: 1>2; two sample T-test: Conditions: Random sample of Upper Great Lakes forest
fires not addressed; We’ll assume that spreading fires and spot fires are independent. approximately
normal sampling distributions as shown by boxplots (show both graphs on the test) Critical value: 1.740;
Test statistic: t = 2.06; P-value = 0.028 ; reject Ho since p-value < 
The data supports the claim that spreading fires cause more damage than spot fires.
11. Two-sample t-interval; (1.183, 14.357) We are 90% confident that spreading fires cause between
1.183% and 14.357% more of the forest to be damaged. (Conditions given in #10)
12. C
13. n = 195
14. C
15. Type I error: He states that the parts do not meet the specifications when in fact they do.
Type II error: He states that the parts do meet the specifications when in fact they do not.
16. It is the probability a sample would get the given results by chance alone if the null hypothesis is true.
17. Since 86 is not in the corresponding interval, we can expect that the true mean is not as high as 86 and
we would reject her claim of a higher mean. The average grade appears to be lower than 86.
18. 1.059
19. a) 98% since the test is 2-tailed
b) fail to reject Ho since 14 is included in the interval.
20. 21.2
21..If  = .10 then the test may not be significant at the  = .05 level since .05 < .10 so more proof is
needed. But the results will always be significant at the =.20 level since .20 > .10.
22. The margin of error occurs since we are taking a sample to represent an entire population. Different
samples produce different results. The margin of error tells us that the population is likely to be not more
than that far away from our sample statistic. The formula for margin of error is critical value (t*) times
the standard error.
23. a) 1.345
24. B
b) 1.761, 1.761
25. a).01  p  .02
c) 1.345
b) .02 < p < .04
26. C
27. 2 sample t-interval; conditions: Random assignment to treatments not discussed in problem.
Independence seems reasonable since the height of one tree probably doesn’t affect the height of another
tree and there’s potentially an infinite number of each tree which satisfies the 10% rule of thumb. Large
enough samples since n1 = 50 > 30 and n2 = 100 > 30. ( .4002 , 1.0) We are 99% confident the mean
height difference between the two types of trees is between ─.4 feet and 1 foot. Since zero is part of the
interval, there does not appear to be a significant difference between the two strains of apple trees.
28. Two sample t-test
29. one proportion z-test
30. paired t-test since the data are dependent on the dog
Download