Statistics 151 Practice problems for final exam

advertisement
Statistics 151
Practice problems for final exam
Instructions: In questions 1-11, write your answers in the space provided. If you need extra space,
you may use the back of the page. To receive full credit you must show all of your calculations. In
testing hypothesis, you must state your H0, Ha, test statistic, p-value/critical value, and conclusion.
In multiple choices questions 12-22, circle the correct answers.
1. The diameters of apples grown in an orchard are normally distributed with mean 3.4 and
standard deviation 0.85 inches.
a. What proportion of the apples has their diameters greater than 4 inches?
Answer: X is N(3.4, .85). So, P (X > 4) = P(Z>.71) = 0.2389 (about 24%).
b. 95% of the apples will have a diameter greater than what value?
Answer: P( X > k) = .95 or P( X ≤ k) = 0.05 ⇒ k = 3.4 − 1.645×.85 = 2 inches.
c. Suppose that we take a random sample of three apples from this orchard. What is the
probability that the average diameter of the three apples is greater than 4 inches?
4 − 3.4 ⎞
⎛
P ( X > 4) = P⎜⎜ z >
⎟⎟ = P ( Z > 1.22) = 0.1112.
.85 / 3 ⎠
⎝
d. How large a sample from this population should be taken if one wants to be 99% sure
that the sampling error (SE) does not exceed 0.25 inches?
2
⎡ z * × σ ⎤ ⎡ 2.575 × .85 ⎤ 2
Answer : n ≥ ⎢
⎥ =⎢
⎥ = 76.65 ⇒ n = 77.
.25
⎦
⎣ SE ⎦ ⎣
2. A poll is conducted to estimate support for a city Mayor following a recent controversy. In a
simple random sample of 600 voters, 345 said that they support the Mayor. Estimate the
population proportion of voters that support the Mayor and construct a 90% confidence
interval.
Answer: The sample proportion in support of the Mayor is p̂ = 345/600=0.575. This is
an estimate for p, population proportion. The 90% confidence interval for p is
pˆ (1 − pˆ )
0.575(0.425)
= 0.575 ± 1.645
n
600
= 0.575 ± 0.033 = (0.542,0.608)
pˆ ± zα/2
1
3. A car dealer specializing in Corvettes enlarged his facilities and offered a number of models for
sale during the open house. From his data on price (y in $1000) and age (x in years) of
Corvettes, the following data were obtained:
x 1
2
4
5
6
6
10
11
11
12
12
12 12 13
15
y 39.9 32.0 25.0 20.0 16.0 20.0 13.0 13.7 11.0 12.0 20.0 9.0 9.0 12.5 7.0
x = 8.8, y = 17.34, ∑ ( x − x )( y − y ) = −499.78, ∑ ( x − x ) 2 = 268.4, ∑ ( y − y ) 2 = 1175.816,
SSE = ∑ ( y − yˆ ) 2 = 245.19
a. Calculate the Least-squares regression line of y on x.
βˆ1 =
SS xy
SS xx
= −499.78 / 268.4 = −1.862, βˆ0 = y − βˆ1 x = 17.34 − (−1.862) × 8.8 = 33.73
∴The regression equation is: Price = 33.73 – 1.862 × Age
b. Is there sufficient evidence to indicate a negative linear relationship between selling price
and age? Test appropriate hypotheses using α = 0.05.
Answer: H 0 : β1 = 0
Test Statistic : t =
H a : β1 < 0
βˆ − 0
sβˆ
1
=
− 1.862
4.3429 / 268.4
∑ ( y − yˆ )
σˆ = s =
n−2
2
=
= −7.02,
sβˆ =
1
s
SS xx
df = 13;
245.19
= 4.3429
13
P-value < .0005;
Conclusion: There is a negative linear relationship between selling price and age.
c. Predict with 95% confidence the resale value of a 5-year old Corvette to be listed in the next
week’s paper.
Answer: 95% Prediction interval is:
yˆ ± t * × s × 1 +
1
( x* − x ) 2
1 (5 − 8.8) 2
+
=
24
.
42
±
(
2
.
16
)
×
(
4
.
3429
)
×
1
+
+
15
268.4
n
( x − x )2
∑
= 24.42 ± 9.93 = (14.49,34.35) = $14490 to $34350.
d. Is it justifiable to predict the selling price of a 20-year old Corvette from the fitted regression
line? Give reasons for your answer.
Answer: No, we cannot predict beyond the range of data. The linear relation cannot
hold for cars that old because we know the selling price never goes below.
2
4. Suppose the following display represents 500 automobile accidents that occurred in a large city:
Involved alcohol
No alcohol
No fatalities
68
194
At least one fatality
142
96
Determine whether there is a relationship between alcohol and fatality? Use α = 0.01 in your
test.
Answer:
H0 : Fatality is not related to use of alcohol vs Ha: Fatality is related to use of alcohol.
Under the null hypothesis of independence, the expected frequencies are:
E11 = 110.04,
E12 = 99.96, E21 = 151.96, E22 = 138.04
Test statistic: χ2 = 58.175, df = 1,
P-value = P(Z χ2≥58.175) <0.005
Conclusion: There is strong evidence to conclude that there is an association between alcohol
use and fatality.
5. Two makes of automobiles are being compared on engine quality. A sample of 100 cars are taken
each type. For one make of car, 32 of the sample cars required a complete engine overhaul
within the first 150,000 km. For the second make, only 18 of the sample cars required engine
overhaul.
Construct a 99% confidence interval for the difference in proportions of cars requiring engine
overhaul. Explain clearly what the above interval means.
Answer: 99% confidence interval estimate for p1 - p2 is
pˆ1 − pˆ 2 ± zα / 2
pˆ1 (1 − pˆ1 ) pˆ 2 (1 − pˆ 2 )
+
n1
n2
.32(1 − .32) .18(1 − .18)
+
100
100
= 0.14 ± 0.1556 = (−.0156,.2956)
= .32 − .18 ± 2.575
The first make of car may require anywhere between 2% fewer overhaul to 30% more
overhaul.
3
6. Suppose that 20% of the trees in a forest are infested with a certain type of parasite.
a. Find the probability that at least one tree will contain parasite in a random sample of 10
trees.
Answer: X is B(10, .2). So, P(X ≥ 1 ) = 1 – P(x = 0) = 1 − .810 = 1 – .1074 = 0.8926
b. After sampling 300 trees, suppose that 72 trees are found to have the parasite. Does this
provide strong evidence that the population proportion (p) of infested trees is higher than
20%? Test appropriate hypotheses using α = 0.05.
Answer:
Hypotheses: H0: p =.20 vs Ha: p >.20;
Test statistic: z =
pˆ − p0
(.24 − .2)
=
= 1.73
p0 (1 − p0 ) / n
.2 × .8 / 300
P-value : P(Z ≥ 1.73) = 0.0418
Conclusion: At 5% significant level, we conclude that the percentage of trees infested with
parasites is more than 20%.
7. Many homeowners buy detectors to check for the invisible gas radon in their homes. How
accurate are these detectors? To answer this question, university researchers placed 12 radon
detectors in a chamber where they were exposed to 105 picocuries per liter (pCi/l) of radon over
3 days. The detector reading were as follows: 91.9, 97.8, 111.4, 122.3, 105.4, 95.0, 103.8, 99.6,
96.6, 119.3, 104.8, 101.7
a. Calculate the mean and sample standard deviation for this data.
Answer: Sample mean = 104.13, sample standard deviation, s = 9.40.
b. Give a 90% CI for the mean reading μ of all detectors of this type.
Answer: Assuming the readings are approximately normally distributed, a 90%
confidence interval is, using a t distribution with 11 degrees of freedom,
x ± tα / 2
s
n
= 104.13 ± (1.796)
9.40
12
= 104.13 ± 4.87 = (99.26,109.00) .
c. What conclusion can you make from this interval?
Answer: The detectors appear to be accurate.
4
8. Does increasing the amount of calcium in our diet reduce blood pressure? A randomised
comparative experiment gave one group of 10 black men a calcium supplement for 12 weeks. The
control group of 11 black men received a placebo that appeared identical. The experiment was
double-blind. The following table gave data on the change, x, in the blood pressure (x = blood
pressure before treatment – blood pressure after treatment) for two groups of black males.
Group
Sample size, n
x
s
Calcium
Placebo
10
11
5.000
-.273
8.743
5.901
Is there any evidence that calcium lowers blood pressure more than a placebo?
appropriate hypotheses using α =0.05. State the assumptions you make in your test.
Test
Answer: Let μ1 andμ2 are the mean change in blood pressure in the Calcium and Placebo
group, respectively.
Assumptions: We assume that the changes in blood pressures in each group are
distributed normally.
Check the condition of equal variation: s12 / s22 = 2.19 < 3 . Pooled t procedure will be
appropriate for the analysis.
Hypotheses : H0 : μ1 -μ2 =0 vs
Ha : μ1 -μ2 >0
Test Statistic
t=
x1 − x2
⎛ s 2p s 2p ⎞
⎜ + ⎟
⎜n n ⎟
2 ⎠
⎝ 1
=
5 − (−.273)
= 1.634
54.536 54.536
+
10
11
9 × (8.743) + 10 × (5.901)
= 54.536, df = 19 ]
[(s =
19
2
2
2
p
P-value : 0.05 < P( t ≥ 1.634) < .10. (p-value > α = 0.05)
Conclusion: There is no evidence at 5% significance level that Calcium lowers blood
pressure.
5
9. Three models of automobiles were tested for fuel efficiency as follows. Exactly three litres of
gasoline were placed in the gasoline tank of a car. The car was then driven until the fuel was
used up. The number of kilometers traveled for each model was recorded for several tests
each. The following data were collected.
Car Model
Sample size
Sample mean
Sample Sd. Dev.
A
4
18
1.633
B
6
16
1.414
C
8
20.25
2.053
Is there any significant difference in average distance traveled on 3 litres of fuel for these
three models? Test appropriate hypotheses using α = 0.01.
Answer: H0 : μ1 = μ2 = μ3 , Ha : Not all three means are equal.
Test statistic: F = MST/MSE
ni yi 330
=
= 18.33
y=
18
18
ni ( y − y ) 2 i 4(18 − 18.33) 2 + 6(16 − 18.33) 2 + 8(20.25 − 18.33) 2 62.50
=
=
= 31.25
MST =
3 −1
2
k −1
(ni − 1) si2 3 × 1.6332 + 5 × 1.414 2 + 7 × 2.0532 47.50
=
=
= 3.17
MSE =
18 − 3
15
n−k
∑
∑
∑
F = MST/MSE = [31.25/3.17 = 9.86, with degrees of freedom ν1= 2, ν2 = 15.
P-value = P( F ≥ 9.86) < 0.01.
Conclusion: There is significant difference in average distance traveled on 3 litres of fuel
for these three models.
10. Does alcohol affect the ability to think? A random sample consisting of 11 automobile drivers
was selected to study whether or not alcohol has some effect on time to complete a puzzle.
Under one scenario, the person would drink a beverage that contained no alcohol; under
another scenario, the person would drink a beverage with alcohol. Each person's time to
complete the puzzle was recorded. The following data were obtained:
Driver
1
2
3
4
5
6
7
8
9
10
11
No alcohol
7.1
6.3
6.8
8.4
6.9
8.5
7.3
7.7
8.1
7.4
6.6
With Alcohol
7.4
6.2
6.6
9.3
7.2
8.8
7.6
7.9
8.7
7.9
7.0
Do the data provide any evidence to conclude that more time is required to complete the
puzzle after consuming alcohol? Test the appropriate hypotheses at α = 0.01.
Answer: Let μ = μ1 − μ2 = μwithout − μwith = the difference between the average completion
times of the puzzle. Hypotheses are: H0 :μ = 0 vs Ha: μ < 0.
Using paired t-procedure, t =-3.49 and p-value < 0.005 (less than α = 0.01)
Conclusion: The data provide sufficient evidence to indicate that it requires
more time, on average, to complete the puzzle after consuming alcohol.
6
11. A research study investigating the relationship between smoking and heart disease in a sample
of 1000 men over 50 years of age provided the following data:
Heart Disease
No heart disease
Total
(a)
Smoker
100
200
300
Non-smoker
80
620
700
Total
180
820
1000
If one of the 1000 men is randomly selected, find the probability that a man over
50 years of age is a smoker and has a record of heart disease.
Answer: P(Smoker and Heart Disease) = 100/1000 = 0.1
(b)
If one of the 1000 men randomly selected, find the probability that he has heart
disease, given that the man over 50 years of age is a smoker.
Answer: P(Heart Disease|Smoker) = 100/300 = .33
(c)
Does the research indicate that heart disease and smoking are independent
events? Explain.
Answer: P(Heart Disease|Smoker)= .33 ≠ P(Heart Disease) =.18.
Therefore, the events are not independent.
12. If two events A and B are independent , then:
(A) P(A|B) = P(A)
(B) P(B|A) = P(B)
(D) all of the above
(E) none of the above
(C) P(AB) = P(A)P(B)
Answer : D
13. An event A will occur with probability 0.5. An event B will occur with probability 0.6. The
probability that both A and B will occur is 0.1. We may conclude
A) that events A and B are independent.
C) that either A or B always occurs.
B) that events A and B are disjoint.
D) None of the above.
Answer: C
7
14. The proportion of students who own a cell phone on college campuses across the country has
increased tremendously over the past few years. It is estimated that approximately 90% of
students now own a cell phone. Fifteen students are to be selected at random from a large
university. Assume that the proportion of students who own a cell phone at this university is
the same as nationwide. Let X = the number of students in the sample of 15 who own a cell
phone. What is the appropriate distribution for X?
A)
B)
C)
D)
X is N(15, 0.9)
X is B(15, 0.9)
X is B(15, 13.5)
X is N(13.5, 1.16)
Answer : B
15. If the null hypothesis is rejected, then
(A)
(B)
(C)
(D)
Only a type I error is possible
Only a type II error is possible
Both Type I and type II errors are possible
Neither a type I nor a type II is possible
Answer: A
16. The null hypothesis is rejected
(A)
(B)
(C)
(D)
The p-value is smaller than the level of significance
The p-value is greater than the level of significance
The p-value is greater than 90%
The p-value is greater than 10%
Answer: A
17. In a test of significance,
(A)
(B)
(C)
(D)
It is possible to prove the alternative hypothesis
It is possible to find evidence to support the alternative
It is not possible to find evidence to support the alternative
None of the above
Answer: B
18. In a test significance,
(A)
(B)
(C)
The null hypothesis is believed to be true and we are trying to find evidence to
support it
The alternative hypothesis is believed to be true and we are trying to find
evidence to support it
Neither of the above
8
Answer: C
19. It has been claimed by the year 2010, 20% of all American adults will be drawing retirement
benefits. If you believe that the number is greater than 20%, which is the way to formulate
the hypotheses?
(A)
(B)
(C)
(D)
H0 :
H0 :
H0 :
H0 :
pˆ = .20
pˆ ≥ 20
p = .20
p ≥ .20
versus
versus
versus
versus
H a : pˆ > .20
H a : pˆ < .20
H a : p > .20
H a : p < .20
Answer: C
20.
The sampling distribution of a statistic describes
(A)
(B)
(C)
(D)
The shape of the distribution of all potential values of the statistic in repeated
sampling
The centre of all potential values of the statistic in repeated sampling
The amount of variability associated with all potential values of the statistic in
repeated sampling
All of the above
Answer: D
21. Suppose that the 45th percentile for weight is 70 kg. This means:
(A) 45 percent weigh more than 70 kg
(B) 45 percent weigh less than 70 kg
(C) 55 percent weigh less than 70 kg
(D) 70 percent weigh more than 45 kg
Answer: B
22. A population has μ = 83, σ = 3.25. Chebyshev’s theorem indicates that the proportion of the
population between 70 and 96 must be:
(A) At least 6.25%
(B) At most 6.25%
(C) At least 93.75%
(D) At most 93.75%
(E) None of these
Answer: C
9
Download