Discussion problems week 5

advertisement
Sections 2C and 2D - Stats 10
Problem 7.52
In a simple random sample of 1200 Americans age 20 and over, the proportion with diabetes was
found to be 0.115 (or 11.5%).
1. What is the standard error for the estimate of the proportion of all Americans age 20 and over
with diabetes?
√
SE
=
√
=
p(1 − p)
n
0.115 × (1 − 0.115)
= 0.0092
1200
or about 0.92% standard error.
2. Find the margin of error, using a 95% confidence level, for estimating this proportion.
The margin of error for a 95% CI is
m = 1.96 × SE = 1.96 × 0.0092 = 0.018
or about 1.8%.
3. Report the 95% confidence interval [or the proportion of all Americans age 20 and over with
diabetes.
The lower boundary for the 95% CI is then
0.115 − 0.018 = 0.097
and the upper boundary is
0.115 + 0.018 = 0.133
Therefore, we are 95% confident that the true proportion of persons aged 20 or more with diabetes lies between 0.097 and 0.133.
4. According to the Centers for Disease Control, nationally, 10.7% of all Americans age 20 or over
have diabetes. Does the confidence interval you found in part c support or refute this claim?
Explain.
Yes, the confidence interval found in part (c) would support this claim since 10.7% (0.107) falls
within the 95% CI.
1
Problem 7.54
In a 2008 survey, the National Highway Traffic Safely Administration (NHTSA) reported that 83% of
people used seat belts. The margin of error is 3 percentage points.
1. Assuming that the confidence level is 95% and the survey was random, find a 95% confidence
interval for the percentage of people who used seat belts in 2008.
The statement of the problem provides a sample proportion (0.83) and a margin of error (0.03).
The lower boundary for the 95% CI is then
0.83 − 0.03 = 0.80
and the upper boundary is
0.83 + 0.03 = 0.86
Therefore, we are 95% confident that the true proportion of persons who wear their seatbelt is
between 0.80 and 0.86.
2. The NHTSA said the percentage of people using seat belts in 2000 was 71%. If 71% were suggested as the percentage for 2008, would you reject that as implausible? Why or why not? Does
this suggest a change in seat belt use between 2000 and 2008? Explain.
A percentage of 71% would be rejected as a plausible percentage of seatbelt users in 2008 since
the corresponding proportion of 0.71 lies well outside (in this case below) the 95% CI. This
provides evidence that seatbelt use has increased since 2000.
Problem 7.58
In the 1960 presidential election, 34,226,731 people voted for Kennedy; 34,108,157 for Nixon; and
197,029 for third-party candidates (www.uselectionatlas.org).
1. What percentage of voters chose Kennedy?
The percentage of voters who voted for Kennedy was
34, 226, 731
34, 226.731
=
= 0.4994
34, 226.731 + 34, 108, 157 + 197, 029 68, 531, 917
2. Would it be appropriate to find a confidence interval for the proportion of voters choosing
Kennedy? Why or why not?
It does not make sense to find a confidence interval (for any degree of confidence) for this
proportion because this is the population proportion, not a sample proportion. If instead we
had been given a sample proportion, then it makes sense to use statistical methods to make an
inference as to the population proportion.
Problem 7.64
In the 2008 General Social Survey, people were asked whether they thought the sun went around the
earth or vice versa. Of 1381 people, 310 thought the sun went around the earth.
1. What proportion of people in the survey believed the sun went around the earth?
The proportion of respondents who believe the sun goes around the earth is
310
= 0.224
1381
2
2. Find a 95% confidence interval for the proportion of all people with this belief.
For the 95% CI we first calculate the standard error, and from this then the margin of error, m.
√
SE
=
√
=
p(1 − p)
n
0.224 × (1 − 0.224)
= 0.0112
1381
Then we have
m = 1.96 × SE = 1.96 × 0.0112 = 0.022
Finally, the lower boundary for the 95% CI is
0.224 − 0.022 = 0.202
and the upper boundary is
0.224 + 0.022 = 0.246
Therefore, we are 95% confident that the true proportion who believe astrology to be very scientific is between 0.202 and 0.246.
3. Suppose a scientist said that 30% of people in the general population believe the sun goes
around the earth. Using the confidence interval, would you say that was plausible? Explain
your answer.
A claim that 30% of the general public believes the sun goes around the earth is not plausible
since it lies outside (above) the 95% CI.
Problem 8.37
In a Rasmussen poll of 1000 adults in July 2010, 520 of those polled said that schools should ban
sugary snacks and soft drinks.
1. Do a majority of adults (more than 50%) support a ban on sugary snacks and soft drinks? Perform a hypothesis test using a significance level of 0,05.
Step 1: We begin by defining the hypotheses:
H0 : p = 0.50 (the proportion supporting banning is 50%) H1 : p > 0.50 (a majority supports
banning sugary foods and soft drinks)
Step 2: Choose the one-proportion z-test and check that the conditions are satisfied so that the
normal distribution provides an appropriate model for the distribution of sample proportions.
Observe that
np = 1000 × 0.50 = 500
and
n(1 − p) = 1000 × 0.50 = 500
so both are greater than 10, sampling is random, and the population is sufficiently large. The
conditions are satisfied.
Step 3: In order to test the alternative hypothesis, we find the z-score and from it the p-value.
The SE can be found by
√
√
0.50 × (1 − 0.50)
0.25
SE =
=
= 0.0158
1000
1000
3
The z-score is
z=
p̂ − p 520/1000 − 0.50 0.52 − 0.50
=
=
= 1.26
SE
0.0158
0.0158
Because we have a one-tailed alternative hypothesis, the p-value corresponds to the area to the
right of z = 1.26. From the Normal table, the p-value is 1 - 0.8962 = 0.1038.
Step 4: At α = 0.05 we fail to reject H0 and conclude that the percentage of U.S. adults who
support the death penalty has changed since 1996.
2. Choose the best interpretation of the results you obtained in part a:
a) The percentage of all adults who favor banning is significantly more than 50%.
b) The percentage of all adults who favor banning is not significantly more than 50%
Thus, statement (b) is correct: The percentage of all adults who favor banning is not significantly different from 50%.
Problem 8.40
According to one source, 50% of plane crashes are due at least in part to pilot error (http://www.planecrashinfo
.com). Suppose that in a random sample of 100 separate airplane accidents, 62 of them were due to
pilot error (at least in part.)
1. Test the null hypothesis that the proportion of airplane accidents due to pilot error is not 0.50,
Use a significance level of 0.05.
Step 1: We begin by defining the hypotheses:
H0 : p = 0.50 (the proportion of airplane accidents due at least in part to pilot error is 0.50)
H1 : p ̸= 0.50 (the proportion of airplane accidents due at least in part to pilot error is not 0.50)
Step 2: We use a one-proportion z-test and check that the conditions are satisfied so that the
normal distribution provides an appropriate model for the distribution of sample proportions.
Observe that
np = 100 × 0.50 = 50
and
n(1 − p) = 100 × 0.50 = 50
so both are greater than 10, sampling is random, and the population is sufficiently large (there
are likely more than 1000 accidents in total). The conditions are satisfied.
Step 3: In order to test the alternative hypothesis, we find the z-score and from it the p-value.
The SE can be found by
√
SE =
The z-score is
z=
0.50 × (1 − 0.50)
=
100
√
0.25
= 0.05
100
p̂ − p 62/100 − 0.50 0.62 − 0.50
=
=
= 2.4
SE
0.05
0.05
The area corresponding to the right tail defined by z = 2.4 is 1 - 0.9918 = 0.0082 as found in the
Normal table. Because we are using a two-tailed alternative hypothesis, we double this value
to get a p-value of 2 × 0.0082 = 0.0164.
Step 4: At α = 0.05 we reject H0 and conclude that the proportion of airplane accidents due at
least in part to pilot error is not 0.50.
2. Choose the correct interpretation:
4
a) The percentage of plane crashes due to pilot error is not significantly different from 50%.
b) The percentage of plane crashes due to pilot error is significantly different from 50%.
Thus, statement (b) is correct: The percentage of plane crashes due to pilot error is significantly
different from 50%.
Problem 8.49
A study used nicotine gum to help people quit smoking. The study was placebo-controlled, randomized, and double-blind. Each participant was interviewed after 28 days, and success was defined as
being abstinent from cigarettes for 28 days. The results showed that 174 out of 1649 people using the
nicotine gum succeeded, and 66 out of 1648 using the placebo succeeded. Although the sample was
not random, the assignment to groups was randomized.
1. Find the proportion of people using nicotine gum that stopped smoking and the proportion
of people using the placebo that stopped smoking, and compare them. Is this what the researchers had expected?
The proportion of people using nicotine gum who stopped smoking is 174/1649 = 0.1055, while
the proportion of people on placebo who stopped smoking is 66/1648 = 0.040. The proportion
of persons who stopped smoking while using nicotine gum was higher (than for those taking
placebo) as researchers hoped.
2. Find the observed value of the test statistic, assuming that the conditions for a two-proportion
z-test hold.
The value of the z-statistic can be found most easily using technology. For example, from R,
the output is
Sample X N Sample p
Gum 174 1649 0.105518
Placebo 66 1648 0.040049
Difference = p (Gum) - p (Placebo)
Estimate for difference: 0.0654700
95% lower bound for difference: 0.0507060
Test for difference = 0 (vs > 0): Z = 7.23 P-Value = 0.000
Thus, the z-statistic is z = 7.23.
Problem 8.61
The Gallup organization frequently conducts polls in which they ask the following question:
“In general, do you feel that the laws covering the sale of firearms
should be made more strict, less strict, or kept as they are now?”
In February 1999, 60% of those surveyed said ‘more strict,’ and on April 26, 1999, shortly after the
Columbine High School shootings, 66% of those surveyed said ‘more strict.’
1. Assume that both polls used samples of 560 people. Determine the number of people in the
sample that said ‘more strict’ in February 1999, before the school shootings, and the number
that said ‘more strict’ in late April 1999, after the school shootings.
The number of persons in the February sample who responded with ‘more strict’ was 336 since
560 × 0.60 = 336. The number of persons in the late April sample who responded with ‘more
strict’ was 370 since 560 × 0.66 = 370.
5
2. Do a test to see whether the proportion that said ‘more strict’ is statistically significantly different in the two different surveys, using a significance level of 0.01.
Step 1: The hypotheses needed to test whether or not the proportion who said ‘more strict’
is statistically significantly different in the two different time periods (before and after the
Columbine massacre) are
H0 : p B = p A (the proportions responding "more strict" before and after are the same)
H1 : p B ̸= p A (the proportions responding "more strict" before and after are different) where p B
represents the proportion from the earlier sample (Before) who responded more strict and p A
represents the proportion from the later sample (After) who responded more strict.
Step 2: We use the two-proportion z-test. This is valid since the participants were chosen at
random. Also, since the pooled proportion is
336 + 370
= 0.6304
560 + 560
then
n 1 × p̂ = n 2 × p̂ = 560 × 0.6304 = 353
n 1 × (1 − p̂) = n 2 × (1 − p̂) = 560 × 0.3696 = 207
and each of these is larger than 10. The conditions required for us to use the Normal distribution to represent the distribution of differences of sample proportions have been met.
Step 3: The value of the z-statistic can be found most easily using technology. For example,
from R, the output is
Sample X N Sample p
B 336 560 0.600000
A 370 560 0.660714
Difference = p (B) - p (A)
Estimate for difference: -0.0607143
99\% CI for difference: (-0.134873, 0.0134444)
Test for difference = 0 (vs not = 0): Z = -2.10 P-Value = 0.035
The z-statistic is z = −2.10 and the p-value is 0.035, which fails to be statistically significant at
α = 0.01.
Step 4: At α = 0.01 we fail to reject H0 and conclude that there is insufficient evidence that the
proportions of people who responded more strict to the question differ before and after the
Columbine shootings.
3. Repeat the problem, assuming that the sample sizes were both 1120.
Step 1 from part (2) remains the same. The conditions described in Step 2 are still met with the
increased sample sizes. We recalculate the z-statistic and corresponding p-value and form a
new conclusion.
Step 3: The value of the z-statistic can be found most easily using technology. For example,
from R, the output is
Sample X N Sample p
B 672 1120 0.600000
A 740 1120 0.660714
Difference = p (B) - p (A)
Estimate for difference: -0.0607143
99\% CI for difference: (-0.113152, -0.00827618)
Test for difference = 0 (vs not = 0): Z = -2.98 P-Value = 0.003
6
The z-statistic is z = −2.98 and the p-value is 0.003, which is significant at α = 0.01.
Step 4: At α = 0.01 we reject H0 and conclude that the difference between the proportions of
people who responded more strict to the question before and after the Columbine shootings is
statistically significant.
4. Comment on the effect of different sample sizes on the p-value and on the conclusion.
The larger sample size (in part (3)) resulted in a z-value that was farther from 0 and a lower
p-value than what was observed in part (2). As a result, we were able to conclude in (3) that the
difference was statistically significant at α = 0.01.
Problems from Gould and Ryan, Introductory Statistics
7
Download