chp8_section1_a_ans

advertisement
Chapter 8 Section 1
Homework A
8.7 Can we use the large-sample confidence interval?
In each of the following circumstances state whether you would use the large-sample confidence
interval. The variable X denotes the number of observed successes out of n attempts.
(a) n = 50, X = 30
(b) n = 90, X = 15.
(c) n=10, X = 2
(d) n = 60; X = 50.
(e) n = 25, X = 15.
8.9 What's wrong? Explain what is wrong with each of the following:
(a) An approximate 99% confidence interval for an unknown proportion p is p̂ plus or minus its
standard error.
(c) A significance test is used to evaluate H0: p̂ = 0.2 versus the two-sided alternative.
8.11 Gambling and college athletics. Gambling is an issue of great concern to those involved in
intercollegiate athletics. Because of this, the National Collegiate Athletic Association (NCAA) surveyed
student-athletes concerning their gambling-related behaviors.11 There were 5594 Division I male athletes
in the survey. Of these, 3547 reported some participation in some gambling behavior. This included
playing cards, betting on games of skill, buying lottery tickets, and betting on sports.
(a) Find the sample proportion and the large-sample margin of error for 95% confidence. Explain in
simple terms the meaning of the 95%. Make sure you confirm that you can use the formula that you
have used.
(b) Because of the way that the study was designed to protect the anonymity of the student-athletes
who responded, it was not possible to calculate the number of students who were asked to respond
but did not. Does this fact affect the way you interpret the results? Write a short paragraph
explaining your answer.
(c) In order to use the formula that you used in (a) what did you have to verify is true about the
distribution of p̂ ?
8.12 Gambling and female athletes. In the study described in the previous exercise, 1447 a total of 3469
female student-athletes re participation in some gambling activity.
(a) Use the large-sample methods to find an estimate of the true proportion with a 95% confidence
interval.
p̂ =
1447
= 0.4171
3469
0.4171 ± 1.96
0.4171(1  0.4171)
3469
(0.4007, 0.4335)
(b) The margin of error for this sample is not same as the margin of error calculated for the previous
exercise. Explain why.
The margin of error is determined by the value of p̂ and the sample size. In the previous exercise
these values were different, thus the reason for a different sample size.
1. Do you enjoy driving your car? In 1991, a Gallup Poll for U.S. population reported this percent to be
79%.
(a) The Pew Research Center recently (2008) conducted the same poll in the U.S, with n = 50, and 36
reported that they enjoy driving. Does this sample provide evidence that “the percent of drivers who
enjoy driving their cars has declined since 1991?” Make sure you state the null and alternative
hypothesis. Report the large-sample z statistic and its P-value. Verify that you can use the method you
used to calculate the p-value.
While the 79% is really a statistic, let us assume for the moment that it is a parameter.
50(1 – 0.79) = 10.5 > 10 we can use a normal approximation
36
H0: p = 0.79 Ha: p < 0.79
≈ 0.72
p̂ =
50
50(0.79) = 39.5 > 10
Test Statistic: Z =
0.72  0.79
= - 1.2152
0.79(1  0.79)
50
P( p̂ < 0.72) = P(Z < -1.2152)
= 0.1121
(b) Draw a sketch of a standard Normal curve mark the location of your z statistic. Shade the appropriate
area that corresponds to the P-value.
(c) The researchers will reject the null hypothesis if p̂ < 0.66. What is the value 0.66 called?
The critical value.
(d) What is the probability of a Type I error?


P( p̂ < 0.66) = P  Z 




0.66  0.79 

0.79(1  0.79) 

50

= P(Z < -2.2569)
= 0.012
(e) The Gallup conducts the same survey (2008) in the U.S, with n = 500, and 360 reported that they
enjoy driving. Does this sample provide evidence that “the percent of drivers who enjoy driving their
cars has declined since 1991?” Make sure you state the null and alternative hypothesis. Report the largesample z statistic and its P-value. Verify that you can use the method you used to calculate the p-value.
500(1 – 0.79) = 105 > 10 we can use a normal approximation
500(0.79) = 395 > 10
H0: p = 0.79 Ha: p < 0.79
Test Statistic: Z =
p̂ =
360
≈ 0.72
500
0.72  0.79
= - 3.843
0.79(1  0.79)
500
P( p̂ < 0.72) = P(Z < -3.843)
< 0.0001
(f) Draw a sketch of a standard Normal curve mark the location of your z statistic. Shade the appropriate
area that corresponds to the P-value.
(g) Notice that problem (e) and problem (a) both produced the same p̂ value. What caused the difference
in p-values?
While the sample proportion for both problems is the same the sample size is completely different.
The increase in sample size makes S.E. p-hat smaller. Thus, if p is really not 0.79, then whatever the
real value of p is, p-hat will be closer to it than to 0.79. This is why p-hat will be “further” away
from 0.79.
(h) Create a 95% confidence interval with the data from (e).
0.72 ±1.96
0.72(1  0.72)
500
(0.6806, 0.7594)
(i) We want to estimate p, to ± 0.01. How large of a sample is needed to construct a 95% confidence
interval for the proportion of U.S. drivers who enjoy driving their automobiles? Use the estimate found in
problem (e) as the value for p*.
2
 1.96 

 (0.72)(1 - 0.72) = 7745.
 0.01 
8.15 Getting angry at other drivers. Refer to Exercise 8.14. The same Pew Poll found that 38% of the
respondents "shouted, cursed or made gestures to other drivers" in the last year.
(b) Does the fact that the respondent is self-reporting these actions affect the way that you interpret the
results? Write a short paragraph -explaining your answer.
2. Long sermons. The National Congregations Study collected data in a one-hour interview with a key
informant—that is, a minister, priest, rabbi, or other staff person or leader.ls One question asked concerned
the length of the typical sermon. For this question 9 out of 119 congregations reported that the typical
sermon lasted more than 30 minutes.
(a) Estimate the true proportion for this question with a 95% confidence interval.
Notice that I do not have enough successes (need at least 15 successes and failures) to use the
“large sample” confidence interval formula. But I have a sample size large enough, (need at
least a sample size of 5), to use the four-plus formula.
p=
92
≈ 0.0894
119  4
0.0894 ± 1.96
0.0894(1  0.0894)
119  4
(0.0390, 0.1398)
(b) The respondents to this question were not asked to use a stopwatch to record the lengths of a random
sample of sermons at their congregations. They responded based on their impressions of the sermons. Do
you think that ministers, priests, rabbis, or other staff persons or leaders might perceive sermon lengths
differently from the people listening to the sermons? Discuss how your ideas would influence your
interpretation of the results of this study.
3. Confidence level and interval width. Refer to Exercise 1(h). Would a 90% confidence interval be
wider or narrower than the one that you found in that exercise? Verify your results by computing the
interval.
The interval should be narrower, since I am not as confident.
0.72 ±1.645
0.72(1  0.72)
500
(0.6869, 0.7530)
8.21 Can we use the z test? In each of the following cases state whether or not the Normal approximation
to the binomial should be used for a significance test on the population proportion p.
(a) n = 30 and Ho: p = 0.2.
(b) n = 30and Ho: p = 0.6.
(c) n = 100 and Ho: p = 0.5.
(d) n = 200 and Ho: p = 0.01.
8.22 Instant versus fresh-brewed coffee. A matched pairs experiment compares the taste of instant
versus fresh-brewed coffee. Each subject tastes two unmarked cups of coffee, one of each type, in random
order and states which he or she prefers. Of the 40 subjects who participate in the study, 12 prefer the
instant coffee. Let p be the probability that a randomly chosen subject prefers fresh-brewed coffee to
instant coffee. (In practical terms, p is the proportion of the population who prefer fresh-brewed coffee.)
(a) Test the claim that a majority of people prefer the taste of fresh-brewed coffee. What is the null
hypothesis, and alternative hypothesis. Report the large-sample z statistic and its P-value. Verify that
your method used to calculate P-value is valid (why do we need to do this?).
Let X count the number of people out of 40 that select the fresh-brewed coffee. Now I want to show
that most people (over half) prefer fresh-brewed coffee. Thus, we can not assume that most people
prefer fresh-brewed coffee. We need a neutral assumption. A neutral assumption is that the
preference is the same for instant versus fresh-brewed.
Let us assume that p = 0.5, that is half of the population prefers fresh-brewed coffee over instant,
evenly split. The measurement will be those people who said that they preferred fresh-brewed. We
will try and see if there is any evidence that this proportion is more than 0.5, which would indicate
that most people prefer fresh brewed coffee.
28
= 0.7
40
Now we have 28 successes and we have 12 failures, both greater than 10 so we can use a normal
approximation.
H0: p = 0.5, Ha: p > 0.5. p̂ =
0.7  0.5
= 2.53
0.5(0.5)
40
P( p̂ > 0.7) = P(Z > 2.53)
= 0.0057
Test Statistic: Z =
The p-value is 0.0057, thus the result is statistically significant, meaning we would see a proportion
as high as 0.7 or higher (further away from 0.5) when we assume 0.5 is the correct proportion
around 57 times out of 10,000 attempts. So our result is rare if 0.5 is true, thus, we wonder why we
would see such a value? Of course what we will be more likely to believe is that p is not 0.5 but
some other value that is actually larger than 0.5 and that is why we saw a sample proportion of 0.7.
We could say that our result is statistically significant at 1%.
(b) Draw a sketch of a standard Normal curve and mark the location of your z statistic. Shade the
appropriate area that corresponds to the P-value.
I put the actual binomial curve of the null situation so you can compare to our approximation
procedure.
(c) Is your result significant at the 5% level? What is your practical conclusion? What is the critical value?
If your result is significant at the 5% level, calculate a 95% confidence interval to estimate the value of the
parameter p using the same data set.
Yes, the result we found is statistically significant at 5%, since
0.0057 < 0.05.
The critical value, is tied to the 5%. The z-score associated with
the 5% is 1.645.
1.645
0.5(0.5)
+ 0.5 = 0.6300. The critical value is 0.6300.
40
8.32 Sample size needed for an evaluation. You are planning an evaluation of a semester-long alcohol
awareness campaign at your college. Previous evaluations indicate that about 25% of the students
surveyed will respond "Yes" to the question "Did the campaign alter your behavior toward alcohol
consumption?" How large a sample of students should you take if you want the margin of error for 95%
confidence to be about 0.1?
2
 Z*  *
The formula is   p (1 – p*) but the issue is what to make p* equal to? Since a previous study
m
was conducted and the p-hat value was 25% we could use that as the value of 0.25.
2
 1.96 

 (0.25)(0.75) = 73.
 0.1 
2
Another option is to go to the other extreme which would be using 0.5 as
 1.96 
p-star: 
 (0.5)(0.5) = 97. A last option is to choose a p-star value between 0.25 and 0.5 that we
 0.1 
would guess is closer to the actual population proportion (parameter) p.
8.33 Sample size needed for an evaluation, continued. The evaluation in the previous exercise will also
have questions that have not been asked before, so you do not have previous information about the
possible value of p. Repeat the calculation above for the following values of p*: 0.1, 0.3, 0.5, 0.7, and 0.9.
Summarize the results in a table and graphically. What sample size will you use?
8.34 Are the customers dissatisfied? An automobile manufacturer would like to know what proportion
of its customers are dissatisfied with the service received from their local dealer. The customer relations
department will survey a random sample of customers and compute a 95% confidence interval for the
proportion that are dissatisfied. From past studies, they believe that this proportion will be about 0.15.
Find the sample size needed if the margin of error of the confidence interval is to be no more than 0.02.
2
 1.96 
Using the guessed value: 
 (0.15)(0.85) = 1225
 0.02 
2
 1.96 
Using the conservative (extreme) value: 
 (0.5)(0.5) = 2401
 0.02 
Download