AP Statistics

advertisement
Inference for Proportions
Chapters 19-22
REVIEW
Name________________________________
Class period_____
1. Some defendants in criminal proceedings plead guilty and are sentenced without a trial, whereas others
plead innocent, are subsequently found guilty, and then are sentenced. In recent years legal scholars have
speculated as to whether the sentences of those who plead guilty differ in severity from the sentence of
those who plead innocent and are subsequently judged guilty. Consider the accompanying data on a group
of randomly selected defendants from San Francisco County, all of whom were accused of robbery and
had previous prison records. Test the claim that the sentences are different for defendants who plead
guilty.  = 0.01.
Plea
Guilty
Not guilty
Number judged guilty
n1 = 191
n 2 = 128
Number sentenced to prison
101
112
Sample proportion
p1 = .529
p 2 = .875
2. There is a remedy for male pattern baldness---at least that’s what millions of males hope, since the
FDA approved Upjohn’s minoxidil for such treatment. Minoxidil was investigated in a large 27-center
study where patients were randomly assigned to receive topical minoxidil or an identical looking placebo.
Ignoring the center-to-center variation, suppose that the preliminary results were as follows. Test the
claim that minoxidil does make a difference in new hair growth. Assume all conditions are met. No need
to write them all out.
preparation administered
sample size
percent with new hair
growth
minoxidil
310
32
placebo
309
20
3. In tests of a computer component, it is found that 7% of the components are defective. A random
sample of 175 components, 18 components were found to be defective. At the 5% level off significance,
does this sample show that significantly more defective components are being made?
4. A new variety of apple is intended to resist cedar apple rust. A horticulturist grows 50 trees of the new
strain and 100 of the parent strain under the same field conditions. At the end of three years, each tree is
checked by a researcher for serious apple rust infection. The results show that 22 trees of the new strain
and 61 of the parent strain are seriously infected. Give a 99% confidence interval for the difference
between the proportions of new strain versus parent strain trees with the infection. State what the
confidence interval means.
5. A New York Times poll on women’s issues interviewed 1025 women and 472 men randomly selected
from the United States excluding Alaska and Hawaii. The poll found that 47% of the women said they do
not get enough time for themselves.
a) The poll announced a margin of error of  3 percentage points for 95% confidence in conclusions
about women. Explain to someone who knows no statistics why we can’t just say that 47% of all adult
women do not get enough time for themselves.
b) Explain clearly what “95% confidence” means.
c) The margin of error for results concerning men was  4 percentage points. Why is this larger than the
margin of error for women?
6. Which example would give a longer confidence interval in each case below.
a) 95% confidence with n = 25 or 95% confidence with n = 75?
b) 99% confidence with n = 50 or 90% confidence with n = 50?
7. A young dairy farmer believes he can compete with Amy’s Ice Cream and hopes to open an ice cream
shop in the same vicinity. He will close the deal if his ice cream is preferred by more than 40% of the
local residents. His statistical analysis uses the hypotheses: Ho: P = .4 against Ha: P > .4. Describe in
words what a Type I error would be.
8. Describe in words what a Type II error would be for the problem above.
9. The dairy farmer above completes his test using α = .05 and gets a p-value of 0.067. His value for
β = .16. What is the power of his test?
10. Twelve of 200 randomly selected shark attacks occurred in deep water. Construct the 98%
confidence interval for the true proportion of shark attacks that occur in deep water.
11. Among 80 fish caught in a certain lake, 28 were inedible as a result of the chemical pollution of their
environment. Estimate the true population proportion with a 99% confidence interval.
12. An automobile insurance company wants to determine from a sample what proportion of it’s
thousands of policyholders intend to buy a new car within the next twelve months. How large a sample
will they need to be able to assert with a probability of at least 0.95 that the sample proportion will differ
from the true proportion by less than 0.06?
13. Smeltzer finds that about 89% of her statistics students turn in homework on any given assignment.
Her conclusion was based on a random sample of 180 student assignments and has a margin of error of
4.2%. What level of confidence did she use?
14. You want to estimate the proportion of RRISD students who live in a household with two parents.
You decide that you should be 98% confident that your estimate is within  3% of the true proportion.
How many students must you survey?
15. If a confidence interval is (.532 , .628), what is the margin of error?
16. When completing a test of significance, what does the P-value mean?
17. A 90% confidence interval for the proportion of cheese purchases that are cheddar is (.793 , .851). If
the store owner claims the proportion of cheddar cheese sales is .86 and runs a significance test at the 5%
level, what can you expect?
18. What is z* for a 70% confidence interval?
19. M&Ms are packaged to contain 20% orange candies. To test this 32 randomly selected packages are
opened and the proportion of orange M&Ms is calculated. Ho: P = .2 is tested against Ha: P  .2 at the
2% level.
a) What is the corresponding confidence level for this test?
b) If the corresponding confidence interval is (.185 , .245), would you reject or fail to reject Ho? Why?
20. A confidence interval is found to be (.756, .844) What is the sample proportion?
21. If a result is statistically significant (reject Ho) at the 10% level, is it always significant at the 5%
level? Is it significant at the 20% level?
22. What is the margin of error?
23. What is the critical value of a proportion test at the  = .12 level if
a) Ho: P = .4 and Ha: P > .4.
b) Ho: P = .4 and Ha: P < .4.
c) Ho: P = .4 and Ha: P  .4.
24. If x1 = 54, n1 = 95, x2 = 71, n2 = 103, find the pooled p .
25. Researchers at a large health maintenance organization (HMO) are planning a study of a certain mild
illness. They will select a random sample of patients who are ages 35 to 54 and see if they contract the
illness in the next year. The researchers are interested din estimating the proportion of men and of women
who are likely to develop the illness in each of 4 age groups: 35-39, 40-44, 45-49, and 50-54.
The researchers plan to include 2,000 patients in the study. Suppose the researchers draw a random
sample from all patients at this HMO who are ages 35-54 and find the following numbers within each
gender and age-group. (An actual AP question!)
Age-Groups
Male
Female
35-39
350
445
40-44
230
370
45-49
150
245
50-54
60
150
a) Suppose that at the end of the study, 10 percent of the females in the 40-44 age-group contracted the
illness. Calculate a 95% confidence interval to estimate the population proportion of females in this agegroup that contracted the illness.
Interpret this confidence interval in the context of this situation.
Interpret the confidence level of 95 percent.
b) Suppose that at the end of the study, 10 percent of the males in the 40-44 age-group contracted the
illness. The corresponding 95 percent confidence interval to estimate the population proportion of males
in this age-group that contracted the illness is (0.061 , 0.139).
Note that this interval and the interval in part (a) are of different lengths even though the two sample
proportions were identical. What would be an alternative way to allocate a sample of 2,000 subjects so
that the 95 percent confidence interval widths for all male age-groups and for all female age-groups
(i.e. for all 8 groups) would be the same when the sample proportions are the same. Justify your answer.
c) Based on previous studies, researchers believe that the percentages of those who contract the illness
will be similar for males and females, and therefore plan to ignore gender when selecting a sample for this
study. Previous studies also indicate that the percentage of adults who will contract this illness in the 3539, 40-44, 45-49, and 50-54 age-groups are anticipated to be 5%, 8%, 20%, and 35%, respectively. How
should the sample of 2,000 subjects be allocated with respect to age-groups so that the widths of the 95%
confidence intervals for the four groups will be approximately the same? Justify your answer.
Answers
1. P1=proportion of defendants sent to prison who plead guilty; P2=proportion of defendants sent to
prison who plead not guilty but were found guilty; Ho: P1=P2; Ha: P1P2; Use 2 proportion z-test; Critical
values: -2.575, 2.575; Conditions: random samples from the population of defendants given;
independence is reasonable since one defendant’s plea should not affect another’s OR we can assume
there are more than 1,910 (191x10) defendants who plead guilty and more than 1,280 defendants who
plead not guilty over the course of time ; large enough samples; n1p1 = (191)(.529) = 101.039 >10;
n1(1p1) = (191)(1.529) = 89.961>10; n2p2=(128)(.875)=112 > 10; n2(1p2)=(128)(1.875)=16 >10;
Test statistic: z = 6.434, p-value=.0+; reject Ho since p-value < ; The data supports the claim that
sentences are different for defendants who plead guilty. Pleading guilty appears to lighten the sentence.
2. x1=99, x2=62, P1=proportion of minoxidil users with new hair; P2=proportion of placebo users with
new hair; Ho: P1=P2; Ha: P1P2; Stated all conditions met. Test statistic: z = 3.366, p-value=.000762;
reject Ho: p1  p2 since p-value < ; The data supports the claim that minoxidil does make a difference. It
appears that more new hair growth results from use of this product.
3. population characteristic: P = proportion of defective components Ho: P = .07 ; Ha: P  .07 ;
conditions: random sample of components – given. large enough sample since np = (175)(.07) = 12.25 >
10 and n(1p) = (175)(1.07) = 162.75 > 10; Independence is reasonable since one component being
defective shouldn’t affect another component. z = 1.70, p-value = .0442 ; Reject Ho since the p-value < ;
There is clearly sufficient evidence to claim that more defective components are being made. Stop
production and fix the problem.
4. Use a two proportion z-interval. Conditions: Randomness and independence are not known and could
make our results suspect. Large enough samples condition is met since n1p1 = 50(.44) = 22 > 10;
n1(1─p1) = 50(.56) = 28 > 10 ; n2p2 = 100(.61) = 61 > 10 ; n2(1─p2) = 100(.39) = 39 > 10
( .3902, .0501) We are 99% confident that the difference between the proportions of new strain versus
parent strain trees with the infection is between ─.3902 and .0501. Since zero is part of the interval, there
appears to be no significant difference between the two strains of apple trees.
5. a) A random sample gives a good estimate of the population but because the entire population was not
surveyed, there is a margin of error that occurs due to sampling error. Different samples will give a
different estimate.
b) If all possible samples of the same size were taken, 95% of the resulting confidence intervals would
contain the true population mean. We are 95% sure that we have captured the true population mean.
c) Less men were surveyed so the margin of error increases.
6.a) 95% confidence with n=25
b) 99% confidence with n = 50
7. He mistakenly believes that more than 40% of the residents prefer his ice cream when, in reality, less
than 40% prefer his ice cream. He would open his shop and possibly lose money.
8. He mistakenly finds that only 40% or less of the residents prefer his ice cream when really more than
40% prefer it. He would then choose not to open his shop when he could have made a go of it. Yikes!
He’d lose out potential money. Best not to make a mistake when starting a small business.
9. power of test = 0.84 ; The probability of correctly rejecting Ho when it is not true is .84.
10. Use a one proportion z-interval. Conditions: SRS from the population of sharks given; approximately

normal distribution of p since the sample is large enough as shown by np = (200)(.06)=12≥10 and
n(1–p) = (200)(.94) = 188 ≥ 10. Independence is reasonable since one shark attack shouldn’t affect
another, unless, of course, it’s the same pesky shark.
interval :(.021, .099) writing out the formula is required on the test.
We are 98% confident that the true proportion of shark attacks that occur in deep water is between
.021 and .099.
11. Use a one proportion z-interval. Conditions: Random selection is not discussed and could be a
problem. Maybe the toxic fish were easier to catch. Independence? We’re not sure here either. We’ll

assume both and carry on.; large enough sample for an approximately normal distribution of p since
np = (80)(.35)=28 ≥10 and n(1–p) = (80)(.65) = 52 ≥ 10.
interval:
.35  2.576
.35(1  .35)
(.213, .487)
80
We are 99% confident the true proportion of inedible fish in the lake due to chemical pollution is between
.213 and .487.
12. n=267
13. 92%
14. 1509
15. .048
16. It is the probability that a sample would get the given results by chance alone if Ho is true.
17. Since .86 is not in the corresponding confidence interval, we can expect that the true proportion is not
as high as .86 and that we would reject the owner’s claim of a higher proportion. It appears that less than
86% of the cheese sales are cheddar since the entire interval is below .86.
18. 1.04
19. a) 98% since the test is 2-tailed
b) fail to reject Ho since 0.2 is included in the interval.
20. .8
21. If  = .10 then the test may not be significant at the  = .05 level since .05 < .10 so more proof is
needed. But the results will always be significant at the =.20 level since .20 > .10.
22. The margin of error occurs since we are taking a sample to represent an entire population. Different
samples produce different results. The margin of error tells us that the population is likely to be not
more than that far away from our sample statistic.
23. a)1.175
24. 0.6313
b) 1.175
c) 1.555, 1.555
(sum of two x values divided by sum of two n values)
25. a) Use a one proportion z-interval.
Conditions: Random sample of patients from the population of all patients discussed.
Large enough sample since n p = 370(.1) = 37 > 10 and n(1─ p ) = 370(.0) = 333 > 10
.1  1.96
.1(.9)
(.0694 , .0131)
370
We can be 95% confident that the true proportion of this HMO’s 40 to 44 year old female patients
who contract the disease is between 0.07 and 0.13.
Confidence level: Ninety-five percent of all possible random samples of size 370 from this population
will result in a confidence interval that includes the true proportion of this HMO’s 40 to 44 year old
female patients who contracted the disease.
pˆ (1  pˆ
, which depends on p and n. If
n
the sample proportions are equal, then the confidence interval widths will be the same if the sample sizes
are the same for all 8 age-gender groups. Thus, we need to take random samples of size 250 from each of
the 8 groups.
b) The width of the interval is determined by the magnitude of
c) The width of the interval is proportional to
pˆ (1  pˆ
. The interval widths for all of the groups will be
n
the same if
pˆ (1  pˆ
is the same for each group. This will happen when the sample size is proportional
n
to pˆ (1  pˆ ) and n1 + n2 + n3 + n4 = 2000.
Σ pˆ (1  pˆ ) = (.05)(.95)+(.08)(.92)+(.2)(.8)+(.35)(.65)=.0475 + .0736 + .1600 + .2275 = .5086
 .0475 
2000 = 186.79
 .5086 
n2 = 
 .1600 
2000 = 629.18
 .5086 
n4 = 
we want n1 = 
n3 = 
 .0736 
2000 = 289.42
 .5086 
 .2275 
2000 = 894.61
 .5086 
So n1 = 187, n2 = 289 , n3 = 629 , and n4 = 895
(No! I would not expect you to get 25b or 25c correct for your test. Just thought you’d like to see an
example of an investigative task from an AP test. I’m hoping you can wade through the suggested answer
and make some sense of it.)
Download