advertisement

Inference for Proportions Chapters 19-22 REVIEW Name________________________________ Class period_____ 1. Some defendants in criminal proceedings plead guilty and are sentenced without a trial, whereas others plead innocent, are subsequently found guilty, and then are sentenced. In recent years legal scholars have speculated as to whether the sentences of those who plead guilty differ in severity from the sentence of those who plead innocent and are subsequently judged guilty. Consider the accompanying data on a group of randomly selected defendants from San Francisco County, all of whom were accused of robbery and had previous prison records. Test the claim that the sentences are different for defendants who plead guilty. = 0.01. Plea Guilty Not guilty Number judged guilty n1 = 191 n 2 = 128 Number sentenced to prison 101 112 Sample proportion p1 = .529 p 2 = .875 2. There is a remedy for male pattern baldness---at least that’s what millions of males hope, since the FDA approved Upjohn’s minoxidil for such treatment. Minoxidil was investigated in a large 27-center study where patients were randomly assigned to receive topical minoxidil or an identical looking placebo. Ignoring the center-to-center variation, suppose that the preliminary results were as follows. Test the claim that minoxidil does make a difference in new hair growth. Assume all conditions are met. No need to write them all out. preparation administered sample size percent with new hair growth minoxidil 310 32 placebo 309 20 3. In tests of a computer component, it is found that 7% of the components are defective. A random sample of 175 components, 18 components were found to be defective. At the 5% level off significance, does this sample show that significantly more defective components are being made? 4. A new variety of apple is intended to resist cedar apple rust. A horticulturist grows 50 trees of the new strain and 100 of the parent strain under the same field conditions. At the end of three years, each tree is checked by a researcher for serious apple rust infection. The results show that 22 trees of the new strain and 61 of the parent strain are seriously infected. Give a 99% confidence interval for the difference between the proportions of new strain versus parent strain trees with the infection. State what the confidence interval means. 5. A New York Times poll on women’s issues interviewed 1025 women and 472 men randomly selected from the United States excluding Alaska and Hawaii. The poll found that 47% of the women said they do not get enough time for themselves. a) The poll announced a margin of error of 3 percentage points for 95% confidence in conclusions about women. Explain to someone who knows no statistics why we can’t just say that 47% of all adult women do not get enough time for themselves. b) Explain clearly what “95% confidence” means. c) The margin of error for results concerning men was 4 percentage points. Why is this larger than the margin of error for women? 6. Which example would give a longer confidence interval in each case below. a) 95% confidence with n = 25 or 95% confidence with n = 75? b) 99% confidence with n = 50 or 90% confidence with n = 50? 7. A young dairy farmer believes he can compete with Amy’s Ice Cream and hopes to open an ice cream shop in the same vicinity. He will close the deal if his ice cream is preferred by more than 40% of the local residents. His statistical analysis uses the hypotheses: Ho: P = .4 against Ha: P > .4. Describe in words what a Type I error would be. 8. Describe in words what a Type II error would be for the problem above. 9. The dairy farmer above completes his test using α = .05 and gets a p-value of 0.067. His value for β = .16. What is the power of his test? 10. Twelve of 200 randomly selected shark attacks occurred in deep water. Construct the 98% confidence interval for the true proportion of shark attacks that occur in deep water. 11. Among 80 fish caught in a certain lake, 28 were inedible as a result of the chemical pollution of their environment. Estimate the true population proportion with a 99% confidence interval. 12. An automobile insurance company wants to determine from a sample what proportion of it’s thousands of policyholders intend to buy a new car within the next twelve months. How large a sample will they need to be able to assert with a probability of at least 0.95 that the sample proportion will differ from the true proportion by less than 0.06? 13. Smeltzer finds that about 89% of her statistics students turn in homework on any given assignment. Her conclusion was based on a random sample of 180 student assignments and has a margin of error of 4.2%. What level of confidence did she use? 14. You want to estimate the proportion of RRISD students who live in a household with two parents. You decide that you should be 98% confident that your estimate is within 3% of the true proportion. How many students must you survey? 15. If a confidence interval is (.532 , .628), what is the margin of error? 16. When completing a test of significance, what does the P-value mean? 17. A 90% confidence interval for the proportion of cheese purchases that are cheddar is (.793 , .851). If the store owner claims the proportion of cheddar cheese sales is .86 and runs a significance test at the 5% level, what can you expect? 18. What is z* for a 70% confidence interval? 19. M&Ms are packaged to contain 20% orange candies. To test this 32 randomly selected packages are opened and the proportion of orange M&Ms is calculated. Ho: P = .2 is tested against Ha: P .2 at the 2% level. a) What is the corresponding confidence level for this test? b) If the corresponding confidence interval is (.185 , .245), would you reject or fail to reject Ho? Why? 20. A confidence interval is found to be (.756, .844) What is the sample proportion? 21. If a result is statistically significant (reject Ho) at the 10% level, is it always significant at the 5% level? Is it significant at the 20% level? 22. What is the margin of error? 23. What is the critical value of a proportion test at the = .12 level if a) Ho: P = .4 and Ha: P > .4. b) Ho: P = .4 and Ha: P < .4. c) Ho: P = .4 and Ha: P .4. 24. If x1 = 54, n1 = 95, x2 = 71, n2 = 103, find the pooled p . 25. Researchers at a large health maintenance organization (HMO) are planning a study of a certain mild illness. They will select a random sample of patients who are ages 35 to 54 and see if they contract the illness in the next year. The researchers are interested din estimating the proportion of men and of women who are likely to develop the illness in each of 4 age groups: 35-39, 40-44, 45-49, and 50-54. The researchers plan to include 2,000 patients in the study. Suppose the researchers draw a random sample from all patients at this HMO who are ages 35-54 and find the following numbers within each gender and age-group. (An actual AP question!) Age-Groups Male Female 35-39 350 445 40-44 230 370 45-49 150 245 50-54 60 150 a) Suppose that at the end of the study, 10 percent of the females in the 40-44 age-group contracted the illness. Calculate a 95% confidence interval to estimate the population proportion of females in this agegroup that contracted the illness. Interpret this confidence interval in the context of this situation. Interpret the confidence level of 95 percent. b) Suppose that at the end of the study, 10 percent of the males in the 40-44 age-group contracted the illness. The corresponding 95 percent confidence interval to estimate the population proportion of males in this age-group that contracted the illness is (0.061 , 0.139). Note that this interval and the interval in part (a) are of different lengths even though the two sample proportions were identical. What would be an alternative way to allocate a sample of 2,000 subjects so that the 95 percent confidence interval widths for all male age-groups and for all female age-groups (i.e. for all 8 groups) would be the same when the sample proportions are the same. Justify your answer. c) Based on previous studies, researchers believe that the percentages of those who contract the illness will be similar for males and females, and therefore plan to ignore gender when selecting a sample for this study. Previous studies also indicate that the percentage of adults who will contract this illness in the 3539, 40-44, 45-49, and 50-54 age-groups are anticipated to be 5%, 8%, 20%, and 35%, respectively. How should the sample of 2,000 subjects be allocated with respect to age-groups so that the widths of the 95% confidence intervals for the four groups will be approximately the same? Justify your answer. Answers 1. P1=proportion of defendants sent to prison who plead guilty; P2=proportion of defendants sent to prison who plead not guilty but were found guilty; Ho: P1=P2; Ha: P1P2; Use 2 proportion z-test; Critical values: -2.575, 2.575; Conditions: random samples from the population of defendants given; independence is reasonable since one defendant’s plea should not affect another’s OR we can assume there are more than 1,910 (191x10) defendants who plead guilty and more than 1,280 defendants who plead not guilty over the course of time ; large enough samples; n1p1 = (191)(.529) = 101.039 >10; n1(1p1) = (191)(1.529) = 89.961>10; n2p2=(128)(.875)=112 > 10; n2(1p2)=(128)(1.875)=16 >10; Test statistic: z = 6.434, p-value=.0+; reject Ho since p-value < ; The data supports the claim that sentences are different for defendants who plead guilty. Pleading guilty appears to lighten the sentence. 2. x1=99, x2=62, P1=proportion of minoxidil users with new hair; P2=proportion of placebo users with new hair; Ho: P1=P2; Ha: P1P2; Stated all conditions met. Test statistic: z = 3.366, p-value=.000762; reject Ho: p1 p2 since p-value < ; The data supports the claim that minoxidil does make a difference. It appears that more new hair growth results from use of this product. 3. population characteristic: P = proportion of defective components Ho: P = .07 ; Ha: P .07 ; conditions: random sample of components – given. large enough sample since np = (175)(.07) = 12.25 > 10 and n(1p) = (175)(1.07) = 162.75 > 10; Independence is reasonable since one component being defective shouldn’t affect another component. z = 1.70, p-value = .0442 ; Reject Ho since the p-value < ; There is clearly sufficient evidence to claim that more defective components are being made. Stop production and fix the problem. 4. Use a two proportion z-interval. Conditions: Randomness and independence are not known and could make our results suspect. Large enough samples condition is met since n1p1 = 50(.44) = 22 > 10; n1(1─p1) = 50(.56) = 28 > 10 ; n2p2 = 100(.61) = 61 > 10 ; n2(1─p2) = 100(.39) = 39 > 10 ( .3902, .0501) We are 99% confident that the difference between the proportions of new strain versus parent strain trees with the infection is between ─.3902 and .0501. Since zero is part of the interval, there appears to be no significant difference between the two strains of apple trees. 5. a) A random sample gives a good estimate of the population but because the entire population was not surveyed, there is a margin of error that occurs due to sampling error. Different samples will give a different estimate. b) If all possible samples of the same size were taken, 95% of the resulting confidence intervals would contain the true population mean. We are 95% sure that we have captured the true population mean. c) Less men were surveyed so the margin of error increases. 6.a) 95% confidence with n=25 b) 99% confidence with n = 50 7. He mistakenly believes that more than 40% of the residents prefer his ice cream when, in reality, less than 40% prefer his ice cream. He would open his shop and possibly lose money. 8. He mistakenly finds that only 40% or less of the residents prefer his ice cream when really more than 40% prefer it. He would then choose not to open his shop when he could have made a go of it. Yikes! He’d lose out potential money. Best not to make a mistake when starting a small business. 9. power of test = 0.84 ; The probability of correctly rejecting Ho when it is not true is .84. 10. Use a one proportion z-interval. Conditions: SRS from the population of sharks given; approximately normal distribution of p since the sample is large enough as shown by np = (200)(.06)=12≥10 and n(1–p) = (200)(.94) = 188 ≥ 10. Independence is reasonable since one shark attack shouldn’t affect another, unless, of course, it’s the same pesky shark. interval :(.021, .099) writing out the formula is required on the test. We are 98% confident that the true proportion of shark attacks that occur in deep water is between .021 and .099. 11. Use a one proportion z-interval. Conditions: Random selection is not discussed and could be a problem. Maybe the toxic fish were easier to catch. Independence? We’re not sure here either. We’ll assume both and carry on.; large enough sample for an approximately normal distribution of p since np = (80)(.35)=28 ≥10 and n(1–p) = (80)(.65) = 52 ≥ 10. interval: .35 2.576 .35(1 .35) (.213, .487) 80 We are 99% confident the true proportion of inedible fish in the lake due to chemical pollution is between .213 and .487. 12. n=267 13. 92% 14. 1509 15. .048 16. It is the probability that a sample would get the given results by chance alone if Ho is true. 17. Since .86 is not in the corresponding confidence interval, we can expect that the true proportion is not as high as .86 and that we would reject the owner’s claim of a higher proportion. It appears that less than 86% of the cheese sales are cheddar since the entire interval is below .86. 18. 1.04 19. a) 98% since the test is 2-tailed b) fail to reject Ho since 0.2 is included in the interval. 20. .8 21. If = .10 then the test may not be significant at the = .05 level since .05 < .10 so more proof is needed. But the results will always be significant at the =.20 level since .20 > .10. 22. The margin of error occurs since we are taking a sample to represent an entire population. Different samples produce different results. The margin of error tells us that the population is likely to be not more than that far away from our sample statistic. 23. a)1.175 24. 0.6313 b) 1.175 c) 1.555, 1.555 (sum of two x values divided by sum of two n values) 25. a) Use a one proportion z-interval. Conditions: Random sample of patients from the population of all patients discussed. Large enough sample since n p = 370(.1) = 37 > 10 and n(1─ p ) = 370(.0) = 333 > 10 .1 1.96 .1(.9) (.0694 , .0131) 370 We can be 95% confident that the true proportion of this HMO’s 40 to 44 year old female patients who contract the disease is between 0.07 and 0.13. Confidence level: Ninety-five percent of all possible random samples of size 370 from this population will result in a confidence interval that includes the true proportion of this HMO’s 40 to 44 year old female patients who contracted the disease. pˆ (1 pˆ , which depends on p and n. If n the sample proportions are equal, then the confidence interval widths will be the same if the sample sizes are the same for all 8 age-gender groups. Thus, we need to take random samples of size 250 from each of the 8 groups. b) The width of the interval is determined by the magnitude of c) The width of the interval is proportional to pˆ (1 pˆ . The interval widths for all of the groups will be n the same if pˆ (1 pˆ is the same for each group. This will happen when the sample size is proportional n to pˆ (1 pˆ ) and n1 + n2 + n3 + n4 = 2000. Σ pˆ (1 pˆ ) = (.05)(.95)+(.08)(.92)+(.2)(.8)+(.35)(.65)=.0475 + .0736 + .1600 + .2275 = .5086 .0475 2000 = 186.79 .5086 n2 = .1600 2000 = 629.18 .5086 n4 = we want n1 = n3 = .0736 2000 = 289.42 .5086 .2275 2000 = 894.61 .5086 So n1 = 187, n2 = 289 , n3 = 629 , and n4 = 895 (No! I would not expect you to get 25b or 25c correct for your test. Just thought you’d like to see an example of an investigative task from an AP test. I’m hoping you can wade through the suggested answer and make some sense of it.)