Test 9B AP Statistics Name: Part 1: Multiple Choice. Circle the letter corresponding to the best answer. 1. A significance test was performed to test the null hypothesis H 0 : p = 0.5 versus the alternative Ha: p > 0.5. The test statistic is z = 1.40. Which of the following is closest to the P-value for this test? (a) 0.0808 (b) 0.1492 (c) 0.1616 (d) 0.2984 (e) 0.9192 2. The mean time it takes for a person to experience pain relief from aspirin is 25 minutes. A new ingredient is added to help speed up relief. Let µ denote the mean time to obtain pain relief with the new product. An experiment is conducted to verify if the new product works more quickly. What are the null and alternative hypotheses for the appropriate test of significance? (a) H0 : µ = 25 vs. Ha : µ ≠ 25 (b) H0 : µ = 25 vs. Ha : µ < 25 (c) H0 : µ < 25 vs. Ha : µ = 25 (d) H0 : µ < 25 vs. Ha : µ > 25 (e) H0 : µ = 25 vs. Ha : µ > 25 3. A test of H 0 : µ = 60 versus H a : µ ≠ 60 produces a sample mean of x = 58 and a P-value of 0.04. At an α = 0.05 level, which of the following is an appropriate conclusion? (a) There is sufficient evidence to conclude that µ < 60. (b) There is sufficient evidence to conclude that µ = 60. (c) There is insufficient evidence to conclude that µ = 60. (d) There is insufficient evidence to conclude that µ ≠ 60. (e) There is sufficient evidence to conclude that µ ≠ 60. 4. Because t procedures are robust, the most important condition for their use is (a) the population standard deviation is known. (b) the population distribution is approximately Normal. (c) the data can be regarded as a random sample from the population. (d) np and n(1 – p) are both at least 10. (e) all values in the sample are within two standard deviations of the mean. 5. We want to test H0: µ = 1.5 vs. Ha : µ ≠ 1.5 at α = 0.05 . A 95% confidence interval for µ calculated from a given random sample is (1.4, 3.6). Based on this finding we (a) fail to reject H0 . (b) reject H0 . (c) cannot make any decision at all because the value of the test statistic is not available. (d) cannot make any decision at all because the distribution of the population is unknown. (e) cannot make any decision at all because (1.4, 3.6) is only a 95% confidence interval for µ. ©BFW Publishers The Practice of Statistics for AP*, 5/e 6. Which of the following statements is/are correct? I. The power of a significance test depends on the effect size. II. The probability of a Type II error is equal to the significance level of the test. III. Error probabilities can be expressed only when a significance level has been specified. (a) I and II only (b) I and III only (c) II and III only (d) I, II, and III (e) None of the above gives the complete set of correct responses. Use the following for questions 7 and 8: The water diet requires you to drink two cups of water every half hour from the time you get up until you go to bed, but otherwise allows you to eat whatever you like. Four adult volunteers agree to test the diet. They are weighed prior to beginning the diet and after six weeks on the diet. The weights (in pounds) are Subject A B C D Weight before diet 180 125 240 150 Weight after 6 weeks 170 130 215 152 7. Which of the following conditions must be met in order to use a t-procedure on these paired data? (a) Only the distribution of pre-diet weights must be approximately Normal. (b) Only the distribution of differences (after 6 weeks – before) must be approximately Normal. (c) The distribution of both pre-diet weights and six-week weights must be approximately Normal. (d) The distribution of pre-diet weights and the distribution of differences (after 6 weeks – before) must be approximately Normal. (e) All three distributions—before diet, after 6 weeks, and the difference—must be approximately Normal. 8. What would a Type II error be for this test of the water diet? (a) Concluding that the diet leads to weight loss when it doesn’t. (b) Concluding that the diet leads to weight loss when it really does. (c) Not concluding that the diet leads to weight loss when it does. (d) Not concluding that the diet leads to weight loss when it really doesn’t. (e) Drawing a conclusion from this test when the Normality condition has not been satisfied. ©BFW Publishers The Practice of Statistics for AP*, 5/e 9. A researcher wishes to determine if people are able to complete a certain pencil and paper maze more quickly while listening to classical music. Suppose previous research has established that the mean time needed for people to complete a certain maze (without music) is 40 seconds. The researcher, therefore, decides to test the hypotheses H 0 : µ = 40 versus H a : µ < 40 , where µ = the time in seconds needed to complete the maze while listening to classical music. To do so, the researcher has 10,000 people complete the maze with classical music playing. The mean time for these people is x = 39.92 seconds, and the P-value of his significance test is 0.0002. Which statement below best describes the appropriate conclusion to draw from this study? (a) The researcher has proved that listening to classical music substantially improves the time it takes to complete the maze. (b) The researcher has strong evidence that listening to classical music substantially improves the time it takes to complete the maze. (c) The researcher has moderate evidence that listening to classical music substantially improves the time it takes to complete the maze. (d) Although the researcher has obtained a statistically significant result, it appears to have little practical significance. (e) Since the P-value is greater than the reciprocal of the sample size, this is not a significant result. 10. The recommended daily Calcium intake for women over 21 (and under 50) is 1000 mg per day. The health services at a college are concerned that women at the college get less Calcium than that, so they take a random sample of female students in order to test the hypotheses H 0 : µ = 1000 versus H a : µ < 1000 . Prior to the study they estimate that the power of their test against the alternative H a : µ = 900 is 0.85. Which of the following is the best interpretation of this value? (a) The probability of making a Type II error. (b) The probability of rejecting the null hypothesis when the parameter value is 1000. (c) The probability of rejecting the null hypothesis when the parameter value is 900. (d) The probability of failing to reject the null hypothesis when the parameter value is 1000. (e) The probability of failing to reject the null hypothesis when the parameter value is 900. ©BFW Publishers The Practice of Statistics for AP*, 5/e Part 2: Free Response Show all your work. Indicate clearly the methods you use, because you will be graded on the correctness of your methods as well as on the accuracy and completeness of your results and explanations. 11. Publishing scientific papers online is fast, and the papers can be long. Publishing in a paper journal means that the paper will live forever in libraries. The British Medical Journal combines the two: it prints short and readable versions, with longer versions available online. Is this OK with authors? The journal asked a random sample of 104 of its recent authors several questions. One question was “Should the journal continue using this system?” In the sample, 72 said “Yes.” (a) Do the data give good evidence that more than two-thirds (67%) of authors support continuing this system? Carry out an appropriate test to help answer this question. (b) Interpret the P-value from your test in the context of the problem. ©BFW Publishers The Practice of Statistics for AP*, 5/e 12. “Red tide” is a bloom of poison-producing algae—a few different species of a class of plankton called dinoflagellates. When weather and water condition cause these blooms, shellfish such as clams living in the area develop dangerous levels of a paralysis-inducing toxin. In Massachusetts, the Division of Marine Fisheries (DMF) monitors levels of the toxin in shellfish by regular sampling of shellfish along the coastline. If the mean level of toxin in clams exceeds 800µg (micrograms) of toxin per kg of clam meat in any area at a 5% level of significance, clam harvesting is banned there until the bloom is over and levels of toxin in clams subside. During a bloom, the distribution of toxin levels in clams on a single mudflat is distinctly non-Normal. (a) Define the parameter of interest and state appropriate hypotheses for the DMF to test. (b) Because of budget constraints and the large number of coastal areas that must be tested, the DMF would like to sample no more than 10 clams from any single area. Explain why this sample size may lead to problems in carrying out the significance test from (a). (c) Describe a Type I and a Type II error in this situation and the consequences of each. (d) The DMF is considering changing the significance level of the test to 10%. Discuss the impact this might have on error probabilities and the power of the test, and describe the practical consequences of this change. ©BFW Publishers The Practice of Statistics for AP*, 5/e Test 9B Part I 1. a Area under standard normal curve above z = 1.40 is 0.0808. 2. b The question is whether the new product works more quickly, so we are interested in the one-tailed alternative that the mean time for pain relief is less than 25 minutes. 3. e Since the P-value is less than α, we have sufficient evidence against H0, to conclude that H0 is not likely to be true. Since the test is two-tailed, we can conclude only that µ ≠ 60 . 4. c t-procedures are robust with respect to the Normality of the population if n is large enough, so the most important condition is randomness. 5. a Since the null value of 1.5 is in the 95% confidence interval, we cannot reject H0. 6. b I is true because power is greater if the alternative (actual) value of the parameter is farther from the null. III is true because P(Type I error) = α and P(Type II error) depends on α. II is false, since P(Type I error) = α, not P(Type II error). 7. b Only the Normality of the differences is important, since we are testing the mean of the difference in weights, not the pre-diet and post-diet mean weights. 8. c Type II error is failing to reject H0 when it’s not true, thus concluding that the mean weight loss is zero when it’s actually positive. 9. d A test with such a large sample size will often produce a significant result for a very small departure from the null value. There is little practical significance to a change in finish time of 0.08 seconds. 10. c Power is the probability of rejecting a false null hypothesis, which in this case is concluding that the true mean daily Calcium intake for women a the college is lower than 1000 mg when it is actually 900 mg. Part II 11. (a) State: We wish to test H 0 : p = 0.67 versus H a : p > 0.67 , where p = the true proportion authors who support the system of only publishing longer papers online. We will use a significance level of α = 0.05. Plan: The procedure is a one-sample z-test for a proportion. Conditions: Random: The journal took a random sample of 104 recent authors. 10%: It seems reasonable to assume that there are more than 10 × 104 = 1040 recent authors of articles in the British Medical Journal. Large counts: Assuming H 0 : p = 0.67 np = (104) ⋅ (0.67) = 69.7 ≥ 10, and n (1 − p ) = (104) ⋅ (0.33) = 34.3 ≥ 10 Do: pˆ = z= 0.69 − 0.67 ( 0.67 )( 0.33) 72 ≈ 0.69 , so 104 = 0.43 ; P-value = 0.3336. Conclude: A P-value of 0.3336 is greater than α = 0.05, so we 104 fail to reject H0: there is insufficient evidence to conclude that the proportion of recent authors who support the system is greater than 67%. (b) If the true proportion of authors who support the system is 0.67, the probability of getting a sample proportion of authors as far or farther above 0.67 as our sample is 0.3336. 12. (a) H 0 : µ = 800 versus H a : µ > 800 , where µ = mean concentration of Red Tide toxins in clams, in µg/kg. (b) This sample size is too small for a population that is known to be non-normal. (c) Type I error: Concluding that the mean level of toxin is above 800 µg/kg when it is not. The DMF would close the area to clam harvesting, which would have a negative economic impact on anyone who depends on the clam business, even though the clams are safe to eat. Type II error: Not concluding that the mean level of toxin is above safe levels when it is. This could cause anyone who eats clams from this area to become sick or even die. d) Raising the significance level to 10% would increase the probability of a Type I error, but decrease the probability of a Type II error and increase the power of the test. This would decrease the likelihood of people eating toxic clams, so it might be a good idea. Better safe than sorry. ©BFW Publishers The Practice of Statistics for AP*, 5/e