Stat 2470, Portfolio Problem Set #3, Fall 2014 Name ______________________________________ Instructions: Follow the directions for the portfolio carefully. Be sure to answer all parts of each question as completely as possible, and provide explanations for your responses. Try to think about the question in the context of a larger study. 1. Let 𝜇 denote the true average radioactivity level (in picocuries). The value 5 pCi/L is considered the dividing line between safe and unsafe water. Would you recommend testing 𝐻0 : 𝜇 = 5, 𝐻𝑎 : 𝜇 > 5, or 𝐻0 : 𝜇 = 5, 𝐻𝑎 : 𝜇 < 5 ? Explain your reasoning. [Hint: Think about the consequences of a Type I error and Type II error for each possibility.] What 𝛼 value would you use for the case you decided on? 2. The calibration of a scale is to be checked by weighing a 10 kg test specimen 25 times. Suppose the results of different weighings are independent of one another and that the weight on each trial is normally distributed with 𝜎 = 0.200 𝑘𝑔. Let 𝜇 denote the true average weight reading of the scale. a. What hypotheses should be tested? b. Suppose the scale is to be recalibrated if either 𝑥̅ ≥ 10.1032 𝑜𝑟 𝑥̅ ≤ 9.8969. What is the probability that recalibration is carried out when it is actually unnecessary? c. What is the probability that recalibration is judged unnecessary when in fact 𝜇 = 10.1? When it’s 𝜇 = 9.8? 𝑥̅ −10 d. Let 𝑧 = 𝜎 . For what value 𝑐 is the rejection region of part (b) equivalent to the two√𝑛 tailed region of either 𝑧 ≤ −𝑐 𝑜𝑟 𝑧 ≥ 𝑐? e. If the sample size were only 10 rather than 25, how should the procedure for part (d) be altered so that 𝛼 = 0.05? f. Using the test of part (e), what would you conclude from the following data: 9.981 10.006 9.857 10.107 9.888 9.728 10.439 10.214 10.190 9.793 g. Re-express the test procedure of part (b) in terms of the standardized test statistic 𝑍 = 𝑋̅−10 𝜎 √𝑛 . h. Suppose the rejection region is {𝑥̅ : 𝑥̅ ≥ 10.1004 𝑜𝑟 𝑥̅ ≤ 9.8940} = {𝑧: 𝑧 ≥ 2.51, 𝑜𝑟 𝑧 ≤ −2.65}. What is 𝛼 for this procedure? i. What is 𝛽 under the situation in part (h), when 𝜇 = 10.1? When 𝜇 = 9.9? Is this desirable? j. Sketch the graph of the rejection regions for part (b) and (h). 3. Automatic identification of the boundaries of significant structures within a medical image is an area of ongoing research. A paper from 2005 discussed a new technique for such identification. A measure of the accuracy of the automatic region is the average linear displacement (ALD). The paper gave the following ALD observations for a sample of 49 kidneys (in units of pixel dimensions). 1.38 0.44 1.09 0.75 0.66 1.28 0.51 0.39 0.70 0.46 0.54 0.83 0.58 0.64 1.30 0.57 0.43 0.62 1.00 1.05 0.82 1.10 0.65 0.99 0.56 0.56 0.64 0.45 0.82 1.06 0.41 0.58 0.66 0.54 0.83 0.59 0.51 1.04 0.85 0.45 0.52 0.58 1.11 0.34 1.25 0.38 1.44 1.28 0.51 a. Summarize/describe the data. b. Is it plausible that ALD is at least approximately normally distributed? Why or why not? Must normality be assumed prior to calculating a confidence interval for true average ALD or testing hypotheses about true average ALD? Explain. c. The author of the article commented that in most cases the ALD is better than or of the order of 1.0. Does the data in fact provide strong evidence for concluding the true average ALD under these circumstances is less than 1.0? Carry out an appropriate test of hypotheses. d. Calculate an upper confidence bound for true average ALD using a confidence level of 95% and interpret this bound. 4. Hexavalent chromium has been identified as an inhalation carcinogen and an air toxin of concern in a number of different locales. An article gave the accompanying data on both indoor and outdoor concentrations (nanograms/m2) for a sample of houses selected from a certain region. 1 2 3 4 5 6 7 8 9 Indoor 0.07 0.08 0.09 0.12 0.12 0.12 0.13 0.14 0.15 Outdoor 0.29 0.68 0.47 0.54 0.97 0.35 0.49 0.84 0.86 10 11 12 13 14 15 16 17 18 Indoor 0.15 0.17 0.17 0.18 0.18 0.18 0.18 0.19 0.20 Outdoor 0.28 0.32 0.32 1.55 0.66 0.29 0.21 1.02 1.59 19 20 21 22 23 24 25 26 27 Indoor 0.22 0.22 0.23 0.23 0.25 0.26 0.28 0.28 0.29 Outdoor 0.90 0.52 0.12 0.54 0.88 0.49 1.24 0.48 0.27 28 29 30 31 32 33 Indoor 0.34 0.39 0.40 0.45 0.54 0.62 Outdoor 0.37 1.26 0.70 0.76 0.99 0.36 a. Calculate a confidence interval for the population mean difference between indoor and outdoor concentrations using a confidence level of 95%, and interpret the resulting interval. b. If a 34th house were to be randomly selected from the population, between what values would you predict the difference in concentrations to lie? c. Explain why it would be incorrect to conduct a two-sample t-test (or z-test) on this data set. 𝑝 5. In medical investigations, the ratio 𝜃 = 𝑝1 is often of more interest than the difference 𝑝1 − 𝑝2 . 2 𝑝̂ Let 𝜃̂ = 𝑝̂1 . When m and n are both large, the statistic ln(𝜃̂) has approximately normal 2 distribution with approximate mean value ln(𝜃) and approximate standard deviation 1 𝑚−𝑥 ( 𝑚𝑥 + 𝑛−𝑦 2 ). 𝑛𝑦 a. Use these facts to obtain a large-sample 95% confidence interval for estimating ln(𝜃), and then a confidence interval for 𝜃 itself. Treatment Sample Size Successes Aspirin 730 141 Non-Aspirin 549 81 b. Using the data provided on the effectiveness of aspirin in colorectal cancer treatment, calculate an interval of plausible values for 𝜃 at the 95% confidence level. What does this interval suggest about the efficacy of the aspirin treatment? c. Compare this result to the more traditional way of comparing proportion differences. d. Explain why medical investigators might find one approach gives them more information than the other? 6. In an experiment to compare bearing strengths of pegs inserted in two different types of mounts, a sample of 14 observations on stress limit for red oak mounts resulted in a sample mean and sample standard deviation of 8.48 MPa and 0.79 MPa respectively, whereas a sample of 12 observations when Douglas fir mounts were used gave a mean of 9.36 MPa and a standard deviation of 1.52. Test whether or not true average stress limits are identical for the two types of mounts. Compare degrees of freedom and P-values for the unpooled and pooled t-tests. Explain why you prefer once analysis over the other in terms of what it tells you about the underlying data. 7. An article reports the following data on total Fe (iron) for four types of iron formation (1=carbonate, 2=silicate, 3=magnetite, 4=hematite). 1. 20.5 28.1 27.8 27.0 28.0 25.2 25.3 27.1 20.5 31.3 2. 26.3 24.0 26.2 20.2 23.7 34.0 17.1 26.8 23.7 24.9 3. 29.5 34.0 27.5 29.4 27.9 26.2 29.9 29.5 30.0 35.6 4. 36.5 44.2 34.1 30.3 31.4 33.1 34.1 32.9 36.3 25.5 Carry out an analysis of variance F test at significance level 0.01 and summarize the results in an ANOVA table. Would the results change if you used a significance level of 0.05? Carry out Tukey’s procedure to determine how the means of the iron formations are grouped. Use the underscoring method and then add a paragraph to summarize the results.