17582_06_ch06_p290-359.qxd 11/25/08 4:00 PM Page 323 6.6 Choosing Sample Sizes for Inferences about 1 2 323 two populations were shifted by amounts 0, .4s, and .8s, where s denotes the standard deviation of the distribution. (When the population distribution is Cauchy, s denotes a scale parameter.) From Table 6.18, we can make the following observations. The level of the paired t test remains nearly equal to .05 for uniform and double exponential distributions, but is much less than .05 for the very heavy-tailed Cauchy distribution. The Wilcoxon signed-rank test’s level is nearly .05 for all four distributions, as expected, because the level of the Wilcoxon test only requires that the population distribution be symmetric. When the distribution is normal, the t test has only slightly greater power values than the Wilcoxon signed-rank test. When the population distribution is short-tailed and uniform, the paired t test has slightly greater power than the signed-rank test. Note also that the power values for the t test are slightly less than the t power values when the population distribution is normal. For the double exponential, the Wilcoxon test has slightly greater power than the t test. For the Cauchy distribution, the level of the t test deviates significantly from .05 and its power is much lower than the Wilcoxon test. From other studies, if the distribution of differences is grossly skewed, the nominal t probabilities may be misleading. The skewness has less of an effect on the level of the Wilcoxon test. Even with this discussion, you might still be confused as to which statistical test or confidence interval to apply in a given situation. First, plot the data and attempt to determine whether the population distribution is very heavy-tailed or very skewed. In such cases, use a Wilcoxon rank-based test. When the plots are not definitive in their detection of nonnormality, perform both tests. If the results from the different tests yield different conclusions, carefully examine the data to identify any peculiarities to understand why the results differ. If the conclusions agree and there are no blatant violations of the required conditions, you should be very confident in your conclusions. This particular “hedging” strategy is appropriate not only for paired data but also for many situations in which there are several alternative analyses. 6.6 Choosing Sample Sizes for Inferences about 1 2 Sections 5.3 and 5.5 were devoted to sample-size calculations to obtain a confidence interval about m with a fixed width and specified degree of confidence or to conduct a statistical test concerning m with predefined levels for a and b. Similar calculations can be made for inferences about m1 m2 with either independent samples or with paired data. Determining the sample size for a 100(1 a)% confidence interval about m1 m2 of width 2E based on independent samples is possible by solving the following expression for n: 1 1 za2s E An n Note that, in this formula, s is the common population standard deviation and that we have assumed equal sample sizes. Sample Sizes for a 100(1 A)% Confidence Interval for 1 2 of the Form y1 y 2 E, Independent Samples 2z2a2 s2 E2 (Note: If s is unknown, substitute an estimated value to get an approximate sample size.) n 17582_06_ch06_p290-359.qxd 324 11/25/08 4:00 PM Page 324 Chapter 6 Inferences Comparing Two Population Central Values The sample sizes obtained using this formula are usually approximate because we have to substitute an estimated value of s, the common population standard deviation. This estimate will probably be based on an educated guess from information on a previous study or on the range of population values. Corresponding sample sizes for one- and two-sided tests of m1 m2 based on specified values of a and b, where we desire a level a test having the probability of a Type II error b(m1 m2) b whenever |m1 m2| , are shown here. Sample Sizes for Testing 1 2 , Independent Samples One-sided test: n 2s2 Two-sided test: n 2s2 (za zb)2 ∆2 (za2 zb)2 ∆2 where n1 n2 n and the probability of a Type II error is to be b when the true difference |m1 m2| ∆. (Note: If s is unknown, substitute an estimated value to obtain an approximate sample size.) EXAMPLE 6.10 One of the crucial factors in the construction of large buildings is the amount of time it takes for poured concrete to reach a solid state, called the “set-up” time. Researchers are attempting to develop additives that will accelerate the set-up time without diminishing any of the strength properties of the concrete. A study is being designed to compare the most promising of these additives to concrete without the additive. The research hypothesis is that the concrete with the additive will have a smaller mean set-up time than the concrete without the additive. The researchers have decided to have the same number of test samples for both the concrete with and without the additive. For an a .05 test, determine the appropriate number of test samples needed if we want the probability of a Type II error to be less than or equal to .10 whenever the concrete with the additive has a mean set-up time of 1.5 hours less than the concrete without the additive. From previous experiments, the standard deviation in set-up time is 2.4 hours. Solution Let m1 be the mean set-up time for concrete without the additive and m2 be the mean set-up time for concrete with the additive. From the description of the problem, we have ● ● ● ● ● One-sided research hypothesis: m1 m2 s 2.4 a .05 b .10 whenever m1 m2 1.5 n1 n2 n From Table 1 in the Appendix, za z.05 1.645 and zb z.10 1.28. Substituting into the formula, we have n 2s2(za zb)2 2(2.4)2(1.645 1.28)2 43.8, ∆2 (1.5)2 or 44 Thus, we need 44 test samples of concrete with the additive and 44 test samples of concrete without the additive.