1 Chapter 8 – Comparing Two Treatments Inference about Two Population Means We want to compare the means of two populations to see whether they differ. There are two situations to consider, as shown in the following examples: 1) In an experiment designed to study the effects of illumination level on task performance (“Performance of Complex Tasks Under Different Levels of Illumination,” J. Illuminating Engineering, 1976: 235-242), subjects were required to insert a fine-tipped probe into the eyeholes of ten needles in rapid succession both for a low-light-level with a black background and for a higher level with a white background. It is of interest to compare the mean times for completion of the task under the two different conditions. 2) Compare the mean lifetime, 1, for transistors produced by production line 1 to the mean lifetime, 2, for transistors produced by production line 2. We want to know whether these two means differ. In the first case, we are comparing related means, using dependent samples. For each member of one sample, there is a matched member of the other sample. In the second case, we are comparing unrelated means, using independent samples. There is no natural way to match each member of one sample with a member of the other sample. We will use somewhat different procedures for hypothesis tests, depending on whether our samples are dependent or independent. There is another issue to be considered. Are the variances of the two populations equal or unequal. This issue, of course, did not arise with inference about a single population. We will see that the procedure for inference about the difference between the means depends on the comparison of the variances. Comparing Two Means, Independent Samples We will assume the following: 1) We have selected a random sample from each of the two populations. The r.s., of size n1, from population 1 will be denoted by i.i.d. with mean denoted by variance 2) 3) 1 and variance X 21 , X 22 , X 11 , X 12 , 12 . , X 1n1 . These r.v.’s are assumed to be The r.s., of size n2, from population 2 will be , X 2 n2 . These r.v.’s are assumed to be i.i.d. with mean 2 and 22 . The two populations are independent. This implies that all of the n1 n2 r.v.’s listed above are independent of each other. Either both populations are normal, or the conditions of the Central Limit Theorem apply. (We may also check for normality of each population using normal probability plots with the samples of data.) 2 We want to estimate the difference, 1 2 , between the population means. A logical point estimator of this parameter is X 1 X 2 . It is easily shown that this statistic is an unbiased estimator of the parameter. It is also easily shown that the variance of the estimator is V X 1 X 2 12 n1 22 n2 . Given these results and the assumptions listed above, it is clear that the random variable X 1 X 2 1 2 12 n1 22 has an approximate standard normal distribution. We want to use this n2 fact to do inference about the difference between the two population means. However, the random variable given above depends on two other unknown parameters. We need to estimate the two population variances. Testing Hypotheses About the Difference Between Two Means Assume that we want to test whether the means of two populations, population 1 and population 2, differ. In other words, our alternative hypothesis has one of the following forms: Ha: 1 - 2 0 Ha: 1 - 2 < 0 Ha: 1 - 2 > 0 There are two cases to consider: Either the populations have the same variability, or they do not. The form of the test statistic will depend on whether we can make the assumption that the populations have equal variances. If we can assume equal variances, then we want to use the following statistic: t X 1 X 2 1 2 0 n1 1s12 n2 1s22 n1 n2 2 S 2 P n1 1 S12 n2 1 S 22 n1 n2 2 1 1 . Here the quantity n1 n2 is the pooled variance estimate. If we cannot assume equal variances, then we want to use the following statistic: t X 1 X 2 1 2 0 s12 s22 n1 n2 . Under the null hypothesis, this statistic has an approximate t-distribution with degrees of freedom given by 3 2 S12 S 22 n1 n2 2 2 S12 / n1 S22 / n2 n1 1 n2 1 In either case, our hypothesis proceeds as follows: Step 1: State the null and alternative hypotheses. Step 2: State the chosen sample sizes and significance level. Step 3: State the test statistic (substituting 0 for 1 - 2), and stating the distribution of the test statistic under the null hypothesis. Step 4: Find the rejection region and critical value(s). Step 5: Choose the samples, collect the data, calculate the value of the test statistic. Step 6: If we reject the null hypothesis, the conclusion should be stated in the following form: “We reject H0 at the () level of significance. We have sufficient evidence to conclude that (statement of alternative hypothesis).” If we fail to reject the null hypothesis, the conclusion should be stated in the following form: “We fail to reject H0 at the () level of significance. We do not have sufficient evidence to conclude that (statement of alternative hypothesis).” In the following examples, we assume that the population variances are equal. We can also do a simple graphical check for equality, by constructing side-by-side boxplots of the two data sets. We could also check for normality using probability plots, if we had the data sets available. We will learn later how to test for equality of the variances. Example: Let and denote true average tread lives for two competing brands of size P205/65R15 radial tires. We want to test whether the average tread lives are different. We choose a random sample of 45 tires of the first type and a random sample of 45 tires of the second type. We test each tire under identical conditions until the tread wears out. We obtain the following data: x1 42,500 mi. , s1 2200 mi. , x2 40, 400 mi. , s2 1900 mi. Example: The accompanying table gives summary data on cube compressive strength (N/mm2) for concrete specimens made with a pulverized fuel-ash mix (“A study of twenty-five-year-old pulverized fuel ash concrete used in foundation structures,” Proceedings of the Institute of Civil Engineers,” Mar. 1985, 149-165). We want to test whether the true mean 7-day strength is less than the true mean 28-day strength. Age (days) 7 28 Sample Size 68 74 Sample Mean 26.99 35.76 Sample SD 4.89 6.43 4 Confidence Intervals for Differences Between Population Means We can find confidence interval estimates for the differences between two population means (independent samples) using the following formulas, depending on whether the population variances are equal or unequal: 1) For equal population variances, use X 1 X 2 t 2 n1 1s12 n2 1s22 1 ,d . f . n1 n2 2 1 n n . In this case, d.f. = n1 + n2 – 2. 2 1 2) For unequal population variances, use X 1 X 2 t 2 ,d . f . s12 s 22 . In this case, d.f. = the smaller of the values n1 – 1 and n2 – 1. n1 n2 Example: Estimate the difference in mean tread lives from the first example above. Interpret this interval estimate. Example: Estimate the difference, 7 28 , in mean compressive strengths from the second example above. Interpret this interval estimate. Choice of Sample Sizes, When Variances Are Equal For the two-sided alternative hypothesis, HA: 1 2 0 , with equal sample sizes and equal 0 , to 2 find the appropriate sample size. The sample size read from the curve will be n* 2n 1. population variances, we may use Charts Va and Vb in the Appendix, together with d Tests of Hypotheses Concerning the Difference Between Two Population Means, Unequal Variances If we do not have reason to believe that the populations have equal variability, we should check for equal variability by some method. One way to do this is to do a hypothesis test in which the null hypothesis is that the population variances are equal. Another way is to graphically compare the two data sets. We will look at the second method now, and look at testing equality of the variances later. Example: The void volume within a textile fabric affects comfort, flammability, and insulation properties. Permeability of a fabric refers to the accessibility of void spaces to the flow of a gas or liquid. The paper “The relationship between porosity and air permeability of woven textile fabrics” (Journal of Testing and Evaluation, 1997: 108-114) gave summary information on air permeability (cm3/cm2/sec) for a number of different fabric types. Consider the following data on two different types of plain-weave fabric: Fabric Type Cotton Triacetate Sample Size 10 10 Sample Mean 51.71 136.14 Sample SD 0.79 3.59 We want to test whether plain-weave triacetate has a higher mean permeability than plain-weave cotton. We also want a 95% confidence interval estimate of the difference between the means. 5 Tests of Hypotheses Concerning the Difference Between Two Population Means, Dependent Samples When there is a natural pairing between each member of one population and a member of the other population, the test for differences between the population means must be done somewhat differently. For dependent samples, inferences are performed based on the differences between the scores for each pair. Let X1, X2, X3, …, Xn be the observations made on the members of the first sample, and let Y1, Y2, Y3, …, Yn be the observations made on members of the second sample. The random variables we will use in this test will be D1 = X1 – Y1, D2 = X2 – Y2, D3 = X3 – Y3, …, Dn = Xn – Yn . Using the set of difference random variables, we n conduct a one-sample t-test. The sample mean is then D D n S D2 i 1 i D i 1 n i , and the sample variance is D n 1 2 . We may either use the raw data of difference scores to conduct our t-test, or we may use these sample statistics based on the difference scores. The alternative hypotheses have one of the following forms: Ha: 1 - 2 0 or Ha: D 0 Ha: 1 - 2 < 0 or Ha: D < 0 Ha: 1 - 2 > 0 or Ha: D > 0 Here D 1 2 , the difference between the population means. The steps in the hypothesis test are similar to those in previous tests: Step 1: State the two hypotheses to be tested. Step 2: State the sample size (note that both samples must have the same size), and the chosen significance level. Step 3: The test statistic is t d D0 , which under the null hypothesis has a t distribution with d.f. sD n = n – 1. Step 4: Find the rejection region and critical value(s). Step 5: Choose samples, collect data, calculate the value of the test statistic. Step 6: State the conclusion, in terms of the alternative hypothesis, and being sure to include the significance level of the test. Confidence Intervals for Differences Between Related Population Means If we want to obtain an interval estimate of the difference between two population means, when there is a pairwise relationship between members of one population and members of the other, we first compute the difference scores 6 Note that, if the samples are either from normal distributions or are large enough that we may use the Central Limit Theorem, then D D T has an (approximate) t distribution with d.f. = n – 1. Then a (1 – α)100% confidence SD n SD SD . D t , D t interval estimate for D is n 1, n 1, n n 2 2 Example: In an experiment designed to study the effects of illumination level on task performance (“Performance of Complex Tasks Under Different Levels of Illumination,” J. Illuminating Engineering, 1976: 235-242), subjects were required to insert a fine-tipped probe into the eyeholes of ten needles in rapid succession both for a low-light-level with a black background and for a higher level with a white background. It is of interest to compare the mean times for completion of the task under the two different conditions. The data are given in the table below. We want to test whether the higher level of illumination yields a decrease of more than 5 seconds in true mean task completion time. Subject 1 2 3 4 5 6 7 8 9 Black 25.85 28.84 32.05 25.74 20.89 41.05 25.01 24.96 27.47 White 18.23 20.84 22.96 19.68 19.50 24.98 16.61 16.07 24.59 We also want a 95% confidence interval estimate of the difference between the mean times to complete the task under low illumination v. high illumination.