252meanl 10/21/05 D. COMPARISON OF TWO SAMPLES The first five methods shown here are appropriate in cases where one wants to compare the means of two samples. The first one is the most general, since it is acceptable for most large samples. The second and third method are usually thought of as small sample methods, but would be appropriate for large samples. However, methods 2, 3, and four assume that the underlying distributions are normal, so that the methods based on ranking must be used in small sample situations where the underlying distribution is not normal. For the first four methods, the tests are very similar and thus are summarized together. Suppose that we have two samples. The first sample is a sample of n1 items , has a sample mean of x1 , a sample variance of s12 , and comes from a population with a mean of 1 and a variance of 12 . The second sample is a sample of n2 items, has a sample mean of x 2 , a sample variance of s 22 , and comes from a population with a mean of 2 and a variance of 22 . Our hypotheses are H 0 : D D0 H0 : 1 2 or more generally H1 : 1 2 H 1 : D D0 . where D 1 2 and d x1 x 2 . If you use the second representation of the hypotheses to test equality of population means, then you set D0 0 . For each of the first four methods, there are three different approaches to testing these hypotheses. Each of these can be expressed in similar notation. For examples use 252meanx1. a. Confidence Interval: D d t 2 s d or 1 2 x1 x 2 t 2 s d . Form a confidence interval using this formula. If D0 (which may be ) is in the interval, do not reject H 0 . b. Test Ratio: t between x x 2 10 20 d D0 or t 1 . If this test ratio lies sd sd t , do not reject H0 . 2 c. Critical Value: d cv D0 t 2 s d or x1 x 2 cv 10 20 t 2 s d . If d x1 x 2 is between the two critical values, do not reject H 0 . The difference between the cases comes down to the choice of t and The formula for s d . Let us now consider the first four cases. 1. Two Means, Two Independent Samples, Large Samples. If the total number of degrees of freedom is large (or the two samples come from normally distributed populations with known variances 12 and 22 ), then replace t with z and use s d s12 s 22 . n1 n 2 2. Two Means, Two Independent Samples, Populations Normally Distributed, Population Variances Assumed Equal. 1 1 t t n1 n2 2 and s d s p2 n1 n 2 , n 1s12 n2 1s 22 . where s p2 1 n1 n 2 2 (3. Two Means, Two independent Samples, Populations Normally Distributed, Population Variances not Assumed Equal. This time the degrees of freedom for t must be calculated by the s2 s2 2 1 2 n1 n 2 Satterthwaite approximation. The formula is df 2 2 s2 s 22 1 n2 n1 n 1 n2 1 1 but the formula for the standard deviation is the same as in method 1, , s12 s 22 . This formula tends to give identical answers to n1 n 2 sd method 1 when the degrees of freedom are large. It also tends to give answers similar to method 2 when sample variances are of similar size. For Minitab examples see 252meanx3 and 252meanx5. 4. Two Means, Paired Samples (If samples are small, populations should be normally distributed). If n is the number of pairs of data, then t t n1 and sd 1 n d 2 n d n 1 2 . In this case d1 x11 x21, d 2 x21 x22 , etc. 5. Rank Tests. ( The remainder of this document is expanded in 252meanx2) Especially in the case where samples are small and the underlying distributions are not normal, it is not appropriate to compare means. a. The Wilcoxon-Mann-Whitney Test for Two Independent Samples. If samples are independent, This test is appropriate to test whether the two samples come from the same distribution. If the distributions are similar, it is often called a test of equality of medians. b. Wilcoxon Signed Rank Test for Paired Samples. This is a more powerful test for equality of medians when the data is paired. It can also be used for the median of a single sample. The Sign Test for paired data is a simpler test to use in this situation, but it is less powerful. 6. Proportions. For independent samples - If p1 is the proportion of successes in the first population, and p2 is the proportion of successes in the second population, we define p p1 p2 . Then our hypotheses will be H 0 : p 1 p 2 or more generally H 0 : p = p 0 . H 1 : p1 p 2 H 1 : p p 0 Let p1 x1 , p2 x2 and p p1 p2 where x1 is the number of n1 n2 successes in the first sample, x 2 is the number of successes in the second sample and n1 and n2 are the sample sizes. The usual three approaches to testing the hypotheses can be used. a. Confidence Interval: p p z s p or 2 p1 p2 p1 p2 z 2 s p , where s p p1 q1 p 2 q 2 . n1 n2 Compare this interval with p0 . p p 0 p1 p p10 p 20 b. Test Ratio: z where p10 and p p p 20 come from the null hypothesis if specified and p1q1 p 2 q 2 although s p may have to be used if p1 and p 2 n1 n2 p are unknown. Also note that if the null hypothesis is p1 p2 or p0 0 , 1 n p n 2 p 2 x1 x 2 1 , where p 0 1 1 p 0 q 0 n1 n 2 n1 n 2 n1 n 2 and x1 and x2 are the number of successes in sample 1 and sample 2, respectively. c: Critical Value: pCV p0 z p or p1 p 2 CV p10 p 20 z 2 p . Test we use p 2 this against p1 p2 . For calculation of p , see Test Ratio above. For paired samples, use the McNemar Test. This is described in 252meanx2. 7. Variances. Switch to document 252meanx4l here for examples. s12 s22 Test the ratios 2 and 2 against values of F . 2 s2 s1 H 0 : 12 22 If we want to test H 1 : 12 22 DF2 n 2 1 , compare s12 s 22 where DF1 n1 1 and against FDF1 ,DF2 . If the ratio of sample variances is larger than FDF1 ,DF2 , reject H 0 . H 0 : 12 22 If we want to do the opposite test H 1 : 12 22 s 22 , compare against FDF2 ,DF1 . s12 H 0 : 12 22 For the 2-sided test , do both tests above , but use H 1 : 12 22 F . A 2-sided confidence interval is 2 s 22 s12 F DF1 , DF2 2 22 12 s 22 s12 1 DF2 , DF1 . F 2 8. Appendix: Sample sizes for confidence intervals for differences between means and proportions. For means n1 n 2 n1 n 2 z 22 12 22 e2 z 22 p1 q1 p 2 q 2 e2 . . For proportions