252solnD1 2/26/03 (Open this document in 'Page Layout' view!) Re-edited to replace or with D . D. COMPARISON OF TWO SAMPLES 1. Two Means, Two Independent Samples, Large Samples. 9.2, 9.3 2. Two Means, Two Independent Samples, Populations Normally Distributed, Population Variances Assumed Equal. 9.4, 9.6a,b†, 9.19† 3. Two Means, Two independent Samples, Populations Normally Distributed, Population Variances not Assumed Equal. (D3, D4) 4. Two Means, Paired Samples (If samples are small, populations should be normally distributed). D1, D2, 9.33* 5. Rank Tests. Downing & Clark 17-15, 17-9. Text 15-12†, 15-28† a. The Wilcoxon-Mann-Whitney Test for Two Independent Samples. b. Wilcoxon Signed Rank Test for Paired Samples. D5 6. Proportions. 9.42†, 9.44*, look at 9.40† 7. Variances. D6, D7, 9.70†, 9.71†, 9.76*, 9.77* Solutions to outline points 1 through 3 are in this document. From the formula table Difference H 0 : D D0 * d z 2 d d D0 z between Two H 1 : D D0 , d Means ( 12 22 D 1 2 d known) n n 1 d cv D0 z 2 d 2 d x1 x2 Difference between Two Means ( unknown, variances assumed equal) Difference between Two Means( unknown, variances assumed unequal) H 1 : D D0 , 1 1 n1 n 2 sd s p D 1 2 DF n1 n2 2 H 0 : D D0 * D d t 2 s d DF H 1 : D D0 , s12 s22 n1 n2 sd s12 s22 n 1 n2 D 1 2 t sˆ 2p t d D0 sd d cv D0 t 2 s d n1 1s12 n2 1s22 n1 n2 2 d D0 sd d cv D0 t 2 s d 2 s12 2 n1 n1 1 * Same as H 0 : D D0 * D d t 2 s d s 22 2 n2 n2 1 H 0 : 1 2 H 1 : 1 2 if D0 0. 1 Problems with 2 means and independent samples Exercise 9.2: In this problem the population variances are known. Our data is as follows: 1 12 2 10 For sample 1 1 4 For sample 2 n1 64 2 3 n2 64 x1 sample mean x2 sample mean If is known and the data is chosen from a normal population, x will have the normal distribution with a standard deviation of x or a variance of x2 2 . Sums or differences of sample means will also n n have the normal distribution and the variances of sums or differences of sample means from independent samples will be the sums of the variances of the individual sample means. 4 a) For sample 1, 1 12 , x1 1 0.5 . So x ~ N 12,0.5 . n1 64 b) For sample 2, 1 10, x2 2 3 n2 64 0.375 . So x ~ N 10,0.375 . c) Last semester we learned that if D 1 2 and d x1 x 2 . E d E x1 x 2 E x1 E x 2 1 2 D and d2 1 2 12 10 2 D and Var d Varx1 x2 Varx1 2Covx1 , x 2 Varx2 . But if the first and second samples are independent, Covx1, x2 0 , Varx1 x21 d2 Var x1 2Covx1, x2 Varx2 Finally d 12 n1 12 n1 0 42 64 22 n2 16 2 32 9 and Varx2 x22 2 . So 64 n2 64 64 16 9 25 . 64 64 64 s12 s 22 n1 n 2 25 5 0.625 . (The formula in the outline is s d 64 8 d x1 x2 ~ N 2,0.625 Exercise 9.3: a) From the outline if ). So we conclude d) Yes. D 1 2 and d x1 x 2 , the formula for a confidence interval is D d t 2 s d or 1 2 x1 x 2 t 2 s d and for large samples we replace t with z and use s1 150 s2 200 s12 s 22 . For sample 1 n1 400 For sample 2 n2 400 sd n1 n 2 x1 5275 x2 5240 Note that n1 n2 2 400 400 2 798 . Since this is a large sample, we replace t with z .05 , so z 2 z.025 1.960 , sd s12 s22 n1 n2 150 2 200 2 400 400 62500 156 .25 12 .5 and 400 d x1 x2 5275 5240 35 . Finally D d z 2 s d 35 1.96 12.5 35 24.5 or 10.5 to 59.5. 2 H 0 : D 0 H : 2 H : 2 0 b) 0 1 or 0 1 or H 1 : D 0 H 1 : 1 2 H 1 : 1 2 0 (i) From the outline or Table 3 b. Test Ratio: t lies between t , do not reject H 0 . z 2 x x 2 10 20 d D0 or t 1 . If this test ratio sd sd d D0 35 0 2.8 . Make a diagram with zero in the sd 12 .5 middle showing 'reject' regions below -1.96 and above 1.96. Since 2.8 falls in the upper 'reject' region, reject H 0 . Or use the p-value pval 2Pz 2.80 2.5 .4974 .0052 . . Since the p-value is below the significance level , reject H0 . (ii). Critical Value: d CV D0 t 2 s d or x1 x 2 CV 10 20 t 2 s d . If d x1 x 2 is between the two critical values, do not reject H 0 . d CV 0 z sd 0 1.96012.5 0 24.5 . Make a diagram 2 with zero in the middle showing 'reject' regions below -24.5 and above 24.5. Since d 35 falls in the upper 'reject' region, reject H 0 . (iii) Confidence Interval: Since D0 0 does not fall in the confidence interval in a), reject H 0 . H 0 : 1 2 H 0 : 1 2 0 H : D 0 c) or or 0 H1 : 1 2 H1 : 1 2 0 H 1 : D 0 (i) Test Ratio: t H0 . z x x 2 10 20 d D0 or t 1 . If this test ratio lies below t , do not reject sd sd d D0 35 0 2.8 . Make a diagram with zero in the middle showing a 'reject' region sd 12 .5 above z.05 1.645 . Since 2.8 falls in the 'reject' region, reject H0 . Or use the p-value pval Pz 2.80 .5 .4974 .0026 . . Since the p-value is below the significance level , .05, reject H0 . (ii). Critical Value: d CV 0 t sd or x1 x2 CV 10 20 t sd . If d x1 x 2 is not above the critical value, do not reject H 0 . d CV 0 z sd 0 1.64512.5 20.5625 . Make a diagram with zero in the middle showing a 'reject' region above 20.5625. Since d 35 falls in the upper 'reject' region, reject H 0 . (iii) Confidence Interval: The formula for a two sided confidence interval was D d z 2 s d . But since the alternate hypothesis is now H 1 : D 0 , the confidence interval becomes . D d z 2 s d 35 1.645 12.5 14 .4375 . Make a diagram. Shade the area above 14.4375 to represent the confidence interval, D 14 .4375 . Shade the area below zero to represent the null hypothesis, H 0 : D 0 . Since the two areas do not touch, the confidence interval contradicts the null hypothesis, and we must reject it. 3 H 0 : 1 2 25 H 0 : D 25 d) or , so this time D0 25 . H1 : 1 2 25 H 1 : D 25 (i) Test Ratio: t not reject H 0 . x x 2 10 20 d D0 or t 1 . If this test ratio lies between t z , do sd sd 2 2 z d D0 35 25 0.8 . Make a diagram with zero in the middle showing 'reject' sd 12 .5 regions below -1.96 and above 1.96. Since 0.8 does not fall in one of the 'reject' regions, do not reject H 0 . Or use the p-value pval 2Pz 0.80 2.5 .2881 .4238 . Since the p-value is above the significance level , do not reject H 0 . (ii). Critical Value: d CV D0 t 2 s d or x1 x 2 CV 10 20 t 2 s d . If d x1 x 2 is between the two critical values, do not reject H 0 . d CV D0 z 2 s d 25 1.960 12.5 25 24.5 , or 0.5 and 49.5. Make a diagram with 25 in the middle showing 'reject' regions below 0.5 and above 49.5. Since does not fall in a 'reject' region, do not reject H0 . (iii) Confidence Interval: Since D0 0 does not fall in the confidence interval in a), reject H 0 . d 35 e) You must assume that x1 and x2 are two independent random variables. Exercise 9.4: To use the t statistic, the narrowest assumptions are that we have two independent samples, that each population is approximately normal. If we wish to use the method presented in the text (the 1 n 1s12 n2 1s22 , 1 second method in the outline), where t t n1 n2 2 , sd s p2 and s p2 1 n1 n2 2 n1 n2 we must also assume that the population variances are equal. n 1s12 n2 1s22 to use in Exercise 9.6a,b†: All this problem wants is computation of s p2 1 n1 n2 2 s 2 120 1 1 sd s p2 . a) For sample 1 1 . For sample 2 n1 25 n1 n2 25 1 120 25 1100 24 120 24 100 120 100 So s p2 25 25 2 48 2 s22 100 . n2 25 110 s 2 12 s 2 20 b) For sample 1 1 . For sample 2 2 . n1 20 n2 10 20 1 12 10 120 19 12 9 20 408 14 .5714 . So s p2 20 10 2 48 28 4 x1 0.0491 Exercise 9.19†: a) For sample 1 s12 0.009800 . For sample 2 n 27 1 2 2 assuming 1 2 . d x1 x2 0.0491 0.0307 0.0184 x2 0.0307 2 s2 0.002465 n 23 2 n 1s12 n2 1s22 26 0.009800 22 0.002465 0.006438 So s p2 1 48 n1 n2 2 .05 We are 1 1 and sd s p2 n n 2 1 1 1 0.006438 0.006438 0.037037 0.043478 0.006438 0.080515 0.0051836 27 23 = 0.02276747. df n1 n2 2 27 23 2 48 . H 0 : D 0 H : 2 H : 2 0 d D0 a) 0 1 or 0 1 or (i) If we use a test ratio t sd H : D 0 H : H : 0 2 2 1 1 1 1 1 0.0184 0 0.80817 . If this test ratio lies between t , do not reject H 0 . Make a diagram with 0.02276747 2 48 zero in the middle showing 'reject' regions below t.48 025 2.011 and above t.025 2.011 . Since -0.80817 does not fall in a 'reject' region, do not reject H 0 . 48 Or use the p-value pval 2Pt 0.80817 . . Since 0.80817 lies between t.48 25 0.680 and t.20 0.849 we can say .40 pval .50 . Since the p-value is above the significance level , do not reject H 0 . (ii). Critical Value: d CV D0 t 2 s d 0 2.011 0.2276747 0.0458 . Make a diagram with zero in the middle showing 'reject' regions below -0.0458 and above 0.0458. Since d 0.0184 does not fall in a 'reject' region, do not reject H 0 . b) The confidence interval is D d t 2 s d or 1 2 x1 x 2 t 2 s d 0.0184 2.011 0.2276747 0.0184 0.0458 or -0.64 to 0.027. since this interval includes D0 0 , do not reject H 0 . Problem D3 (Optional): A secretary types 16 pages on word processor 1 and 16 pages on word processor 2. Her times are: x1 8.20 s12 4.10 x2 7.10 s22 4.20 If 1 2 =1-2 test D 0 at the 90 per cent confidence level. Assume that these are independent samples and that 12 22 . ( .10 ) Solution: We have n1 16 and n2 16 . class. From the Syllabus supplement: Difference D d t 2 s d Between Two s2 s2 Means( sd 1 2 Unknown, n1 n2 Variances 2 s12 s22 Assumed n n 1 2 DF 2 Unequal) 2 s s The two-sided confidence interval for this problem was done in H 0 : D D0 H 1 : D D0 D 1 2 t d D0 sd d cv D0 t 2 s d 2 1 n1 n1 1 2 2 n2 n2 1 5 We found the following in class: s12 4.1 s 2 4.2 s2 s2 0.25625 , 2 0.26250 , so 1 2 0.25625 0.26250 0.51875 , n1 16 n2 16 n1 n2 sd DF s12 s22 0.51875 0.720 , d x1 x2 8.20 7.10 1.10 and n1 n2 s12 s22 n 1 n2 2 2 2 0.51875 2 0.25625 2 0.26250 2 0.26910 29 .9 . 0.00438 0.00459 s12 s22 15 15 n 1 n2 n1 1 n2 1 I will follow my own advice this time and round the degrees of freedom down to 29. (If we had followed 29 this advice in class, we would have used t.025 2.045 and the two-sided confidence interval would have been D d t 2 s d 1.10 2.045 0.720 1.10 1.47 .) H 0 : 1 2 H : D 0 We are now testing 0 or . Since our hypotheses are one-sided we use H1 : 1 2 H 1 : D 0 t 29 1.311 .10 d D0 x1 x 2 1 2 1.10 0 1.527 . Make a diagram with zero in the sd sd 0.720 middle showing a 'reject' region above t 29 1.311 . Since 1.527 falls in the 'reject' region, reject H . (i) Test Ratio: t 0 .10 (ii) Critical Value: d CV D0 t s d 0 1.311 0.720 0.943 . Make a diagram with zero in the middle showing a 'reject' above 0.943. Since d 1.10 falls in the 'reject' region, reject H 0 . (iii) Confidence interval: D d t 2 s d becomes D d t s d 1.10 1.311 0.720 0.157 . D 0.157 contradicts the null hypothesis D 0 so reject H0 . 6 Problem D4: (Old Minitab Manual - modified) In a study of tool life , two independent samples of wear are taken. The first of these represents volume loss in millionths of a cubic inches from 10 untreated tools. The second represents loss in the same units from 10 tools that were treated by a new wear retardant process. Untreated .56 .50 .69 .59 .47 .42 .45 .47 .50 .50 Treated .13 .13 .18 .23 .18 .31 .35 .23 .31 .33 On the assumption that the parent populations are Normal, test the hypothesis that the means are equal and do a confidence interval for the difference between the means ) (a) assuming that the variances are equal and (b) assuming that the variances are not equal. Solution: a) x1 s12 x 1 n`1 5.15 0.515 10 x nx12 x 2 2 2 nx2 2 2 1 n1 1 0.00625 . x2 s22 n2` x 2.709 10 0.515 2 9 2.38 0.238 10 n2 1 0.00684 . 0.628 10 0.238 2 9 x12 0.3136 0.2500 0.4761 0.3481 0.2209 0.1764 0.2025 0 .2209 0.2500 0.2500 2.709 x1 0.56 0.50 0.69 0.59 0.47 0.42 0.45 0.47 0.50 0.50 5.15 H 0 : D 0 H 0 : 1 2 H : 2 0 or 0 1 or H 1 : D 0 H 1 : 1 2 H 1 : 1 2 0 x2 0.13 0.13 0.18 0.23 0.18 0.31 0.35 0.23 0.31 0.33 2.38 x22 0.0169 0.0169 0.0324 0.0529 0.0324 0.0961 0.1225 0.0529 0.0961 0.1089 0.628 d x1 x2 0.515 0.238 0.277 a) We assume that the variances are equal. ( 12 22 ). So we use the traditional method for this problem. n 1s12 n2 1s22 9 0.00625 9 0.00684 0.00625 0.00684 0.00655 and s p2 1 18 2 n1 n2 2 1 1 1 1 0.00625 0.00625 0.2 0.03618 sd s p2 10 10 n1 n2 df n1 n2 2 10 10 2 18 . (i) Test Ratio: t d D0 sd 0.277 0 7.66 . If this test ratio lies between t , do not reject H 0 . 0.03618 2 Make a diagram with zero in the middle showing 'reject' regions below t.18 025 2.101 and above t.18 025 2.101. Since 7.66 falls in a 'reject' region, reject H 0 . (ii). Critical Value: d CV D0 t 2 s d 0 2.1010.03618 0.076 . Make a diagram with zero in the middle showing 'reject' regions below -0.076 and above 0.076. Since d 0.277 falls in a 'reject' region, reject H 0 . (iii) Confidence Interval: D d t 2 s d 0.277 2.1010.03618 0.277 0.076 or 0.201 to 0.353. Since zero is not on this interval, reject H 0 . 7 b) (Optional) We assume that the variances are not equal. ( 12 22 ). approximation. So use the Satterthwaite s12 0.00625 s 2 .00684 s2 s2 0.000625 , 2 0.000684 , so 1 2 0.000625 0.000684 0.001309 , n1 10 n2 10 n1 n2 sd DF s12 s22 0.001309 0.0362 , d x1 x2 8.20 7.10 1.10 and n1 n2 s12 s22 n 1 n2 2 0..001309 2 0.000625 2 0.0006846250 2 17 .96 . We probably should round down to 17 2 2 s12 s22 9 9 n 1 n2 n1 1 n2 1 degrees of freedom, but note that, if we use 18 degrees of freedom, our results with this method are the same as those with the traditional method. d D0 0.277 0 7.66 . If this test ratio lies between t , do not reject H0 . (i) Test Ratio: t sd 0.0362 2 Make a diagram with zero in the middle showing 'reject' regions below t.17 025 2.110 and above t.17 025 2.110 . Since 7.66 falls in a 'reject' region, reject H0 . (ii). Critical Value: d CV D0 t 2 s d 0 2.110 0.03618 0.076 . Make a diagram with zero in the middle showing 'reject' regions below -0.076 and above 0.076. Since d 0.277 falls in a 'reject' region, reject H 0 . (iii) Confidence Interval: D d t 2 s d 0.277 2.110 0.03618 0.277 0.076 or 0.201 to 0.353. Since zero is not on this interval, reject H 0 . Parts not copied ©2003 Roger Even Bove 8