1 BA 2606 Section 13.4 Inferences on Two Population Variances In addition to comparing two means, researchers may be interested in comparing two population variances. For example, is the variation in two quality control processes different? Another reason may be in determining which t test to use when comparing two means: the pooled variance case, or the case of unequal variances. For the comparison of two variances or standard deviations, an F test is used. Note that when comparing means, we look at the difference between the two means. When comparing variances we look to the s 12 12 ratios of two variances, or 2 . Not surprisingly, the ratio 2 will be s2 2 used as the test statistic, where s12 is the larger variance. The sampling distribution of s 12 s 22 is the F distribution. Properties of the F distribution The values of F cannot be negative, because variances are always positive or zero The distribution is positively skewed The F distribution is a family of curves based on the degrees of freedom of the variance in the numerator and the degrees of freedom of the variance in the denominator. The assumptions here are that the two independent random samples come from normally distributed populations Test Statistic: F s 12 s 22 with numerator df 1 n1 1 and denominator df, 2 n2 1, where s12 is the larger variance. 12 A 1 100% Confidence Interval for 2 has 2 s2 LCL 12 s2 1 F 2 , 1 , 2 and s2 UCL 12 s2 F 2 , 2 , 1 2 Table 6 in Appendix B gives the critical values for the F distribution for =0.05, 0.025 and 0.01. It is limited since the requirement of two sets of degrees of freedom for the numerator and the denominator means 1 a lot of numbers. We may use the relationship F1 , , for the 2 1 2 F , , 2 2 1 lower tailed critical value. Choosing the larger of the two sample variances as the numerator, may save some work. Example: Find the critical values for a a. right tailed F test when =0.05 and 1 15, 2 21 (2.18) b, two tailed F test when =0.05 and 1 20, 2 12 1 (3.07, ) 2.68 When testing the equality of two variances, these hypotheses are used: H 0 : 12 22 H 0 : 12 22 H 0 : 12 22 H A : 12 22 H A : 12 22 H A : 12 22 Two-tailed = Right-tailed Left-tailed Test Statistic: F s 12 s 22 with numerator df 1 n1 1 and denominator df, 2 n2 1 Rejection Region: a) two tailed test:Reject H 0 if F > F 2 , 1 , 2 b) right tailed test:Reject H 0 if F > F , 1 , 2 c) left tailed test: or if F < F1 Reject H 0 if F < F1 , 1 , 2 2 , 1 , 2 1 F 2 , 2 , 1 1 F , 2 , 1 Calculations: No p-value necessary for the F test due to the limitations of the table. Conclusion: 3 Exercises 481-482 13.59 b. s12 1 28 1 LCL 2 .6492 s2 F.025,24,24 19 2.27 s2 28 UCL 12 F.025,24,24 2.27 3.34526 19 s2 A 95 % Confidence Interval for 12 is (.6492, 3.34526) 2 2 We are 95% confident that 13.60 2 1 2 falls in this interval. 2 H 0 : 12 22 H A : 12 22 =.05 Test Statistic: F s 12 s 22 with numerator df 1 n1 1 and denominator df, 2 n2 1 Rejection Region: Reject H 0 if F > F.025,9,10 3.78 or if F < F.975,9,10 1 F.025,10,9 1 .2525 3.96 Calculations: Machine 1: s = .002394438 Machine 2: s = .0033709993 .002394438 2 .0033709993 2 F .00000573333 .5045 .0000113636 Note that the reciprocal is F = 1.982 and the RR for this would be reject the null hypothesis if F > 3.96 or if 1 F .26455 3.78 Conclusion: Fail to reject H0 . There is insufficient evidence to conclude that the two machines differ in the consistency of their fills. 4 13.63 H 0 : 12 22 H A : 12 22 =.05 Test Statistic: F s 12 s 22 with numerator df 1 n1 1 and denominator df, 2 n2 1 Rejection Region: Reject H 0 if F > F.05,99,99 1.39 or if Calculations: Week 1: s12 19.38, n1 100 Week 2: s22 12.70, n2 100 19.38 1.526 12.70 Conclusion: Reject H0 . There is enough evidence to infer that limiting both minimum and maximum speeds reduces the variation in speeds. F Section 13.5 Inference about the Difference between two Population Proportions Sampling Distribution of pˆ 1 pˆ 2 X X Let pˆ 1 1 and pˆ 2 2 where X 1 and X 2 are the number of n1 n2 successes in their respective samples. If n1 pˆ 1 , n1 qˆ1 , n2 pˆ 2 , n2 qˆ 2 5 then pˆ 1 pˆ 2 is approximately normally distributed The mean or expected value of pˆ 1 pˆ 2 is p1 p 2 pq p q The variance of pˆ 1 pˆ 2 is 1 1 2 2 n1 n2 The standard error of pˆ 1 pˆ 2 is Therefore Z pˆ 1 pˆ 2 p1 p 2 p1 q1 p 2 q 2 n1 n2 has a standard normal p1 q1 p 2 q 2 n1 n2 distribution. However, this is not exactly the test statistic we will use. There are two possible cases with two separate test statistics, although both are 5 Z’s with standard normal distributions, there are slight differences in each of the two. Case 1: We test that there is no difference between the population proportions. There are three possible sets of hypotheses. H 0 : p1 p 2 0 H A : p1 p 2 0 Two tailed tailed H 0 : p1 p 2 0 H A : p1 p 2 0 right or upper tailed H 0 : p1 p2 0 H A : p1 p2 0 left or lower We would like to use the Z statistic from above, but the standard error p1 q1 p 2 q 2 of pˆ 1 pˆ 2 is unknown: and so it must be estimated from n1 n2 the sample data. When the two population parameters are as hypothesized, equal, we can pool the data from the 2 samples to come X X2 up with a pooled proportion estimate pˆ 1 n1 n 2 pˆ 1 pˆ 2 Test Statistic for Case 1: Z 1 1 pˆ qˆ n1 n 2 Case 2: We test that there exists a specified difference between the population proportions. The three possible sets of hypotheses are as follows. H 0 : p1 p 2 D0 H A : p1 p 2 D0 Two tailed tailed H 0 : p1 p 2 D0 H A : p1 p 2 D0 right or upper tailed Where D0 0 . Test Statistic for Case 2: Z pˆ 1 pˆ 2 D0 pˆ 1 qˆ1 pˆ 2 qˆ 2 n1 n2 H 0 : p1 p 2 D0 H A : p1 p 2 D0 left or lower 6 There is only one case for interval estimation for the difference in population proportions: A (1-)100% Confidence Interval for p1 p 2 is: pˆ 1 pˆ 2 z 2 pˆ 1 qˆ1 pˆ 2 qˆ 2 n1 n2 Exercises pages 493-497 13.69 b. A 90% Confidence Interval for p1 p 2 is: .48 .52 1.645 .48 1 .48 .52 1 .52 100 100 .04 .1162 .1562,.0762 We are 90 % confident that p1 p 2 falls in this interval. 13.72 Let Population 1 be those who score under 600 on a credit scorecard and Population 2 be those who score 600 or more H0 : p1 p2 0 HA : p1 p2 0 .10 Test Statistic: Z pˆ 1 pˆ 2 1 1 pˆ qˆ n1 n 2 Rejection Region: Reject H0 if Z > 1.28 Calculations: 11 7 p1 .01957 p2 .0087 562 804 11 7 18 p .013177 562 804 1366 .01957 .0087 Z 1.7336 1 1 .013177 1 .013177 562 804 p-value = 1-.9582=.0418 7 Conclusion: Reject H0. There is enough evidence to conclude that those who score under 600 are more likely to default than those who score 600 or more. 13.73 a. Let Population 1 be voter preference six months ago and Population 2 be voter preference this month. Also p1 represents the proportion of voters who support this politician 6 months ago and p2 represents the proportion of voters who support this politician this month. H 0 : p1 p2 0 H A : p1 p2 0 .05 Test Statistic: Z pˆ 1 pˆ 2 1 1 pˆ qˆ n1 n 2 Rejection Region: Reject H0 if Z < -1.645 Calculations: X1 p1 .56, X1 616 1100 X p2 2 .46, X2 368 800 616 368 984 p .51789 1100 800 1900 .56 .46 Z 4.31 1 1 .51789 1 .51789 1100 800 p-value ≈ 0 Conclusion: Reject H0 at .05 level of significance. There is enough evidence to conclude that the politician’s popularity has decreased significantly. 8 13.73 b. H0 : p1 p2 0.05 HA : p1 p2 0.05 .05 Test Statistic: Z pˆ 1 pˆ 2 D0 pˆ 1 qˆ1 pˆ 2 qˆ 2 n1 n2 Rejection Region: Reject H0 if Z < -1.645 Calculations: Z .56 .46 .05 .56 .44 .46 .54 2.16 1100 800 p-value = 1 - .9846 = .0154 Conclusion: Reject H0 at .05 level of significance. There is enough evidence to conclude that the politician’s popularity has decreased by more than 5 %. c. A 95 % Confidence Interval for p1 p 2 is .10 .045 ….we are 95 % confident that p1 p 2 falls in this interval. Chapter Exercises pages 499 - 502