1 BA 2606 Chapter 13 Inference about the Comparison of Two Populations The basic concepts of hypothesis testing have been explained in the last two chapters. Here we extend our discussion to 2 populations, and we will compare two means, two variances and two proportions. As before, we will use the Z and t test statistic, but we will introduce a new distribution, the F test statistic, for comparing the variances of 2 populations. There are basically 6 new test statistics that we must familiarize ourselves with, understand the theory, the formulas and underlying assumptions. We start with tests of hypothesis on two population means. There are many instances when researchers wish to compare the means of two groups. For example, the average lifetimes of two different brands of bus tires might be compared to see whether there is any difference in tread wear. Two brands of cough syrup might be tested to see whether one brand is more effective than the other. Section 13-1 Inference About the Difference between two Means: Independent Samples Sampling Distribution of Xi X2 Xi X2 is normally distributed if the populations are normal (or if the sample sizes are sufficiently large) The expected value or mean of Xi X2 is 1 2 The variance of Xi X2 is 12 22 n1 n2 The standard error of Xi X2 is 12 22 n1 n2 estimator mean and that confidence intervals are of form s tan dard.error estimate Z s tandard.error , then it follows that in this case: If you recall that Z Z= X1 X2 1 2 12 22 n1 n2 and X i X2 z 2 12 22 n1 n2 We will not be using either of these since the chance that both population variances are known is quite rare. What do we use when population variances are unknown? The t distribution! 2 There are two cases for testing about the difference in two means when the 2 random samples are independent of each other. For both we must assume that the two populations are normally distributed. Case 1: The two populations are normally distributed. The two samples are random and independent of each other. The variances are equal. X X2 D0 with n1 n2 2 1 1 sp2 n1 n2 n1 1s12 n2 1s22 2 is the pooled estimate of the sp n1 n2 2 t Test Statistic: 1 common population variance 2 A 1 100% Confidence Interval for 1 2 : X X t i 2 Case2: Test Statistic: 2 1 1 with n1 n2 2 sp2 n1 n2 The two populations are normally distributed. The two samples are random and independent of each other. The variances are unequal. t X 1 X2 D0 s12 s22 n1 n2 with s12 s22 n1 n2 2 s 2 2 s2 2 2n 1 n 1 2 n1 1 n2 1 And round df to the nearest integer. Some textbooks use an approximation for this df = minimum of (n 1 – 1) or (n2 – 1) 3 A 1 100% Confidence Interval for 1 2 : X i X2 t s12 s22 n n 2 1 2 2 s12 s22 2 2 n n with 2 2 1 s1 s22 n n 1 2 n1 1 n2 1 There are 3 possible sets of hypotheses: H 0 : 1 2 D0 H A : 1 2 D0 Two tailed H 0 : 1 2 D0 H A : 1 2 D0 right or upper tailed H 0 : 1 2 D0 H A : 1 2 D0 left or lower tailed Use the same procedure for hypothesis testing. The question is: how do we know whether the variances are equal or unequal? We need to conduct another test on the ratio of the 2 variances, which we will cover in Section 13-4. For the comparison of two variances or standard deviations, an F test is used. Note that when comparing means, we look at the difference between the two means. When comparing variances we look to the ratios of two variances, or 12 . Not surprisingly, the 2 2 ratio s 12 2 s2 will be used as the test statistic, where s12 is the larger variance. The sampling distribution of s 12 s 22 is the F distribution. Properties of the F distribution The values of F cannot be negative, because variances are always positive or zero The distribution is positively skewed The F distribution is a family of curves based on the degrees of freedom of the variance in the numerator and the degrees of freedom of the variance in the denominator. The assumptions here are that the two independent random samples come from normally distributed populations 4 Test Statistic: F s 12 s 22 with numerator df 1 n1 1 and denominator df, 2 n2 1 , where s12 is the larger variance. By choosing the numerator variance to be the larger of the two sample statistics we need only find the upper tail critical value. We will revisit this in Section 13-4 and we will learn how to find the lower tail critical value there Example: a. Find the critical value for a right tailed F Test with 0.05 level of significance when the degrees of freedom for the numerator are 15 and the degrees of freedom for the denominator are 20 (2.20) b. Find the upper critical value for the two tailed F test with 0.05 level of significance when the sample size from which the variance from the numerator was obtained was 21 and the sample size from which the variance from the denominator was obtained was 12 (3.23) Exercises pages 456 – 460 13.12 H 0 : (1 2 ) = 0 H1 : (1 2 ) < 0 α = 0.10, Note: assume both populations are normal, both samples are random Two-tail F test: s1 = 2.424412873, s2 = 2.406010991 F = 1.015355, since this value is not > F.05, 9, 9 = 3.18, we fail to reject the hypothesis that the population variances are equal ; use equal-variances test statistic Test Statistic: t ( x 1 x 2 ) ( 1 2 ) 1 1 s 2p n n 2 1 Rejection region: Reject H0 if t t , t .10,18 1.330 t ( x 1 x 2 ) ( 1 2 ) 1 1 s 2p n1 n 2 = (5.10 7.30 ) 0 (10 1)5.88 (10 1)5.79 1 1 10 10 2 10 10 p-value = .0283 by computer program By table: .025 < p value < .05 2.04, 5 Conclusion: Reject H0 at α = .10. There is enough evidence to infer that there are fewer errors when the yellow ball is used. 13.29 H 0 : (1 2 ) = 0 H1 : (1 2 ) < 0 α = 0.05, Note: assume both populations are normal, both samples are random From Appendix A: With Textbook: x 1 63.71 s1 = 5.90, n1 = 173 No Textbook: x 2 66.80 , s2 = 6.85, , n2 = 202 Two-tail F test: F = 1.348, since this value is not > F.025, 201, 172 ≈ 1.32, we reject the hypothesis that the population variances are equal ; use unequal-variances test statistic Test Statistic: t ( x 1 x 2 ) ( 1 2 ) s 12 s 22 n1 n 2 (s12 / n 1 s 22 / n 2 ) 2 (s12 / n 1 ) 2 (s 22 / n 2 ) 2 n1 1 n 2 1 = 372.9862, round to 373 Rejection region: t t , t.05,373 1.653 t ( x 1 x 2 ) ( 1 2 ) s 12 s 22 n1 n 2 = (63 .71 66 .80 ) 0 5.90 2 6.85 2 173 202 = –4.69, p-value < .005. Calculations: Reject H0 at α = .05. There is enough evidence to infer that students without textbooks outperform those with textbooks. 6 Section 13-2 Observational and Experimental Data We can obtain our data in two different ways: 1) Experiment (or controlled study) – where the researcher randomly divides subjects into appropriate groups. Some treatment is applied to one or more groups and the effect or response is observed. For example, patients may be randomly given unmarked capsules of either aspirin or acetaminophen and the effects of the medication may be measured for each group. Experiments often have a treatment group and a control group where, ideally, neither the subjects nor researchers know which group is which. In the Salk vaccine experiment of 1954, half the children received the polio vaccine while the other received a placebo – their doctors did not even know who received what. This is an example of a double-blind approach. Blinding occurs when the subjects and researchers do not know who is receiving what treatments. Controlled experiments can indicate cause and effect relationships. 2) Observational Study – Here, there is no choice as to who goes into the treatment or control group. Specific characteristics are observed and measured but the researcher does not attempt to modify the subjects studied. For example, a researcher cannot ethically tell 100 people to smoke 3 packs of cigarettes a day and another 100 to smoke only a pack a day. They can only observe people who habitually smoke these amounts. Results of such studies can suggest relationships, but it is difficult to conclude cause and effect. In an experiment we impose some change or treatment and measure the result or response. In an observational study we simply observe and measure something that has taken place or is taking place, while not trying to cause any changes by our presence. Which is more appropriate, an experiment or observational study? a) a study designed to determine whether daily calcium supplements benefit women by increasing bone mass? Experimental b) a study designed to examine life expectancies of tall versus short people. Observational (examine medical records of heights and ages at time of death). Experimental, choosing half to be short, half to be tall, makes no sense. Exercises page 466 13.90 It is an experimental study if volunteers are randomly assigned to eating oat bran versus another grain cereal 13.91 a. Observational, bc students choose the software package 7 b. If students are randomly assigned to a software package it becomes an experimental study Section 13-3 Inference About the Difference between Two Means: Matched Pairs Experiment In this section, our two samples are dependent samples, where the subjects are paired or matched in some way. For example, heart rate, before and after taking a certain medication. We will define our differences Di X i Yi and we are interested in the population mean of the differences. We make the assumption that the differences are normally distributed and we use the t distribution. There are 3 possible sets of hypotheses: H 0 : D D0 H 0 : D D0 H 0 : D D0 H A : D D0 Two tailed H A : D D0 H A : D D0 right or upper tailed left or lower tailed Use the same procedure for hypothesis testing. Test Statistic: t X D D0 sD nD with n D 1 A 1 100% Confidence Interval for D is X D t sD 2 nD Example A dietician wishes to see if a person’s cholesterol level will change if the diet is supplemented by a certain mineral. Six subjects were pretested and then took the mineral supplement for a six week period. The results are shown in the following table. (Cholesterol level is measured in milligrams per deciliter) Can it be concluded that the cholesterol level has been changed at .10 level of significance? Assume that cholesterol levels are normally distributed. Also construct a 90 % confidence interval for the mean of the differences. Subject Before After 1 210 190 2 235 170 3 208 210 4 190 188 5 172 173 6 244 228 8 Define D = Before – After H0 : D 0 HA : D 0 0.10 Test Statistic: t X D D0 , 5 sD nD Rejection Region: Reject H0 if t < -2.015 or t > 2.015 Calculations: Differences = 20, 65, -2, 2, -1, 16 Di 100, Di2 4890, d 16.6, sD 25.39 t 16.6 0 , 2(0.05) < p-value < 2(.10) 1.607 25.392 6 Conclusion: Fail to reject H0 at .05 level of significance. There is insufficient evidence to support the claim that the mineral changes a person’s cholesterol level. Note: A 90 % Confidence Interval for μD is: 25.39 16.6 2.015 6 16.6 20.89 4.19,37.59 We are 90% confident that μD falls in this interval. Note that 0 falls within this interval…..which implies the null hypothesis that μD = 0 cannot be rejected. Exercises Pages 478-479 Section 13-4 Inferences on Two Population Variances In addition to comparing two means, researchers may be interested in comparing two population variances. For example, is the variation in two quality control processes different? Another reason may be in determining which t test to use when comparing two means: the pooled variance case, or the case of unequal variances. For the comparison of two variances or standard deviations, an F test is used. Note that when comparing means, we look at the difference between the two means. When comparing variances we look to the ratios of two variances, or 12 . Not surprisingly, the 2 2 9 ratio s 12 2 s2 will be used as the test statistic, where s12 is the larger variance. The sampling distribution of s 12 is the F distribution. s 22 Properties of the F distribution The values of F cannot be negative, because variances are always positive or zero The distribution is positively skewed The F distribution is a family of curves based on the degrees of freedom of the variance in the numerator and the degrees of freedom of the variance in the denominator. The assumptions here are that the two independent random samples come from normally distributed populations Test Statistic: F s 12 s 22 with numerator df 1 n1 1 and denominator df, 2 n2 1 , where s12 is the larger variance. A 1 100% Confidence Interval for 12 has 2 2 s2 LCL 12 s2 1 F 2 , 1 , 2 s and UCL F , , 2 1 s 2 2 1 2 2 Table 6 in Appendix B gives the critical values for the F distribution for =0.05, 0.025, 0.01 and 0.005. It is limited since the requirement of two sets of degrees of freedom for the numerator and the denominator means a lot of numbers. We use the relationship 1 for the lower tailed critical value. Choosing the larger of the two F1 , , 2 1 2 F , , 2 2 1 sample variances as the numerator, may save some work. Example: Find the critical values for a a. right tailed F test when =0.05 and 1 15, 2 21 (2.18) two tailed F test when =0.05 and 1 20, 2 12 (3.07, b, 1 ) 2.68 10 When testing the equality of two variances, these hypotheses are used: H 0 : 12 22 H 0 : 12 22 H 0 : 12 22 H A : 12 22 H A : 12 22 H A : 12 22 Two-tailed = Right-tailed Left-tailed Test Statistic: F s 12 s 22 with numerator df 1 n1 1 and denominator df, 2 n2 1 Rejection Region: a) two tailed test:Reject H 0 if F > F 2 , 1 , 2 or if F < F1 b) right tailed test:Reject H 0 if F > F , 1 , 2 c) left tailed test: Reject H 0 if F < F1 , 1 , 2 2 , 1 , 2 1 F 1 F , 2 , 1 Calculations: No p-value necessary for the F test due to the limitations of the table. Conclusion: Exercises on page 486-487 13.121 b. s2 1 28 1 LCL 12 .6492 s2 F.025,24,24 19 2.27 s12 28 UCL 2 F.025,24,24 2.27 3.34526 19 s2 A 95 % Confidence Interval for 12 is (.6492, 3.34526) 2 2 We are 95% confident that 2 1 2 2 falls in this interval. 2 , 2 , 1 11 13.122 H 0 : 12 22 H A : 12 22 =.05 Test Statistic: F s 12 s 22 with numerator df 1 n1 1 and denominator df, 2 n2 1 Rejection Region: Reject H 0 if F > F.025,9,10 3.78 or if F < F.975,9,10 1 F.025,10,9 1 .2525 3.96 Calculations: Machine 1: s = .002394438 Machine 2: s = .0033709993 .002394438 2 .0033709993 2 F .00000573333 .5045 .0000113636 Note that the reciprocal is F = 1.982 and the RR for this 1 .26455 3.78 Conclusion: Fail to reject H0 . There is insufficient evidence to conclude that the two machines differ in the consistency of their fills. would be reject the null hypothesis if F > 3.96 or if F Example: The CEO of an airport hypothesizes that the variance for the number of passengers for American airports is greater than the number of passengers for foreign airports. At α = 0.10 is there enough evidence to support this hypothesis? The data in millions of passengers per year are shown for selected airports. Assume the variable is normally distributed. American airports Foreign airports 36.8 73.5 60.7 51.2 72.4 61.2 42.7 38.6 60.5 40.1 12 Section 13-5 Inference about the Difference between two Population Proportions Sampling Distribution of pˆ 1 pˆ 2 X X Let pˆ 1 1 and pˆ 2 2 where X 1 and X 2 are the number of successes in their n1 n2 respective samples. If n1 pˆ 1 , n1 qˆ1 , n2 pˆ 2 , n2 qˆ 2 5 then pˆ 1 pˆ 2 is approximately normally distributed The mean or expected value of pˆ 1 pˆ 2 is p1 p 2 pq p q The variance of pˆ 1 pˆ 2 is 1 1 2 2 n1 n2 p1 q1 p 2 q 2 n1 n2 The standard error of pˆ 1 pˆ 2 is Therefore Z pˆ 1 pˆ 2 p1 p 2 p1 q1 p 2 q 2 n1 n2 has a standard normal distribution. However, this is not exactly the test statistic we will use. There are two possible cases with two separate test statistics, although both are Z’s with standard normal distributions, there are slight differences in each of the two. Case 1: We test that there is no difference between the population proportions. There are three possible sets of hypotheses. H 0 : p1 p 2 0 H A : p1 p 2 0 Two tailed H 0 : p1 p 2 0 H A : p1 p 2 0 right or upper tailed H 0 : p1 p2 0 H A : p1 p2 0 left or lower tailed We would like to use the Z statistic from above, but the standard error of pˆ 1 pˆ 2 is p1 q1 p 2 q 2 and so it must be estimated from the sample data. When the n1 n2 two population parameters are as hypothesized, equal, we can pool the data from the 2 X X2 samples to come up with a pooled proportion estimate pˆ 1 n1 n 2 pˆ 1 pˆ 2 Test Statistic for Case 1: Z 1 1 pˆ qˆ n1 n 2 unknown: 13 Case 2: We test that there exists a specified difference between the population proportions. The three possible sets of hypotheses are as follows. H 0 : p1 p 2 D0 H 0 : p1 p 2 D0 H A : p1 p 2 D0 Two tailed H A : p1 p 2 D0 right or upper tailed H 0 : p1 p 2 D0 H A : p1 p 2 D0 left or lower tailed Where D0 0 . Test Statistic for Case 2: Z pˆ 1 pˆ 2 D0 pˆ 1 qˆ1 pˆ 2 qˆ 2 n1 n2 There is only one case for interval estimation for the difference in population proportions: A (1-)100% Confidence Interval for p1 p 2 is: pˆ 1 pˆ 2 z 2 pˆ 1 qˆ1 pˆ 2 qˆ 2 n1 n2 Exercises pages 498 – 499 13.133 b. A 90% Confidence Interval for p1 p 2 is: .48 .52 1.645 .48 1 .48 .52 1 .52 100 100 .04 .1162 .1562,.0762 We are 90 % confident that p1 p 2 falls in this interval. 13.137 a. Let Population 1 be voter preference six months ago and Population 2 be voter preference this month. Also p1 represents the proportion of voters who support this politician 6 months ago and p2 represents the proportion of voters who support this politician this month. 14 H 0 : p1 p2 0 H A : p1 p2 0 .05 Test Statistic: Z pˆ 1 pˆ 2 1 1 pˆ qˆ n1 n 2 Rejection Region: Reject H0 if Z < -1.645 Calculations: X1 p1 .56, X1 616 1100 X p2 2 .46, X2 368 800 616 368 984 p .51789 1100 800 1900 .56 .46 Z 4.31 1 1 .51789 1 .51789 1100 800 p-value ≈ 0 Conclusion: Reject H0 at .05 level of significance. There is enough evidence to conclude that the politician’s popularity has decreased significantly. 13.137 b. H0 : p1 p2 0.05 HA : p1 p2 0.05 .05 Test Statistic: Z pˆ 1 pˆ 2 D0 pˆ 1 qˆ1 pˆ 2 qˆ 2 n1 n2 Rejection Region: Reject H0 if Z < -1.645 Calculations: Z .56 .46 .05 .56 .44 .46 .54 1100 800 p-value = 1 - .9846 = .0154 2.16 15 Conclusion: Reject H0 at .05 level of significance. There is enough evidence to conclude that the politician’s popularity has decreased by more than 5 %. c. A 95 % Confidence Interval for p1 p 2 is .10 .045 ….we are 95 % confident that p1 p 2 falls in this interval. 13.148 H 0 : (p1 p 2 ) = 0 H1 : ( p1 p 2 ) 0 Note: wording is incorrect in text Rejection region: z z z .05 1.645 p̂1 z 47 64 47 64 .3422 p̂ 2 .2568 p̂ .3000 187 183 187 183 (p̂1 p̂ 2 ) 1 1 p̂(1 p̂) n1 n 2 = (. 3422 .2568 ) 1 1 .3000 (1 .3000 ) 187 183 1.79 , p-value = P(Z > 1.79) = 1- .9633 = .0367 There is enough evidence to allow us to conclude that drivers behind a male driver are more likely to honk.