Hypothesis Testing with Two Populations Week 9 GT00303 Comparing 2 Populations Previously we looked at techniques to estimate and test parameters for one population: Population Mean (µ) Population Variance (σ2) We will still consider these parameters when we are looking at two populations, however our interest will now be: The difference between two means (µ1- µ2). The ratio of two variances (σ12 / σ22). 9-2 Difference between 2 Means (Independent Samples) In order to test and estimate the difference between two population means, we draw random samples from each of two populations. Initially, we will consider independent samples, that is, samples that are completely unrelated to one another. 9-3 Population 1 Sample, size: n1 Statistics: Parameters: Population 2 Sample, size: n2 Parameters: 2 & 2 2 Statistics: x2 & s22 9-4 The sampling distribution of x1 x2 . (1) If populations are normal (approximately normal): (2) If populations are non-normal: The expected value of x1 x2 : The standard error of x1 x2 : 1 2 12 n1 22 n2 9-5 Statistic - Parameter Test statistic Standard error z x1 x2 - 1 2 12 n1 22 n2 ?? In practice, the population variances (σ2) are unknown. So, z statistic is rarely used In this case, we have to replace them with sample variances (s), and use t-statistic! 9-6 However, the application of t-test depends on 2 conditions: (1) When we believe the population variances are equal (equal-variances t-test) (2) When we believe the population variances are not equal (unequal-variances t-test) 9-7 (1) Equal-variance t-test for (μ1 - μ2) Test Statistic: t x1 x2 1 2 s 2p n1 CI Estimator: x1 x2 t s 2p d.f. = n1 n2 2 n2 s 2p 2 , n1 s 2p n2 , d.f. = n1 n2 2 Pooled variance estimator 9-8 (2) Unequal-variance t-test for (μ1 - μ2) Test Statistic: t s12 s22 n1 n2 x1 x2 CI Estimator: d.f. x1 x2 1 2 s 2 1 s 2 1 t n1 s 2 2 n1 n1 1 2 s 2 n2 2 2 s12 s22 n1 n2 When the two population variances are unequal, we cannot pool the data and produce a common estimator. 2 n2 2 n2 1 9-9 So, which t-test to use? Equal-variances or unequal- variances? We have to first test the hypothesis of equal variances! 1st Hypothesis H0: σ12 / σ22 = 1 H1: σ12 / σ22 ≠ 1 If reject H0 and conclude unequal variances, USE unequal-variances ttest for 2nd hypothesis. If do not reject H0 and conclude insufficient evidence that variances are unequal, USE equal-variances t-test for 2nd hypothesis 2nd Hypothesis H0: µ1- µ2 = 0 H1: µ1- µ2 > 0 9-10 How to test 1st Hypothesis? 1st Hypothesis H0: σ12 / σ22 = 1 H1: σ12 / σ22 ≠ 1 s12 Test Statistic: F 2 s2 d.f.: v1 n1 1 v2 n2 1 The rejection region and critical value can be obtained from the F-table. 9-11 Illustration 1: Millions of investors buy mutual funds choosing from thousands of possibilities. Some funds can be purchased directly from banks or other financial institutions while others must be purchased through brokers, who charge a fee for this service. This raises the question, can investors do better by buying mutual funds directly than by purchasing mutual funds through brokers. To help answer this question a group of researchers randomly sampled the annual returns from mutual funds that can be acquired directly and mutual funds that are bought through brokers and recorded the net annual returns, which are the returns on investment after deducting all relevant fees. Can we conclude at the 5% significance level that directly-purchased mutual funds outperform mutual funds bought through brokers? 9-12 Population 1 Net annual return from directly-purchased mutual funds Population 2 Net annual return from broker-purchased funds µ1 = mean net annual return for population 1 µ2 = mean net annual return for population 2 9-13 From the data Xm13-01 (click here), we can compute sample mean and sample variance using Excel: n1 50 x1 6.63 s12 37.49 n2 50 x 2 3.72 s22 43.34 Preliminary Test Since population variances (σ2) are unknown, we will use tdistribution. But which t-test to use? Equal-variances or unequalvariances? 9-14 To decide, we apply the F-test on the following hypothesis: H0: σ12 / σ22 = 1 H1: σ12 / σ22 ≠ 1 s12 37.49 F 2 = 0.86 s2 43.34 We will compare this test statistic with the critical value (or rejection region). For F-table, we use α = 5%, and degree of freedom v1 = 50-1 = 49, v2 = 50 – 1 =49. It is a two-tail test. 9-15 Right-tail Critical Value: F0.025,49,49 » F0.025,50,50 = 1.75 Left-tail Critical Value: The F-table gives critical value for right-tail test. Because the F distribution is not symmetric, and there are no negative values, you CANNOT simply take the opposite of the right critical value to find the left critical value. The way to find a left critical value is to reverse the degrees of freedom, look up the right critical value, and then take the reciprocal of this value F0.975,49,49 = 1 F0.025,49,49 » 1 F0.025,50,50 1 = = 0.57 1.75 9-16 The Rejection Region is F < 0.57 F > 1.75 The test statistic of 0.86 does not fall into the Rejection Region. Do not reject H0 and conclude that there is insufficient evidence to infer the population variances are unequal. So, for this illustration, we will conduct the hypothesis testing using equal-variances t-test. 9-17 Step 1: The hypothesis to be tested is that the mean net annual return from directly-purchased mutual funds (µ1) is larger than (outperform) the mean of broker-purchased funds (µ2). H0: µ1- µ2 = 0 H1: µ1-µ2 > 0 H0 is presumed to be true This is what we want to prove! 9-18 Step 2: Since population variances (σ2) are unknown, we will use tdistribution. Our F-test earlier suggests the use of equal-variances ttest. Test Statistic: t x1 x2 1 2 s 2 p n1 s 2 p , d.f. = n1 n2 2 n2 Pooled variance estimator 9-19 2 2 n 1 s n 1 s 1 1 2 2 s 2p n1 n2 2 (50 1) 37.49 (50 1) 43.34 = 50 50 2 40.42 Test Statistic: t x1 x2 1 2 s 2p n1 s 2p n2 d.f. = n1 + n2 -2 = 50 + 50 – 2 = 98 6.63 3.72 0 40.42 40.42 50 50 = 2.29 9-20 Step 3: With α= 0.05 and it is a one-tail test (right tail), the critical value and rejection region is as follows: 0.05 t0.05,98=??? 9-21 t0.05, 98 = 1.661 9-22 Step 4: Reject H0 if the computed test statistic (from Step 2, t = 2.29) falls into the shaded Rejection Region, or t > 1.661 Step 5: Reject H0 at the 5% level of significance and conclude there is sufficient evidence to infer that on average directly-purchased mutual funds outperform brokerpurchased mutual funds. 9-23 Can we estimate the 95% confidence interval for μ1-μ2? CI Estimator: x1 x2 t s 2p 2 n1 s 2p n2 , d.f. = n1 n2 2 1 1 (6.63 3.72) 1.984 40.42 50 50 2.91 2.52 0.39,5.43 It is estimated that the return on directly purchased mutual funds is on average between 0.39 and 5.43 percentage points larger than broker-purchased mutual funds. 9-24 Illustration 2: What happens to the family-run business when the boss’s son or daughter takes over? Does the business do better after the change if the new boss is the offspring of the owner or does the business do better when an outsider is made chief executive officer (CEO)? In pursuit of an answer researchers randomly selected 140 firms between 1994 and 2002, 30% of which passed ownership to an offspring and 70% appointed an outsider as CEO. For each company the researchers calculated the operating income as a proportion of assets in the year before and the year after the new CEO took over. Do these data allow us to infer at the 5% level of significance that the effect of making an offspring CEO is different from the effect of hiring an outsider as CEO? 9-25 Population 1: Operating income of companies whose CEO is an offspring of the previous CEO Population 2: Operating income of companies whose CEO is an outsider µ1 = mean operating income for population 1 µ2 = mean operating income for population 2 From the data Xm13-02 (click here), we can compute sample mean and sample variance using Excel: n1 42 x1 0.10 s12 3.79 n2 98 x2 1.24 s22 8.03 9-26 Preliminary Test Since population variances (σ2) are unknown, we will use tdistribution. But which t-test to use? Equal-variances or unequalvariances? To decide, we apply the F-test on the following hypothesis: H0: σ12 / σ22 = 1 H1: σ12 / σ22 ≠ 1 s12 3.79 F 2 = 0.47 s2 8.03 9-27 We will compare this test statistic with the critical value (or rejection region). To use the F-table, we use α = 5%, and degree of freedom v1 = 42-1 = 41, v2 = 98 – 1 =97. It is a two-tail test. Right-tail Critical Value: F0.025,41,97 » F0.025,40,100 = 1.64 Left-tail Critical Value: F0.975,41,97 = 1 F0.025,97,41 » 1 F0.025,100,40 1 = = 0.57 1.74 9-28 The Rejection Region is F < 0.57 F > 1.64 The test statistic of 0.47 falls into the Rejection Region. Reject H0 and conclude that there is sufficient evidence to infer the population variances are unequal. So, for this illustration, we will conduct the hypothesis testing using unequal-variances t-test. 9-29 Step 1: The hypothesis to be tested is that the mean operating income for companies whose CEO is an offspring of the previous CEO (µ1) is different from the mean operating income of companies whose CEO is an outsider (µ2). H0: µ1- µ2 = 0 H1: µ1-µ2 ≠ 0 H0 is presumed to be true This is what we want to prove! 9-30 Step 2: Since population variances (σ2) are unknown, we will use tdistribution. Our F-test earlier suggests the use of unequal-variances t-test. Test Statistic: t d.f. s 2 1 s 2 1 x1 x2 1 2 s12 s22 n1 n2 n1 s 2 2 n1 n1 1 2 s n2 2 2 2 n2 2 n2 1 9-31 Test Statistic: t x1 x2 1 2 s12 s22 n1 n2 0.10 1.24 0 3.79 8.03 42 98 = 3.22 d.f. s 2 1 s 2 1 n1 s 2 2 n1 n1 1 2 s n2 2 2 2 n2 n2 1 2 3.79 3.79 42 8.03 98 2 42 8.03 98 42 1 98 1 2 2 111 9-32 Step 3: With α= 0.05 and it is a two-tail test, the critical value and rejection region is as follows: -t0.025,111 » -t0.025,110 = -1.982 t0.025,111 » t0.025,110 = 1.982 0.025 0.025 -t0.025,111=??? t0.025,111=??? 9-33 Step 4: Reject H0 if the computed test statistic (from Step 2, t = -3.22) falls into the shaded Rejection Region, or t < -1.982 t > 1.982 Step 5: Reject H0 at the 5% level of significance and conclude that there is sufficient evidence to infer that mean operating income for the two populations are different. 9-34 Difference between 2 Means (Matched Pairs) Previously, we consider independent samples, that is, samples that are completely unrelated to one another. When an observation in one sample is matched with an observation in a second sample, this is called a matched pairs experiment. 9-35 Illustration 3A: In the last few years, a number of web-based companies that offer job placement services have been created. The manager of one such company wanted to investigate the job offers recent MBAs were obtaining. In particular, she wanted to know whether finance majors were being offered higher salaries than marketing majors. In a preliminary study she randomly sampled 50 recently graduated MBAs half of whom majored in finance and half in marketing. From each she obtained the highest salary (including benefits) offer. Can we infer at the 5% level of significance that finance majors obtain higher salary offers than do marketing majors among MBAs? 9-36 Population 1 Highest salary offer to finance majors Population 2 Highest salary offer to marketing majors µ1 = mean highest salary offer for population 1 µ2 = mean highest salary offer for population 2 9-37 From the data Xm13-04 (click here), we can compute sample mean and sample variance using Excel: n1 = 25 n2 = 25 x1 = 65624 x2 = 60423 s12 = 360433294 s22 = 262228559 Preliminary Test Since population variances (σ2) are unknown, we will use tdistribution. But which t-test to use? Equal-variances or unequalvariances? 9-38 To decide, we apply the F-test on the following hypothesis: H0: σ12 / σ22 = 1 H1: σ12 / σ22 ≠ 1 s12 360433294 F= 2 = = 1.37 262228559 s2 We will compare this test statistic with the critical value (or rejection region). To use the F-table, we use α = 5%, and degree of freedom v1 = 25-1 = 24, v2 = 25 – 1 =24. It is a two-tail test. 9-39 Right-tail Critical Value F0.025,24,24 2.27 The Rejection Region is Left-tail Critical Value F0.975,24,24 1 F0.025,24,24 1 0.44 2.27 F < 0.44 F > 2.27 The test statistic of 1.37 does not fall into the Rejection Region. Do not reject H0 and conclude that there is insufficient evidence to infer the population variances are unequal. So, for this illustration, we will conduct the hypothesis testing using equal-variances t-test. 9-40 Step 1: The hypothesis to be tested is that the mean highest salary offer for finance majors (µ1) is larger than the mean highest salary offer for marketing majors (µ2). H0: µ1- µ2 = 0 H1: µ1-µ2 > 0 H0 is presumed to be true This is what we want to prove! 9-41 Step 2: Since population variances (σ2) are unknown, we will use tdistribution. Our F-test earlier suggests the use of equal-variances ttest. Test Statistic: t x1 x2 1 2 s 2 p n1 s 2 p , d.f. = n1 n2 2 n2 Pooled variance estimator 9-42 s 2p = = ( n - 1) s + ( n 1 2 1 s ) 2 2 2 1 n1 + n2 - 2 (25 - 1) ( 360433294 ) + (25 - 1) ( 262228559 ) 25 + 25 - 2 = 311330926 Test Statistic: t = (x 1 - x2 ) - ( m1 - m2 ) s 2p n1 = + s 2p n2 d.f. = n1 + n2 -2 = 25 + 25 – 2 = 48 ( 65624 - 60423) - ( 0) 311330926 311330926 + 25 25 = 1.04 9-43 Step 3: With α= 0.05 and it is a one-tail test (right tail), the critical value and rejection region is as follows: t0.05,48 t0.05,50 1.676 0.05 t0.05,48=??? 9-44 Step 4: Reject H0 if the computed test statistic (from Step 2, t = 1.04) falls into the shaded Rejection Region, or t > 1.676 Step 5: Do not reject H0 at the 5% level of significance and conclude there is insufficient evidence to infer that finance majors receive higher salary offers than marketing majors. 9-45 Illustration 3B: Suppose now that we redo the experiment in the following way. We examine the transcripts of finance and marketing MBA majors. We randomly sample a finance and a marketing major whose grade point average (GPA) falls between 3.92 and 4 (based on a maximum of 4). We then randomly sample a finance and a marketing major whose GPA is between 3.84 and 3.92. We continue this process until the 25th pair of finance and marketing majors are selected whose GPA fell between 2.0 and 2.08 (The minimum GPA required for graduation is 2.0.) As in Illustration 3A, we recorded the highest salary offer. Can we infer at the 5% level of significance that finance majors obtain higher salary offers than do marketing majors among MBAs? 9-46 In Illustration 3A, the samples are independent. Illustration 3B is a matched pairs experiment, i.e., each observation in one sample is matched with an observation in the other sample. The matching is conducted by selecting finance and marketing majors with similar GPAs. For full data, click here 9-47 For each GPA group (e.g., Group 1 has GPA between 3.92 and 4), we calculate the matched pair difference between the salary offers for finance and marketing majors. The difference of the means is equal to the mean of the differences, hence we will consider the “mean of the paired differences” as our parameter of interest: 9-48 From the data Xm13-05 (click here), we can compute sample mean and sample variance using Excel: xD = 5065 sD = 6647 Step 1: The hypothesis to be tested is that the mean highest salary offer for finance majors (µ1) is larger than the mean highest salary offer for marketing majors (µ2). H0: µD = 0 H1: µD > 0 H0 is presumed to be true This is what we want to prove! 9-49 Step 2: The test statistic for the mean of the population of differences (µD) is: Test Statistic: t = t= 5605 - 0 6647 25 xD - m D sD nD = 3.81 , d.f. = nD - 1 d.f. = nD – 1 = 25 – 1 = 24 9-50 Step 3: With α= 0.05 and it is a one-tail test (right tail), the critical value and rejection region is as follows: t0.05,24 1.711 0.05 t0.05,24=??? 9-51 Step 4: Reject H0 if the computed test statistic (from Step 2, t = 3.81) falls into the shaded Rejection Region, or t > 1.711 Step 5: Reject H0 at the 5% level of significance and conclude there is sufficient evidence to infer that finance majors receive higher salary offers than marketing majors. 9-52 Can we estimate the 95% confidence interval for μD? CI Estimator: x D ± ta sD 2 nD = 5065 ± 2.064 d.f. = nD - 1 , 6647 25 = 5065 ± 2744 = ( 2321,7809 ) It is estimated that the mean salary offer to finance majors exceeds the mean salary offer to marketing majors by an amount that lies between $2,321 and $7,809. 9-53 Independent Samples versus Matched Pairs Conclusion Test Statistic Independent Samples Matched Pairs Do not reject H0 at the 5% level of significance and conclude there is insufficient evidence to infer that finance majors receive higher salary offers than marketing majors. Reject H0 at the 5% level of significance and conclude there is sufficient evidence to infer that finance majors receive higher salary offers than marketing majors. Test Statistic: t = (x 1 - x2 ) - ( m1 - m2 ) s 2p n1 + s 2p n2 Test Statistic: t = xD - m D sD nD = 3.81 = 1.04 9-54 Mean Difference (Numerator) Standard Errors (Denominator) Independent Samples Matched Pairs x1 - x2 = 5201 xD = 5065 s 2p n1 + s2p n2 = 4991 sD nD = 1329 The difference in the test statistic was caused not by the numerator, but by the denominator. So, the matched pairs experiment reduces the variation in the data (But it will not always be the case). 9-55 The Ratio of Two Population Variances When looking at two population variances, we consider the ratio of the variances, i.e. the parameter of interest to us is: s 2 1 s12 Test Statistic: F 2 s2 s 2 2 d.f.: v1 n1 1 v2 n2 1 æ s12 ö 1 CI Estimator: LCL = ç 2 ÷ è s2 ø Fa 2,v1 ,v2 æ s12 ö UCL = ç 2 ÷ Fa è s2 ø 2,v2 ,v1 9-56 Illustration 4: Container-filling machines are used to package a variety of liquids; including milk, soft drinks, and paint. Ideally, the amount of liquid should vary only slightly, since large variations will cause some containers to be under-filled (cheating the customer) and some to be overfilled (resulting in costly waste). The president of a company that developed a new type of machine boasts that this machine can fill 1 liter (1,000 cubic centimeters) containers consistently. A random sample of 25 l-liter fills was taken and the results (cubic centimeters) recorded. Suppose that the statistics practitioner also collected data from another container-filling machine and recorded the fills of a randomly selected sample. Can we infer at the 5% significance level that the second machine is superior in its consistency? 9-57 Because the information we want is about the consistency of the two machines, so the parameter of interest is σ12 / σ22 σ12 = Variance of machine 1 σ22 = Variance of machine 2 From the data Xm13-07 (click here), we can compute sample mean and sample variance using Excel: n1 = 25 n2 = 25 x1 = 999.68 x2 = 999.81 s12 = 0.6333 s22 = 0.4528 9-58 Step 1: We need to determine whether machine 2 is more consistent than machine 1 (consistent means smaller variance). Hence, we are testing whether there is sufficient evidence to infer that σ12 > σ22 H0: σ12 / σ22 = 1 H1: σ12 / σ22 > 1 H0 is presumed to be true This is what we want to prove! 9-59 Step 2: s12 0.6333 F= 2 = = 1.40 0.4528 s2 Step 3: To use the F-table, we use α = 5%, and degree of freedom v1 = 25-1 = 24, v2 = 25 – 1 =24. It is a one-tail (right tail) test. F0.05,24,24 = 1.98 9-60 Step 4: Reject H0 if the computed test statistic (from Step 2, F = 1.40) falls into the shaded Rejection Region, or F > 1.98 Step 5: Do not reject H0 at the 5% level of significance and conclude that there is insufficient evidence to infer the variance of machine 2 is less than the variance of machine 1 (or machine 2 is more consistent). 9-61 Can we estimate the 95% confidence interval for σ12 / σ22 ? æ s12 ö 1 CI Estimator: LCL = ç 2 ÷ è s2 ø Fa 2,v1 ,v2 æ s12 ö UCL = ç 2 ÷ Fa è s2 ø 2,v2 ,v1 d.f.: v1 = n1 -1 v2 = n2 -1 From F table, F0.025,24,24 = 2.27 æ 0.6333 ö 1 LCL = ç = 0.616 ÷ è 0.4528 ø 2.27 æ 0.6333 ö UCL = ç 2.27 = 3.17 ÷ è 0.4528 ø It is estimated that the ratio of the two population variances lies between 0.616 and 3.17 9-62