Section 8.2 Testing the Difference Between Means (Small Independent Samples) (Independent Samples, σ1 and σ2 Unknown) Section 8.2 Objectives • Perform a t-test for the difference between two population means μ1 and μ2 using independent samples with population standard deviations unknown. Two Sample t-Test for the Difference Between Means These conditions are necessary to use a t-test for the difference between means for independent samples. 1. The samples must be randomly selected. 2. The samples must be independent. 3. Each population must have a normal distribution -oreach sample size is at least 30. 4. The population standard deviations (σ1, σ2) are UNKNOWN Two Sample t-Test for the Difference Between Means • The test statistic is x1 x2. • The standardized test statistic is x1 x2 1 2 t . sx x 1 2 • The standard error and the degrees of freedom of the sampling distribution depend on whether the population variances 12 and 22 are equal. Two Sample t-Test for the Difference Between Means • Variances are equal Information from the two samples is combined to calculate a pooled estimate of the standard deviation ˆ . n1 1 s12 n2 1 s22 ˆ n1 n2 2 The standard error for the sampling distribution of x1 x2 is sx x 1 2 1 1 ˆ n1 n2 d.f.= n1 + n2 – 2 Two Sample t-Test for the Difference Between Means • Variances are not equal If the population variances are not equal, then the standard error is s12 s22 sx x . n1 n2 d.f = smaller of n1 – 1 and n2 – 1 1 2 Normal or t-Distribution? Both populations normally distributed or are both sample sizes at least 30? No You cannot use the ztest or the t-test. (!) Use the t-test with Yes Are both population standard deviations (σ1 and σ2 ) known? Yes Use the z-test. No Are the population variances equal? Yes sx x ˆ 1 2 d.f = n1 + n2 – 2. No Use the t-test with sx x 1 1 1 n1 n2 2 s12 s22 n1 n2 d.f = smaller of n1 – 1 or n2 – 1. Two-Sample t-Test for the Difference Between Means (Independent Samples, σ1,σ2 Unknown) In Words In Symbols 1. State the claim mathematically and verbally. Identify the null and alternative hypotheses. 2. Specify the level of significance. 3. Determine the degrees of freedom. State H0 and Ha. Identify α. d.f. = n1+ n2 – 2 (equal σ) or d.f. = smaller of n1 – 1 or n2 – 1. (unequal σ) 4. Determine the critical value(s). Use Table 5 in Appendix B. Two-Sample t-Test for the Difference Between Means (Small Independent Samples) In Words In Symbols 5. Determine the rejection region(s). 6. Find the standardized test statistic and sketch the sampling distribution. x1 x2 1 2 t 7. Make a decision to reject or fail to reject the null hypothesis. If t is in the rejection region, reject H0. Otherwise, fail to reject H0. 8. Interpret the decision in the context of the original claim. sx x 1 2 Example: Two-Sample t-Test for the Difference Between Means The results of a state mathematics test for random samples of students taught by two different teachers at the same school are shown below. Can you conclude that there is a difference in the mean mathematics test scores for the students of the two teachers? Use α = 0.10. Assume the populations are normally distributed, and the population variances are not equal and they are unknown . Teacher 1 Teacher 2 x1 473 x2 459 s1 = 39.7 s2 = 24.5 n1 = 8 n2 = 18 Solution: Two-Sample t-Test for the Difference Between Means • • • • • H0: μ 1 = μ 2 Ha: μ1 ≠ μ2 (claim) α 0.10 d.f. = 8 – 1 = 7 Rejection Region: t ≈ 0.922 • Test Statistic: t (473 459) 0 0.922 39.7 2 24.52 8 18 • Decision: Fail to Reject H0 . At the 10% level of significance, there is not enough evidence to support the claim that the mean mathematics test scores for the students of the two teachers are different. Example: Two-Sample t-Test for the Difference Between Means A manufacturer claims that the mean calling range (in feet) of its 2.4-GHz cordless telephone is greater than that of its leading competitor. You perform a study using 14 randomly selected phones from the manufacturer and 16 randomly selected similar phones from its competitor. The results are shown below. At α = 0.05, can you support the manufacturer’s claim? Assume the populations are normally distributed and the population variances are equal (and unknown). Manufacturer Competitor x1 1275ft x2 1250 ft s1 = 45 ft s2 = 30 ft n1 = 14 n2 = 16 Solution: Two-Sample t-Test for the Difference Between Means • • • • • H0: μ 1 ≤ μ 2 Ha: μ1 > μ2 (claim) α = 0.05 d.f. = 14 + 16 – 2 = 28 Rejection Region: 0.05 0 1.701 t Solution: Two-Sample t-Test for the Difference Between Means • Test Statistic: sx1 x2 n1 1 s12 n2 1 s2 2 n1 n2 2 14 1 45 16 1 30 2 1 1 n1 n2 14 16 2 2 1 1 13.8018 14 16 x1 x2 1 2 1275 1250 0 t 1.811 sx1 x2 13.8018 Solution: Two-Sample t-Test for the Difference Between Means • • • • • H0: μ 1 ≤ μ 2 Ha: μ1 > μ2 (claim) α = 0.05 d.f. = 14 + 16 – 2 = 28 Rejection Region: 0.05 0 1.701 1.811 t • Test Statistic: t 1.811 • Decision: Reject H0 . At the 5% level of significance, there is enough evidence to support the manufacturer’s claim that its phone has a greater mean calling range than its competitor. The maximal oxygen consumption is a way to measure the fitness of an individual. It is the amount of oxygen in milliliters a person uses per kilogram of body weight per minute. A medical research center claims that athletes have a greater mean maximal oxygen consumption than non-athletes. The results for samples from the 2 groups are below. At α = 0.05, can you support the research center’s claim? (Assume population variances are equal, and populations are normally distributed, σ’s unknown.) Athletes Non-athletes 𝑥1= 56 ml/kg/min 𝑥2 = 47 ml/kg/min s1 = 4.9 ml/kg/min s2 = 3.1 ml/kg/min n1 = 23 n2 = 21 Test of Inference I. Ho : µ1 ≤ µ2 Ha : µ1 > µ2 (claim) III. IV. V. VI. II. α = 0.05 Test of Inference I. Ho : µ1 ≤ µ2 Ha : µ1 > µ2 (claim) III. t = = IV. V. VI. 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 −𝑚𝑒𝑎𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 9 720.4200 ∗ 42 = II. α = 0.05 𝑥1 − 𝑥2 −(µ1 −µ2) 𝑠 𝑥1− 𝑥2 9 9 = 56−47 −0 22∗4.92+20∗3.12 1 1 ∗ + 23+21−2 23 21 = 4.1416∗.3018 = 1.2499 = 7.200 .0911 Test of Inference I. Ho : µ1 ≤ µ2 Ha: µ1 > µ2 (claim) III. t = = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 −𝑚𝑒𝑎𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 9 720.4200 ∗ 42 = II. α = 0.05 𝑥1 − 𝑥2 −(µ1 −µ2) 𝑠 𝑥1− 𝑥2 9 = 56−47 −0 22∗4.92+20∗3.12 1 1 ∗ + 23+21−2 23 21 9 = 4.1416∗.3018 = 1.2499 = 7.200 .0911 IV. Define t0 (table): df=n1+n2-2 = 23+21-2=42 V. VI. t0 = 1.684 Test of Inference I. Ho : µ1 ≤ µ2 II. α = 0.05 Ha : µ1 > µ2 (claim) III. t = = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 −𝑚𝑒𝑎𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 9 720.4200 ∗ 42 = 𝑥1 − 𝑥2 −(µ1 −µ2) 𝑠 𝑥1− 𝑥2 9 = 56−47 −0 2 22∗4.9 +20∗3.1 23+21−2 ∗ 1 1 + 23 21 9 = 4.1416∗.3018 = 1.2499 = 7.200 .0911 IV. Define t0 (table): df=n1+n2-2 = 23+21-2=42 V. Define rejection region: reject if t > 1.684 VI. 2 t0 = 1.684 Test of Inference I. Ho : µ1 ≤ µ2 II. α = 0.05 Ha : µ1 > µ2 (claim) III. t = = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 −𝑚𝑒𝑎𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 9 720.4200 ∗ 42 = 𝑥1 − 𝑥2 −(µ1 −µ2) 𝑠 𝑥1− 𝑥2 9 = 56−47 −0 2 22∗4.9 +20∗3.1 23+21−2 2 ∗ 1 1 + 23 21 9 = 4.1416∗.3018 = 1.2499 = 7.200 .0911 IV. Define t0 (table): df=n1+n2-2 = 23+21-2=42 V. Define rejection region: reject if t > 1.684 VI. Decision : reject H0 t0 = 1.684 Test of Inference I. Ho : µ1 ≤ µ2 Ha : µ1 > µ2 (claim) III. t = = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 −𝑚𝑒𝑎𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 9 720.4200 ∗ 42 = II. α = 0.05 𝑥1 − 𝑥2 −(µ1 −µ2) 𝑠 𝑥1− 𝑥2 9 = 56−47 −0 22∗4.92+20∗3.12 1 1 ∗ + 23+21−2 23 21 9 = 4.1416∗.3018 = 1.2499 = 7.200 .0911 IV. Define t0 (table): df=n1+n2-2 = 23+21-2=42 t0 = 1.645 V. Define rejection region: reject if t > 1.645 VI. Decision : reject H0 VII. There is enough evidence, at the 5% significance level to support the claim that athletes have a greater mean maximal oxygen consumption than non-athletes. Section 8.2 Summary • Performed a t-test for the difference between two means μ1 and μ2 from independent samples when the population standard deviations are not known.