' I. Confidence intervals and tests for the difference of two means (Section 9.3) Often we’re interested in comparing two groups. For example: $ • Are the rates of teen smoking the same among white and minority groups? If not, how large is the difference? • Is a new medication more effective at reducing blood pressure than the currently used medication? If so, how much more effective? We’ll learn how to answer questions about the difference in means between the two groups. & 1 % $ ' Basic setting • Population 1 has mean µ1 and standard deviation σ1 . • Population 2 has mean µ2 and standard deviation σ2 . • We’re interested in µ1 − µ2 . • We have independent samples of size n1 and n2 from the two populations. • We typically will not know the population standard deviations σ1 and σ2 , so we’ll estimate them by the sample standard deviations S1 and S2 . • We’ll assume that the populations are normally distributed, or that the sample sizes n1 and n2 are large. (See the text for details.) & 2 % $ ' Example: • A new medication has been developed to reduce blood pressure. • Let µ1 represent the mean reduction in blood pressure using the new medication for people with high blood pressure. • Let µ2 represent the mean reduction in blood pressure using the old medication for people with high blood pressure. • We’re interested in testing the hypotheses H0 : µ 1 − µ 2 ≤ 0 Ha : µ 1 − µ 2 > 0 and in forming a confidence interval for µ1 − µ 2 . • We randomly assign n1 = 20 people with high blood pressure to use the new medication, and n2 = 15 people with high blood pressure to use the old medication. & 3 % ' The t distribution again $ • The difference in sample means X 1 − X 2 is a good estimator of µ1 − µ2 . • As usual, we’ll standardize the estimator. • Fact: The standardized estimator (X 1 − X 2 ) − (µ1 − µ2 ) p S12 /n1 + S22 /n2 has a t distribution. & 4 % ' • We need to know how many degrees of freedom! $ • Answer 1: There is a complicated (but accurate) formula that gives the degrees of freedom in the text. Most software (e.g. Minitab) uses this formula. • Answer 2 (Easier but less accurate): Use the minimum of n1 − 1 and n2 − 1 as the degrees of freedom. • We’ll always use Answer 2. & 5 % $ ' Formulas • Confidence interval for µ1 − µ2 : q (X 1 − X 2 ) ± c S12 /n1 + S22 /n2 . • We get the multiplier c from Table B.3, the t table. • Hypothesis test for whether µ1 − µ2 is 0: – Test statistic is X1 − X2 p S12 /n1 + S22 /n2 – Compute p-values the usual way using Table B.3. & 6 % ' Back to example Recall that µ1 is the mean BP reduction using the new medication and µ2 is the mean BP reduction using the old medication, and that n1 = 20 and n2 = 15. $ • Collect data and find X 1 = 19, S1 = 5 X 2 = 16, S2 = 3 • Want to test H0 : µ 1 − µ 2 ≤ 0 Ha : µ 1 − µ 2 > 0 & 7 % $ ' • Test statistic: T =p =p 19 − 16 52 /20 + 32 /15 3 25/20 + 9/15 ≈ 2.206. • Degrees of freedom: 14. • The p-value is the area to the right of 2.206. From Table B.3 we can bound the p-value to be between 0.01 and 0.025. & 8 % ' • A 99% confidence interval for the mean difference µ1 − µ2 is p (19 − 16) ± 2.977 52 /20 + 32 /15 $ • This is about (−1.05, 7.05). & 9 % $ ' Example: • Chapin Social Insight test (measures how accurately a person appraises other people) was given to a group of college students. • Question of interest: Do males and females differ in average social insight? • Formally we want to test H0 : µ 1 − µ 2 = 0 Ha : µ1 − µ2 6= 0 where µ1 and µ2 are the mean scores for males and females. & 10 % $ ' • Data: Gender Male Female Sample size 133 162 Sample mean 25.34 24.94 Sample s.d. 5.05 5.44 • Test statistic: 25.34 − 24.94 p 5.052 /133 + 5.442 /162 ≈ 0.654. • Compute a p-value from the t distribution with 132 degrees of freedom. • Since our table doesn’t have this, we’ll use 120 degrees of freedom. • We want twice the area to the right of 0.654. • The area to the right of 0.845 is 0.2, so the area to the right of 0.654 is more than 0.2. • So the p-value is more than 0.4. & 11 %