Week 9 October 27-31 Four Mini-Lectures QMM 510 Fall 2014 Chapter 10 Two-Sample Hypothesis Tests Chapter Contents 10.1 Two-Sample Tests 10.2 Comparing Two Means: Independent Samples 10.3 Confidence Interval for the Difference of Two Means, 1 2 10.4 Comparing Two Means: Paired Samples 10.5 Comparing Two Proportions So many topics, so little time … 10.6 Confidence Interval for the Difference of Two Proportions, 1 2 10.7 Comparing Two Variances 10-2 Chapter 10 Two-Sample Tests What Is a Two-Sample Test • • A two-sample test compares two sample estimates with each other. A one-sample test compares a sample estimate to a nonsample benchmark. Basis of Two-Sample Tests • Two-sample tests are especially useful because they possess a built-in point of comparison. • The logic of two-sample tests is based on the fact that two samples drawn from the same population may yield different estimates of a parameter due to chance. 10-3 Chapter 10 Two-Sample Tests What Is a Two-Sample Test If the two sample statistics differ by more than the amount attributable to chance, then we conclude that the samples came from populations with different parameter values. 10-4 Chapter 10 Comparing Two Means: Independent Samples ML 9.1 Format of Hypotheses • The hypotheses for comparing two independent population means µ1 and µ2 are: 10-5 Chapter 10 Comparing Two Means: Independent Samples Case 1: Known Variances • When the population variances 12 and 22 are known, use the normal distribution for the test (assuming a normal population). • The test statistic is: 10-6 Chapter 10 Comparing Two Means: Independent Samples Case 2: Unknown Variances, Assumed Equal • If the variances are unknown, they must be estimated and the Student’s t distribution used to test the means. • Assuming the population variances are equal, s12 and s22 can be used to estimate a common pooled variance sp2. 10-7 Case 3: Unknown Variances, Assumed Unqual • If the population variances cannot be assumed equal, the distribution of the random variable x1 x2 is uncertain (Behrens-Fisher problem).. • The Welch-Satterthwaite test addresses this difficulty by estimating each variance separately and then adjusting the degrees of freedom. 10-8 A quick rule for degrees of freedom is to use min(n1 – 1, n2 – 1). You will get smaller d.f. but avoid the tedious formula above. Chapter 10 Comparing Two Means: Independent Samples Test Statistic • • • If the population variances 12 and 22 are known, then use the normal distribution. Of course, we rarely know 12 and 22 . If population variances are unknown and estimated using s12 and s22, then use the Student’s t distribution (Case 2 or Case 3) If you are testing for zero difference of means (H0: µ1−µ2 = 0) the formulas are simplified to: 10-9 Chapter 10 Comparing Two Means: Independent Samples Chapter 10 Comparing Two Means: Independent Samples Which Assumption Is Best? • • • • If the sample sizes are equal, the Case 2 and Case 3 test statistics will be identical, although the degrees of freedom may differ and therefore the p-values may differ. If the variances are similar, the two tests will usually agree. If no information about the population variances is available, then the best choice is Case 3. The fewer assumptions, the better. Must Sample Sizes Be Equal? • Unequal sample sizes are common and the formulas still apply. 10-10 Large Samples • If both samples are large (n1 30 and n2 30) and the population is not badly skewed, it is reasonable to assume normality for the difference in sample means and use Appendix C. • Assuming normality makes the test easier. However, it is not conservative to replace t with z. • Excel does the calculations, so we should use t whenever population variances are unknown (i.e., almost always). 10-11 Chapter 10 Comparing Two Means: Independent Samples Three Caveats: • Are the populations severely skewed? Are there outliers? Check using • histograms and/or dot plots of each sample. t tests are OK if moderately skewed, while outliers are more serious. In small samples, the mean may not be a reliable indicator of central tendency and the t-test will lack power. • In large samples, a small difference in means could be “significant” but may lack practical importance. 10-12 Chapter 10 Comparing Two Means: Independent Samples Are the means equal? Test the hypotheses: Example: Order Size H0: μ1 = μ2 H0: μ1 ≠ μ2 Summary statistics in 8 spreadsheet cells and use MegaStat: Assuming either Case 2 or Case 3, we would not reject H0 at α = .05 (because the pvalue exceeds .05) Friday Saturday 22.32 25.56 4.35 6.16 13 18 Hypothesis Test: Independent Groups (t-test, pooled variance) Friday Saturday 22.32 25.56 mean 4.35 6.16 std. dev. 13 18 n 29 -3.24000 30.07397 5.48397 1.99604 0 df difference (Friday - Saturday) pooled variance pooled std. dev. standard error of difference hypothesized difference Hypothesis Test: Independent Groups (t-test, unequal variance) Friday Saturday 22.32 25.56 mean 4.35 6.16 std. dev. 13 18 n 28 -3.24000 1.88777 0 df difference (Friday - Saturday) standard error of difference hypothesized difference -1.716 t .0972 p-value (two-tailed) -1.623 t .1154 p-value (two-tailed) 10-13 Chapter 10 Comparing Two Means: Independent Samples ML 9.2 Paired Data • Data occur in matched pairs when the same item is observed twice but under different circumstances. • For example, blood pressure is taken before and after a treatment is given. • Paired data are typically displayed in columns. 10-14 Chapter 10 Comparing Two Means: Paired Samples Paired t Test • Paired data typically come from a before/after experiment. • In the paired t test, the difference between x1 and x2 is measured as d = x1 – x2 • The mean and standard deviation for the differences d are: • The test statistic becomes just a one-sample t-test. 10-15 Chapter 10 Comparing Two Means: Paired Samples Chapter 10 Comparing Two Means: Paired Samples Steps in Testing Paired Data • Step 1: State the hypotheses. For example: H0: µd = 0 H1: µd ≠ 0 • Step 2: Specify the decision rule. Choose (the level of significance) and determine the critical values from Appendix D or with use of Excel. • Step 3: Calculate the test statistic t. • Step 4: Make the decision. Reject H0 if the test statistic falls in the rejection region(s) as defined by the critical values. 10-16 Chapter 10 Comparing Two Means: Paired Samples Analogy to Confidence Interval A two-tailed test for a zero difference is equivalent to asking whether the confidence interval for the true mean difference µd includes zero. 10-17 Chapter 10 Comparing Two Means: Paired Samples Example: Exam Scores Using MegaStat: Right-tailed test to see if mean scores improved Name Cecil David Edward Fred Gary Henry Post-Test 85 97 81 77 96 68 Pre-Test 79 87 78 82 96 69 Diff 6 10 3 -5 0 -1 Mean difference St dev of differences 2.1667 5.3448 H 0: μ d = 0 (no change in mean) t calc 0.9930 H 1: μ d > 0 (improved mean score) t .05 2.015 0.1832 p -value Hypothesis Test: Paired Observations 0.000 hypothesized value 84.000 mean Post-Test Do not reject H 0 because t calc does not exceed t .05 (p > .05). tcalc d 2.1667 sd / n (5.3448) / 6 81.833 mean Pre-Test =T.DIST.RT(0.9930,5) confidence interval includes zero 2.167 5.345 2.182 6 5 mean difference (Post-Test - Pre-Test) std. dev. std. error n df 0.993 t .1832 p-value (one-tailed, upper) -3.442 confidence interval 95.% lower 7.776 confidence interval 95.% upper 5.609 margin of error 10-18 Chapter 10 Comparing Two Proportions ML 9.3 Testing for Zero Difference: 1 2 = 0 To test for equality of two population proportions, 1, 2, use the following hypotheses: 10-19 Chapter 10 Comparing Two Proportions Testing for Zero Difference: 1 2 = 0 Sample Proportions The sample proportion p1 is a point estimate of 1 and p2 is a point estimate of 2: 10-20 Chapter 10 Comparing Two Proportions Testing for Zero Difference: 1 2 = 0 Pooled Proportion If H0 is true, there is no difference between 1 and 2, so the samples are pooled (or averaged) in order to estimate the common population proportion. 10-21 Chapter 10 Comparing Two Proportions Testing for Zero Difference: 1 2 = 0 Test Statistic • • If the samples are large, p1 – p2 may be assumed normally distributed. The test statistic is the difference of the sample proportions divided by the standard error of the difference. The standard error is calculated by using the pooled proportion. • The test statistic for the hypothesis 1 2 = 0 is: • 10-22 Chapter 10 Comparing Two Proportions Example: Hurricanes p1 p x1 19 x 45 =.4130 p2 2 .6429 n1 46 n2 70 2.435 x1 x2 19 45 64 .5517 n1 n2 46 70 116 Hypothesis test for two independent proportions … or using MegaStat: p1 p2 0.413 0.6429 19/46 pc 0.5517 p (as decimal) 45/70 64/116 p (as fraction) 19. 45. 46 70 -0.2298 0. 0.0944 -2.435 .01491 64. X 116 n difference hypothesized difference std. error z p-value (two-tailed) -0.468 confidence interval 99.% lower 0.0084 confidence interval 99.% upper 0.2382 margin of error =2*NORM.S.DIST(-2.435,1) 10-23 Chapter 10 Comparing Two Proportions Testing for Zero Difference: 1 2 = 0 Checking for Normality • We have assumed a normal distribution for the statistic p1 – p2. • This assumption can be checked. • For a test of two proportions, the criterion for normality is n 10 and n(1 − ) 10 for each sample, using each sample proportion in place of . • If either sample proportion is not normal, their difference cannot safely be assumed normal. • The sample size rule of thumb is equivalent to requiring that each sample contains at least 10 “successes” and at least 10 “failures.” 10-24 Chapter 10 Comparing Two Proportions Testing for Nonzero Difference 10-25 Chapter 10 Comparing Two Variances ML 9.4 Format of Hypotheses We may need to test whether two population variances are equal. 10-26 Chapter 10 Comparing Two Variances The F Test • The test statistic is the ratio of the sample variances: • If the variances are equal, this ratio should be near unity: F = 1. 10-27 Chapter 10 Comparing Two Variances The F Test • If the test statistic is far below 1 or above 1, we would reject the hypothesis of equal population variances. • The numerator s12 has degrees of freedom df1 = n1 – 1 and the denominator s22 has degrees of freedom df2 = n2 – 1. • The F distribution is skewed with mean > 1 and mode < 1. Example: 5% right-tailed area for F11,8 10-28 Chapter 10 Comparing Two Variances F Test: Critical Values • For a two-tailed test, critical values for the F test are denoted FL (left tail) and FR (right tail). • A right-tail critical value FR may be found from Appendix F using df1 and df2 degrees of freedom. FR = Fdf1, df2 • Excel function is: =F.INV.RT(α, df1, df2) A left-tail critical value FL may be found by reversing the numerator and denominator degrees of freedom, finding the critical value from Appendix F and taking its reciprocal: FL = 1/Fdf2, df1 Excel function is: =F.INV(α, df1, df2) 10-29 Chapter 10 Comparing Two Variances Two-Tailed F-Test: • Step 1: State the hypotheses: H0: 12 = 22 H1: 12 ≠ 22 • Step 2: Specify the decision rule. Degrees of freedom are: Numerator: df1 = n1 – 1 Denominator: df2 = n2 – 1 Choose α and find the left-tail and right-tail critical values from Appendix F or from Excel. • Step 3: Calculate the test statistic. • Step 4: Make the decision. Reject H0 if the test statistic falls in the rejection regions as defined by the critical values. 10-30 One -Tailed F-Test Example: 5% left-tailed area for F11,8 • Step 1: State the hypotheses. For example: H0: 12 22 H1: 12 < 22 • Step 2: State the decision rule. Degrees of freedom are: Numerator: df1 = n1 – 1 =F.INV(0.05,11,8) Denominator: df2 = n2 – 1 Choose α and find the critical value from Appendix F or Excel. • Step 3: Calculate the test statistic. • Step 4: Make the decision. Reject H0 if the test statistic falls in the rejection region as defined by the critical value. 10-31 Chapter 10 Comparing Two Variances Chapter 10 Comparing Two Variances EXCEL’s F Test Note: Excel uses a left-tailed test if s12 < s22 So, if you want a two-tailed test, you must double Excel’s one-tailed p-value. Conversely, Excel uses a right-tailed test if s12 > s22 10-32 Assumptions of the F Test • The F test assumes that the populations being sampled are normal. It is sensitive to nonnormality of the sampled populations. • MINITAB reports both the F test and a robust alternative called Levene’s test along with its p-values. 10-33 Chapter 10 Comparing Two Variances