Chapter 8 (2/3e) or 9 (1e) Hypothesis Testing: Two Sample Test for Means and Proportions Introduction: The two sample test is similar to the one sample test, except that we are now testing for differences between two populations rather than a sample and a population. There are three types of two sample tests: Hypothesis Testing with Sample Means (Large Samples) Hypothesis Testing with Sample Means (Small Samples) Hypothesis Testing with Sample Proportions (Large Samples) The Question to be Answered: “Is the difference between sample statistics large enough to conclude that the populations represented by the samples are significantly different?” Null Hypothesis: The H0 is that the populations are the same. H 0: μ 1 = μ 2 If the difference between the sample statistics is large enough, or, if a difference of this size is unlikely, assuming that the H0 is true, we will reject the H0 and conclude there is a difference between the populations. Null Hypothesis (cont.) The H0 is a statement of “no difference” The 0.05 level will continue to be our indicator of a significant difference We change the sample statistics to a Z score, place the Z score on the sampling distribution and use Appendix A to determine the probability of getting a difference that large if the H0 is true. Alternate Hypothesis: The alternate hypothesis is the research hypothesis. If the null hypothesis is rejected, then we will have found evidence to support the research hypothesis. H 1: μ 1 ≠ μ 2 Formula for Hypothesis Testing with Sample Means (Large Samples) Z 1 2 Explanation of formula: The numerator sample means. 1 2 is the difference in The denominator is the “pooled estimate” of the standard error for both samples. The pooled estimate is calculated by using the sample information in the following formula: 2 1 2 2 s s n1 1 n 2 1 The Five Step Model 1. 2. 3. 4. 5. Make assumptions and meet test requirements. State the H0 and H1. Select the Sampling Distribution and Determine the Critical Region. Calculate the test statistic. Make a Decision and Interpret Results. Example: Hypothesis Testing in the Two Sample Case Text 1e 9.5b, 2/3e 8.5b (Email messages): Middle class families average 8.7 email messages and working class families average 5.7 messages. The middle class families seem to use email more but is the difference significant? Problem Information: E-Mail Messages Sample 1 (M.Class) 1 S1 n1 = 8.7 = 0.3 = 89 Sample 2 (W.Class) 2 = 5.7 S2 = 1.1 n2 = 55 Step 1 Make Assumptions and Meet Test Requirements We have: Independent Random Samples Level of Measurement is Interval Ratio Sampling Distribution is normal in shape because we have a large sample: n1 + n2 ≥ 100 (in this case, n1 + n2 = 144) Step 2 State the Null Hypothesis H0: μ1 = μ2 The Null asserts there is no significant difference between the populations. H1: µ1≠ µ2 The research hypothesis contradicts the H0 and asserts there is a significant difference between the populations. Step 3 Select the Sampling Distribution and Establish the Critical Region Sampling Distribution = Z distribution Alpha (α) = 0.05 Z (critical) = ± 1.96 Using the formula: Compute the pooled estimate (S.E.): s12 s22 .32 1.12 .001 .022 .152 n1 1 n 2 1 89 1 55 1 Solve for Z: 8.7 5.7 Z 19.74 1 2 .152 Step 5 Make a Decision The obtained test statistic (Z = 19.74) falls in the Critical Region so reject the null hypothesis. The difference between the sample means is so large that we can conclude (at α = 0.05) that a difference exists between the populations represented by the samples. The difference between the email usage of middle class and working class families is significant (Z=19.74, α=.05) Two-tailed Hypothesis Test: Z= -1.96 c Z = +1.96 c Z=19.74 I When α = .05, then .025 of the area is distributed on either side of the curve in area (C ) The .95 in the middle section represents no significant difference between the two populations. The cut-off between the middle section and +/- .025 is represented by a Z-value of +/- 1.96. Factors in Making a Decision The use of one- vs. two-tailed tests (we are more likely to reject with a one-tailed test) The size of the sample (n). The larger the sample the more likely we are to reject the H0. Significance Vs. Importance As long as we work with random samples, we must conduct a test of significance. Significance is not the same thing as importance. Differences that are otherwise trivial or uninteresting may be significant. Significance Vs. Importance When working with large samples, even small differences may be significant. The value of the test statistic (step 4) is an inverse function of n. The larger the n, the greater the value of the test statistic, the more likely it will fall in the critical region (region of rejection) and be declared significant. Significance Vs Importance Significance and importance are different things. A sample outcome could be: significant and important significant but unimportant not significant but important not significant and unimportant Formula for Hypothesis Testing with Sample Proportions (Large Samples) Formula for proportions: s1 s 2 p p See next slide for how to calculate the standard deviation of the sampling distribution* and the pooled estimate of the population proportion*…. *Note that you need to calculate both these values in order to solve the denominator of the above equation! Calculating Pu (the Pooled Estimate of the Population Proportion) and the Standard Deviation of the Sampling Distribution To calculate Pu (the pooled estimate, fig. 7.7 or 8.7): n1 Ps1 n2 Ps 2 Pu n1 n2 Standard Deviation of the S.D. (fig. 7.7 or 8.8): p p n1 n2 u (1 u ) n1n2 Example: Using the same guidelines as for the large sample test for means (above) and the 5-step method, work with a partner and try #9.11 to test for a difference in proportions. The answer to this question can be found at the back of your text. Formula (t-test) for Hypothesis Testing with Sample Means (Small Samples N1 + N2 < 100) Formula: S.E: t 1 2 n s n s n1 n2 2 2 1 1 2 2 2 n1 n2 n1n2 Note: Use t-table with df = n1 + n2 - 2 Example: Using the same format as for the large sample test (above) and the 5-step method, work with a partner and try 1e #9.7a or 2/3e #8.7a) Do part b for homework. The answer to this question can be found at the back of your text. Using SPSS to do Independent Samples Test for Difference in Two Means SPSS uses a t-test rather than a z-test for both large and small samples. Follow guidelines in text at the end of the chapter. In interpreting your printout, look at the Levene’s test (shown in the first two columns F and sig.) first. If the p-value (sig) is greater than alpha=.05, focus on interpreting the top row of the “t-test for Equality of Means”. If it is less than .05, use the bottom row of the t-test. If the significance level (Sig. 2-tailed) is less than α=.05, then the difference between the sample means is significant. Report t, df, and your α-level in your interpretation. Practice Problems #8.4 (2/3e)/ 9.4 (1e) #8.8 (2/3e)/ 9.8 (1e) #8.12a (2/3e)/ 9.12a (1e) Answers can be found in the lecture list directly below this presentation. No looking before you have tried the questions!