Section 8.3 Testing the Difference Between Means (Dependent Samples) Section 8.3 Objectives • Perform a t-test to test the mean of the differences for a population of paired data t-Test for the Difference Between Means • To perform a two-sample hypothesis test with dependent samples, the difference between each data pair is first found: d = x1 – x2 Difference between entries for a data pair • The test statistic is the mean d of these differences. d d Mean of the differences between paired n data entries in the dependent samples t-Test for the Difference Between Means Three conditions are required to conduct the test. 1. The samples must be randomly selected. 2. The samples must be dependent (paired). 3. Both populations must be normally distributed. If these requirements are met, then the sampling distribution for d is approximated by a t-distribution with n – 1 degrees of freedom, where n is the number of data pairs. -t0 μd t0 d Symbols used for the t-Test for μd Symbol Description n The number of pairs of data d The difference between entries for a data pair, d = x1 – x2 d The hypothesized mean of the differences of paired data in the population Symbols used for the t-Test for μd Symbol d sd Description The mean of the differences between the paired data entries in the dependent samples d d n The standard deviation of the differences between the paired data entries in the dependent samples 2 ( d ) 2 2 d (d d ) n sd n 1 n 1 t-Test for the Difference Between Means • The test statistic is d d . n • The standardized test statistic is d d t . sd n • The degrees of freedom are d.f. = n – 1. t-Test for the Difference Between Means (Dependent Samples) In Words 1. State the claim mathematically and verbally. Identify the null and alternative hypotheses. In Symbols State H0 and Ha. 2. Specify the level of significance. Identify α. 3. Determine the degrees of freedom. d.f. = n – 1 4. Determine the critical value(s). Use Table 5 in Appendix B if n > 29 use the last row (∞) . t-Test for the Difference Between Means (Dependent Samples) In Words In Symbols 5. Determine the rejection region(s). 6. Calculate d and sd . d d n (d ) 2 d (d d ) n sd n 1 n 1 2 7. Find the standardized test statistic. d d t sd n 2 t-Test for the Difference Between Means (Dependent Samples) In Words 8. Make a decision to reject or fail to reject the null hypothesis. 9. Interpret the decision in the context of the original claim. In Symbols If t is in the rejection region, reject H0. Otherwise, fail to reject H0. Example: t-Test for the Difference Between Means A teacher claims that a grammar seminar will help students reduce the number of grammatical errors made when writing essays. The table shows the number of errors made before and after participating in the seminar. Assuming the populations are normally distributed, is there enough evidence to support the claim at α = 0.01? Student 1 2 3 4 5 6 7 Errors (before) 15 10 12 8 5 4 9 Errors (after) 11 9 6 5 1 0 9 Example: t-Test for the Difference Between Means Hypotheses: claim: mean_before > mean_after mean_before – mean_after > 0 µbefore - µafter > 0 µ1 - µ 2 > 0 µd > 0 Hypotheses: H0:µd ≤ 0 Ha: µd > 0 (claim) Example: t-Test for the Difference Between Means Get differences: Two data sets (samples) become a single set of data (the set of differences, “d”) Student 1 2 3 4 5 6 7 Errors before 15 10 12 8 5 4 9 Errors after 11 9 6 5 1 0 9 d (x1 – x2) 4 1 6 3 4 4 0 22 d2 16 1 36 9 16 16 0 94 So, 𝑑= 𝑑 𝑛 = 22/7 = 3.143 sd = (d d ) 2 n 1 94−222/7 6 = sum (d ) d n n 1 2 94−69.143 6 2 = 4.143=2.035 Example: t-Test for the Difference Between Means Get t test statistic: Two data sets (samples) become a single set of data (the set of differences) t= 𝑑 −𝜇𝑑 𝑠𝑑 / 𝑛 = 3.143−0 2.035/ 7 = 3.143 .769 = 4.08 Determine t0 df = n-1=6, α=0.01, one tail (from Ha). From table t0 = 3.143 Rejection region: t value greater than 3.143 Decision: since t is in rejection region (4.08 > 3.143) we reject H0. There is enough evidence at 1% significance level to support the teacher’s claim that the seminar does reduce the number Example: t-Test for the Difference Between Means A shoe manufacturer claims that athletes can increase their vertical jump heights using the manufacturer’s new Strength Shoes®. The vertical jump heights of eight randomly selected athletes are measured. After the athletes have used the Strength Shoes® for 8 months, their vertical jump heights are measured again. The vertical jump heights (in inches) for each athlete are shown in the table. Assuming the vertical jump heights are normally distributed, is there enough evidence to support the manufacturer’s claim at α = 0.10? Athletes 1 2 3 4 5 6 7 8 Height (old) 24 22 25 28 35 32 30 27 Height (new) 26 25 25 29 33 34 35 30 Example: t-Test for the Difference Between Means Hypotheses: claim: increases vertical jump mean_before < mean_after mean_before – mean_after < 0 µbefore - µafter < 0 µ1 - µ 2 < 0 µd < 0 Hypotheses: H0:µd ≥ 0 Ha: µd < 0 (claim) Solution: Two-Sample t-Test for the Difference Between Means d = (jump height before shoes) – (jump height after shoes) • • • • • H0: μ d ≥ 0 Ha: μd < 0 (claim) α = 0.10 d.f. = 8 – 1 = 7 Rejection Region: Solution: Two-Sample t-Test for the Difference Between Means d = (jump height before shoes) – (jump height after shoes) Before 24 22 After 26 25 d –2 –3 d2 4 9 25 28 35 25 29 33 0 –1 2 0 1 4 32 30 34 35 27 30 –2 –5 4 25 –3 9 Σ = –14 Σ = 56 d 14 d 1.75 n 8 2 (d ) d n sd n 1 (14) 56 8 8 1 2.1213 2 2 Solution: Two-Sample t-Test for the Difference Between Means d = (jump height before shoes) – (jump height after shoes) • Test Statistic: • H0: µd ≥ 0 • Ha: μd < 0 (claim) d d 1.75 0 t 2.333 • α = 0.10 sd n 2.1213 8 • d.f. = 8 – 1 = 7 • Decision: Reject H0. • Rejection Region: At the 10% level of significance, there is enough evidence to support the shoe manufacturer’s claim that athletes can increase their vertical jump heights using t ≈ –2.333 the new Strength Shoes®. Section 8.3 Summary • Performed a t-test to test the mean of the difference for a population of paired data