Class Notes April 07 The general steps in any hypothesis test: 1. Check the appropriate conditions for a valid significance test. 2. Determine the null and alternative hypothesis; 3. Summarize the data into an appropriate test statistic (t-statistic for the mean and z-statistic for the proportion); 4. Assuming the null hypothesis is true, find the p-value; 5. Decide whether or not the result is statistically significant based on the p-value. 13.2 Testing Hypotheses about One Population Mean The usual procedure for testing this hypothesis is called a one-sample t-test (that’s why we use t* for CI). Conditions under which a “t-test” for one population mean is valid: 1. The population of the measurements of interest is bell shaped and a random sample of any size is measured. In practice, we can use this method as long as there is no evidence that the shape is notably skewed or that there are extreme outliers. 2. The population of measurements of interest is not bell-shaped, but a large random sample is measured. 30 is usually used as “large”. The statistic we use is t-statistic: t= sample mean null value standard error Example: Suppose we want to see if the average height of Stat 200 students is shorter than 67 inches. The following is the Minitab output: One-Sample T: Height Test of mu = 67 vs mu < 67 Variable Height Variable Height N 418 Mean 66.682 95.0% Upper Bound 67.074 StDev 4.863 T -1.34 SE Mean 0.238 P 0.091 Page 1 of 7 Determine the p-value: most software programs can do the work for you. You can also find p-value from “Tables of the student t-distribution.” p-value and conclusion: In our case, the p-value is 0.091 > 0.05. Hence, we do not reject the null hypothesis. We therefore claim that the average height of Stat 200 students is 67 inches. Matched Pair Analysis Most often, this type of data occurs when the researcher collects two measurements from each observational unit. For instance, if we record weights both before and after a diet program for each person in a sample, we have paired data. With paired data, we calculate the difference between the measurements for each pair, and then we use procedures for analyzing a single sample. Note: We use d as the difference between the 2 treatments. The following are 5 steps that we use to solve a Matched Pair problem. 1. Compute difference for each unit; 2. Set up null and alternative hypothesis; 3. Calculate average difference and standard deviation of difference. 4. Check conditions that difference is normal and paired analysis. 5. Calculate confidence interval: d t* sd n , where d is the average difference between 2 measurements; sd is the sample standard deviation of difference; n is sample size. Or hypothesis test with t-statistic: t = d null value . sd n Example: Stichler, Richey, and Mandel compared two methods of measuring treadwear in their paper “Measurement of Treadwear of Commercial Tires” (Rubber Age, 73:2, 1953). 11 tires were each measured for treadwear by two methods, one based on weight and the other on groove wear. We want to see if there is any difference between these two methods. Page 2 of 7 weight groove difference 30.5 28.7 1.8 30.9 25.9 5.0 31.9 23.3 8.6 30.4 23.1 7.3 27.3 23.7 3.6 20.4 20.9 -0.5 24.5 16.1 8.4 20.9 19.9 1.0 18.9 15.2 3.7 13.7 11.5 2.2 11.4 11.2 0.2 The Minitab output is the following: Test of mu = 0 vs mu not = 0 Variable difference Variable difference N 11 ( Mean 3.755 95.0% CI 1.590, 5.919) StDev 3.221 T 3.87 SE Mean 0.971 P 0.003 How do you calculate the difference between “weight” and “groove”? What is the value for d ? What is the value for sd? What is the value for tstatistic, and p-value? p-value and conclusion: In this problem, we have p-value = 0.003 < 0.05. Hence, we reject the null hypothesis. Therefore we claim that there is a difference between the 2 methods. 13.3 The Difference between Two Means (Independent Sample) It is often of interest to determine whether or not the mean of populations represented by two independent samples of a quantitative variable differ. Page 3 of 7 In most cases when comparing two means, the null hypothesis is that they are equal: H0: 1 - 2 = 0 Note: Again, in order to carry out this hypothesis testing, we need to check the 2 conditions on Page 1 for both samples. Moreover, the samples must be independent. In the 2-sample t test case, we have: t= sample mean null value = standard error x1 x2 2 2 s s1 2 n1 n2 This is the t statistic for unpooled version, where we do not assume equal variance. Example: Suppose we want to see if on average, the female students in Stat 200 students study more than male students do, we do a 2-sample t test using Minitab and the following is the Minitab output. Two-Sample T-Test and CI: StudyHrs, Gender Two-sample T for StudyHrs Gender female male N 258 159 Mean 15.93 14.19 StDev 9.53 8.57 SE Mean 0.59 0.68 Difference = mu (female) - mu (male ) Estimate for difference: 1.735 95% lower bound for difference: 0.247 T-Test of difference = 0 (vs >): T-Value = 1.92 P-Value = 0.028 DF = 361 P-value and conclusion: from the output, we see the p-value is 0.028 < 0.05. We have evidence to reject the null hypothesis. Hence we conclude that on average, female students do study more than male students do. 13.4 The Difference between Two Proportions Conditions for a confidence interval for the difference in two proportions: Page 4 of 7 1. Sample proportions are available based on independent samples from the two populations. 2. All of the quantities n1 pˆ 1, n2 pˆ 2, n1 (1 pˆ 1 ), and n2 (1 pˆ 2 ) are at least 10. We will use z statistic for proportions. z= p̂ = sample mean null value = standard error pˆ 1 pˆ 2 , where pˆ (1 pˆ ) pˆ (1 pˆ ) n1 n2 n1 pˆ 1 n2 pˆ 2 . n1 n2 For example, if we want to see if there is any difference between the proportions of right-handedness among male and female Stat 200 students, we can use 95% confidence interval to solve this question. The Minitab output is the following: Test and CI for Two Proportions: Handed, Gender Success = right-handed Gender female male X 236 129 N 258 156 Sample p 0.914729 0.826923 Estimate for p(female) - p(male): 0.0878056 95% CI for p(female) - p(male): (0.0193535, 0.156258) Test for p(female) - p(male) = 0 (vs not = 0): Z = 2.51 P-Value = 0.012 p-value and conclusion: from the output, we see the p-value is 0.012 < 0.05. We have evidence to reject the null hypothesis. Hence we conclude that there is a difference between the proportions of right-handedness among male and female Stat 200 students. One thing to notice is that although the conclusion is “there is a difference”, we can in fact tell something more specific: the proportion of females is higher than the proportion of males (how?). 13.5 The Relationship between Significance Tests and Confidence Interval Page 5 of 7 So far, we have introduced the confidence interval and hypothesis testing for one mean, the difference between 2 means, one proportion, and the difference between 2 proportions. In this section, we will concentrate on the important relationship between hypothesis testing and confidence interval. Confidence Intervals and Two-sided Alternatives When testing one population mean or the difference in two populations means, with the hypothesis H0: parameter = null value vs. Ha: parameter null value 1. If the null value is covered by a (1 - )100% confidence interval, the null hypothesis is not rejected and the test is not statistically significant at level . 2. If the null value is not covered by a (1 - )100% confidence interval, the null hypothesis is rejected and the test is statistically significant at level . Question 1: Suppose in order to compare the average time spent to drive from State College to NYC and the average time spent to drive from NYC to State College, we ask 100 students in random to drive from here to NYC and then drive back. After careful calculation, we get the p-value is 0.02 at significant level 0.05. Which one of the following interval is the possible 95% confidence interval? a. (0.342, 1.987) b. (-0.056, 0.743) c. (-1.256, 0.245) d. (-2.583, 3.872) (*The Following Material is Not Required*) Confidence Intervals and One-sided Alternatives When testing the hypotheses H0: parameter = null value versus a one-sided alternative (either “>” or “<”), compare the null value to a (1- 2)100% confidence interval: 1. If the null value is covered by the interval, the test is not statistically significant at level ; 2. For the alternative Ha: parameter > null value, the test is statistically significant at level if the entire interval falls above the null value; 3. For the alternative Ha: parameter < null value, the test is statistically significant at level if the entire interval falls below the null value. Page 6 of 7 Question 2: Suppose in order to see if the average time spent to drive from State College to NYC is less than 5 hours, we ask 100 students in random to drive from here to NYC. Decide if we can reject the null hypothesis at the significance level of = 0.05, using the following 90% confidence interval. 1. (4.375, 5.551) 2. (3.985, 4.891) 3. (5.075, 5.876) 4. (4.577, 5.425) Page 7 of 7