Chapter 6: Statistical Inference 1. The population mean is a fixed number not a random variable, therefore it is incorrect to state that it has a probability to lie in some range of values. It is correct to say that you are “95% confident that the population mean lies between -5 and 5” since this refers to the method by which you are estimating the confidence interval. 2. False. Accepting the null hypothesis does not mean that the null hypothesis is true. It means that there is insuffient evidence to reject it, accepting the alternative hypothesis. 3. False. Rejecting the null hypothesis means that the probability of error associated with incorrectly rejecting the null hypothesis is low enough to allow you to accept the alternative hypothesis. However, the null hypothesis might see be true and your decision might still be in error. 4. a. 1.96 1.96 20 1.96 20 1.96 , x+ ,50 + x 50 50 7.84,50 7.84 n n 25 25 42.16,57.84 b. s s 20 20 , x t1 ,n 1 ,50 2.064 50 8.256,50 8.256 x t1 ,n 1 50 2.064 2 2 n n 25 25 41.74,58.26 5. a. Let = mean price of Civics in San Francisco. The two hypotheses are: H0 : = 8500 Ha : 8500 We are assuming that the distribution of Honda Civics is normal. b. The alternative is two-sided, because the hypothesis is that differs from the national average, in other words either greater than or less than. c. The acceptance region for the null hypothesis covers the range z 1 / 2 n , n z 1 / 2 where =600, n=9, =8500, and =0.05. Finally, use Excel’s NORMSINV function to calculate the value of z1–/2=NORMSINV(0.975)=1.96. The acceptance region is therefore 600 ,8500 8500 1.96 9 z 1 / 2 600 8500 392,8500 392 8108,8892 9 Since the acceptance region does not include the sample mean of 9000, we reject the null hypothesis and accept the alternative that the mean price of Civics in San Francisco is not equal to $8500. 1 Chapter 6: Statistical Inference d. One way to perform this test is with a one-sample t test. The t statistic is : t x s n where x 9000 , =8500, s=600, and n=10. So that t 9000 8500 2.635 600 10 Using Excel’s TDIST function, TDIST(2.635,9,2)=0.0271. Hence, the probability of the t statistic is 2.71%, and at the 5% significance level, the price difference is significant. 6. a. The unknown population mean of the American speakers is μ1, the population mean for the imported speakers is μ2. The null and alternate hypothesis are: H0: μ1 – μ2 = 0 H a: μ 1 – μ 2 ≠ 0 b. Call the sample average of the American speakers x1, and the sample average of the imported speakers x2. The unknown population mean of the American speakers is 1, the population mean for the imported speakers is 2To test whether these averages indicate that the types of speakers are different in price, use the t-test: t x x 1 1 where s n 1 s n 1 2 s 1 2 1 s 22 n1 n2 2 n n 1 2 1 1 2 2 The first group, the American-made speakers, has a sample size, n1 = 10 and a standard deviation, s1 =5. The second group, the imported speakers has a sample size, n2 , equal to 5 with a standard deviation, s2 = 4. Therefore, the pooled standard deviation, s, is: 10 1 5 1 s 5 4 2 10 5 2 2 = 10 152 5 1 42 10 5 2 289 4.715 13 Under the null hypothesis, the value of the t statistic is: 90 85 0 t x x 1 1 = = 1 2 s 1 2 n n 1 2 4.715 1 1 10 5 5 1936 . 2.5825 This t statistic follows the t distribution with 13 degrees of freedom under the null hypothesis. Using Excel’s TDIST function we can calculate the probability of a t distributed random variable having such an extreme value. The form of the Excel function is TDIST(1.936,13,2) = 0.0749. The difference between the two speakers is not significant at the 5% level and we cannot reject the hypothesis of equal population means for the two speaker groups. c. It is a significant difference at the 10% level, so at this lower level of statistical significance, we can reject the hypothesis of equal means. 2 Chapter 6: Statistical Inference 7. Given the distribution of the t-statistic, we know that: x P t1 / 2,n1 t1 / 2,n1 1 / n Multiplying the inequalities by / n yields: P t1 / 2,n1 x t1 / 2,n1 1 n n and subtracing x from each term of this inequality, we get: P x t1 / 2,n1 x t1 / 2,n1 1 n n We multiply each term by –1 (changing the signs and the direction of the inequalities): P x t1 / 2,n1 x t1 / 2,n1 1 n n and finally, we rearrange the terms and inequalities to arrive at the final form of the confidence interval: P x t1 / 2,n1 x t1 / 2,n1 1 n n So, the upper and lower confidence limits for μ are: x t1 / 2,n1 n 8. a. Let μ1 = mean number of beds for non-rural homes and μ2 = mean number of beds in the rural homes. The null and alternative hypothese are: H0: μ1 – μ2 = 0 H a: μ 1 – μ 2 ≠ 0 3 Chapter 6: Statistical Inference The results of the two-sample t-test are: b. Descriptive Statistics N Beds Mean Std. Dev Std. Err Location = "Non-rural" 18 111.39 43.568 10.269 Location = "Rural" 34 83.68 36.436 6.249 t test Analysis (pooled) Mean Diff. Beds Std. Err. 27.71 11.370 t df 2.437 p-value 50.00 0.018 lower 95% 4.87 upper 95% 50.55 t test Analysis (unpooled) Mean Diff. Beds Std. Err. 27.71 Equality of Variance Tests c. 12.021 t df p-value 2.305 29.81 0.028 F-test Barlett Levene 0.370 0.395 0.808 lower 95% 3.13 upper 95% 52.30 The distribution for the two location types appears as follows: The standard deviation of the two samples is pretty close (43.568 and 36.436), which would lead us to believe that a pooled estimate of the standard deviation is the way to go. Moreover, the results of the F-test, Bartlett test, and Levene test, support this conclusion. However there is an outlier in both samples, which may cause us to doubt the validity of using the t-test in this situation. 4 Chapter 6: Statistical Inference In the Mann-Whitney test, we choose between the following hypotheses: d. H0: The median number of beds in the rural nursing homes = median number of beds in the nonrural homes. Ha: The median number of beds are not equal. The results of the Mann-Whitney test are: N Beds 3rd Quartile Maximum "Non-rural" 18 Minimum 60.00 1st Quartile 81.75 Median 120.00 120.00 244.00 "Rural" 34 25.00 59.25 80.00 106.50 221.00 Mann-Whitney Rank Analysis Beds Median Diff. Rank Sum1 26.50 613.5 Rank Sum2 p-value 764.5 lower 95% 0.009 upper 95% 4.00 52.00 Based on the results of the text, we reject the null hypothesis with a p-value of 0.009 and accept the alternative hypothesis that there are more beds in non-rural homes. e. Whether we use the results of the t-tests or the Mann-Whitney test, the conclusion is the same: there are significantly more beds in non-rural homes. The confidence intervals and estimates of the difference between the locations are also very similar. a. The first few values of the Days_Beds variable are: 9. Beds Revenue Salaries Expenses 244 128 385 23521 5230 5334 59 155 203 9160 2459 493 120 281 392 21900 6304 120 291 419 22354 6590 120 238 363 17421 65 180 234 120 306 90 214 96 120 b. Medical Days Total Days Location Days Beds Non-rural 1.578 Rural 3.441 6115 Non-rural 3.267 6346 Non-rural 3.492 5362 6225 Non-rural 3.025 10531 3622 449 Rural 3.600 372 22147 4406 4998 Rural 3.100 305 14025 4173 966 Rural 3.389 155 169 8812 1955 1260 Non-rural 1.760 133 188 11729 3224 6442 Rural 1.567 First use a two sample t-test. Let μ1 = mean of the Days_Beds variable for non-rural homes and μ2 = mean of the Days_Beds variable for rural homes. The null and alternative hypotheses are: H0: μ1 – μ2 = 0 H a: μ 1 – μ 2 ≠ 0 Another approach is to use the Mann-Whitney test to evaluate the hypotheses: H0: The median Days_Beds value in the rural nursing homes = median Days_Beds value in the nonrural homes. Ha: The median Days_Beds values are not equal. 5 Chapter 6: Statistical Inference The results of the two tests are: Descriptive Statistics N Days Beds Std. Dev Std. Err Location = "Non-rural" 18 3.01505 Mean 0.576282 0.135831 Location = "Rural" 34 3.11144 0.637401 0.109313 t test Analysis Mean Diff. Days Beds -0.09640 Std. Err. 0.179938 t df -0.536 50.00 p-value 0.595 lower 95% upper 95% -0.45781 0.26502 Equality of Variance Tests F-Test Bartlett 0.673 Descriptive Statistics 1st Minimum Quartile Median N Days Beds Levene 0.641 3rd Quartile 0.688 Maximum Location = "Non-rural" 18 1.578 2.775 3.222 3.458 3.550 Location = "Rural" 34 1.567 2.818 3.279 3.545 4.699 Median Diff. Days Beds -0.061 Mann-Whitney Rank Analysis Rank Rank Sum1 Sum2 p-value lower 95% 435.500 -0.322 942.500 0.419 upper 95% 0.165 c. The distribution of the Days_Beds variable appears as follows: d. We fail to reject the null hypothesis under either test. There is no indication that the beds are being utilized at different rates. a. Let μ1 = mean draft number in the first half of the year and μ 2 = mean draft number in the second half of the year. The null and alternative hypotheses are: 10. H0: μ1 – μ2 = 0 H a: μ 1 – μ 2 ≠ 0 6 Chapter 6: Statistical Inference b. There is not a significant statistic for rejecting the hypothesis that the standard deviation of the two samples are equal, so we'll use the pooled estimate. The result of the two-sample t-test is: Descriptive Statistics N Number Mean Std. Dev Std. Err Half = 1 182 206.38 106.149 7.868 Half = 2 184 160.92 100.757 7.428 t test Analysis Mean Diff. Number 45.46 Std. Err. 10.817 t df 4.202 364.00 p-value 0.000 lower 95% upper 95% 24.18 66.73 Equality of Variance Tests F-Test 0.482 c. Bartlett 0.483 Levene 0.380 The distribution of the two samples appears as: The distribution resembles a Uniform distribution, however since the t-test is robust to problems with nonNormality, it can still be used here. d. The 95% confidence intervals are: N Mean Std. Dev Std. Err lower 95% upper 95% Half = 1 182 206.38 106.149 7.868 190.85 221.9 Half = 2 184 160.92 100.757 7.428 146.27 175.58 e. A draft number selected for a person born in the first half of the year is, on average, 45.46 points higher than a draft number for a person whose birthday falls in the second half of the year. This is a statistically significant difference. The ramification of this result is that the assignment of draft numbers is not truly random and that people born in the second half of the year are more likely to receive low draft numbers, and hence are more likely to be drafted. 7 Chapter 6: Statistical Inference 11. a. Let μ1 = mean female salary and μ2 = mean male salary. The null and alternative hypotheses are: H0: μ1 – μ2 = 0 H a: μ 1 – μ 2 ≠ 0 We'll use a significance level of 5% for this test. b. The unpooled and pooled t-tests results are: Descriptive Statistics N Salary Mean Std. Dev Std. Err Sex = "F" 37 27,027.35 5,478.231 900.616 Sex = "M" 44 33,004.91 5,831.991 879.206 unpooled t test Analysis Mean Diff. Salary -5,977.56 Std. Err. 1,258.615 t df -4.749 p-value 78.00 0.000 lower 95% upper 95% -8,483.27 -3,471.85 pooled t test Analysis Mean Diff. Salary -5,977.56 Std. Err. 1,265.517 t df -4.723 79.00 p-value 0.000 lower 95% upper 95% -8,496.51 -3,458.61 Equality of Variance Tests F-Test 0.705 Bartlett 0.698 Levene 0.000 The conclusions of the two tests are the same, suggesting that there is a statistically significant difference in salary between the male and female professors. A histogram of the difference appears as follows: 8 Chapter 6: Statistical Inference c. There are only enough data for the assistant professor and instructor groups. The unpooled and pooled t-test results are: Descriptive Statistics N asst prof instructor Mean Std. Dev Std. Err Sex = "F" 17 28,274.18 6,598.744 1,600.430 Sex = "M" 15 31,202.33 5,027.220 1,298.023 Sex = "F" 20 25,967.54 4,197.807 938.658 Sex = "M" 17 30,960.41 5,829.235 1,413.797 unpooled t test Analysis Mean Diff. Std. Err. t df p-value lower 95% upper 95% asst prof -2,928.16 2,060.641 -1.421 29.42 0.166 -7,142.64 1,286.33 instructor -4,992.87 1,697.027 -2.942 28.54 0.006 -8,469.08 -1,516.66 lower 95% upper 95% pooled t test Analysis Mean Diff. Std. Err. t df p-value asst prof -2,928.16 2,096.262 -1.397 30.00 0.173 -7,209.29 1,352.98 instructor -4,992.87 1,652.707 -3.021 35.00 0.005 -8,348.05 -1,637.69 Equality of Variance Tests F-Test Bartlett Levene asst prof 0.312 0.307 0.173 instructor 0.173 0.177 0.005 The conclusion from the pooled and unpooled t-tests is the same: there is a significant difference for instructors, but not one for assistant professors. d. There is some evidence that the university has underpaid its female faculty members. The overall significance level for all ranks is less than 0.0001. When testing for individual ranks, there is statistical significance only for the instructors. There is no significant difference for assistant professors and not enough evidence for other teaching ranks. However, there are some factors that have not been considered yet. For example, the age at which a person has been hired and a measure of the person's teaching experience should also be considered in any analysis of this type. 12. a. Let μ1 = mean athlete graduation rate and μ2 = mean female graduation rate. Let μd = μ1 – μ2. The null and alternative hypotheses are: H0: μd = 0 H a: μ d ≠ 0 9 Chapter 6: Statistical Inference b. The results of the paired t-test are: Descriptive Statistics N Mean Std. Dev. Std. Err. White Females 11 82.73 7.254 2.187 White Males 11 68.45 8.466 2.553 t-Test Analysis Difference N Mean Std. Dev 11 14.27 6.513 Std. Err t 1.964 df 7.268 lower 95% p-value 10 0.000 upper 95% 9.90 18.65 There is strong evidence for a difference in the graduation rates. White female athletes have a graduation rate that is 14.27 points higher than their male counterparts. The 95% confidence interval for the difference is (9.90, 18.65). The 90% confidence interval is (10.71, 17.83). c. The results of the 1 sample Wilcoxon test are: Descriptive Statistics N Minimum 1st Quartile Median 3rd Quartile Maximum White Females 11 71 78 82 87 94 White Males 11 55 64 69 72 83 Wilcoxon Sign Rank Analysis N Difference N<0 11 N=0 0 N>0 0 Median 11 p-value 15.00 0.001 lower 95% upper 95% 10.00 19.00 The results of the Sign test are: N White Females White Males Minimum Descriptive Statistics 1st 3rd Quartile Median Quartile Maximum 11 71 78 82 87 94 11 55 64 69 72 83 Sign Test Analysis N Difference 11 N<0 N=0 0 N>0 0 Median 11 15.00 p-value 0.001 lower 95% 9.28 upper 95% 20.15 Achieved Confidence 0.950 In both tests we reject the null hypothesis, accepting the alternative that there is a difference in graduation rates. d. We make the assumption that the only difference involved is gender since each pair comes from the same school and involves athletes, so these are paired comparsions. e. White female athletes have a significantly higher graduation rates than white male athletes in the Big Ten conference. However since this sample was not a random sample but a sample of a specific group, the results can only be applied to the Big Ten schools. To test whether these results can be applied to all universities, a random sample of the universities will have to be created. a. Let μ1 = mean white refusal rate and μ2 = mean minority refusal rate. Let μd = μ1 – μ2. The null and alternative hypotheses are: 13. H0: μd = 0 H a: μ d ≠ 0 10 Chapter 6: Statistical Inference The paired t-test results are: b. Descriptive Statistics N Mean Std. Dev. Std. Err. Minority 20 36.882 13.0509 2.9183 White 20 15.625 7.7958 1.7432 t-Test Analysis Difference N Mean Std. Dev Std. Err t 20 21.257 8.2954 1.8549 df 11.460 p-value 19 0.000 lower 95% upper 95% 17.374 25.139 The histogram and P-plot appears as follows: c. 6.0 5.0 4.0 3.0 2.0 1.0 0.0 6.2 8.7 11.1 13.5 16.0 18.4 20.9 23.3 25.7 28.2 30.6 33.1 35.5 37.9 40.4 Count Differences in Refusal Rates Refusal Rates P-Plot of Difference in Refusals 1.132 0.132 -0.868 -1.868 5.0 15.0 25.0 35.0 There is no evidence of a violation of the assumption of Normality. d. The results of the Wilcoxon Signed Rank test are: Descriptive Statistics Minimum 1st Quartile Median 3rd Quartile Maximum Minority N 20 10.6 26.4 37.4 45.3 62.2 White 20 3.7 9.2 15.8 20.2 32.4 Wilcoxon Sign Rank Analysis N Difference 20 N<0 N=0 0 N>0 0 20 11 Median 19.050 p-value 0.000 lower 95% upper 95% 17.315 25.100 Chapter 6: Statistical Inference The results of the Wilcoxon test match the results of the t-test. For high income applicants, the results of the t-test and the Wilcoxon test are as follows: e. Descriptive Statistics N Mean Std. Dev. Std. Err. High Income Minority 20 27.515 10.9220 2.4422 High Income White 20 11.300 6.5164 1.4571 t-Test Analysis Difference N Mean Std. Dev Std. Err 20 16.215 8.1082 1.8131 N Minimum t pvalue df 8.943 19 Descriptive Statistics 1st 3rd Quartile Median Quartile 0.000 lower 95% upper 95% 12.420 20.010 Maximum High Income Minority 20 5.8 21.3 29.1 37.0 41.3 High Income White 20 2.2 7.4 9.8 15.1 26.8 Wilcoxon Sign Rank Analysis N Difference N<0 20 N=0 0 N>0 1 19 Median 17.350 p-value 0.000 The distribution of the paired differences in refusal rates appears as follows: Differences in Refusal Rates (High Income) 6.0 Counts 5.0 4.0 3.0 2.0 1.0 3.0 5.1 7.1 9.1 11.1 13.2 15.2 17.2 19.3 21.3 23.3 25.3 27.4 29.4 1.0 0.0 Refusal Rates P-Plot of Difference in Refusals (High Income) 1.132 0.132 -0.868 -1.868 0.0 10.0 20.0 30.0 There is no reason to doubt the assumption of normality from these charts. 12 lower 95% upper 95% 12.100 20.750 Chapter 6: Statistical Inference 14. a. The 95% confidence intervals for the numeric variables are: N Salary Pupil Ratio Spending per Pupil Teacher Salary b. Std. Dev. lower 95% upper 95% Area = "North" 21 Mean 6.3510 0.78299 5.9945 6.7074 Area = "South" 17 7.1318 0.83919 6.7003 7.5632 Area = "West" 13 7.1131 1.50153 6.2057 8.0204 Overall 51 6.8055 1.07669 6.5027 7.1083 Area = "North" 21 3,900.62 796.898 3,537.88 4,263.36 Area = "South" 17 3,274.41 756.910 2,885.24 3,663.58 Area = "West" 13 3,919.15 1,560.191 2,976.34 4,861.97 Overall 51 3,696.61 1,054.761 3,399.95 3,993.26 Area = "North" 21 24,424.14 3,725.544 22,728.30 26,119.99 Area = "South" 17 22,894.00 3,553.857 21,066.78 24,721.22 Area = "West" 13 26,158.62 5,123.734 23,062.37 29,254.86 Overall 51 24,356.22 4,179.426 23,180.73 25,531.70 The 95% non-parametric confidence intervals are: N Salary Pupil Ratio Spending per Pupil Teacher Salary Median lower 95% upper 95% Area = "North" 21 6.25 5.960 6.710 Area = "South" 17 7.34 6.660 7.585 Area = "West" 13 6.86 6.055 8.070 Overall 51 6.77 Area = "North" 21 3,621 3,457.0 4,278.0 Area = "South" 17 2,980 2,828.5 3,664.5 Area = "West" 13 3,705 3,115.5 4,603.0 Overall 51 3,554 3,334.0 3,840.0 Area = "North" 21 24,500 22,632.5 26,406.0 Area = "South" 17 22,080 21,179.5 24,003.0 Area = "West" 13 25,788 23,561.0 27,488.5 Overall 51 23,382 23,029.5 25,036.0 6.455 7.090 15. a. The null and alternative hypotheses are: H0: The number of pollution days is the same throughout the time period. Ha: The number of pollution days is not equal. b. The results of the t-test analysis are: t-Test Analysis N Diff80 14 Mean -16.600 Std. Dev. 26.2461 Std. Err. 7.0146 t 2.367 df 13.000 p-value 0.034 lower 95% -31.754 upper 95% -1.446 Based on this analysis we reject the null hypothesis and conclude, with a p-value of 0.034, that the average number of pollution days from 1985 to 1989 is significantly lower than the number of days in 1980. The number of pollution days decreased from 1.446 to 31.754 days. 13 Chapter 6: Statistical Inference The Normal P-plot appears as follows: c. 1.292 0.792 0.292 -0.208 -0.708 -1.208 -1.708 -99.6 -79.6 -59.6 -39.6 -19.6 0.4 Because of one extreme outlier there is some question whether the t-test is appropriate for this data. The outlier might have a large impact on the test's conclusion. d. The results of the Wilcoxon test are: Wilcoxon Sign Rank Analysis N Diff80 14 N<0 N=0 N>0 11 0 3 Median p-value -10.9 0.008 lower 95% -21.600 upper 95% -4.400 The test results are actually stronger than those provided by the t-test. This is probably due to the fact that the Wilcoxon test is not influenced by the presence of a large outlier. Because of the outlier, it is better to report the non-parametric result. 16. a. The null and alternative hypotheses are: H0: The resistance levels will the same for men and women throughout the course of the study. Ha: The resistance levels will not be the same. 14 Chapter 6: Statistical Inference b. The descriptive statistics of the Resistance variable are: N Resistance Day= 1 Resistance Day= 5 Resistance Day= 9 Resistance Day= 10 Resistance Day= 13 Resistance Day= 16 Resistance Day= 19 Resistance Day= 20 Resistance Resistance Day= 22 Day= 24 Mean Std. Dev Std. Err Gender = "Female" 8 211.400 10.0504 3.5534 Gender = "Male" 7 185.071 21.0666 7.9624 Gender = "Female" 8 211.025 16.5697 5.8583 Gender = "Male" 7 179.871 20.5025 7.7492 Gender = "Female" 8 203.763 16.0285 5.6669 Gender = "Male" 7 175.314 17.3432 6.5551 Gender = "Female" 8 254.463 15.5860 5.5105 Gender = "Male" 7 199.829 24.1288 9.1198 Gender = "Female" 8 238.600 14.6502 5.1796 Gender = "Male" 7 195.629 19.6891 7.4418 Gender = "Female" 8 241.975 21.7236 7.6805 Gender = "Male" 7 202.100 21.0396 7.9522 Gender = "Female" 8 236.750 15.4489 5.4620 Gender = "Male" 7 200.414 18.1770 6.8703 Gender = "Female" 8 208.163 23.2408 8.2169 Gender = "Male" 7 179.071 14.6212 5.5263 Gender = "Female" 8 191.838 22.5575 7.9753 Gender = "Male" 7 182.386 19.4655 7.3573 Gender = "Female" 8 192.750 19.2699 6.8129 Gender = "Male" 7 179.414 13.1021 4.9521 The t-test results are: Mean Diff. Resistance c. Std. Err. t df p-value lower 95% upper 95% Day= 1 26.329 8.3327 3.160 13.00 0.008 8.327 44.330 Day= 5 31.154 9.5690 3.256 13.00 0.006 10.481 51.826 Day= 9 28.448 8.6163 3.302 13.00 0.006 9.834 47.062 Day= 10 54.634 10.3447 5.281 13.00 0.000 32.286 76.982 Day= 13 42.971 8.8815 4.838 13.00 0.000 23.784 62.159 Day= 16 39.875 11.0811 3.598 13.00 0.003 15.936 63.814 Day= 19 36.336 8.6758 4.188 13.00 0.001 17.593 55.079 Day= 20 29.091 10.2144 2.848 13.00 0.014 7.024 51.158 Day= 22 9.452 10.9651 0.862 13.00 0.404 -14.237 33.140 Day= 24 13.336 8.6475 1.542 13.00 0.147 -5.346 32.018 There are significant differences between males and females on every day except days 22 and 24. This is the case even if an unpooled variance estimate is used. 15 Chapter 6: Statistical Inference The scatterplot appears as follows: d. Resistance Data 300 280 Resistance 260 240 Female Male 220 200 180 160 140 0 5 10 15 20 25 Day e. The results of the Mann-Whitney test are: Median Diff. Resistance Rank Sum1 Rank Sum2 p-value lower 95% upper 95% Day= 1 -31.250 38.5 81.5 0.047 -44.400 0.000 Day= 5 -32.700 35.0 85.0 0.014 -52.000 -9.700 Day= 9 -29.900 33.0 87.0 0.006 -48.500 -7.400 Day= 10 -63.200 29.0 91.0 0.001 -75.800 -30.600 Day= 13 -43.050 30.0 90.0 0.001 -56.900 -22.200 Day= 16 -44.750 33.0 87.0 0.006 -62.600 -14.500 Day= 19 -39.950 32.0 88.0 0.004 -56.500 -16.200 Day= 20 -27.650 36.5 83.5 0.025 -55.800 -5.800 Day= 22 -9.850 48.5 71.5 0.430 -36.200 15.900 Day= 24 -15.100 42.0 78.0 0.121 -32.000 6.600 The conclusions of this test match the conclusions of the t-test. f. During the first nine days of observation–the control period–women had a higher resistance scores than males, indicating a higher level of blood loss. During the test period–from day 10 through day 20–the resistance levels increased. This was most evident in the first day of the test period, and was particularly noticeable for females. In the recovery period–day 22 and 24–the resistance levels decreased and there was no significant difference between the male and female participants. a. Let μ1 = mean exam score for control students and μ2 = mean exam score for experimental method students. This is a two-sided test. The null and alternative hypotheses are: 17. H0: μ1 – μ2 = 0 H a: μ 1 – μ 2 ≠ 0 16 Chapter 6: Statistical Inference b. The results of the unpooled and pooled t-tests are: Descriptive Statistics N Final Score Std. Dev Std. Err Group = "Control" 79 Mean 23.87 11.597 1.305 Group = "Experimental" 85 27.34 8.847 0.960 unpooled t test Analysis Mean Diff. Final Score Std. Err. -3.47 1.620 t -2.141 df 145.64 p-value 0.034 lower 95% -6.67 upper 95% -0.27 pooled t test Analysis Mean Diff. Final Score Std. Err. -3.47 1.604 t -2.162 df 162.00 p-value 0.032 lower 95% -6.64 upper 95% -0.30 Equality of Variance Tests F-Test 0.015 Bartlett 0.016 Levene 0.039 Based on the results of the equality of variance tests, we should use the unpooled t-test results. Based on that test, the 95% confidence interval for the final score is (–6.67, –0.27) with a p-value of 0.034. Test scores are statistically significantly higher for students using the experimental method. c. The distribution of scores for the two groups are: The scores are not evenly distributed. The distribution of the scores in both methods seems to break into two groups. The first group contains those students who score less than 10 on the final exam. The second group consists of those students who score 19 or better on the final. There are very few students who score between 10 and 19. Because of the shape of the distribution, there is some question whether the t-test is the appropriate test for this data. 17 Chapter 6: Statistical Inference d. The results of the Mann-Whitney test are: Descriptive Statistics N Final Score Minimum 1st Quartile Median 3rd Quartile Maximum Group = "Control" 79 0.00 20.00 28.00 32.00 39.00 Group = "Experimental" 85 0.00 25.00 29.00 34.00 38.00 Mann-Whitney Rank Analysis Median Diff. Final Score -2.00 Rank Sum1 Rank Sum2 6,006.5 7,523.5 lower 95% p-value 0.092 upper 95% -4.00 0.00 Using the non-parametric test, we would not reject the null hypothesis since the p-value is 0.092. e. We arrive at two different conclusions based on the two statistical tests. In examining the distribution of the scores, we noticed that scores did not appear to follow the Normal distribution. This causes us to have some concern about the appropriateness of the t-test and fail to reject the null hypothesis based on the results of the Mann-Whitney test. a. The results of the paired t-tests are: 18. Descriptive Statistics Region = "E" Region = "MW" Region = "S" Region = "W" Overall N Mean Std. Dev. Std. Err. Dem1980 13 42.415 5.4626 1.5150 Dem1984 13 42.162 5.1699 1.4339 Dem1980 10 37.690 7.1507 2.2612 Dem1984 10 40.040 6.5862 2.0827 Dem1980 10 48.000 4.1918 1.3256 Dem1984 10 38.370 2.0022 0.6332 Dem1980 17 34.106 6.8112 1.6520 Dem1984 17 35.753 6.0693 1.4720 Dem1980 50 39.762 7.9227 1.1204 Dem1984 50 38.800 5.8179 0.8228 t-Test Analysis N Mean Std. Dev Std. Err t df pvalue lower 95% upper 95% Region = "E" Difference 13 0.254 3.5225 0.9770 0.260 12 0.799 -1.875 Region = "MW" Difference 10 -2.350 3.3909 1.0723 -2.192 9 0.056 -4.776 0.076 Region = "S" Difference 10 9.630 3.3549 1.0609 9.077 9 0.000 7.230 12.030 Region = "W" Difference 17 -1.647 3.6797 0.8925 -1.846 16 0.084 -3.539 0.245 Overall Difference 50 0.962 5.6308 0.7963 1.208 49 0.233 -0.638 2.562 b. 2.382 The only area in which there was a significant difference in voting was the Southern region. The pvalue for this region is < 0.0001 and the 95% confidence interval is (7.23, 12.03) indicating a decline in voting percentage from 7 to 12 points. Overall, there was not a significant difference after pooling all regions. 18 Chapter 6: Statistical Inference 19. a. Let μ1 = mean exam score for the female students and μ2 = mean exam score for male students. The null and alternative hypotheses are: H0: μ1 – μ2 = 0 H a: μ 1 – μ 2 ≠ 0 b. The result of the two-sample t-tests are: Descriptive Statistics N Exam Score Mean Std. Dev Std. Err Gender = "Female" 37 80.41 10.057 1.653 Gender = "Male" 43 79.93 12.618 1.924 unpooled t test Analysis Mean Diff. Exam Score Std. Err. 0.48 2.537 t df 0.187 77.58 p-value 0.852 lower 95% -4.58 upper 95% 5.53 pooled t test Analysis Mean Diff. Exam Score Std. Err. 0.48 2.580 t df 0.184 78.00 p-value 0.854 lower 95% -4.66 upper 95% 5.61 Equality of Variance Tests F-Test 0.167 Bartlett 0.165 Levene 0.193 There is no reason not to use the pooled t-test results. We fail to reject the null hypothesis with a p-value of 0.854. The 95% confidence interval for the difference in scores is (–4.66, 5.61). c. The distribution of the scores for the two groups is: The t-distribtuion assumes a continuous distribution from minus infinity to positive infinity, however the exame scores are constrained to the range [0, 100]. In practical terms, the range is more like (70, 100]. Moreover exam scores are discrete, only taking on integer values. However, the robustness of the t-test to departures from normality may allow us to still the analyze the data with the t-test without coming to an erroneous conclusion. 19 Chapter 6: Statistical Inference 20. The t-test results are: a. React1 vs. React2 t-Test Analysis Difference N Mean Std. Dev Std. Err 14 0.00350 0.020843 0.005570 N Mean Std. Dev Std. Err 14 0.01429 0.016198 0.004329 N Mean Std. Dev Std. Err 14 0.01079 0.019589 0.005235 t df 0.628 p-value 13 lower 95% upper 95% -0.00853 0.01553 lower 95% upper 95% 0.00493 0.02364 lower 95% upper 95% -0.00052 0.02210 0.541 React1 vs. React3 t-Test Analysis Difference t df 3.300 p-value 13 0.006 React2 vs. React3 t-Test Analysis Difference t 2.060 df p-value 13 0.060 As the meet continues, the reaction time drops with each round. The drop is not significant between round 1 around round 2, but it is significant when the first round is compared with the third round and almost significant when the second round is compared with the third. The P-plots for each paired difference are: b. React1 vs. React3 React1 vs. React2 1.292 1.292 0.792 0.792 0.292 0.292 -0.208 -0.208 -0.708 -0.708 -1.208 -1.208 -1.708 -0.035 -0.015 0.005 -1.708 -0.008 0.025 0.002 0.012 0.022 0.032 0.042 React2 vs. React3 1.292 0.792 0.292 -0.208 -0.708 -1.208 -1.708 -0.016 0.004 0.024 0.044 There is no reason to reject an assumption for normality for the 1 to 2 plot and for the 1 to 3 plot; however there are some problems with the 2 to 3 plot. One value appears out of line with the others, making it appear as if the data were not normally distributed. 20 Chapter 6: Statistical Inference c. The Wilcoxon test results are: React1 vs. React2 Wilcoxon Sign Rank Analysis N Difference 14 N<0 N=0 5 N>0 1 8 Median p-value 0.00750 0.542 lower 95% upper 95% -0.00800 0.01700 React1 vs. React3 Wilcoxon Sign Rank Analysis N Difference 14 N Difference 14 N<0 N=0 1 N<0 N>0 2 N=0 4 11 N>0 0 10 Median p-value 0.01100 lower 95% 0.003 Median p-value 0.01000 0.078 upper 95% 0.00400 0.02450 lower 95% upper 95% -0.00050 0.02100 The results match what was observed using the t-test: 1) There is no significant difference between round 1 and round 2, 2) There is a significant difference between round 1 and round 3, and 3) There is no significant difference between round 2 and round 3 at the 5% level, but there is one at the 10% level. d. The reaction times decrease as the sprinters advance in the competion. There is not a significant decrease in the reaction times between the first and second round, but comparing the third round reaction times to those in the first show a significant decrease of 0.014 seconds. a. The t-test results are: 21. Round1 vs. Round2 t-Test Analysis Difference N Mean 14 0.1043 Std. Dev 0.12233 Std. Err 0.03269 t df 3.190 p-value 13 0.007 lower 95% upper 95% 0.0337 0.1749 lower 95% upper 95% -0.0284 0.0298 Round2 vs. Round3 t-Test Analysis Difference b. N Mean 14 0.0007 Std. Dev 0.05045 Std. Err 0.01348 t 0.053 df p-value 13 0.959 There is a significant decrease in the race time between the first and second round with a p-value of .007. The 95% confidence interval for the difference is (0.0337, 0.1749). However there is no significant difference between the times in round 2 and round 3. This suggests that the level of competition in the second and third rounds is high enough that the race times do not decrease as much. The competition in the first round is not as severe, allowing the elite racers to conserve their strength for the later rounds. 21 23