Tests of Significance I. 1 Sample z-Test Consider the following question: A coin is tossed 100 times, giving 58 heads. Does the coin seem to be “fair”? Test at the 1% level of significance. (1) Every legitimate test of significance involves a box model. Fair Coin 1 1 AV = .5 SD = .5 1 100 Draws 0 EV = (100)(.5) = 50 SE = 100 (.5) 5 (2) To make a test of significance, the null hypothesis has to be formulated as a statement about the box model. H : The observed number of heads (58) is not significantly greater than the expected number of heads (50); any difference is due to chance variation. (3) Calculate the appropriate test statistic. Observed Expected SE 58 50 8 z 1.6 5 5 z Statistic (4) Calculate the P-value from the normal table. 89% P 5.5% 50 z=0 58 z = 1.6 1 Note: The P-value is the probability of getting a value more extreme than the observed value; you will be finding the area of 1 tail of the normal curve. (5) Compare the P-value to the level of significance. If the P-value is less than the given level of significance, then reject the null; this would suggest that the difference between the observed value and the expected value is due to some factor other than chance variation. If the P-value is not less than the given level of significance, then fail to reject the null; this would suggest that the difference is probably due to chance variation. P 5.5% 1% fail to reject the null there is no reason to believe that the coin is not fair. Note: The 1 Sample z-Test is used when comparing the results of an experiment (sample) with an established standard or ideal situation (population box model). Observed (%) Expected(%) When comparing percents, use z . SE (%) Observed ( Ave) Expected( Ave) When comparing averages, use z . SE ( Ave) II. 1 Sample t-Test Consider the following question: A car company states that its 2002 Celantra gets an average of 32 mpg. A consumer testing company randomly selects 8 Celantras with the following gas mileages (in mpg): 29, 33, 30, 30, 28, 31, 34, and 29. Do the data support the car company’s claim? Test at the 5% level of significance. (1) Make a box model. Company’s Standard 8 Draws AV = 32 mpg Sample SD (1.94) 8 2.07 mpg 7 AV = 30.5 mpg SD = 1.94 mpg SE ( Ave) 2.07 8 .73 mpg 2 (2) Formulate the null hypothesis. H : The observed average of 30.5 mpg is not significantly less than the expected average of 32 mpg. (3) Calculate the t-Statistic. Observed ( Ave) Expected( Ave) SE ( Ave) 30.5 32 t 2.05 .73 t-Statistic = (4) Calculate the P-value from the Student’s t-curve with df = 7. 5% P 2.5% 30.5 t = -2.05 t = -2.36 32 t=0 t = -1.89 Thus, 2.5% < P < 5%. (5) Since P-value < 5%, reject the null. The data do not seem to support the car company’s claim. Note: The 1 sample t-test will be used instead of the 1 sample z-test whenever the the sample size is small (N 26), the SD of the population is unknown, and the population is normally distributed. III. 2 Sample z-Test Consider the following question: A testing laboratory is testing the life of air conditioning compressors produced by two different companies. A random sample of 400 compressors was taken 3 from company A, giving an average life of 110 months with SD of 60 months. Also, a random sample of 100 compressors was taken from company B, giving an average life of 90 months with SD of 40 months. Do the compressors of company A last significantly longer than those of company B? Test at the 1% level of significance. (1) Make the box models. 400 Draws Sample Company A All Compressors Company A All Compressors Company B AV = ? SD = 60 SE ( Ave) AV = 110 SD = 60 AV = ? SD = 40 60 Sample 40 SE ( Ave) 400 =3 100 =4 SE for Difference of Averages = 100 Draws Company B AV = 90 SD = 40 32 4 2 5 (2) Formulate the null hypothesis. H : The observed difference of the averages (110 months – 90 months = 20 months) is not significantly greater than the expected difference of the averages (0 months). (3) Calculate the z-statistic. Observed Difference Expected Difference SE for Difference 20 0 20 z 4 5 5 z (4) Calculate the P-value from the normal curve. 99.9937% 0 z=0 P .00315% 20 z=4 4 (5) Since P-value < 1%, reject null. Thus, the difference between the averages seems to be significant; company A’s compressors seem to last longer than those from company B. IV. 2 Sample t-Test Consider the following question: A high school counselor wishes to test the effectiveness of a SAT prep course. One class of 10 students takes the prep course and another class of 15 students does not receive any special instruction. The SAT is given to both classes at the end of 8 weeks. The first class scored an average of 1180 with SD of 120, and the second scored an average of 1000 with SD of 160. Is the difference in the average scores significant? Test at the 1% level of significance. (1) Make the box models. Population for SAT Prep 10 Draws Population for Regular Class AV = ? SD 120 10 Class AV = 1180 SD = 120 AV = ? SAT Prep 15 Draws SD 160 15 9 126.5 126.5 SE ( Ave) 10 40 14 165.6 165.6 SE ( Ave) 15 42.76 SE for Difference of Averages = Regular Class AV = 1000 SD = 160 40 2 (42.76) 2 58.6 (2) Formulate the null hypothesis. H : The observed difference of the averages (1180 – 1000 = 180) is not significantly greater than the expected difference of the averages (0). (3) Calculate the t-statistic. t Observed ( Diff ) Expected( Diff ) 180 0 3.07 SE ( Diff ) 58.6 5 (4) Calculate the P-value from the Student’s t-curve with combined df = df (1st sample) + df (2nd sample) = 9 + 14 = 23. 0.5% P 0 t=0 180 t = 3.07 t = 2.81 (5) Since P-value < 0.5% <1, reject null. Thus, the difference between the averages seems to be significant; the SAT prep course seems to have improved the scores. V. Matched Difference (1 Sample z-Test or t-Test) Consider the following question: A systems analyst is testing the possibility of using a new computer system. In order to make a decision, a sample of seven jobs was selected and the processing time in seconds was recorded on the old and the new systems with the following results: Job 1 2 3 4 5 6 7 Old 8 4 10 9 8 7 12 New 6 3 7 8 5 8 9 Is there sufficient evidence to conclude that the old system uses more processing time? Test at the 0.5% level of significance. We could use a 2 sample t-test to compare the average processing time on the old system with the average processing time on the new system. However, there is a more effective technique that can be used. Since the same seven jobs are used for both samples of processing times, we can use the natural pairing or matching between these times and do a 1 sample t-test on the seven differences. This technique eliminates chance variation between two different jobs which would occur if we use the 2 sample test. 6 (1) Make the box model. The seven differences in processing times are: 8 – 6 = 2, 4 – 3 = 1, 10 – 7 = 3, 9 – 8 = 1, 8 – 5 = 3, 7 – 8 = – 1, and 12 – 9 = 3. Matched Differences In Processing Times For All Jobs 7 Draws Differences In Processing Times AV = 0 (*) SD 1.385 7 6 1.496 sec 1.496 SE ( Ave) 0.565 sec 7 For The 7 Jobs AV = 1.714 sec SD = 1.385 sec Note (*): In this problem, we assume that the box model is “theoretical”, consisting of a large number of tickets that are normally distributed and have an average of 0. (2) Formulate the null hypothesis. H : The observed average of the differences (1.714 sec) is not significantly greater than the expected average of the differences (0 sec). (3) Calculate the t-statistic. Observed ( Ave of Differences) Expected( Ave of Differences) SE ( Ave of Differences) 1.714 sec 0 sec t 3.03 0.565 sec t (4) Calculate the P-value from the Student’s t-curve with df = 6. 2.5% P 1% 0 sec t=0 2.45 7 1.714 sec 3.03 3.14 Thus, 1% < P < 2.5%. (5) Since P > 0.5%, then fail to reject null. Thus, the difference in processing times between the old system and the new system do not seem to be significant. VI. 2 - Test (Goodness of Fit ) Consider the following question: A die is rolled 60 times with the following results: Number on Die 1 2 3 4 5 6 Observed Frequency 4 6 17 16 8 9 Are these results significantly different from what we would expect from a fair die? Test at the 1% level of significance. We could use a 1 sample z-test to test each of the 6 numbers individually. In each of the 6 cases, the box model would be the same: 1 AV = SD 1 5 1 6 15 .373 66 0 EV 60 Draws 1 60 10 6 SE .373 60 2.9 For instance, if we test whether or the results for “6” are significantly different from 9 10 .34 with a corresponding P-value what we would expect, we would get z 2.9 of 37% and no significant difference. However, if we test the results for “3”, we 17 10 2.41 with corresponding P-value of 0.8% and a significant would get z 2.9 difference. The 2 -test allows us to test all 6 numbers at once. 8 (1) Make a box model for each of the 6 numbers. We need to calculate the EV, not the SE. 1 1 5 0 60 Draws AV = 1 6 EV 1 60 10 6 (2) Formulate the null hypothesis. H : There is no significant difference between the observed frequencies and the expected frequencies for a fair die. (3) Calculate the 2 -statistic. Number on Die 1 2 3 4 5 6 2 Observed Frequency 4 6 17 16 8 9 Expected Frequency 10 10 10 10 10 10 (Obs Exp) 2 (4 10) 2 (6 10) 2 (17 10) 2 (16 10) 2 Exp 10 10 10 10 (8 10) 2 (9 10) 2 14.2 10 10 (4) Calculate the P-value from the Chi-Square table with df = 6 – 1 = 5. 5% P 1% 2 11.07 14.2 15.09 9 Thus, 1% < P < 5%. (5) Since P > 1%, then fail to reject null. There is no reason to believe that the die is not fair. VII. 2 -Test ( Independence Between 2 Attributes ) Consider the following question: Wake Forest University recorded the statistics shown below in the table for its 1992 – 1993 annual giving campaign: Class 1980 182 260 1970 192 174 Contributed Did Not Contribute 1990 325 586 Are the giving patterns independent of the class year? Test at the 5% level of significance. (1) Make the box models. First, we must get the column totals and the row totals. Contributed Did Not Contribute Totals 1970 192 174 Class 1980 182 260 1990 325 586 Total 699 1020 366 442 911 1719 Thus, the ratio of “contributed” in the population to “did not contribute” in the population is 699 to 1020. If the giving patterns are independent of the class year, then the ratio of “contributed” in each individual class to “did not contribute” in that same class should be the same as for the population. In order to calculate the expected number of “contributed” in the class 1970, use the following box model: 699 1 AV 699 1719 1020 0 EV 366 Draws 699 366 149 1719 Since the expected number of “contributed” in the class 1970 is 149, then the expected number of “did not contribute” in 1970 is 366 – 149 = 217. In order to calculate the expected number of “contributed” in the classes 1980 and 1990, use 10 the same box model except with 442 draws and 911 draws: 699 442 180 1719 699 911 370 EV 1719 1980: EV 1990: Also, the expected number of “did not contribute” in 1980 is 442 – 180 = 262 and in 1990 is 911 – 370 = 541. Thus, the completed contingency table would be: Contributed Did Not Contribute Totals 1970 Obs Exp 192 149 174 217 Class 1980 Obs Exp 182 180 260 262 1990 Obs Exp 325 370 586 541 699 1020 366 366 442 911 911 1719 442 Total (2) Formulate the null hypothesis. H : The giving patterns are independent of the class year. (3) Calculate the 2 -statistic. 2 (192 149) 2 (182 180) 2 (325 370) 2 (174 217) 2 149 180 370 217 (260 262) 2 (586 541) 2 30.18 262 541 (4) Calculate the P-value. If the contingency table has C columns and R rows, then df = (C – 1)(R – 1). Thus, in this example, df = (3 – 1)(2 –1) = 2(1) = 2. 1% P Thus, P < 1%. 2 9.21 30.18 (5) Since P < 5%, reject the null. It seems as though the giving patterns are not independent of the class year. 11