252chisq 2/29/08 (Open this document in 'Outline' view!) E. CHI-SQUARED AND RELATED TESTS. These tests are generalizations of the one-sample and two-sample tests of proportions. A test of Goodness of Fit is necessary when a single sample is to be divided into more than two categories. A Test of Homogeneity is needed when one wants to compare more than one sample. A test of Independence is used to see if two variables or categorizations are related, but is formally identical to a test of homogeneity. 1. Tests of Homogeneity and Independence Two possible null hypotheses apply here. The observed data is indicated by O, the expected data by E . H 0 : Cities are Homogeneous by income groups O City 1 Upper Income Middle Income Lower Income Sample Size pc 30 2 3 4 Total 10 15 15 10 50 5 10 15 10 40 40 15 15 10 20 60 60 30 40 40 40 40 40 40 1 4 4 4 150 5 150 15 150 15 150 15 pr 1 150 3 50 4 150 2 150 150 1 15 5 1 H 0 : Sick Days are Independent of Age O Days Age 15 - 25 Age 26 - 49 Age 50 up Total 30 pc 0 1 2 3 Total 50 10 15 15 10 50 5 10 15 10 40 40 15 15 10 20 60 60 30 40 40 40 40 40 40 1 4 4 4 150 5 150 15 150 15 150 15 150 1 pr 1 150 3 4 150 150 15 2 5 1 The numbers are obviously identical in these two cases. In each case the expected values O 150 . Each cell are done the same way. There are r 3 rows, c 4 columns and rc 12 cells. n gets p c p r n p c (Column total) . For example, for the upper left corner the expected value is 1 1 150 1 30 10 3 5 3 1 252chisq 2/29/08 (Open this document in 'Outline' view!) E Column 1 Row 1 Row 2 Row 3 Total pc 2 4 Total 50 10 13 1 3 13 1 3 13 1 3 50 8 10 2 3 10 2 3 10 2 3 40 40 16 16 16 60 60 12 30 3 30 40 40 40 40 40 40 1 4 4 4 150 5 150 15 150 15 150 15 O E 2 pr 1 150 3 150 4 150 150 1 15 2 5 1 O2 n . The first of these two E E formulas is shown below. For an explanation of the equivalence of these two formulas, the reason why the degrees of freedom are as given below, and to relate the chi-squared test to a z test of proportions, see 252chisqnote. The formula for the chi-squared statistic is 2 E O E O2 E O or 2 E O 2 E 10.0000 10 0.0000 0.0000 0.00000 8.0000 5 3.0000 9.0000 1.12500 12.0000 15 -3.0000 9.0000 0.75000 13.3333 15 -1.6667 2.7778 0.20833 10.6667 10 0.6667 0.4445 0.04167 16.0000 15 1.0000 1.0000 0.06250 13.3333 15 -1.6667 2.7779 0.20834 10.6667 15 -4.3333 18.7775 1.76038 16.0000 10 6.0000 36.0000 2.25000 13.3333 10 3.3333 11.1109 0.83332 10.6667 10 0.6667 0.4445 0.04167 16.0000 20 -4.0000 16.0000 1.00000 150.0000 150 0.0000 8.28121 The degrees of freedom for this application are r 1c 1 3 14 1 23 6 . The most common test is a one-tailed test on the grounds that the larger the discrepancy that occurs between O and E , the larger will be E O 2 . If our significance level is 5%, compare E 26 .05 12 .5916 . Since our value of this sum is less than O E 2 E to the table chi-squared, do not reject the null hypothesis. Note: Rule of thumb for E . All values of E should be above 5 and we generally combine cells to make this so. However a number 2 is acceptable in E if i) Our computed 2 turns out to be less than 2 or, (ii) The particular value of E makes a very small contribution to O E 2 E , relative to the value of the total. Note: Marascuilo Procedure. The Marascuilo procedure says that, for 2 by c tests, if (i) equality is rejected and 2 252chisq 2/29/08 (Open this document in 'Outline' view!) (ii) p a p b 2 s p , where a and b represent 2 groups, the chi - squared has c 1 degrees of freedom and the standard deviation is s p p a q a pb qb , you can say that you have a significant na nb difference between p a and p b . This is equivalent to using a confidence interval of c 1 p a q a pb qb n n b a Example: Pelosi and Sandifer give the data below for satisfaction with local phone service classified by type of company providing the service. 1) Is the proportion of people who rate the phone service as excellent independent of the type of company? 2) If it is not, test for a difference in the proportion who rate their service as excellent against the best-rated provider. Remember that this is a test of equality of proportions. Service 1 is a long distance company, Service 2 is a local phone company, Service 3 is a power company, Service 4 is CATV (cable) and Service 5 is Cellular. p1 is thus the proportion of long distance company customers that rate their service as excellent. H 0 : p1 p 2 p3 p 4 p5 H 1 : Not all ps equal. p a pb p a pb 2 p1 .1592 p 2 .2520 p 3 .2127 p 4 .3328 p 5 .2571 n1 1658 n 2 1762 n 3 616 n 4 646 n 5 770 Solution: Set up the O table. To get the number that rate service as excellent for long distance, note that p1 n1 .1592 1658 263 .95 . But this must be a whole number, so round it to 246. The number that do not rate it as excellent is 1658 246 1394 . This gives us our first column. O Long Dist Local Ph Power CATV pq is also computed for use later. n Cellular Total pr Excellent 264 444 131 215 198 1252 .2296 Not 1394 1318 485 431 572 4200 .7704 Sum 1658 1762 616 646 770 5452 1.0000 Proportion .1592 .2520 .2127 .3328 .2571 Excellent .0000807 .0001070 .0002718 .0003437 .0002481 pq n Note that in addition to computing the overall proportion of excellent and not excellent service (.2296 and .7704) , the ‘proportion excellent’ has been computed for each type of service as well as the variance pa qa used in the confidence interval formula. If we apply the proportions in each row to the column sums na we get the following expected values. Long Dist Local Ph Power CATV Cellular Total E pr Excellent 380.68 404.56 141.43 148.32 176.79 1252 . .2296 Not 1277.32 1357.44 474.57 497.68 593.21 4200 .7704 sum 1658 1762 616 646 770 5452 1.0000 3 252chisq 2/29/08 (Open this document in 'Outline' view!) The chi-squared test follows. Row E O 1 2 3 4 5 6 7 8 9 10 380.68 1277.32 404.56 1357.44 141.43 474.57 148.32 497.68 176.79 593.21 5452.00 264 1394 444 1318 131 485 215 431 198 572 5452 E O 116.677 -116.677 -39.445 39.445 10.434 -10.434 -66.678 66.678 -21.208 21.208 0.000 E O2 13613.5 13613.5 1555.9 1555.9 108.9 108.9 4446.0 4446.0 449.8 449.8 E O 2 E 35.7612 10.6578 3.8459 1.1462 0.7697 0.2294 29.9755 8.9335 2.5441 0.7582 94.622 4 The degrees of freedom are r 1c 1 2 15 1 14 4 and 2 .05 9.448 , so we reject the null hypothesis and say that there is a difference between the proportions that rate their service as excellent. Since the highest proportion satisfied was with CATV we compare the proportions with the proportion calculated for CATV using the confidence interval formula above. Long distance .1592 .3328 9.448 .0000807 .0003437 .1736 .0633 Local Phone .2520 .3328 9.448 .0001070 .0003437 .0808 .0653 .2127 .3328 9.448 .0002718 .0003437 .1201 .0763 Power .2571 .3328 9.448 .0002481 .0003437 .0757 .0748 Cellular Notice that the absolute size of the error term is always smaller than the absolute size of the difference in proportions, so that we can say that all of these differences are significant. Though I have not checked it, I doubt, if we compare all other proportions with the proportion saying cellular service is excellent we would get such strong results. 2. Tests of Goodness of Fit a. Uniform Distribution Let us pool the data above, that is, treat it all as if it were one sample, and ask if it is uniformly distributed. H 0 : Uniform distribution E O 2 E O O E O E E 50 50 0 0 0 50 40 10 100 2 50 60 -10 100 2 150 150 0 4 Since there are 3 numbers here, there are 2 degrees of freedom. Since 4 is less than .2052 5.9915 , we 2 cannot reject the null hypothesis. An easier way to do this is to compute 2 O E n. E 50 50 50 150 O 50 40 60 150 O2 n . Remember E O2 E 50 32 72 154 -150 4 252chisq 2/29/08 (Open this document in 'Outline' view!) 4 5 252chisq 2/29/08 (Open this document in 'Outline' view!) For a combined Chi-square test of both uniformity and homogeneity see 252chisqx1 b. Poisson Distribution x 0 1 2 3 4 5 6 7+ x 0 1 2 3+ Example: I believe that there is almost a daily accident on my corner. To make this into a testable hypothesis, let us sat that I believe that the distribution is Poisson with a parameter of 0.8 and that I observe the numbers of accidents shown below over 200 days. For example there are 100 days with no accidents, 60 days with 1 etc. H 0 : Poisson(0.8) H 1 : Not Poisson(0.8) To get f , I look up f O f E fn 100 .4493 89.86 on the Poisson table an 60 .3595 71.90 multiply by n 200 , u 30 .1438 28.76 formula E fn 6 .0383 7.66 Unfortunately, I canno 0 .0077 1.54(<5) work with E as it appears here. 4 .0012 0.24(<5) I must have each E at least 0 .0002 0.04(<5) 5. To fix the problem, I add 0 .0000 0.00(<5) the smallest cells together to 200 1.0000 200.00 increase E to 5 or mo O2 O 100 60 30 10 E 89.86 71.90 28.76 9.48 E 111.28 50.07 31.29 10.56 200 200 203.20 Since I did not estimate the mean of 0.8 from the data, I have 3 degrees of freedom. .205(3) 7.815 so I do not reject the null hypothesis. 200.00 3.20 H 0 : Poisson But what if my hypotheses are simply ? Then I would have to estimate the mean from H 1 : Not Poisson the data. Looking at the x and O columns I calculate 0100 160 230 36 54 158 0.79 . I would still use the Poisson distribution with a 200 200 parameter of 0.8 unless I had a computer handy to compute it with a parameter of 0.79, but my degrees of freedom are now 3 - 1 = 2, because I used the data to estimate a parameter. 6 252chisq 2/29/08 (Open this document in 'Outline' view!) c. Normal Distribution A common way to set up a 2 test of normality is to group data starting at the mean and ending each group one-half of a standard deviation from the mean. One can proceed outward from the mean until four or five groups have been sectioned off in each direction. For example, if our null hypothesis is that x ~ N 100 ,10 , we can start at 100 and let the width of each group be one half of 10 or 5. The groups would be 100-105, 105-110, etc. going up, and 95-100, 90-95, etc. going down. Then for the highest number in x each interval, compute z . For example, for the interval 90-95 compute 95 100 z 0.5 . Then use the normal distribution to compute F z . 10 . For example F (0.5) Pz 0.5 .5 .1915 .3085 . Then, to find the frequency of the interval, subtract this F z from the F z for the previous interval. An example of calculating E this way is shown below. H 0 : x ~ N 100 ,10 and n 1000 x interval z E fn -80 .0228 22.8 80-85 .0440 44.0 85-90 .0919 91.9 90-95 .1498 149.8 95-100 .1915 191.5 100-105 0.5 191.5 105-110 1.0 149.8 110-115 1.5 F z f -2.0 .0228 -1.5 .0668 -1.0 .1587 -0.5 .3085 0.0 .5000 .6915 .1915 .8413 .1498 .9332 .0919 91.9 44.0 115-120 2.0 .9772 .0440 120- 1.0000 .0228 22.8 For smaller values of n we may find that some numbers in E are less than 5, so that we have to combine some intervals. In the above example the degrees of freedom for 2 are 10 - 1 = 9 if the mean and variance are known. If they both had to be computed from data the degrees of freedom would be reduced by 2 to 7. 7 252chisq 2/29/08 (Open this document in 'Outline' view!) 3. Kolmogorov-Smirnov Test a. Kolmogorov-Smirnov One-Sample Test This is a more powerful test of goodness of fit than the Chi-Squared test. Unfortunately, it can only be used when the distribution in the null hypothesis is totally specified. For example, if we wanted to do the test for Poisson(0.8) above, we would look up the cumulative distribution Fe for Poisson(0.8) and proceed as below. Note that this would not work if our hypothesis was that the distribution was Poisson without the mean specified. x 0 1 2 3 4 5 6 7+ O 100 60 30 6 0 4 0 0 200 O n .50 .30 .15 .03 .00 .02 .00 .00 1.00 Fo .50 .80 .95 .98 .98 1.00 1.00 1.00 Fe .4493 .8088 .9526 .9909 .9986 .9998 1.0000 1.0000 D Fo Fe .0507 .0088 .0026 .0109 .0106 .0002 .0000 .0000 The maximum difference is MaxD .0507 , which must be checked against the Kolmogorov-Smirnoff table 1.36 .0962 . Since MaxD is less for n 200 . According to the table, for .05 , the critical value is 200 than .0962, accept the null hypothesis. b. Lilliefors Test. Because the Kolmogorov-Smirnov Test is so limited in application, it proved advantageous to develop a special version of that test to use to test for a normal distribution when the mean and variance are unknown. Once a sample mean and variance are found, this test is identical to the K-S Test except for the use of a special table. Problem E9: Is the following data normal? 420, 440, 445, 450, 460, 475, 480, 500, 520, 530 Solution: Assume .05 H0 : Normal The only practical method is the Lilliefors method. Question: Why is Chi-squared impractical and Kolmogorov-Smirnov impossible? The numbers must be in order before we begin computing cumulative probabilities! Checking the data we xx find that x 472 and s 35 .92 . We compute z . (This is really a t .) Fe is the cumulative s distribution, gotten from the Normal table by adding or subtracting 0.5. Fo comes from the fact that there are 10 numbers, so that each number is one-tenth of the distribution. For .05 and n 10 the critical value from the Lilliefors table is 0.2616. Since the largest deviation here is .1293, we do not reject H 0 . Remember that the Lilliefors method is a specialized version of the KS method used only in situations where you are testing for a Normal distribution and using a sample mean and standard deviation estimated from the data. The KS method can only be used in situations where the null hypothesis including parameters 8 252chisq 2/29/08 (Open this document in 'Outline' view!) is specified in advance. A Chi-squared test of goodness of fit is usually considered a large sample test, but can be adjusted for estimation of parameters. x z Fo Fe D 420 440 445 450 460 475 480 500 520 530 -1.45 -0.89 -0.75 -0.61 -0.33 0.08 0.22 0.78 1.34 1.61 0.1000 0.2000 0.3000 0.4000 0.5000 0.6000 0.7000 0.8000 0.9000 1.0000 .0735 .1867 .2266 .2702 .3707 .5319 .5871 .7823 .9099 .9463 .0265 .0133 .0734 .1291 .1293 .0681 .1129 .0177 .0099 .0537 ©2002 Roger Even Bove 9