Analysis of Count Data Chapter 26 Goodness of fit Formulas and models for two-way tables - tests for independence - tests of homogeneity Example 1: Car accidents and day of the week A study of 667 drivers who were using a cell phone when they were involved in a collision on a weekday examined the relationship between these accidents and the day of the week. Are the accidents equally likely to occur on any day of the working week? Example 2: M & M Colors Mars, Inc. periodically changes the M&M (milk chocolate) color proportions. Last year the proportions were: yellow 20%; red 20%, orange, blue, green 10% each; brown 30% In a recent bag of 106 M&M’s I had the following numbers of each color: Yellow 29 (27.4%) Red Orange 23 (21.7%) 12 (11.3%) Blue Green Brown 14 (13.2%) 8 (7.5%) 20 (18.9%) Is this evidence that Mars, Inc. has changed the color distribution of M&M’s? Example 3: Are successful people more likely to be born under some astrological signs than others? 256 executives of Fortune 400 companies have birthday signs shown at the right. There is some variation in the number of births per sign, and there are more Pisces. Can we claim that successful people are more likely to be born under some signs than others? Births Sign 23 Aries 20 Taurus 18 Gemini 23 Cancer 20 Leo 19 Virgo 18 Libra 21 Scorpio 19 Sagittarius 22 Capricorn 24 Aquarius 29 Pisces To answer these questions we use the chi-square goodness of fit test Data for n observations on a categorical variable with k possible outcomes are summarized as observed counts, n1, n2, . . . , nk in k cells. 2 hypotheses: null hypothesis H0 and alternative hypothesis HA H0 specifies probabilities p1, p2, . . . , pk for the possible outcomes. HA states that the probabilities are different from those in H0 The Chi-Square Test Statistic The Chi-square test statistic is: (Obs Exp) Exp all cells 2 2 where: Obs = observed frequency in a particular cell Exp= expected frequency in a particular cell if H0 is true The expected frequency in cell i is npi The Chi-Square Test Statistic (cont.) The χ2 test statistic approximately follows a chi-squared distribution with k-1 degrees of freedom, where k is the number of categories. If the χ2 test statistic is large, this is evidence against the null hypothesis. 2 ( Obs Exp ) 2 Exp all cells Decision Rule: 2 2 χ χ .05 ,reject H0, If otherwise, do not reject H0. .05 0 Do not reject H0 Reject H0 2.05 2 Car accidents and day of the week H0 specifies that all days are equally likely for car accidents each pi = 1/5. The expected count for each of the five days is npi = 667(1/5) = 133.4. 2 2 (count 133.4) (observed expected) day 2 8.49 expected 133.4 Following the chi-square distribution with 5 − 1 = 4 degrees of freedom. p df 0.25 0.2 0.15 0.1 0.05 0.025 0.02 0.01 0.005 0.0025 0.001 0.0005 1 1.32 1.64 2.07 2.71 3.84 5.02 5.41 6.63 7.88 9.14 10.83 12.12 2 2.77 3.22 3.79 4.61 5.99 7.38 7.82 9.21 10.60 11.98 13.82 15.20 3 4.11 4.64 5.32 6.25 7.81 9.35 9.84 11.34 12.84 14.32 16.27 17.73 4 5.39 5.99 6.74 7.78 9.49 11.14 11.67 13.28 14.86 16.42 18.47 20.00 5 6.63 7.29 8.12 9.24 11.07 12.83 13.39 15.09 16.75 18.39 20.51 22.11 6 7.84 8.56 9.45 10.64 12.59 14.45 15.03 16.81 18.55 20.25 22.46 24.10 Since the of the12.02 test statistic is less than18.48 the table 7 9.04value 9.808.49 10.75 14.07 16.01 16.62 20.28value 22.04of 9.49, 24.32 we 26.02 8 10.22 11.03 12.03 13.36 15.51 17.53 18.17 20.09 21.95 23.77 26.12 27.87 do9 not 11.39 reject H 0 12.24 13.29 14.68 16.92 19.02 19.68 21.67 23.59 25.46 27.88 29.67 10 12.55 13.44 14.53 15.99 18.31 20.48 21.16 23.21 25.19 27.11 29.59 31.42 11There is no significant evidence of different car accident rates for different 13.70 14.63 15.77 17.28 19.68 21.92 22.62 24.72 26.76 28.73 31.26 33.14 12 14.85 15.81 16.99 18.55 21.03 23.34 24.05 26.22 28.30 30.32 32.91 34.82 weekdays when the driver was using a cell phone. 13 15.98 16.98 18.20 19.81 22.36 24.74 25.47 27.69 29.82 31.88 34.53 36.48 Using software The chi-square function in Excel does not compute expected counts automatically but instead lets you provide them. This makes it easy to test for goodness of fit. You then get the test’s p-value—but no details of the X2 calculations. =CHITEST(array of actual values, array of expected values) with values arranged in two similar r * c tables --> returns the p value of the Chi Square test Note: Many software packages do not provide a direct way to compute the chisquare goodness of fit test. But you can find a way around: Make a two-way table where the first column contains k cells with the observed counts. Make a second column with counts that correspond exactly to the probabilities specified by H0, using a very large number of observations. Then analyze the two-way table with the chi-square function. Example 2: M & M Colors H0 : pyellow=.20, pred=.20, porange=.10, pblue=.10, pgreen=.10, pbrown=.30 Yellow Red Orange Blue Green Brown Total Obs. 29 23 12 14 8 20 106 Exp. 21.2 21.2 10.6 10.6 10.6 31.8 106 Expected yellow = 106*.20 = 21.2, etc. for other expected counts. (Obs Exp ) 2 (29 21.2) 2 (23 21.2) 2 Exp 21.2 21.2 all cells 2 (12 10.6) 2 (14 10.6) 2 (8 10.6) 2 (20 31.8) 2 10.6 10.6 10.6 31.8 2.87 0.153 0.185 1.091 0.638 4.379 9.316 Example 2: M & M Colors (cont.) 2 9.316;degrees of freedom 6 1 5 2 The test statistic is χ 2 9.316 ; χ0.05 with 5 d.f. 11.070 Decision Rule: 2 If χ 2 χ.05 ,reject H0, otherwise, do not reject H0. Here, 0.05 0 Do not reject H0 Reject H0 20.05 = 11.070 χ 2 2 2 χ.05 = 9.316 < = 11.070, so we do not reject H0 and conclude that there is not sufficient evidence to conclude that Mars has changed the color proportions. Models for two-way tables The chi-square test is an overall technique for comparing any number of population proportions, testing for evidence of a relationship between two categorical variables. There are 2 types of tests: 1. Test for independence: Take one SRS and classify the individuals in the sample according to two categorical variables (attribute or condition) observational study, historical design. 2. Compare several populations (tests for homogeneity): Randomly select several SRSs each from a different population (or from a population subjected to different treatments) experimental study. Both models use the X2 test to test of the hypothesis of no relationship. Testing for independence We have now a single sample from a single population. For each individual in this SRS of size n we measure two categorical variables. The results are then summarized in a two-way table. The null hypothesis is that the row and column variables are independent. The alternative hypothesis is that the row and column variables are dependent. Chi-square tests for independence (Obs Exp) Exp all cells 2 2 Expected cell frequencies: row total column total Exp n Where: row total = sum of all frequencies in the row column total = sum of all frequencies in the column n = overall sample size H0: The two categorical variables are independent (i.e., there is no relationship between them) H1: The two categorical variables are dependent (i.e., there is a relationship between them) Example 1: Parental smoking Does parental smoking influence the incidence of smoking in children when they reach high school? Randomly chosen high school students were asked whether they smoked (columns) and whether their parents smoked (rows). Student Smoke No smoke Total Both smoke 400 1380 1780 416 1823 2239 Neither smokes 188 1168 1356 Total 1004 4371 5375 Parent One smokes Are parent smoking status and student smoking status related? H0 : parent smoking status and student smoking status are independent HA : parent smoking status and student smoking status are not independent Example 1: Parental smoking (cont.) Does parental smoking influence the incidence of smoking in children when they reach high school? Randomly chosen high school students were asked whether they smoked (columns) and whether their parents smoked (rows). Examine the computer output for the chi-square test performed on these data. What does it tell you? Hypotheses? Are data ok for 2 test? (All expected counts 5 or more) df = (rows-1)*(cols-1)=2*1=2 Interpretation? Since P-value is less than .05, reject H0 and conclude that parent smoking status and student smoking status are related. Example 2: meal plan selection The meal plan selected by 200 students is shown below: Number of meals per week Class none Standing 20/week 10/week Total Fresh. 24 32 14 70 Soph. 22 26 12 60 Junior 10 14 6 30 Senior 14 16 10 40 Total 70 88 42 200 Example 2: meal plan selection (cont.) The hypotheses to be tested are: H0: Meal plan and class standing are independent (i.e., there is no relationship between them) H1: Meal plan and class standing are dependent (i.e., there is a relationship between them) Example 2: meal plan selection (cont.) Expected Cell Frequencies Observed: Class Standing Number of meals per week Expected cell frequencies if H0 is true: 20/wk 10/wk none Total Fresh. 24 32 14 70 Soph. 22 26 12 60 Junior 10 14 6 30 Senior 14 16 10 40 Class Standing Total 70 88 42 200 Example for one cell: row total column total Exp n 30 70 10.5 200 Number of meals per week 20/wk 10/wk none Total Fresh. 24.5 30.8 14.7 70 Soph. 21.0 26.4 12.6 60 Junior 10.5 13.2 6.3 30 Senior 14.0 17.6 8.4 40 70 88 42 200 Total Example 2: meal plan selection (cont.) The Test Statistic The test statistic value is: (Obs Exp)2 Exp all cells 2 (24 24.5)2 (32 30.8) 2 24.5 30.8 (10 8.4)2 0.709 8.4 χ 0.2 05 = 12.592 from the chi-squared distribution with (4 – 1)(3 – 1) = 6 degrees of freedom Example 2: meal plan selection (cont.) Decision and Interpretation 2 The test statistic is 2 0.709 ; 0.05 with 6 d.f. 12.592 Decision2 Rule: If > 12.592, reject H0, otherwise, do not reject H0 Here, 0.05 0 Do not reject H0 Reject H0 20.05=12.592 2 2 2 = 0.709 < χ 0.05 = 12.592, so do not reject H0 Conclusion: there is not sufficient evidence that meal plan and class standing are related. Models for two-way tables The chi-square test is an overall technique for comparing any number of population proportions, testing for evidence of a relationship between two categorical variables. There are 2 types of tests: 1. Test for independence: Take one SRS and classify the individuals in the sample according to two categorical variables (attribute or condition) observational study, historical design. NEXT: 2. Compare several populations (tests for homogeneity): Randomly select several SRSs each from a different population (or from a population subjected to different treatments) experimental study. Both models use the X2 test to test of the hypothesis of no relationship. Comparing several populations (tests for homogeneity) Select independent SRSs from each of c populations, of sizes n1, n2, . . . , nc. Classify each individual in a sample according to a categorical response variable with r possible values. There are c different probability distributions, one for each population. The null hypothesis is that the distributions of the response variable are the same in all c populations. The alternative hypothesis says that these c distributions are not all the same. Example: Cocaine addiction (test for homogeneity) Cocaine produces short-term feelings of physical and mental well being. To maintain the effect, the drug may have to be taken more frequently and at higher doses. After stopping use, users will feel tired, sleepy, and depressed. The pleasurable high followed by unpleasant after-effects encourage repeated compulsive use, which can easily lead to dependency. We compare treatment with an antidepressant (desipramine), a standard treatment (lithium), and a placebo. Population 1: Antidepressant treatment (desipramine) Population 2: Standard treatment (lithium) Population 3: Placebo (“sugar pill”) Cocaine addiction H0: The proportions of success (no relapse) are the same in all three populations. Observed Expected Expected relapse counts No Yes Desipramine 25*26/74 ≈ 8.78 25*48/74 ≈ 16.22 Lithium 26*26/74 ≈ 9.14 26*48/74 ≈ 16.86 Placebo 23*26/74 ≈ 8.08 23*48/74 ≈ 14.92 Cocaine addiction Table of counts: “actual / expected,” with three rows and two columns: No relapse Relapse Desipramine 15 8.78 10 16.22 Lithium 7 9.14 19 16.86 Placebo 4 8.08 19 14.92 df = (3−1)*(2−1) = 2 2 2 15 8 . 78 10 16 . 22 2 8.78 16.22 2 2 7 9.14 19 16.86 9.14 16.86 2 2 4 8.08 19 14.92 8.08 14.92 10.74 2 components: 4 .4 1 0 .5 0 2 .0 6 2 .3 9 0 .2 7 1 .1 2 Cocaine addiction: Table χ H0: The proportions of success (no relapse) are the same in all three populations. p df 0.25 0.2 0.15 0.1 0.05 0.025 0.02 1 1.32 1.64 2.07 2.71 3.84 5.02 5.41 2 2.77 3.22 3.79 4.61 5.99 7.38 7.82 3 4.11 4.64 5.32 6.25 7.81 9.35 9.84 4 5.39 5.99 6.74 7.78 9.49 11.14 11.67 5 6.63 7.29 8.12 9.24 11.07 12.83 13.39 6 7.84 8.56 9.45 10.64 12.59 14.45 15.03 2 7 9.04 9.80 10.75 12.02 16.62 X =14.07 10.7116.01 > 5.99; df 8 10.22 11.03 12.03 13.36 15.51 17.53 18.17 9 11.39 12.24 13.29 14.68 16.92 19.02the19.68 reject H0 10 12.55 13.44 14.53 15.99 18.31 20.48 21.16 11 13.70 14.63 15.77 17.28 19.68 21.92 22.62 12 14.85 15.81 16.99 18.55 21.03 23.34 24.05 13 15.98 16.98 18.20 19.81 22.36 24.74 25.47 The proportions of success are not the same in 14 17.12 18.15 19.41 21.06 23.68 26.12 26.87 18.25 19.31 20.60 22.31 25.00 27.49 28.26 all 15 three populations (Desipramine, Lithium, 16 19.37 20.47 21.79 23.54 26.30 28.85 29.63 17 20.49 21.61 22.98 24.77 27.59 30.19 31.00 Placebo). 18 21.60 22.76 24.16 25.99 28.87 31.53 32.35 19 22.72 23.90 25.33 27.20 30.14 32.85 33.69 20 23.83 25.04 26.50 28.41 31.41 34.17 35.02 Desipramine is a more successful treatment 21 24.93 26.17 27.66 29.62 32.67 35.48 36.34 0.01 6.63 9.21 11.34 13.28 15.09 16.81 = 18.48 2 20.09 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00 33.41 34.81 36.19 37.57 38.93 0.005 7.88 10.60 12.84 14.86 16.75 18.55 20.28 21.95 23.59 25.19 26.76 28.30 29.82 31.32 32.80 34.27 35.72 37.16 38.58 40.00 41.40 0.0025 0.001 0.0005 9.14 10.83 12.12 11.98 13.82 15.20 14.32 16.27 17.73 16.42 18.47 20.00 18.39 20.51 22.11 20.25 22.46 24.10 22.04 24.32 26.02 23.77 26.12 27.87 25.46 27.88 29.67 27.11 29.59 31.42 28.73 31.26 33.14 30.32 32.91 34.82 31.88 34.53 36.48 Observed 33.43 36.12 38.11 34.95 37.70 39.72 36.46 39.25 41.31 37.95 40.79 42.88 39.42 42.31 44.43 40.88 43.82 45.97 42.34 45.31 47.50 43.78 46.80 49.01