1 Categorical Data Analysis Chapter 3: Inference for Contingency Tables 2 Estimation of Association Parameters • Proportion difference – point and interval estimators • Relative risk – point and interval estimators • Odds ratio – point and interval estimators – Example 3 IxJ Contingency Tables • Inference on difference, RR, odds ratio • Are X and Y independent? • Ho : independence 4 Measure the Lack of Independence • Pearson chi-square statistic: 2 ˆ ( n u ) ij ij 2 uˆij i, j • Likelihood ratio (LR) test statistic: nij G 2 nij log uˆij i, j 2 If the statistic is too large, then we have a strong evidence against independence. 5 Tests for Independence • For Pearson or LR test, the df of the chi-square test is the dimension of the whole parameter space (Q) – the dimension of the hypothesized parameter space (Q0), i.e. df = dim(Q) – dim(Q0) 6 • Poisson sampling: df = IJ-(I+J-1) =(I-1)(J-1) • Single multinomial sampling: df = (IJ-1)-(I+J-2) = (I-1)(J-1) • Independent multinomial sampling: df = I(J-1)-(J-1) = (I-1)(J-1) 7 Example: Oral Contraceptive vs. Heart Attack • Case-Control study: Retrospective sampling; Column totals were fixed Heart attack Oral Contra- Used ceptives Never used Total Yes No 23 34 35 132 58 166 8 Follow-up Chi-squared Tests • Pearson and standardized residuals • Partitioning Chi-squared 9 Residuals • Pearson residual: eij nij uˆij uˆij 1/ 2 • Standardized Pearson residual: e ij s nij uˆij [uˆij (1 pi )(1 p j )] 1/2 10 Partitioning Chi-squared • Describing association in IxJ table • Partition a IxJ table to (I-1)(J-1) sub 2x2 tables • Chi-squared= the sum of independent (I-1)(J-1) chi-squareds 11 Rules for Independent Partitioning 1. S df for the subtables = (I-1)(J-1) 2. Each cell count nij must appear in one and only one subtable 3. Each marginal total (ni+ or n+j) must be a marginal total for one and only one subtable 12 Example: Aspirin vs. Heart Attack • Prospective sampling; Row totals were fixed Fatal H.A. Non-fatal No H.A. H.A. Placebo 18 171 10845 Aspirin 5 99 10933 13 Ordinary X: Trend Tests • Test for Linear Trend alternative: M^2 • Choice of scores • Example: Table 2.8. 1996 General Social Survey Job Satisfaction Income Very dissatisfied Little dissatisfied Moderately satisfied Very satisfied <15K 1 3 10 6 15K-25K 2 3 10 7 25K-40K 1 6 14 12 >40K 0 1 9 11 14 Exact Test for Independence • The Chi-squared tests are for large samples • The Chi-squared tests are valid only when The sample size is large enough so that expected frequencies are greater than or equal to 5 for 80% or more of the categories 15 Fisher’s Exact Test • Consider a 2x2 table • Under the three sampling methods, what is the distribution of n11 conditional on n1+, n2+, n+1, n+2? • Example: Table 3.8 16