If we live with a deep sense of gratitude, our life will be greatly embellished. 1 Categorical Data Analysis Chapter 10: Tests for Matched Pairs 2 Meta Analysis • Also known as stratified analysis • Section 6.3.2: Cochran-Mantel-Haenszel test; test for conditional independence Situation: When another variable (strata Z) may “pollute” the effect of a categorical explanatory variable X on a categorical response Y Goal: Study the effect of X on Y while controlling the stratification variable Z without assuming a model 3 Example: Respiratory Improvement (SAS textbook, P. 46) Center Treatment Yes No Total 1 Test 29 16 45 1 Placebo 14 31 45 43 47 90 Total 2 Test 37 9 45 2 Placebo 24 21 45 61 29 90 Total 4 SAS Output Summary Statistics for trtmnt by response Controlling for center Cochran-Mantel-Haenszel Statistics (Based on Table Scores) Statistic Alternative Hypothesis DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 Nonzero Correlation 1 18.4106 <.0001 2 Row Mean Scores Differ 1 18.4106 <.0001 3 General Association 1 18.4106 <.0001 5 What to Do if Dependent • (Section 6.3.5) When X and Y are NOT conditionally independent given Z, we would like to test for homogeneous association • (Section 6.3.6) If X, Y, Z have homogeneous association, we would like to estimate the common conditional odds ratio for X, Y given Z 6 SAS Output Estimates of the Common Relative Risk (Row1/Row2) Type of Study Method Value 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control Mantel-Haenszel 4.0288 2.1057 7.7084 (Odds Ratio) Logit 4.0286 2.1057 7.7072 Cohort (Col1 Risk) Mantel-Haenszel Logit 1.7368 1.6760 1.3301 1.2943 2.2680 2.1703 Cohort (Col2 Risk) Mantel-Haenszel Logit 0.4615 0.4738 0.3162 0.3264 0.6737 0.6877 Breslow-Day Test for Homogeneity of the Odds Ratios ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 0.0002 DF 1 Pr > ChiSq 0.9900 Total Sample Size = 180 7 Matched-pair Data • Comparing categorical responses for two “paired” samples When either • Each sample has the same subjects (or say subjects are measured twice) Or • A natural pairing exists between each subject in one sample and a subject from the other sample (eg. Twins) 8 Example: Rating for Prime Minister Second Survey First Survey Approve Disapprove Approve 794 150 Disapprove 86 570 9 Marginal Homogeneity • The probabilities of “success” for both samples are identical (The data table shows “symmetry” across the main diagonal) • Eg. The probability of approve at the first and 2nd surveys are identical 10 Estimating Differences of Proportions • Sample estimate: P+1-P1+ • Standard error of P+1-P1+ (based on the multinomial distribution of data): p1 (1 p1 ) p1 (1 p1 ) 2( p11 p22 p12 p21 ) n • Asymptotical (1-a) confidence interval: ( p1 p1 ) Za / 2 SE( p1 p1 ) 11 McNemar Test (for 2x2 Tables only) • See SAS textbook Sec 3.7 (p. 40) • Ho: marginal homogeneity Ha: no marginal homogeneity • A special case of C-M-H test; an approximate test (when n*=n12+n21>10) • Exact test (when n*=n12+n21<10) 12 Level of Agreement: Kappa Coefficient • The larger the Kappa coefficient is; the stronger the agreement is • The difference between observed agreement and that expected under independence compared to the maximum possible difference is called Kappa coefficient 13 SAS Output McNemar's Test Statistic (S) DF Asymptotic Pr > S Exact Pr >= S 1 <.0001 17.3559 3.716E-05 Simple Kappa Coefficient Kappa ASE 95% Lower Conf Limit 95% Upper Conf Limit Sample Size = 1600 0.6996 0.0180 0.6644 0.7348 Level of agreement 14 Chi-square Test for Square Tables Consider a IxI table • Marginal homogeneity: i i , i 1,...,I • Symmetry: for all pairs of cells, ij Symmetry ji => marginal homogeneity <= 15 Chi-square Test for Square Tables Ho: symmetry vs. Ha: not symmetry • Fitted values: ˆij ˆ ji (nij n ji ) / 2 • Standardized Pearson residuals: r ij (nij n ji ) / (nij n ji ) • Pearson Chi-square Test statistic: X 2 rij2 i j X^2 follows approximately Chi-square with df = 16 I(I-1)/2 Example: Coffee Purchase 2nd purchase 1st High purchase point High point Taster’s Sanka Nescafe Brim Taster’s Sanka Nescafe Brim 93 17 44 7 10 9 46 11 0 9 17 11 155 9 12 6 4 9 15 2 10 4 12 2 27 17 Example: Coffee Purchase • X^2 = 20.4 and df is 5(5-1)/2=10 lack of fit (reject Ho: symmetry) which pairs of cells cause the lack of fit? Examine their standardized Pearson residuals The pair (1,3) and (3,1) contribute the most; other pairs are fine (rij^2 is around 1 or less) 18