Chapter 10 Categorical Data Analysis Inference for a Single Proportion (p) • Goal: Estimate proportion of individuals in a population with a certain characteristic (p). This is equivalent to estimating a binomial probability • Sample: Take a SRS of n individuals from the population and observe y that have the characteristic. The sample proportion is y/n and has the following sampling properties: ^ Sample proportion : p y n Mean and Std. Dev. of sampling distributi on : ^ p p ^ p p (1 p ) n ^ p 1 p Estimated Standard Error : SE ^ p n Shape : approximat ely normal for large samples (Rule of thumb : np , n(1 p ) 5) ^ Large-Sample Confidence Interval for p • Take SRS of size n from population where p is true (unknown) proportion of successes. – Observe y successes – Set confidence level (1-a) and obtain za/2 from z-table y Point Estimate : p n ^ ^ p 1 p Estimated Standard Error : SE ^ p n Margin of error : m za / 2SE ^ ^ p ^ C % confidence interval for p : p m Example - Ginkgo and Azet for AMS • Study Goal: Measure effect of Ginkgo and Acetazolamide on occurrence of Acute Mountain Sickness (AMS) in Himalayan Trackers • Parameter: p = True proportion of all trekkers receiving Ginkgo&Acetaz who would suffer from AMS. • Sample Data: n=126 trekkers received G&A, y=18 suffered from AMS 18 (.14)(.86) p .143 SE ^ .031 p 126 126 Margin of error ((1 a )100% 95%) : m 1.96(.031) .061 ^ 95% CI for p : .143 .061 (.082,.204) Wilson-Agresti-Coull Method • For moderate to small sample sizes, large-sample methods may not work well wrt coverage probabilities • Simple approach that works well in practice: – Adjust observed number of Successes (y) and sample size (n) ~ y y 0.5 za / 2 2 ~ 2 Note that for a 0.05, z.025 1.962 4 n n za2 / 2 ~ ~ Point Estimate: p y ~ n ~ p 1 p ~ Estimated Standard Error: SE ~ p ~ n Margin of error: m za / 2SE ~ p ~ (1 a )100% confidence interval for p : p m Example: Lister’s Tests with Antiseptic • Experiments with antiseptic in patients with upper limb amputations (John Lister, circa 1870) • n=12 patients received antiseptic y=1 died ~ y 1 0.5 1.96 1 0.5 3.84 2.92 ~ 2 n 12 1.96 12 3.84 15.84 2 2.92 .1843(.8157) p .1843 SE ~ .0974 p 15.84 15.84 Margin of error((1-a )100% 95%) : 1.96(.0974) .1910 95% CI for p : .1875 .1910 ( .0035,.3985) (0,.40) ~ Sample Size for Margin of Error = E • Goal: Estimate p within E with 100(1-a% Confidence • Confidence Interval will have width of 2E p 1 p za2 /2p 1 p m E za /2 n n E2 Since p is unknown, an educated guess can be used or set p 0.5 This is most conservative as p 1 p is largest for p 0.5 a 0.05, p 0.5 za2 /2p 1 p 22 0.5 1 0.5 1 1 n 2 E Significance Test for a Proportion • Goal test whether a proportion (p) equals some null value p0 H0: pp0 ^ p p0 Test Statistic : zobs p o (1 p 0 ) Ha :p p 0 n RR : zobs za Ha :p p 0 RR : zobs za Ha :p p 0 RR : zobs za / 2 P - value P( Z zobs ) P - value P( Z zobs ) P - value 2 P( Z zobs ) Large-sample test works well when np0 and n(1-p0) 5 Ginkgo and Acetaz for AMS • Can we claim that the incidence rate of AMS is less than 25% for trekkers receiving G&A? • H0: p=0.25 Ha: p < 0.25 18 0.143 p 0 0.25 126 .143 .25 .107 Test Statistic : zobs 2.75 .039 .25(.75) 126 RR (a .05) : zobs z.05 1.645 n 126 ^ y 18 p P - value P( Z 2.75) .0030 Strong evidence that incidence rate is below 25% (p < 0.25) Comparing Two Population Proportions • Goal: Compare two populations/treatments wrt a nominal (binary) outcome • Sampling Design: Independent vs Dependent Samples • Methods based on large vs small samples • Contingency tables used to summarize data • Measures of Association: Absolute Risk, Relative Risk, Odds Ratio Contingency Tables • Tables representing all combinations of levels of explanatory and response variables • Numbers in table represent Counts of the number of cases in each cell • Row and column totals are called Marginal counts 2x2 Tables - Notation Group 1 Outcome Present y1 Outcome Absent n1-y1 Group Total n1 Group 2 y2 n2-y2 n2 Outcome Total y1+y2 (n1+n2)(y1+y2) n1+n2 Example - Firm Type/Product Quality Not Integrated Vertically Integrated Outcome Total High Quality Low Quality Group Total 33 55 88 5 79 84 38 134 172 • Groups: Not Integrated (Weave only) vs Vertically integrated (Spin and Weave) Cotton Textile Producers • Outcomes: High Quality (High Count) vs Low Quality (Count) Source: Temin (1988) Notation • Proportion in Population 1 with the characteristic of interest: p1 • Sample size from Population 1: n1 • Number of individuals in Sample 1 with the characteristic of interest: y1 • Sample proportion from Sample 1 with the ^ characteristic of interest: y1 p1 n1 • Similar notation for Population/Sample 2 Example - Cotton Textile Producers p1 - True proportion of all Non-integretated firms that would produce High quality p2 - True proportion of all vertically integretated firms that would produce High quality n1 88 n2 84 y1 33 y1 33 p 1 0.375 n1 88 y2 5 y2 5 p 2 0.060 n2 84 ^ ^ Notation (Continued) • Parameter of Primary Interest: p1-p2, the difference in the 2 population proportions with the characteristic (2 other measures given below) ^ ^ • Estimator: D p 1p 2 • Standard Error (and its estimate): ^ ^ ^ p 1 1 p 1 p 2 1 p 2 SED n1 n2 ^ D p 1 (1 p 1 ) p 2 (1 p 2 ) n1 n2 • Pooled Estimated Standard Error when p1p2p: ^ 1 1 p 1 p n1 n2 ^ SEDP y1 y2 p n1 n2 ^ Cotton Textile Producers (Continued) • Parameter of Primary Interest: p1p2, the difference in the 2 population proportions that produce High quality output ^ ^ • Estimator: D p 1 p 2 0.375 0.060 0.315 • Standard Error (and its estimate): ^ ^ ^ p 1 1 p 1 p 2 1 p 2 0.375(0.625) 0.060(0.94) .003335 .0577 SED n1 n2 88 84 ^ • Pooled Estimated Standard Error when p1p2p: SEDP 1 1 0.2210.779 .0633 88 84 ^ p 33 5 0.221 88 84 Significance Tests for p1p2 • Deciding whether p1p2 can be done by interpreting “plausible values” of p1p2 from the confidence interval: – If entire interval is positive, conclude p1 p2 (p1p2 > 0) – If entire interval is negative, conclude p1 p2 (p1p2 < 0) – If interval contains 0, do not conclude that p1 p2 • Alternatively, we can conduct a significance test: – H0: p1 p2 Ha: p1 p2 (2-sided) ^ ^ – Test Statistic: p 1p 2 zobs Ha: p1 p2 (1-sided) ^ 1 1 p 1 p n1 n2 ^ – RR: |zobs| za/2 (2-sided) zobs za (1-sided) – P-value: 2P(Z|zobs|) (2-sided) P(Z zobs) (1-sided) Example - Cotton Textile Production H 0 : p1 p 2 (p 1 p 2 0) H A : p1 p 2 (p 1 p 2 0) ^ TS : zobs ^ p 1p 2 0.375 0.060 0.315 4.98 ^ 1 1 0.0633 ^ 1 1 0.221(0.779) p 1 p 88 84 n1 n2 RR : zobs z.025 1.96 P - value 2 P( Z 4.98) 0 Again, there is strong evidence that non-integrated performs are more likely to produce high quality output than integrated firms Fisher’s Exact Test • Method of testing for testing whether p2=p1 when one or both of the group sample sizes is small • Measures (conditional on the group sizes and number of cases with and without the characteristic) the chances we would see differences of this magnitude or larger in the sample proportions, if there were no differences in the populations Example – Echinacea Purpurea for Colds • Healthy adults randomized to receive EP (n1=24) or placebo (n2=22, two were dropped) • Among EP subjects, 14 of 24 developed cold after exposure to RV-39 (58%) • Among Placebo subjects, 18 of 22 developed cold after exposure to RV-39 (82%) • Out of a total of 46 subjects, 32 developed cold • Out of a total of 46 subjects, 24 received EP Source: Sperber, et al (2004) Example – Echinacea Purpurea for Colds • Conditional on 32 people developing colds and 24 receiving EP and 22 receiving placebo, the following table gives the outcomes that would have been as strong or stronger evidence that EP reduced risk of developing cold (1sided test). P-value from SPSS is .079 (next slide). EP/Cold 14 13 12 11 10 Sum Placebo/Cold Probability 18 0.059808 19 0.016025 20 0.002604 21 0.000229 22 0.000008 0.078674 nEP nPL yEP yPL Probabilities:p yEP , yPL nEP nPL yEP yPL 24 22 1961256 7315 14 18 .059808 p 14,18 23987744005 46 32 ... 24 22 10 22 1961256 1 .0000082 p 10, 22 23987744005 46 32 Example - SPSS Output r C O L N o e o T E 4 P 2 T 6 a r c c p t t s a s d i i i l d d d f u b P 0 1 4 a C 4 1 9 L 1 1 0 F 4 9 N 6 a C b 0 6 McNemar’s Test for Paired Samples • Common subjects (or matched pairs) being observed under 2 conditions (2 treatments, before/after, 2 diagnostic tests) in a crossover setting • Two possible outcomes (Presence/Absence of Characteristic) on each measurement • Four possibilities for each subject/pair wrt outcome: – – – – Present in both conditions Absent in both conditions Present in Condition 1, Absent in Condition 2 Absent in Condition 1, Present in Condition 2 McNemar’s Test for Paired Samples Condition 1\2 Present Absent Present n11 n12 Absent n21 n22 McNemar’s Test for Paired Samples • Data: n12 = # of pairs where the characteristic is present in condition 1 and not 2 and n21 # where present in 2 and not 1 • H0: Probability the outcome is Present is same for the 2 conditions (p1 = p2) • HA: Probabilities differ for the 2 conditions (p1 ≠ p2) Large-Sample Test (Normal Approximation to Binomial) n12 n21 T .S .: zobs n12 n21 P val 2 P ( Z | zobs |) Example - Reporting of Silicone Breast Implant Leakage in Revision Surgery • Subjects - 165 women having revision surgery involving silicone gel breast implants • Conditions (Each being observed on all women) – Self Report of Presence/Absence of Rupture/Leak – Surgical Record of Presence/Absence of Rupture/Leak L C G p o u t t S R 9 8 7 N 5 3 8 T 4 1 5 Source: Brown and Pennello (2002), “Replacement Surgery and Silicone Gel Breast Implant Rupture”, Journal of Women’s Health & Gender-Based Medicine, Vol. 11, pp 255-264 Example - Reporting of Silicone Breast Implant Leakage in Revision Surgery • H0: Tendency to report ruptures/leaks is the same for self reports and surgical records • HA: Tendencies differ T .S .: zobs n12 n21 28 5 4.00 n12 n21 28 5 P val 2 P ( Z | zobs |) 2 P ( Z 4) 2(.0000317) 0 Exact P-value: 2P Y 28 | Y ~ B n 33, p 0.5 Multinomial Experiment / Distribution • Extension of Binomial Distribution to experiments where each trial can end in exactly one of k categories • n independent trials • Probability a trial results in category i is pi • ni is the number of trials resulting in category I • p1+…+pk = 1 • n1+…+nk = n Multinomial Distribution / Test for Cell Probabilities p n1 ,..., nk k n i 1 i n, n! p 1n1 ...p knk n1 !...nk ! k p i 1 i 1, ni 0, p i 0 Testing whether the category probabilities are specific values: H 0 : p 1 p 10 ,..., p k p k 0 k p i 1 i0 1 H A : At least one cell probability is not as specified Expected cell counts under H 0 : Ei np i 0 k 2 Test Statistic: obs i 1 ni Ei Ei i 1,..., n 2 2 2 Rejection Region: obs a2,k 1 P-value: P k21 obs Goodness of Fit Test for a Probability Distribution • Data are collected and wish to be determined whether it comes from a particular probability distribution (e.g. Poisson, Normal, Gamma) • Estimate any unknown model parameters (p estimates) • Break down the range of data values into k > p intervals (typically where ≥ 80% have expected counts ≥ 5) obtain observed (n) and expected (E) values for each interval k 2 Test Statistic: obs i 1 2 P-value: P k2 p obs ni Ei 2 Ei Assessing quality of fit to hypothesized distribution: P-value Quality of Fit > .25 Excellent .15-.25 Good .05-.15 Moderately Good .01-.05 Poor <.01 Unacceptable Associations Between Categorical Variables • Case where both explanatory (independent) variable and response (dependent) variable are qualitative • Association: The distributions of responses differ among the levels of the explanatory variable (e.g. Party affiliation by gender) Contingency Tables • Cross-tabulations of frequency counts where the rows (typically) represent the levels of the explanatory variable and the columns represent the levels of the response variable. • Numbers within the table represent the numbers of individuals falling in the corresponding combination of levels of the two variables • Row and column totals are called the marginal distributions for the two variables Example - Cyclones Near Antarctica • Period of Study: September,1973-May,1975 • Explanatory Variable: Region (40-49,50-59,60-79) (Degrees South Latitude) • Response: Season (Aut(4),Wtr(5),Spr(4),Sum(8)) (Number of months in parentheses) • Units: Cyclones in the study area • Treating the observed cyclones as a “random sample” of all cyclones that could have occurred Source: Howarth(1983), “An Analysis of the Variability of Cyclones around Antarctica and Their Relation to Sea-Ice Extent”, Annals of the Association of American Geographers, Vol.73,pp519-537 Example - Cyclones Near Antarctica Region\Season 40-49S 50-59S 60-79S Total Autumn 370 526 980 1876 Winter 452 624 1200 2276 Spring 273 513 995 1781 Summer 422 1059 1751 3232 Total 1517 2722 4926 9165 For each region (row) we can compute the percentage of storms occuring during each season, the conditional distribution. Of the 1517 cyclones in the 40-49 band, 370 occurred in Autumn, a proportion of 370/1517=.244, or 24.4% as a percentage. Region\Season 40-49S 50-59S 60-79S Autumn 24.4 19.3 19.9 Winter 29.8 22.9 24.4 Spring 18.0 18.9 20.2 Summer 27.8 38.9 35.5 Total% (n) 100.0 (1517) 100.0 (2722) 100.0 (4926) Example - Cyclones Near Antarctica 40.00 region 40-49S 50-59S 60-79S 30.00 regp ct Bars show Means 20.00 10.00 Autumn Winter Spring Summer season Graphical Conditional Distributions for Regions Guidelines for Contingency Tables • Compute percentages for the response (column) variable within the categories of the explanatory (row) variable. Note that in journal articles, rows and columns may be interchanged. • Divide the cell totals by the row (explanatory category) total and multiply by 100 to obtain a percent, the row percents will add to 100 • Give title and clearly define variables and categories. • Include row (explanatory) total sample sizes Independence & Dependence • Statistically Independent: Population conditional distributions of one variable are the same across all levels of the other variable • Statistically Dependent: Conditional Distributions are not all equal • When testing, researchers typically wish to demonstrate dependence (alternative hypothesis), and wish to refute independence (null hypothesis) Pearson’s Chi-Square Test • Can be used for nominal or ordinal explanatory and response variables • Variables can have any number of distinct levels • Tests whether the distribution of the response variable is the same for each level of the explanatory variable (H0: No association between the variables • r = # of levels of explanatory variable • c = # of levels of response variable Pearson’s Chi-Square Test • Intuition behind test statistic – Obtain marginal distribution of outcomes for the response variable – Apply this common distribution to all levels of the explanatory variable, by multiplying each proportion by the corresponding sample size – Measure the difference between actual cell counts and the expected cell counts in the previous step Pearson’s Chi-Square Test • Notation to obtain test statistic – Rows represent explanatory variable (r levels) – Cols represent response variable (c levels) 1 2 … c Total 1 n11 n12 … n1c n1. 2 n21 n22 … n2c n2. … … … … … … r nr1 nr2 … nrc nr. Total n.1 n.2 … n.c n.. Pearson’s Chi-Square Test • Observed frequency (nij): The number of individuals falling in a particular cell • Expected frequency (Eij): The number we would expect in that cell, given the sample sizes observed in study and the assumption of independence. – Computed by multiplying the row total and the column total, and dividing by the overall sample size. – Applies the overall marginal probability of the response category to the sample size of explanatory category Pearson’s Chi-Square Test • Large-sample test (at least 80% of Eij > 5) • H0: Variables are statistically independent (No association between variables) • Ha: Variables are statistically dependent (Association exists between variables) • Test Statistic: 2 (nij Eij )2 obs Eij 2 • P-value: Area above obs in the chi-squared distribution with (r-1)(c-1) degrees of freedom. (Critical values in Table 8) Example - Cyclones Near Antarctica Observed Cell Counts (nij): Region\Season 40-49S 50-59S 60-79S Total Autumn 370 526 980 1876 Winter 452 624 1200 2276 Spring 273 513 995 1781 Summer 422 1059 1751 3232 Total 1517 2722 4926 9165 Note that overall: (1876/9165)100%=20.5% of all cyclones occurred in Autumn. If we apply that percentage to the 1517 that occurred in the 40-49S band, we would expect (0.205)(1517)=310.5 to have occurred in the first cell of the table. The full table of Eij: Region\Season 40-49S 50-59S 60-79S Total Autumn 310.5 557.2 1008.3 1876 Winter 376.7 676.0 1223.3 2276 Spring 294.8 529.0 957.3 1781 Summer 535.0 959.9 1737.1 3232 Total 1517 2722 4926 9165 Example - Cyclones Near Antarctica Computation of Region 40-49S 40-49S 40-49S 40-49S 50-59S 50-59S 50-59S 50-59S 60-79S 60-79S 60-79S 60-79S 2 obs Season Autumn Winter Spring Summer Autumn Winter Spring Summer Autumn Winter Spring Summer n_ij E_ij 370 452 273 422 526 624 513 1059 980 1200 995 1751 310.5 376.7 294.8 535.0 557.2 676.0 529.0 959.9 1008.3 1223.3 957.3 1737.1 (n-E)^2 3540.25 5670.09 475.24 12769 973.44 2704 256 9820.81 800.89 542.89 1421.29 193.21 ((n-E)^2)/E 11.4017713 15.0520042 1.61207598 23.8672897 1.74702082 4 0.48393195 10.2310762 0.79429733 0.44379138 1.4846861 0.11122561 71.2291706 2 obs Example - Cyclones Near Antarctica • H0: Seasonal distribution of cyclone occurences is independent of latitude band • Ha: Seasonal occurences of cyclone occurences differ among latitude bands 2 • Test Statistic: obs 71.2 • RR: obs2 .05,62 = 12.59 • P-value: Area in chi-squared distribution with (31)(4-1)=6 degrees of freedom above 71.2 From Table 8, P(2 22.46)=.001 P< .001 Likelihood Ratio Statistic Alternative statistic provided by many computer packages: Test Statistic: nij 2 nij ln E i 1 j 1 ij r 2 LR c r c n nij 2 nij ln ni n j i 1 j 1 2 Rejection Region: LR a2 , r 1 c 1 P-value: P 2r 1 c 1 L2R Note: The formula on page 512 of textbook is incorrect Row(i) 1 1 1 1 2 2 2 2 3 3 3 3 Sum Column(j) n_ij n_i● n_●j X2(LR) 1 370 1517 1876 129.6947 2 452 1517 2276 164.6768 3 273 1517 1781 -41.9335 4 422 1517 3232 -200.192 1 526 2722 1876 -60.5646 2 624 2722 2276 -99.8393 3 513 2722 1781 -31.4258 4 1059 2722 3232 208.0912 1 980 4926 1876 -55.8208 2 1200 4926 2276 -46.1601 3 995 4926 1781 76.9673 4 1751 4926 3232 27.84263 71.33672 SPSS Output - Cyclone Example REGION * SEASON Crosstabulation REGION 40-49S 50-59S 60-79S Total Count Expected Count % within REGION Count Expected Count % within REGION Count Expected Count % within REGION Count Expected Count % within REGION Autumn 370 310.5 24.4% 526 557.2 19.3% 980 1008.3 19.9% 1876 1876.0 20.5% SEASON Winter Spring 452 273 376.7 294.8 29.8% 18.0% 624 513 676.0 529.0 22.9% 18.8% 1200 995 1223.3 957.3 24.4% 20.2% 2276 1781 2276.0 1781.0 24.8% 19.4% Summer 422 535.0 27.8% 1059 959.9 38.9% 1751 1737.1 35.5% 3232 3232.0 35.3% Total 1517 1517.0 100.0% 2722 2722.0 100.0% 4926 4926.0 100.0% 9165 9165.0 100.0% Chi-Square Tests Pears on Chi-Square Likelihood Ratio Linear-by-Linear Ass ociation N of Valid Cas es Value 71.189a 71.337 23.418 6 6 Asymp. Sig. (2-s ided) .000 .000 1 .000 df 9165 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 294.79. P-value Misuses of chi-squared Test • Expected frequencies too small (at least 80% of expected counts should be above 5, not necessary for the observed counts) • Dependent samples (the same individuals are in each row, see McNemar’s test) • Can be used for nominal or ordinal variables, but more powerful methods exist for when both variables are ordinal and a directional association is hypothesized Ordinal Explanatory and Response Variables • Pearson’s Chi-square test can be used to test associations among ordinal variables, but more powerful methods exist • When theories exist that the association is directional (positive or negative), measures exist to describe and test for these specific alternatives from independence: – Gamma – Kendall’s tb Concordant and Discordant Pairs • Concordant Pairs - Pairs of individuals where one individual scores “higher” on both ordered variables than the other individual • Discordant Pairs - Pairs of individuals where one individual scores “higher” on one ordered variable and the other individual scores “higher” on the other • C = # Concordant Pairs D = # Discordant Pairs – Under Positive association, expect C > D – Under Negative association, expect C < D – Under No association, expect C D Example - Alcohol Use and Sick Days • Alcohol Risk (Without Risk, Hardly any Risk, Some to Considerable Risk) • Sick Days (0, 1-6, 7) • Concordant Pairs - Pairs of respondents where one scores higher on both alcohol risk and sick days than the other • Discordant Pairs - Pairs of respondents where one scores higher on alcohol risk and the other scores higher on sick days Source: Hermansson, et al (2003) Example - Alcohol Use and Sick Days A C D d o d d a t A W 7 3 5 5 H 4 3 6 3 S 2 5 4 1 T 3 1 5 9 • Concordant Pairs: Each individual in a given cell is concordant with each individual in cells “Southeast” of theirs •Discordant Pairs: Each individual in a given cell is discordant with each individual in cells “Southwest” of theirs Example - Alcohol Use and Sick Days A C D d o d d a t A W 7 3 5 5 H 4 3 6 3 S 2 5 4 1 T 3 1 5 9 C 347(63 56 25 34) 113(56 34) 154(25 34) 63(34) 83164 D 145(154 63 52 25) 113(154 52) 56(52 25) 63(52) 73496 Measures of Association • Goodman and Kruskal’s Gamma: CD CD ^ ^ 1 1 • Kendall’s tb: ^ tb 0.5 CD n 2 ni. 2 n 2 n. j 2 When there’s no association between the ordinal variables, the population based values of these measures are 0. Statistical software packages provide these tests. Example - Alcohol Use and Sick Days C D 83164 73496 0.0617 C D 83164 73496 ^ c y m a b o r l E x o u O K 5 0 7 5 O G 2 2 7 5 N 9 a N b U Measures of Association • • • • Absolute Risk (AR): p1p2 Relative Risk (RR): p1 / p2 Odds Ratio (OR): o1 / o2 (o = p/(1-p)) Note that if p1 p2 (No association between outcome and grouping variables): – AR=0 – RR=1 – OR=1 Relative Risk • Ratio of the probability that the outcome characteristic is present for one group, relative to the other • Sample proportions with characteristic from groups 1 and 2: y1 p1 n1 ^ y2 p2 n2 ^ Relative Risk • Estimated Relative Risk: ^ RR p1 ^ p2 95% Confidence Interval for Population Relative Risk: ( RR (e 1.96 v 1.96 v ) , RR (e ^ )) ^ (1 p 1 ) (1 p 2 ) e 2.71828 v y1 y2 Relative Risk • Interpretation – Conclude that the probability that the outcome is present is higher (in the population) for group 1 if the entire interval is above 1 – Conclude that the probability that the outcome is present is lower (in the population) for group 1 if the entire interval is below 1 – Do not conclude that the probability of the outcome differs for the two groups if the interval contains 1 Example - Concussions in NCAA Athletes • Units: Game exposures among college socer players 1997-1999 • Outcome: Presence/Absence of a Concussion • Group Variable: Gender (Female vs Male) • Contingency Table of case outcomes: Outcome Source: Covassin, et al (2003) No Concussion Concussion Total Gender Female 158 74924 75082 Male 101 75633 75734 Total 259 150557 150816 Example - Concussions in NCAA Athletes 158 0.0021 75082 (2.1 Concussion s per 1000 female player/gam es) ^ 101 0.0013 Among Males : p M 75734 (1.3 Concussion s per 1000 male player/gam es) ^ Among Females : p F ^ RR ( F / M ) pF ^ pM .0021 1.62 .0013 1 .0021 1 .0013 v .1273 .0162 101 158 95%CI for Population Relative Risk : v 1.62e -1.96(.1273) ,1.62e1.96(.1273) (1.27,2.13) There is strong evidence that females have a higher risk of concussion Odds Ratio • Odds of an event is the probability it occurs divided by the probability it does not occur • Odds ratio is the odds of the event for group 1 divided by the odds of the event for group 2 • Sample odds of the outcome for each group: y1 / n1 y1 odds1 ( n1 y1 ) / n1 n1 y1 y2 odds2 n2 y2 Odds Ratio • Estimated Odds Ratio: odds1 y1 /( n1 y1 ) y1 (n2 y2 ) OR odds2 y2 /( n2 y2 ) y2 (n1 y1 ) 95% Confidence Interval for Population Odds Ratio ( OR (e 1.96 v ) , OR (e1.96 v ) ) e 2.71828 1 1 1 1 v y1 n1 y1 y2 n2 y2 Odds Ratio • Interpretation – Conclude that the probability that the outcome is present is higher (in the population) for group 1 if the entire interval is above 1 – Conclude that the probability that the outcome is present is lower (in the population) for group 1 if the entire interval is below 1 – Do not conclude that the probability of the outcome differs for the two groups if the interval contains 1 Osteoarthritis in Former Soccer Players • Units: 68 Former British professional football players and 136 age/sex matched controls • Outcome: Presence/Absence of Osteoathritis (OA) • Data: • Of n1= 68 former professionals, y1 =9 had OA, n1-y1=59 did not • Of n2= 136 controls, y2 =2 had OA, n2-y2=134 did not odds1 OR X1 9 2 .1525 odds2 .0149 n1 X 1 59 134 odds1 .1525 10.23 odds2 .0149 1 1 1 1 .6355 v .797 9 59 2 134 95% CI for Population Odds Ratio : v Source: Shepard, et al (2003) 10.23e 1.96(.797) ,10.23e1.96(.797) (2.14,48.80) Interval > 1 Mantel-Haenszel Test / CI for Multiple Tables • Data collected from q studies or strata in 2x2 contingency tables with common groupings/outcomes • Each table has 4 cells: nh11, nh12, nh21, nh21 h=1,…,q • They can be combined for an overall Chi-square statistic or odds ratio and confidence Interval Table 1 Trt\Response 1 1 n_111 2 n_121 Total n_1●1 2 n_112 n_122 n_1●2 Total n_11● n_12● n_1●● Table q Trt\Response 1 1 n_q11 2 n_q21 Total n_q●1 2 n_q12 n_q22 n_q●2 Total n_q1● n_q2● n_q●● Mantel-Haenszel Computations nh1 nh1 nh11 n h 1 h q nh1 nh 2 nh1nh2 2 h 1 nh nh 1 q 2 Test Statisic: MH 2 Rejection Region: MH a2 ,1 ^ OR MH R S q n n R h11 h 22 nh h 1 2 2 P-value: P 12 MH q nh12 nh 21 nh h 1 S 1 1 1 1 S n n n n h 1 h12 h 21 h 22 h11 ^ ^ 1.96 v 95% CI for Overall Odds Ratio: OR MH e , OR MH e1.96 v ^ 1 v V OR MH 2 S ^ q 2 h