Effect Modification & Confounding Kostas Danis EPIET Introductory course, Menorca 2012 Analytical epidemiology Study design: cohorts & case control & cross-sectional studies Choice of a reference group Biases Impact Causal inference Stratification - Effect modification - Confounding Matching Multivariable analysis Cohort studies marching towards outcomes Cohort study Total Cases Non cases Exposed 100 50 50 50 % Not exposed 100 10 90 10 % Risk ratio Risk % 50% / 10% = 5 Source population Exposed Cases Sample Unexposed Controls: Sample of the denominator Representative with regard to exposure Controls Controls are non cases Cases Source popn Low attack rate: non-cases likely to represent exposure in source pop start Non- cases end Cases High attack rate: non-cases unlikely to represent exposure in source population start Non- cases end Case control study Cases Controls Odds ratio Exposed a b Not exposed c d Total a+c b+d Odds of exposure a/c b/d OR= (a/c) / (b/d) = ad / bc Who are the right controls? Controls may not be easy to find Cross-sectional study: Sampling Sampling Population Sample Target Population Cross-sectional study Total Cases Non cases Prevalence % Exposed 1,000 500 500 50 % Not exposed 1,000 100 900 10 % Prevalence ratio (PR) 50% / 10% = 5 Should I believe my measurement? Exposure Outcome RR = 4 True association causal non-causal Chance? Bias? Confounding? Exposure Outcome Third variable Two main complications (1) Effect modifier - useful information (2) Confounding factor - bias To analyse effect modification To eliminate confounding Solution = stratification stratified analysis Create strata according to categories inside the range of values taken by third variable Effect modification Effect modifier Variation in the magnitude of measure of effect across levels of a third variable. Happens when RR or OR is different between strata (subgroups of population) Effect modifier To identify a subgroup with a lower or higher risk ratio To target public health action To study interaction between risk factors Effect modification Factor A (asbestos) Disease (lung cancer) Factor B (smoking) Effect modifier = Interaction 19 Asbestos (As) and lung cancer (Ca) Case-control study, unstratified data As Ca Controls Yes No 693 307 320 680 Total 1000 1000 OR 4.8 Ref. Asbestos Lung cancer Smoking Smokers As Ca Controls OR Yes No 517 183 160 340 6.0 Ref. Total 700 500 Ca Controls OR Yes No 176 124 160 340 3.0 Ref. Total 300 500 Nonsmokers As Asbestos (As), smoking and lung cancer (Ca) As Smoking Cases Controls OR Yes Yes 517 160 8.9 Yes No 176 160 3.0 No Yes 183 340 1.5 No No 124 340 Ref. 1.5 * 3.0 < 8.9 1.5 * 3.0 * interaction=8.9 Physical activity and MI Physical activity MI Controls OR, 95%CI 2500 kcal/d 190 264 0.64 (0.6-0.9) < 2500 kcal/d 176 157 Ref. Physical activity Infarction Gender Men Physical activity MI Controls OR, 95%CI 2500 kcal/d 141 208 0.53 (0.4-0.7) < 2500 kcal/d 144 112 Ref. Physical activity MI Controls OR, 95%CI 2500 kcal/d 49 56 1.2, (0.7-2.2) < 2500 kcal/d 32 45 Ref. Women Vaccine efficacy VE = ARU – ARV ---------------ARU VE = 1 – RR Vaccine efficacy Pop. Cases Cases per 1000 RR V 301 545 150 0.49 0.28 NV 298 655 515 1.72 Ref. Total 600 200 665 1.11 Status VE = 1 - RR VE = 72% = 1 - 0.28 Vaccine Disease Age Vaccine efficacy by age group Age Status Pop. Cases Cases /1000 RR VE <1y V NV 35 625 24 375 38 30 1.07 1.23 0.87 13% 1-4y V NV 44 220 46780 34 86 0.77 1.84 0.42 58% 5-9y V NV 78 200 75 000 50 250 0.64 3.33 0.19 81% 10-24y V NV 83 400 82 600 18 120 0.22 1.45 0.15 85% > 24y V NV 60 100 69 900 10 29 0.17 0.41 0.40 60% Effect modification Different effects (RR) in different strata (age groups) VE is modified by age Test for homogeneity among strata (Woolf test) Any statistical test to help us? Breslow-Day Woolf test Test for trends: Chi square How to conduct a stratified analysis? Crude analysis Stratified analysis 1. 2. 3. Do stratum-specific estimates look different? 95% CI of OR/RR do NOT overlap? Is the Test of Homogeneity significant? NO Check for confounding (compare crude RR/OR with MH RR/OR) YES EFFECT MODIFICATION (Report estimates by stratum) 33 Stratified analysis: Effect Modification ORs / RRs different across strata ORs / RRs 95% C.I. do not overlap ORs / RRs C.I. do overlap Effect modification Use Woolf's test Woolf's test significant Woolf's test not significant Effect modification Effect modification unlikely Discuss lack of power of Wollf's test Death from diarrhea according to breast feeding, Brazil, 1980s (Crude analysis) Diarrhea Controls OR (95% CI) No breast feeding 120 136 3.6 (2.4-5.5) Breast feeding 50 204 Ref No breast feeding Diarhoea Age Death from diarrhea according to breast feeding, Brazil, 1980s Infants < 1 month of age Cases Controls OR (95% CI) No breast feeding 10 3 32 (6-203) Breast feeding 7 68 Cases Controls OR (95% CI) No breast feeding 110 133 2.6 (1.7-4.1) Breast feeding 43 136 Ref Ref Infants ≥ 1 month of age Woolf test (test of homogeneity):p=0.03 Risk of gastroenteritis by exposure, Outbreak X, Place, time X (crude analysis) Exposed Yes No Exposure n AR (%)* n AR(%)* RR† (95% CI‡) pasta 94 77 7 4.2 18.0 (8.8-38) tuna 49 68 49 * AR = Attack Rate 24 † RR = Risk Ratio ‡ 95% CI = 95% confidence interval of the RR 2.9 (2.1-3.8) Tuna gastroenteritis Pasta Risk of gastroenteritis by exposure, Outbreak X, Place, time X (stratified analysis) Pasta Yes Cases Total AR (%) RR (95% CI) Tuna 43 52 83 1.1 No tuna 46 60 77 Ref Cases 4 Total 17 3 144 (0.9-1.3) Pasta No Tuna No tuna AR (%) 24 2 Woolf test (test of homogeneity): p=0.0007 RR 11 Ref (95% CI) (2.6-46) Tuna, pasta and gastroenteritis Tuna Pasta Yes Yes 43 83 42 Yes No 4 23 12 No Yes 46 76 38 No No 3 2 Ref. 38 * 12 > 42 Cases AR(%) RR 38 * 12 * interaction= 42 Risk of HIV by injecting drug use (idu), surveillance data, Spain, 1988-2004 Cases Total AR (%) RR (95% CI) Idu 268 2,732 9.8 3.9 (3.3-4.4) No idu 484 18,822 2.5 Ref idu hiv gender Risk of HIV by injecting drug use (idu), Spain, 1988-2004 (stratified analysis) Males Cases Total AR (%) RR (95% CI) 12 20 idu 86 693 No idu 52 8,306 0.6 Ref Cases 182 Total 2,039 AR (%) 8.9 RR 2.3 432 10,576 (14-28) Females idu No idu 4.1 Woolf test (test of homogeneity): p=0.00000 Ref (95% CI) (1.9-2.6) Idu, gender and hiv Idu Male Cases AR(%) RR Yes Yes 86 12.4 3.0 Yes No 182 8.9 2.2 No Yes 52 0.6 0.14 No No 432 4.1 Ref. 0.14 * 2.2 > 3.0 0.14 * 2.2 * interaction= 3.0 Confounding Confounding Distortion of measure of effect because of a third factor Should be prevented Needs to be controlled for Confounding Skateboarding Chlamydia Age Age not evenly distributed between the 2 exposure groups - skate-boarders, 90% young - Non skate-boarders, 20% young Exposure (coffee) Outcome (Lung cancer) Third variable (smoking) 50 Grey hair stroke Age 51 Cases of Down syndroms by birth order Cases per 100 000 live births 180 160 140 120 100 80 60 40 20 0 1 2 3 Birth order 4 5 Cases of Down Syndrom by age groups Cases per 1000 900 100000 live 800 births 700 600 500 400 300 200 100 0 < 20 20-24 25-29 30-34 Age groups 35-39 40+ Birth order Down syndrom Age or mother Cases of Down syndrom by birth order and mother's age Cases per 100000 1000 900 800 700 600 500 400 300 200 100 0 40 + 1 2 3 Birth order 4 5 <2 0 30 -34 25 29 20 -24 35 -39 s up o r eg g A Confounding To be a confounding factor, 2 conditions must be met: Exposure Outcome Third variable Be associated with exposure - without being the consequence of exposure Be associated with outcome - independently of exposure Exposure Outcome Hypercholesterolaemia Myocardial infarction Third factor Atheroma Any factor which is a necessary step in the causal chain is not a confounder Salt Myocardial infarction Hypertension The nuisance introduced by confounding factors • May simulate an association • May hide an association that does exist • May alter the strength of the association – Increased – Decreased Confounding factor Apparent association Ethnicity Pneumonia Crowding Altered strength of association Crowding Pneumonia Malnutrition How to prevent/control confounding? Prevention – Randomization (experiment) – Restriction to one stratum – Matching Control – Stratified analysis – Multivariable analysis Are Mercedes more dangerous than Porsches? Type Total Accidents AR % RR Porsche 1 000 300 30 1.5 Mercedes 1 000 200 20 Ref. Total 2 000 500 25 95% CI = 1.3 - 1.8 Car type Accidents Confounding factor: Age of driver < 25 years Type Total Accidents AR % Porsche 550 250 45.5 Mercedes 300 120 40.0 RR, 95% CI 1.14 (0.9-1.3) 25 years Type Total Accidents AR % Porsche 450 50 11.1 Mercedes 700 80 11.4 Crude RR = 1.5 Adjusted RR = 1.1 (0.94 - 1.27) RR, 95% CI 0.97 (0.7-1.4) Incidence of malaria according to the presence of a radio set, Kahinbhi Pradesh Crude data Malaria Total AR% Radio set 80 520 15 No radio 220 1080 20 RR 0.7 Ref RR: 0.7; 95% CI: 0.6- 0.9; p < 0.02 95% CI = 0.6 - 0.9 Radio Malaria Confounding factor: Mosquito net Sleeping under mosquito net Malaria Total AR% RR Radio 30 400 7.5 1.02 No radio 50 680 7.4 Ref Malaria Total AR % RR 50 120 41.7 0.98 170 400 42.5 Ref No mosquito net Radio No radio Crude RR = 0.7 Adjusted RR = 1.01 To identify confounding Compare crude measure of effect (RR or OR) to adjusted (weighted) measure of effect (Mantel Haenszel RR or OR) Any statistical test to help us? When is ORMH different from crude OR ? 10 - 20 % Mantel-Haenszel summary measure Adjusted or weighted RR or OR Advantages of MH • Zeroes allowed S (ai di) / ni OR MH = --------------------------- S (bi ci) / ni Mantel-Haenszel summary measure • Mantel-Haenszel (adjusted or weighted) OR SUM (ai di / ni) OR MH = ------------------SUM (bi ci / ni) Cases Exp+ a1 b1 Exp- c1 d1 n1 Cases (a1 x d1) / n1 + (a2 x d2) / n2 ORMH = ---------------------------------------(b1 x c1) / n1 + (b2 x c2) / n2 Controls Controls Exp+ a2 b2 Exp- c2 d2 n2 How to conduct a stratified analysis? Crude analysis Stratified analysis 1. 2. 3. Do stratum-specific estimates look different? 95% CI of OR/RR do NOT overlap? Is the Test of Homogeneity significant? NO Check for confounding (compare crude RR/OR with MH RR/OR) YES EFFECT MODIFICATION (Report estimates by stratum) 73 Risk of gastroenteritis by exposure, Outbreak X, Place, time X (crude analysis) . cstable case pesto pasta Exposed Exposure Total Cases AR% pasta 121 pesto 79 94 77.69 45 56.96 Unexposed Total Cases 165 212 7 58 AR% Risk Ratio P 4.24 18.31 [8.81-38.04] 0.000 27.36 2.08 [1.56-2.79] 0.000 74 Stratified Analysis . csinter case pesto, by(pasta) pasta = Exposed pesto Total Exposed UnExposed 56 65 Cases Risk % 43 51 76.79 78.46 Cases Risk % Risk difference Risk Ratio Attrib.risk.exp Attrib.risk.pop -0.02 0.98 0.02 0.01 [-0.17-0.13] [0.81-1.19] [-0.19-0.19] [.-.] Risk difference Risk Ratio Attrib.risk.exp Attrib.risk.pop 0.01 1.21 0.17 0.02 [-0.09-0.11] [0.15-9.53] [-5.52-0.90] [.-.] pasta = Unexposed pesto Total Exposed UnExposed 20 145 1 6 5.00 4.14 Test of Homogeneity (M-H) : pvalue : 0.8366301 Crude RR for pesto : 2.08 [1.56-2.79] MH RR for pesto adjusted for pasta : 0.99 [0.81-1.20] Adjusted/crude relative change : -52.67 % > 10-20% 75 Examples of stratified analysis Examples 1 2 3 4 5 Stratum 1 Stratum 2 Crude RR 4.00 1.01 3.05 1.02 1.07 4.00 1.03 5.20 1.86 9.40 4.00 4.00 4.00 4.00 4.00 Effect modifier Belongs to nature Different effects in different strata Simple Useful Increases knowledge of biological mechanism Allows targeting of PH action Confounding factor Belongs to study Weighted RR different from crude RR Distortion of effect Creates confusion in data Prevent (protocol) Control (analysis) Analyzing a third factor Examine crude OR / RR Examine ORs / RRs in each stratum Identical ORs / RRs across strata Different ORs / RRs across strata Strata ORs / RRs similar to crude (Crude value falls between strata) Strata ORs / RRs different from crude (Crude value does not fall between strata) Effect modification Third factor does not play a role Confounding factor Stop the analysis. DO NOT adjust! Report ONE crude OR/RR Adjust using the M-H technique Report MULTIPLE ORs / RRs for each stratum Eliminate the confouding Report ONE adjusted OR / RR How to conduct a stratified analysis Perform crude analysis Measure the strength of association List potential effect modifiers and confounders Stratify data according to potential modifiers or confounders Check for effect modification If effect modification present, show the data by stratum If no effect modification present, check for confounding If confounding, show adjusted data If no confounding, show crude data How to define the strata? • Strata defined according to third variable: – ‘Usual’ confounders (e.g. age, sex, socio-economic status) – Any other suspected confounder, effect modifier or additional risk factor – Stratum of public health interest • For two risk factors: – stratify on one to study the effect of the second on outcome • Two or more exposure categories: – each is a stratum • Residual confounding ? 80 Logical order of data analysis How to deal with multiple risk factors: Crude analysis Multivariable analysis 1. stratified analysis 2. modelling linear regression logistic regression Multivariate analysis • Mathematical model • Simultaneous adjustment of all confounding and risk factors • Can address effect modification A train can mask a second train A variable can mask another variable Back-up slides Risk factors for Salmonella enteritidis infections, France, 1995 Delarocque-Astagneau et al Epidemiol. Infect 1998:121:561-7 86 Cases of Salmonella enteritidis gastroenteritis according to egg storage and season Summer Cases Controls OR (95%CI) Duration of storage >= 2 weeks 12 2 < 2 weeks 52 64 >= 2 weeks 7 3 < 2 weeks 32 36 >= 2 weeks 19 5 < 2 weeks 84 100 7.4 (1.5-69.9) Other seasons Duration of storage 2.6 (0.5-16.8) All seasons 4.5 (1.5 – 16.1) 87 Duration of storage Salmonellosis Season 88 Cases of Salmonella enteritidis gastroenteritis according to egg storage and season Summer (A) “Long” storage (B) Yes Yes 12 2 ORAB 6.8 Yes No 52 64 ORA 0.9 No Yes 7 3 ORB 2.6 No No 32 36 Ref Ref Cases Control OR 89 Advantages & Disadvantages of Stratified Analysis • Advantages – straightforward to implement and comprehend – easy way to evaluate interaction • Disadvantages – only one exposure-disease association at a time – requires continuous variables to be grouped • Loss of information; possible “residual confounding” – deteriorates with multiple confounders • e.g. suppose 4 confounders with 3 levels – 3x3x3x3=81 strata needed – unless huge sample, many cells have “0”’ and strata have undefined effect measures 90