Matching (in case control studies) James Stuart, Fernando Simón EPIET Dublin, 2006 Remember confounding… Confounding factor is variable independently associated with • exposure of interest • outcome that distorts measurement of association Control of confounders In the study design • Restriction • Matching In the analysis • Stratification • Multivariate analysis Control of confounders In the study design • Restriction • Matching In the analysis • Stratification • Multivariate analysis Matching Selection of controls to match specific characteristics of cases a) Frequency matching Select controls to get same distribution of variable as cases (e.g. age group) b) Individual matching Select a specific control per case by matching variable (e.g. date of birth) … but matching introduces bias because controls are no longer representative of source population to remove this selection bias • Stratify analysis by matching criteria matched design matched analysis • Can not study the effect of matching variables on the outcome a) Frequency matching useful if distribution of cases for a confounding variable differs markedly from distribution of that variable in source population a) Frequency matching Age Cases (years) 0-14 15-29 30-44 45+ TOTAL 50 30 15 5 100 a) Frequency matching Age Cases (years) 0-14 15-29 30-44 45+ TOTAL 50 30 15 5 100 Controls unmatched 20 20 20 40 100 a) Frequency matching Age Cases (years) 0-14 15-29 30-44 45+ TOTAL 50 30 15 5 100 Controls unmatched matched 10 25 25 40 100 50 30 15 5 100 a) Frequency matching: analysis • Mantel-Haenszel Odds Ratio (weighted) ORMH [a d n ] [b c n ] i i • Conditional logistic regression for multiple variables a) Frequency matching: analysis • keep stratification by age group 0-14 years Exposed Yes No Total Cases 45(a) 5(c) 50 Controls 30(b) 20(d) 50 a d ni 900 100 9 b c ni 150 100 1.5 Total 75 25 100(ni) a) Frequency matching: analysis 15-29 years Exposed Yes No Total Cases 15(a) 15(c) 30 Controls 4(b) 26(d) 30 a d ni 390 60 6.5 b c ni 60 60 1.0 same process for each age group ORMH 9 6.5 etc 1.5 1 etc Total 19 41 60(ni) b) individual matching Each pair could be considered one stratum 4 possible outcomes per pair Exposure + Case 1 0 Control 1 0 b) individual matching Each pair could be considered one stratum 4 possible outcomes per pair Exposure + - + Case 1 0 1 0 Control 1 0 0 1 b) individual matching Each pair could be considered one stratum 4 possible outcomes per pair Exposure + - + - + Case 1 0 1 0 0 1 Control 1 0 0 1 0 1 b) individual matching Each pair can be considered as one stratum 4 possible outcomes per pair Exposure + + + Case 1 0 1 0 0 1 Control 1 0 0 1 0 1 + 0 1 1 0 ad = zero unless case exposed, control not exposed bc = zero unless control exposed, case not exposed b) individual matching ORMH [a d n ] [b c n ] i i The only pairs that contribute to OR are discordant ORMH= sum of discordant pairs where case exposed sum of discordant pairs where control exposed b) individual matching If change way of presenting case and control data to show in pairs Controls Exposed Unexposed Exposed e f (ad=1) Cases Unexposed g (bc = 1) h ORMH = sum of discordant pairs where case exposed sum of discordant pairs where control exposed = f/g b) individual matching: for n controls each set analysed in pairs case used in as many pairs as number of controls Case Control1 Control2 Control3 Control4 C+/Ctr- C-/Ctr+ + + 3 0 + + + + 1 0 0 0 + + 3 0 + 0 1 + + + + 1 0 + + + + + 0 0 Total......................................................................... 8 1 OR= pairs case exp/control not = 8 = 8 pairs case not/control exp 1 Matched study: example • 20 cases of cryptosporidiosis • Hypothesis: associated with attendance at local swimming pool • 2 matched studies conducted (i) controls from same general practice and nearest date of birth (ii) case nominated (friend) controls Analysis: GP and age matched controls swimming pool exposure + Controls + 1 15 - 1 Cases 3 OR = f/g = 15/1 = 15.0 Analysis: friend controls swimming pool exposure + Controls + 13 3 Cases - 1 3 OR = 3/1 = 3.0 Why do matched studies? • Random sample may not be possible • Quick and easy way to get controls • Improves efficiency of study (smaller sample size) • Can control for confounding due to factors that are difficult to measure or even for unknown confounders. Disadvantages of matching • Cannot examine risks associated with matching variable • If no controls identified, more likely if too many matching variables, lose case data and vice versa • Overmatching on exposure of interest will bias OR towards 1 • May be residual confounding in frequency matching Over-matching • exposure to the risk factor of interest • under-estimates true association • may fail to find true association Key points • Matching controls for confounding factors in study design • Matched design matched analysis • Matching for variables that are not confounders complicates design • Frequency matching simpler than individual • Multivariable analysis reduces need to match