Matching in case control studies Yvan Hutin Cases of acute hepatitis (E) by residence, Girdharnagar, Gujarat, India, 2008 Attack rate per 1,000 > 40 30-39 20-29 >0-10 0 Water pumping station Leak Drain overflow Risk of hepatitis by place of residence, Girdharnagar, Gujarat, India, 2008 Source of water Hepatitis No hepatitis Total Leaking pipes /overflowing drain 144 8,694 8,838 No leakages / overflowing drain 89 12,436 12,525 233 21,130 21,363 Total RR = 2.3, Chi Square= 41.1 df= 1. P < 0.001 3 Attack rate of acute hepatitis (E) by zone of residence, Baripada, Orissa, India, 2004 Attack rate 0 - 0.9 / 1000 1 - 9.9 / 1000 10 -19.9 / 1000 20+ / 1000 Underground water supply Pump from river bed Case-control study methods, acute hepatitis outbreak, Baripada, Orissa, India, 2004 • Cases – All cases identified through active case search • Control – Equal number of controls selected from affected wards but in households without cases • Data collection – Reported source of drinking water – Comment events – Restaurants Consumption of pipeline water among acute hepatitis cases and controls, Baripada, Orissa, India, 2004 Acute hepatitis Control Total Drunk pipeline water 493 134 627 Did not drink pipeline water 45 404 449 Total 538 538 1076 Adjusted odds ratio = 33, 95 % confidence interval: 23- 47 Key elements • The concept of matching • The matched analysis • Pro and cons of matching Controlling a confounding factor • Stratification • Restriction • Matching • Randomization • Multivariate analysis The concept of matching • Confounding is anticipated – Adjustment will be necessary • Preparation of the strata a priori – Recruitment of cases and controls • By strata • To insure sufficient strata size • If cases are made identical to controls for the matching variable, the difference must be explained by the exposure investigated Consequence.... • The problem: – Confounding • Is solved with another problem: – Introduction of more confounding, – so that stratified analysis can eliminate it. Definition of matching • Creation of a link between cases and controls • This link is: – Based upon common characteristics – Created when the study is designed – Kept through the analysis Types of matching strategies • Frequency matching – Large strata • Set matching – Small strata – Sometimes very small (1/1: pairs) Unmatched control group Cases Controls Bag of cases Bag of controls Matched control group Cases Controls Sets of cases and controls that cannot be dissociated Matching: False pre-conceived ideas Matching is necessary for all case-control studies Matching needs to be done on age and sex Matching is a way to adjust the number of controls on the number of cases Matching: True statements Matching can put you in trouble Matching can be useful to quickly recruit controls Matching criteria • Potential confounding factors – Associated with exposure – Associated with the outcome • Criteria – Unique – Multiple – Always justified Risk factors for microsporidiosis among HIV infected patients • Case control study • Exposure – Food preferences • Potential confounder – CD4 / mm3 • Matching by CD4 category • Analysis by CD4 categories Mantel-Haenszel adjusted odds ratio OR M-H= ai.di) / Ti] bi.ci) / Ti] Matched analysis by set (Pairs of 1 case / 1 control) • Concordant pairs – Cases and controls have the same exposure – No ad and bc: no input to the calculation Cases Controls Total Exposed 1 1 2 Non exposed 0 0 Total 1 1 No effect Cases Controls Total Exposed 0 0 0 0 Non exposed 1 1 2 2 Total 1 1 2 No effect Matched analysis by set (Pairs of 1 case / 1 control) • Discordant pairs – Cases and controls have different exposures – ad’s and bc’s: input to the calculation Cases Controls Total Exposed 1 0 1 Non exposed 0 1 Total 1 1 Positive association Cases Controls Total Exposed 0 1 1 1 Non exposed 1 0 1 2 Total 1 1 2 Negative association The Mantel-Haenszel odds ratio... OR M-H= ai.di) / Ti] bi.ci) / Ti] …becomes the matched odds ratio OR M-H= Discordant sets case exposed Discordant sets control exposed …and the analysis can be done with paper clips! • Concordant questionnaire : Trash • Discordant questionnaires : On the scale – The "exposed case" pairs weigh for a positive association – The "exposed control" pairs weigh for a negative association Analysis of matched case control studies with more than one control per case • Sort out the sets according to the exposure status of the cases and controls Example for 1 case / 2 controls Sets with case exposed: +/++, +/+-, +/-Sets with case unexposed: -/++, -/+-, -/-- • Count reconstituted case-control pairs for each type of set • Multiply the number of discordant pairs in each type of set by the number of sets • Calculate odds ratio using the f/g formula The old 2 x 2 table... Cases Controls Total Exposed a b L1 Unexposed c d L0 C1 C0 T Total Odds ratio: ad/bc ... is difficult to recognize! Cases Controls Exposed Unexposed Total Exposed e f a Unexposed g h c Total b d P (T/2) Odds ratio: f/g The Mac Nemar chi-square (f - g) Chi2 McN= (f+g) 2 Matching: Advantages Easy to communicate Useful for strong confounding factors May increase power of small studies May ease control recruitment Suits studies where only one factor is studied Allows looking for interaction with matching criteria Matching: Disadvantages ✘ Must be understood by the author ✘ Is deleterious in the absence of confounding ✘ Can decrease power ✘ Can complicate control recruitment ✘ Is limiting if more than one factor ✘ Does not allow examining the matching criteria Matching with a variable associated with exposure, but not with illness (Overmatching) • Reduces variability • Increases the number of concordant pairs • Has deleterious consequences: – If matched analysis: reduction of power – If match broken: Odds ratio biased towards one Hidden matching (“Crypto-matching”) • Some control recruitment strategies consist de facto in matching – Neighbourhood controls – Friends controls • Matching must be identified and taken into account in the analysis Matching for operational reasons • Outbreak investigation setting • Friends or neighbours controls are a common choice • Advantages: – Allows identifying controls fast – Will take care of gross confounding factors – May results in some overmatching, which places the investigator on “the safe side” Breaking the match • Rationale – Matching may limit the analysis – Matching may have been decided for operational purposes • Procedure – Conduct matched analysis – Conduct unmatched analysis – Break the match if the results are unchanged Take home messages • Matching is a difficult technique • Matching design means matched analysis • Matching can always be avoided