Selection Bias Concepts Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/talks April 15 H.S. 1 Questions Given measured appropriate variables: Can you adjust for confounding? Yes Can you adjust for selection bias? Depends on the definition April 15 H.S. 2 Contents • Background – Define bias • Selection bias – as effect modification – as collider stratification bias – DAG structure (old concept) (new concept) • Examples • Size and direction of bias April 15 H.S. 3 Bias definition • Bias – Frequency: – Effect: April 15 expected risk ≠ true risk association ≠ causal effect H.S. 4 Selection bias concepts Concept DAG structure Effect responders Effect ≠ modification Effect non responders Differential response bias Collider Differential loss to follow up stratification Healthy worker bias bias Berkson’s bias (case control) April 15 H.S. 5 Selection bias as effect modification April 15 H.S. 6 Selection bias: Risk • Selection of responders – The prevalence is different among • the responders compared to the full population • the responders compared to the non responders Population R0 Non responders R1 Responders Rp Rp is the weighted mean of R0 and R1 April 15 H.S. 7 Effect modification • Selection of responders – The effect of E on D is different among • the responders compared to the full population • the responders compared to the non responders Population April 15 RR0 Non responders RR1 Responders RRp H.S. 8 Problems • Is not a bias, RR0 and RR1 are the true effects • Is effect modification by selection variable S • Leads to the conclusion that: 1. 2. 3. Biolocical effects are protected from bias The bias can not be adjusted for RRp is the average of RR0 and RR1 Not true for collider stratification bias “DAG” structure: S E April 15 D H.S. 9 Selection bias as collider stratification bias April 15 H.S. 10 Example with paths • Study S calcium supp. E D milk bone density S C calcium supp. family history E D milk bone density Structure: Collider stratification April 15 – Milk on bone density – Exclude Calcium supplements Path 1 ED 2 E[S]D Type Status Causal Open Noncausal Open 2 E[S][C]D Noncausal Closed Lessons learned: Biological effect not protected May adjust for selection bias H.S. 11 Examples S C respond education E D alcohol CHD S C loss to follow up smoking E D drug disease S C working health E D dust lung disease • Differential response – Survey: Alcohol and CHD • Differential loss to follow up – Randomized trial: drug and disease • Healthy worker effect – Cross-section: Melt hall dust and lung disease Note: no confounding April 15 H.S. 12 Selection bias structure April 15 H.S. 13 Paths 1. Causal 2. Confounding An open non-causal path without colliders 3. Selection bias A non-causal path that is open due to conditioning on a collider C A E A B D Causal BCVs? C B E D Confounding A B E D Selection bias Collider stratification bias • Selection bias = Collider stratification bias • Selection bias, Path definition – A non causal path that is open due to conditioning on a collider S S E April 15 S D E C D H.S. A B E D 15 Selection bias examples April 15 H.S. 16 Folic acid and cardiac malformation C Selection: Study only live born Live born E D Folic acid Card. Mal. Bias? Yes, E[C]D is open S Grief Selection: Non grieving parents volonteer C Bias? Live born E D Folic acid Card. Mal. April 15 Yes, E[C]D is (partially) open H.S. 17 Education and unfaithfulness • Study the effect among couples in a relationship (not divorced)? R divorced S E sensation seeking education Path 1 ED 2 ERD 3 ERSD Apr-15 D unfaithful Type Causal Noncausal Noncausal Population Open Closed Closed Sample Open Open Open H.S. Selection bias 18 Size and Direction of bias April 15 H.S. 19 Example 1, full table (Adjusted) RRs Response D 1 E 0 1 0 0.9 0.3 0.9 0.3 Response= 54 % R 3.0 1.0 RR E 2.0 E 1 0 Responders D 1 0 sum 114 86 286 514 400 600 1000 RR= OR= RD= 2.0 2.4 0.14 D 0 2.0 2.0 True and biased RRs Proportion responding in 1,1 group Population D 1 E 1 0 1 0 103 26 257 154 RR= OR= RD= 2.0 2.4 0.14 Non respond D E 1 0 1 0 11 60 29 360 RR= OR= RD= 2.0 2.4 0.14 Example 2 Response D E 1 0 1 0 0.9 0.9 0.3 0.3 Response= 42 % Pattern: Only D influence response R 1.0 3.0 RR E 2.0 D 1 0 1.6 2.3 Result: RR (and RD) biased, OR unbiased ODS, Case-Control Example 3 Response D E 1 0 1 0 0.9 0.45 0.45 0.225 Response= 38 % Pattern: Both E and D influence response R 2.0 2.0 RR E 1.0 D 1 0 1.0 0.3 Result: Surprise: responders are unbiased Theory: bias in at least one stratum Example 4 Response D E 1 0 1 0 0.45 0.225 0.9 0.45 Response= 56 % Pattern: Both E and D influence response R 2.0 0.5 RR E 2.0 D 1 0 2.2 3.6 Result: Surprise: both strata biased upwards True RR is not a weighted average Example 5 Response D E 1 0 1 0 0.99 0.495 0.66 0.33 Response= 51 % R 2.0 RR E Response D E 1 0 1 0 0.5 0.25 0.3333 0.1667 Response= 26 % Pattern: Both E and D influence response 1.5 2.0 D 1 0 1.9 0.1 R 2.0 1.5 RR E 2.0 D 1 0 1.9 1.8 Result: Same DAG, different results The DAG does not fully determine the selection! Summing up • Selection bias as “effect modification”: – Is not a bias, should not be called selection bias – Has properties different from proper selection bias • Selection bias as “collider stratification”: – Structure defined in DAG, – Distinct from confounding – Consistent with • • • • April 15 Differential response bias Differential loss to follow up Healthy worker bias Berkson’s bias (case control) H.S. 25 Litterature • Hernan and Robins, Causal Inference April 15 H.S. 26