a=1 - HUMIS

Modern Approach to Causal

Inference

Brian C. Sauer, PhD MS

SLC VA Career Development Awardee

About Me

• SLC VA Career Development Awardee

• PhD in Pharmacoepidemiology from College of

Pharmacy at University of Florida

• MS in Biomedical Informatics from University

Utah

• Assistant Research Professor in Division of

Epidemiology, Department of Internal

Medicine

Acknowledgments

• Mentors

– Matthew Samore, MD

– Tom Greene , PhD

– Jonathan Nebeker, MD

• Simulation

– Chen Wang, MS statistics

• Primary References:

– Causal Inference Book: Jamie Robins & Miguel Hernán

• http://www.hsph.harvard.edu/faculty/miguel-hernan/causalinference-book/

– Modern Epidemiology 3 rd Ed. Chapter 12. Rothman,

Greenland, Lash

Outline

• Causal Inference and the counterfactual framework

• Exchangeability & conditional exchangeability

• Use of Directed Acyclic Graphs (DAGs) to identify a minimal set of covariates to remove confounding.

Key Learning Points

• Understand the Rationale for

– randomized control trials.

– covariate selection in observation research.

• Identify the minimal set of covariates needed to produce unbiased effect estimates.

• Develop terminology and language to describe these ideas with precision.

• Become familiar with notation for causal inference, which is a barrier to this literature.

Counterfactual Framework

• Neyman (1923)

– Effects of point exposures in randomized experiments

• Rubin (1974)

– Effects of point exposures in randomized and observational studies (potential outcomes and

Rubin Causal Framework)

• Robins (1986)

– Effects of time-varying exposures in randomized and observational studies. (counterfactuals)

Counterfactual

Working Example:

• Zeus took the heart pill 5-days later he died

• Had he not taken the heart pill he would still be alive on that 5th day

– that is, all things being equal

• Did the pill cause Zeus’s death ?

Counterfactual

Working Example:

• Hera didn’t take the pill

– 5-days later she was alive

• Had she taken the pill she would still be alive

5-days later.

• Did the pill cause Hera’s survival ?

Gettysburg: A Novel of the Civil War

• Newt Gingrich and

William R Forstchen

• Historical Fiction

– Imagines how the war would have ended if there was a confederate victory at Gettysburg

Notation for Actual Data

• Y=1 if patient died, 0 otherwise

– Y z =1, Y h =0

• A=1 if patient treated, 0 otherwise

– A z =1, A h =0

Pat ID A Y

Zeus

Hera

1

0

1

0

Notation for Ideal Data

Outcome under No Treatment

• Y a=0 =1 if subject had not taken the pill, he would have died

– Y z, a=0 = 0, Y h, a=0 = 0

Outcome under Treatment

• Y a=1 =1 if subject had taken the pill, he would have died

– Y z, a=1 = 1, Y h, a=1 = 0

Pat ID A

Zeus 1

Hera 0

Y a =0

0

0

Y a =1

1

0

Available Research Data Set

ID

Zeus

Hera

Apollo 1

Cyclope 0

A

1

0

Y

1

0

0

0

Y a=0

?

0

?

0

Y a=1

1

?

0

?

(Individual) Causal Effect

Formal definition of causal effects:

– For Zeus: Pill has a causal effect because

- Y z,a=1 ≠ Y z,a=0

– For Hera: Pill doesn’t have a causal effect because

- Y h,a=1 = Y h,a=0

Average Causal Effects

Formal Definitional of Average Causal Effects

• In the population, exposure A has a causal effect on the outcome Y if

- Pr[Y a=1 =1] ≠ Pr[Y a=0 =1]



Causal null hypothesis holds if

- Pr[Y a=1 =1] = Pr[Y a=0 =1]

- E[Y a=1 =1] = E[Y a=0 =1]

Representation of causal null

Causal effects can be measured in many scales

• Risk difference: Pr[Y a=1 =1] - Pr[Y a=0 =1] =0

• Risk Ratio: Pr[Y a=1 = 1] ÷ Pr[Y a=0 =1] =1

• Odds Ratio, Hazard Ratio, etc…

Average Causal Effects

No average causal effect:

• Risk difference: Pr[Y a=1 =1] - Pr[Y a=0 =1] =0

• Risk Ratio: Pr[Y a=1 =1] / Pr[Y a=0 =1] =1

• Are there individual Causal Effects?

Pat ID

Rheia

Kronos

Demeter

Hades

Hestia

Poseidon

Hera

Zeus

Artemis

Apollo

Leto

Ares

Athena

Hephaestus

Aphrodite

Cyclope

Persephone

Hermes

Hebe

Dionysus

0

1

1

0

0

1

1

Y a=0 Y a=1

1

0

0

0

0

1

0

0

1

1

0

1

1

0

0

0

0

1

0

0

1

1

0

1

1

1

1

1

0

1

1

0

0

Causal Effects

Definition

• Causal effects are calculated by contrasting counterfactual risks within the population.

• Counterfactual contrasts are by definition causal effects.

Associational Measures

Observed in real world – association ≠ causation

• Pr[Y=1|A=1] =Pr[Y=1|A=0]: Treatment A and outcome Y are independent

• Also quantify strength of association - Risk

Difference, Risk Ratio, OR, HR, etc

• Pr[Y=1|A=1] - Pr[Y=1|A=0]=0

• Pr[Y=1|A=1] ÷ Pr[Y=1|A=0]=1

Causation vs. Association

The key conceptual difference:

• A causal effect defines a comparison of the sample subjects under different actions

– Assumes the counterfactual approach

– Everyone simultaneously treated and untreated

– Marginal effects

• Association is defined as a comparison of different subjects under different conditions

– Effects conditional on treatment assignment group

Causation vs. Association

Causation?

Question:

• Under what conditions can associational measures be used to estimate causal effects?

Causation?

Question:

• Under what conditions can associational measures be used to estimate causal effects?

Answer:

• Ideal randomized experiments

Randomized Experiments

• Generate missing counterfactual data

• Missing counterfactual is missing completely at random (MCAR)

• Because of this, causal effects can be consistently estimated with ideal RCTs despite the missing data

Ideal Randomized Experiments

Exchangeability:

• Risk under the potential treatment value a among the treated is equal to the risk under the potential treatment value a for the untreated

• Pr[Y a =1|A=1] = Pr[Y a =1|A=0]

• Consequence of these conditional risk being equal in all subsets defined by treatment status in the population is that they must be equal to the

marginal risk under treatment value a in the whole population

Population of Interest

Key Issue

• In the presence of exchangeability the counterfactual risk under treatment in the white part of the population would equal the counterfactual risk under treatment in the entire population.

RCT?

• A= heart transplant

• Y= death

• L=prognostic factor

• Counts

– 13 of 20 (65%) gods were treated

– 9 of 12 (75%) treated had prognostic factor (l=1)

– 3 of 7 (25%) not treated have prognostic factor

Pat ID

Rheia

Kronos

Demeter

Hades

Hestia

Poseidon

Hera

Zeus

Artemis

Apollo

Leto

Ares

Athena

Eros

Aphrodite

Cyclope

Persephone

Hermes

Hebe

Dionysus

0

0

0

0

Y

0

1

0

1

1

1

1

1

0

1

1

1

1

0

0

0

1

1

0

0

A

0

0

1

1

0

1

1

0

0

1

1

1

1

1

1

1

0

0

0

0

L

0

0

0

0

1

1

1

1

1

1

1

1

1

1

1

1

RCT?

• Design 1:

– 13 of 20 treated: Randomly selected 65% for treatment

• Design 2:

– 9 out of 12 in critical condition (75%) treated

– 4 out of 8 not in critical condition were treated

(50%)

Pat ID

Rheia

Kronos

Demeter

Hades

Hestia

Poseidon

Hera

Zeus

Artemis

Apollo

Leto

Ares

Athena

Eros

Aphrodite

Cyclope

Persephone

Hermes

Hebe

Dionysus

0

0

0

0

Y

0

1

0

1

1

1

1

1

0

1

1

1

1

0

0

0

1

1

0

0

A

0

0

1

1

0

1

1

0

0

1

1

1

1

1

1

1

0

0

0

0

L

0

0

0

0

1

1

1

1

1

1

1

1

1

1

1

1

Conditionally RCT

• Simply combination of 2 marginally randomized experiments.

• One conducted in subset of population L=0 and the other in L=1

• Values not MCAR, but they are MAR condition on the covariate L

• Marginal exchangeability not achieved

• randomization generates conditional exchangeability

Analysis Randomized Trials

• Question: How do you analyze a marginally

RCT?

• Answer:


• Question: How do you typically analyze a marginally RCT?

• Hint : Dependent and independent variables?

• Answer:


• Question 1: How do you typically analyze a marginally RCT?

• Hint : Dependent and independent variables?

• Answer: Crude or unadjusted analysis with treatment and outcome.


• Question 2: How do you typically analyze a conditionally RCT?


• Question 2: How do you typically analyze a conditionally RCT?

• Answer 2:

– Robins recommends standardization and IPW

– Stratification type method or common

– Conditions were standardization ≠ stratification.

Review their text.

Summary

• Randomization produces

– Marginal exchangeability

– Conditional exchangeability

• Exchangeability

– Allows us to use associational measure to estimate causal effects

Observational Studies

• Investigator has no control over treatment assignment, e.g., randomization

• Cannot achieve exchangeability by design

• To estimate a causal contrast we must obtain valid observable substitute quantities for the desired counterfactual quantities

• If we don’t have good substitutes thenwe have a confounded relationship, i.e., the associational RR ≠

CRR

Observational Studies

Conceptual justification

• Conceptualize observational studies as though they are conditionally randomized experiments.

• We assume that some components of the observational study happen by chance.

Identifiability Conditions

• Consistency: treatment levels are not assigned by researcher, but correspond to well defined interventions

• Positivity: all conditional probabilities of treatment are greater than zero

• Conditional Exchangeability: conditional probabilities of being assigned to specific treatment not chosen by investigator, but can be calculated from data

Observational Studies?

Causal Inference

• Exchangeability and conditional exchangeability can not be reached by design.

• Question 1 : How do we address conditional exchangeability in Observational studies?

Observational Studies?

Causal Inference

• Exchangeability and conditional exchangeability can not be reached by design.

• Question 1 : How do we address conditional exchangeability in Observational studies?

• Question 2: How should be pick covariates for our observational studies?

Big Picture

• Covariates should be selected to produce conditional exchangeability

• Confounding must be removed to produce conditional exchangeability

– A variable that removes confounding is a confounder

• Adjusting for certain types of covariates can either block paths, open paths or do nothing

• We want to adjust variables that block all backdoor paths between the treatment and outcome, i.e., remove confounding.

Theory of Causal DAGs

• Mathematically formalized by

– Pearl (1988, 1995, 2000)

– Sprites, Glymour, and

Scheines (1993, 2000)

Directed Acyclic Graphs

• Are abstract mathematical objects.

• Encode an investigators a priori assumptions about the causal relations among the exposure, outcomes and covariates.

• They represent:

– joint probability distributions

– causal structures.

Value of DAGs

• Support communication among researchers and clinicians

• Explicate our belief and background knowledge about causal structures

• Allow us to determine what needs to be measured to remove confounding

• Helps us determine how bias can be induced

• Helps choose appropriate statistics

Directed Acyclic Graphs (DAGs)

• Directed edges (arrows) linking nodes

(variables)

• Variables joined by an arrow are said to be adjacent or neighbors

• Acyclic because no arrows from descendents (effects) to ancestors

(causes)

• Descendants of variable X are variables affected either directly or indirectly by X

• Ancestors of X are all the variables that affect X directly or indirectly

• Paths between two variables can be directed or undirected

d-separation Criteria

• Rules linking absence of open paths to statistical independencies

• Describe expected data distributions if the causal structure represented by the graph is correct

• Unconditional d-separation

– Path is open or unblocked if no collider on path

– Collider blocks a path

• d-Connected

– Open path between two variables

Graphical Conditioning

• Conditioning (adjustment) on a collider F on a path, or any descendant of F, opens the path at F

– U

1 and U

2 are marginally independent, but conditionally associated (conditioning on F)

• Conditioning on a non-collider closes the path and removes C as a source of association between

A and Y

– A and Y are marginally associated, but conditionally independent

(conditioning on C)

Graphical vs. Statistical

Criteria for Indentifying Confounders

• Statistical: a confounder must

– Be associated with the exposure under study in the source population

– Be a risk factor for the outcome, though it need not actually cause the outcome

– Not be affected by the exposure or the outcome

• Graphical: a confounder must

– Be a common cause

– Have an unblocked back-door path

Unified Theory of Bias

• Bias can be reduced to or explained by 3 structures

– Reverse causation: case-control – outcome precedes exposure measurement or outcome can have effect on exposure. Measurement error or

Information bias.

– Common cause: confounding, confounding by indication

– Conditioning on common effects: collider, selection bias, time varying confounding

Covariate Selection

• Adequate Background Knowledge

– Confounder identification must be grounded on an understanding of the causal structure linking the variables being studied (treatment and disease)

– Build a directed acyclic graph (DAG) to check whether the necessary criteria for confounding exists.

– Condition on the minimal set of variables necessary to remove confounding

• Inadequate Background Knowledge

– Remove known instrumental variables, colliders, intermediates (variables with post treatment measurement

– Use automated selection procedures such as HDPS

Confounding and Bias

• Under adjustment occurs when

– An open back door path was not closed

• Over adjustment can occur from adjusting

– Instrumental variables

– Intermediate variables

– Colliders

– Variables caused by outcome

• Discussion of variable types

Confounder

• Common Cause, i.e., confounder

• Confounder L distort the effect of treatment A on disease Y

• Always adjust for confounders, unless small data set and confounder has strong association with treatment and week association with outcome

• Goal is to produce conditional exchangeability

Confounder Example

• A = treatment

– a=1 statin alone

– a=0 niacin alone

• L = Baseline Cholesterol

– l=1: LDL ≥ 160 mg/dL

– l=0: LDL < 160 mg/dL

• Y = Myocardial infarction

– Y=1: Yes

– Y=0: No

Intermediate Variable

• Adjusting for intermediate variable

I in a fixed covariate model will remove the effect of treatment A on disease/outcome Y

• In a fixed covariate model we do not want to include variables influenced by A or Y

• Time-varying treatment model does include time-varying confounding that is also an intermediate variable

Intermediate Example

• A = treatment

– a=1 statin alone

– a=0 niacin alone

• I = Post-treatment Cholesterol

– i=1: LDL ≥ 160 mg/dL

– i=0: LDL < 160 mg/dL

• Y = Myocardial infarction

– Y=1: Yes

– Y=0: No

Collider

• Adjusting for the collider

C can produce bias

• Conditioning on common effect F without adjustment of U

1 or U

2 will induce an association between U

1 and U

2

, which will confound the association between A and Y

Collider

• A = antidepressant use

• Y = lung cancer

• U1 = depression

• U2 = smoking status

• F= cardiovascular disease

Variables associated with treatment or disease only

• Inclusion of variables associated with treatment only (A) can cause bias and imprecision

• Variables associated with disease but not treatment (risk factors) can be included in models. They are expected to decrease variance of treatment effect without increasing bias

• Including variables associated with disease reduces the chance of missing important confounders

Reality is Complicated

Shrier I, Platt, RW. Reducing bias through directed acyclic graphs. BMC

Medical Research Methodology. 2008: 8:70

Determining Minimal Set of Variables

• Produce a DAG and get clinical experts to agree on underlying causal network

• Block (condition) on the variables that allow for open backdoor paths

– Backdoor paths are confounders

• Pearl 6-step approach for determining minimal set of variables

(illustrated by Shrier & Platt.

Reducing bias through DAGs. BMC research Methodology 2008 8:70)

Limitations of DAG approach

• Subject matter knowledge is often not good enough to draw DAG that can be used to determine the minimal set of covariates needed to produce conditional exchangeability

• In large database studies with many providers it is difficult to know all the factors that influence treatment decisions.

Insufficient Background Knowledge

• Recommendation :

– Propensity score (PS) approach :

• Remove colliders and instruments (variables associated with treatment but not disease)

• In a large PS study we should include as many of the remaining variables as possible.

• Focus should be on variables that are a priori thought to be strongly causally related to outcomes (risk factors, confounders)

– Outcome models approach:

• Use a change in estimate approach to select variables

– Since evidence of best variable selection approaches are limited, researchers should explore the sensitivity of their results to different variable selection strategies as well a s removal and inclusion of variables that could be IV or colliders.

Analysis of Observational Data Based on Counterfactuals

• Fixed treatments

– Propensity Score

– Instrumental Variables

– IPW

• Time-varying treatments (sequentially randomization)

– IPW

– G-estimation

– Doubly robust

Simulation

• We have developed simulations to understand and teach these concepts.

• Poster at CDA conference.

• If interested then please contact me

• Brian.sauer@va.gov

; brian.sauer@utah.edu

a=1 - HUMIS

Modern Approach to Causal

Inference

About Me

Acknowledgments

Outline

Key Learning Points

Counterfactual Framework

Counterfactual

Counterfactual

Gettysburg: A Novel of the Civil War

Notation for Actual Data

Notation for Ideal Data

Available Research Data Set

(Individual) Causal Effect

Average Causal Effects

Representation of causal null

Average Causal Effects

Causal Effects

Associational Measures

Causation vs. Association

Causation vs. Association

Causation?

Causation?

Randomized Experiments

Ideal Randomized Experiments

Population of Interest

Key Issue

RCT?

RCT?

Conditionally RCT

Analysis Randomized Trials

Analysis Randomized Trials

Analysis Randomized Trials

Analysis Randomized Trials

Analysis Randomized Trials

Summary

Observational Studies

Observational Studies

Identifiability Conditions

Observational Studies?

Observational Studies?

Big Picture

Theory of Causal DAGs

Directed Acyclic Graphs

Value of DAGs

Directed Acyclic Graphs (DAGs)

d-separation Criteria

Graphical Conditioning

Graphical vs. Statistical

Criteria for Indentifying Confounders

Unified Theory of Bias

Covariate Selection

Confounding and Bias

Confounder

Confounder Example

Intermediate Variable

Intermediate Example

Collider

Collider

Variables associated with treatment or disease only

Reality is Complicated

Determining Minimal Set of Variables

Limitations of DAG approach

Insufficient Background Knowledge

Analysis of Observational Data Based on Counterfactuals

Simulation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib