Systematic reviews in comparative effectiveness

Systematic reviews in comparative effectiveness

Jesse A. Berlin, ScD

Johnson & Johnson Pharmaceutical R&D

University of Pennsylvania Clinical Trials Symposium

April 2010

1

Comparative Effectiveness

• “Comparative effectiveness research (CER) is the generation and synthesis of evidence that compares the benefits and harms of alternative methods to prevent, diagnose, treat, and monitor a clinical condition or to improve the delivery of care. The purpose of CER is to assist consumers, clinicians, purchasers, and policy makers to make informed decisions that will improve health care at both the individual and population levels.”

(Initial National Priorities for CER 2009)

2

Principles for conducting meta-analyses of CER studies

• The usual guidelines for systematic reviews still apply

– Need a protocol (yes, in advance of doing the work)

– Ideally, define the question independently of knowledge of the trial results (truly prospective meta-analysis)

– Worry about accuracy of data extraction

– “Appropriate” data analysis is always a good idea

3

What studies to include in the systematic review?

• Any (?) appropriate (meeting pre-defined entry and exclusion criteria) randomized studies (Treatment A vs. Treatment B)

• Think PICOTS (vs. usual registration trials?)

– Patient populations

– Interventions

– Comparators

– Outcome measures of interest

– Timing

– Settings

4

What about non-inferiority studies?

• Include non-inferiority (NI) studies that directly compare A with B?

• Must understand the earlier placebo-controlled studies

– Principles outlined in FDA draft guidance on NI studies

• Steps to ensure that “active” comparator has the anticipated effect

– Do we require that the active comparator separates from placebo?

• If the goal is to show superiority, then including NI studies may actually underestimate the true difference?

– Expected bias from “sloppy” trials is toward the null

– HOWEVER – could just be showing superiority to an active comparator that didn’t really “work” in this study

5

Why network meta-analysis / mixed treatment comparisons might be helpful

• Evidence from head to head comparison trials is often limited or unavailable

• It allows all evidence to be combined in a single analysis

– Inference based on more evidence can (usually) provide more precision

– Treatments can be ranked

– Better informed decisions (maybe)

• But there are some serious caveats

(coming soon)

6

Indirect comparisons

• Using evidence from A vs B and A vs C trials to draw conclusions about the effect of B relative to C

Treatment A

Treatment B Treatment C

7

Unadjusted indirect comparison

Adjusted indirect comparison

Evolution

A vs P

A vs P

A vs C

A vs D

A vs F

∑ A

B vs P

B vs P

B vs C

B vs D

B vs F

∑B

• Results of single arms between different trials are compared

• It ignores randomization

• Biased results

• Over-precise estimates

A vs P B vs P

A vs P B vs P

A vs C B vs C

A vs D B vs D

A vs F B vs F

[A vs P] vs [B vs P]

• Find a common comparator

• Analysis based on treatment differences

• It takes randomization into account

• Correct analysis

8

Mechanics of indirect comparisons

• OR ( A vs B) = OR (A vs. placebo) / OR (B vs.

Placebo)

• Wider confidence intervals

– Variance

A vs. B

= Variance

A vs. placebo

+ Variance

B vs. placebo

• As a strategy, indirect comparisons are statistically inefficient!

9

Assumptions

• Homogeneity (depending on the question)

– Similar assumption to traditional meta-analysis

– All placebo controlled trials for Treatment A are “similar”

And

– All placebo controlled trials for Treatment B are “similar”

• Similarity (the key sticking point)

Factors that affect the response to a treatment must be similarly distributed in the various treatment arms

• Confounding alone is not enough

– Clinical

• Patients characteristics, settings, follow up, outcomes

– Methodological

10

How it can go wrong (1)

• Two treatments (A and B) with identical effects vs. control (C)

• Control event rate

– in men = 20%

– in women = 10%

• RR (A vs. C) = RR (B vs. C)

– in MEN = 1.0

– in WOMEN = 0.5

• Study 1 (compares A vs. C) : 80% men

• Study 2 (compares B vs. C) : 20% men

11

How it can go wrong (2)

Study 1

Total MEN

A

C

(control)

Total 160

Event No event

80 (20%) 320

80 (20%) 320

640

400

400

800

WOMEN Event

A

No event

5 (5%) 95

10 (10%) 90 C

(control)

Total 15 185

Total

100

100

200

12

Study 1 TOTAL

(ignoring sex)

A

C

Total

How it can go wrong (3)

Event

85 (17%)

90 (18%)

175

No event

415

410

825

Total

500

500

1000

13

How it can go wrong (4)

• RR Study 1:

– RR (A vs. C) in Men = 1.0

– RR (A vs. C) in Women = 0.5

– RR (A vs. C) (ignoring sex) = 17% / 18% = 0.94

14

How it can go wrong (5)

Study 2

Total MEN

B

C

(control)

Total 40

Event No event

20 (20%) 80

20 (20%) 80

160

100

100

200

WOMEN Event

B

No event

20 (5%) 380

40 (10%) 360 C

(control)

Total 60 740

Total

400

400

800

15

Study 2 TOTAL

(ignoring sex)

B

C

Total

How it can go wrong (6)

Event

40 (8%)

60 (12%)

100

No event

460

440

900

Total

500

500

1000

16

How it can go wrong (7)

• RR Study 2:

– RR (B vs. C) in Men = 1.0

– RR (B vs. C) in Women = 0.5

– RR (B vs. C) ignoring sex = 8% / 12% = 0.67

17

How it can go wrong (8)

• RR for indirect comparison of A vs. B

= RR (A vs. C) / RR (B vs. C) = 0.94 / 0.67 = 1.4

SHOULD BE:

RR (A vs. B) = 1.0 in men and 1.0 in women

SHOULD CONCLUDE: A = B

18

Empirical findings

• Direct and indirect comparisons usually (but not always) agree (

Song F, Altman DG, Glenny AM, Deeks JJ. Validity of indirect comparison for estimating efficacy of competing interventions: empirical evidence from published meta-analyses. BMJ 2003;326:472-5.

)

• Adjusted indirect comparisons may be less biased than head to head trials for evaluating new drugs.

(Song F, Harvey I, Lilford R. Adjusted indirect comparisonmay be less

• biased than direct comparison for evaluating new pharmaceutical interventions. J Clin

Epidemiol 2008;61:455-63.)

• Concern about bias remains .

(Song F, Loke YK, Glenny A-M, Eastwood

AJ, Altman DG. Methodological problems in the use of indirect comparisons for evaluating healthcare interventions: survey of published systematic reviews. BMJ 2009;338:b1147.)

19

Mixed treatment comparison/ Network meta-analysis

• Extension of meta-analysis to permit the simultaneous analysis of any number of interventions

• It pools direct evidence of the effect of B relative to C with indirect evidence

Treatment A Treatment D

Treatment B Treatment C

20

A vs P

A vs P

A vs C

A vs D

A vs F

B vs P

B vs P

B vs C

B vs D

B vs F

Evolution

Mixed treatment comparisons/network meta-analysis

• Uses Bayesian framework

• Simultaneous comparison of multiple treatments

• Deals with trials with multiple arms

• accounts for the correlation in trials with multiple arms

• Ranks treatments

• Pools direct and indirect evidence

• Evaluates consistency

21

How Popular?

National Institute for Health and Clinical Excellence NICE in UK

• When conducting systematic reviews to evaluate effectiveness, direct evidence should be used wherever possible

• If little or no such evidence exists, it may be necessary to look for indirect comparisons

• The reviewer needs to be aware that the results may be susceptible to bias

• If both direct and indirect comparisons are possible, it is recommended that these be done separately before considering whether to pool data

22

Canadian Agency for Drugs and

Technologies in Health

• Indirect treatment comparisons should be

restricted to those situations in which is not possible to perform direct head to head trial

– (AGAIN – define “not possible”)

• The inconsistency of the network needs to be considered

23

Cochrane Collaboration

• Indirect comparison are observational studies across trials, and may suffer the bias of observational studies

• But…, they publish network metaanalysis

24

Pharmaceutical benefits advisory committee

• Body that assesses prescription medicines for reimbursement in Australia

Not sufficient evaluation of these methods to support their current use in submission to PBAC

25

Agency for Healthcare Research and

Quality - AHQR

• Awards contracts to institutions to review all relevant and scientific literature and synthesize evidence on health care topics

• AHQR Evidence Based Practice Centers now describe indirect comparisons

• Receiving training

26

How popular?

• Increasing number of publications in high impact journals

Network MA of Antidepressants in Lancet. 2009 Feb 28;373:746-58

27

What about non-randomized studies?

• Multiple potential sources of bias

• RCTs could overcome the bias but often:

– Enroll restricted populations

– Don’t study clinically important outcomes

– Don’t choose helpful comparators

Not flaws inherent to the RCT design

– Pragmatic trials could help

• Meanwhile, we may fill gaps in RCT literature

assuming a certain degree of internal

validity (and reproducibility!)

• NECESSARY FOR ASSESSING POTENTIAL

HARMS

28

Ongoing work

• Institute of Medicine Committee on Standards for Systematic

Reviews of Clinical Effectiveness Research

• Mandated by Congress

• Note the word “clinical”

– Cost is not on the proverbial table

• Emphasis on aspects unique to “CER”

• Final report to be released in February 2011

• Study website http://www.iom.edu/Activities/Quality/SystemReviewCER.aspx

.

• ROSTER: Al Berg (Chair); Anna Maria Siega-Riz; Chris Schmid; David Mrazek; Giselle

Corbie-Smith; Hal Sox; Jeremy Grimshaw; Jesse Berlin; Katie Maslow; Kay

Dickersin; Marguerite Koster; Mark Helfand; Mohit Bhandari; Paul Wallace; Sally

Morton (Co-chair); Vince Kerr

29

Principles (according to Berlin)

• Do the head to head when possible

– In a relevant (broad enough) population

– Endpoints relevant to patients, caregivers, physicians, payors (or combinations thereof)

• Clinically important (not JUST surrogates)

• Long-term

• Patient-reported outcomes (symptoms)?

• (By the way, who defines “when possible” and how do they define it?)

• Indirect comparisons useful for planning future trials?

30

BACKUP SLIDES

31

Confounding alone is not enough

(you need effect modification for things to go wrong)

• Two treatments with identical effects vs. control

• RR (D or E vs. control) = 0.5 in BOTH men and women

• Control event rate in men = 20%

• Control event rate in women = 10%

• Study 1 (D vs. control) = 80% men

• Study 2 (E vs. control) = 20% men

32

Confounding is not enough

• RR = 0.5 for D vs. control in both men and women AND overall

• RR = 0.5 for E vs. control in both men and women AND overall

• RR (D vs. E) = 1.0

33