Systematic reviews in comparative effectiveness

Systematic reviews in comparative effectiveness

Jesse A. Berlin, ScD

Johnson & Johnson Pharmaceutical R&D

University of Pennsylvania Clinical Trials Symposium

April 2010

1

Comparative Effectiveness

“Comparative effectiveness research (CER) is the generation and synthesis of evidence that compares the benefits and harms of alternative methods to prevent, diagnose, treat, and monitor a clinical condition or to improve the delivery of care. The purpose of CER is to assist consumers, clinicians, purchasers, and policy makers to make informed decisions that will improve health care at both the individual and population levels.”

(Initial National Priorities for CER 2009)

2

Principles for conducting meta-analyses of CER studies

The usual guidelines for systematic reviews still apply

Need a protocol (yes, in advance of doing the work)

Ideally, define the question independently of knowledge of the trial results (truly prospective meta-analysis)

Worry about accuracy of data extraction

“Appropriate” data analysis is always a good idea

3

What studies to include in the systematic review?

Any (?) appropriate (meeting pre-defined entry and exclusion criteria) randomized studies (Treatment A vs. Treatment B)

Think PICOTS (vs. usual registration trials?)

Patient populations

Interventions

Comparators

Outcome measures of interest

Timing

Settings

4

What about non-inferiority studies?

Include non-inferiority (NI) studies that directly compare A with B?

Must understand the earlier placebo-controlled studies

Principles outlined in FDA draft guidance on NI studies

Steps to ensure that “active” comparator has the anticipated effect

Do we require that the active comparator separates from placebo?

If the goal is to show superiority, then including NI studies may actually underestimate the true difference?

Expected bias from “sloppy” trials is toward the null

HOWEVER – could just be showing superiority to an active comparator that didn’t really “work” in this study

5

Why network meta-analysis / mixed treatment comparisons

might

be helpful

Evidence from head to head comparison trials is often limited or unavailable

It allows all evidence to be combined in a single analysis

Inference based on more evidence can (usually) provide more precision

Treatments can be ranked

Better informed decisions (maybe)

But there are some serious caveats

(coming soon)

6

Indirect comparisons

Using evidence from A vs B and A vs C trials to draw conclusions about the effect of B relative to C

Treatment A

Treatment B Treatment C

7

Unadjusted indirect comparison

Adjusted indirect comparison

Evolution

A vs P

A vs P

A vs C

A vs D

A vs F

∑ A

B vs P

B vs P

B vs C

B vs D

B vs F

∑B

Results of single arms between different trials are compared

It ignores randomization

Biased results

Over-precise estimates

A vs P

A vs P

A vs C

A vs D

A vs F

B vs P

B vs P

B vs C

B vs D

B vs F

[A vs P] vs [B vs P]

Find a common comparator

Analysis based on treatment differences

It takes randomization into account

Correct analysis

8

Mechanics of indirect comparisons

OR

(

A vs B) = OR (A vs. placebo) / OR (B vs.

Placebo)

Wider confidence intervals

Variance

A vs. B

=

Variance

A vs. placebo

+ Variance

B vs. placebo

As a strategy, indirect comparisons are statistically inefficient!

9

Assumptions

Homogeneity (depending on the question)

Similar assumption to traditional meta-analysis

All placebo controlled trials for Treatment A are “similar”

And

All placebo controlled trials for Treatment B are “similar”

Similarity (the key sticking point)

Factors that affect the response to a treatment must be similarly distributed in the various treatment arms

Confounding alone is not enough

Clinical

Patients characteristics, settings, follow up, outcomes

Methodological

10

How it can go wrong (1)

Two treatments (A and B) with identical effects vs. control (C)

Control event rate

– in men = 20%

– in women = 10%

RR (A vs. C) = RR (B vs. C)

– in MEN = 1.0

in WOMEN = 0.5

Study 1 (compares A vs. C) : 80% men

Study 2 (compares B vs. C) : 20% men

11

How it can go wrong (2)

Study 1

Total MEN

A

C

(control)

Total 160

Event No event

80 (20%) 320

80 (20%) 320

640

400

400

800

WOMEN Event

A

C

(control)

5 (5%)

No event

95

10 (10%) 90

Total 15 185

Total

100

100

200

12

Study 1 TOTAL

(ignoring sex)

A

C

Total

How it can go wrong (3)

Event

85 (17%)

90 (18%)

175

No event

415

410

825

Total

500

500

1000

13

How it can go wrong (4)

RR Study 1:

RR (A vs. C) in Men = 1.0

RR (A vs. C) in Women = 0.5

RR (A vs. C) (ignoring sex) = 17% / 18% = 0.94

14

How it can go wrong (5)

Study 2

Total MEN

B

C

(control)

Total 40

Event No event

20 (20%) 80

20 (20%) 80

160

100

100

200

WOMEN Event

B

C

(control)

20 (5%)

No event

380

40 (10%) 360

Total 60 740

Total

400

400

800

15

Study 2 TOTAL

(ignoring sex)

B

C

Total

How it can go wrong (6)

Event

40 (8%)

60 (12%)

100

No event

460

440

900

Total

500

500

1000

16

How it can go wrong (7)

RR Study 2:

RR (B vs. C) in Men = 1.0

RR (B vs. C) in Women = 0.5

RR (B vs. C) ignoring sex = 8% / 12% = 0.67

17

How it can go wrong (8)

RR for indirect comparison of A vs. B

= RR (A vs. C) / RR (B vs. C) = 0.94 / 0.67 = 1.4

SHOULD BE:

RR (A vs. B) = 1.0 in men and 1.0 in women

SHOULD CONCLUDE: A = B

18

Empirical findings

Direct and indirect comparisons usually (but not always) agree

(

Song F, Altman DG, Glenny AM, Deeks JJ. Validity of indirect comparison for estimating efficacy of competing interventions: empirical evidence from published meta-analyses. BMJ 2003;326:472-5.

)

Adjusted indirect comparisons may be less biased than head to head trials for evaluating new drugs.

(Song F, Harvey I, Lilford R. Adjusted indirect comparisonmay be less biased than direct comparison for evaluating new pharmaceutical interventions. J Clin

Epidemiol 2008;61:455-63.)

Concern about bias remains

.

(Song F, Loke YK, Glenny A-M, Eastwood

AJ, Altman DG. Methodological problems in the use of indirect comparisons for evaluating healthcare interventions: survey of published systematic reviews. BMJ 2009;338:b1147.)

19

Mixed treatment comparison/ Network meta-analysis

Extension of meta-analysis to permit the simultaneous analysis of any number of interventions

It pools direct evidence of the effect of B relative to C with indirect evidence

Treatment A Treatment D

Treatment B Treatment C

20

A vs P

A vs P

A vs C

A vs D

A vs F

B vs P

B vs P

B vs C

B vs D

B vs F

Evolution

Mixed treatment comparisons/network meta-analysis

Uses Bayesian framework

Simultaneous comparison of multiple treatments

Deals with trials with multiple arms

• accounts for the correlation in trials with multiple arms

Ranks treatments

Pools direct and indirect evidence

Evaluates consistency

21

How Popular?

National Institute for Health and Clinical Excellence NICE in UK

When conducting systematic reviews to evaluate effectiveness, direct evidence should be used wherever

possible

If little or no such evidence exists, it may be necessary to look for indirect comparisons

The reviewer needs to be aware that the results may be

susceptible to bias

If both direct and indirect comparisons are possible, it is recommended that these be done separately before considering whether to pool data

22

Canadian Agency for Drugs and

Technologies in Health

Indirect treatment comparisons should be

restricted to those situations in which is not possible to perform direct head to head trial

(AGAIN – define “not possible”)

The inconsistency of the network needs to be considered

23

Cochrane Collaboration

Indirect comparison are observational studies across trials, and may suffer

the bias of observational studies

But…, they publish network metaanalysis

24

Pharmaceutical benefits advisory committee

Body that assesses prescription medicines for reimbursement in Australia

Not sufficient evaluation of these methods to support their current use in submission to PBAC

25

Agency for Healthcare Research and

Quality - AHQR

Awards contracts to institutions to review all relevant and scientific literature and synthesize evidence on health care topics

AHQR Evidence Based Practice Centers now describe indirect comparisons

Receiving training

26

How popular?

Increasing number of publications in high impact journals

Network MA of Antidepressants in Lancet. 2009 Feb 28;373:746-58

27

What about non-randomized studies?

Multiple potential sources of bias

RCTs could overcome the bias but often:

Enroll restricted populations

Don’t study clinically important outcomes

Don’t choose helpful comparators

Not flaws inherent to the RCT design

Pragmatic trials could help

Meanwhile, we may fill gaps in RCT literature

assuming a certain degree of internal

validity (and reproducibility!)

NECESSARY FOR ASSESSING POTENTIAL

HARMS

28

Ongoing work

Institute of Medicine Committee on Standards for Systematic

Reviews of Clinical Effectiveness Research

Mandated by Congress

Note the word “clinical”

Cost is not on the proverbial table

Emphasis on aspects unique to “CER”

Final report to be released in February 2011

Study website http://www.iom.edu/Activities/Quality/SystemReviewCER.aspx

.

ROSTER: Al Berg (Chair); Anna Maria Siega-Riz; Chris Schmid; David Mrazek; Giselle

Corbie-Smith; Hal Sox; Jeremy Grimshaw; Jesse Berlin; Katie Maslow; Kay

Dickersin; Marguerite Koster; Mark Helfand; Mohit Bhandari; Paul Wallace; Sally

Morton (Co-chair); Vince Kerr

29

Principles (according to Berlin)

Do the head to head when possible

In a relevant (broad enough) population

Endpoints relevant to patients, caregivers, physicians, payors (or combinations thereof)

Clinically important (not JUST surrogates)

Long-term

Patient-reported outcomes (symptoms)?

(By the way, who defines “when possible” and how do they define it?)

Indirect comparisons useful for planning future trials?

30

BACKUP SLIDES

31

Confounding alone is not enough

(you need effect modification for things to go wrong)

Two treatments with identical effects vs. control

RR (D or E vs. control) = 0.5 in BOTH men and women

Control event rate in men = 20%

Control event rate in women = 10%

Study 1 (D vs. control) = 80% men

Study 2 (E vs. control) = 20% men

32

Confounding is not enough

RR = 0.5 for D vs. control in both men and women AND overall

RR = 0.5 for E vs. control in both men and women AND overall

RR (D vs. E) = 1.0

33