Jesse A. Berlin, ScD
Johnson & Johnson Pharmaceutical R&D
University of Pennsylvania Clinical Trials Symposium
April 2010
1
• “Comparative effectiveness research (CER) is the generation and synthesis of evidence that compares the benefits and harms of alternative methods to prevent, diagnose, treat, and monitor a clinical condition or to improve the delivery of care. The purpose of CER is to assist consumers, clinicians, purchasers, and policy makers to make informed decisions that will improve health care at both the individual and population levels.”
(Initial National Priorities for CER 2009)
2
Principles for conducting meta-analyses of CER studies
• The usual guidelines for systematic reviews still apply
– Need a protocol (yes, in advance of doing the work)
– Ideally, define the question independently of knowledge of the trial results (truly prospective meta-analysis)
– Worry about accuracy of data extraction
– “Appropriate” data analysis is always a good idea
3
• Any (?) appropriate (meeting pre-defined entry and exclusion criteria) randomized studies (Treatment A vs. Treatment B)
• Think PICOTS (vs. usual registration trials?)
– Patient populations
– Interventions
– Comparators
– Outcome measures of interest
– Timing
– Settings
4
• Include non-inferiority (NI) studies that directly compare A with B?
• Must understand the earlier placebo-controlled studies
– Principles outlined in FDA draft guidance on NI studies
• Steps to ensure that “active” comparator has the anticipated effect
– Do we require that the active comparator separates from placebo?
• If the goal is to show superiority, then including NI studies may actually underestimate the true difference?
– Expected bias from “sloppy” trials is toward the null
– HOWEVER – could just be showing superiority to an active comparator that didn’t really “work” in this study
5
Why network meta-analysis / mixed treatment comparisons might be helpful
• Evidence from head to head comparison trials is often limited or unavailable
• It allows all evidence to be combined in a single analysis
– Inference based on more evidence can (usually) provide more precision
– Treatments can be ranked
– Better informed decisions (maybe)
• But there are some serious caveats
(coming soon)
6
• Using evidence from A vs B and A vs C trials to draw conclusions about the effect of B relative to C
Treatment A
Treatment B Treatment C
7
Unadjusted indirect comparison
Adjusted indirect comparison
A vs P
A vs P
A vs C
A vs D
A vs F
∑ A
B vs P
B vs P
B vs C
B vs D
B vs F
∑B
• Results of single arms between different trials are compared
• It ignores randomization
• Biased results
• Over-precise estimates
A vs P B vs P
A vs P B vs P
A vs C B vs C
A vs D B vs D
A vs F B vs F
[A vs P] vs [B vs P]
• Find a common comparator
• Analysis based on treatment differences
• It takes randomization into account
• Correct analysis
8
• OR ( A vs B) = OR (A vs. placebo) / OR (B vs.
Placebo)
• Wider confidence intervals
– Variance
A vs. B
= Variance
A vs. placebo
+ Variance
B vs. placebo
• As a strategy, indirect comparisons are statistically inefficient!
9
• Homogeneity (depending on the question)
– Similar assumption to traditional meta-analysis
– All placebo controlled trials for Treatment A are “similar”
And
– All placebo controlled trials for Treatment B are “similar”
• Similarity (the key sticking point)
– Factors that affect the response to a treatment must be similarly distributed in the various treatment arms
• Confounding alone is not enough
– Clinical
• Patients characteristics, settings, follow up, outcomes
– Methodological
10
• Two treatments (A and B) with identical effects vs. control (C)
• Control event rate
– in men = 20%
– in women = 10%
• RR (A vs. C) = RR (B vs. C)
– in MEN = 1.0
– in WOMEN = 0.5
• Study 1 (compares A vs. C) : 80% men
• Study 2 (compares B vs. C) : 20% men
11
Total MEN
A
C
(control)
Total 160
Event No event
80 (20%) 320
80 (20%) 320
640
400
400
800
WOMEN Event
A
No event
5 (5%) 95
10 (10%) 90 C
(control)
Total 15 185
Total
100
100
200
12
Study 1 TOTAL
(ignoring sex)
A
C
Total
Event
85 (17%)
90 (18%)
175
No event
415
410
825
Total
500
500
1000
13
• RR Study 1:
– RR (A vs. C) in Men = 1.0
– RR (A vs. C) in Women = 0.5
– RR (A vs. C) (ignoring sex) = 17% / 18% = 0.94
14
Total MEN
B
C
(control)
Total 40
Event No event
20 (20%) 80
20 (20%) 80
160
100
100
200
WOMEN Event
B
No event
20 (5%) 380
40 (10%) 360 C
(control)
Total 60 740
Total
400
400
800
15
Study 2 TOTAL
(ignoring sex)
B
C
Total
Event
40 (8%)
60 (12%)
100
No event
460
440
900
Total
500
500
1000
16
• RR Study 2:
– RR (B vs. C) in Men = 1.0
– RR (B vs. C) in Women = 0.5
– RR (B vs. C) ignoring sex = 8% / 12% = 0.67
17
• RR for indirect comparison of A vs. B
= RR (A vs. C) / RR (B vs. C) = 0.94 / 0.67 = 1.4
SHOULD BE:
RR (A vs. B) = 1.0 in men and 1.0 in women
SHOULD CONCLUDE: A = B
18
• Direct and indirect comparisons usually (but not always) agree (
Song F, Altman DG, Glenny AM, Deeks JJ. Validity of indirect comparison for estimating efficacy of competing interventions: empirical evidence from published meta-analyses. BMJ 2003;326:472-5.
)
• Adjusted indirect comparisons may be less biased than head to head trials for evaluating new drugs.
(Song F, Harvey I, Lilford R. Adjusted indirect comparisonmay be less
• biased than direct comparison for evaluating new pharmaceutical interventions. J Clin
Epidemiol 2008;61:455-63.)
• Concern about bias remains .
(Song F, Loke YK, Glenny A-M, Eastwood
AJ, Altman DG. Methodological problems in the use of indirect comparisons for evaluating healthcare interventions: survey of published systematic reviews. BMJ 2009;338:b1147.)
19
Mixed treatment comparison/ Network meta-analysis
• Extension of meta-analysis to permit the simultaneous analysis of any number of interventions
• It pools direct evidence of the effect of B relative to C with indirect evidence
Treatment A Treatment D
Treatment B Treatment C
20
A vs P
A vs P
A vs C
A vs D
A vs F
B vs P
B vs P
B vs C
B vs D
B vs F
Mixed treatment comparisons/network meta-analysis
• Uses Bayesian framework
• Simultaneous comparison of multiple treatments
• Deals with trials with multiple arms
• accounts for the correlation in trials with multiple arms
• Ranks treatments
• Pools direct and indirect evidence
• Evaluates consistency
21
National Institute for Health and Clinical Excellence NICE in UK
• When conducting systematic reviews to evaluate effectiveness, direct evidence should be used wherever possible
• If little or no such evidence exists, it may be necessary to look for indirect comparisons
• The reviewer needs to be aware that the results may be susceptible to bias
• If both direct and indirect comparisons are possible, it is recommended that these be done separately before considering whether to pool data
22
Canadian Agency for Drugs and
Technologies in Health
• Indirect treatment comparisons should be
restricted to those situations in which is not possible to perform direct head to head trial
– (AGAIN – define “not possible”)
• The inconsistency of the network needs to be considered
23
• Indirect comparison are observational studies across trials, and may suffer the bias of observational studies
• But…, they publish network metaanalysis
24
• Body that assesses prescription medicines for reimbursement in Australia
• Not sufficient evaluation of these methods to support their current use in submission to PBAC
25
Agency for Healthcare Research and
Quality - AHQR
• Awards contracts to institutions to review all relevant and scientific literature and synthesize evidence on health care topics
• AHQR Evidence Based Practice Centers now describe indirect comparisons
• Receiving training
26
• Increasing number of publications in high impact journals
Network MA of Antidepressants in Lancet. 2009 Feb 28;373:746-58
27
• Multiple potential sources of bias
• RCTs could overcome the bias but often:
– Enroll restricted populations
– Don’t study clinically important outcomes
– Don’t choose helpful comparators
• Not flaws inherent to the RCT design
– Pragmatic trials could help
• Meanwhile, we may fill gaps in RCT literature
– assuming a certain degree of internal
validity (and reproducibility!)
• NECESSARY FOR ASSESSING POTENTIAL
HARMS
28
• Institute of Medicine Committee on Standards for Systematic
Reviews of Clinical Effectiveness Research
• Mandated by Congress
• Note the word “clinical”
– Cost is not on the proverbial table
• Emphasis on aspects unique to “CER”
• Final report to be released in February 2011
• Study website http://www.iom.edu/Activities/Quality/SystemReviewCER.aspx
.
• ROSTER: Al Berg (Chair); Anna Maria Siega-Riz; Chris Schmid; David Mrazek; Giselle
Corbie-Smith; Hal Sox; Jeremy Grimshaw; Jesse Berlin; Katie Maslow; Kay
Dickersin; Marguerite Koster; Mark Helfand; Mohit Bhandari; Paul Wallace; Sally
Morton (Co-chair); Vince Kerr
29
• Do the head to head when possible
– In a relevant (broad enough) population
– Endpoints relevant to patients, caregivers, physicians, payors (or combinations thereof)
• Clinically important (not JUST surrogates)
• Long-term
• Patient-reported outcomes (symptoms)?
• (By the way, who defines “when possible” and how do they define it?)
• Indirect comparisons useful for planning future trials?
30
31
Confounding alone is not enough
(you need effect modification for things to go wrong)
• Two treatments with identical effects vs. control
• RR (D or E vs. control) = 0.5 in BOTH men and women
• Control event rate in men = 20%
• Control event rate in women = 10%
• Study 1 (D vs. control) = 80% men
• Study 2 (E vs. control) = 20% men
32
• RR = 0.5 for D vs. control in both men and women AND overall
• RR = 0.5 for E vs. control in both men and women AND overall
• RR (D vs. E) = 1.0
33