Applying Bayesian evidence synthesis
in comparative effectiveness research
David Ohlssen (Novartis Pharmaceticals Corporation)
Overview
 Part 1 Bayesian DIA CER sub-team
 Part 2 Overview of Bayesian evidence synthesis
Part 1 Bayesian DIA CER sub-team
Team Members
 Chair: David Ohlssen
 Co-chair: Haijun Ma
 Other team members:
• Fanni Natanegara, George Quartey, Mark Boye, Ram Tiwari, Yu
Chang
4
Problem Statement
 Comparative effectiveness research (CER) is designed to inform
health-care decisions by providing evidence on the effectiveness,
benefits, and harms of different treatment options
 Timely research and dissemination of CER results to be used by
clinicians, patients, policymakers, and health plans and other payers
to make informed decisions at both the individual and population
levels
 Bayesian approaches provide a natural framework for combining
information from a variety of sources in comparative effectiveness
research
• Rapid technical development as evident by a recent flurry of
publications
 Limited understanding on how Bayesian techniques should be
applied in practice
5
Objectives
 Encourage the appropriate application of Bayesian
approaches to the problem of comparative effectiveness.
 Input into ongoing initiatives on comparative effectiveness
within medical products development setting through white
papers/publications and session at future meetings
6
Project Scope
 Analysis of patient benefit risk using existing data
 Initially focused on
 1) The use of Bayesian evidence synthesis techniques
such as mixed treatment comparisons
 2) Joint Modeling in benefit risk assessment
7
Current aims for 2012
 Literature review of Bayesian methods in CER – Q4 2012
 To gain an understanding and appreciation of other CER
working groups – Q4 2012
• Decide on the list of CER working groups to contact
• Understand the objectives, status of each group
8
Part 2 Overview of Bayesian evidence
synthesis
Introduction
Evidence synthesis in drug development
 The ideas and principles behind evidence synthesis date
back to the work of Eddy et al; 1992
 However, wide spread application has been driven by the
need for quantitative health technology assessment:
• cost effectiveness
• comparitive effectiveness
 Ideas often closely linked with Bayesian principles and
methods:
• Good decision making should ideally be based on all relevant
information
• MCMC computation
Recent developments in comparative effectiveness
 Health agencies have increasing
become interested in health
technology assessment and the
comparative effectiveness of
various treatment options
 Statistical approaches include
extensions of standard metaanalysis models allowing
multiple treatments to be
compared
 FDA Partnership in Applied
Comparative Effectiveness
Science (PACES) -including
projects on utilizing historical
data in clinical trials and
subgroup analysis
Aims of this talk
Evidence synthesis
 Introduce some basic concepts
 Illustration through a series of applications:
• Motivating public health example
• Network meta-analysis
• Using historical data in the design and analysis of clinical trials
• Subgroup analysis
 Focus on principles and understanding of critical
assumptions rather than technical details
Basic concepts
Framework and Notation for evidence synthesis
Y1
Y2
YS
Y1,..,YS
Data from S sources
1,…, S
Source-specific
parameters/effects of interest
(e.g. a mean difference)
2
1
?
S
Question related to 1,…, S
(e.g. average effect,
or effect in a new study)
Strategies for HIV screening
Ades and Cliffe (2002)
 HIV: synthesizing evidence from multiple sources
 Aim to compare strategies for screening for HIV in prenatal clinics:
• Universal screening of all women,
• or targeted screening of current injecting drug users (IDU) or women
born in sub-Saharan Africa (SSA)
 Use synthesis to determine the optimal policy
Key parameters
Ades and Cliffe (2002)
 a- Proportion of women born in sub-Saharan Africa (SSA)
 b Proportion of women who are intravenous drug users (IDU)
 c HIV infection rate in SSA
 d HIV infection rate in IDU
 e HIV infection rate in non-SSA, non-IDU
 f Proportion HIV already diagnosed in SSA
 g Proportion HIV already diagnosed in IDU
 h Proportion HIV already diagnosed in non-SSA, non-IDU
NO direct evidence concerning e and h!
A subset of some of the data used in the synthesis
Ades and Cliffe (2002)
 HIV prevalence, women not born in SSA,1997-8
[db + e(1 − a − b)]/(1 − a)
74 / 136139
 Overall HIV prevalence in pregnant women, 1999
ca + db + e(1 − a − b)
254 / 102287
 Diagnosed HIV in SSA women as a proportion of all
diagnosed HIV, 1999
fca/[fca + gdb + he(1 − a − b)]
43 / 60
Implementation of the evidence synthesis
Ades and Cliffe (2002)
 The evidence was synthesized by placing all data sources
within a single Bayesian model
 Easy to code in WinBUGS
 Key assumption – consistency of evidence across the
different data sources
 Can be checked by comparing direct and indirect evidence
at various “nodes” in the graphical model (Conflict p-value)
Network meta-analysis
Motivation for Network Meta-Analysis
 There are often many treatments for health conditions
 Published systematic reviews and meta-analyses typically
focus on pair-wise comparisons
• More than 20 separate Cochrane reviews for adult smoking
cessation
• More than 20 separate Cochrane reviews for chronic asthma in
adults
 An alternative approach would involve extending the
standard meta-analysis techniques to accommodate
multiple treatment
 This emerging field has been described as both network
meta-analysis and mixed treatment comparisons
20
Network meta-analysis graphic
A
E
B
F
C
G
D
H
21
Network meta-analysis – key assumptions
Three key assumptions (Song et al., 2009):
 Homogeneity assumption – Studies in the network MA
which compare the same treatments must be sufficiently
similar.
 Similarity assumption – When comparing A and C
indirectly via B, the patient populations of the trial(s)
investigating A vs B and those investigating B vs C must
be sufficiently similar.
 Consistency assumption – direct and indirect
comparisons, when done separately, must be roughly in
agreement.
Example 2 Network meta-analysis
Trelle et al (2011) - Cardiovascular safety of non-steroidal antiinflammatory drugs:
 Primary Endpoint was myocardial
placebo

naproxen
Lumiracoxib

rofecoxib
Ibuprofen
Diclofenac
Celecoxib
Etoricoxib


infarction
Data synthesis 31 trials in 116 429
patients with more than 115 000
patient years of follow-up were
included.
A Network random effects metaanalysis were used in the analysis
Critical aspect – the assumptions
regarding the consistency of
evidence across the network
How reasonable is it to rank and
compare treatments with this
technique?
Results from Trelle et al
Myocardial infarction analysis
Relative risk with 95% confidence interval compared to placebo
Treatment RR estimate
Celecoxib
1.35
Diclofenac
0.82
Etoricoxib
0.75
Ibuprofen
1.61
Lumiracoxib
2.00
Naproxen
0.82
Rofecoxib
2.12
lower limit
0.71
0.29
0.23
0.50
0.71
0.37
1.26
upper limit
2.72
2.20
2.39
5.77
6.21
1.67
3.56
Authors' conclusion:
Although uncertainty remains, little evidence exists to
suggest that any of the investigated drugs are safe in
cardiovascular terms. Naproxen seemed least harmful.
24
Comments on Trelle et al
 Drug doses could not be considered (data not available).
 Average duration of exposure was different for different
trials.
 Therefore, ranking of treatments relies on the strong
assumption that the risk ratio is constant across time for all
treatments
 The authors conducted extensive sensitivity analysis and
the results appeared to be robust
Additional Example Using Network meta-analysis for Phase
IIIB Probability of success in a pricing trial
placebo
B
A
C
D
Combination
product
26
Use of Historical controls
Introduction
Objective and Problem Statement
 Design a study with a control arm / treatment arm(s)
 Use historical control data in design and analysis
 Ideally:  smaller trial comparable to a standard trial
 Used in some of Novartis phase I and II trials
 Design options
• Standard Design:
“n vs. n”
• New Design:
“n*+(n-n*) vs. n” with n* = “prior sample size”
 How can the historical information be quantified?
 How much is it worth?
The Meta-Analytic-Predictive Approach
Framework and Notation
Y1
Y2
Y1,..,YH
Historical control data from
H trials
YH
1,…, H
Control “effects” (unknown)
2
?
‘Relationship/Similarity’
(unknown)
no relation… same effects
1
?
*
H
*
Effect in new trial (unknown)
Design objective: [ * | Y1,…,YH ]
Y*
Y*
Data in new study
(yet to be observed)
Example – meta-analytic predictive approach to form priors
Application
Random-effect meta-analysis
prior information for control group
in new study, corresponding to
prior sample size n*
Bayesian setup-using historical control data
Meta Analysis of Historical Data
Study Analysis
Drug
Placebo
Observed Control Response
Rates
Prior
Distribution
of Control
Response
Rate
Historical
Trial 1
Observed
Control
data
Prior
Distribution
of drug
response
rate
Observed
Drug
data
Historical
Trial 2
Historical
Trial 3
Historical
Trial 4
Historical
Trial 5
MetaAnalysis
Predictive
Distribution
of Control
Response
Rate in a
New Study
Bayesian
Analysis
Posterior Distribution of
Control Response Rate
Posterior Distribution of
Drug Response Rate
Historical
Trial 6
Historical
Trial 7
Historical
Trial 8
Posterior Distribution of
Difference in Response
Utilization in a quick kill quick win PoC Design
... ≥ 70%
... ≥ 50%
... ≥ 50%
1st Interim
2nd Interim
Final analysis
Positive PoC if
P(d ≥ 0.2)...
Negative PoC if
P(d < 0.2)...
... ≥ 90%
... ≥ 90%
With N=60, 2:1 Active:Placebo, IA’s after 20 and 40 patients
First interim
Second interim
... > 50%
Final
Overall
power
Stop for
efficacy
Stop for
futility
Stop for
efficacy
Stop for
futility
Claim
efficacy
Fail
0
1.6%
49.0%
1.4%
26.0%
0.2%
21.9%
3.2%
d = 0.2
33.9%
5.1%
27.7%
3.0%
8.8%
21.6%
70.4%
d = 0.5
96.0%
0.0%
4.0%
0.0%
0.0%
0.0%
100.0%
Scenario
d=
With pPlacebo = 0.15, 10000 runs
R package available for design investigation
33 | Evidence synthesis in drug development
Subgroup Analysis
Introduction to Subgroup analysis
 For biological reasons treatments may be more effective in
some populations of patients than others
• Risk factors
• Genetic factors
• Demographic factors
 This motivates interest in statistical methods that can
explore and identify potential subgroups of interest
35
Challenges with exploratory subgroup analysis
random high bias - Fleming 2010
Effects of 5-Fluorouracil Plus Levamisole on Patient
Survival Presented Overall and Within Subgroups, by Sex and Age*
Hazard Ratio Risk of Mortality
Analysis
North Central
Intergroup
Group
Treatment
Study
Group Study
# 0035
(n = 162)
(n = 619)
All patients
0.72
0.67
Female
Male
0.57
0.91
0.85
0.50
Young
Old
0.60
0.87
0.77
0.59
Assumptions to deal with extremes
Jones et al (2011)
 Similar methods to those used when combining historical
data
 However, the focus is on the individual subgroup
parameters g1,......, gG rather than the prediction of a new
subgroup
1) Unrelated Parameters
g1,......, gG
(u)
Assumes a different treatment effect in each subgroup
2) Equal Parameters
g1=...= gG
(c)
 Assumes the same treatment effect in each subgroup
3) Compromise.
Effects are similar/related to a certain degree
(r)
Comments on shrinkage estimation
 This type of approach is sometimes called shrinkage
estimation
 Shrinkage estimation attempts to adjust for random high
bias
 When relating subgroups, it is often desirable and logical to
use structures that allow greater similarity between some
subgroups than others
 A variety of possible subgroup structures can be examined
to assess robustness
Subgroup analysis– Extension to multiple studies
Data summary from several studies
• Subgroup analysis in a
meta-analytic context
• Efficacy comparison T
vs. C
• Data from 7 studies
• 8 subgroups
• defined by 3 binary baseline covariates A, B, C
• A, B, C high (+) or low (-)
• describing burden of
disease (BOD)
• Idea: patients with
higher BOD at baseline
might show better
efficacy
Graphical model
Subgroup analysis involving several studies
Y1
Y2
Y...
?
1
2
S
Study-specific parameters
1,…, S
• Parameters allow data to be
combined from multiple studies
Y1,..,YS
Data from S studies
YS
g2
g1
?
gG
Subgroup parameters
g1,…, gG
• Main parameters of interest
• Various modeling structures can be
examined
Extension to multiple studies
Example 3: sensitivity analyses across a range of subgroup structures
• 8 subgroups
• defined by 3 binary base-line covariates
A, B, C
• A, B, C high (+) or low (-)
• describing burden of disease (BOD)
41 | Evidence synthesis in drug development
Summary
Subgroup analysis
 Important to distinguish between exploratory subgroup
analysis and confirmatory subgroup analysis
 Exploratory subgroup analysis can be misleading due to
random high bias
 Evidence synthesis techniques that account for similarity
among subgroups will help adjust for random high bias
 Examine a range of subgroup models to assess the
robustness of any conclusions
Conclusions
• There is general agreement that good decision making should be
based on all relevant information
• However, this is not easy to do in a formal/quantitative way
• Evidence synthesis
- offers fairly well-developed methodologies
- has many areas of application
- is particularly useful for company-internal decision making (we have used
and will increasingly use evidence synthesis in our phase I and II trials)
- has become an important tool when making public health policy decisions
References
44 | Combining Information in Drug Development 2010
Evidence Synthesis/Meta-Analysis
DerSimonian, Laird (1986). Meta-analysis in clinical trials. Controlled Clinical Trials, 7;
177-88
Gould (1991). Using prior findings to augment active-controlled trials and trials with
small placebo groups. Drug Information J. 25 369--380.
Normand (1999). Meta-analysis: formulating, evaluating, combining, and reporting
(Tutorial in Biostatistics). Statistics in Medicine 18: 321-359.
See also Letters to the Editor
by Carlin (2000) 19: 753-59, and Stijnen (2000) 19:759-761
Spiegelhalter et al. (2004); see main reference
Stangl, Berry (eds) (2000). Meta-analysis in Medicine in Health Policy. Marcel Dekker
Sutton, Abrams, Jones, Sheldon, Song (2000). Methods for Meta-analysis in Medical
Research. John Wiley & Sons
Trelle et al., “Cardiovascular safety of non-steroidal anti-inflammatory drugs: network
non-steroidal anti-inflammatory drugs: network meta-analysis,” BMJ 342 (January 11,
2011): c7086-c7086.
Historical Controls
Ibrahim, Chen (2000). Power prior distributions for regression models.Statistical
Science, 15: 46-60
Neuenschwander, Branson, Spiegelhalter (2009). A note on the power prior.
Statistics in Medicine, 28: 3562-3566
Neuenschwander, Capkun-Niggli, Branson, Spiegelhalter. (2010). Summarizing
Historical Information on Controls in Clinical Trials. Clinical Trials, 7: 5-18
Pocock (1976). The combination of randomized and historical controls in
clinical trials. Journal of Chronic Diseases, 29: 175-88
Spiegelhalter et al. (2004); see main reference
Thall, Simon (1990). Incorporating historical control data in planning phase II
studies. Statistics in Medicine, 9: 215-28
Subgroup Analyses
Berry, Berry (2004). Accounting for multiplicities in assessing drug safety:
a three-level hierarchical mixture model. Biometrics, 60: 418-26
Davis, Leffingwell (1990). Empirical Bayes estimates of subgroup effects in clinical trial.
Controlled Clinical Trials, 11: 37-42
Dixon, Simon (1991). Bayesian subgroup analysis. Biometrics, 47: 871-81
Fleming (2010), “Clinical Trials: Discerning Hype From Substance,” Annals of Internal
Medicine 153:400 -406.
Hodges, Cui, Sargent, Carlin (2007). Smoothing balanced single-error terms Analysis of
Variance. Technometrics, 49: 12-25
Jones, Ohlssen, Neuenschwander, Racine, Branson (2011). Bayesian models for subgroup
analysis in clinical trials. Clinical Trials Clinical Trials 8 129 -143
Louis (1984). Estimating a population of parameter values using Bayes and empirical Bayes
methods. JASA, 79: 393-98
Pocock, Assman, Enos, Kasten (2002). Subgroup analysis, covariate adjustment and
baseline comparisons in clinical trial reporting: current practic eand problems. Statistics in
Medicine, 21: 2917–2930
Spiegelhalter et al. (2004); see main reference
Thall, Wathen, Bekele, Champlin, Baker, Benjamin (2003). Hierarchical
Bayesian approaches to phase II trials in diseases with multiple subtypes, Statistics in
Medicine, 22: 763-80
Poisson network meta-analysis model
 Model extension to K treatments : Lu, Ades (2004). Combination of direct
and indirect evidence in mixed treatment comparisons, Statistics in Medicine, 23:31053124.
(1)
rit ~ Poisson(it Eit )
( 2)
log(i1 )  i   i 2 / K  ...   iK / K
K 1
log(i 2 )  i 
 i 2   i3 / K ...   iK / K
K
K 1
logit( piK )  i   i 2 / K   i 3 / K ... 
 iK
K
K
1
i 
log(ik ),  ik  log(ik )  log(i1 )
K
k 1

i  1,...,S ; t  1,..., K

 Different choices for µ’s and  ’s. They can be:
• common (over studies), fixed (unconstrained), or “random”
• Note: random  ’s  (K-1)-dimensional random effects distribution
Acknowledgements
Stuart Bailey ,Björn Bornkamp, Beat Neuenschwander, Heinz
Schmidli, Min Wu, Andrew Wright