Efficient Source Data Verifications in Cancer Trials

advertisement
University of Pennsylvania Annual Conference on
Statistical Issues in Clinical Trials
April 13, 2011
Efficient source data verification
in randomized trials
Marc Buyse
IDDI, Louvain-la-Neuve and
I-BioStat, Hasselt University, Belgium
Outline
1.
2.
3.
4.
Trials as a cost-effective, sustainable activity
Scientific vs. regulatory requirements
The continuum from errors to fraud
Monitoring strategies
–
–
–
Extensive monitoring
Reduced monitoring
Targeted monitoring
5. The SMART project
6. Conclusions
2
Potential reductions in clinical trial costs
Assumptions:
• Treatment of chronic disease
• 20,000 patients
• 1,000 sites
• 48 months enrollment (24) + follow-up (24)
• 24 visits per site (every other month)
• 60-page CRF
• 10,000 $ per patient site
Total budget in millions of $:
• Coordinating Center
• Site payments
• Other costs: travel, meetings, etc
Ref: Eisenstein et al, Clinical Trials 2008;5:75.
421 $
170 $ (40%)
200 $ (48%)
51 $ (12%)
3
Potential reductions in clinical trial costs
•
•
•
•
•
•
Ref: Eisenstein et al, Clinical Trials 2008;5:75.
4 mths planning
24 mths accrual
1,000 sites
24 site visits
60-page CRF
10,000 $ per site
4
Potential reductions in clinical trial costs
Ref: Eisenstein et al, Clinical Trials 2008;5:75.
•
•
•
•
•
•
4 mths planning
24 mths accrual
1,000 sites
24 site visits
60-page CRF
10,000 $ per site
•
•
•
•
•
•
4 mths planning
18 mths accrual
750 sites
4 site visits
20-page CRF + EDC
5,000 $ per site
5
Potential reductions in clinical trial costs
Ref: Eisenstein et al, Clinical Trials 2008;5:75.
•
•
•
•
•
•
4 mths planning
24 mths accrual
1,000 sites
24 site visits
60-page CRF
10,000 $ per site
•
•
•
•
•
•
4 mths planning
18 mths accrual
750 sites
4 site visits
20-page CRF + EDC
5,000 $ per site
•
•
•
•
•
•
4 mths planning
18 mths accrual
100 sites
no site visits
5-page CRF + EDC
650 $ per site
6
7
Scientific vs. regulatory requirements
for a clinical trial
From a scientific point of view, a trial must estimate the effect
of a treatment without bias.
Randomized trials enable such unbiased inference even in
the presence of massive random errors which only cause
conservatism (in tests for superiority).
From a regulatory point of view, a trial must provide verifiable
evidence that it was carried out according to
specifications.
Absence of errors must be demonstrated regardless of
their consequences.
8
The continuum from errors to fraud
Type
Typical examples
Intent
Errors
Poorly calibrated
equipment
Wholly unintentional
Sloppiness Data missing or
Limited awareness
incorrectly copied from
source documents
Fraud
Data fabricated to
avoid missing data or
create patients
Treatment Data fabricated or
-related
falsified to favor
fraud
treatment
Deliberate
Definite « intention
to cheat »
9
The continuum from errors to fraud
Type
Typical examples
Impact
Errors
Poorly calibrated
equipment
Potential (small) loss
in power / no bias
Sloppiness Data missing or
Potential (small) loss
incorrectly copied from in power / no bias
source documents
Fraud
Data fabricated to
avoid missing data or
create patients
Treatment Data fabricated or
-related
falsified to favor
fraud
treatment
Unknown effect on
power / no bias
Definite bias
10
The continuum from errors to fraud
Type
Typical examples
Ease of detection
Errors
Poorly calibrated
equipment
Difficult to detect
Sloppiness Data missing or
May be hard to
incorrectly copied from detect
source documents
Fraud
Data fabricated to
avoid missing data or
create patients
Treatment Data fabricated or
-related
falsified to favor
fraud
treatment
Detectable through
center comparisons
Detectable through
treatment by center
comparisons
11
12
Monitoring strategies
Extensive monitoring
• 100% SDV for primary and key secondary
outcomes
Reduced monitoring
• Random sampling of centers / patients /
outcomes to ensure rate of errors < x%
• Risk-adapted monitoring
Targeted monitoring
• Monitoring based on Key Risk Indicators
• Statistical Monitoring
13
Monitoring strategies
Extensive monitoring
• 100% SDV for primary and key secondary
outcomes
Reduced monitoring
• Random sampling of centers / patients /
outcomes to ensure rate of errors < x%
• Risk-adapted monitoring
Targeted monitoring
• Monitoring based on Key Risk Indicators
• Statistical Monitoring
14
Extensive monitoring
”(...) trial management procedures ensuring validity and
reliability of the results are vastly more important than
absence of clerical errors.
Yet, it is clerical inconsistencies referred to as ’errors’ that
are chased by the growing GCP-departments.”
Refs:
Lörstad, ISCB-27, Geneva, August 28-31, 2006
15
« Monitoring confirms consistency between data
collection forms and source documents; if the source
documents are wrong because of laboratory, clinical,
or clerical errors, then monitoring adds expense
without benefit. A common misinterpretation of
sponsors is that GCP requires audits of 100% of data;
by contrast, random audits might suffice. »
16
Ref: Glickman et al, NEJM 2009;360:816.
17
Monitoring strategies
Extensive monitoring
• 100% SDV for primary and key secondary
outcomes
Reduced monitoring
• Random sampling of centers / patients /
outcomes to ensure rate of errors < x%
• Risk-adapted monitoring
Targeted monitoring
• Monitoring based on Key Risk Indicators
• Statistical Monitoring
18
Reduced monitoring
Countries
Countries
Countries
Countries
Countries
Countries
Countries
Centers
Random sampling
Countries
Countries
Countries
Patients
Countries
Countries
Countries
Items
Countries
Countries
Countries
Visits
19
Risk A – Negligible risk (non invasive procedures)
Risk B – Risk similar to that of usual care (trials involving
approved drugs)
Risk C – High risk (phase III trials of new agents, new
indications or at risk populations)
Risk D – Very high risk (phase I or II trials of new agents)
20
OPTIMON: OPTimisation of MONitoring
for clinical research studies
Centers
accruing
> 5 patients
in several
trials
Trials
stratified by
risk group:
- A
- B
- C
Control
(“pharma” standards)
Experimental
(less visits / checks)
Goal: non-inferiority of the proportion of patients with at least one severe
error in informed consent, suspected unexpected serious adverse events
reports, major eligibility criteria, or primary endpoint (expected: 95% with
non-inferiority margin of 5%).
Source: Geneviève Chêne, University Teaching Hospital Bordeaux, France
https://ssl2.isped.u-bordeaux2.fr/optimon/Documents.aspx
21
Monitoring strategies
Extensive monitoring
• 100% SDV for primary and key secondary
outcomes
Reduced monitoring
• Random sampling of centers / patients /
outcomes to ensure rate of errors < x%
• Risk-adapted monitoring
Targeted monitoring
• Monitoring based on Key Risk Indicators
• Statistical Monitoring
22
Targeted monitoring
based on Key Risk Indicators
Countries
Countries
Countries
Countries
Data
Management
Countries
Countries
Countries
Centers
Monitoring
Team
Countries
Countries
Countries
Items
Countries
Countries
Countries
Patients
Countries
Countries
Countries
Visits
23
Examples of “Key Risk Indicators”
Study conduct
• Actual accrual vs. target
• % pts with protocol violations
• % dropouts
• …
24
Examples of “Key Risk Indicators”
Study conduct
• Actual accrual vs. target
• % pts with protocol violations
• % dropouts
• …
Treatment compliance
• % dose reductions
• % dose delays
• Reasons for Rx stops
• …
25
Examples of “Key Risk Indicators”
Study conduct
• Actual accrual vs. target
• % pts with protocol violations
• % dropouts
• …
Safety
• AE rate
• AE grade 3/4 rate
• SAE rate
• …
Treatment compliance
• % dose reductions
• % dose delays
• Reasons for Rx stops
• …
26
Examples of “Key Risk Indicators”
Study conduct
• Actual accrual vs. target
• % pts with protocol violations
• % dropouts
• …
Safety
• AE rate
• AE grade 3/4 rate
• SAE rate
• …
Treatment compliance
• % dose reductions
• % dose delays
• Reasons for Rx stops
• …
Data management
• Overdue forms
• Query rate
• Query resolution time
• …
27
Targeted monitoring – based on
statistical monitoring
Countries
Countries
Countries
Countries
SMART
Data
Management
Countries
Countries
Countries
Centers
Monitoring
Team
Countries
Countries
Countries
Items
Countries
Countries
Countries
Patients
Countries
Countries
Countries
Visits
28
Targeted monitoring
(…)
Ref: Baigent et al, Clinical Trials 2008;5:49.
29
Principles behind statistical checks
• Plausible data are hard to fabricate
 check plausibility
(e.g. mean, variance, correlation structure, outliers,
inliers, dates, etc.)
• Humans are poor random number generators
 check randomness
(e.g. Benford’s law for first digit, digit preference, etc.)
• Clinical trial data are highly structured
 check comparability
(e.g. between centers, treatment arms, etc.)
30
Ref:
Encyclopaedic Companion to Medical Statistics (Everitt B, Palmer C, Eds..)
Arnold Publishers Ltd, London, 2010.
31
SMART*
A software that systematically performs a
large battery of statistical tests on the values
of all variables collected in a clinical trial.
These tests generate a large number of pvalues, ranks and other statistics that are kept
in a database for checks of randomness,
plausibility and comparability.
* Statistical Monitoring Applied to Randomized Trials
32
Brute force approach
• In multicentric trials, the distribution of all
variables can be compared between each
center and all other centers
• These tests can be applied automatically,
without regard to meaning or plausibility
• They yield very large number of centerspecific statistics
• Meta-statistics can be applied to these
statistics to identify outlying centers
33
An example
• Trial in depression
• Two stages:
– an open-label run-in treatment stage
– a double-blind randomized treatment stage
•  800 patients from  70 centers
34
Exemplary findings:
heart rate/blood pressure
• To be taken at each visit, in two positions (supine/standing)
• Variability suspiciously low for several centers
• “Strange” patient:
VISIT
1
1
POS
1
2
HR
72
70
SYSBP DIABP
115
75
110
70
2
2
1
2
72
70
115
110
75
70
3
3
1
2
70
70
110
110
75
70
4
4
1
2
72
70
110
105
75
70
5
5
...
1
2
...
74
72
...
115
110
...
75
70
...
35
Exemplary findings:
heart rate/blood pressure
• To be taken at each visit, in two positions (supine/standing)
• Variability suspiciously low for several centers
• “Strange” patient:
VISIT
1
1
POS
1
2
HR
72
70
SYSBP DIABP
115
75
110
70
2
2
1
2
72
70
115
110
75
70
3
3
1
2
70
70
110
110
75
70
4
4
1
2
72
70
110
105
75
70
5
5
...
1
2
...
74
72
...
115
110
...
75
70
...
Is it worth asking for inessential,
tedious measurements?
36
Exemplary findings:
baseline MADRS score
• MADRS score (the sum of results on 10 questions) < 12
needed to enter the randomized stage
• Half of the patients were expected to have a score < 12
after the run-in period
37
Exemplary findings:
baseline MADRS score
• MADRS score (the sum of results on 10 questions) < 12
needed to enter the randomized stage
• Half of the patients were expected to have a score < 12
after the run-in period
• In reality, 67% had a score < 12
38
Exemplary findings:
baseline MADRS score
• MADRS score (the sum of results on 10 questions) < 12
needed to enter the randomized stage
• Half of the patients were expected to have a score < 12
after the run-in period
• In reality, 67% had a score < 12
“Strange” centers :
– Center A: 5 8 5 4 7 8 9 4 6 5 7 5 4 3
– Center B: 11 11 11 11 10 11 11 11 11
39
Conclusions
• Current clinical research practices (such as
intensive monitoring and 100% source data
verification) are not useful, effective, or
sustainable
• A statistical approach to quality assurance
could yield huge cost savings and yet
increase the reliability of the trial results
• Regulatory requirements should evolve
accordingly
40
Download