• General endpoint considerations
• Surrogate endpoints
• Composite endpoints and recurrent events
• Safety outcomes (adverse events)
• Some background
• Definitions
• Data collection considerations
• Reporting of safety outcomes
• Recommendations
• The collection of safety outcomes in trials is often not done with same rigor as efficacy outcomes (e.g., less focused, less thought given to data collection protocol).
• Safety reporting in trials and in systematic reviews is inadequate (this is also true for interim analyses) (Ioannidis,
JAMA 2001, and Ernst, BMJ 2001).
CONSORT guidelines were modified in 2004 (Ann Intern Med).
• Major safety outcomes should be recorded irrespective of attribution to study treatment.
• Safety is best assessed using aggregated data against a randomized control.
• Safety of treatments is not known with certainty until they have been used for many years (Lasser, JAMA 2002) .
• FIAU’s toxicity mimicked the underlying disease and other medications taken by the patients (hepatic and pancreatic adverse events not attributed to the treatment).
• FIAU’s toxicity was not predicted by animal studies.
• Each adverse event was considered separately – no cumulative analysis of events by treatment group.
• In NIH trial, 15 patients were randomized to 2 doses of
FIAU for 24 weeks (no placebo control).
• Study stopped because one patient developed liver failure.
• Ultimately, 7 patients experienced severe liver toxicity and
5 died after cessation of study treatment.
• Retrospective review of 2 other studies revealed delayed toxicities of FIAU.
N Engl J Med 1995; 333: 1099-1105.
• Approved by FDA in 1999 on the basis of short-term studies indicating it lowered glucose and HbA1c.
• Questions about safety arose based on a meta-analysis by
Nissen and Wolski in 2007.
• FDA advisory committee concludes that rosiglitazone increases risk of MI compared to placebo and notes limitations of metaanalysis. FDA adds “boxed” warning, recommends long-term head to head comparison with another diabetes drug to compare CVD outcomes, but does not withdraw drug from the market.
• Trial is initiated (TIDE trial), ethics of clinical trial questioned, Senate Finance Committee discusses data and ultimately trial discontinued and severe restrictions placed on use of the drug.
IOM Report 2012, Ethical and Scientific Issues in Studying the Safety of Approved Drugs.
• FDA Panel reviews CVD safety data again
– 13 panel members vote to ease up risk mitigation strategy, including special certification to use it.
– 7 vote to remove all restrictions
– 5 vote to continue current restrictions
• Short-term studies using surrogate outcomes are not powered to identify clinically relevant safety signals.
• Many trials done for registration are not conducted to reliably assess major safety issues (e.g., data not collected following treatment discontinuation, major clinical outcomes not collected in a standardized manner, good meta-analyses hard to do).
• It is difficult to do a trial to confirm a safety signal.
• International Conference on Harmonization (ICH) of Technical Requirements of Pharmaceuticals for
Human Use
– Clinical Safety Data Management: Definitions and
Standards for Expedited Reporting E2A
• Regulatory authorities have reporting requirements for adverse events
– Code of Federal Regulations (CFR) Title 21, Part 312 describes safety reporting for investigational drugs
(trials being carried out under an IND with the FDA)
– EU Directive provides guidance for reporting safety data for investigational medicinal products
• Adverse events that require reporting to regulatory authorities
– S erious A dverse E vents (SAEs)
– S uspected U nexpected S erious A dverse R eaction
(SUSARs)
• Adverse events for planned treatment comparisons
– Treatment discontinuation due to adverse effects
– Adverse effects according to severity by standard grading table
• Open-ended
• Side effect check-list
• Another way of thinking of the categorization:
– Case reports of individual events
– Data on case report forms with which to compute counts and rates of events by treatment group
• Both are important (refer to week 1 notes for discussion of case histories and series)
Serious Events Definition – ICH Guidelines
• Events resulting in death
• Life-threatening events
• Events leading to hospitalization or prolonging existing hospitalization
• Events leading to persistent/significant disability or incapacity
• Congenital abnormalities/birth defects
• Other important medical events that may jeopardize the participant or require intervention to prevent one of the other outcomes above
• Unexpected – an adverse drug experience which is not consistent with the current investigator brochure (or label for approved drug)
• Related – cannot rule out the possibility that the treatment caused the adverse event
(i.e., the investigator cannot check “not related”)
• SUSAR= S uspected U nexpected S erious
A dverse R eaction
• SUSAR reporting is required by the European
Union (EU) Directive 2001/20/EC for all trials being conducted at sites in the European
Economic Area (EEA) that use investigational medicinal products (IMPs)
• Serious : per ICH GCP Guideline E2A
• Unexpected : per labeling of the suspect agent
• Suspected adverse reaction : related to treatment
Recommendations for Reporting Serious
Adverse Events
• Require serious events to be reported on all participants, regardless of treatment status or relationship to study treatment (attribution is unreliable and hard to standardize).
• This allows a randomized comparison of the occurrence of serious events over the course of the trial
• Establish systems to meet expedited reporting requirements by regulatory authorities (within 7 days of awareness)
– IND safety reports for serious adverse events (SAEs) associated with the treatment and unexpected (e.g., see
21 CFR 312.32 for drugs)
– SUSARS
• Individual case reports of SAEs or SUSARs sent to all investigators using the investigational product by sponsor or designee (e.g., an investigator may receive safety reports for patients in another institution and in another trial)
• Investigators notify IRB/EC
• Sponsor updates investigator brochure on a regular basis
• Problem: Numerators and not denominators
• Where do you draw the line?
– Collect all adverse events irrespective of severity?
– Only serious adverse events?
– Only why the participants is taking the study treatment?
• Standardization
– Tables for grading AEs for severity
– Open-ended or checklist
– MedDRA coding
– Event review committee
• More detailed data collection is important in early phase studies.
• Describe safety data collection and reporting in the protocol and in the trial report
• Evaluation of 75, 598 “routine” adverse events collected on 1,181 patients between
1999 and 2001.
• An average of 2,588 adverse events per study.
• Most adverse events reported were mild;
3% were severe; 1% required expedited reporting
• Much of what is collected is not reported.
Mahoney M, et al, J Clin Oncol 2005; 23:9275-9281
• You will fail!
• Bradford Hill suggested the following 3 questions be asked by the person developing a case report form:
– Is this question essential?
– Can I obtain useful answers to it?
– Can I analyze them usefully at the end?
• Peto is less diplomatic: “…the statistician should, at the design stage, cross out most of the things that the trial organizer wants to ask.”
Principles of Medical Statistics and Biomedicine 28:24-36, 1978.
A Minimalist’s Approach: Composite
Hierarchical Outcomes for Safety
A
Treatment
B
Death
Death or SAE
Death, SAE, or Severe AE
Death, SAE, Severe AE, or
Treatment D/C due to AE
• Herpes zoster (shingles) vaccine trial; 38,546 participants 60 years of age and older
• Safety evaluation
– Adverse events within 42 days of vaccination for all participants
– Serious adverse events for all of follow-up (3+ years) for all participants
– Substudy of 300 participants to closely monitor symptoms using a checklist within 42 days of vaccination
N Engl J Med 2005; 352: 2271-2284.
Grade 1
Mild
Allergic Reaction Pruritis w/o rash
Grade 2
Moderate
Localized urticaria
Creatinine >1 – 1.5
ULN
>1.5 – 3
ULN
Grade 3
Severe
Generalized urticaria or angiodema
>3.0 – 6 ULN
Grade 4
Potentially
Lifethreatening
Anaphylaxis
> 6.0
ULN
• Structured data collection is easier to standardize, process and analyze
• Structured data is easier to combine across investigators/studies
• Greater chance of missing something with structured data collection
• No or less freedom of expression with structured data collection
1) “Did you look at any of the following newspapers or magazines or journals yesterday? Put a tick (√) against each of them that you definitely looked at yesterday. Put a cross (X) if you did not look at it yesterday.”
Daily Mail [ ]
The Times [ ]
•
•
Daily Herald [ ]
List all others:
Belson W., Duncan J. Applied Statistics , 1962.
2) “Did you look at any newspapers or magazines or journals yesterday? This would include daily papers, Sunday papers, weeklies and any that come out monthly. Write down the name of each that you definitely looked at yesterday.”
Daily
Type of
Publication
Sunday
Weekly & Monthly
√ List
Group 1
Open
Response
-
1.72
0.78
1.23
0.17
√ List
Group 2
Open
Response
1.52
1.12
-
0.90
-
0.23
Daily Mail
The Times
Daily Herald
Evening News
Daily Mirror
Percent
√’d
16
Percent
Mentioned with
Open Response
12
39
46
4
9
27
39
3
5
Star
Daily Telegraph
* Before rounding
21
17
12
13
Ratio*
0.78
0.70
0.54
0.70
0.85
0.56
0.77
Average Number of Publications Identified
Which Did Not Appear in the √ List
Daily
Sunday
Type of
Publication
Weekly & Monthly
ALL
√ List
Group 1
Open
Response
0.06
0.18
0.29
0.47
0.26
0.39
0.71
√ List
Group 2
Open
Response
0.35
0.61
-
0.33
0.68
-
0.57
1.18
News Chronicle
Percent
Mentioned Under
“All Others”
6
Percent
Mentioned with
Open Response
7
Ratio*
1.12
Daily Express 15 25 1.72
Daily Sketch
Evening Standard
All Others
7
5
3
13
10
5
1.95
1.94
1.99
* Before rounding
1. There is a sharp difference in the two methods indicating that at least one of them may be in error when assessing yesterday’s behavior.
2. The yields from the two systems cannot be compared or pooled either within a survey or between surveys.
3.
One cannot assume that by using an “all others” response in √ list that the problem of not having an exhaustive √ list is solved.
4. Unstructured questions provide more freedom of expression and structured questions allow less expression of individuality.
5. Some TV programs which were not broadcast the day before were included in the √ list and actually checked by some respondents. They were not mentioned in the open response suggesting an inflating tendency with the √ list.
Collection of Adverse Events in
Clinical Trials.
• After taking placebo for 1 month (as part of a run-in to a trial) 214 men with benign prostatic hyperplasia were randomly assigned to 3 methods for collecting adverse events.
• 3 self-administered forms used.
• No assessment of severity; responses to open-ended questions were recorded
Bent S et al, Ann Intern Med 2006; 144:257-261.
A Randomized Trial of Methods for Assessing
Medical Problems - 2
Total Adverse
Events Reported
No. (%) with
Event
Open-Ended
Questionnaire
(N=70)
11
Open-Ended
Defined
Question
(N=70)
14
10 (14.3%) 9 (12.9%)
Checklist
(N=74)
238
57 (77.0%)
Open-Ended Questionnaire: “Did you have any significant medical problem since your last visit?”
Open-Ended Defined Question: “Since the last study visit, have you limited your usual daily activities for more than 1 day because of a medical problem?”
Checklist: “Since the last visit, have you experienced any of the following?” (53 symptoms)
(See Chapter 11 of Friedman, Furberg, and
DeMets)
Percent Ever Reporting
Aspirin Placebo
1.29
0.45
Bloody Stools
Volunteered
(open-ended)
Elicited
(√-list)
Reason for dosage reduction
4.86
0.22
2.99
0.04
Visit-Driven Checklist
•
Open-Ended
Questionnaire for
Serious Events
Event-Driven Form
• Medical Dictionary for Regulatory Activities
• Used in U.S., EU, and Japan; mandated for safety reporting in EU and Japan
• Each event assigned an 8-digit number
• Multi-axial: one event may be linked to multiple System Organ Classes
SOC
HLGT
Ischaemic Coronary Artery Disorders
HLT
PT
LLT
SOC
(n=26)
HLGT
(n > 300)
HLT
(n > 1,600)
PT
(n > 18,000)
LLT
(n > 66,000)
Was event ascertainment similar for the treatments under study?
When did collection of AEs begin and end?
What was the timing of follow-up assessments for adverse events?
How were AEs defined?
Were severe events centrally reviewed and adjudicated?
“Grade 4 events were reported irrespective of their perceived relationship to the use of interleukin-2 or antiretroviral therapy and were coded according to the Medical
Dictionary for Regulatory Activities (version
12.0)”
ESPRIT HIV Study. N Engl J Med 2009; 361:1548-59.
“The study was designed as an intention-to-treat analysis with data of clinical events (excluding survival) and each subject’s tolerance censored at the termination of the study medication .”
AIDS 1994;8:1185.
“We used a five-point scale…to grade adverse events occurring while the patient was taking study drugs and during the eight weeks after their permanent discontinuation ”.
“All analyses were performed according to intention to treat”.
N Engl J Med 1996;335:1099-1106.
• Many trial reports are lacking in terms of safety data collection and reporting limiting the ability to weigh risks versus benefits.
• Collect less, better, and report it.
• Often a combination of “visit-driven” and “eventdriven” data collection is optimal with use of both a checklist and open-ended response questions.
• Collect safety outcomes for entire trial duration irrespective of relationship to treatment.
• Some treatments, even if taken for a short period of time, may delay or reverse disease progression.
• Likewise, some treatments, even if taken for a short time, may cause toxicities (e.g., liver damage, an acceleration of atherosclerosis) that may not manifest themselves while taking the treatment.
• A “Consumer Reports” analysis of safety and efficacy outcomes is helpful, but cannot be done reliably unless you collect data on all patients for the duration of the study.
• I
• Composite hierarchical safety outcomes can be useful to supplement an analysis of the individual safety components.
• Interim treatment comparisons, for which numerators and denominators are available, are the most important component of safety monitoring. These must be done to supplement the requirements for reporting individual case summaries to IRBs/ECs and regulatory authorities.
• Targeted data collection based on stage or research
• Pre-identification of data not necessary to collect
• Collection of data in a subsample
• Decreased frequency of laboratory monitoring
Determining the extent of safety data collection needed in late stage premarket and postapproval clinical investigations, February 2012.