FDA and M-CERSI Workshop Complex data analytics . Henry “Skip” Francis, M.D.

advertisement
FDA and M-CERSI Workshop.
Complex data analytics
Henry “Skip” Francis, M.D.
Data Mining Research and Evaluation Team, FDA/CDER
September 11, 2015
Pharmacovigilance
IT Tools Should Not Replace Our Own Eyes or Good Clinical Judgment
(Slide courtesy of June Almenoff, M.D., Ph.D.)
• Human Judgment is irreplacable
DIA
www.diahome.org
2
Workshop Questions
From a Data mining and informatics viewpoint
•
•
•
•
Mobile health and social media.
Tools and technologies for real time analysis.
Methodological opportunities and challenges.
Understanding trust and influence in social
media.
• Standards and Inter-operatibility
Complex Data Analytics
Unifying Goal
• Convergence is an approach to problem
solving that integrates expertise from life
sciences with physical, mathematical and
computational sciences, medicine and
engineering to form comprehensive synthetic
frameworks that merge areas of knowledge
from multiple fields to address specific
challenges.
•
National Research Council. Convergence: Facilitating Transdisciplinary Integration of Life Sciences, Physical Sciences,
Engineering, and Beyond. Washington, DC: The National Academies Press, 2014.
Converging data sources
5
FDA Complex Data Analytic methods
Components of data mining program (CRISP_DM model)
Data mining universe and its problems
Business process management
Visualization tools
In database
analytics
Database data
Warehouse
Decision operations
Descriptive Analysis
Predictive Analysis
Optimization
Business rules
Enterprise Platform (Oracle 11g)
Operational Analytics: Putting Analytics to Work in Operational
Systems
2010 BeyeNETWORK and Decision Management Solutions
http://www.oracle.com/us/products/applications/hyperion/operation
al-analytics-report-081829.pdf, Page 14.
Adaptive
Control
KEY
Performance
Indicators
Workshop Questions
From a Data mining and informatics viewpoint
•
•
•
•
Mobile health and social media.
Tools and technologies for real time analysis.
Methodological opportunities and challenges.
Understanding trust and influence in social.
Media.
• Standards and Inter-operatibility
RAPID logos – 03/13/2014
RAPID logos – 02/20/2014
Peramivir use during H1N1 influenza A epidemic
Using a combined system for convergent data
• 14 to 34 million cases
• 63,00 to 153,000
hospitalizations
• 7,000 to 14,000 deaths
•
•
•
•
1371 requests for drug
369 AE reports
344 unique patients
900 adverse events
1
Visual Look at RAPID Stakeholders
RAPID MOBILE APPS
3
CHIO Cloud Infrastructure
• Taha Kass-Hout (CHIO)
• Jim Milto
• AWS Contractor Team
2
Security
• Lewis Watson (CISO)
• Shawn Porter
FDA RAPID LEADERSHIP
• Dr Henry Francis
• Richard Zhang
• Bruce Weaver
• Syed Haider
FDA COMMUNICATIONS
Office of
Communications
• Paul Buckman
• Sherunda Lister
• Kim Rawlings
MOBILE DEVELOPMENT
• NIH National Library of
Medicine: George Thoma,
Sameer Antani, Stacey Arnesen
• FDA RAPID Contractor Team:
Booz Allen Hamilton, Program
Manager Dinesh Kolla
Mobile Data Collection:
Clinicians / Reporters enter
MCM AE, attach digital file (ie,
photo) and app auto captures
geolocation (lat/long)
2
Data Transferred to FDA: Data
is submitted from mobile device
and sent over Cellular network
or WiFi to FDA’s Cloud
Environment
MedDRA / Data SMEs
• Sonja Brajovic
• Roger Goetsch
• Krishna Chary (E2B)
• Mitra Roca (HL7)
4
3500A / MEDWATCH
• Joseph Tonning
• Dr Robert Ball
1
RAPID CLOUD ENVIRONMENT
OIM (Streaming)
• Josh Lehman
Division of Drug
Information (DDI)
• Mary Kremzer
• Catherine Chew
Web Services, Database,
Dashboards
• FDA RAPID Contractor Team
DMEP (Emergency)
• Mary Beth Roberts
5
3
Data Processed in Cloud: Data
is processed and stored in FDA
Cloud. Adhering to FDA
Security and Data best
practices. Dashboards provide
leadership, comm staff and QC
views into MCM AE Data.
4
Response Sent to Reporter:
Within 24 hours a targeted response
is sent via email containing link to
digital file (ie, Podcast) with
additional information
5
Perform Analytics : Utilize
existing tools for location-based
analysis (ArcGIS) and signal
detection (Empirica) of captured
MCM AE data.
EXISTING ANALYTICAL TOOLS
ArcGIS (Geolocation from
Mobile Devices, Heat Maps)
•Martha O’Connor
•Wayne Gorski
•Nathan Beck
•Newland Agbenowosi (RAPID
Contractor Team)
Empirica (Signal Detection)
• Ana Szarfman
• Marilyn Pitts
• Michael Johnston
Workshop Questions
From a Data mining and informatics viewpoint
•
•
•
•
mobile health and social media
Tools and technologies for real time analysis
Methodological opportunities and challenges
Understanding trust and influence in social
media
• Standards and Inter-operatibility
App Flow
The cloud-based back end design will support the ingestion, analysis and
visualization of streaming adverse event data for pharmacovigillance study.
Visualization Layer
View Age Group Breakdowns
the numberAnalysis
of
Single Drug Analysis Modules• The graph shows
Multi-Drug
Modules
Main Screen
Analyze Drug-AE Combinations
Stratified into Age or Gender Groups
reports for the drug alone compared
to the number of reports for the drugdrug
event combination
Analyze Drug-Event Combination per
SDR Metric Per Drug Class
• The bars are broken out into nine age
groups
• Discrepancies in these distributions
are often worthy
y of follow up
p
What are the drug-AE combinations and
SDR metrics for a particular age group or
gender?
Search Across All AE Reports
What are all the drug-AE combinations per
SDR metric per drug class?
View a Sector Map
© 2011 Oracle Corporation
Analyze the AE profile for a particular
Drug
56
Analyze Drug-Event Associations
A sector map for Em
management data is
presentation of data
across all System Or
(SOCs).
(SOCs)
• Each System O
represented by
tile") in the sect
What information can one find on any
drug, AE, or drug-AE pair by searching
across a database of AE reports?
What is the AE profile of a particular drug
across all System Organ Classes?
Are there associations between different
• onSmaller
Stheirll tiles
til ("P
groups of drug-AE pairs based
prevalence in AE reports?SOC tile repres
• These are rank
order of values
Propylthiouracyl (left) vs. Methimazole (right)
14
3. RAPID Phase II
The proposed RAPID Biosurveillance Platform includes a cloud-based open source big
data analytic tool to facilitate detection of adverse event signals in near real-time
Cubism.js
[Alerts] Display Drugs with > 50% increase in the number of reports in the last 2 weeks
My Projects
 Project 1
 Project 2
 Project 3
Create Project
Drug-AE Filter:
• All drugs in which 1 or more
cases are fatal
• All drugs in which 50% of the
adverse event cases are
reported in pediatrics
Disproportionality Metric Filter:
• ROR
• PRR
• GPS
• IC
• Corresponding Ranking
statistic: e.g., 5 per cent
quantile Q0.05of the posterior
distribution
TTime
imeS
eriesVisualization
Series Visualization for Number of Reports for Drug of Interest:
Drug-Event Tracker
Adverse Event Data and Social Media Data
% Change
in Number
of Reports
(daily)
Algorithm &
Ranking
Statistic (all
data)
Paramivir-H1N1
+3
2.0
AvandiaDiabetes
0.17
1.05
MPA-Steroid
Injections
1.0
1.3
ZanamivirH1N1
0.17
1
+ 10
2.5
Drug
ALL
Clicking on a drug in the list
above activates the time series
and geographic visualization
09:23
09:24
09:25
09:26
09:27
foo
foo
FAERS
bar
Pro-MED-mail
FAERS + Profoo + bar
MED-mail
FAERS – Profoo - bar
MED-mail
09:28
09:29
09:30
09:31
09:3
10
14
24
-4
*Mouseover or use the arrow keys to inspect values
Cubism.js is a D3 plugin for visualizing time series. Use Cubism to construct be
dashboards, pulling data from Graphite, Cube and other sources. Cubism is ava
Apache License on GitHub.
Scalable
Cubism fetches time series data incrementally: after the initial display, Cubism re
by polling only the most recent values. Cubism renders incrementally, too, using
charts one pixel to the left. This approach lets Cubism scale easily to hundreds o
*Mouseover bubbles to view information on adverse events
every ten seconds!
Despite asynchronous fetching, rendering is synchronized so
reported at different geographic locations
simultaneously, further improving performance and readability.
Effective
RAPID GIS Visualizing
16
Growing volumes of rich and complex data sets have outstripped
basic tools and methods, resulting in a ‘data analysis gap’
Analytic Complexity
Small amounts of
data or samples
(megabytes to
gigabytes)
ADVANCED
ANALYTICS
BASIC
ANALYTICS
Large (or all)
Amounts of data
(gigabytes to petabytes)
CLOUD
ANALYTICS
Predictive and
real-time
analytics
capabilities
BIG DATA
ANALYTICS
Accurate
historical
observations
Volume, Variety, Velocity of Data
Existing Capabilities –
Basic/Advanced Analytics:
•Analyze the average number of
violations at a single manufacturer over
time
•Simulate the impact of a foodborne
illness on the wider population
Future Capabilities –
Big Data/Cloud Analytics:
•Predict which manufacturers are at
greatest risk for producing contaminated
products based on a variety of data
sources
•Analyze relationships between
contaminated produce and multiple
environmental variables
Workshop Questions
From a Data mining and informatics viewpoint
•
•
•
•
Mobile health and social media.
Tools and technologies for real time analysis.
Methodological opportunities and challenges.
Understanding trust and influence in social
media.
• Standards and Inter-operability.
Social Media World
Mining Social Media for Product
safety
• Description of the problem
•
-Like drinking from a firehose
•
-90% of data stored in the world today has been created in the past
•
two years. (Kass-Hout, T. and Alhinnawi, H. British Medical Bulletin
2013:108: 5-24)
• Solutions to the problem
•
-Informatics, data mining, bi directional communication.
•
- “Real time” adaptation to public health events
• How will the solution impact FDA?
•
- Develop multidimensional vision of medication use and drug safety
What’s the Value of Social Media?
• Responsibility to monitor public discussions
• Expectation that someone is listening
• Massive volume of discussion and patient reported outcomes of unknown
depth (and quality)
• Unique stream of intelligence that is not necessarily captured in other
data sources
• Not intended to supplant traditional post-marketing surveillance
Vernacular-to-Regulatory
Translation
22
Workshop Questions
From a Data mining and informatics viewpoint
•
•
•
•
Mobile health and social media.
Tools and technologies for real time analysis.
Methodological opportunities and challenges.
Understanding trust and influence in social
media.
• Standards and Inter-operability.
• Where do you do the work?
The cloud-based RAPID Bio-surveillance System will support collaboration
between FDA and other Federal agencies to enhance monitoring
emerging health threats
RAPID Biosurveillance System
Tier 1: Regulatory Action/Guidance
FDA and external partners issue
guidance to ensure patient safety
Tier 2: Data Visualization
Dashboards summarizing key
information provide safety alerts
Disproportionality metrics and
detailed analyses allow FDA and
collaborators to understand
emerging issues
Tier 3: Data Management &
Analytics
RAPID data is combined with
existing MedWatch and Medwatcher
reports and data from external
collaborators
Advanced analytics support AE
signal detection
Tier 4: Processing of Adverse Events
Healthcare professionals submit AE
data via the RAPID mobile app
AE data is stored in a “data lake”
to support real-time access
m
e
d
w
a
t
h
e
r
D
r
u
g
S
h
o
r
t
a
g
e
The cloud-based RAPID Biosurveillance System will support collaboration
between FDA and other Federal agencies to enhance monitoring of
investigational therapies for Ebola and other emerging health threats
RAPID Biosurveillance System
Tier 4: Processing of Adverse Events
Healthcare professionals submit AE data via the RAPID mobile app
AE data is stored in a “data lake” to support real-time access
The cloud-based RAPID Biosurveillance System will support collaboration
between FDA and other Federal agencies to enhance monitoring of
investigational therapies for Ebola and other emerging health threats
RAPID Biosurveillance System
Tier 1: Regulatory Action/Guidance
 FDA and external partners issue guidance to ensure patient safety
Four points worth discussion
•
•
•
•
Convergence of information sources
What solutions will solve convergence issues
Each information source has its own message
Each information source has its own problems
Holistic Drug Safety
28
Pharmacovigilance
IT Tools Should Not Replace Our Own Eyes or Good Clinical Judgment
(Slide courtesy of June Almenoff, M.D., Ph.D.)
Distinguishing Humans and non humans
DIA
www.diahome.org
29
Data Mining team
•
•
•
•
•
•
•
Syed Haider
Alfred Sorbello
Ana Szarfman
Joseph Tonning
Bruce Weaver
Lin Yu (collaborator)
Richard Zhang
Download