BIG DATA & ANALYTICS 3.0

advertisement
BIG DATA & ANALYTICS 3.0
Chuck Lyon, Operations Lead
Enterprise Data Warehouse
3 April 2015
What is Big Data?
INTERMOUNTAIN’S DEFINTION
Using additional data sources and new analytic tools to produce superior,
actionable analytic insights (not previously possible or cost effective) leading to
•
Improved healthcare outcomes
•
Reduced cost
•
Improved patient experience
New Data Sources and Analytic Tools
POSSIBILITIES FOR NEW, SUPERIOR INSIGHTS
Additional Data Sources
Potential New Tools
•
•
•
•
•
•
•
Unstructured physician notes,
discharge summaries and clinical
documentation.
High volume, streaming clinical device
data
Personal device data
Genomics data
Cerner population health data (47
million patient lives)
External data, social media, etc.
•
•
•
•
Low cost, high volume storage,
distributed processing (Hadoop)
Semantic content recognition
(Unstructured to structured with clinical
significance, NLP – natural language
processing)
Machine learning correlation and
causation discovery
High volume data federation indexing
and search (SOLR)
Care Coordination
Analytics 3.0 *
INTERMOUNTAIN’S PATH FORWARD
Analytics 3.0 is an emerging analytics movement producing superior descriptive,
predictive and prescriptive analytic insights by tightly integrating big data and
traditional analytics to achieve insights and outcomes not previously possible.
1.0
2.0
3.0
Traditional
Analytics
Big Data
New, Superior
Business Value
(structured data,
relational, statistical)
(unstructured data,
volume, variety,
velocity, veracity)
(integrated big data
AND traditional analytics)
* International Institute for Analytics
Planning and Roadmap
ANTICIPATED PHASES
Initial Use Cases
•
•
•
•
•
Physiologic Data
EDW Augmentation
• Historical EMR (HELP) archiving
• ETL offload from EDW
NLP
• Concept extraction from Text Documents
Search
• SOLR search over EDW data
• End user self-service
• Data investigation
Genomics
• Storage of raw genomic files
Physiologic Monitor Data
Data Ingest to Visualization
Device Interface at Intermountain – has existed for 30+ years
Sampled data is pushed to EMR at 15 minute intervals
Data is deleted after minimal storage time
Requests from researchers
Data pipe is dammed up, data collected and sent, then deleted
Need:
Store the data for historical analysis, complex event correlation
Use algorithms to enhance clinical alerting and decision-making
Current Status:
Storing near-real-time data in Hadoop (10 minute queue processing)
Visualization:
Tableau connected to Hive
Tableau Visualization of Hive Data
10 minute maximum latency
Value
WHAT HAS BIG DATA ACCOMPLISHED IN HEALTHCARE?
•
Clinical Benefits:
•
•
Deriving optimum clinical pathways to reduce
variance for treating various chronic and acute
conditions
•
Predicting the "next top 5%" of expensive
patients
•
Improving prediction accuracy of heart condition
diagnosis from echocardiogram data
•
Improving prediction of 30 day CHF
readmissions using unstructured clinical data.
•
Using genomic data to better predict and
prevent pre-term births
•
•
Using genomic data to prescribe personalized
treatment for Leukemia patients
Operational Benefits:
•
Retaining ICU device data for granular
assessment of acute events
•
Off loading EDW data landing and expanding to
include complete source application data
•
Reducing data modeling and prescriptive ETL
through ELT and discovery based approaches
•
Retiring legacy applications with an active
archive
•
Storage and processing of genomic data
Financial Benefits:
•
Improving recovery of claims
Contact Information
Charles (Chuck) Lyon
Enterprise Data Warehouse, Operations Lead
Intermountain Healthcare
Chuck.lyon@imail.org
801-507-8080
Download