- Health Analytics - Georgia Institute of Technology

advertisement
Health Analytics at Georgia Tech: From
Information to Knowledge to Decision Making
Nicoleta Serban, PhD and Julie Swann, PhD
Industrial and Systems Engineering
Georgia Institute of Technology
May 2014
2
Data Science Framework
Information
•
•
Infrastructure
Management
Data
•
•
Representation
Sampling
• Data architectures
• Data integration,
sharing and federation
• Data privacy rules
• Data wrangling
•
•
•
•
•
•
Knowledge
•
•
Computation
Tools
Data mining
Machine learning
Statistical inference
Network analysis
Simulations
Visualization
Decisions
•
System
engineering
Deriving hypotheses
Validating hypotheses
Eliciting causal relations
Designing, planning, and
optimizing
• Testing, ranking, scoring
• System dynamics
•
•
•
•
Scope of Data in Healthcare
Data Types
Examples
1. Disease Registry
2. Disease Progression
3. Electronic Health
Records
4. Facility Info
5. Medical Claims Data
6. National Survey or
Examination Data
7. State Databases
8. General
1. Cystic Fibrosis
2. “Natural History” models
3. Queries (CHOA, VHA) on
specific projects
4. VA satellite clinics
5. Medicaid (children and
pregnant women, GA + 13
other states, 2005-2009)
6. NHANES, HCUP KIDS
7. GA’s Oasis, HCUP SEDD
and SID
8. Census, National Provider
Index, GIS
Medicaid claims data will be used as a test bed for the decisionmaking support tools targeting knowledge representing the
care of children with Medicaid.
5
CMS Medicaid Claims Data
• MAX Claims Data
▫
▫
▫
▫
▫
Personal Summary: patients, demographics, birthdate, etc.
Inpatient: claims, diagnoses, procedures, LOS, payment
Other Therapy: claims for physician, lab, clinic, outpatient
Long Term Care: facility type, date of service, etc.
Prescription Drug: paid drug claims
• Patient-level Identifiable-Files with locations and a provider-ID
• Years 2005 – 2009 for 14 states (+2010-2011 upcoming)
▫ SE: Georgia, Alabama, Arkansas, Louisiana, Mississippi, N. Carolina, S.
Carolina, Tennessee, Texas
▫ Other: California, Minnesota, New York, Pennsylvania
• Study population: children and pregnant women
GT Project Champion: Beth Mynatt (IPaT, GT)
GT Lead on Information Technology: Matt Sanders (GTRI)
GT Research Leads: Nicoleta Serban and Julie Swann
6
Medicaid Project: Approved Topics
1) MEASURING AND EXPLAINING INEQUITIES:
To assess the impact of healthcare system characteristics
vs. inequities in healthcare, including geographical, use,
quality, expenditure and outcomes among Medicaid
children enrollees, especially in states with historic
inequities like in the southeast.
2) OPTIMIZING INTERVENTIONS AND DELIVERY
SYSTEMS
To analyze flows and policies across the system, e.g., the
match between supply and demand, and financially, both
geographically and across time, along with the
corresponding costs or outcomes, to analyze improved
methods of delivery including medical homes.
7
Medicaid Project: Implementation
Information:
•
•
Identifiable patient-level claims
5 years+14 states =
266,839,307,070 Observations
2 Terabytes of information
Data:
•
•
Represented as patient care
trajectories: utilization, cost and
patient characteristics
Sampled by disease
Challenge #1: HIPPA and CMS data safeguards compliance
- data environment: access, sharing, linking, storage
Challenge #2: Database backbone
- projected research needs
- projected computational needs
Challenge #3: Data Processing
- unavailability of tools to process-mine claims
- additional data and information needs
- expert opinion & collaborations
8
Medicaid Project: Safeguards
• Data stored in secured location at Georgia Tech,
with access to the identifiable patient files by a
limited set of employees approved by CMS & IRB
• Sharing of aggregated data is allowed with
collaborators, if consistent with research protocol
• Cells should have at least 11 entries
 Data undergoes review process at GT before release
from data workstation
• Significant liability involved if breach occurs
9
Medicaid Project: Health Analytics
Knowledge
Data
• Baseline Metrics
• Care Pathway
• Access & Outcomes
• Systematic disparities in
access, outcomes and cost
• Network of providers
• Profiles of patient-level care
pathways
Process Mining
Spatial Statistical Models
Functional Data Analysis
Unsupervised classification
Sequence clustering
Markov-decision processes
Optimization
10
Medicaid Project: Health Analytics
Knowledge:
• Systematic disparities in
access, outcomes and cost
• Network of providers
• Profiles of patient-level care
pathways
Decision Making:
• Policy interventions
• Network Interventions
Markov-decision processes
Causal Inference
Optimization Modeling
Simulations
11
Medicaid Project: Research Scope
• Limitations
 Research must fit within the scope proposed to CMS
 Analysis of raw data must be conducted at GT
 Process for analyzing data is onerous, time-consuming,
and “expensive”
 The most recent (~2) years of data are not available
• Positives
 We can benchmark GA against 13 other states
 Patients and/or providers can be followed
longitudinally
 Permission of pursuing research topics and
publication of the related findings is not required to be
submitted to CMS or GT
Medicaid Project: Opportunities
• Developing the proof of concept in building large
infrastructures for protected information
• Becoming the center for deployment of tools for
mining claims data
• Advancing rigor in health analytics
• Educating students and visiting researchers
• Informing policy making in understanding and
managing the healthcare system
Health analytics at GT bridges fundamental mathematical and
computational modeling with health service research and health
economics as a means of translating health and healthcare data
into knowledge and decision making.
14
Health Analytics:
Serban & Swann Group
• Healthcare Access & Outcomes
 Measurement
 Linking Access & Outcomes
• Interventions
 Policy & Network Interventions
 Cost-effectiveness: Telemedicine
• Pediatric Asthma
 Baseline Metrics
 Care Pathways in Utilization & Cost
• Collaborations between GT ISyE, GT IPaT,
Children’s, CDC, VHA, DCH, DPH and other health
entities
We define healthcare access as the equal opportunity of people to get
appropriate care to maintain or improve their health. We focus primarily on
making inferences on spatial access, which is particularly important for
managing chronic diseases where regular visits and adherence to
recommended care practices can reduce severe outcomes.
Healthcare Access: Five Dimensions
17
Health Analytics and Access
Evaluate
Interventions
Infer
•
Disparities
Measure
Status Quo
•
•
•
Link to
Outcomes
Measurement
Estimating spatial access of different
populations by taking into account supply and
demand trade-offs and system constraints.
Inference: Equity
Studying systematic disparities in access to
services between population groups.
Inference: Linking to Outcomes
Understand how access is associated to health
outcomes geographically and longitudinally
Evaluating Interventions
Informed decision making in healthcare
delivery -- policy and network interventions
targeting improvement in spatial access with a
significant projected impact on outcomes
Access Measures: Pediatric Primary Care
Data: National Provider Index (NPI), Medicaid
claims, Bureau Census, Geographic Information
System among other sources
• Study Population: Children in 14 states
• Measurement Model: Matching patients to
providers using optimization modeling estimated at
the census tract level
• Spatial Access Measures: Travel distance/time,
Congestion & Coverage
Access Measures: Pediatric Primary Care
GEORGIA: Disparities across geography
Congestion
Coverage
Large discrepancies between urban and
rural care.
High congestion across state except for
some cities.
High coverage in broad regions
surrounding the most populated cities.
Low coverage in many rural areas.
o
o
o
GEORGIA: Disparities between Medicaid & non-Medicaid
Congestion
Coverage
o
Yellow and red regions indicate areas where
the non-Medicaid patients' availability of
services is superior to Medicaid patients,
while blue regions indicate areas where the
reverse is true
Access ~ Outcomes: Pediatric Asthma
Data: Health Cost and Utilization Project (NC)
DPH OASIS (GA)
• Study Population: Children ages 4-17 in GA & NC
• Geographic Access: travel distance between
patients and matched providers using optimization
estimated at the census tract level
• Outcome measure: ED visit & hospitalization rates
at the county level
Access ~ Outcomes: Pediatric Asthma
 Access is significant alone and in interactions with
other factors
 Impact of access varies with geography
 Improving access is expected to reduce the
occurrence of severe outcomes.
Predicted Reduction in Number of ED Visits
in Georgia
Number of County/Age Pairs
45
40
35
30
25
Specialist5
20
Specialist15
15
Specialist5:Primary15
10
Specialist15:Primary15
5
0
1 to 5
5 to 10
10 to 15
Reduction in Number of ED Visits
>15
Access ~ Outcomes: Cystic Fibrosis
Data: clinical information derived from the Cystic
Fibrosis Foundation Registry
• Study Population: 229,968 observations on 7823
patients from 2002 to 2011.
• Geographic Access: realized travel distance from
the zip code of each patient to the care centers.
• Outcome measure: %FEV1, a common outcome
for research in cystic fibrosis
Access ~ Outcomes: Cystic Fibrosis
• Not a consistent
relationship between
geographic access and
outcomes
• Access is significant
for some age groups
but not for all
• State based analysis
shows other factors
impact outcome levels
The distribution of %FEV1 & CF center
locations in Georgia
A policy intervention is a type of an action that involves design, revision,
implementation or translation of a health policy for reducing costs and for
improving health outcomes, healthcare access and quality.
A network intervention refers to an action that involves altering an existing
network of care, including networks consisting of medical facilities.
25
Policy Interventions
Improving Access to Pediatricians
for Medicaid Population
• Considered policies that would:
• Improve patients’ mobility.
• Increase percentage of physicians
accepting any Medicaid patients.
• Increase percentage of caseload
physicians devote to Medicaid
patients.
• Simulate policy change by altering
inputs in the access measurement
models.
• Evaluate impact at the state-wide and
local level with respect to access and
health outcomes.
Network Interventions
Research Question: How
can the existing network be
modified to meet specified
goals?
Evaluation Criteria: Equity,
Effectiveness, Efficiency
Interventions
 Open new facilities
 Expand hours
 Mobile clinics, telemedicine
27
Cost-Effectiveness: Telemedicine
Telemedicine:
• Originated in the Netherlands in the early 1900s
• More than 100 definitions of telemedicine (WHO, 2010)
• Time Magazine has called telemedicine “healing by wire”
• Countries of implementation:
▫ Almost everywhere on the globe!
▫ Provider-driven implementations in the US (e.g., VHA)
• Example of an implementation:
▫ Tele-ophthalmology at VHA
▫ Cost-effectiveness: Diabetic Retinopathy Screening
28
Cost-Effectiveness: Telemedicine
• Step 1: Model Individual Disease Progression
 Data: Sample of veterans with diabetes (Atlanta)
 Model: Markov Decision Model
• Step 2: Simulate individual disease profiles
 Data: VHA & General parameter input
 Model: Estimated Disease Progression Model
• Step 3: Compare traditional to telemedicine
 Simulate screening with and without telemedicine
 Utility measures: Quality Adjusted Life Years
(QALY) vs. costs of the program
29
Cost-Effectiveness: Cost vs. QALY Ratio
$100,000
Average Cost/QALY
$50,000
Average Cost/QALY
$60,000
Cost-effective
$40,000
$30,000
$20,000
$10,000
$0
3000
$80,000
$60,000
Cost-effective
$40,000
$20,000
$0
($20,000)
3500
4000
6000
Patient Pool Size
9000
Patient Age in Years
• Cost-effectiveness for pool sizes of 3500 or larger (~9000
VHA patients in Atlanta area)
• Cost-effectiveness for patients between the ages of 50-80
Our research spans multiple directions, including deriving a set of
baseline measures for asthma care, linking access to outcomes, and
identifying care pathways in utilization and cost. The end point is to
design policy and network interventions to improve health outcomes
and access with limited resources.
31
Baseline: Utilization, Cost & Treatment
Objective: develop a set of baseline metrics for
pediatric asthma to be used in designing and
evaluating interventions to have the greatest
impact with limited resources.
Pilot Study: Children population with Medicaid
insurance ages 4-17 in Georgia, 2009
32
Baseline:
Utilization Metrics by Race & Age
The Other population (e.g., mostly Hispanic)
has the most visits per patient and the
African American population has the most
patients per 1000 children on Medicaid.
There are no differences in visits per
patient, but the number of patients per
1000 children decreases with age.
33
Baseline: Cost Metrics by Race
The Other population has the highest
charge per visit, followed by the African
American population. Payment amount
and prepaid value show no significant
differences.
The Other population has the highest
charge per enrollee per month, followed by
the African American population.
34
Baseline: Treatment Control Metrics
The African American population has a lower
medication ratio than the other two
populations, indicating a lower use of long
term controller medication.
Fulton county and the surrounding areas
have the lowest medication ratio in the
state.
Baseline: Treatment Adherence Metrics
36
Care Pathways: Utilization & Cost
Objective: To identify underlying care pathways
and to visualize the utilization relational system
for pediatric asthma care in the Medicaid system
using large patient-level claims data.
Pilot Study: Children population with Medicaid
insurance ages 4-17 in Georgia, 2009
37
Care Pathways: Utilization
38
Care Pathways: Cost to MCO’s
39
Care Pathways:
Cost to the Medicaid System
40
Care Pathways: Cost & Utilization
Acknowledgements
Supporting Institutes and Organizations
- National Science Foundation (CAREER Award)
- Institute of People and Technology
- Children’s Healthcare of Atlanta
Research Team
IT Staff: Matthew Sanders and Paul Diederich
Postdoctoral fellow: Dr. Monica Gentili
Undergraduate students: Sarah Drath, Pravara
Harati, Qiming Zhang, Sean Monahan
PhD Graduate students: Erin Garcia, Ross Hilton,
Ben Johnson, Kevin Johnson , Zihao Li, Rodrique
Ngueyep, Richard Zheng
Contact Us
• Nicoleta Serban
nserban@isye.gatech.edu or 404-385-7255
• Julie Swann
jswann@isye.gatech.edu or 404-385-3054
Download