Health Care Data Warehousing

advertisement
Health Care Data Warehousing:
Current and Future Directions
Alan R. Hevner
ahevner@coba.usf.edu
Information Systems and Decision Sciences
College of Business Administration
University of South Florida
Tampa, FL
March 28, 2003
Computer Science 40th Anniversary Symposium
1
Outline
 Research Portfolio
 Health Care Data Warehousing
 Future Directions



Bioterrorism Surveillance Systems
Environmental Health
Physician/Hospital/Procedure Volume Studies
March 28, 2003
Computer Science 40th Anniversary Symposium
2
Research Portfolio
 National Institute for Software Testing and Productivity
(NISTP) – DOD Funding








Computational Intelligence (CI) Testing Tools
Graduate Courses on Software Testing
Telemedicine Quality Attributes – VA Partner
Formal Methods for Network-Centric System Specification
E-Commerce Software Development
Collaborative Programming and Agile Methods
Inspection Techniques for Graphical Models
Design Science as a Research Paradigm in IS
March 28, 2003
Computer Science 40th Anniversary Symposium
3
Health Care Data Warehouse Research
 Co-Principal Investigators
 James Studnicki - COPH, USF
 Don Berndt and Alan Hevner - COBA, USF
 Research Staff
 Center for Health Outcomes Research Staff
 COBA and COPH Doctoral and Masters Students
 Collaboration and Funding
 U.S. Dept. of Commerce TOP Grant
 Florida Department of Health
 Bear Stearns Research Laboratory
 Florida Communities
March 28, 2003
Computer Science 40th Anniversary Symposium
4
CATCH Data Warehouse
 Utilizes over 400 health status indicators.
Marketing Data
Hospital
Discharges
Demographics
Vital
Statistics
March 28, 2003
Cancer
Registry
Computer Science 40th Anniversary Symposium
5
Data Dissemination Modes
 Effective Presentation of Data Warehouse
Information to Decision Makers
 Data Dissemination Modes






Ad-Hoc Queries and Data Browsing (SQL/QBE)
Pre-Defined Report Generation (CATCH Reports)
Desktop Data Warehousing (iCAP - MS Excel)
Online Analytic Processing (OLAP)
Geographic Information Systems (GIS)
Web-Enabled Access
March 28, 2003
Computer Science 40th Anniversary Symposium
6
Pre-Defined CATCH Reports
CATCH Workflow

Data Staging

Indicator Calculation
Report Production
State-Specific
Data Sources
Stored Procedures


Customized for state
data.
OLAP Access
CATCH Data
Warehouse
National Data
Sources
March 28, 2003
Computer Science 40th Anniversary Symposium
7
Research Applications
 Bioterrorism Surveillance Systems
 Environmental Health Impacts – EPA Project
 Physician/Hospital/Procedure Volume
Studies
March 28, 2003
Computer Science 40th Anniversary Symposium
8
Bioterrorism Surveillance Systems
 Sentinel Networks Throughout Florida



Anticipate and prevent (if possible).
Sense and provide early warning.
React and minimize epidemiological impacts.
 Surveillance System Infrastructure



Networks connecting sensors and early data indicators.
Historical data warehouses for pattern recognition.
Intelligent agents to alert populace, disseminate reaction,
and eliminate threat.
March 28, 2003
Computer Science 40th Anniversary Symposium
9
Three Pillars of Threat Surveillance
“A surveillance system includes a functional capacity for data
collection, analysis, and dissemination linked to public health
programs.” [CDC]
 Real-Time Data Collection

Data with minimal lag time.
 Data Warehousing & Data Mining

Historical context, analysis, and pattern recognition
techniques for threat detection.
 Communications & Alert Networks

Timely dissemination to response groups.
March 28, 2003
Computer Science 40th Anniversary Symposium
10
Surveillance
System
Architecture
Research Challenges
 Flash / Real-Time Data
Warehouses
 Threat Detection via
Data Mining
March 28, 2003
Computer Science 40th Anniversary Symposium
11
Data Mining & Pattern Recognition
 Alarm thresholds are determined on the basis of
historical pattern recognition.

Does the current real time data constitute an abnormal
pattern?
 Historical health care data is maintained in a data
warehouse.
 Sophisticated browsing technologies and/or
intelligent data mining will help recognize when
data streams trigger alarm thresholds for a threat.
March 28, 2003
Computer Science 40th Anniversary Symposium
12
Environmental Health Project
 Develop a series of indicators that can characterize
community exposure to various environmental
contaminants or hazards.


Use evolving EPA models of dispersion and exposure.
Collect experiential and historical information.
 Describe the socioeconomic and demographic
characteristics of these communities of interest.
 Investigate any association between exposure
levels and variation in health status.

Tools include hierarchical linear models and spatial statistics.
March 28, 2003
Computer Science 40th Anniversary Symposium
13
All Cancer
Mortality
Health Status Indicators
Merged with Environmental
Exposure Data
Total TRI Air
Emissions
Health Status Indicators
Merged with Environmental
Exposure Data
Source: EPA 1999 Toxic Release Inventory
Clinical Volume Research
 Web-based access to volume data on:



Procedures
Physicians
Hospitals
 Relationship of volume to positive patient outcomes
 Applications


Healthcare Organizations
Consumers
 Compelling research results

Paper under review at Peer Refereed Journal
March 28, 2003
Computer Science 40th Anniversary Symposium
16
Extract of Research Findings
Table 3. For Selected Surgical Procedures in Florida (1998), Physician Volume Distribution Characteristics
Procedures
Operations of Cardiovascular
System
Coronary artery bypass graft
Removal coronary artery
obstruction
Open heart surgery
Insertion, removal, replacement,
repair of pacemaker leads
and devices
Puncture of vessel
Cardiac catherization
All Physicians with at Least One Procedure
By Surgical Volume
Lowest 50% Physicians
By Surgical Volume
Highest 10% Physicians
Mean
Median
% Single
Total
Total Procedures Procedures Procedure Maximum
Procedures Physicians /Physician /Physician Physicians Volume
Mean
Procedures % Total
/Physician Procedures
Mean
Procedures % Total
/Physician Procedures
25124
384
65.4
20
37.0%
421
2.2
1.7%
235.0
35.5%
39986
7052
906
317
44.1
22.2
9
11
29.8%
31.2%
574
210
2.0
2.7
2.3%
6.1%
196.1
92.0
44.1%
40.4%
14569
21549
48956
1124
4712
1350
13.0
4.6
36.3
3
2
14
36.1%
46.4%
31.6%
204
121
405
1.4
1.1
2.4
5.3%
11.7%
3.4%
65.7
23.1
153.6
50.5%
50.4%
42.0%
© 2002 Studnicki et al. Submitted for journal publication. Do not copy without permission of authors.
Application Directions
 Hospital Use

Risk Management






Retrospectively analyze patterns of low hospital/MD combinations
Identify variation in volume related processes of care (e.g., based on
payer source)
“Real time” risk assessment at scheduling
Develop guidelines/algorithms
Referral patterns
Privileging considerations
 Consumer Use


Selection of Doctors and Hospitals
Education
March 28, 2003
Computer Science 40th Anniversary Symposium
18
Future Research Directions
 Health care data quality at source


Motivation for reporting timely and accurate data
Data collection audits
 Use of health care data in communities



National Standards for community health assessments (very limited data
sets required)
How are data used to make health care decisions?
Different stakeholders use of data
 Ability to inform national debate on health care issues


Quality of Care to poor, disadvantaged, minorities
Comparison of care under different health care programs (Managed Care
Programs)
March 28, 2003
Computer Science 40th Anniversary Symposium
19
Download