Health Care Data Warehousing: Current and Future Directions Alan R. Hevner ahevner@coba.usf.edu Information Systems and Decision Sciences College of Business Administration University of South Florida Tampa, FL March 28, 2003 Computer Science 40th Anniversary Symposium 1 Outline Research Portfolio Health Care Data Warehousing Future Directions Bioterrorism Surveillance Systems Environmental Health Physician/Hospital/Procedure Volume Studies March 28, 2003 Computer Science 40th Anniversary Symposium 2 Research Portfolio National Institute for Software Testing and Productivity (NISTP) – DOD Funding Computational Intelligence (CI) Testing Tools Graduate Courses on Software Testing Telemedicine Quality Attributes – VA Partner Formal Methods for Network-Centric System Specification E-Commerce Software Development Collaborative Programming and Agile Methods Inspection Techniques for Graphical Models Design Science as a Research Paradigm in IS March 28, 2003 Computer Science 40th Anniversary Symposium 3 Health Care Data Warehouse Research Co-Principal Investigators James Studnicki - COPH, USF Don Berndt and Alan Hevner - COBA, USF Research Staff Center for Health Outcomes Research Staff COBA and COPH Doctoral and Masters Students Collaboration and Funding U.S. Dept. of Commerce TOP Grant Florida Department of Health Bear Stearns Research Laboratory Florida Communities March 28, 2003 Computer Science 40th Anniversary Symposium 4 CATCH Data Warehouse Utilizes over 400 health status indicators. Marketing Data Hospital Discharges Demographics Vital Statistics March 28, 2003 Cancer Registry Computer Science 40th Anniversary Symposium 5 Data Dissemination Modes Effective Presentation of Data Warehouse Information to Decision Makers Data Dissemination Modes Ad-Hoc Queries and Data Browsing (SQL/QBE) Pre-Defined Report Generation (CATCH Reports) Desktop Data Warehousing (iCAP - MS Excel) Online Analytic Processing (OLAP) Geographic Information Systems (GIS) Web-Enabled Access March 28, 2003 Computer Science 40th Anniversary Symposium 6 Pre-Defined CATCH Reports CATCH Workflow Data Staging Indicator Calculation Report Production State-Specific Data Sources Stored Procedures Customized for state data. OLAP Access CATCH Data Warehouse National Data Sources March 28, 2003 Computer Science 40th Anniversary Symposium 7 Research Applications Bioterrorism Surveillance Systems Environmental Health Impacts – EPA Project Physician/Hospital/Procedure Volume Studies March 28, 2003 Computer Science 40th Anniversary Symposium 8 Bioterrorism Surveillance Systems Sentinel Networks Throughout Florida Anticipate and prevent (if possible). Sense and provide early warning. React and minimize epidemiological impacts. Surveillance System Infrastructure Networks connecting sensors and early data indicators. Historical data warehouses for pattern recognition. Intelligent agents to alert populace, disseminate reaction, and eliminate threat. March 28, 2003 Computer Science 40th Anniversary Symposium 9 Three Pillars of Threat Surveillance “A surveillance system includes a functional capacity for data collection, analysis, and dissemination linked to public health programs.” [CDC] Real-Time Data Collection Data with minimal lag time. Data Warehousing & Data Mining Historical context, analysis, and pattern recognition techniques for threat detection. Communications & Alert Networks Timely dissemination to response groups. March 28, 2003 Computer Science 40th Anniversary Symposium 10 Surveillance System Architecture Research Challenges Flash / Real-Time Data Warehouses Threat Detection via Data Mining March 28, 2003 Computer Science 40th Anniversary Symposium 11 Data Mining & Pattern Recognition Alarm thresholds are determined on the basis of historical pattern recognition. Does the current real time data constitute an abnormal pattern? Historical health care data is maintained in a data warehouse. Sophisticated browsing technologies and/or intelligent data mining will help recognize when data streams trigger alarm thresholds for a threat. March 28, 2003 Computer Science 40th Anniversary Symposium 12 Environmental Health Project Develop a series of indicators that can characterize community exposure to various environmental contaminants or hazards. Use evolving EPA models of dispersion and exposure. Collect experiential and historical information. Describe the socioeconomic and demographic characteristics of these communities of interest. Investigate any association between exposure levels and variation in health status. Tools include hierarchical linear models and spatial statistics. March 28, 2003 Computer Science 40th Anniversary Symposium 13 All Cancer Mortality Health Status Indicators Merged with Environmental Exposure Data Total TRI Air Emissions Health Status Indicators Merged with Environmental Exposure Data Source: EPA 1999 Toxic Release Inventory Clinical Volume Research Web-based access to volume data on: Procedures Physicians Hospitals Relationship of volume to positive patient outcomes Applications Healthcare Organizations Consumers Compelling research results Paper under review at Peer Refereed Journal March 28, 2003 Computer Science 40th Anniversary Symposium 16 Extract of Research Findings Table 3. For Selected Surgical Procedures in Florida (1998), Physician Volume Distribution Characteristics Procedures Operations of Cardiovascular System Coronary artery bypass graft Removal coronary artery obstruction Open heart surgery Insertion, removal, replacement, repair of pacemaker leads and devices Puncture of vessel Cardiac catherization All Physicians with at Least One Procedure By Surgical Volume Lowest 50% Physicians By Surgical Volume Highest 10% Physicians Mean Median % Single Total Total Procedures Procedures Procedure Maximum Procedures Physicians /Physician /Physician Physicians Volume Mean Procedures % Total /Physician Procedures Mean Procedures % Total /Physician Procedures 25124 384 65.4 20 37.0% 421 2.2 1.7% 235.0 35.5% 39986 7052 906 317 44.1 22.2 9 11 29.8% 31.2% 574 210 2.0 2.7 2.3% 6.1% 196.1 92.0 44.1% 40.4% 14569 21549 48956 1124 4712 1350 13.0 4.6 36.3 3 2 14 36.1% 46.4% 31.6% 204 121 405 1.4 1.1 2.4 5.3% 11.7% 3.4% 65.7 23.1 153.6 50.5% 50.4% 42.0% © 2002 Studnicki et al. Submitted for journal publication. Do not copy without permission of authors. Application Directions Hospital Use Risk Management Retrospectively analyze patterns of low hospital/MD combinations Identify variation in volume related processes of care (e.g., based on payer source) “Real time” risk assessment at scheduling Develop guidelines/algorithms Referral patterns Privileging considerations Consumer Use Selection of Doctors and Hospitals Education March 28, 2003 Computer Science 40th Anniversary Symposium 18 Future Research Directions Health care data quality at source Motivation for reporting timely and accurate data Data collection audits Use of health care data in communities National Standards for community health assessments (very limited data sets required) How are data used to make health care decisions? Different stakeholders use of data Ability to inform national debate on health care issues Quality of Care to poor, disadvantaged, minorities Comparison of care under different health care programs (Managed Care Programs) March 28, 2003 Computer Science 40th Anniversary Symposium 19