MUSC’s Enterprise Data Warehouse at a Glance 10/15/2013 The MUSC Enterprise Data Warehouse (EDW) is intended to be a comprehensive collection of all of the organization’s important data. MUSC established the Enterprise Data Warehouse (EDW) to support clinical, management, research and other operational initiatives. The EDW consolidates data from various computer systems across the MUSC enterprise. The data is brought together to form a single unified “data collection” spanning numerous disciplines and domains. The vision for the EDW is to convert data into meaningful information that is secure, accurate and timely so that staff, faculty and administrators throughout the enterprise will be able to use the information for decision-making and research activities. Architecture The EDW technical architecture consists of three principal components: Analytics Database Server, ETL Server, and the Business Intelligence Server. The Analytics Database Server uses Sybase IQ, a high-performance columnar database designed specifically for data warehousing, as its central repository of all enterprise data. IQ utilizes multiple indexes and data compression to provide fast loading of data and efficient query execution. The ETL Server primarily uses Informatica PowerCenter to extract, transform and load data from a wide variety of systems in use at MUSC, including relational databases such as Sybase ASE, Oracle, and Microsoft SQL Server. The ETL Server also facilitates import of data from non-relational sources, typically via file transfers of data extracts generated by vendor reporting systems (where no direct access to their product’s underlying data repository is available or supported). The Talend open source middleware ETL product is also being used in a few special cases. The Business Intelligence Server uses MicroStrategy Business Intelligence to deliver reports and dashboards supporting operational, M.Daniels ** WORKING DRAFT ** Page 1 of 6 analytical and decision support users. A mix of desktop, web, and mobile clients are currently supported in this environment. Infrastructure There are currently 2 EDW environments, a test environment and a production environment. For the production environment, the ETL and database reside on virtual machines (Solaris containers) hosted on a dedicated Oracle/Sun Solaris server. The business intelligence applications reside on a small group of Linux and Windows servers. The IQ server excels at compressing data and indices -- currently, the server uses approximately 2 TB of disk space. Data Sources The data found in the EDW originates from dozens of computer systems across the MUSC enterprise. In some cases, data is extracted directly from an application’s database (an example would be Picis anesthesia), in some cases data is post processed by a secondary system to apply additional business logic (an example would be Trendstar’s post processing of Keane charge data), and in some cases data from multiple sources are collected and aggregated in a data repository (an example would be data collected by the Oacis Clinical Data Repository). Examples of the data collection options are shown graphically in Figure 1. The data sources are varied both from in domain, source system, level of detail, and temporality. The earliest data in the EDW is patient registrations, laboratory results, radiology procedures, and hospital transcription – this data dates back to June 1, 1993. Other data sources start (and stop) over the last 20 years. The only reason a data source would stop is if the source system was retired (i.e. Practice Partner), but often if is replaced by another system. M.Daniels ** WORKING DRAFT ** Page 2 of 6 Data Source along with data type and status are shown in Appendix A. Note that this list includes data sources that are on the list to be harvested in the near future. Figure 1 -­‐ Data Collection Options Other Points of Information • The EDW infrastructure supports analytics and reporting efforts by teams using other tools (i.e. Tableau, SAS, Access, etc.) • The EDW team builds and supports various data extraction scripts to provide specific data extracts for internal use and distribution to approved external entities, for example: o Health Science South Carolina (HSSC) o Insurance payers o DHEC – Syndromic Surveillance o Pharmacy 340B Program o Pediatric Medicaid – Out of Network o Meaningful Use Quantity Measure data M.Daniels ** WORKING DRAFT ** Page 3 of 6 Appendix A – Data Sources Source Abbott Point of Care Action OI AHRQ ANSOS Apollo Avatar / HCAHPS Centricity Cerner Lab Clinic Scheduling CMS Dental Medicine Dietary Employee Survey EndoWorks Epic UHC Op DB Quality Indicators Nurse Shift / Scheduling Cardiology Patient Satisfaction OB Blood Bank Gen Lab MicroBiology MicroBiology Charge Cost Mapping Pathology IDX Flowcast/Imagecast Public Indicators Labor/Supply Bchmk Status EDW Adult only Inpatient Delivery Discrete Report Susceptabilities Report EDW EDW (1) EDW EDW EDW EDW EDW (1) EDW EDW (1) Public Website Identify Source Identify Source M.Daniels Note Info not in Cerner Sodexo ePremis GE Echo GE MUSE GE Viewpoint GetWell Hand Hygiene Description QCM / iStat Endoscopic Ultrasound Ambulatory Ambulatory Ambulatory ASAP -­‐ ED ASAP -­‐ ED Immunizations Medications Administered Medications Ordered Bed Management Transport Echo Pacs EKG PreNatal Ultrasound Patient Education Source -­‐ EI team Report Discrete Immunizations Report Discrete Discrete Discrete Discrete ** WORKING DRAFT ** EDW (1) EDW (1) EDW EDW EDW (1) EDW EDW EDW EDW EDW EDW EDW (1) EDW (1) EDW (1) EDW Page 4 of 6 Hollings CC HPF IDX FlowCast IDX ImageCast Keane Reg/PatAccnt Cancer /Tumor Registry Document Imaging UMA Radiology Report Patient Charges Source Identified EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW (1) EDW EDW (1) CPT codes only Payment Patient Registration Krames Patient Education McKesson Clinicals ClinDoc CPOE Orders ED Track Board Glucose Co-­‐Signs Longitudinal (HHS) Med Administrations McKesson Medication Verified Pharmacy Accudose Cabinet Med Dispense MediServ PT/OT/Speech RT MedFlow Ophthalmology Meducare RESCUENET Getting Access Neurophysiology EEG/EMG NSQIP ACS Nursing Operations Accommodation Codes ETA Team Olympus Endoscopy EndoWorks Outcomes Science Stroke Data Parking Local Application Management Patient Call Back Call back results ETA Team Philips Vu Physiologic Monitor Physician Survey Identify Source Picis Anesthesia Report OR Report Anesthesia Discrete OR Discrete PMM Material / Supply Practice Partner Immunizations Practice Partner Reports Press Ganey Patient Satisfaction Pediatrics Only M.Daniels EDW (1) EDW EDW (1) EDW ** WORKING DRAFT ** EDW (1) EDW EDW EDW (1) EDW (1) EDW EDW EDW EDW EDW Page 5 of 6 PSN Pyxis Quantros Reach Patient Safety Events Supply Dispensing CMS Quality Indicators Stroke Data (Telemedicine) Death Registry State EMS Data Infection Control / Risk Pulmonary Function Tests Ventilator S.C. Vital Stats S.C. Vital Stats Safety Surveillor Sensormedics Siemens Trauma Registry Trendstar UHC Data Velos Verge Source Identified Source Identified Patient Cost Accounting Transplant Credentialing BioBank Production Access Specimen Data OBIS Time HEMM Supply Mgmt ORSP Proposal / Award Download / License Use ORSP Proposal / Award ORSP Grant Expendature Hospital HR Hospital Payroll University HR University Payroll PVID / UDAK info ORSP PI Space Kronos McKesson Supply Coeus Software License ERMA GCA SmartStream Identity SAMWA No Data in EDW Data Load in Progress Data Load Live Data Load Suspended pointer to report -­‐ report not loaded EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW (1) M.Daniels ** WORKING DRAFT ** Page 6 of 6