MUSC's Enterprise Data Warehouse at a Glance 10/15/2013 The

advertisement
MUSC’s Enterprise Data Warehouse at a Glance
10/15/2013
The MUSC Enterprise Data Warehouse (EDW) is intended to be a
comprehensive collection of all of the organization’s important data.
MUSC established the Enterprise Data Warehouse (EDW) to support
clinical, management, research and other operational initiatives. The
EDW consolidates data from various computer systems across the
MUSC enterprise. The data is brought together to form a single
unified “data collection” spanning numerous disciplines and domains.
The vision for the EDW is to convert data into meaningful information
that is secure, accurate and timely so that staff, faculty and
administrators throughout the enterprise will be able to use the
information for decision-making and research activities.
Architecture
The EDW technical architecture consists of three principal
components: Analytics Database Server, ETL Server, and the
Business Intelligence Server.
The Analytics Database Server uses Sybase IQ, a high-performance
columnar database designed specifically for data warehousing, as its
central repository of all enterprise data. IQ utilizes multiple indexes
and data compression to provide fast loading of data and efficient
query execution.
The ETL Server primarily uses Informatica PowerCenter to extract,
transform and load data from a wide variety of systems in use at
MUSC, including relational databases such as Sybase ASE, Oracle,
and Microsoft SQL Server. The ETL Server also facilitates import of
data from non-relational sources, typically via file transfers of data
extracts generated by vendor reporting systems (where no direct
access to their product’s underlying data repository is available or
supported). The Talend open source middleware ETL product is also
being used in a few special cases.
The Business Intelligence Server uses MicroStrategy Business
Intelligence to deliver reports and dashboards supporting operational,
M.Daniels ** WORKING DRAFT ** Page 1 of 6 analytical and decision support users. A mix of desktop, web, and
mobile clients are currently supported in this environment.
Infrastructure
There are currently 2 EDW environments, a test environment and a
production environment. For the production environment, the ETL
and database reside on virtual machines (Solaris containers) hosted
on a dedicated Oracle/Sun Solaris server. The business intelligence
applications reside on a small group of Linux and Windows servers.
The IQ server excels at compressing data and indices -- currently, the
server uses approximately 2 TB of disk space.
Data Sources
The data found in the EDW originates from dozens of computer
systems across the MUSC enterprise. In some cases, data is
extracted directly from an application’s database (an example would
be Picis anesthesia), in some cases data is post processed by a
secondary system to apply additional business logic (an example
would be Trendstar’s post processing of Keane charge data), and in
some cases data from multiple sources are collected and aggregated
in a data repository (an example would be data collected by the Oacis
Clinical Data Repository). Examples of the data collection options are
shown graphically in Figure 1.
The data sources are varied both from in domain, source system,
level of detail, and temporality. The earliest data in the EDW is patient
registrations, laboratory results, radiology procedures, and hospital
transcription – this data dates back to June 1, 1993. Other data
sources start (and stop) over the last 20 years. The only reason a
data source would stop is if the source system was retired (i.e.
Practice Partner), but often if is replaced by another system.
M.Daniels ** WORKING DRAFT ** Page 2 of 6 Data Source along with data type and status are shown in Appendix
A. Note that this list includes data sources that are on the list to be
harvested in the near future.
Figure 1 -­‐ Data Collection Options Other Points of Information
• The EDW infrastructure supports analytics and reporting efforts
by teams using other tools (i.e. Tableau, SAS, Access, etc.)
• The EDW team builds and supports various data extraction
scripts to provide specific data extracts for internal use and
distribution to approved external entities, for example:
o Health Science South Carolina (HSSC)
o Insurance payers
o DHEC – Syndromic Surveillance
o Pharmacy 340B Program
o Pediatric Medicaid – Out of Network
o Meaningful Use Quantity Measure data
M.Daniels ** WORKING DRAFT ** Page 3 of 6 Appendix A – Data Sources Source Abbott Point of Care Action OI AHRQ ANSOS Apollo Avatar / HCAHPS Centricity Cerner Lab Clinic Scheduling CMS Dental Medicine Dietary Employee Survey EndoWorks Epic UHC Op DB Quality Indicators Nurse Shift / Scheduling Cardiology Patient Satisfaction OB Blood Bank Gen Lab MicroBiology MicroBiology Charge Cost Mapping Pathology IDX Flowcast/Imagecast Public Indicators Labor/Supply Bchmk Status EDW Adult only Inpatient Delivery Discrete Report Susceptabilities Report EDW EDW (1) EDW EDW EDW EDW EDW (1) EDW EDW (1) Public Website Identify Source Identify Source M.Daniels Note Info not in Cerner Sodexo ePremis GE Echo GE MUSE GE Viewpoint GetWell Hand Hygiene Description QCM / iStat Endoscopic Ultrasound Ambulatory Ambulatory Ambulatory ASAP -­‐ ED ASAP -­‐ ED Immunizations Medications Administered Medications Ordered Bed Management Transport Echo Pacs EKG PreNatal Ultrasound Patient Education Source -­‐ EI team Report Discrete Immunizations Report Discrete Discrete Discrete Discrete ** WORKING DRAFT ** EDW (1) EDW (1) EDW EDW EDW (1) EDW EDW EDW EDW EDW EDW EDW (1) EDW (1) EDW (1) EDW Page 4 of 6 Hollings CC HPF IDX FlowCast IDX ImageCast Keane Reg/PatAccnt Cancer /Tumor Registry Document Imaging UMA Radiology Report Patient Charges Source Identified EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW (1) EDW EDW (1) CPT codes only Payment Patient Registration Krames Patient Education McKesson Clinicals ClinDoc CPOE Orders ED Track Board Glucose Co-­‐Signs Longitudinal (HHS) Med Administrations McKesson Medication Verified Pharmacy Accudose Cabinet Med Dispense MediServ PT/OT/Speech RT MedFlow Ophthalmology Meducare RESCUENET Getting Access Neurophysiology EEG/EMG NSQIP ACS Nursing Operations Accommodation Codes ETA Team Olympus Endoscopy EndoWorks Outcomes Science Stroke Data Parking Local Application Management Patient Call Back Call back results ETA Team Philips Vu Physiologic Monitor Physician Survey Identify Source Picis Anesthesia Report OR Report Anesthesia Discrete OR Discrete PMM Material / Supply Practice Partner Immunizations Practice Partner Reports Press Ganey Patient Satisfaction Pediatrics Only M.Daniels EDW (1) EDW EDW (1) EDW ** WORKING DRAFT ** EDW (1) EDW EDW EDW (1) EDW (1) EDW EDW EDW EDW EDW Page 5 of 6 PSN Pyxis Quantros Reach Patient Safety Events Supply Dispensing CMS Quality Indicators Stroke Data (Telemedicine) Death Registry State EMS Data Infection Control / Risk Pulmonary Function Tests Ventilator S.C. Vital Stats S.C. Vital Stats Safety Surveillor Sensormedics Siemens Trauma Registry Trendstar UHC Data Velos Verge Source Identified Source Identified Patient Cost Accounting Transplant Credentialing BioBank Production Access Specimen Data OBIS Time HEMM Supply Mgmt ORSP Proposal / Award Download / License Use ORSP Proposal / Award ORSP Grant Expendature Hospital HR Hospital Payroll University HR University Payroll PVID / UDAK info ORSP PI Space Kronos McKesson Supply Coeus Software License ERMA GCA SmartStream Identity SAMWA No Data in EDW Data Load in Progress Data Load Live Data Load Suspended pointer to report -­‐ report not loaded EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW EDW (1) M.Daniels ** WORKING DRAFT ** Page 6 of 6 
Download