VA Data Lifecycle

advertisement
Reading for Fun – Official History
 VistA*/U.S. Department of Veterans Affairs national-
scale HIS
 Steven H. Brown, Michael J. Lincoln, Peter J. Groen,
Robert M. Kolodner
 International Journal of Medical Informatics 69 (2003)
135/156
 VistA Document Library (VDL)
 www4.va.gov/vdl
(With Pharmacy Additions)
Richard Pham, PharmD
Enterprise Architect
OI&T Corporate Data Warehouse – Architecture
Richard.Pham@va.gov
The Health Care System Is MUCH MORE COMPLICATED Than
Other Business Processes (~7% of VistA)
Each Process Is A World Unto Itself
DHCP/VistA/CPRS
 VistA – Veterans Health Information Systems and
Technology Architecture - Refers both to the architecture
and the database which the architecture supports
 DHCP - Decentralized Hospital Computer Program – The
DOS (Unix-like) system where many of VistA’s non-clinical
entries take place
 CPRS - Computerized Provider Record System – A user-
friendly GUI providing access to clinical order entry
functions
Objectives
 The main objective is to understand the data lifecycle
of VA’s VistA/CPRS and the user experience of
VistA/CPRS
 A high-level overview of VistA Internals
 Learn about data structures and outputs in VistA
 Learn where data enters and travels throughout the VA
 Try to make sense of data resources within the VA and
how they are accessed
The VA Data Lifecycle
The VA Data Lifecycle
Core Patient Care Functionality
 VistA is first and foremost an Electronic Medical
Record. The architecture design supports veteran
health care.
Core Patient Care Functionality
 VistA Internals
 DHCP
 CPRS
VistA Internals 101
 MUMPS
 Server and Operating System
 Kernel
 “Three Wise Men (Managers)”
 TaskMan
 MailMan
 FileMan
 Modules
Massachusetts General Hospital Utility
Multi-Programming System (MUMPS or M)
 My definition in English: M is a programming language
designed for hierarchical databases that is convenient for
medical applications or anything else where speed and data
storage upkeep are a problem and programmer
intelligence/organization is not
 My technical definition: M is a Turing-complete, low and
high-level, imperative, machine-compiled (no longer
interpreted) programming language utilizing a hierarchical
global array file structure
 Used commonly in healthcare and financial industry
settings
Structure of The Veterans Administration
Data Efforts (Late 1970s)
VHA Ancestor
Department of Medicine
and Surgery (DMAS)
VHA-OI Ancestor
Computer Assisted System
Staff (CASS)
OI&T Ancestor
Office of Data
Management &
Telecommunications
(ODM&T)
Comparing The Two Offices
CASS
ODM&T
 Decentralized design
 Centralized design
philosophy
 Rapid, agile development
philosophy
 Bureaucratic, process-focused
development
 SME-involved development
 Development without SME’s
Highlights of ODM&T Development
 Took 6 years to deploy APPLES Pharmacy at 10 sites
 A 1980 paper detailing ODM&T’s transactional patient
treatment file (PTF) system promised an interactive
national solution by 1990.
 Navigating the mandated 17 steps between system
specification and deployment alone is said to have
required at least 3 years.
Beginnings of DHCP
 There were subject matter experts that believed that
they could put out useful applications faster than the
ODM&T sloth
 Development of the testing and principles was done
unofficially throughout the late 1970s
 The Second
Original DHCP Design Principles
 A commitment to rapid prototype development
 All use ANSI MUMPS
 Modular Design
 Actively Maintained Data Dictionary
 Code Sharing/Portability
 Involve the SME’s
DHCP Kernel
 Functions as both an operating system for VistA
applications and an M virtual machine
 Kernel shields DHCP modules from needing to know
hardware and OS configurations on the server
 Isolates M to the ANSI standard (1995)
 Provides a toolbox of standard functions for most
programmers
VistA to Relational Database
Terminology
VistA (Example)
Relational Database (Example)
Namespace (VHAFRE)
Database (VA Fresno)
“Package” – Not hardcoded
Schema (RxOutpatient)
File (50.68 – VA PRODUCT)
Table (NationalDrug)
Field (.01 – NAME)
Column (DrugNameWithDose)
Domain (cardinal/decimal, setofcodes,
freetext/wordprocessing)
Field Type (numeric, boolean, varchar)
Internal Entry Number (IEN or .001)
~Key (9722)
Record
Tuple/Row (ISOSORBIDE
MONONITRATE 120MG TAB,SA)
MUMPS Classic Database
 One Data Type
 String (Text)
 Other types
 Cardinal Numbers
 Float Numbers
 $H Dates
 One Data Storage Type
 Multidimensional Array aka Globals
 Dynamic (duck) typing
VistA Data Organization
 Namespace
 File

Field
 Record
 654 (VAMC Reno)
 File 120.5 (GMR Vitals)

Field 0.1 (DATE/TIME VITALS TAKEN)
 IEN-1, BP, 140/90
 Most Files have an entry at the 0.001 Field called “IEN” or
“Internal Entry Number” as an identity key to mark the
record as unique
Upside of Using Globals
 Faster - No joins
 Faster – All parameter pointers built in
 Faster – Direct and planned programmatic access to
database (Look at SQL execution plans)
 Less Data Storage Overhead and faster paging – If the
data point does not exist in the array, there does not
need to be a fixed point like in relational
Downside of Using Globals
 No Intrinsic Structure and No Enforcement* - M
believes whatever you put into the globals (most M
programmers view this as an advantage while
relational programmers have an MI)
 ACID-compliance not mandated
 (Il)logical data structures guaranteed – There are
many interesting* ways that the M programmers
modeled the data that does not make sense to later
viewers
MUMPS Quirks
 Whitespace (Space) matters
 Requires knowledge of kernel and sometimes lower-
level concepts
 Programming Without Type or Structure Enforcement
 VA programming standards and conventions
The Three Wise Men (Managers)
 TaskMan – The man(anger) that schedules tasks to the
kernel
 MailMan – The man(anger) that messages between
the user, TaskMan, and any other two-way
communication between packages
 FileMan – The man(anager) that controls internal file
(data structure) interactions
TaskMan
 TaskMan handles application processing:
 Creation of application processing tasks
 Scheduling these tasks
 Monitoring health/statistics of these tasks
 If kernel is the brain, then TaskMan is the body of the
operation
 If programming, NEVER EVER use the TaskMan global.
This subverts TaskMan’s scheduling queue, and can cause a
system memory leak. Use the calls instead…
MailMan
 VistA needs a way to pass and receive data from the
database to other areas
 MailMan fulfills this function in the pre-TCP/IP days
 “Electronic mail” doesn’t mean just email
 Practically any message between the database and
anyone else (the end-user, another site, or application,
etc.) can be moved this way
 Gives programmers methods to both receive and
return data to the database
FileMan
 A higher-level method to access the VistA database
without exposing a programmer interface
 Mostly menu-driven
 One can use limited programming
 Serves as the model for all other modules that interact
with the VistA database
ODM&T Initial Action Plan To DHCP
Development (1980)
 Ordered that development stop
 Fired the developers
 Removed the hardware
 Cut the DMAS budget so it would never happen
again…
The official history
Development Goes Underground
 Developers that survived the ODM&T purge
continued their work as a black project in DMAS
 During 1980 and 1981, the survivors (Underground
Railroad) continued work on developing modules for
system integration
Modules
 Modules are programmed to interact with the VistA
database
 Most use Fileman as a model for programming
Some of the Many Modules
Medicine
Surgery
Dentistry
Nursing
Pharmacy
Laboratory
Care
Management
Patient Care
Encounters
ADT
Mental Health
EDIS
Oncology
Nutrition and
Food Service
Imaging/PACS Prosthetics
Not really in the scope of this presentation to cover each
module .
Try the VistA Documentation Library:
http://www4.va.gov/vdl/
Or
VHA eHealth University (VeHU): http://www.vehu.va.gov/
Acceptance and DHCP 1.0
 Once there was a critical mass of packages that were
shown to be useful, the tide turned and the project was
blessed…
 Initial testing done in 1980-83
 1.0 installation was in 1985
 Most of the underlying packages can still be
recognized by the original programmers
Special Topic – The Pharmacy
Package (File 50 Series)
 Where are the files?
 File 50 Series is the main line, though there are other
places
 How many files are there? - 437
 Look at screen capture on next side for File 50 series.
 How many columns? - 3174
How Many File 50 Tables Are
There? (1096 – 659 = 437 Tables!!)
How Many File 50 Series Columns
Are There? (8526 – 5352 = 3172!!)
Further Information On The
Background
 For the VA Base M Training
 http://vaww.vistau.med.va.gov/VistaU/MTraining/Def
ault.htm
 For the VA Programming Standards and Conventions
 http://vista.med.va.gov/sacc/
 For the VA Document Library
 http://vista.med.va.gov/vdl/
Computerized Patient Record
System (CPRS)
 A Real-Time Order Checking System that alerts clinicians during the ordering session
that a possible problem could exist if the order is processed
 A Notification System that immediately alerts clinicians about clinically significant
events
 A Patient Posting System, displayed on every CPRS screen, that alerts clinicians to issues
related specifically to the patient, including crisis notes, warning, adverse reactions, and
advance directives
 The Clinical Reminder System, which allows caregivers to track and improve preventive
health care for patients and ensure timely clinical interventions are initiated
 Remote Data View functionality that allows clinicians to view a patient’s medical history
from other VA facilities to ensure the clinician has access to all clinically relevant data
available at VA facilities
 CPRS DOES NOT STORE DATA!!!
CPRS Internals
 Written in Embarcadero Delphi (NOT in MUMPS)
 Connects from the Graphic User Interface to the VistA
database using a Remote Procedure Call (RPC) Broker
 This Remote Procedure Call Broker translates
instruction sets from other languages into M
Present State of VistA
 Large MUMPS database
 Over 50+ Main Clinical Packages
 Over 10,000 + Tables
 Each medical center runs somewhere between 2-4 TB
worth of data over 30 years (mostly imaging)
 Many processes
 300+ MB of running executable at any given time
 Over 20,000 subroutines (VDL)
 Many simultaneous users
The VA Data Lifecycle
National Analytic Systems
 A list of systems that support policy, planning, and
congressional needs
 There are more extracts than this, but I have chosen
the most common ones…
Local VistA
Installations
Local VistA
File
Site 1
Local VistA
File
Site N
Host
Location
DSS – Austin, Tx
PBM – Hines, IL
NPCD – Austin, Tx
Etc.
Local Extract
Software
Site 1
VistA Extracts
Load and
Translate
Software
Local Extract
Software
Site N
VistA Extracts
DSS Extract Software
PBM Extract Software
NPCD Extract Software
Etc.
DSS Extract Files
PBM Extract Files
NPCD Extract Files
Etc.
Diagram of Data
Sources Available to
VA Researchers
DSS Production Database
PBM Database
NPCD Database
Etc.
National
Database
DSS Build Software
PBM Build Software
NPCD Build Software
Etc.
Build SAS
Datasets
DSS NDE SAS Datasets
Medical SAS Datasets
Etc.
Custom
Extract By
Database
Owner
Custom
Extract
SAS Datasets
Research
Database
PBM Custom Extract
Medicare Data
National Death Index
Etc.
External
Data
Researchers
46
Systems to Support Planning
 Decision Support System (DSS)
 Supports accounting and costing for the OIG, GAO,
CBO, and other auditing agencies
 Allocation Resource Center
 Supports personnel and resource allocation at the
medical center level
 Workload capture, resource allocation
 Basis for the VERA (VA’s Fund Control Point) Model
VistA DSS Data Feeds
ADM
DEN
IVP
LAR
MTL
PAS
PRO
SUR
UDP
Admissions
Dental
Pharmacy: IV
Lab Results
Mental Health Test
Patient Assessment
Prosthetics
Surgery
Pharmacy Unit Dose IP
CLI
ECS
LAB
MOV
NUR
PRE
RAD
TRT
Clinic Visits
Event Capture System
Lab
Patient Movement – Inpatient
Nursing
Pharmacy OP
Radiology/Nuclear Med
Patient Treatment Specialty
May be transmitted by site at any time of the month, ideally
around the 25th of the month prior to the processing month.
Systems to Support Research
 National Patient Care Database
 An integrated set of data that captures a patient’s care
encounter with the VA
 Corporate Data Warehouse –
 A near real-time accumulation of much of the same data
 The result of the Health Data Repository process
NPCD Data Flow Diagram
VistA
MailMan
•NPCD data is sent from the facilities to the AAC via
MailMan messaging
•Once a message reaches the AAC MailMan server,
It automatically moves to the Data Management
Interface System (DMI)
•NPCD and other applications retrieve their respective
data from DMI for use
• Acknowledgement messages are sent to facilities
z900
SD
PROLIANT
8000
Data
Data
extracted
extracted
&
backed
by application
up nightly M-F
ESC
Austin
MailMan
Server
Data Stream
DMI
SD
DLT
NPCD
Acknowledgement
message
Data received
in DMI 24x7
Acknowledgement
message
HL7 data to Oracle DB
NPCD Processing
UNIX
Daily
Data
Loading
Flat files are indexed
and loaded into the
database daily
Oracle on Unix
NPCD
DSS
data
extracted
z900 (MAINFRAME)
Master
Extract File
(MEF)
SAS
Data is checked
for duplicates bimonthly
WINDOWS
VSSC/
KLF Menu
Data is extracted
and filtered for
reporting twice a
month
Resources for Further Information
 VA Information Resource Center (ViREC)
 http://www.virec.research.va.gov
 National Patient Care Database –
 (Internal) http://vaww.aac.va.gov/npcd/
 National Data Systems (NDS)
 (Internal) http://vaww4.va.gov/NDS/DataAccess.asp
Secrets of the VA Data Universe
 This was an extremely brief introduction to a
complicated area
 I have another presentation on the availability of
databases in the VA and how to access them for
operations and/or research
The VA Data Lifecycle
Regional Remote Data Processing
Center Shadow Systems
 A offsite backup process to ensure continuity of
operations for VistA Patient Care
Regional Data Processing Centers
(RDPCs)
 Read only backup VistA systems are set up to take
journaling files
 When a record is written or altered to a local medical
center’s VistA, a journal file with that entry is prepared and
sent to a Regional Data Processing Center
 This maintains an active backup in case the local medical
center’s VistA goes down
 VistA goes down
 OpenVMS goes down
Regions and RDPCs
 Region I RDPC – Sacramento (SAC) and Denver (DEN)
 Region IV RDPC – Philadelphia (PHI) and Brooklyn
The Day VistA Died
Northern Cal Without VistA
 The medical staff was forced to write discharge instructions and notes






on paper.
The electronic lists of instructions and of medications were not
available for the patients being discharged.
Patients being discharged could not be given follow-up appointments
at the time of discharge.
The appointments had to be made later and the patient notified by
phone.
There were delays in obtaining discharge medications and patients
remained on the wards longer than would normally be required.
The nurses administered medications to the patients and used the
paper Medication Administration Record, or MAR, to record the
administration events.
Initial medication passes were interrupted and delayed until the paper
copies of the (MAR) could be printed.
RDPC Denver and Brooklyn
The VA Data Lifecycle
Business Intelligence
Business Intelligence in the VA –
Making the Data Work For Us
 VistA has a wealth of clinical and administrative data
available
 In the past, giving a value-added, timely VistA dataset
was hard
 Querying the active system with minimal impact
 Needed an interface between M and analyst languages
(SAS, SQL, etc.)
 Easy to read reports was hard to build
Corporate/Regional Data
Warehouse
 Takes a copy of the journal file that goes into the backup
shadow system
 Translated from the M array to a relational database format
using Intersystems Cache’s class mapping program
 Staged in a Feeder-Collector system for collection
 Indexed and value-added columns produced and loaded to
an VISN RDW Server
CDW Governance
VHA Business
Owners/SME’s
Communicates
Organizational
Priorities
CDW Governance Board
Organizes SMEs
and Data
Stewards
Sets and monitors
domain, work priorities,
and timelines for
completion.
OI&T
VHA-OI
Data Quality
10N, OIA, VBA
Provides
Documentation
and
Clarification of
Business Logic
Corporate Data
Warehouse
CDW Governance Is In VHA’s Hands
 Ordered By VHA
 Domain and Work Prioritization By CDW Governance Board
 Chair – KLF (OIA)
 Vice-Chair – Larry Mole (Public Health SHG)
 Monitored and Accountable To VHA
 Project management provided by John Quinn (National Data
Systems) and KLF (OIA)
 Supported By VHA
 OI Data Quality
 Business Owners
 PBM’s Data Steward is Rob Silverman
“As the number of eyes goes up,
the number of bugs goes down.”
 Writing documentation about the business logic of the
files and fields
 Answering end user questions about the data
 Data validation
 Preferably before
 Inpatient Pharmacy
 ADR/Allergy Package
1st category models are simple –
V Health Factor
FMFile
V HEALTH FACTORS
V HEALTH FACTORS
V HEALTH FACTORS
V HEALTH FACTORS
V HEALTH FACTORS
V HEALTH FACTORS
V HEALTH FACTORS
V HEALTH FACTORS
V HEALTH FACTORS
V HEALTH FACTORS
Source Mapping
FMField
ResolveFld DWTableName
HEALTH FACTOR
HealthFactor
HEALTH FACTOR
0.01 HealthFactor
PATIENT NAME
HealthFactor
EVENT DATE AND TIME
HealthFactor
VISIT
0.01 HealthFactor
VISIT
0.01 HealthFactor
LEVEL/SEVERITY
HealthFactor
VISIT
HealthFactor
ENCOUNTER PROVIDER
HealthFactor
COMMENTS
HealthFactor
DWFieldName
HealthFactorTypeIEN
HealthFactorType
PatientIEN
EventDateTime
VisitVistaDate
VisitDateTime
LevelSeverity
VisitIEN
EncounterStaffIEN
Comments
2nd category models require
transformation – Prescription
Prescription
and 1st fill
Refill
Partial
Fill
Fileman
Prescription
Only
All Fills
Data Warehouse
3rd category models not usable
without transformation - PCMM
Levels of Data
 National – Corporate Data Warehouse (CDW)
 Region – Regional Data Warehouse (RDW)
 VISN – VISN Data Warehouse (VDW)
 Medical Center – Local Data
Entities Who Produce Business
Intelligence Products
 National – VSSC, PSSG, DMDC, HEC, ARC, DSS, BIPL, OQP, PCS, PBM
 Region – Regional BISL Teams
 VISN – VISN Data Warehouse, VISN PBM
 Local – DSS
 Bolded are ones that have substantial resources in clinical business
intelligence
 PSSG handles much of the GIS and Statistical Demography for the VA
Data Access
 VISN and Station Level – Contact Your VISN Database
Manager
 Regional/Corporate Access – Contact NDS for the 9957
Permissions
Operational Challenges of VistA
 System Resources
 $8 Billion investment over 20 years
 New needs for new domains
 MUMPS Programmers must be internally trained (and many of
them are retiring or dying)
 Communication with Other Systems
 HIMISS compliance with data interchange
 E-functions (billing, prescribing, verification)
 Interagency Cooperation – DoD and NHIN
 Business Intelligence
 Closing the data lifecycle and bringing back clinical data for
knowledge discovery
Why Is Pharmacy ALWAYS Picked
As An IT Test Case Project?
 Pharmacy is data savvy
 Local data quality control from ADPACs
 Federated data quality control from VISN PBM, CMOP, PBM
SHG
 Pharmacy is one of the few domains that have active business
logic SME’s
 Pharmacy did not contract institutional memory out the door
 Pharmacy gets things done
 Pharmacy has more technology success stories as a collective
than any other PCS office (BCMA, Automation, Central Fills)
 Pharmacy actively mines data
Acknowledgments
 Kernel – Jack Schram (Oakland OIFO)
 SQLI – Ellen Zufall (SF IRMS)
 RPC Broker and MUMPS coding – Perry Richmond
(VISN 18 BI)
 Regional Data Process – Vincent Bui and Ken Koenig
(Region I SQL Back Office Team)
Acknowledgments
 OI&T Business Intelligence Product Line (BISL)
 Jack Bates – Manager, OI&T BIPL
 Stephen Anderson – Lead Data Architect
 Mike Baker – Lead ETL Architect
 Denver Griffith/Ken Fuchsel – Server Administrators
 Dave Fackler
 Ron Talmage
 Dan Hardan, Jeff King, Jeff Price
Questions
Download