Presentation - Association for Pathology Informatics

advertisement
Pathology Informatics &
MD Anderson Cancer Center
Mark Routbort, MD, PhD
University of Texas
MD Anderson Cancer Center
Houston, Texas
MD Anderson Cancer Center
• Major tertiary care/referral center for cancer diagnostics,
treatment, and research
• Over 200 active clinical trials
• No residency program, but over 30 new fellows annually
distributed between
–
–
–
–
Surgical pathology
Hematopathology
Laboratory medicine
Molecular pathology
• 80+ pathology/lab medicine faculty
• Visiting rotations available in pathology informatics
Visiting pathology informatics fellowships
Informatics as part of a career in pathology
• As the primary or a major focus
– Medical directorship with oversight of a technical group
– There are currently 3 active faculty vacations in the US
and Canada fitting this description
• As a “hat” or role in a department or practice
– For an individual lab, e.g. molecular pathology
– Participation in processes (such as RFP) and
committees
– Engaging in search for solutions
• As a life-long interest
– The “go-to” person
Why do we care?
•
As providers of pathology & laboratory services, interpretation and accurate
reporting of information are at the core of our clinical endeavors
•
70/70 rule-of-thumb for lab data
– 70% of clinical data points emerge from laboratory data
– Used in 70% of clinical decision making
•
Pathology data
– The definitive diagnosis of cancer
• Diagnosis dictates chemotherapy/surgery and prognosis
• Staging guides protocols & management
•
Practical knowledge is practically useful
Principles driving informatics at MD Anderson
•
Consider the “source of truth” of data
– Seek to minimize the number of transformations, and especially human
transcriptions, data goes through
– This favors a query-based architecture of primary systems, and encourages
stewardship of data by domain experts
•
Use standards where possible and practicable
– Don’t re-invent the wheel
– But don’t let the pursuit of perfection prevent forward progress
– Generally, it is most important to design well, and openly
•
Leverage and develop clinician informaticians to work at the junction of
clinical practice, data management, and research
– Synergistic skills
– This means recognizing informatics as both a clinical and research service to the
institution
Pathology informatics faculty at MD Anderson
•
3 full-time faculty members with significant (>=50%) dedicated informatics time
•
We are recruiting one more!
•
Michael Riben, MD
– Workflow & change management
– Vocabularies/terminoogy
•
Mary Edgerton, MD
– Institutional tissue bank
– Microarray data standards and computational modeling/analysis
First principles and practical applications
• Well covered by practical informatics course at
this conference & John Sinard’s “bible” –
Practical Pathology Informatics
• Data structures & web services
• Medical data transmission, HL7, and pathology
reporting
• Workflow fundamentals
• Information retrieval and vocabularies
The need for relational models
• A model of laboratory blood draws using a simple 2dimensional table rapidly fails
Relational databases enable an extensible model of the real world
Databases versus “Excel”
• Databases
–
–
–
–
–
Model the world as entities and relationships
Primary keys for identifying entities
Foreign keys for linking entities
Denormalization
Referential integrity
• The “Excel” view (two-dimensional tables) is not bad:
– Needed for:
• Statistical analysis
• Graphing
– But it can always be derived as a “view” from the database model
®
Microsoft Access , distributed with the near-ubiquitous Microsoft Office suite, is an
excellent model system for learning about relational databases
Relational lingua franca for the Web: XML
•
eXtended Markup
Language
•
W3C specification for
data modeling
•
Human and machine
readable
•
Self-describing
XML Schema
•
•
•
Describes what conforming XML data should look like (a blueprint)
Required and optional elements and attributes, cardinality, data types, complex structure
Conformance to a schema represents a contract for exchange of information
Web services and WSDL
• Web services:
– Software constructs designed to support interoperable Machine to Machine
interaction over the World Wide Web
– Service provider
– Service requester
– Optionally, service broker
– Most commonly enabled by the SOAP (simple object access protocol)
specification
• WSDL: Web services description language
– an XML-based language that provides a model for describing Web services
– Modern application development environments (Java, .NET) can consume
WSDL directly to automatically create client (requester)-side code to
consume the service
MD Anderson SPiDR
• Web services based “shared pathology data repository”
for all clinical lab and pathology data
Web services and schemas can greatly facilitate
connections to complex data sources
1.
2.
3.
Instantiate the service proxy
Execute a service call to return a LabData object
Bind the grid to the LabData object
The data model/schema directly dictates
the run-time appearance:
Information transfer: Health Level 7 (HL7)
•
Messaging standard for health care inter-systems communication
•
Founded 1987, versions 2.1, 2.2, 2.3 from 1990-1999, in wide use for
communicating lab and pathology results (version 2.x)
•
ANSI standard
CBC (Supergroup) result message examples - Partial result message
MSH|^~\&|ESI|LAB|INVISION_PMS|HIS|20050331155000-0600||ORU^R01|2980822|T|2.1
PID|1||000000000999999|00000|TEST^MICKEY^N||19400313|F||W|||||||UNK|000010501880256|428827901
PV1|1|O|DICT^DICT|||||||731||||HIS|||0000361^WALTERS, RONALD S. M|R||||||||||||||||||||||||||200503011442000600|20050402155000-0600
OBR|1|5500280|01014775200001550550028025032847925032847900000000101|5500312^CBC^COMPLETE
BLOOD CNT/DIF/PLT|RT|20050331152000-0600|20050331154200-0600|||PCCGS^SO, CELIA
G.||||20050331154300-0600||0000361^WALTERS, RONALD S.
M||1||0000509003089|G|||LA|P||^^^200503311520^^RT
OBX|001|NM|5500009^WBC^WHITE BLOOD CELL COUNT|| 2.4|K/UL| 4.011.0|L|||F||00000000000000225200|20050331155000.0000-0600|IIM^INSTRUMENT PERFORMED
ID|PCNDA^ACOSTA, NOEL D.
OBX|002|NM|5500018^RBC^RED BLOOD CELL COUNT|| 3.03|M/UL| 4.005.50|L|||F||00000000000000225200|20050331155000.0000-0600|IIM^INSTRUMENT PERFORMED
ID|PCNDA^ACOSTA, NOEL D.
How is information commonly conveyed?
Pathologist
Pathologist,
transcriptionist,
resident entry
AP – LIS
Format
conversion to
ASCII text
“Native”
pathology
report
DIAGNOSIS
Metastatic adenocarcinoma.
HL7
Interface
engine
HL7
HIS
Database
/Viewer
Custom
display
logic
Clinician
HL7 is not WYSIWYG (what you see is what you get)
HIS viewer
Pathology system
The integrity of semantic content
is at stake in any transformation
“Direct” electronic delivery of pathology reports
Pathologist
Self, transcriptionist,
resident entry
HIS Viewer
Web service based
direct query for report
Rich Text Format
(RTF)
“Native”
pathology
report
stored in
PowerPath
database
Custom path
report viewing
control
Clinician
Current pathology reporting at MD Anderson
Web
service
Pathology system
EMR
Workflow foundations from an informatics perspective
• Data model of the objects involved in your business
process
• Defined transitions between states associated with rules
and events
• Identify objects with machine readable technologies
Workflow foundations: asset identification
• Bar coding
– 1D or 2D machine readable data encodings
– Line of sight (laser or digital imaging detection)
– Can represent
• machine readable form of human information, or
• a unique identifier to uniquely identify the asset
• RFID (radio frequency identification)
– digital data encoded in an RFID tag is captured by a
reader using radio waves
– non-line of sight
– can multiplex
– active, passive, and hybrid forms
– relatively expensive compared to bar codes
RFID vs. Barcode
• RFID
–
–
–
–
–
–
–
–
Does not require “line of sight”
No-contact
No operator
Simultaneous (parallel)
Identification
Data storage is greater (up to
30x)
Smaller tag size required
Read reliability – eliminates
multiple scanning attempts
Harder to deploy
• Barcode
– Line of site required
– Requires operator most of the
time
– Serial identification
– Limited information ( increased
with 2d)
– Requires larger tag size
– May require multiple scan
attempts
– Cheap
– Easily deployable
Workflow example at MD Anderson:
Introducing new technology to the grossing lab
• Previous system: Telephony based with very limited dictation
workflow control or metrics
Background
• “Free-flow” dictation style with batching of numerous
dictations in a single session
• No connection between dictation system and AP-LIS
• Significant percentage of transcription time spent
listening to dictation for case information, typing in
accession numbers, and loading case into PowerPath
(estimate 10-20% depending on case type)
Background
• Routing and priority
dependent on correct
punching of numbers on
touch pad
• “Paper-towel syndrome”
provoked by profound
distrust in system
Goals
• Replace dictation system with telephony independent solution
• Use non-proprietary dictation hardware if possible
• Needed solution that can tolerate conditions of grossing
environment (fluids, biohazards)
• Bring clinical and ancillary information closer to grossing personnel
• Drive system with bar codes
– Connect physical specimens to AP-LIS
– NO numeric input by humans
– Wanted to be focus-free (no keyboard wedges)
• Route priority and process according to known workflow rules
based on case type
Hardware selection
Ergotron LX wall mount and
Elo 1529 Touch panels
InSync Buddy
Microphone
Symbol MS3207
Scanners
Kinesis Footpedals
Solution: Software
•
Dictation module – WinScribe™
– Client/server application capable of operating with numerous off-the-shelf
dictation hardware solutions
– Basic API for automation
– Capable of basic rule & priority based routing
– Attractive licensing model for our installation (concurrent transcriptionists)
•
•
•
AP-LIS - PowerPath™
Touch panel keyboard (for system login) – Click-N-Type freeware
Workflow “wrapper” – PathStation
–
–
–
–
MD Anderson custom workflow application for Pathology
Supplements, not supplants, existing AP-LIS
Integrate bar coding with WinScribe, PowerPath, and EMR
Made it possible to create a fully hands-free dictation environment, except for
initial login
Specimen arrival
• Institutional
label
• MRN is bar
coded MRN
in Code 39
Institutional bar code (MRN) drives accessioning
• If no recent specimen match, start a new case on the patient
• If recent specimens exist, system offers choice of new case
or add on to existing case
Ready to gross with new labels that drive workflow
Scanning specimen at workstation:
1.
Opens case in PowerPath
2.
Starts a new dictation in the WinScribe system
3.
Sets the patient & case context in PathStation
Real-time dictation job overview and review
Payoffs
• No human data entry of accessions or MRNs either by
dictators or transcriptionists
• Dictations instantly and easily available for review at the
case level
• No prioritization or case type flags entered by hand –
simply scan specimen or requisition and dictate
• Better dictation turn around times with new system
• Well-established framework for further enhancement
– bar codes on case paperwork also drive pathologist workflow at
signout
• Users have found novel uses of the system
interconnects (patient and case context synchronization)
Information retrieval
Those who cannot learn from history are doomed to repeat it.
George Santayana
• Effective retrieval and analysis from our laboratory and
pathology information systems is key to knowledge
development
– Transactional queries: Where is this specimen right now?
– Analytic queries: How many lab tests of each type are we doing
monthly?
– Identification queries: Can I find cases of metastatic
rhabdomyosarcomas in our database?
Mechanics of search engines
Spidering/
crawling
Corpus
Document
caching
and
indexing
Query tools:
keywords,
phrases,
conjunction,
ranking
Searching the web vs searching pathology reports
• Web based searches:
– Favor relevance (precision) over full retrieval (recall), thus
• Ranking extremely important
• Pathology reports – predominantly case-finding for:
– Examples (full recall not needed)
– Cohort generation (need high recall and precision)
– Data extraction (similar to cohort generation, but with added goal
of automatically or semiautomatically harvesting granular data)
Foundations: inverted files
(text indexes; concordances)
• The key to efficient retrieval
The power of indexes
• Library of Congress
– ~ 20 million books
– ~ 5 trillion words
– If fully indexed, a binary search could find
any word by looking at less than 42 entries
– At that point, you’d have a list of every
single book of the 20 million which
contained that word
• Indexes versus one-dimensional
catalogs
Text index based retrieval
Full-text index
(system)
results table
Diagnosis
dx
Gross
description
gd
Clinical
history
None
given.
SNOMED
sc
Table index
(SQL Server)
Text index based retrieval tool at MD Anderson
Text-based retrievals versus data element capture (synoptic reporting)
• Text retrievals
–
–
–
–
–
With appropriate index engine, can be extremely fast
Simple to use
Do not require any modification of incoming data
Very good at retrieving rare diagnoses
Very poor at
• Data extraction
• Semantic retrievals (find all cases which have metastatic carcinoma, not those
in which the phrase “no evidence of metatastic carcinoma” occurs)
• Intraobserver retrieval
•
Synoptic reporting/structured documentation – elements in report are defined
and maintained for downstream use
– Separation of presentation of report from content/data
– Data elements are “marked up” and remain searchable
• What is the average size of largest lymph node metastasis in tumor X?
• With clinical data correlation, can determine clinically relevant elements
Vocabularies/ontologies: SNOMED CT
•
A systematized,
hierarchical nomenclature
(ontology) facilitating
Disease
Cancer
–
–
–
Adenocarcinoma
Concept (not word)
based retrieval
Aggregation and
subsumption
Every concept has interrelationships with other
concepts that provide
logical computer readable
definitions. These
include hierarchical
relationships and clinical
attributes.
Lymphoma
Hodgkin
lymphoma
Lymphocytepredominant
Hodgkin
lymphoma
Squamous cell
carcinoma
Non-Hodgkin’s
lymphoma
Non-Hodgkin Bcell lymphoma
Classical
Hodgkin
lymphoma
Nodular
Mixed
sclerosis cellularity
HD
HD
Carcinoma
Lymphocyte
rich HD
Mantle cell
lymphoma
Follicular
lymphoma
Summing up –
Major pathology informatics projects in the last 3 years at MD Anderson
• Completed
– Creation of a real-time RTF viewer for pathology reports in our
EMR to improve legibility and comprehensibility of reports
– Importation of a large scale legacy Fortran store of lab data going
back to the 1970’s into a modern relational format
– Creating a unified, structurally robust repository for all clinical lab
and pathology data which is used in a federated architecture by
our EMR over web services
– Real-time document scanning of all paperwork associated with
pathology cases
– Creation of a “Workflow integration application” (PathStation) for
pathologists which unifies multiple disparate application under
single-signon and patient context, driving workflow through bar
codes
– Integration of the PathStation application in the grossing room to
drive dictation and associated grossing workflow
Summing up –
Major pathology informatics projects in the last 3 years at MD Anderson
• In progress
– Pathology workflow optimization: Integrate bar coded identification of
pathology blocks and slides with a real-time workflow model
– Expansion of use of the shared pathology data repository to research
users and non-transactional query models
– Virtual slide implementation for outside consultation material
– Implementation of SCC SoftLab LIS as well as specialized modules for
• Cytogenetics
• Flow cytometry
• Molecular diagnostics
– Establishment of enterprise vocabulary services for the institution (Mike
Riben)
– Microarray analysis and statistical modeling of breast carcinoma (Mary
Edgerton)
– Tissue banking enhancements (Mary Edgerton)
Summing up - Optimizing the pathology informatics cycle
Domain expertise
• Only imagination and time
limit the scope of work to
do
• There is a tremendous
synergism available when
someone possesses the
skill and inclination to
engage at all vertices
• Consider advanced
training in informatics
Information
models and
technologies
Workflow
Download