Peter Haug`s Slide

advertisement
Computable Semantics and
Probabilistic Graphical Models
Where Probabilistic Systems and
Semantics Rub Elbows
Peter Haug, MD
Homer Warner Center for Informatics Research
Intermountain Healthcare
First of all: Thanks
This work has many contributers:
Dominik Aronsky, MD, PhD
Jeffrey Ferraro, PhD
Stan Huff, MD
Scott Evans, PhD
Robert Hausam, MD
Lee Pierce
Xinzu Wu, PhD
Matthew Ebert
Kumar Mynam
And many more!
Please ask questions …
3
Agenda
•
•
Why Decision Support?
Introduction: Bayesian Diagnostic Networks
•
Bayesian Systems
•
•
•
•
A Few Bayesian Tools
Diagnostic Systems
Representing the Semantics of Diagnosis
•
•
•
A Framework for Computable Models
Diagnostic Modeling with Ontologies
Ontologies -> Bayesian Network
Clinical Data
•
A Brief Look at Medical Data Forms
Computerized Decision Support:
Core Assumptions
‘... man is not perfectible. There are limits to
man’s capabilities as an information
processor that assure the occurrence of
random errors in his activities.’
~ Clement J. McDonald, MD (1976)
‘The complexity of modern medicine exceeds the
inherent limitations of the unaided human
mind.’
~ David M. Eddy, MD, Ph.D. (1990)
Patient
Underlying principle:
We are designing the system so that the
computer is an active part of patient
care, not just a way of getting data to
people to read.
Agenda
•
•
Why Decision Support?
Introduction: Bayesian Diagnostic Networks
•
Bayesian Systems
•
•
•
•
A Few Bayesian Tools
Diagnostic Systems
Representing the Semantics of Diagnosis
•
•
•
A Framework for Computable Models
Diagnostic Modeling with Ontologies
Ontologies -> Bayesian Network
Clinical Data
•
A Brief Look at Medical Data Forms
The Reverend Thomas Bayes
1702 to 1761
Bayes set out his theory of probability in 1764. At that time,
Richard Price, a friend of Bayes, discovered two unpublished
essays among Bayes's papers which he forwarded to the Royal
Society.
A Way to Think about Probabilistic Systems
(and an introduction to some terminology)
Learning from Data
•
•
•
•
The data comes from Health Care Encounters
It is captured in Electronic Health Records (EHRs)
It is aggregated and organized in Enterprise Data
Warehouses (EDW)
It includes the diagnoses and the data that support
them
Bayesian Networks
•
Model the joint probability distribution of the data
and diagnoses
•
Use directed graphs to structure these models
Re-Using Healthcare Data
Episodes of Care
Medical Information System
Enterprise Data
Warehouse
• Quality Improvement
• Measures of Care
• Clinical Research
• Medical Decision Support
12
Example: Patients with Symptoms of Heart Disease
Patient Population
Data Collected in a
Care Setting
13
Original Data
Patient Myocardial
ST
ID
Infarction Chest Pain Segment
1
Present
Present
Elevated
2
Absent
Absent
Normal
3
Present
Absent
Depressed
4
Absent
Absent
Normal
5
Absent
Absent
Normal
6
Absent
Absent
Normal
7
Absent
Absent
Normal
….
….
….
….
Summarizing the Data: The Numbers
A Condensed Look at 1000 Cases
MI
No MI
20
980
1000
MI
No MI
Chest Pain
15
80
95
No Chest Pain
5
900
905
20
980
1000
Summarizing the Data: The Numbers
A Condensed Look at 1000 Cases
MI
MI
20
2%
Chest Pain
Chest Pain
No Chest Pain
No Chest Pain
No MI
No MI
980
98%
MI
MI
15
1.5%
5
0.5%
20
2%
1000
100%
No MI
No MI
80
8.0%
900
90.0%
980
98%
95
10%
905
91%
1000
100%
And the
“Marginal
Probabilities
Another Summary: The Joint Probability Distribution
Another View of the 2x2 Table
16
Dividing by the Column Marginals
False Positive Rate: P(F|no D)
Sensitivity: P(F|D)
MI
No MI
Chest Pain
75%
8%
No Chest Pain
25%
92%
100%
100%
False Negative Rate: P(no F| D)
Specificity: P(no F|no D)
Bayes Equation
Inferring the probability of a Disease (D) from a
Finding (F)
Prior Disease
Probability
Sensitivity
P ( D) P( F | D)
P( D | F ) 
P( F )
Posterior Disease
Probability
Probability of
Finding
Probability Updating
The Disease is Myocardial Infarction
The Finding is Chest Pain
P(MI)P(Chest Pain | MI)
P(MI | Chest Pain) =
P(Chest Pain)
P(MI) = 2.0% (0.02)
P(Chest Pain|MI) = 75% (0.75)
P(Chest Pain) = ?
The Question of P(F)
Simple Bayes
•
Patient has One and Only One
Disease
Multi-Membership Bayes
•
•
Patient has Any Group of
Disease
Each Disease is Evaluated
Independently
Bayesian Networks
•
•
Patient has Any Group of
Disease
Diseases are Evaluated
According to Their Collective
(Joint) Behavior
P ( F )   P ( F and Di )
i
Add All of the Probabilities
Of Having Both the
Finding and Disease
P( F )   P( Di ) P( F | Di )
i
The Question of P(F)
20
Simple Bayes
•
Patient has One and Only One
Disease
Multi-Membership Bayes
•
•
Patient has Any Group of
Disease
Each Disease is Evaluated
Independently
Bayesian Networks
•
•
Patient has Any Group of
Disease
Diseases are Evaluated
According to Their Collective
(Joint) Behavior
P( F )  P( Di ) P( F | Di )  P( D i ) P( F | D i )
Two States Apply for Each Disease:
With and Without the Disease
21
The Question of P(F)
Simple Bayes
•
Patient has One and Only One
Disease
Multi-Membership Bayes
•
•
Patient has Any Group of
Disease
Each Disease is Evaluated
Independently
Bayesian Networks
•
•
Patient has Any Group of
Disease
Diseases are Evaluated
According to Their Collective
(Joint) Behavior
Disease
Intermediate
Concept
Finding 3
Finding 1
Finding 4
Finding 2
P(F) is Determined from
the Joint Effect of Child Nodes
on Their Parents
Probability Updating
The Disease is Myocardial Infarction
The Finding is Chest Pain
P(MI)P(Chest
Pain
P(MI)P(Chest Pain
| MI)| MI)
P(MI
| Chest
=
P(MI
| Chest
Pain)Pain)
=
P(Chest Pain |P(Chest
MI) + P(Chest
Pain | noMI )
Pain)
P(MI) = 2.0% (0.02)
P(Chest Pain|MI) = 75% (0.75)
Multi-Membership Bayes
P(Chest Pain) = ?
Probability Updating
Using the Multi-Membership Model
The Disease is Myocardial Infarction
The Finding is Chest Pain
00..02
0200..75
75
P( MI P
| Chest
( MI | Chest
Pain ) Pain
 )
0.02  0.75? 0.98  0.08
P(MI) = 2.0% (0.02)
P(Chest Pain|MI) = 75% (0.75)
P(Chest Pain) = 0.02 x 0.75 + 0.98 x 0.08
24
Probability Updating
Using the Multi-Membership Model
The Disease is Myocardial Infarction
The Finding is Chest Pain
P(MI | Chest Pain)  0.16
P(MI) = 2.0% (0.02)
P(Chest Pain|MI) = 75% (0.75)
P(Chest Pain) = 0.02 x 0.75 + 0.98 x 0.08
Diagnostic Bayesian Networks
(Demonstrating Different Characteristics)
Simple Bayes
•
Patient has one Disease
•
All findings are Conditionally Independent
Multi-Membership Bayes
•
•
Patient can have multiple Diseases
All Diseases are evaluated independently
Bayesian Networks
•
•
•
•
Any relationship among diseases and findings
Can represent any of the other models
Multilayered models
Graphical/probabilistic representation of knowledge
Using a Bayesian Network
Examples of Bayesian Diagnostics
In Netica (www.Norsys.com)
Myocardial InfarctionMyocardial Infarction
A Simple
Bayesian Network
A Simple Bayesian
Network
Present 2.00
Present 2.00
(Several Findings)
(One Finding)
Absent
98.0
Absent
Chest Pain
Present
Absent
9.34
90.7
ST Elevation
Present
Absent
13.6
86.4
98.0
Troponin Increase
Chest Pain
Present
Present 9.34 Absent
Absent
90.7
4.74
95.3
More Diagnostic Examples
(Myocardial Infarction)
Using Pulmonary Diseases
•
•
•
•
Pneumonia
Asthma
COPD
Pulmonary Embolism
With Increasingly Complex Models
•
•
Simple Bayes
Multi-Membership Bayes
•
Complex Relationships
Bayesian Diagnostic Models
(Naïve Bayes)
Disease
Pneumonia
Asthma
Chronic Bronchitis
Other
Elevated_WBC
Dyspnea
Present
Absent
100
0
0
0
Wheezing
15.0
85.0
Present
Absent
Present
Absent
10.0
90.0
Cough
Present
Absent
85.0
15.0
Fever
Present
Absent
90.0
10.0
92.0
8.00
Bayesian Diagnostic Models
(Multi-Membership Bayes)
Pneumonia
Present
Absent
Asthma
6.00
94.0
Present
Absent
4.00
96.0
Dyspnea
Present
Absent
15.3
84.7
Wheezing
Dyspnea
Present
Absent
Elevated WBC
15.3
84.7
Present
Absent
14.5
85.5
14.9
85.1
Fever
Cough
Present
Absent
Present
Absent
Present
Absent
15.1
84.9
Cough
Present
Absent
8.88
91.1
10.3
89.7
Bayesian Diagnostic Models
(Bayesian Network: Two-Layer)
Asthm a
Pneum onia
Present
Absent
Present
Absent
6.00
94.0
4.00
96.0
Dyspnea
Present
Absent
15.8
84.2
Elevated WBC
Present
Absent
Cough
Present
Absent
12.8
87.2
Fever
Present
Absent
15.1
84.9
15.1
84.9
Wheezing
Present
Absent
11.3
88.7
Bayesian Diagnostic Models
(Multi-Layer Bayesian Network)
Pneum onia
Present
Absent
Asthm a
6.00
94.0
Present
Absent
Dyspnea
Present
Absent
4.00
96.0
System ic Inflam ation
15.8
84.2
Present
Absent
15.1
84.9
Wheezing
Cough
Present
Absent
Present
Absent
11.3
88.7
12.8
87.2
Fever
Present
Absent
20.1
79.9
Elevated WBC
Present
Absent
17.7
82.3
Bayesian Diagnostic Models
(Multi-Layer with Continuous Variables)
Pne um onia
Present
Absent
As thm a
6.00
94.0
Present
Absent
4.00
96.0
Sys te m ic Inflam ation
Dys pne a
Present
Absent
Present
Absent
15.8
84.2
15.1
84.9
Whe e zing
Present
Absent
11.3
88.7
Ele vate d WBC
Cough
Present
Absent
12.8
87.2
Te m pe rature
35 to 35.5
35.5 to 36
36 to 36.5
36.5 to 37
37 to 37.5
37.5 to 38
38 to 38.5
38.5 to 39
39 to 39.5
39.5 to 40
40 to 40.5
40.5 to 41
41 to 41.5
41.5 to 42
42 to 42.5
42.5 to 43
43 to 43.5
43.5 to 44
44 to 44.5
44.5 to 45
45
0.10
0.21
1.93
11.5
28.6
28.8
12.2
3.33
2.26
2.50
2.47
2.08
1.49
0.96
0.55
0.30
0.18
0.13
0.11
0.10
.098
37.9 ± 1.2
0 to 5
5 to 10
10 to 15
15 to 20
20 to 25
25 to 30
30 to 35
35 to 40
0+
84.9
15.1
.003
0+
0+
0
0
8.26 ± 2.3
Bayesian Diagnostic Models
(Multi-Layer with Added Associations)
Chronic Bronchitis
Pulm onary Em bolus
Present
Absent
Chest Pain
Present
Absent
5.91
94.1
2.00
98.0
Present
Absent
Dyspnea
Present
Absent
14.1
85.9
Asthm a
Pneum onia
0
100
Present
Absent
2.02
98.0
Present
Absent
4.00
96.0
Cough
Present
Absent
Wheezing
9.64
90.4
Present
Absent
WBC
Tem perature
35 to 35.5
35.5 to 36
36 to 36.5
36.5 to 37
37 to 37.5
37.5 to 38
38 to 38.5
38.5 to 39
39 to 39.5
39.5 to 40
40 to 40.5
40.5 to 41
41 to 41.5
41.5 to 42
42 to 42.5
42.5 to 43
43 to 43.5
43.5 to 44
44 to 44.5
44.5 to 45
.002
0
8.87
18.6
21.0
17.6
12.5
8.18
5.23
3.12
1.91
1.17
0.71
0.44
0.27
0.16
0.10
.062
.039
.023
37.8 ± 1.2
0 to 2.5
2.5 to 5
5 to 7.5
7.5 to 10
10 to 12.5
12.5 to 15
15 to 17.5
17.5 to 20
20 to 22.5
22.5 to 25
25 to 27.5
27.5 to 30
30 to 32.5
32.5 to 35
35 to 37.5
37.5 to 40
0+
0.92
21.8
42.5
24.5
7.88
1.95
0.42
.082
.015
.003
0+
0+
0+
0+
0+
9.37 ± 2.6
8.32
91.7
Using Bayesian Diagnostic
Systems in Care
Example: Diagnosing Pneumonia?
Protocols: Computers Intervene in the Workflow
(an example from the ED)
Goal:
•
•
•
Rapidly Screen for Pneumonia Patients in the ED
Assess Risk of Death
Apply a Pneumonia Care Protocol
Approach:
•
Use Probabilistic System to Identify Patients
•
•
•
•
Diagnostic Bayesian Networks
Supported with Natural Language Processing*
Suggest Enrollment in Pneumonia Protocol
Provide Therapeutic Suggestions
*Extracts Data from the X-ray Report
Advanced CDS
(Diagnositic Models)
Example: Community-Acquired Pneumonia
Computable Medical
Knowledge Reposotory
Chest Xray
Reports
Chest Xray Report
Processing
(Structured Data
Extraction)
Data Supporting
Pneumonia
Assessment
Does the patient
have pneumonia?
Pneumonia
Screening Tool
Should we used the
protocol?
Pneumonia
Protocol
Enrollment
Pneumonia
Treatment
Protocol
Apply Pneumonia
Care Protocol.
Clinical Data
Repository
The Emergency Department Workflow
Imbed logic, orders into process of care
Alerting for Pneumonia in the Patient Tracking Syste
System Watches the Data Flow in the ED
Imbed logic, orders into process of car
Imbed logic, orders into process of care
Treatment Protocol
Uses Data from the EHR Combined with Manually Input Data
ChiefComplaint
RESPIRATORY COMPLAINT 32.4
FEVER
6.96
ABD PAIN
6.05
ORTHO INJURY
4.26
CHEST PAIN
4.12
NEURO COMPLAINT
3.69
FALL
3.62
TRAFFIC INJURY
3.50
ABD PROBLEMS
3.45
CHEST PRESSURE
3.10
BACK PAIN
2.82
WEAKNESS
2.79
SYNCOPE
2.28
ENT PROBLEM
2.19
BODY ACHES
1.88
CV COMPLAINTS
1.88
HEADACHE
1.83
DIZZY
1.77
FLANK PAIN
1.43
CV PROBLEMS
0.92
ASSAULT RAPE
0.87
PSYCHIATRIC
0.86
CHEST HEAVINESS
0.82
SKIN COMPLAINT
0.78
SPECIFIC DIAGNOSIS
0.51
DIABETIC
0.44
PAIN CHEST
0.37
HEART RACE
0.33
TRAUMA
0.31
GENITOURINARY PROBLEM 0.31
PALPITATIONS
0.31
HEART IRR
0.30
ALLERGIES
0.29
HIGH BP
0.28
FLUID NUTRITION
0.27
CONVULSIONS
0.25
INFECTION
0.20
RAPID HR
0.19
IRR HEARTBEAT
0.16
LACERATION
0.16
INGESTION
0.16
BP HIGH
0.13
UNCONSCIOUSNESS
0.11
VAGINAL BLEEDING
.098
MED REFILL
.091
UNKNOWN
.087
LOW BP
.064
CARDIAC ARREST
.059
EYE PROBLEM
.055
BP LOW
.054
other0.18
BPSystolic
< 121.5
29.4
121.5 to 148.5 44.6
>= 148.5
26.0
134 ± 22
HeartRate
< 85.5
44.5
85.5 to 99.5
24.7
99.5 to 110.5 13.0
>= 110.5
17.8
92.1 ± 15
BPDiastolic
< 69.5
28.3
69.5 to 82.5 36.2
>= 82.5
35.5
76.9 ± 11
RespRate
< 19.5
52.3
19.5 to 21.5 24.9
21.5 to 27.5 16.1
>= 27.5
6.72
20.8 ± 3.5
NLP_FINDING
Positive
25.9
Negative
74.1
Implemented Using:
Diagnostic System
•
•
MeanBP
< 85.5
23.0
85.5 to 99.5 35.4
>= 99.5
41.7
95.1 ± 12
Bayesian Network
•
TempC
< 36.75
62.7
36.75 to 37.45 23.8
37.45 to 38.05 6.04
>= 38.05
7.46
36.79 ± 0.63
Sodium
< 137.5
25.7
137.5 to 140.5 41.8
>= 140.5
32.6
139.2 ± 2.4
< 13.5
>= 13.5
Chloride
< 103.5
42.1
103.5 to 105.5 25.1
>= 105.5
32.9
104.3 ± 1.8
Model Trained Using EDW Data
PNEUMONIA
Absent
94.9
Present
5.09
NLP System
•
Age
< 15.5
8.06
15.5 to 45.5 45.6
>= 45.5
46.4
42 ± 21
Creatinine
< 0.405
3.90
>= 0.405 96.1
SpO2
< 92.1
10.2
92.1 to 95.3 23.6
95.3 to 98.4 44.9
>= 98.4
21.3
96.1 ± 3
WBC
< 11.85
86.1
11.85 to 18.75 12.4
>= 18.75
1.45
9.46 ± 3.4
Random Forests-Based Concept Identification
Yes
No
BS_CLEAR
44.0
56.0
BS_CRACKLES
Yes 0.72
No
99.3
BS_CONGESTION
Yes 0.53
No
99.5
BS_TUBULAR
Yes .024
No
100
BS_CLEARING_SECREA...
Yes
0.45
No
99.6
BS_FINE_CRACK...
Yes 0.31
No
99.7
BS_MODERATE
Yes 1.36
No
98.6
Trained with Documents in the EDW
BS_STRIDOR
Yes .083
No
99.9
BS_NO_COUGH
Yes
0+
No
100
BS_WHEEZES
Yes 2.84
No
97.2
BS_EXPIRATION
Yes 0.90
No
99.1
Yes
No
BS_RALES
0.11
99.9
Yes
No
BS_ABSENT
.030
100
BS_INSPIRATION
Yes 0.79
No
99.2
BS_NOT_CLEARING_SECREA...
Yes
0.10
No
99.9
BS_RHONCHI
Yes 0.43
No
99.6
BS_ABNORMAL
Yes 3.87
No
96.1
BS_DECREASED
Yes 2.29
No
97.7
BS_FREQUENT
Yes 1.19
No
98.8
BS_COURSE
Yes 0.90
No
99.1
BS_NON_PRODUCTIVE_CO...
Yes
1.74
No
98.3
Yes
No
BS_WEAK
0.16
99.8
BUN
45.1
54.9
BS_PRODUCTIVE_CO...
Yes
1.81
No
98.2
BS_INFREQUENT
Yes 0.62
No
99.4
BS_STRONG
Yes 0.76
No
99.2
Agenda
•
•
Why?
Introduction: Bayesian Diagnostic Networks
•
Bayesian Systems
•
•
•
•
A Few Bayesian Tools
Diagnostic Systems
Representing the Semantics of Diagnosis
•
•
•
A Framework for Computable Models
Diagnostic Modeling with Ontologies
Ontologies -> Bayesian Network
Clinical Data
•
A Brief Look at Medical Data Forms
The Process of Data-Based Research
(finding the right data)
Query Database
Query Database
Identify Research Problem
Clinical Researcher
Determine Subject Availability
Clinical Researcher
+ Data Analyst
+ Terminologist
Determine Data Availability
Clinical Researcher
+ Data Analyst
+ Terminologist
Collect/Analyze Data
Clinical Researcher
+ Data Analyst
+ Terminologist
+ Statistician
Review Results
Clinical Researcher
Query Database
Data Review/
Analysis
Data discovery and extraction takes 80-90% of the time.
Building a System to Automate Predictive
Modeling
•
Build a System That Can:
•
•
•
•
•
•
Identify the Target Patients
Identify Relevant Data Elements
Extract Patients and Data from the EDW/AHR
Provide Initial Analyses
Support Refinement
The Key is Teaching the System a Certain
Amount of Medical Knowledge
•
Ontologies: Tools For Capturing Complex
Medical Knowledge
Ontology-Driven Model Discovery
•
Can we use knowledge embedded in
ontologies to drive research?
•
The Ontology would:
•
•
•
Disease
Ontology
Help select research patients
Identify and extract relevant data
Provide preliminary analysis of the data
Concept Retrieval (from
Ontology
Analysis
Design
Utility
•
•
Structural Knowledge Retrieval
from the Ontology
Analytic Workbench
Screening Models
Allow visualization of this·· data
Model Comparisons
·
20%
Analytic Health
Repository
20%
20%
Analysis Results
P(d | f ) 
A tool to support Medical Data Mining
Natural
Language
Processing
Subsystem
20%
20%
Model Explanation
(by reference to the
Ontology)
Retrieval from the Analytic
ReturnDataData
and results to the user for
Health Repository
further study
Concept Translation to
EDW Representation
•
Output
Analytic Data
P(d ) P( f | d )
| di )
i
 P(d ) P( f
i
Prediction
Algorithm
Relevant Ontologic
Concepts
Ontologies Describe How Diseases Are Related
(according to ICD9)
Pneumonia
Viral Pneumonia
Viral pneumonia
ICD9: 480
Pneumococcal pneumonia
Pneumococcal pneumonia
ICD9: 481
Pseudomonas Pneumonia
Pneumonia due to
Pseudomona
ICD9: 482.1
Bronchopneumonia
Bronchopneumonia,
organism unspecified
ICD9: 485
Bacterial
Pneumonia
More Pneumonias
Other Bacterial Pneumonia
Other bacterial pneumonia
ICD9: 482
Hemophilus Pneumonia
Pneumonia due to
Hemophilus influenzae
ICD9: 482.2
Staph Aureus Pneumonia
Pneumonia due to
Staphylococcus, unspecified
ICD9: 482.40
Streptococal Pneumonia
Pneumonia due to Other
Streptococcus
ICD9: 482.3
MSSA Staph Pneumonia
Methicillin Susceptable
Staph Aureus (MSSA)
Pneumonia
ICD9: 482.41
Staphlococcal Pneumonia
Pneumonia due to
Staphylococcus
ICD9: 482.4
MRSA Staph Pneumonia
Methicillin Resistant Staph
Aureus (MRSA) Pneumonia
ICD9: 482.42
More Bactierial
Pneumonias
Other Staph Pneumonia
Other Staphylococcus
pneumonia
ICD9: 482.49
Ontologies Describe How Clinical Data are Related
to Diseases
has_Altered_VS
Pneumonia
Temperature
Vital Signs:
Temperature
LOINC: 8310-5
has_Altered_Lab_Value
has_Sign
Bacterial
Pneumonia
Pneumonia
Pneumonia, Organism
unspecified
ICD9: 486
White Blood Count
Hematology: White
Blood Count
LOINC: 62239-9
More Pneumonias
has_X-ray_Manifestation
Pneumococcal pneumonia
Pneumococcal pneumonia
ICD9: 481
Pulmonary Rales
Signs: Chest
Auscultation-Rales
PTXT:
28.1.3.22.34.2.1.32
Other Bacterial Pneumonia
Other bacterial pneumonia
ICD9: 482
has_Micro_Manifestation
Localize Infitrate
X-ray Finding:
Localized Infiltrate
SNOMED: 128309002
More Bacterial
Pneumonias
has_??_Manifestation
+ Sputum Culture
Sputum Culture:
Positive
SNOMED: 442773002
More
Manifestations
Visualizing the Results
Comparing Two Models Using the ROC Curves
Inspecting the Tradeoffs in Accuracy
Extensions of Diagnostic Modeling
•
Large Models
AGE
< 15.5
15.5 to 45.5
>= 45.5
Simple Temporal Model
•
42.8 ± 21
Redundant Data
Equations and Logic
•
•
8.41
42.3
49.3
Time Slice 1ModelsTime Slice 2
Temporal
Time Slice 3
PNEUMONIA
•
Admit Dx: Pneumonia
Present
Absent
4.72
95.3
•
Absent
Present
95.3
4.71
Following Disease Over Time
Summarized Data as Features
PNEUMONIA1
Present
Absent
PNEUMONIA2
5.03
95.0
Absent
Present
PNEUMONIA3
94.4
5.61
Present
Absent
5.61
94.4
TEMP
CC
RESPIRATORY COMPLAINT
ABD PAIN
ORTHO INJURY
NEURO COMPLAINT
FALL
CHEST PRESSURE
CHEST PAIN
ABD PROBLEMS
WEAKNESS
TRAFFIC INJURY
other-
54.5
5.09
3.34
3.14
3.11
2.73
2.33
2.23
2.02
1.92
19.6
< 36.75
36.75 to 37.35
37.35 to 38.05
>= 38.05
75.6
20.5
3.44
0.49
NLP_FINDING1
NLP_FINDING
Negative
Positive
36.63 ± 0.38
WBC
< 11.85
11.85 to 15.15
>= 15.15
81.2
11.7
7.07
11.1 ± 2.1
Negative
Positive
67.1
32.9
TEMP1
< 36.75
36.75 to 37.35
37.35 to 38.05
>= 38.05
78.8
17.1
3.12
0.95
36.61 ± 0.39
WBC1
< 11.85
11.85 to 15.15
>= 15.15
100
0
0
10.2 ± 0.95
NLP_FINDING2
65.9
34.1
Negative
Positive
TEMP2
< 36.75
36.75 to 37.35
37.35 to 38.05
>= 38.05
77.0
18.2
4.12
0.67
36.62 ± 0.39
WBC2
< 11.85
11.85 to 15.15
>= 15.15
81.4
10.7
7.86
11.1 ± 2.2
65.7
34.3
TEMP3
< 36.75
36.75 to 37.35
37.35 to 38.05
>= 38.05
76.8
18.2
3.78
1.26
36.63 ± 0.41
WBC3
< 11.85
11.85 to 15.15
>= 15.15
8
1
5
10.9 ±
Agenda
•
•
Why Decision Support?
Introduction: Bayesian Diagnostic Networks
•
Bayesian Systems
•
•
•
•
A Few Bayesian Tools
Diagnostic Systems
Representing the Semantics of Diagnosis
•
•
•
A Framework for Computable Models
Diagnostic Modeling with Ontologies
Ontologies -> Bayesian Network
Clinical Data
•
A Brief Look at Medical Data Forms
A diagram of a simple clinical model
(A Data Object)
Clinical Element Model for White Blood Count
White Blood Count
WBCLabObs
data
9.6 x 103
Units
Cells per CC
quals
Specimen Type
SpecimenType
data
Commment
data
Whole Blood
Comment
Specimen Hemolyzed
What Does a Medical Concept Look Like
(in probability space)
Concepts vary based on source, goals, and usage.
Pneumonia
• Present
• Absent
White Blood Count
• Specimen Type
• Units
• Value
Cough
• Present
• Absent
• Unknown
Pulmonary Infiltrate
(Chest X-ray Report)
• Present
• Possible
• Absent
• Unknown
What Does a Concept Look Like
Some concepts have subconcepts.
Pulmonary Infiltrate
(Chest X-ray Report)
• Present
• Possible
• Absent
• Unknown
White Blood Count
• Specimen Type
• Units
• Value
Specimen Type
•
Blood
•
Pleural Fluid
•
Ascitic Fluid
•
…
Units
•
Mg per Deciliter
•
Grams
•
Cells per CC
•
…
Value
•
Real Number
What Does a Concept Look Like
Concepts can Modeled Probabilistically
Pneumonia
Present 1.50
Absent
98.5
White_Blood_Count_Specimen
Blood
82.0
Pleural Fluid
4.00
Acitic Fluid
2.00
Urine
12.0
White_Blood_Count_Units
mg per deciliter
kilograms
grams
cells per cc
etc
16.7
11.1
33.3
5.56
33.3
White_Blood_Count_Value
0 to 1000
1000 to 2000
2000 to 3000
3000 to 4000
4000 to 5000
5000 to 6000
6000 to 7000
7000 to 8000
8000 to 9000
9000 to 10000
10000 to 11000
11000 to 12000
12000 to 13000
>= 13000
0.49
1.66
4.41
9.20
15.0
19.2
19.2
15.0
9.20
4.41
1.66
0.49
0.11
.023
6010 ± 2000
Cough
Present
4.26
Absent
53.2
Unknown
42.6
CBC_White_Blood_Count
Unavailable
95.4
0 to 1000
.022
1000 to 2000
.075
2000 to 3000
0.20
3000 to 4000
0.42
4000 to 5000
0.68
5000 to 6000
0.87
6000 to 7000
0.87
7000 to 8000
0.68
8000 to 9000
0.42
9000 to 10000
0.20
10000 to 11000
.075
11000 to 12000
.022
12000 to 13000
.005
>= 13000
.001
-203 ± 1500
Pulmonary Infiltrate (Chest X-Ray Report)
Present
Possible
Absent
Unknown
6.22
3.38
19.2
71.2
What Does a Concept Look Like
Concepts are (in part) defined by their relationships.
Pneumonia
• Present
• Absent
Causes
White Blood Count
• Specimen Type
• Units
• Value
Pulmonary Infiltrate
• Present
• Absent
Specimen: Blood
Units: Cells/CC
Value Thesholds:
High-9,000
Low-2,000
Reported As
Pulmonary Infiltrate
(Chest X-ray Report)
• Present
• Possible
• Absent
• Unknown
White Blood Count
• Elevated
• Normal
• Reduced
• Unavailable
What Does a Concept Look Like
And there are a number of ways to compute Concepts.
Pneumonia
Present 1.50
Absent
98.5
Causes
Pulmonary Infiltrate
Present 5.41
Absent
94.6
White_Blood_Count_Specimen
Blood
82.0
Pleural Fluid
4.00
Acitic Fluid
2.00
Urine
12.0
16.7
11.1
33.3
5.56
33.3
White_Blood_Count_Value
0 to 1000
1000 to 2000
2000 to 3000
3000 to 4000
4000 to 5000
5000 to 6000
6000 to 7000
7000 to 8000
8000 to 9000
9000 to 10000
10000 to 11000
11000 to 12000
12000 to 13000
>= 13000
0.49
1.66
4.41
9.20
15.0
19.2
19.2
15.0
9.20
4.41
1.66
0.49
0.11
.023
6010 ± 2000
Pulmonary Infiltrate (Chest X-Ray Report)
Present
Possible
Absent
Unknown
6.22
3.38
19.2
71.2
CBC_White_Blood_Count
Unavailable
95.4
0 to 1000
.022
1000 to 2000
.075
2000 to 3000
0.20
3000 to 4000
0.42
4000 to 5000
0.68
5000 to 6000
0.87
6000 to 7000
0.87
7000 to 8000
0.68
8000 to 9000
0.42
9000 to 10000
0.20
10000 to 11000
.075
11000 to 12000
.022
12000 to 13000
.005
>= 13000
.001
-203 ± 1500
White_Blood_Count_Units
mg per deciliter
kilograms
grams
cells per cc
etc
Reported As
Specimen: Blood
Units: Cells/CC
White_Blood_Count
Elevated
0.30
Normal
4.15
Reduced
.098
Unavailable
95.4
Value Thesholds:
High-9,000
Low-2,000
Conclusion
•
Graphical Probabilistic Models can capture the
Semantics of Medical Diagnosis.
•
These models can be manufactured using data
collected during the course of care.
•
Probabilistic models can participate in clinical
care.
•
Medical terminologies, embedded in Ontologies
can help to develop these models.
Comments and Questions
Questions???
Probability and Semantics
One way to think of semantics: a set of relationships between concepts
Disease
Finding
Pneumonia
Cough
Concept
Word
Mammal
Mouse
Whole
Part
Hand
Thumb
The arrows provide link across which we can reason
P(A)
P(B|A)
A diagram of a simple clinical model
Clinical Element Model for Systolic Blood Pressure
SystolicBP
SystolicBPObs
data
138 mmHg
quals
BodyLocation
BodyLocation
data
Right Arm
PatientPosition
PatientPosition
data
Sitting
# 60
What if there is no model?
Site #1
Dry Weight: 70 kg
Site #2
Weight: 70
kg
Dry
Wet
Ideal
# 61
Too many ways to say the same
thing
A single name/code and value
• Dry Weight is 70 kg
Combination of two names/codes and
values
• Weight is 70 kg
• Weight type is dry
# 62
64
Terminology
•
•
Probability
•
P(D) – Probability of Disease
•
Implies a Ratio or Rate
•
Names: Prevalence, Prior Probability
Location Specific
Num berwith Disease
Num berin Population
Population from a
Specific Setting
65
More Terminology
Conditional Probability
•
Probability of a Finding in a
patient with a Disease
Number With Disease and Finding
Number with Disease
•
Probability of a Disease in a
Patient with a Finding
Number With Disease and Finding
Number with Finding
•
Probability of Disease in a
patient with Finding 1, Finding
2, neg Finding 3, Finding 4, no
Finding 5, etc.
Number With Disease and a Group of Findings
Number with the Group of Findings
66
Names for the Numbers
Prevalence
P(D)
MI
No MI
2%
98%
Prior Probability
100%
67
Yet Another View
Dividing by the Row Marginals
Positive Predictive Value: P(D|F)
MI
No MI
Chest Pain
16%
84%
100%
No Chest Pain
0.6%
99%
100%
Negative Predictive Value:
P(no D|no F)
68
From Data to
Probabilities
Data
Data
Bayesian
Calculation
DIAGNOSIS
PROB
Pneumonia
92%
Asthma
14%
Chronic Bronchitis
12%
Acute Bronchitis
8%
Bayes Equation
Probability of Disease
When the Finding is Present
69
Probability of Both
The Disease and Finding
P( F and D)
P( D | F ) 
P( F )
Probability of
Finding
70
Bayes Equation
From probability theory:
P (F and D) = P (D) * P (F|D)
P( F and D)
P( D | F ) 
P( F )
71
Bayes Equation
Prior Disease
Probability
Sensitivity
P ( D) P( F | D)
P( D | F ) 
P( F )
Posterior Disease
Probability
Probability of
Finding
72
Probability Updating
The Disease is Myocardial Infarction
The Finding is Chest Pain
P ( D) P( F | D)
P( D | F ) 
P( F )
73
Probability Updating
The Disease is Myocardial Infarction
The Finding is Chest Pain
P( MI ) P(Chest Pain | MI )
P( MI | Chest Pain) 
P(Chest Pain)
P(MI) = 2.0% (0.02)
P(Chest Pain|MI) = 75% (0.75)
P(Chest Pain) = ?
74
Probability Updating
The Disease is Myocardial Infarction
The Finding is Chest Pain
0.02  0.75
P( MI | Chest Pain ) 
?
P(MI) = 2.0% (0.02)
P(Chest Pain|MI) = 75% (0.75)
P(Chest Pain) = ?
75
The Question of P(F)
Simple Bayes
•
Patient has One and Only
One Disease
Multi-Membership Bayes
•
•
Patient has Any Group of
Disease
Each Disease is Evaluated
Independently
Bayesian Networks
•
•
Patient has Any Group of
Disease
Diseases are Evaluated
According to Their
Collective (Joint) Behavior
P( F )   P( F and Di )
i
Add All of the Probabilities
Of Having Both the
Finding and Disease
P( F )   P( Di ) P( F | Di )
i
76
The Question of P(F)
Simple Bayes
•
Patient has One and Only
One Disease
Multi-Membership Bayes
•
•
P(F )  P(Di )P(F | Di )  P(Di )P(F | Di )
Patient has Any Group of
Disease
Each Disease is Evaluated
Independently
Bayesian Networks
•
•
Patient has Any Group of
Disease
Diseases are Evaluated
According to Their
Collective (Joint) Behavior
Two States Apply for Each Disease:
With and Without the Disease
The Question of P(F)
Simple Bayes
•
Patient has One and Only
One Disease
Multi-Membership Bayes
•
•
Patient has Any Group of
Disease
Each Disease is Evaluated
Independently
Disease
Intermediate
Concept
Finding 3
Finding 1
Bayesian Networks
•
•
Patient has Any Group of
Disease
Diseases are Evaluated
According to Their
Collective (Joint) Behavior
Finding 4
Finding 2
P(F) is Determined from
the Joint Effect of Child Nodes
on Their Parents
Probability Updating
Using the Multi-Membership Model
The Disease is Myocardial Infarction
The Finding is Chest Pain
00..02
0200..75
75
P( MI P
| Chest
( MI | Chest
Pain ) Pain
 )
0.02  0.75? 0.98  0.08
P(MI) = 2.0% (0.02)
P(Chest Pain|MI) = 75% (0.75)
P(Chest Pain) = 0.02 x 0.75 + 0.98 x 0.08
79
What about more findings?
• The joy of recursion!
F1= Chest Pain
F2= ST Elevation
F3= CK Increased
….
P( D) P( F1 | D)
P( D | F1 ) 
P( F1 )
'
'
P(D
)P(F
|
D
)
2
P(D' | F2 ) =
P(F2 )
''
''
P(D
)P(F
|
D
)
3
P(D'' | F3 ) =
P(F3 )
etc.
82
Modeling Medical
Phenomena
Examples of Some of the Things that can
be Modeled
83
Pneumonia
Present 2.00
Absent 98.0
Noise
Real White Blood Count
0 to 2
0.13
2 to 4
2.09
4 to 6
13.3
6 to 8
33.5
8 to 10
33.5
10 to 12
13.4
12 to 14
2.61
14 to 16
0.82
16 to 18
0.45
18 toNormal
20
0.17and
20 toLogNormal
30
.055
30 to 50
0+
Distributions
50 to 80
0+
80 to 130
0+
>= 130
0+
8.15 ± 2.4
Rales Really There!
Present 6.70
Absent
93.3
The Effect of Noise on the Diagnosis of Pneumonia
•
•
Noise/Bias modeled
Noisy Lab (continuous) Data
with Simple Discrete
Distributions
Noisy Physical Exam (categorical) Data
Types of Noise
Measured White Blood Count
0 to 2
2.94
2 to 4
8.59
4 to 6
15.9
6 to 8
21.5
8 to 10
21.6
10 to 12
16.2
12 to 14
8.81
14 to 16
3.09
16 to 18
0.81
18
to
20
0.29 of
Different Types
20 to 30
0.18
Normal
Noise/Bias
30
to 50
.006
50 to 80
0+
80 to 130
0+
>= 130
0+
8.17 ± 3.5
Auscultated Rales
Present 18.6
Absent
81.4
•
Bias
•
Imprecision
Source of Result
Small SD
25.0
Big SD
25.0
Bias High
25.0
Bias Low
25.0
Reported By
Medical Student
Resident
Attending
Pulmonologist
Over Sensitive Med Student
20.0
20.0
20.0
20.0
20.0
Boolean Logic
A
Present
Absent
Present
Absent
•
C: If A and B then C
0.20
99.8
•
H
E
Present
Absent
20.0
80.0
Probabilistic Logic
•
Present
Absent
Four Variables
B
1.0
99.0
84
High
Medium
Low
5.00
95.0
20.0
60.0
20.0
If A and B then C
• P(C) = P(A and B) = P(A) * P(B|A)
F: If A or B or E then F
Present 24.8
A or B thenC
D
Absent
75.2
If A or BD: Ifthen
Present 20.8
Absent
79.2
• P(C) = P(A or B) = P(A) + P(B) – P(A and B)
In a Bayesian Network, the resolution of Linked
Rules Occurs Automatically
I: If B and F and H = High Then I
G: If (C and D) or (E and F) then G
Present
Absent
5.19
94.8
Five Interconnected Rules
Present
Absent
4.00
96.0
Temporal Phenomena
85
Several Approaches to Temporal Modeling have been
Proposed
Markov and Hidden Markov Models are Most Common
•
•
•
•
Called Dynamic or Temporal Bayesian Networks
Can Model Complex Disease Behavior
Trained from Data Organized in “Time Slices”
Can be Extended to Include Decisions and Utilities
•
(become “Partially Observable Markov Decision
Processes”)
86
The Dynamic Bayesian Network
First Time Slice
Disease_Status
Absent
90.0
Mild
4.00
Moderate 3.00
Severe
2.00
• Changes
in
Dead
1.00
Second Time Slice
Disease_Status1
Absent
81.6
Mild
7.54
Moderate 4.60
Severe
3.12
Status
Disease
Dead of a 3.13
Can Model Changing Medical Phenomena
•
the State or
Findings Caused by the Disease in it’s Various
States
Test
Normal
85.7
Mildly •Abnormal
Can be8.40
Used
Severely Abnormal 4.87
Patient Deceased
1.00
Explanation
Test1
for
Normal
Mildly Prediction
Abnormal
Diagnosis,
Severely Abnormal
Patient Deceased
78.0
11.1
and
7.87
3.13
Nor
Mild
Sev
Pat
87
Second Time Slice
First Time Slice
Pancreatitis
Acute
Recovering
Dischargeable
Pancreatitis
63.0
8.23
28.8
Acute
Recovering
Dischargeable
39.7
28.9
31.4
Am ylase
Pain
Present
Absent
60.8
39.2
30 to 80
80 to 140
140 to 200
200 to 600
600 to 8500
26.2
18.6
17.6
19.2
18.5
Am ylase
Pain
Present
Absent
57.3
42.7
Pancreatitis Over Time
0.392 ± 0.49
0.427 ± 0.49
981 ± 2000
Abdom enal Pain
Present
Absent
57.9
42.1
0.421 ± 0.49
Glucose
60 to 90
90 to 103
103 to 115
115 to 140
140 to 410
20.5
16.6
19.5
22.6
20.8
139 ± 81
30 to 80
80 to 140
140 to 200
200 to 600
600 to 8500
30.3
17.7
17.0
20.3
14.7
816 ± 1800
Abdom enal Pain
Lipase
0 to 300
300 to 600
600 to 1200
1200 to 3000
3000 to 1.28e5
22.8
14.7
24.7
11.3
26.5
17900 ± 34000
WBC
4 to 6
6 to 8
8 to 9
9 to 12
12 to 17
14.6
28.6
16.0
18.3
22.5
9.28 ± 3.4
Present
Absent
53.5
46.5
0.465 ± 0.5
Glucose
60 to 90
90 to 103
103 to 115
115 to 140
140 to 410
24.3
16.6
22.8
19.9
16.5
130 ± 74
Lipase
0 to 300
300 to 600
600 to 1200
1200 to 3000
3000 to 1.28e5
31.5
15.9
24.1
9.98
18.6
12700 ± 30000
WBC
4 to 6
6 to 8
8 to 9
9 to 12
12 to 17
17.8
30.5
16.5
15.7
19.5
8.9 ± 3.3
Bayesian Networks and
Diagnosis
Re-Purposing Clinical Data
Strategic Goals
Minimum goal: Be able to share
applications, reports, alerts,
protocols, and decision support with
ALL customers of our same vendor
Maximum goal: Be able to share
applications, reports, alerts,
protocols, and decision support with
anyone in the WORLD
Why do we need detailed
clinical models?
# 90
How are the models used in an
EMR?
Data entry screens, flow sheets, reports, ad hoc queries
•
Basis for application access to clinical data
Computer-to-Computer Interfaces
•
Creation of maps from departmental/external system
models to the standard database model
Core data storage services
•
Validation of data as it is stored in the database
Decision logic
•
Basis for referencing data in decision support logic
Does NOT dictate physical storage strategy
# 96
Core Assumptions
‘The complexity of modern medicine exceeds
the inherent limitations of the unaided
human mind.’
~ David M. Eddy, MD, Ph.D.
‘... man is not perfectible. There are limits to
man’s capabilities as an information
processor that assure the occurrence of
random errors in his activities.’
~ Clement J. McDonald, MD
Ontologies, Concepts, and
Probabilities
The way from medical concepts to
diagnostic models
Relational database implications
Patient
Identifier
Date and Time
Observation Type
Observation
Value
Units
123456789
7/4/2005
Dry Weight
70
kg
123456789
7/19/2005
Current Weight
73
kg
Patient
Identifier
Date and Time
Observation
Type
Weight type
Observation
Value
Units
123456789
7/4/2005
Weight
Dry
70
kg
123456789
7/19/2005
Weight
Current
73
kg
How would you calculate the desired weight loss
during the hospital stay?
# 101
Download