Similarity measure between patient traces for clinical pathway analysis

advertisement
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
1
Similarity measure between patient traces for
clinical pathway analysis: problem, method, and
applications
Zhengxing Huang*, Wei Dong*, Huilong Duan**, Haomin Li**
Abstract—Clinical pathways leave traces, described as event
sequences with regard to a mixture of various latent treatment
behaviors. Measuring similarities between patient traces can
profitably be exploited further as a basis for providing insights
into the pathways, and complementing existing techniques of
clinical pathway analysis, which mainly focus on looking at
aggregated data seen from an external perspective. Most existing methods measure similarities between patient traces via
computing the relative distance between their event sequences.
However, clinical pathways, as typical human-centered processes,
always take place in an unstructured fashion, i.e., clinical events
occur arbitrarily without a particular order. Bringing order in the
chaos of clinical pathways may decline the accuracy of similarity
measure between patient traces, and may distort the efficiency of
further analysis tasks. In this paper, we present a behavioral topic
analysis approach to measure similarities between patient traces.
More specifically, a probabilistic graphical model, i.e., Latent
Dirichlet Allocation, is employed to discover latent treatment
behaviors of patient traces for clinical pathways such that
similarities of pairwise patient traces can be measured based
on their underlying behavioral topical features. The presented
method provides a basis for further applications in clinical
pathway analysis. In particular, three possible applications are
introduced in this paper, i.e., patient trace retrieval, clustering,
and anomaly detection. The proposed approach and the presented
applications are evaluated via a real-world data-set of several
specific clinical pathways collected from a Chinese hospital.
Index Terms—Clinical pathway analysis, Similarity measure,
Latent Dirichlet Allocation, Patient trace clustering, Patient trace
retrieval, Anomaly detection
I. I NTRODUCTION
Clinical pathways define the essential component of the
complex health-care process, with the objective of linking evidence to practice for specific health conditions and, therefore,
optimize patient outcomes and maximize clinical efficiency
[1–6]. They have been proposed to support the translation of
clinical guidelines into local protocols and clinical practice [7],
and as a strategy, to optimize resource allocation in a climate
of increasing health-care costs [8, 9].
Clinical pathway analysis (CPA) has experienced increased
A preliminary version of this paper appeared in the 14th Conference on
Artificial Intelligence in Medicine (AIME2013)
Zhengxing Huang and Huilong Duan are from the College of Biomedical
Engineering and Instrument Science of Zhejiang University. The Key Laboratory of Biomedical Engineering, Ministry of Education, China.
Wei Dong is with the Department of Cardiology, Chinese PLA General
Hospital.
Haomin Li is from the College of Computer Science of Zhejiang University.
*Both authors contributed equally to this work.
**Corresponding authors: duanhl@zju.edu.cn, haomin li@yahoo.com
attention over the years due to its importance to health-care
management in general and its usefulness for capturing the
actionable knowledge and interesting insights to administrate,
automate, and schedule the best practice for individual patients
in clinical pathways [3, 10, 11]. For example, it is possible to
discover a clinical pathway model from past clinical pathway
instances (i.e. patient traces) [3], detect the anomalies in clinical pathways [11], identify care-points where patient traces
deviate from expected and/or normative medical behaviors
[12], and enrich pathway models based on patient traces, etc.
Predominant approaches to CPA are from an external perspective of clinical pathways [3]. For example, Muluk et al.,
[13] evaluated the effects of the clinical pathway of nonurgent
abdominal aortic aneurysm surgery, i.e., charges, length of
stay, and mortality rate. Barbieri et al., [14] presented a
meta-analysis method to evaluate the use of clinical pathways for hip and knee joint replacements by assessing the
major outcomes of in-hospital hip and knee joint replacement
processes: postoperative complications, number of discharged
patients at home, length of stay, and direct cost, etc. Kul
proposed a patient survival analysis for clinical pathways
[15]. As valuable as these approaches are, they typically look
at aggregated data seen from the measures, e.g., length of
stay, mortality, and infection rate, etc [10], and thus restrict
the attention to an external perspective of CPA. In clinical
settings, pathways are evolving and clinicians typically have
an oversimplified and incorrect view of the actual clinical
pathways. In this regard, health-care organizations require to
provide insights into clinical pathways and enable various
types of analysis.
In this study, we argue that a careful inspection of patient
traces can support health-care organizations to analyze and
improve clinical pathways from an internal perspective. Patient
traces properly group sets of consistent examples, representing
frequent, similar modifications to instances of the same pathway model, and allowing to extract generalized knowledge for
clinical pathways. By measuring similarities between patient
traces, it can be useful to health-care organizations for a
number of reasons including better overall clinical pathway
management and maintenance [16]. For example, similar patient traces can be grouped to exploit the specific knowledge
or previously experienced situations, identify standardized and
consolidated clinical pathways, and retrieve suggestions on
how to improve and optimize clinical pathways, etc.
In order to measure similarities between patient traces, it is
a common technique to provide a measure of distance in the
Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
2
features’ space, e.g., to compute similarity primarily by using
event sequences of patient traces. Traditional techniques of
sequence similarity measures are focused on direct matching
between sequences commonly applying the classical distance
concepts. They may not be appropriate to measure similarities
between patient traces for clinical pathways. Clinical pathways, as typical human-centric processes, always take place
in an unstructured fashion, i.e., clinical events may occur
arbitrarily without a particular order in the pathways. Bringing
order in the chaos of clinical pathways probably requires
different similarity measure strategies rather than the existing
methods [17].
To this end, we employ a probabilistic graphical model,
i.e., Latent Dirichlet Allocation (LDA) [18], to measure similarities between patient traces for clinical pathways. The
assumption made is that the possible treatment behaviors of
patient traces in clinical pathways may be represented by a
relatively small number of simple and common behavioral
topics, where each topic is characterized by a probability
distribution over treatment behaviors, i.e., a set of specific
clinical events performed on specific patients. The derived
treatment topics can be combined with the original patient
traces to measure similarities between traces. Many further
interesting applications, e.g., patient trace retrieval, clustering,
and anomaly detection, can be performed based on similarity
to analyze clinical pathways.
The remainder of this paper is organized as follows: We
present our similarity measure method in Section 2. Section 3
experimentally evaluates our approach based on three typical
applications, i.e., patient trace retrieval, clustering and anomaly
detection. We present the system prototype in Section 4.
Finally, Section 5 concludes and discusses possible directions
for future work.
II. M ETHOD
In this section, we introduce some notations and terminologies for the patient trace representation at first. This is followed
by a description of the proposed similarity measure between
patient traces for clinical pathways.
A. Patient trace representation
Clinical pathways leave traces, described as sequences of
clinical events with regard to a mixture of various latent
treatment behaviors. Typically, we assume that it is possible to
sequentially record various kinds of clinical events in clinical
pathways such that each event refers to a clinical activity
(i.e., a well-defined step in clinical pathways) and is related
to a particular patient (i.e., a patient trace). Furthermore,
additional information such as the time-stamp of the event, and
patient data elements recorded with the event (e.g., age, sex,
first diagnosis code, and care level, etc.). In general, hospital
information systems record such information. To introduce the
patient trace representation model and our similarity measure
method, we first define the following concepts.
Definition 1: Let E be the set of clinical events 1 . A patient
trace is a non-empty sequence of clinical events performed on
a particular patient, i.e., c = he1 , e2 , . . . , en i, where ei ∈ E
(1 ≤ i ≤ n) is a particular clinical event. For convenience,
let c(i) be the ith clinical event in the trace. A patient trace
repository R is a multi-set of patient traces.
For example, Table 1 shows an example of a patient trace
repository, which consists of ten patient traces, i.e., R =
{c1 , c2 , · · · , c10 }. Each clinical event in the repository is
linked to a particular trace and globally unique, i.e., the same
event cannot occur twice in a repository. For example, let
e = (Adm, 1) be a specific clinical event, which indicates
that the patient is in admission at the time stamp 1. For the
sake of simplicity, the time stamps of these event examples
are integer values, however it could be presented in a dateformat time stamp. A patient trace in a repository represents a
particular clinical process instance also referred to as “case” of
the treatment to a patient. The trace contains a set of clinical
events, which spread along the observed time period of the
patient’s length of stay. Table 2 lists the meaning of these
event types.
B. Similarity measure between patient traces
As mentioned earlier, a patient trace is represented by
a mixture of treatment behaviors, w.r.t specific categories
of clinical events in clinical pathways. In this study, we
employ a specific topic analysis approach, i.e., Latent Dirichlet
Allocation, to mine the set of latent treatment behavioral topics
from patient trace repository. And then, based on the derived
treatment behavioral topics, similarities among patient traces
can be measured efficiently.
LDA has been widely used to model the generative process
of a text document corpus, where a document is summarized
as a mixture of topics. With respect to our study, patient traces
are a mixture of latent treatment behaviors. Note that treatment
behaviors are recognized as a set of clinical events, we can
extract clinical event types to represent “words” in the model,
and clinical events of a particular patient trace are combined
to form a “document”. All patient traces in the repository are
thus converted into a collection of documents.
In general, LDA helps to explain the behavioral similarity
of patient traces by grouping clinical events into unobserved
sets. As shown in Figure 1, a mixture of these sets then
constitutes the observable patient trace. The generative process
of LDA is as follows. For each patient trace c, a mixture of
topic proportion θc ∼ Dir (α) is sampled from a Dirichlet
distribution parameterized by the hyperparameter α. Each
clinical event e in a trace is generated by first sampling a
topic t from a multinomial distribution t ∼ Mult(θ), and then
sampling e ∼ Mult(φt ) also from a multinomial distribution.
Given a treatment behavioral topic t, each φt ∼ Dir (β) is
sampled from a Dirichlet distribution parameterized by β. In
1 Some clinical events might have a duration, i.e., they are conducted not
at a specific time-stamp, but over a time period. However, such a clinical
event can be assumed to consist of a pair of sub clinical activities, i.e., a
start event and an end event, which correspond to a start event and an end
event, respectively. In this study, we assume that clinical events are time point
events, and intervals are represented by starting and ending time point events.
Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
3
TABLE I
E XAMPLE PATIENT TRACES FOR THE INTRACRANIAL HEMORRHAGE CLINICAL PATHWAY. T HE TRACES ARE SIMPLIFIED INFORMATION EXTRACTION
FROM PATIENT RECORDS OF Z HEJIANG H UZHOU C ENTRAL H OSPITAL OF C HINA .
c1
c2
c3
c4
c5
c6
c7
c8
c9
c10
h(Adm, 1), (EPT, 1), (EKS, 1), (EBT, 1), (OxS, 1), (lso, 2), (Oxl, 2), (OxS, 2), (BT, 2), (Mic, 2), (Coa, 2), (Uri, 2), (Vei, 2), (EBT, 2), (EKS, 2), (Hig, 2), (Hep, 2),
(Ele, 2), (Cat, 2), (Sto, 2), (Rep, 3), (Rep, 3), (IhH, 4), (OxS, 4), (Vei, 5), (Pos, 6), (Ind, 6), (Rep, 6), (Vei, 7), (Oxl, 7), (BT, 8), (Hep, 8), (Ele, 8), (Vei, 9), (Vei, 9),
(Vei, 11), (Vei, 11), (KAR, 14), (Osm, 16), (Oxl, 16), (Vei, 16), (Hep, 16), (Ele, 16), (Vei, 18), (Dis, 18)i
h(Adm, 1), (Con, 1), (Sto, 1), (Cat, 1), (Ele, 1), (LKF, 1), (Sil, 1), (EPT, 1), (EKS, 1), (EBT, 1), (Thy, 1), (Uri, 1), (Coa, 1), (Vei, 1), (Sex, 1), (BT, 1), (OxS, 1), (lso,
1), (InH, 2), (Oxl, 3), (Vei, 3), (Pos, 5), (Rep, 5), (Lum, 6), (aCB, 6), (rCF, 6), (Lum, 7), (CFA, 7), (rCF, 7), (Lum, 8), (BT, 8), (CFA, 8), (rCF, 8), (CSF, 8), (aCC, 8),
(Hep, 8), (Ele, 8), (Oxl, 11), (OxS, 11), (Ind, 11), (Rep, 11), (Dis, 11)i
h(Adm, 1), (Det, 1), (Tum, 1), (Oxs, 1), (Hem, 1), (Hem, 1), (Ane, 1), (Ane, 1), (Coa, 1), (Coa, 1), (Ana, 1), (Ana, 1), (Uri, 1), (Uri, 1), (Vei, 1), (Thy, 1), (Thy, 1),
(EBT, 1), (EKS, 1), (EKS, 1), (EUS, 1), (EPT, 1), (EPT, 1), (LKG, 1), (LKG, 1), (Sto, 1), (Sto, 1), (Oxs, 2), (Vei, 2), (Con, 2), (Oxs, 3), (ERS, 5), (FAC, 7), (Oxs, 7),
(Oxs, 8), (BTH, 10), (Ele, 10), (LKF, 10), (Hem, 10), (Det, 11), (CDR, 11), (Dis, 18)i
h(Adm, 1), (EBT, 1), (EPT, 1), (EE, 1), (Osm, 1), (Rep, 2), (Oxs, 2), (InH, 2), (Sto, 2), (Hig, 2), (Ele, 2), (Vei, 2), (Uri, 2), (Vei, 2), (Osm, 2), (Coa, 2), (BT, 2), (Cat,
2), (Hig, 2), (Con, 3), (Vei, 4), (EKS, 5), (Osm, 7), (Vei, 7), (Osm, 8), (aCB, 8), (rCSF, 8), (Bac, 8), (Hep, 8), (Ele, 8), (Vei, 8), (BT, 8), (Oxl, 9), (Ind, 9), (Pos, 10),
(Vei, 10), (Rep, 10), (Rep, 10), (Vei, 12), (Vei, 14), (Vei, 15), (Ele, 18), (Osm, 18), (Hep, 18), (BT, 18), (Vei, 18), (Dis, 21)i
h(Adm, 1), (EPT, 1), (Con, 1), (OxS, 1), (Sto, 2), (Hep, 2), (Ele, 2), (Vei, 2), (Uri, 2), (Coa, 2), (Myo, 2), (Sex, 2), (BT, 2), (Lip, 2), (Mul, 2), (Ind, 3), (Hep, 3), (Ele,
3), (Vei, 3), (Vei, 3), (BT, 3), (Sex, 3), (Vei, 4), (Oxl, 4), (Rep, 6), (Ind, 6), (InH, 6), (Vei, 7), (Osm, 9), (Hep, 9), (BT, 9), (Vei, 9), (Con, 10), (Vei, 10), (Vei, 11), (CDR,
12), (Rep, 12), (Pos, 12), (Vei, 12), (Ele, 13), (Osm, 13), (OxS, 13), (BT, 13), (Vei, 13), (Vei, 13), (Hep, 13), (rCSF, 14), (Lum, 14), (aCB, 14), (Bac, 14), (Det, 16),
(Vei, 16), (Vei, 18), (Vei, 18), (Vei, 19), (Dis, 21)i
h(Adm, 1), (Tum, 1), (ESR, 1), (Coa, 1), (Uri, 1), (Thy, 1), (EBT, 1), (EKS, 1), (EPT, 1), (LKG, 1), (Sto, 1), (EUS, 1), (Con, 2), (CDR, 2), (Hol, 6), (Inf, 7), (Hem, 7),
(Ele, 7), (Hep, 7), (BTH, 7), (BTH, 13), (BTH, 13), (Hep, 13), (Ele, 13), (CA7, 13), (Dis, 14)i
h(Adm, 1), (Tum, 1), (Ser, 1), (Hem, 1), (Gly, 1), (Ane, 1), (Coa, 1), (Ana, 1), (Thy, 1), (Uri, 1), (BDH, 1), (EBT, 1), (EKS, 1), (EPT, 1), (LKG, 1), (Sto, 1), (EUS, 1),
(Oxs, 2), (BDH, 2), (BDH, 2), (BDH, 2), (BDH, 2), (Con, 2), (BDH, 2), (Oxs, 3), (Spu, 3), (Spu, 3), (BDH, 3), (Oxs, 4), (Oxs, 5), (Vei, 7), (LFP, 7), (Hol, 7), (ESR,
8), (BTH, 8), (LKF, 8), (Ele, 8), (Vei, 14), (LPF, 15), (Dis, 15)i
h(Adm, 1), (Hem, 1), (Coa, 1), (Uri, 1), (Thy, 1), (EBT, 1), (EKS, 1), (Tum, 1), (LKG, 1), (Sto, 1), (EUS, 1), (Con, 4), (Ele, 5), (Ren, 5), (BTH, 5), (BTH, 12), (Hep,
12), (Ele, 12), (Hol, 13), (Dis, 14)i
h(Adm, 1), (Tum, 1), (Oxs, 1), (Hem, 1), (Coa, 1), (Uri, 1), (Vei, 1), (Thy, 1), (EBT, 1), (EKS, 1), (LKG, 1), (Sto, 1), (EUS, 1), (Oxs, 2), (Oxs, 3), (FAO, 4), (Oxs, 4),
(Oxs, 5), (EKS, 5), (EBT, 5), (EUS, 5), (Con, 5), (Oxs, 6), (Hem, 6), (BTH, 6), (Glu, 6), (LKF, 6), (Ele, 6), (Oxs, 8), (BTH, 13), (LKG, 13), (Ele, 13), (Dis, 15)i
h(Adm, 1), (EBT, 1), (EBT, 1), (EE, 1), (EE, 1), (BT, 1), (EKS, 1), (EPT, 1), (Cra, 2), (Oxl, 2), (Ful, 2), (BT, 2), (Sex, 2), (Myo, 2), (Gas, 2), (Gas, 2), (Mic, 2), (Osm,
2), (Cor, 2), (Uri, 2), (Thy, 2), (HA, 2), (Hep, 2), (Ele, 2), (Sto, 2), (BNP, 2), (Con, 3), (rCF, 5), (GPC, 5), (Lum, 5), (BT, 5), (aCB, 5), (Bac, 5), (Osm, 5), (Hep, 5),
(Ele, 5), (rCF, 6), (BT, 6), (aCB, 6), (Bac, 6), (Osm, 6), (Lum, 6), (Hep, 6), (Ele, 6), (Bac, 7), (Bac, 7), (aCB, 7), (aCB, 7), (Lum, 7), (rCF, 7), (rCF, 7), (Dis, 8)i
TABLE II
T HE MEANING OF THE EXAMPLE ALPHABETIC LABELS OF CLINICAL EVENTS SHOWN IN TABLE 1.
aCB: Acute Cerebrospinal fluid biochemical
Ana: Analysis of urine microalbumin
Bac: Bacteria and fungi were cultured and identified
BTH: Blood test+Hypersensitive CRP
Cat: Catheterization
Coa: Coagulation + D-dimer
Cra: Craniotomy for intracranial decompression
Dis: Discharge
EBT: Emergency blood test
Ele: Electrolyte
ESR: ESR
Ful: Full set of Lipids (hospital)
Gly: Glycosylated hemoglobin
HA: Hepatitis A antibody
HLA: HLA-B27
Imu: Immune (5 items)
InT: Infrared treatment
KAR: Kidneys and renal vascular color Doppler ultrasound
LKG: Liver, kidney, the glycolipid heart enzyme (hospitalization)
Mic: Micro-jet atomization mask
OxS: Oxygen saturation monitoring
Pos: Postoperative drainage
Rep: Replacement of drainage
Sex: Sex hormones
Sto: Stool examination
Tum: Tumors (10 items)
aCC: Acute CSFRT cryptococcal
Ane: Anemia (3 items)
BFG: (BFGF) topical bovine basic fibroblast growth
BNP: B-type natriuretic peptide
CA7: CA72-4
Con: Conventional ECG Exam
CSF: CSF biochemical
DLV: Determination of left ventricular function
EE: Emergency Electrolyte
EPT: Emergency PT
EUS: Emergency ultra-sensitivity CRP
Gas: Gastrointestinal high nutrition therapy
GPC: General physical cooling
Hep: The hepatorenal sugar (hospitalization)
Hol: 24-hour Holter
In3: Inflammation (3 items)
InH: Intracranial hematoma(including simple epidural)
LFP: Low-frequency pulse power treatment
Lip: Lipids (7 items, hospitalization)
Mul: Multiple intracranial hematoma
OxI: Oxygen inhalation
rCF: CSF routine
Ser: Serum troponin T assay
Sil: Silicone suction drainage
Th7: Thyroid function (7 items)
Uri: Urine + sediment test
LDA, each patient trace c is a mixture of topics represented by
θc and each topic t is a distribution over all events represented
by φt,e = Pr(e|t).
Using this generative model, the treatment behavioral topic
assignments for clinical events can be calculated based on
the current topic assignment of all the other clinical event
positions. More specifically, the topic assignment is sampled
from:
π (e)
Pr(ti = t|t¬i , c) = P
ntc,¬i + α
a
nt,¬i
+β
b∈A
nbt + β|A|
P
j∈K
t
ncj + αK
Adm: Admission
Ant: Anti-O rheumatoid
BT: Blood test
BDH: B-D Heparin cap
CDR: Color Doppler routine inspection
Cor: Cortisol
CFA: Cerebrospinal fluid biochemical+ADA
DT3: Determination of tumor (3 items)
EKS: Emergency kidney, sugar
ERS: Emergency renal, sugar
FAC: By the femoral artery catheter cerebral arteriography
Glu: Glucose
Hem: Hemorheology
Hig: High-frequency oxygen / hour
IDF: Intracranial Doppler flow imaging (TCD)
Ind: Indwelling catheter
Iso: Isoflurane (live Ning)/1ml/ml
LKF: Liver and kidney function (hospitalization)
Lum: Lumbar puncture
Myo: Myocardial enzymes
Osm: Osmotic pressure
Ren: Renal function (hospitalization)
SE+: Stool examination+OB
Spu: Sputum culture
Thy: Thyroid (five items)
Vei: Vein catheterization
where ti = t represents the assignment of the ith occurrence
to topic t, t¬i represents all treatment behavioral topics assignments not including the ith occurrence, K is the number
πa (e)
of topics, |A| is the number of clinical event types, nt,¬i
is
the number of times the event type πa (e) assigned to topic t,
not including the current instance, and ntc,¬i is the number of
times topic t assigned to the patient trace c, not including the
current instance.
From these count matrices, we can estimate the topic-event
(1)
Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
4
Fig. 1. Graphical representation of LDA-based similarity measure between
patient traces [4].
distribution θ and trace-topic distribution φ by,
π (e)
and,
nt a
+β
b + β|A|
n
b∈A t
(2)
ntc + α
t
t∈T nc + αK
(3)
θt,e = P
φc,t = P
Exact inference in LDA is generally intractable. In particular, we use Gibbs sampling to estimate the parameters ntc
and net from which we can determine the model parameters
θt,e and φc,t . The pseudo-code for Gibbs sampling is shown
in Algorithm 1. By inspection, the complexity of Algorithm 1
scales linearly with the number of latent treatment topics K,
the number
Pof clinical events in the patient trace repository
R, N =
c∈R |c|, and linearly with the number of Gibbs
samples L, giving the overall complexity of O(L · N · K).
Taking the traces shown in Table 1 as an example, clinAlgorithm 1 Gibbs sampling for LDA
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
Procedure::LDAGibbsSampling(R, α, β, L)
Input:
R is a patient trace repository
α, β are Dirichlet hyper-parameters
L is the number of Gibbs samples
Output:
T is the set of estimated treatment topics based on the probability Pr(e|t)
Steps:
//Initialization
Initialize the count parameters, ntc = 0, net = 0
For each event e in R
Sample a treatment topic t from t ∼ Mult( T1 )
Let ntc = ntc + 1, net = net + 1
End For
//Run the chain
For l = 1 to L do
For each c ∈ R do
For each ei ∈ c do
Let ntc = ntc − 1, net = net − 1
Sample tci according to Equation (1)
Let ntc = ntc + 1, net = net + 1
End For
End For
store the l-th Gibbs sample
End For
Output: Estimated treatment topics based on Pr(e|t) based on Equations
(2) and (3)
27: End Procedure
ical experts from the cooperated hospital have indicated that
the two derived topics have specific clinical intentions, i.e.,
cerebral hemorrhage treatment (ICD-10: I61), and subdural
hematoma treatment (ICD-10: I62.006), respectively. Thus, we
denote K = 2 for the example traces. Note that the derived
topics reflect a collaborative shared view of medical behaviors
contained in the traces, and the event types of the topics
reflect a common vocabulary to describe the patient trace.
Table 3 shows typical examples of event types (satisfying
p(e|t) ≥ 0.01) of the derived treatment behavioral topics. As
can be seen, the topics group typically co-occurring events. For
example, clinical event types “Intracranial hematoma surgery
(including simple epidural)” and “Postoperative drainage” are
correlated with each other, and they have the same value of
the event-topic distribution. The relationships between clinical
event types via treatment behavioral topics can be used to
provide good classification of patient traces. Note that the
derived latent topics are not necessary disjoint. E.g., “ECG”
occurs in the cerebral hemorrhage topic as well as in the
subdural hematoma topic.
Once we have learned the model parameters, we can measure the similarity between patient traces. In particular, for
a specific trace c in the repository R, we obtain the topic
→
−
distribution θc = {θ̂c,t1 , θ̂c,t2 , · · · , θ̂c,tK }, where each θ̂c,ti
is the posterior estimate of θc,ti for the treatment behavioral
topic ti (1 ≤ i ≤ K). Upon this, we are able to calculate the
similarity between two traces c and c∗ (c, c∗ ∈ R) as follows:
P
θ̂c,t × θ̂c∗ ,t
(4)
sim(c, c∗ ) = qP t∈T qP
2
2
θ̂
θ̂
∗
t∈T c,t
t∈T c ,t
Taking the traces shown in Table 1 as examples, for patient
trace c1 , the top 5 similar traces are c4 (sim(c1 , c4 ) = 0.9999),
c5 (sim(c1 , c5 ) = 0.9997), c2 (sim(c1 , c2 ) = 0.9834), c10
(sim(c1 , c10 ) = 0.9761) and c6 (sim(c1 , c6 ) = 0.9323).
III. C ASE STUDY
The presented similarity measure approach provides a basis
for further CPA tasks. In this section, three possible CPA applications, i.e., patient trace retrieval, clustering, and anomaly
detection, are presented as follows. To test the feasibility of
the proposed method, experiments on data-sets collected from
Zhejiang Huzhou Central Hospital of China were performed.
The explanation of the experimental setups and obtained
results are presented in the following.
A. Data set description
The experimental data-set was extracted from Zhejiang
Huzhou Central hospital of China. The application of information technology in this hospital is at a relatively high
level, and the electronic medical records system has been
gradually used since 2004. The system records many kinds
of information of clinical pathways, e.g, examinations, lab
tests, surgeries, etc. In the experiments, we build a specific
patient trace repository of clinical pathways of several specific
types of cancer, i.e., branchial lung cancer, colon cancer, rectal
cancer, breast cancer, and gastric cancer, from the system. The
collected data is from 2007/08 to 2009/09. In addition, we
preprocessed those traces by removing those incomplete traces
Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
5
TABLE III
T HE TYPICAL EVENT TYPES FOR THE DERIVED TREATMENT BEHAVIORAL TOPICS FROM EXAMPLE TRACES SHOWN IN TABLE 1.
Topic 1
Topic 2
Admission; Oxygen saturation monitoring; Vein catheterization; Conventional ECG Exam; Blood test; Coagulation + D-dimer; Electrolyte; Emergency blood test;
Emergency kidney and sugar; Emergency PT; Liver, kidney and the glycolipid heart enzyme (hospitalization); Urine + sediment test; The hepatorenal sugar (hospitalization);
Thyroid function (7 items); Emergency ultra-sensitivity CRP; Stool examination + OB; Tumors (10 items); Blood test + Hypersensitive CRP; Hemorheology; Discharge
Oxygen saturation monitoring; High-frequency oxygen/hour; Vein catheterization; Intracranial hematoma surgery (including simple epidural); Postoperative drainage;
Indwelling catheter; Replacement of drainage bag; Lumbar puncture; Blood test; Electrolyte; Stool examination; Oxygen inhalation; Cerebrospinal fluid biochemical; CSF
routine; The hepatorenal sugar (hospitalization); Osmotic pressure; Bacteria and fungi were cultured and identified; Blood test + Hypersensitive CRP; Hemorheolopgy;
B-D Heparin cap
(e.g., the trace of which the patient died or was transferred
during his or her LOS) from the repository. In detail, there
are 258 traces, 11028 clinical events with 266 event types.
The average LOS of these traces is 25.39 days while some
traces take a very short time, e.g., only 4 days in hospital, and
other traces take much longer, e.g., 66 days in the hospital,
which implicitly indicates the diversity of treatment behaviors
in intracranial hemorrhage clinical pathway.
B. Similarity measure methods considered
In order to evaluate the performance of the presented
similarity measure method, we compare the presented LDAbased similarity measure with the traditional edit-distancebased similarity measure, and a classical simple term vector
based method.
• Using “edit distance” to measure the temporal similarity
between pairwise patient traces c and c∗ are implicitly
considered as the penalties of a transformation of the
trace c to c∗ or vice versa through a set of editing
operations, i.e., “no change”, “substitution”, “deletion”
and “insertion”, applied to one of the traces iteratively.
For more details about the ‘edit distance’ approach, please
refer to [19].
• Term vector has been widely used for representing text
documents. Adapted to our setting, the term vector of a
particular patient trace c has the following form: w(c)
~
=
{w1 , · · · , w|V | }, where V is the event type vocabulary
of the patient trace repository. Note that the element wi
in the vector, which corresponds to the term ith in V ,
is weighted by using some schema such as TFxIDF. In
this study, we use Equation (5) to calculate the similarity
between two patient traces ci and cj based on their term
vectors:
P
wi,v × wj,v
(5)
sim(ci , cj ) = qP v∈V qP
2
2
w
w
i,v
j,v
v∈V
v∈V
In the following experiments, we refer to LDA-based similarity
measure with K-topic model (K = 1, 2, 3, · · · , 20) as LDAK, edit-distance-based similarity measure as ED, and term
vector based similarity measure as TV.
C. Experimental settings
Constructing LDA model is to fit latent treatment behavioral
topics to the patient trace repositories. In the experiments, we
conducted topic analysis for the experimental repository using
LDA with different number of treatment behavioral topics
(K = 1, 2, 3, · · · , 20). The Dirichlet prior α and β of LDA
are set to 0.2 and 0.1, which are common settings in literature.
The number of iterations of Gibbs sampling is set to 10000.
Note that Gibbs sampling converges before 10000 iterations
for the experiments. In addition, to expand the number of trials
when we construct the LDA model, we adopt a fivefold crossvalidation strategy. For each repository, we split it randomly
into five mutually exclusive subsets of equal size. We then
designate each subset as the testing data set are used to
compute the perplexity score while the others serve as the
training data set. To minimize potential biases that may result
from the randomized folding process, we perform this fivefold
cross-validation process five times and estimate the overall
performance by averaging the performance estimates obtained
from the 250 individual trials. The topic models are exploited
for experiments hereafter.
Now that we have built the LDA model from patient trace
repository, several interesting applications could be performed
based on the learned LDA model. As shown below, we
evaluated the presented approach based on three specific applications, i.e., patient trace retrieval, clustering, and anomaly
detection.
D. Patient trace retrieval
The first application based on similarity measure is patient
trace retrieval. Patient traces describe the knowledge acquired
after solving specific problems [20]. When a clinician encounters problems in executing a patient trace, he/she may
retrieve suggestions from past traces. Given a query, those
patient traces with high similarities are good candidates for
recommendation, i.e., closer to the query in terms of their
behavioral similarities.
There is an assumption of using Equation (4) to measure
the treatment behavioral similarity between pairwise traces c
and c∗ : both traces should be placed into the patient trace
repository R such that topic analysis can be performed, and a
LDA model can be learned from the traces in the repository.
However, in most cases of retrieval, the queried trace is a
new one outside the repository, and thus Equation (4) is not
appropriate for measuring similarities between a trace in the
repository and an external query. To this regard, we employed
a LDA-based retrieval model [21] to measure similarities
between patient traces. The basic idea of using the LDA-based
retrieval model is to generate the query likelihood process,
where each trace is scored by its likelihood generating a query
trace c∗ , Pr(c|c∗ ). And thus similarity can be measured as
sim b (c, c∗ ) ∝ Pr(c|c∗ ).
To calculate the query likelihood, we need to sum over the
Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
6
treatment topical variable for each clinical event type of the
query trace c∗ . Given the posterior estimates θ̂ and φ̂, the query
likelihood of a particular trace c (c ∈ R) given c∗ , Pr(c|c∗ ),
can be calculated as
YX
Pr(c|c∗ ) =
Pr(e|t, φ̂)Pr(t|θ̂, c)
(6)
e∈c∗ t∈T
Taking Equation (6) to replace Equation (4) for similarity
measure, similar traces with a query can be retrieved from
the repository R.
E. Evaluation metrics on patient trace retrieval
For evaluation on patient trace retrieval, the matrix “Precision” is calculated:
5
Precision =
X rel (ci )
1
×
|Q| i=1 5
(7)
where Q is the set of query traces from the repository.
Especially, 25 traces are randomly selected as query traces
from the repository. i denotes the ith retrieval trace, which is
from 1 to 5, i.e., given a query, we retrieved top 5 similar
traces. rel (ci ) denotes the relevant value of the trace ci to the
query. If ci is relevant to the query, rel (ci ) = 1, otherwise
rel (ci ) = 0. Apparently, it needs to identify if a retrieval trace
is relevant to a particular query. For this purpose, a manual
evaluation was conducted independently by three managers
of medical services at the Zhejiang Huzhou Central hospital
adopting a majority voting.
F. Evaluation results on patient trace retrieval
Figure 2 shows detailed experimental results in comparison
between LDA-K (K = 1, 2, 3, · · · , 20), ED and TV on the
performance of retrieval. We observe that the number of
treatment topics K has weak impacts on retrieval performance
of the proposed LDA method. As depicted in Figure 2, with
K increases, the precision increases slowly at first, and then
remains stable with the further increases of K. However, when
K surpasses a certain threshold, the precision decreases slowly
with further increases of the value of K. We can observe
that, the precision achieves the best performance when K
is around 11, while smaller values like K = 1 or larger
values like K = 20 can potentially degrade the performance.
This phenomenon indicates that the number of latent treatment
topics for analysis should be suitable to reflect the topics in
the repository. In comparison with ED and TV, the precision
achieved by LDA-11 is 0.792, while the precision achieved
by ED and TV are 0.632 and 0.664, respectively, i.e., roughly
19% improvements on the quality of precision, which is quite
remarkable. In fact, as shown in Figure 2 the presented LDAK outperforms ED and TV regardless of the value of K. It
indicates that LDA is more appropriate for the patient trace
retrieval than ED and TV.
G. Patient trace clustering
In clinical pathways, patients who have the similar symptoms, chief complaints, pathology examination results, and
other clinical features, may have similar traces, and can be
grouped into the same cluster. Patient trace clustering helps
reveal the underlying characteristics and commonalities among
a large collection of traces. The information extracted by
clustering can also facilitate subsequent analysis, for instance,
to extract common treatment patterns of execution in the
traces, or speed up trace indexing and anomaly detection.
0
A reasonable similarity measure sim(c, c ) is critical for
the patient trace clustering. The objective of the clustering methods that work on similarity measure function is
to maximize the intra cluster similarities and minimize the
inter cluster similarity [22, 23]. In this study, we adopted
a hierarchical micro-clustering algorithm [24] to generate
partitions of patient traces in the repository. In Algorithm 2, we
iteratively group two trace clusters with the largest similarity,
where the similarity between two clusters is defined as the
similarity between the farthest traces in the two clusters. The
algorithm terminates when the maximum similarity between
clusters becomes smaller than a user-specified threshold ε.
The algorithm outputs a set of clusters of patient traces. It
guarantees that the similarity between any pairwise traces in
the same cluster is larger than ε.
Algorithm 2 Density-based k nearest neighbor clustering.
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
Procedure::DensityBasedKNNClustering(R, ε)
Input:
R is a patient trace repository
ε is the threshold of similarity
Output:
Φ is the set of patient trace clusters
Steps:
For each trace c in R
Let φc = {c}, Φ ⇐ Φ + φc
End For
For each pair of clusters φi and φj in Φ
Let sim ij = sim(ci , cj ) be the similarity between φi and φj , where
φi = {ci }, φj = {cj }
End For
Set the current maximum similarity sim = max(sim ij )
While (sim ≥ ε)
Select sim x,ySwhere (x, y) = argmax i,j simi,j
Let φz = φxS φy
Let Φ ⇐ Φ {φz } − φx − φy
For each φv 6= φz
Let sim vz = max(sim(c1 , c2 )) where c1 ∈ φv and c2 ∈ φz
End For
End While
Output Φ
End Procedure
H. Evaluation metrics on patient trace clustering
In the experiments, we compare the generated clusters with
the benchmark clusters. The benchmark clusters are identified
from the experimental repository. In particular, we use the first
diagnosis code to category patient traces. As mentioned above,
5 categories, i.e., bronchial lung cancer, colon cancer, rectal
cancer, breast cancer, and gastric cancer, are extracted from
the repository. Since the experimental repository contains these
general categories, they can be used as benchmark clusters for
evaluating the overall performance of clustering.
As to evaluate the patient trace clustering, we first calculate
the accuracy of the system on a per-trace basis and then build
a global score for all patient traces in the repository, i.e., for
a patient trace c, the precision and recall with respect to that
Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
7
Fig. 2.
The results of patient trace retrieval.
trace are calculated as follows:
T
|φc | |ϕc |
|φc |
T
|φc | |ϕc |
Recall c =
|ϕc |
Precision c =
(8)
(9)
where φc is the generated cluster containing
c, ϕc is the
T
benchmark cluster containing c, |φc | |ϕc | is the number of
patient traces simultaneously appeared in both φc and ϕc .
And the final precision and recall numbers are calculated as
follows:
1 X
Precision =
Precision c
(10)
|R|
c∈R
Recall =
1 X
Recall c
|R|
(11)
c∈R
Usually, precision and recall are not used separately, but
combined into Fβ measure as following:
Fβ = (1 + β 2 ) ×
(Precision × Recall )
β 2 × Precision + Recall
(12)
In the experiments, we set β = 0.5 to weight precision twice as
much as recall. This is because we are willing to have averagesize clusters but high precision than merging them into a large
cluster for higher recall but low precision.
I. Evaluation results on patient trace clustering
Using the benchmark clusters, we can evaluate clustering
performance on F0.5 . In particular, by taking the maximum
value of F0.5 (among different merging thresholds ε from 0.0
to 0.4), we compare the performance of ED, TV and LDA-K
(K = 1, 2, 3, · · · , 20). As shown in Figure 3, when the number
of topics is larger than a particular value (K ≥ 8), the F0.5 is
quite stable. Certainly, k ≈ 8 is probably the suitable number
of topics for the experimental patient trace repository.
Now we study the impact of the parameter ε on both the
experimental results, where ε is the merging threshold in the
clustering step. We vary the value of ε from 0.0 to 0.4. Figure
4 shows the results of ED, TV and LDA-8 (using the 8-topic
model). From the figure, we can notice that LDA-8 can provide
significant improvement over ED and TV. The maximum value
of F0.5 of LDA-8 is 0.6622, which is nearly 84% better than
ED (0.1044), and 46% better than TV (0.3565). Note that when
margining threshold is zero, each patient trace is classified
into a specific cluster. That explains why three curves have
the same starting value of F0.5 shown in Figure 4. In addition,
the inclusion of latent topics increases similarity among patient
traces. As a result, when merging threshold is small, LDA-8
does not show an advantage over ED and TV. When merging
threshold increases, LDA-8 obtains better results on F0.5 than
ED and TV, while TV increases slowly with the increases
of ε, and ED remains stable regardless of the change on the
value of ε. In particular, LDA-8 provides the most significant
improvements when ε is 0.15. And then F0.5 decreases slowly
with the further increases of ε. It means the suitable value of
ε is around 0.15 for the experimental repository. Note that we
can always obtain better results with LDA-8 except ε = 0 in
comparison with ED and TV. It indicates that the treatment
behavioral features have much more significant influences on
the similarity measure and subsequent analysis (e.g., patient
trace clustering) than the sequential order of clinical events
of the traces. Apparently, it confirms our assumption that
clinical pathways take place in an unstructured fashion such
that traditional temporal similarity measure between patient
traces would not achieve the accurate results, and may distort
the subsequent tasks of CPA.
J. Anomaly detection
With regard to the set of trace clusters discovered by
the method presented in previous section, it is possible to
find if a particular patient trace c is normal or anomalous.
Since patient traces within a specific cluster have similar
care journeys to each other. We argue that while facing a
new piece of information, humans firstly classify it into an
existing information category [25], and then compare it to the
previous members of the category to understand how it varies
in relation to the general characteristics of the membership
category. Once the “normality” has been roughly captured
by the discovered clusters from a particular patient trace
repository, one can look for those individual patient traces
whose patient-care journey deviates from the normal one. To
this end, we assume that each discovered patient trace cluster
φ represents a particular clinical pathway category, which is
supported by a subset of patient traces in the trace repository
R (φ ⊆ R). Traces of φ share a set of common properties
Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
8
Fig. 3. Performance of clustering using ED, TV and LDA with different latent treatment behavioral topic models on the experiment repository. For each
clustering setting (ED, TV or LDA with different topic models), we changed the merging threshold and obtained the maximum F0.5 for comparison.
Fig. 4.
The comparison between ED, TV and LDA-8 on patient trace clustering.
that make them perceptually similar to each other, while also
making them different from the traces of other clusters. If a
particular patient trace c has similar features with the traces
in φ, we can say that c is regular with regard to φ, otherwise,
c is an anomaly.
To this end, similarities between c and the traces of φ are
combined to generate a conclusion about c. Based on the
presented similarity measure between pairwise traces, we
compute the similarity between a particular patient trace c
and the previous members of each trace cluster by defining a
function ∆φ (c) as:
∆φ (c) =
X
ωφ (c∗ ) · sim(c, c∗ )
(13)
c∗ ∈φ
where ωφ (c∗ ) is the weight of each member c∗ in the cluster
φ, which indicates the participation of c∗ in φ.
1 X
ωφ (c∗ ) =
sim(c∗ , c∗∗ )
(14)
|φ| ∗∗
c
∈φ
∆φ represents the average weighted similarity between a
particular patient trace c and any one of a membership cluster
φ. The selected membership cluster φ∗ is found as:
φ∗ = argmax ∆φ (c∗ )
(15)
∀φ
Once the membership decision of a new particular trace has
been made, we can focus our attention on deciding whether
the new particular trace is normal or not. Intuitively speaking,
we want to decide the normality of a new trace based on its
closeness to the previous members of its membership cluster.
This is done with respect to the average closeness between
the previous members of its membership cluster. In particular,
we define a particular trace c as normal with respect to its
membership cluster φ∗ if ∆φ∗ (c) is larger than a particular
normality threshold µ, i.e., if ∆φ∗ (c) ≥ µ, c is normal w.r.t
φ∗ . Otherwise, it is an anomaly.
K. Evaluation metrics on anomaly detection
In this subsection, we evaluate the proposed anomaly detection method. The overview of the experimental flowchart
involves three steps:
1) By applying the proposed approach, we evaluate the
normality of each patient trace in the repository. In particular, we set up 10-fold cross validation experiments,
which mean those traces in the repository would be split
into ten partitions. Nine partitions are training data, and
one partition is testing data. Based on train data, the
proposed anomaly detection model is built. Then, for
the partition of testing data, the normality of each trace
is calculated based on the learned model. In all, the
set of anomalies are extracted from the repository R,
named Anomalies = {c|c ∈ R ∧ ∆φ∗ (c) < µ}, where
µ is a particular normal threshold value. The calculation
process and methods have been introduced in Section
4.3 of this paper in detail.
2) Ask to the benchmark (or ground truth) evaluation
data, we asked three experienced physicians of Zhejiang Huzhou central hospital to evaluate the discovered
anomalies adopting a majority voting. Formally, we let
bc be the clinical expert’s evaluation result of an anomaly
Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
9
trace c discovered by our method. If clinical experts also
take c as anomaly, bc = 1, otherwise, bc = 0.
3) The last step is the comparison between the calculation
results and benchmark. In particular, the matrix “Precision” is gained as follows:
P
bc
(16)
Precision = c∈Anomalies × 100%
|Anomalies|
L. Evaluation results on anomaly detection
As mentioned above, LDA-K achieves the best clustering
performance with K = 8. Thus, we investigated the performance of LDA-8 on anomaly detection. Table 4 shows the
number of detected anomalies and the corresponding precision
value of LDA-8 with different µ, where µ is the threshold
value of normality of patient traces. In the experiments, we
vary the value of µ from 0.05 to 0.5. The general trend of
precision is observed in Table 4. For example, when µ = 0.9,
there are 29 anomalies detected by LDA-8, and the precision is
about 58.6% (i.e., 17 detected anomalies are evaluated as true,
and 12 detected anomalies are evaluated as false by clinical
experts). When µ reduces to a certain value, i.e, µ ≤ 0.8, the
number of detected anomalies and the corresponding precision
of LDA-8 remains stable with the further decreases of µ, i.e.,
less than 3 detected anomalies and 100% precisions. Although
the precision achieved is quite remarkable when µ ≤ 0.8,
there are at least 14 (17-3) anomalies recognized by clinical
experts while not detected by LDA-8. Clearly, when µ = 0.85,
LDA-8 is able to detect most of the anomalies from the
repository. Thus, as a conservative estimate, the default value
for normality threshold value µ is set at 0.85.
It has to be mentioned that, when ED is applied in anomaly
detection, all 258 traces in the repository are judged as
anomalies even when µ = 0.5. For TV, 133 out of 258 patient
traces are recognized as anomalies when µ = 0.5, which is still
a quite large number of anomalies. It is because the measured
similarities between patient traces are quite small using ED or
TV. For example, using ED for any trace c in the repository,
the maximum ∆φ∗ (c) (over varying ε from 0.0 to 0.4 in the
clustering step) is 0.382. It is far less than µ. Apparently,
both ED and TV are unsuitable to be applied in anomaly
detection for the experimental repository. It also confirms our
assumption that clinical pathways are typically unstructured
such that it requires different strategies rather than traditional
methods to measure similarities between patient traces.
IV. P ROOF - OF - CONCEPT PROTOTYPE
We have implemented and tested the proposed approach
using Microsoft C#. Figure 5 depicts a screen-shot of our
prototype. Based on the input trace repository extracted from
Zhejiang Huzhou Central hospital of China, we can describe
the details of each trace. For example, Figure 5 listed a set of
patient traces of the intracranial hemorrhage pathway. On the
left part of Figure 5, it presents the basic information from the
repository, e.g., number of traces, number of events, number
of event types, minimum LOS, maximum LOS, average LOS,
etc. In addition, all traces with their IDs and all event types
existing in the trace repository are listed. Each event type is
represented as a color dot to distinguish clearly. User could
select the traces and the types of interest to display.
For each patient trace, it shows time-line display, categories
and similar traces on the right part of Figure 5. Time-line
display distributes all the events upon the corresponding inpatient day which means that for each in-patient day, there
are events sorted by time from the earliest event to the latest
one in a single day. Categories show the treatment behavioral
topics the trace belongs to. Sometimes the trace is a mixture
of two or more categories, and we can fix it with a probability
on each category. Similar traces present the typical as well as
similar traces we’ve found from the patient trace repository
by using the methods presented in this work. They are also
displayed with time-line display.
V. C ONCLUSION
In this paper, we present a probabilistic approach of measuring the similarities between patient traces for CPA. The
proposed approach can provide a basis for subsequent CPA
(e.g., patient trace retrieval, clustering, and anomaly detection,
etc.), and assist in getting better insights into clinical pathways.
The advantages of the proposed approach have been pointed
out in our proposal. Note that what we need is to gather a
patient trace repository and use it for analyzing and improving
clinical pathways. Analysis on the patient trace repository is
totally unsupervised. It requires small effort of humans for
preprocessing the traces in the repository. This is particularly
useful when dealing with clinical pathways lacking formal
consensus models, where patient traces can still be measured
based on their treatment behavioral similarities. As a result,
the solution works well for CPA.
We believe that our approach is highly appealing in the
field of CPA. Measuring similarities between patient traces
can profitably be exploited as a basis for further tasks of
CPA, not limited to the applications listed in this article.
E.g., critical/essential treatment behaviors can be detected,
analyzed, and optimized based on the topic analysis presented
in this study, association rules between recognized anomalies
and patient states can be derived, etc. We will address these
tasks by exploiting the potential of the proposed method and its
applications, as a crucial advantage over traditional techniques
for clinical pathway analysis and optimization.
ACKNOWLEDGMENT
This work was supported by the National Nature Science
Foundation of China under Grant No 81101126, and the
National Hi-Tech R&D Plan of China under Grant No
2012AA02A601. The authors would like to give special
thanks to all experts who cooperated in the evaluation of the
proposed method.
R EFERENCES
[1] D.A. Alexandrou, I.E. Skitsas, and G.N. Mentzas. A
holistic environment for the design and execution of selfadaptive clinical pathways. Information Technology in
Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
10
TABLE IV
T HE RESULTS OF ANOMALY DETECTION USING LDA-8 ON THE EXPERIMENTAL REPOSITORY.
# of detected anomalies
# of benchmark anomalies
Precision
Fig. 5.
[2]
[3]
[4]
[5]
[6]
[7]
µ = 0.9
29
17
58.6%
µ = 0.85
13
12
92.3%
µ = 0.8
3
3
100%
µ = 0.75
2
2
100%
µ = 0.7
2
2
100%
µ = 0.65
2
2
100%
µ = 0.6
2
2
100%
µ = 0.55
1
1
100%
µ = 0.5
1
1
100%
A screen-shot of the system prototype.
Biomedicine, IEEE Transactions on, 15(1):108 –118, jan.
2011.
L. Maruster and R.J. Jorna. From data to knowledge: a
method for modeling hospital logistic processes. Information Technology in Biomedicine, IEEE Transactions
on, 9(2):248–255, june 2005.
Z. Huang, X. Lu, and H. Duan. On mining clinical
pathway patterns from medical behaviors. Artificial
Intelligence in Medicine, 56(1):35–50, 2012.
Z. Huang, X. Lu, and H. Duan. Latent treatment topic
discovery for clinical pathways. Journal of Medical
Systems, 37(2):1–10, 2013.
Z. Huang, X. Lu, H. Duan, and W. Fan. Summarizing
clinical pathways from event logs. Journal of Biomedical
Informatics, 46(1):111–127, 2013.
Z. Huang, X. Lu, and H. Duan. Similarity measuring
between patient traces for clinical pathway analysis.
In Niels Peek, Roque Marłn Morales, and Mor Peleg,
editors, Artificial Intelligence in Medicine, volume 7885
of Lecture Notes in Computer Science, pages 268–272.
Springer Berlin Heidelberg, 2013.
H. Campbell, R. Hotchkiss, N. Bradshaw, and M. Porteous. Integrated care pathways. British Medical Journal,
316(7125):133–137, 1998.
[8] A. Dogac, Y. Kabak, T. Namli, and A. Okcan. Collaborative business process support in ehealth: Integrating
ihe profiles through ebxml business process specification
language. Information Technology in Biomedicine, IEEE
Transactions on, 12(6):754–762, 2008.
[9] J. Kimberly, G. de de Pouvourville, and T. d’Aunno. The
globalization of managerial innovation in healthcare.
Cambridge: University Press, 2009.
[10] F. Lin, S. Chen, S. Pan, and Y. Chen. Mining time
dependency patterns in clinical pathways. International
Journal of Medical Informatics, 62(1):11–25, 2001.
[11] A. Rebuge and D.R. Ferreira. Business process analysis
in healthcare environments: A methodology based on
process mining. Information Systems, 37(2):99–116,
2012.
[12] J. van de Klundert, P. Gorissen, and S. Zeemering. Measuring clinical pathway adherence. Journal of Biomedical
Informatics, 43(6):861–872, 2010.
[13] S.C. Muluk, L. Painter, S. Sile, R.Y. Rhee, M.S.
Makaroun, D.L. Steed, and M.W. Webster. Utility
of clinical pathway and prospective case management
to achieve cost and hospital stay reduction for aortic
aneurysm surgery at a tertiary care hospital. Journal of
Vascular Surgery, 25(1):84–93, 1997.
Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
11
[14] A. Barbieri, K. Vanhaecht, P. Van Herck, W. Sermeus,
F. Faggiano, S. Marchisio, and M. Panella. Effects
of clinical pathways in the joint replacement: a metaanalysis. BMC Medicine, 7(32):1–11, 2009.
[15] S. Kul. The use of survival analysis for clinical pathways.
International Journal of Care Pathways, 14(1):23–26,
2010.
[16] M. Qiao, R. Akkiraju, and A. Rembert. Towards efficient
business process clustering and retrieval: Combining
language modeling and structure matching. In Stefanie
Rinderle-Ma, Farouk Toumani, and Karsten Wolf, editors, Business Process Management, volume 6896 of
Lecture Notes in Computer Science, pages 199–214.
Springer Berlin / Heidelberg, 2011.
[17] S. Goedertier, J. De Weerdt, D. Martens, J. Vanthienen,
and B. Baesens. Process discovery in event logs: An
application in the telecom industry. Applied Soft Computing, 11(2):1697–1710, 2011.
[18] D.M. Blei, A.Y. Ng, and M.I. Jordan. Latent dirichlet
allocation. Joural of Machine Learning Research, 3:993–
1022, March 2003.
[19] D. Gusfield. Algorithms on strings, trees and sequences,
Computer Science and Computational Biology. Cambridge University, 1997.
[20] J.M. Juarez, M. Campos, J. Palma, and R. Marin. Tcare: temporal case retrieval system. Expert Systems,
28(4):324–338, 2011.
[21] X. Wei and W.B. Croft. LDA-based document models
for ad-hoc retrieval. In Proceedings of the 29th annual
international ACM SIGIR conference on Research and
development in information retrieval, SIGIR ’06, pages
178–185, New York, NY, USA, 2006. ACM.
[22] Y. Cheng. Mean shift, mode seeking, and clustering.
Pattern Analysis and Machine Intelligence, IEEE Transactions on, 17(8):790–799, aug 1995.
[23] A.K. Jain, M.N. Murty, and P.J. Flynn. Data clustering: A
review. ACM Computing Surveys, 31(3):264–323, 1999.
[24] L. Ertoz, M. Steinbach, and V. Kumar. Finding clusters
of different sizes, shapes, and densities in noisy, high dimensional data. In Third SIAM International Conference
on Data Mining (SDM), pages 47–58, 2003.
[25] E. Rosch, C. Mervis, W. Gray, D. Johnson, and P. BoyesBraem. Basic objects in natural categories. Cognitive
Psychology, 8:382–439, 1976.
Zhengxing Huang received his B.S. in 2003, and
Ph.D. in 2010 in the College of Biomedical Engineering and Instrument Science at Zhejiang University, P.R. China. At present he is an instructor of the
College of Biomedical Engineering and Instrument
Science at Zhejiang University, P.R. China. His
research interests include computer-aided medical
decision support, artificial intelligence in medicine,
etc.
Wei Dong received her B.S. in Clinical Medicine
from Taishan Medical College, China in 1993, M.S.
in Cardiology from PLA General hospital, China in
1999, and Ph.D. in Cardiology form PLA General
hospital, China in 2002. At present, she is a deputy
chief physician of the Cardiology department of the
PLA general hospital, a young faculty of Chinese
Society of Cardiology, and a young faculty of PLA
Society of Cardiology. Her research interests include
coronary heart disease, the diagnosis and treatment
of acute and chronic heart failure, and clinical decision support.
Huilong Duan received his B.S. in Medical Instrumentation from Zhejiang University, China in 1985,
M.S. in Biomedical Engineering from Zhejiang University, P.R. China in 1988, and Ph.D. in Engineering
(Evoked Potential) form Zhejiang University, P.R.
China in 1991. He is currently a Professor in the
Department of Biomedical Engineering, and Dean
of College of Biomedical Engineering & Instrument
Science, Zhejiang University.
His research interests are in Medical Image Processing, Medical Information System and Biomedical
Informatics.
He has published over 100 scholarly research papers in the above research
areas. He is Program Committee Member of Computer Aided Radiology
and Surgery; Editorial Board of Space Medicine & Medical Engineering
and Chinese Journal of Medical Instruments respectively; Editorial Board
of Chinese Journal of Biomedical Engineering; Secretary-General of BME
Education Steering Committee, Chinese Ministry of Education; and Member
of The Brain-Bridge Program Committee, Philips, TU/e and ZJU.
Haomin Li is assistant professor of Biomedical
Engineering, Zhejiang University, China. He holds
a Ph.D. degree in the Biomedical Engineering from
Zhejiang University. A former post doctorate fellow researcher at NHLBI Proteomics Center, David
Geffen School of Medicine at UCLA. His research
interests focus on clinical Knowledge Translation
and Decision Support recently.
Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
Download