Checking semantic consistency: temporal constraints

advertisement
Automatic Checking of the Correctness of Clinical Guidelines in GLARE
Paolo Terenzianib, Luca Anselmaa, Alessio Bottrighib, Laura Giordanob, Stefania Montanib
a
b
DI, Università di Torino, Corso Svizzera 185, 10149 Torino, Italy, E-mail: anselma@di.unito.it
DI, Univ. del Piemonte Orientale “Amedeo Avogadro”, Spalto Marengo 33, 15100 Alessandria, Italy,
Abstract
Representing clinical guidelines is a very complex knowledgerepresentation task, requiring a lot of expertise and efforts.
Nevertheless, guideline representations often contain several
kinds of errors. Therefore, checking the well-formedness and
correctness of a guideline representation is an important task,
which can be drastically improved with the adoption of
computer programs. In this paper, we discuss the advanced
facilities provided by the GLARE system to assist physicians
in the production of a correct representations of clinical
guidelines.
Keywords:
Artificial Intelligence, clinical guidelines, syntactic and
semantic correctness
Introduction
Clinical guidelines are a mean for specifying the “best”
clinical procedures and for standardizing them. Despite the
fact that they are usually produced as a result of long-term
cooperative efforts of large teams of experts, clinical
guidelines representations can nevertheless contain different
forms of errors and/or inconsistencies. This fact seems to us a
natural consequence of the intrinsic difficulty of organizing
and representing huge amounts of both explicit and implicit
knowledge. The dimension of guidelines is usually such that it
makes infeasible an extensive human check of correctness. On
the other hand, advanced Artificial Intelligence techniques can
be used in order to automatize large parts of such a check. In
the rest of the paper we describe the advanced facilities
provided by GLARE (Guideline Acquisition, Representation
and Execution), a domain-independent prototypical system to
acquire, represent and execute clinical guidelines, in order to
check the correctness of guidelines being represented.
Background
In recent years, the medical community has started to
recognize that computer-based systems dealing with clinical
guidelines provide relevant advantages, since, e.g., they can
be used to support physicians in the diagnosis and treatment
of diseases, or for education, critical review and evaluation
aims [5]. Thus, many different approaches and projects have
been developed in recent years to create domain-independent
computer-assisted tools for managing clinical guidelines (see
e.g., Asbru [13], EON [9], GEM [14], GLIF [10, 11], GUIDE
[12], PROforma [3], and also [5, 7, 8, 16]). Besides the
above-mentioned advantages, we believe that computer-based
tool might also play a crucial role in assisting experts in the
extremely difficult task of producing correct guideline
representations. In the following, we first sketch the main
features of our system, and then we show how it has been
extended in order to provide several form of automatic
checks, to enhance the production of terminologically,
syntactically and semantically correct representations of
guidelines.
The GLARE system
GLARE (Guideline Acquisition, Representation and
Execution) is a domain-independent tool to acquire, represent
and execute clinical guidelines [4, 15]. It has been built,
starting from 1997, in a long-term cooperation between
Dipartimento di Informatica of Università del Piemonte
Orientale, Alessandria, Italy, and Azienda Ospedaliera S.
Giovanni Battista, Torino, Italy, one of the largest hospitals in
Italy. In the rest of this section, we sketch some of the more
interesting general features of the GLARE’s approach, while
in the following section we focus on GLARE’s advanced
approach to check the correctness of guideline
representations.
Representation formalism.
In order to guarantee usability of the program to physicians
not expert in Computer Science, in GLARE we aimed at
defining a limited set of clear representation primitives,
covering most of the relevant aspects of a guideline [15]. We
distinguish between atomic and composite actions (plans),
where atomic actions represent simple steps in a guideline,
and plans represent actions which can be defined in terms of
their components via the has-part relation. The has-part
relation supports top-down refinement: a guideline itself can
be seen as a composite action. Control relations establish
which actions can be executed next, and in what order. We
distinguish between four different control relations: sequence,
controlled, alternative and repetition.
Four different types of atomic actions have been defined as
well: work actions (actions to be performed at a certain step of
the guideline), query actions (requests for information),
decisions (selections among alternatives) and conclusions
(explicit output of a decision process). Actions are described
in terms of their attributes.
Acquisition and Execution tools.
As in most approaches in the literature, GLARE distinguishes
between the acquisition phase (when a guideline is introduced
into the system –e.g., by a committee of expert physicians)
and the execution phase (when a guideline is applied to a
specific patient). Therefore, the system is composed by two
main modules, the acquisition tool and the execution tool. The
tools strictly interact with a set of databases, including the
terminological database (during acquisition) and the patient
database (during execution).
The acquisition tool provides a graphical interface to acquire
atomic actions, has-part relations and control relations
between the components of plans. The guideline is depicted as
a graph, where each action is represented by a node (different
forms and colours are used to distinguish among different
types of actions), while control relations are represented by
arcs. By clicking on the nodes in the graph, the user can
trigger other windows to acquire the internal descriptions
(attributes) of the nodes. The interface also shows the
hierarchical structure of the guideline in the form of a tree,
where plans can be seen as parents of their components (see
figure 1).
We have already tested our representation formalism and
acquisition tool prototype. Several groups of expert
physicians, following a few-hour training session, used
GLARE to acquire algorithms concerning different clinical
domains (e.g., bladder cancer, reflux esophagitis, and heart
failure), with the help of a knowledge engineer. In all the tests,
our representation formalism and acquisition tool proved
expressive enough to cover the clinical algorithms, and the
acquisition of a clinical guideline was reasonably fast (e.g., the
acquisition of the guideline on heart failure, starting from a
non-structured textual representation, required only 3 days).
Methods
Our acquisition module provides an “intelligent” interface to
expert-physicians, in the sense that it helps them in the task of
acquiring a consistent guideline. In order to achieve this goal,
our acquisition module supports different types of consistency
checking, which automatically operate whenever the expertphysician modifies (typically, with the addition of new nodes
—actions—, arcs —control relations— or descriptions of the
attributes of a node) a guideline. Other checks can be applied
afterwards, when an entire guideline has been acquired, to
verify its semantic consistency.
The execution tool is typically used “on-line”: a user
physician applies a guideline with reference to a specific
patient. This method is used for integrating guidelines into
clinical practice. Moreover, GLARE is available for “off-line”
execution, i.e. for education, critical review and evaluation
purposes. The execution tool also provides a decision support
facility, which allows physicians navigate through the
guideline to see and compare alternative paths (stemming
from decision actions).
Checking terminological correctness
This first type of consistency checking is automatically
triggered whenever the expert physician introduces a new
term or value within the description of an action in a
guideline. The acquisition module strictly interacts with the
clinical vocabulary Database in order to provide expertphysicians with a standard terminology, and with a standard
range of values for clinical findings. We currently support
two modalities for the execution of the acquisition tool: the
“safe” and the “advanced” modalities. In the “safe” mode, the
expert physician is only allowed to introduce in the guideline
terms/values that have already been defined within the
clinical vocabulary Database. In order to make this task
easier, the acquisition tool provides a means of browsing the
clinical vocabulary Database, on the basis of the hierarchical
organization of the data it contains. For instance, in the
decision “GERD differential diagnosis” one criterion regards
duration of heartburn. This datum can be found by browsing
the database as follows: patient history (section)  subjective
symptoms (class)  specific complaints (category) 
heartburn (datum)  duration (attribute). The possible values
of this attribute, as stored in the Clinical Database, are “null”,
“less than 3 months”, “more than 3 months”.
Thus, in the “safe” mode, the acquisition tool enforces all
data and values to be consistent with the dictionary provided
by the Clinical Database.
In the “advanced” mode is we also allow expert physicians to
introduce new terms/values, that are not already contained
within the Clinical Database. In such a case, a warning is
shown to the expert physician. If s/he decides to go on, s/he
is directly responsible for the correctness of the new
term/value.
Testing.
Checking syntactic consistency
Figure 1: A window of GLARE’s acquisition tool graphical
interface (concerning part of the gallbladder stones treatment
guideline): on the left, the hierarchical structure of the
guideline is displayed; on the right, the representation of
control relations is shown in form of a graph
The second type of consistency checking is automatically
triggered whenever the expert physician introduces a node or
arc within a guideline. The acquisition module checks
whether the new element being introduced is consistent with
several “logical design criteria” of guidelines we want to
enforce with our tool. For example, the acquisition module:
(i) checks that each alternative is preceded by a decision;
whenever alternative arcs exit a node, the acquisition module
checks that such a node is a decisional one. This check
allows for the fact that whenever alternative ways of
achieving goals are considered in the guideline, the guideline
also contains an explicit way of discriminating between them.
This property is important especially at execution time, since,
in such a way, the execution tool can provide specific support
to the user physicians whenever they have to choose from
alternatives;
(ii) checks that decision actions are preceded by query
actions specifying all the data involved in the decision
criteria. Thus, at execution time, a decision action is
executable only after all necessary data for that decision are
available.
Checking semantic consistency: temporal constraints
In most therapies, actions have to be performed according to a
set of temporal constraints concerning their relative order,
their duration, and the delays between them. Additionally, in
many cases, actions must be repeated at regular (i.e., periodic)
times. Furthermore, it is also necessary to carefully take into
account the (implicit) temporal constraints derived from the
hierarchical decomposition of actions into their components
and from the control-flow of actions in the guideline.
Checking the consistency of such a set of implicit and explicit
constraints is a very hard task, which cannot be performed
manually by any expert. On the other hand, within the
Artificial Intelligence community, several approaches to
perform automatically the propagation of temporal constraints
and to check their consistency have been developed [17].
Despite the large amount of valuable works, there still seems
to be a gap between the range of phenomena covered by
current AI temporal reasoning approaches and the needs
arising from clinical guidelines management. In particular, in
clinical guidelines,
(1) qualitative and quantitative constraints, as well as
repeated/periodic events need to be considered at the
same time; all types of constraints may be imprecise
and/or partially defined;
(2) a structured representation of complex events (in terms
of part-of relations) must be supported, to deal with
structured descriptions of the domain knowledge;
(3) the distinction between classes of actions (e.g. an action
in a general guideline) and instances of such actions
(e.g., the specific execution of an action in a guideline)
has to be supported;
Obviously, the interplay between issues (1)-(3) needs to be
dealt with, too. For example, the interaction between
composite and periodic events might be complex to represent
and manage. In fact, in the case of a composite periodic event,
the temporal pattern regards the components, which may,
recursively, be composite and/or periodic events. For instance,
consider Ex.1. In Ex. 1, the instances of the melphalan
treatment must respect the temporal pattern “twice a day, for 5
days”, but such a pattern must be repeated for six cycles, each
one followed by a delay of 23 days, since the melphalan
treatment is part of the general therapy for multiple mieloma.
(Ex. 1) The therapy for multiple mieloma is made by six cycles
of 5-day treatment, each one followed by a delay of 23 days
(for a total time of 24 weeks). Within each cycle of 5 days, 2
inner cycles can be distinguished: the melphalan treatment, to
be provided twice a day, for each of the 5 days, and the
prednisone treatment, to be provided once a day, for each of
the 5 days. These two treatments must be performed in
parallel.
Unfortunately, no current approach in the AI and in the
guideline literature proposes a comprehensive approach in
which all the above phenomena can be represented, and
correct, complete and tractable temporal reasoning can be
performed. In GLARE, we define an approach addressing all
the above-mentioned issues [1].
A complete automatic treatment of temporal constraints
involves, besides the design of an expressive representation
formalism, also the development of suitable temporal
reasoning algorithms operating on them, to be applied both at
acquisition and at execution time. However, subtle issues such
as the trade-off between the expressiveness of the
representation formalism and the tractability of correct and
complete temporal reasoning algorithms have to be faced in
order to deal with temporal constraints in a principled and
well-founded way; few works in the area of computerized
guidelines have deeply analyzed this topic so far.
As a starting point, we have chosen to rely as much as
possible on STP (Simple Temporal Problem) [2], a well
known and consolidated Artificial Intelligence approach
coping with different types of temporal constraints. However,
we had to extend it, in order to cope with the beforementioned additional temporal issues. Specifically, We have
chosen to model the constraints regarding repeated actions
into separate STPs, one for each repeated action. Thus, in our
approach, the overall set of constraints between actions in the
guideline is represented by a tree of STPs (STP-tree
henceforth). The root of the tree (node N1 in the example in
Fig. 2) is the STP which homogeneously represents the
constraints (including the ones derived from the control-flow
of actions in the guideline) between all the actions in the
guideline (e.g., in N1, the fact that the duration of the
chemotherapy is 168 days), except repeated actions. Each
node in the tree is an STP, and has as many children as the
number of repeated actions it contains. Each edge in the tree
connects a pair of endpoints in an STP (the starting and ending
point of a repeated action) to the STP containing the
constraints between its subactions, and is labeled with the list
of properties describing the temporal constraints on the
repetitions. For example, in Fig. 2, we show the STP-tree
representing the temporal constraints involved by the example
Ex. 1.
Figure 2: STP-tree for the multiple mieloma chemotherapy
guideline. Thiny lines and arcs between nodes in a STP
represent bound on differences constraints. Arcs from a pair of
nodes to a child STP represent repetitions. Arcs between any
two nodes X and Y in a STP of the STP-tree are labeled by a
pair [n,m] representing the minimum and maximum distance
between X and Y.
In order to check the consistency of the STP-tree, it is not
sufficient to check the consistency of each node separately. In
such a case, in fact, we would neglect the
repetition/periodicity information. Temporal consistency
checking, thus, proceeds in a top-down fashion, starting from
the root of the STP-tree. Basically, the root contains a
“standard” STP, so that the Floyd-Warshall’s algorithm can be
applied to check its consistency (as shown in [2]). Thereafter,
we proceed top down towards the leaves of the tree. For each
node in the tree, we first check that the constraints on the arcs,
considereds alone, are consistent. If so, we then merge such
constraints with the constraints in the son node, and propagate
the resulting constraints to verify the joint consistence. We
also formally proved the following.
Property. Our algorithm to check the consistency of STPtrees is correct, complete, and operate in polynomial time.
Checking semantic “logical” consistency
Besides the property of being temporally consistent, several
other semantic properties (e.g., logical consistency, safeness)
should be checked on clinical guidelines. Specifically, we
have identified four different classes of properties relevant in
the clinical guideline context, and we have proposed a general
and task-independent approach to check all of them.
(i)
(ii)
Properties concerning a guideline “per se”. One can
check if the guideline contains a path of actions
satisfying a given set of conditions (e.g., a path
including actions X, Y and Z, or a path in which no
action of type X is executed, or a path nor requiring a
given laboratory test, or a path requiring only a given
set of resources, and so on);
Properties of a guideline in a given context. Specific
contexts of execution may impose several limitations
on the executable actions of guidelines, related, e.g., to
the lack of certain resources (e.g., laboratory
instruments). The consequences of such limitations
may be automatically investigated. For instance, one
can check whether there is or not a therapy for a patient
affected by a given disease, in the case a specific set of
resources is available (not available).
(iii) Properties of a guideline when applied to a specific
patient. For instance, the feasibility of a given action
or path of actions on the specific patient can be
checked.
(iv) Integrated proofs. Any combination of the above
types of checks can be performed. For instance, one
may ask whether, given a patient with a specific
disease and set of symptoms, and given an hospital
with a specific set of resources, there is a path in the
guideline which applies to the patient and satisfies a
given set of properties.
In our approach, we provide a general way of proving all the
above properties by loosely coupling GLARE with the model
checker SPIN [6]. Roughly speaking, we have devised a tool
to map clinical guidelines acquired by the GLARE system into
the Promela language, which is the language used by the SPIN
model-checker. Promela allows a high level model of a
distributed system to be defined by modelling each agent in
an extended pseudo-C code, including synchronization
primitives and message exchange primitives. Specifically,
GLARE’s guidelines are translated into a set of agents (e.g.,
the agents representing the guideline actions, the agent
representing the physician executing the guideline, and so on).
Once we have the translation of GLARE’s guidelines in
Promela code, we can use SPIN as a general-purpose engine
to prove any property that can be expressed in the temporal
logic LTL. In fact, SPIN translates each process (each agent)
into a finite automaton, and the global behaviour of the system
is obtained by computing an asynchronous interleaving
product of automata. The resulting automaton represents the
global state space of the system and can built on-the-fly during
the verification process. The property which has to be verified
on the system is passed to the verifier through an interface,
which maps it into a temporal formula, as required by SPIN.
SPIN converts the negation of the temporal formula into a
Büchi automaton and computes its synchronous product with
the system global state space. If the language of the resulting
Büchi automaton is empty then the property is true on all the
possible execution of the system, otherwise the verifier
provides a counterexample for the property (an execution path
on which it is false).
In such a way, we provide a general-purpose approach to
automatically check all the types of properties discussed
above.
Results
GLARE provides a set of facilities to help physician during
the acquisition of clinical guidelines, and to verify a-posteriori
the correctness of guidelines being represented. To the best of
our knowledge, no other guideline system in the literature
provides such a large set of facilities. GLARE’s facilities have
proven to be quite effective in our testing activities. In
particular, in the cases in which a textually written guideline
had to be entered into the GLARE systems, the adoption of
GLARE facilities allowed us to detect different kinds of errors
in the original guidelines. In certain cases, these errors were
simply due to omissions (e.g., lack of a decision action to
discriminate between alternative paths of actions). In other
cases, they were due to the human impossibility to propagate
constraints (and, in particular, temporal constraints) along
long paths of actions. In all cases the physician experts agreed
that the errors detected by the GLARE systems were
“genuine” errors in the original guidelines, and agreed to
correct them.
Acknowledgments
We acknowledge Prof Gianpaolo Molino and Dr. Mauro
Torchio of Azienda Ospedaliera S. Giovanni Battista, Turin,
Italy for their cooperation in defining and testing the GLARE
system. The research reported in this paper has been partially
supported by a grant from Koine Sistemi, Torino, and by
PRIN’06.
References
1. L. Anselma, P. Terenziani, S. Montani, A. Bottrighi.
Towards a Comprehensive Treatment of Repetitions,
Periodicity and Temporal Constraints in Clinical Guidelines.
Artificial Intelligence in Medicine, 2006, 38(2): 171-195.
2. R. Dechter, I. Meiri, J. Pearl, Temporal Constraint
Networks, Artificial Intelligence , 1991; 49: 61-95.
3 J. Fox, N. Johns, A. Rahmanzadeh, R. Thomson,
Disseminating medical knowledge: the PROforma approach,
AI in Medicine, 1998; 14: 157-181.
4 L. Giordano, P.Terenziani, A. Bottrighi, S. Montani, L.
Donzella, Model Checking for Clinical Guidelinesç an AgentBased Approach, Proc. AMIA 2006, Washington D.C.;
November 2006. p.289-293.
5. C. Gordon and J.P. Christensen, eds., Health Telematics for
Clinical Guidelines and Protocols (IOS Press, Amsterdam);
1995.
6. G.J.Holzmann, The SPIN Model Checker. Primer and
Reference Manual. Addison-Wesley; 2003
7. JAMIA, Focus on Clinical Guidelines and Patient
Preferences, JAMIA, 1998; 15(3).
8. Special Issue on Workflow Management and Clinical
Guidelines, D.B. Fridsma (Guest ed.), JAMIA, 2001; 22(1):180.
9. M.A. Musen, S.W. Tu, A.K. Das, and Y. Shahar, EON: A
component-based approach to automation of protocol-directed
therapy. Journal of the American Medical Information
Association 1996; 3(6): 367-388.
10. L. Ohno-Machado, J.H. Gennari, S. Murphy, N.L. Jain,
S.W. Tu, D.E. Oliver, et al., The GuideLine Interchange
Format: A Model for Representing Guidelines, JAMIA , 1998;
5(4):357-372.
11. M. Peleg, A.A. Boxawala, et al., GLIF3: The evolution of
a Guideline Representation Format, in: Proc. AMIA Annual
Symposium; 2000.
12. S. Quaglini, M. Stefanelli, A. Cavallini, G. Miceli, C.
Fassino, and C. Mossa, Guideline-based careflow systems,
Artificial Intelligence in Medicine, 2000; 20(1):5-22.
13. Y. Shahar, S. Mirksch, P. Johnson, The Asgaard Project: a
Task-Specific Framework for the Application and Critiquing
of Time-Oriented Clinical Guidelines, Artificial Intelligence
in Medicine, 1998; 14:29-51.
14. R.N. Shiffman, B.T. Karras, A. Agrawal, R. Chen, L.
Menco, and S. Nath, GEM: a proposal for a more
comprehensive guideline document model using XML,
JAMIA, 2000; 7(5):488-498.
15. P. Terenziani, G. Molino, M. Torchio. A Modular
Approach for Representing and Executing Clinical Guidelines.
Artificial Intelligence in Medicine 23 (2001) 249-276.
16. S.W. Tu, M.S. Mark, A. Musen, A Flexible Approach to
Guideline Modeling, in: Proc. AMIA’99; 1999. p. 420-4.
17. L. Vila, A Survey on Temporal Reasoning in Artificial
Intelligence, AI Communications, 1994; 7(1):4-28.
Please address all correspondence to Paolo Terenziani
Prof. Paolo Terenziani,
Dipartimento di Informatica, Universita’ del Piemonte Orientale
“Amedeo Avogadro”
Via Bellini 25\g, 15100 Alessandria, Italy
Email: terenz@mfn.unipmn.it -- Phone: +39 0131 360174
Download