Uploaded by essraa2345

WIREs Data Min Knowl - 2022 - Guzzo - Process mining applications in the healthcare domain A comprehensive review

advertisement
Received: 22 September 2021
Revised: 9 November 2021
Accepted: 10 November 2021
DOI: 10.1002/widm.1442
ADVANCED REVIEW
Process mining applications in the healthcare domain:
A comprehensive review
Antonella Guzzo1
| Antonino Rullo1
| Eugenio Vocaturo1,2
1
DIMES Department, University of
Calabria, Rende, CS, Italy
2
CNR-NANOTEC National Research
Council, Rende, CS, Italy
Correspondence
Antonella Guzzo, DIMES Department,
University of Calabria, Rende, CS 87036,
Italy.
Email: antonella.guzzo@unical.it
Edited by: Elisa Bertino, Associate Editor
and Witold Pedrycz, Editor in Chief
Abstract
Process mining (PM) is a well-known research area that includes techniques,
methodologies, and tools for analyzing processes in a variety of application
domains. In the case of healthcare, processes are characterized by high
variability in terms of activities, duration, and involved resources
(e.g., physicians, nurses, administrators, machineries, etc.). Besides, the
multitude of diseases that the patients housed in healthcare facilities suffer from makes medical contexts highly heterogeneous. As a result, understanding and analyzing healthcare processes are certainly not trivial tasks,
and administrators and doctors look for tools and methods that can concretely support them in improving the healthcare services they are involved
in. In this context, PM has been increasingly used for a wide range of applications as reported in some recent reviews. However, these reviews mainly
focus on discussion on applications related to the clinical pathways, while a
systematic review of all possible applications is absent. In this article, we
selected 172 papers published in the last 10 years, that present applications of
PM in the healthcare domain. The objective of this study is to help and guide
researchers interested in the medical field to understand the main PM applications in the healthcare, but also to suggest new ways to develop promising
and not yet fully investigated applications. Moreover, our study could be of
interest for practitioners who are considering applications of PM, who can
identify and choose PM algorithms, techniques, tools, methodologies, and
approaches, toward what have been the experiences of success.
This article is categorized under:
Application Areas > Health Care
Fundamental Concepts of Data and Knowledge > Key Design Issues in Data
Mining
KEYWORDS
conformance analysis, healthcare, hospital information system, process discovery, process
mining, process performance measurements, process simulation
WIREs Data Mining Knowl Discov. 2022;12:e1442.
https://doi.org/10.1002/widm.1442
wires.wiley.com/dmkd
© 2021 Wiley Periodicals LLC.
1 of 47
GUZZO ET AL.
1 | INTRODUCTION
The high heterogeneity that characterizes the large amount of data generated by health information systems represents
the major obstacle for the assessment of healthcare quality. Such a heterogeneity is due to multiple factors. First, there
exist several healthcare organizations where patients go for treatment, that differ from each other for many aspects such
as resource management, specialized departments, medical disciplines, outsourced care services, technological equipment, and size of the healthcare facility, to name a few. These differences at the healthcare facility level mean that
healthcare processes, even in similar contexts (e.g., the treatment of a specific disease), may differ from each other in
quality, duration, quantity, and type of the resources involved in the process (e.g., physicians, nurses, administrators,
etc.). However, the dynamics of healthcare processes may also depend on further aspects, among which the introduction of new administrative procedures, and medical guidelines; the discovery of new drugs, treatments, diagnostic procedures, and diseases; the wide range of reactions of the patients' body when subjected to a specific treatment; the
effectiveness of a treatment on each patient; the patient behavior; the course and evolution of diseases for each single
patient; the individual experience of the hospital staff, which leads to different decisions and interpretations. The result
of such a complex, variable, dynamic, and multidisciplinary nature of the healthcare domain is that healthcare processes have high degree of variability, non-repetitive character, and a nondeterministic order of execution to a large
extent.
It comes out that the understanding and analysis of healthcare processes are definitively not trivial tasks. An effective assessment of their quality, indeed, may require the processes to be analyzed from different perspectives, such as
workflow (the sequence of performed activities), time (temporal relation between performed activities), resource (actors
and medical equipment involved in the process), and data (additional information like drugs, department, and hospital
related data). It is clear, then, that hospital administrators and doctors need to be assisted in the task of process understanding by tools for the automatic extraction of knowledge from data, that are able to process and correlate large
amount of (possibly unstructured) heterogeneous information.
Such demand calls for the use of process mining (PM), a research area that includes techniques, methodologies, and tools to analyze and improve processes in a variety of applications, such as the automated discovery of
knowledge from data, and checking conformance with regulation and choosing the most suitable model to represent complex processes by the sequence of events, their timing, and the set of involved resources. The PM
approach allows to find out patterns hidden by the huge volume of data, and to evaluate the process performance
with the aim of acting on process improvement and generating predictions. Moreover, comparative analyses can
be performed between the discovered models and reference ones, such as medical protocols and clinical guidelines, in order to evaluate the compliance of actual processes with respect to prescribed ones, or in other words, to
figure out what is really happening compared to what it was expected to occur. In the last decade, PM techniques
has been used in the field of healthcare processes in a wide range of applications: discovering of process models,
analysis of social interactions, conformance checking, as reported in some recent surveys. However, these surveys
mainly focus the discussion on applications related to clinical pathways, while a systematic review of all possible
applications is completely absent.
Objective of this study is to help and guide researchers who are interested in the medical field to understand the
main PM applications in the healthcare domain, but also to suggest new ways to go for promising and not yet fully
investigated applications. On the other hand, our study could be of interest also for practitioners who are considering
applications of PM, who can identify and choose PM algorithms, techniques, tools, methodologies, and approaches,
toward what have been the experiences of success. We searched manually for the studies published in the last 10 years
in the main electronic digital libraries of scientific literature and we selected 172 papers for a thorough review. In this
article, we propose a taxonomy for the conceptual organization of the retrieved papers on PM applications in
healthcare, and use this to discuss results.
The article is organized as follows: Section 2 overviews the review protocol of the current study by presenting
research questions and details about the publication selection process; Section 3 gives a short overview of main PM
techniques, tools and methods; Section 4 provides insight on healthcare PM application by characterizing data and process in the related field; Section 5 reports a literature discussion based on the proposed categorization of the works that
were selected; Section 6 includes a summary of our findings together with open issues and future research directions;
Section 7 presents past reviews as related works; finally, Section 8 concludes the article by providing overall results and
plans for future work.
agenda
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2 of 47
3 of 47
2 | REVI EW PROT OCOL
In order to retrieve and select relevant studies we conducted a systematic literature research according to the approach
described by Kitchenham (2004), which allowed us to identify and classify the revised papers on the basis of the
adopted PM application methods and algorithms. In particular, our literature research and revision process consist of
the following steps:
1. Specification of the research questions to be answered;
2. Definition of the query to be launched on the most relevant databases of academic papers;
3. Identification of inclusion and exclusion criteria for the screening of the retrieved papers;
4. Revision of the papers obtained so far;
5. Definition of a taxonomy to guide the classification of revised papers.
The following research questions have been defined:
RQ1: What methods and techniques have been adopted for the PM applications in healthcare?
RQ2: How to categorize the processes that are performed in healthcare facilities and which kind of such processes were
subject to more attention in the literature?
RQ3: What are the main challenges PM faces when applied to the healthcare domain?
In order to identify the largest number of targeted research and to assess the volume of potentially relevant studies,
we have defined the following query:
(“process mining” OR “workflow mining”) AND (“health” OR “care” OR “healthcare” OR “hospital” OR “clinic” OR “clinical” OR “pathways” OR “health-care” OR “patient”)
Then we launched the query on the most relevant databases of academic papers, that include PubMed, dblp, Google
Scholar, Scopus, Web of Science, IEEE Xplore, ACM Digital Library, SpringerLink, and ScienceDirect. The retrieved
papers were the result of the above query evaluated on the title, keywords and abstract of the articles. The search was
done independently by each author within the reference time range from 2010 to 2021. As expected, the number of
items provided by each database was unreasonably high (in the order of thousands in Google Scholar, and of hundreds
in the other platforms); however, it was evident that after a certain number of pages the results were articles either
already identified or off-topic (e.g., only PM or only healthcare).
In parallel, we carried out a search on the four websites processmining.org, processmining4healthcare.org,
bpmcenter.org, and pods4h.com, trying to identify relevant studies or project that fit with the application of PM in the
healthcare field.
We found a considerable overlap in the obtained results, as many articles are contained in different databases. After
an aggregation process, in total 296 articles were considered in our review process.
To determine which articles include in, and exclude from this review, we adopted the following criteria:
Inclusion criteria:
• Articles where PM has been applied in the healthcare domain;
• Articles published in English;
• Articles published in relevant journals and conferences;
Exclusion criteria:
• Articles that did not present a PM application in the context of healthcare;
• Articles which contribution were not clear;
• Articles written not in English;
• Bachelor, master, and PhD thesis;
• Conference articles for which an extended journal version has been published; in these cases, the journal version
only was selected.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
According to Kitchenham (2004), these criteria were decided during the protocol definition and refined during the
search process, in order to reduce the likelihood of bias. Of the 296 articles found from the evaluation of the above
query, 187 were considered after the application of the inclusion and exclusion criteria. Finally, 15 papers were
excluded from the review process because review papers or systematic mapping studies, that are separately discussed in
Section 7.
To ensure the quality of the research process, a number of activities have been undertaken. Search in DB was performed in unknown mode to avoid conditioning on previous navigation, and search of papers was extended to websites
accredited in PM healthcare in order to minimize threats of incompleteness that may result from search terms and search engines. The research, analysis, and evaluation of articles based on initial queries were performed manually by the
first author and by at least one additional author. Any disagreements with the inclusion/exclusion of a paper were
resolved by all authors through discussion. Therefore, we believe that by following the methodology proposed by Kitchenham (2004) which is sufficiently accredited, we have managed to ensure an adequate and inclusive basis for this
study and, in the event of a missing publication, the rate would be negligible.
3 | P R O C E S S MI N I N G
The goal of PM is to discover, monitor, analyze, improve, and predict real processes. Organizations that have limited
information on the processes they manage resort to PM to get a deeper insight into the details that characterize these
processes, such as performances, bottlenecks, involved resources, process deviations, and so on. This is because there
may be a significant gap between how a process is supposed to behave, and what actually happens.
Figure 1 depicts a taxonomy for PM according to which a PM task is characterized by six aspects, that are: Application, which identifies the purpose of the PM task; Technique, which identifies the algorithm used to achieve the
intended purpose; Tool, which identifies the technology used to implement the technique; Approach, which identifies
the steps followed to finalize the application; Process, which identifies the kind of process being analyzed; and Data,
which identifies the data source from which the event log has been extracted.
The most widespread PM applications are process discovery, conformance analysis, process enhancement, process performance measurement, and social network analysis (SNA). Process discovery is the task of inferring a model of the process under analysis by means of a discovery algorithm, which takes as input a log of process executions and outputs a
process model described in terms of a process modeling language. In a conformance analysis task, the data generated
from an a priori model are compared with real data in order to compare the model with reality. Enhancement techniques allow to improve or/and extend an existing process model with the insights extracted from logged process executions. The scope of a process performance measurement task is to describe the process under analysis in terms of a set
of performance key indicators (PKI), defined with the aim finding bottlenecks and inefficiencies. SNA allows to discover the interaction patterns that characterize the actors involved in the process.
PM techniques include algorithms for process discovery, conformance checking, process deviation, trace clustering,
and event clustering; but also data visualization methods such as dotted charts, and languages for the representation of
processes. Existing modeling languages visualize the content of a process execution either as declarative or procedural
process models. Procedural languages are Petri nets (Peterson, 1977), process trees, causal nets (Van Der Aalst,
Adriansyah, & Van Dongen, 2011), state machines, and BPMN (Chinosi & Trombetta, 2012), to name a few; declarative
languages are LTL (van der Aalst et al., 2005) and declare (Pesic et al., 2007).
Various are the tools that allow to implement or apply the above-mentioned PM techniques. Someones are GUIbased, such as ARIS,1 Disco,2 ProM,3 PALIA-ER (Rojas, Fernandez-Llatas, et al., 2017), and Celonis.4 GUI-based tools
FIGURE 1
A taxonomy for process mining
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
4 of 47
5 of 47
allow nonexpert PM users to deploy complete PM applications, but also provide experienced ones with a convenient
environment to implement ad hoc solutions. Other tools include programming language's libraries, such as pMineR
(Gatta et al., 2017), bupaR (Janssenswillen et al., 2019), and pm4py (Berti et al., 2019).
The typical approach followed to conduct a PM task consists of four steps: data collection, event log extraction, process discovery, and process analysis. The last step can serve for different purposes, for example: the user may wish to
verify the compliance of the actual process execution with respect to medical guidelines; or else, the user may want to
analyze the process performances, or even, to perform a process simulation to conduct a what if analysis. However,
more specific PM frameworks have been proposed in literature in the form of guidelines to be followed in order to conduct a sound process analysis. These include the L*, which describes the life cycle of a typical PM project (Van Der
Aalst, Adriansyah, De Medeiros, et al., 2011), the PM Project Methodology (PM 2), which includes iterations and consists of six key phases (Van Eck et al., 2015), and, in the context of healthcare, the DIAG approach (Araghi, Fontanili,
et al., 2018), which provides location awareness by means of location systems generated data.
A PM task can be also characterized by means of processes and data when referring to a specific application context,
such as the healthcare in our case. Disparate is, indeed, the different medical and organizational processes that take
place within a healthcare facility, as well as the information systems from which process data are extracted. These two
aspects are described more in detail in Sections 4.1 and 4.2. Processes can be analyzed from different viewpoints, also
called perspectives, that are:
• Control-flow, which focuses on the activities that are executed and the relationships of precedence among them
(in terms of preconditions and post-conditions);
• Organizational, (a.k.a. resource) which focuses on the actors that are involved, on their roles, and on how they are
mapped/assigned to the activities;
• Temporal, (a.k.a. performance) which focuses on the temporal aspect of process executions, such as their duration,
the temporal relations between activities, frequency of specific activities, and is typically considered for the detection
of bottlenecks and the estimation of performance indicators, such as throughput and average duration of process
executions;
• Data, which focuses on the data used and generated during the process execution, including the attributes that characterize the given enactment, and the attributes that characterize the triggered events.
PM techniques assume that process executions are recorded in the form of an event log, that is, a log consisting of a
set of traces. Each trace represents a process case (i.e., a specific execution or instance of a process), and is characterized
by a set of attributes, and by a temporal ordered sequence of events. Events, in turn, are characterized by a timestamp
and a set of event attributes describing the intrinsic properties of the event, for example, activity's name, performing
resource, the cost of executing the activity, the life cycle phase of the activity, origin place, and so forth. An XML-based
standard for event logs is the extensible event stream (XES) (IEEE, 2016), accepted as the standard format for the interchange of event log data between tools and application domains, and supported by the vast majority of PM tools.
The use of PM provides several benefits with respect to other data mining approaches when dealing with processes
executions data. First, it combines the strengths of process modeling and data mining, thus automatizing the generation
of processes models from raw data. Furthermore, with the help of data visualization methods and process model representation languages, such models can include a large set of process information such as different event relations and
temporal features. Other techniques like sequential pattern mining and episode mining, instead, while being sequenceoriented data mining techniques, they focus on the discovery of sequential orderings and partially ordered sets of events
only, leaving out many process characteristics that can be useful for the process management, such as sequential orderings, (exclusive) choice relations, concurrency, and loops.
On the other hand, PM can have drawbacks depending on the nature of the process being analyzed. For instance, in
complex scenarios process models may oversimplify the real processes as they may fail to capture important aspects,
thus providing a naive view of reality (Van der Aalst, 2009); or contrariwise, a process discovery task may generate an
overfitting model, that is, a model which perfectly describes a given set of process executions but unable to adapt to
unexpected process deviations. Another common issue is the so-called spaghetti-like effect, that is the term of the PM jargon for a model hardly intelligible because characterized by a high number of paths between starting and ending activities. These circumstances occur when dealing with unstructured processes, that is, when process traces do not form a
homogeneous group because very different with each other, or when logged events are too fine-grained, in both cases
causing process discovery algorithms to produce incomprehensible models, or models that are not representative.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
Sometimes, models for other perspectives can be created, such as flow times and social networks; however, it is very
unlikely that all of these can be folded into a meaningful single process model as the highly variable nature of the
control-flow does not allow it (Van der Aalst, 2011).
In these cases, PM must be supported by a domain expert who, depending on the objectives of the investigation,
brings the analysis on a different level of abstraction by inserting/removing data to/from the event log, so as to obtain a
more clear view of the process. The inclusion of a human in the process discovery task is known as interactive process
mining, which will be discussed later in this survey.
4 | PM IN HEALTHCARE
As already mentioned, a complicating factor for the application of PM in the healthcare domain is the highly complex
and variable nature of processes: a wide range of processes with different characteristics are daily executed; the delivery
of care services involves several departments and organizations, and provides for the collaboration between professionals with different skills; the flow of each healthcare process is determined by many factors, like the unpredictable
outcome of treatments, and the different interpretations physicians may give to patient's symptoms. Additionally, the
amount of data registered in health information systems may be so huge to make it difficult for organizations to be
aware of the inefficiencies and bottlenecks. Furthermore, the health information systems are daily updated with information related to different actors such as payers, patients, physicians, nurses, and other service providers, which prevents the integration of medical data and, consequently, the evaluation of health care quality.
In this context, PM aims to automatically identify models of patient care, of hospital staff practices, of organizational
processes, and can help to carry proactive process analysis in order to anticipate problems and to put in place countermeasures that enhance the quality of outsourced care services, for example, by shortening treatment times, waiting
times in emergency departments, but also to reduce healthcare costs.
Besides the identification and the consequent enhancement of actual processes, PM is widely used for the assessment of clinical guidelines. Clinical guideline are general recommendations about care, treatments, and diagnosis,
defined a priori with the aim of leading hospital staff in doing their job. However, many processes are theoretically
defined but hard to follow due to the high variety of unexpected scenarios that some events can lead to. Thus, a common practice is to perform a conformance analysis of actual processes against the clinical guidelines so as to find the
differences between what really happens and what it was expected to occur, and then update/enrich clinical guidelines
accordingly.
In this review, we propose the taxonomy shown in Figure 1 as a reference point for the characterization of PM tasks
carried in a healthcare context. Healthcare Data and Processes are described more in detail in Sections 4.1 and 4.2,
respectively; Applications are discussed in Sections 5.1 and 5.2; Tools are described in Section 5.4; Approaches are discussed in Section 5.3. We separately discuss PM techniques and algorithms for each of these aspects.
4.1 | Healthcare data
Process-oriented commercial systems store process information by tracing the process related activities in the form of
an event log. Healthcare data, however, are typically archived by means of DBMS-like systems in a relational/
transactional format, which is not suitable for use in PM applications, and therefore requires a preliminary
preprocessing step to be turned into the form of an event log.
The digital records technologies most used in the healthcare domain are: hospital information system (HIS), electronic medical records (EMR), and electronic health records (EHR). An EMR is the digital version of the patient's medical record as it contains the patient's clinical history in digital form, and typically does not come out of the doctor's
office. An EHR can be considered as an augmented EMR as it is inclusive of a broader view of a patient's care. Besides
the patient's medical history, an EHR contains information about all the staff involved in the patient's care, and it can
also share information with other healthcare facilities. A HIS is a wider information system that would be inclusive of
several subsystems like EMR, EHR, and Picture Archiving and Communication Systems (PACS), as well as other systems to support care/hospital management. It keeps heterogeneous data as it is designed to manage multiple operational aspects such as medical, administrative, and financial. In a HIS, thus, we can find disparate information related
to patient registration, admission, discharge, and transfer; but also finance functions such as billing, accounts
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
6 of 47
FIGURE 2
7 of 47
Data sources used in the reviewed papers
receivable, accounts payable, and some clinical documentation such as diagnoses and billing codes associated with
medical encounters and procedures (Piccialli et al., 2021).
As we will show in the next sections, HIS, EHR, and EMR of healthcare facilities are not the only source of data for
PM applications. In some work the event log has been synthetically generated using (Burattin, 2016; Burattin &
Alessandro Sperduti, 2010), extracted from video recordings (e.g., of a surgical operation; A. Grando et al., 2017;
Kelleher et al., 2014; Lira et al., 2019; S. Yang, Tao, et al., 2018), from body-sensor data (e.g., sequence of blood pressure
measurements; Fernandez-Llatas, Garcia-Gomez, et al., 2011; Kaymak et al., 2012; McGregor et al., 2011), from location
systems such as Real Time Location Systems (RTLS) logs, and from the Medical Information Mart for Intensive Care III
(MIMIC-III)5 open access dataset (Alharbi et al., 2018; A. P. Kurniati et al., 2018a, 2019; Marazza et al., 2020; Pika
et al., 2019; Rojas & Capurro, 2018).
A RTLS (Kamel Boulos and Berry, 2012) contains information on the location of process actors such as patients, clinicians, and medical machineries. Locations are determined by means of wireless technologies as radio frequency identification (RFID) tags, integrated in patient identification bands or staff cards, and antennas which are deployed in the
hospital's departments. RTLS data have been used for PM to collect patients' movements (Araghi et al., 2019, 2020;
Araghi, Fontaili, et al., 2018; Araghi, Fontanili, et al., 2018; Fernandez-Llatas et al., 2013; Martinez-Millana
et al., 2019), to mine the workflow of medical equipments (Liu et al., 2014), to study a surgical process (FernandezLlatas et al., 2015), and to enhance event logs extracted from a HIS (Fernandez-Llatas et al., 2021; Martin, 2018).
MIMIC-III is an open-source healthcare database which contains obfuscated data of about 46.000 intensive care unit
patients that were housed at the Beth Israel Deaconess Medical Centre in Boston, USA, between June 2001 and October
2012. The data were extracted from the hospital's EHR, and can be used to create a relational database with 26 tables,
16 of which with timestamped data useful for building patient workflows. The dataset includes body measurements,
medications, laboratory measurements, observations and notes charted by clinicians, procedure codes, diagnostic codes,
images, hospital length of stay, and more, providing a comprehensive example of EHR data from a large and busy hospital (A. E. W. Johnson, Pollard, et al., 2016).
Figure 2 shows in which percentage each repository has been used as data source for the experiments conducted in
the reviewed papers for the evaluation of the proposed PM application.
4.2 | Healthcare processes
Rojas et al. (2015) categorized healthcare processes into two types: the medical treatment processes, which include activities from diagnostics to the execution of patient relief activities, and the organizational processes, which focus on the
collaborative information of the healthcare professionals and their organizational units. Another classification
(R. Mans et al., 2015) divided operational healthcare processes (i.e., “orchestration and management of the care processes
rather than individual activities”) into two types: nonelective care, including medical emergencies (i.e., “patient for whom
medical treatment is unexpected and needs to be planned on short notice”); and elective care, which includes scheduled
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
standard, routine and nonroutine procedures (i.e., “care for which it is medically sound to postpone treatment for days or
weeks”).
In this review, we propose a further categorization of healthcare processes to answer RQ2, which somehow specializes the one proposed by Rojas et al. (2015). Our categorization, however, does not rely on any a priori knowledge about
the organization of healthcare facilities and medical treatments, rather it is the result of the classification of the processes analyzed in the reviewed paper. Overall, it consists of seven types of processes, each of medical (M) and/or organizational (O) nature:
• Clinical pathway (M): is the sequence of patients' clinical activities, such as medical examinations, drug prescriptions,
surgical operations, symptoms, and so forth, which determines the clinical history of a patient. This type of process is
typically considered in a PM task for prediction purposes, that is, to determine all the possible outcomes of a disease
(each with an associated probability) given the sequence of clinical activities that have occurred in the past. These
data are usually retrieved from a HIS or an EHR.
• Patient pathway (M–O): is the sequence of steps a patient undergoes within a healthcare facility, such as registration, visit, diagnosis received, prescription received, hospitalization or discharge, departments a patient has
stop by. Often, patient pathways are analyzed with the purpose of determining the process performances in a
particular department (e.g., an emergency department) by means of performance indicators such as the average
length of stay of a patient. As for clinical pathways, patient pathways data are typically stored into HIS
and EHR.
• Patient behavior (M): is a sequence of activities performed on the patient's body, such as temperature measurements,
body weight measurements, values of a specific blood component, but also related to the patient's behavior, such as
what or how much the patient has eaten over time, or other data related to the patient habits. This kind of processes
are analyzed mainly to predict disease trajectories and build disease/patient models. Patient behavior data can be
retrieved by means of body-sensor measurements, video recordings, or directly from the patient.
• Personnel interactions (O): is the sequence of interactions between hospital staff within a specific environment. The
analysis of this kind of processes is called SNA, and is typically adopted for the discovery of interaction patterns
between hospital staff or different departments. These data can be gathered from the hospital HIS/EHR, but also by
means of an RTLS.
• Medical processes (M): is the sequence of the activities that identify a specific medical process, such as a surgical operation, or a resuscitation. Such processes are analyzed with the aim of improving process performances, and the
sequence of activities is typically manually extracted from video recordings.
• Human movements (O): is the sequence of places in a specific department or portion of the hospital through which a
patient or a member of the medical staff passes during a certain time slot. Discovering the model of human trajectories has been mainly used with the aim of improving the hospital/department layout. The data are typically gathered
by means of an RTLS.
• Personnel tasks (O): is the sequence of activities performed by medical staff, such as the interactions with information
systems such as HIS, EHR, and PACS, with the aim of analyzing the temporal aspect of the process by characterizing
the frequency of execution of tasks over time, and detect possible trends, cyclic behavior, or auto-correlation between
tasks; but also for clustering analysis aimed at separating, detecting and studying regular behavior, variants, and
infrequencies in the habits of health professionals. These data can be obtained either from HIS/EHR, RTLS,
or PACS.
Figure 3 shows in which percentage each process category has been taken into account in the reviewed papers.
5 | L I T E R A T U R E DI S C U S S I O N
To respond RQ1, here we characterize the state of the art of PM in the healthcare domain according to the taxonomy
reported in Figure 1. We separately discuss the PM applications and for each application we report on the adopted techniques. Then we discuss the PM tools used for the implementation or deployment of PM techniques, and the PM methodologies followed for the accomplishment of PM tasks (process and data aspects have already been discussed in
Sections 4.1 and 4.2). For each of these aspects we report on the work done by the research community. In the Appendix, Table A1 summarizes the classification of the surveyed papers.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
8 of 47
FIGURE 3
9 of 47
Processes analyzed in the reviewed papers
5.1 | Preprocessing: The initial step of PM applications
In the best of cases, event logs are generated by Process Aware Information Systems, that are specifically programmed
for detecting and registering process events in a form suited for PM purposes. However, in most cases event logs are the
result of a preprocessing task performed by a human on raw data stored in different forms (e.g., relational DB, video
recordings, hand notes, etc.), in different places, such as the multiple components of a HIS (e.g., RTLS, EHR, administrative data, etc.), and from different, non-synchronized sources (e.g., different hospitals). Collecting, transforming,
merging, organizing, cleaning, and enhancing data, thus, are fundamental steps for any PM task, that we refer to as
preprocessing. Without any kind of data preprocessing, a PM task conducted on a poor-quality event log may return
biased information about the process under analysis.
In this review, we distinguish between two types of preprocessing, that are Event Log Extraction and Preparation
and Event Log Quality Improvement. The former identifies methods for the generation of event logs from raw data, and
for the selection of relevant data from event logs; the latter identifies techniques to improve event logs quality by
removing “dirty” data (e.g., outliers), or by including additional information, both with the aim of improving the performance of subsequent PM tasks.
5.1.1
| Event log extraction and preparation
Binder et al. (2012) provided different approaches to empower data that is not already available in temporally ordered
event logs, and developed a prototype for structured data acquisition called PTDocs. PTDocs was tested on a billingoriented view of medical patient treatments, which does not provide direct information about administered treatments.
However, by cross-referencing medications data recorded at the pharmacy, the underlying treatments was
reconstructed. Perimal-Lewis et al. (2014) showed step by step how to convert a set of comma-separated files into a
unique event log, where cases were the patient pathways within an emergency department. García et al. (2015) developed a software that can be used by nonexpert users for generating event logs from a HIS without relying on other
external tools, and with the same characteristics, format, and structure as obtained with XESame tool (Verbeek
et al., 2010). Naeem et al. (2017) showed how to convert nonevent biomedical data into events by querying among multiple dataset tables. Then, they proposed the “LOG Generator” algorithm, used to generate an event log for from the
preprocessed event data. Metsker et al. (2017) developed a text mining-based algorithms for the interpretation of medical records as a potential solution to the problem of knowledge retrieval from EHR. In Rinner et al. (2018), the authors
described a data preparation method which uses a naming convention based on “time boxing” for recurring events to
model the time aspects used in medical guidelines. Each activity is associated to a “time box” (a fixed time period it
matches in), each time box corresponds to an event in the medical guideline, and the events in each time box are
named according to the name of the time box. This way they were able to extract an event log by cross-referencing data
from clinical guidelines and an EHR in the context of melanoma surveillance. Vogelgesang and Appelrath (2013)
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
introduced the concept of “factors of influence,” that is, data dimensions which can be used to form a multidimensional
data cube. A separation of the relevant data is desirable to mine different models for different groups of patients, the
selection of which can be considered as OLAP (OnLine Analytical Processing) operations on the multidimensional
cube. PM is then performed on a data slice selected by OLAP operations.
5.1.2
| Event log quality improvement
Event logs may suffer from different types of inaccuracies such as missing timestamps, sequence of events stored in
nonchronological order, missing resources, outliers, noise, and so on. Besides, the high complexity of healthcare processes makes event log a set of heterogeneous data that may enable the generation of unintelligible process models. The
following papers proposed solutions to overcome such issues.
Alharbi et al. (2017) proposed an approach to filter out the outlier events of repeated activity using an interval-based
selection method as a preprocessing step to improve pattern detection. The method provides for variations reduction in
clinical pathways data by reducing the number of repeated events still preserving the mainstream temporal pattern.
The method uses the interval-pattern of repeated activity as a threshold to remove outlier events. With the aim of
avoiding the “spaghetti-like” effect on process discovery tasks, the same authors (Alharbi et al., 2018) proposed an event
abstraction method according to which the event log is enriched with HMM-derived states, so as to remodel the
healthcare processes as state transitions. Martin (2018) showed how indoor location system (ILS) data can be used to
attenuate data quality issues present in the event logs extracted from a HIS. In particular, he proposed various ways of
cross-referencing data from an ILS and an HIS. O. A. Johnson et al. (2018) presented the ClearPath method, which
extends the PM 2 methodology with a process simulation task to address data quality issues. As noisy environments and
building structures can interfere with RTLS signals affecting their accuracy, Fernandez-Llatas et al. (2021) proposed the
use of the interactive PM paradigm for supporting the semiautomated correction of RTLS data. To this end, they presented an “interactive trace correction” method which uses an edit distance framework to correct the data.
Activity-based PM was introduced by Fernandez-Llatas et al. (2010), and is an interactive preprocessing technique
designed to be adopted in healthcare contexts, and which provides for the enrichment of the event log with the activity
outcomes (i.e., information arising from the result of taken actions), with the aim of obtaining more explanatory process
models. The term “interactive” means that the log enrichment task is conducted by a human expert, besides other automated procedures (with which the human interacts). An example of Activity-based PM algorithm is PALIA
(Fernandez-Llatas et al., 2010), used for process discovery. Ibanez-Sanchez et al. (2019) demonstrated the applicability
of this methodology on emergency department processes, following a question-driven, interactive PM approach. As
Activity-based techniques are based on discrete labels, Fernandez-Llatas et al. (2014) coupled activity-based PM with a
temporal abstraction framework to create high level labels. Temporal abstraction techniques allow to abstract high-level
concepts from timestamped data by transforming timestamped representation of raw data to an interval-based description of time series. The authors argued that these high level, discrete labels allow overcoming some limitations introduced by the use of continuous variables, in particular it enables experts to better understand how the process is being
deployed.
Activity clustering is a preprocessing technique aimed at categorizing similar activities with the same label. This
allows to obtain more understandable process models, and results particularly useful when dealing with data coming
from heterogeneous sources as the several information systems adopted in healthcare contexts (e.g., HIS, EHR, EMR,
etc.). Similar activities are, for instance, those that share some characteristics, or those having different names but same
semantic.
Lismont et al. (2016) applied two clustering methods, namely, the pattern abstraction technique (Jagadeesh Chandra Bose & Van der Aalst, 2009), and the activity clustering mining proposed by Günther and van der Aalst (2006). The
former groups activities that frequently appear together, the later identifies activities that always happen in series and
that share some characteristics. Duma and Aringhieri (2017) classified events into 15 classes, then they merged consecutive events of the same event class because irrelevant from a control-flow perspective. Stefanini et al. (2016) manually
grouped activities having different names but same semantic. In (Naeem et al., 2017), clusters have been discovered
based on time aspects, where activities having lesser frequency of occurrence were grouped as one. With the aim of
dealing with the problems related to batch processing (i.e., the recording of several events with same timestamp, such
as a group of laboratory results received at the same time), Alharbi et al. (2017) proposed to group batched events in a
single event.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
10 of 47
11 of 47
Topic modeling (TM) (D. Blei et al., 2010) is a clustering approach based on latent Dirichlet allocation (LDA)
(D. M. Blei et al., 2003), that has been largely used for activity clustering in medical event logs. LDA is an unsupervised
algorithm that allows grouping a set of documents into K different topics, each of which represents a set of words. The
objective is to find a correspondence between documents and topics, such that the words in each document are captured by the K topics. This, translated into medical terms, sounds as: each patient trace is a multinomial distribution of
topics, and each topic is modeled as a multinomial distribution over treatment activities. TM can be implemented with
the R library topicmodels. Among the surveyed papers, those that adopted the TM approach are (Huang et al., 2014;
Stefanini et al., 2016; Xu et al., 2017; Prokofyeva and Zaytsev, 2020; Chiudinelli et al., 2020).
Besides (Chiudinelli et al., 2020); Huang et al., 2014); Prokofyeva and Zaytsev, 2020), where authors applied TM on
medical event logs, X. Xu et al. (2017) used billing data as the reflection of medical behaviors, and used LDA to organize
billing items into high-granularity topics. They start from the assumption that billing items used for patient are goal-oriented, and thus different items that occurred in 1 day represent the day goals.
According with the Semantic PM paradigm, an event log is annotated with the labels of a predefined ontology. An
ontology consists of the concepts, categories, and their properties and relationships that characterize a subject area. As
for activity clustering and topic modeling, semantic PM is another preprocessing method which allows for the simplification, by means of semantic labeling, of the collected data. The ontology can be manually created from scratch, or it
can be automatically mined from the event log and then manually improved. Jangi et al. (2019) reviewed six papers to
show the use of semantic PM for hospitals processes. The ontology used by Antonelli and Bruno (2015) is modeled as a
UML class diagram, and uses the ICD106 vocabulary for disease classification, the MDC7 vocabulary for the major diagnostic categories, the DRG8 vocabulary for diagnosis related group, and the ATC9 vocabulary for drug classification. In
(Montani et al., 2017), ground terms were used to represent patient management actions, while abstracted terms were
used for medical goals. Then, a rule-based mechanism was used to identify which ancestor of an action in the ontology
had to be considered for abstracting the action itself. Finally, consecutive actions on the trace with same ancestor were
merged and labeled as the common ancestor. M. A. Grando et al. (2011) proposed a semantic PM approach to check the
conformance of computer interpretable guideline (CIG) with the related medical recommendations. Dewandono
et al. (2013) proposed a method that combines ontology and PM for giving recommendation in cases of diabetic treatment. The ontology, implemented in the OWL description logic, is used for matching the data of a patient with that of
other healthy patients, while conformance checking algorithms are used to compare process model of a patient with
the process model of healthy patients. Detro et al. (2017) proposed a semantic PM framework for selecting the appropriate process variant according with the patient's symptoms by means of ontologies based on known expertise.
5.2 | PM applications in the healthcare domain
In this section, we discuss the PM applications presented in the reviewed papers. Figure 4 shows in which percentage
each application has been treated.
5.2.1
| Process discovery
The process discovery task consists in applying a process discovery algorithm with the aim of inferring a model of the
process from the event log. In the healthcare context, many of the medical and organizational processes are standardized, that is, clinical guidelines are provided by medical organizations or government institutions to the hospital staff so
as they know what to do in specific situations, and healthcare services can be outsourced uniformly by the healthcare
facilities that belong to the same community (e.g., states, private organizations, etc.). However, such guidelines are difficult to accomplish given that, for example, patients suffering from the same disease may require different treatments
depending on how their body reacts to certain drugs and medications. For this reason, processes carried within the
same medical context may differ from each other in terms of control-flow, organizational, and temporal perspective.
Thus, the process discovery task is of paramount importance for physicians and administrators, as it allows to obtain a
broad view about what actually happens, and possibly to rectify or enrich guidelines accordingly.
The literature on process discovery can be divided in two main strands, that are automated and interactive discovery
of process models. An automated discovery task makes use of a process discovery algorithm which takes as input the
event log and outputs a process model in terms of a process modeling language, without using any a priori knowledge.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
FIGURE 4
GUZZO ET AL.
Applications proposed in the reviewed papers
An interactive discovery task, in addition, takes advantage of a domain expert to discover the process model, exploiting
his domain knowledge along with the event log (Benevento, Dixit, et al., 2019). Table 1 provides a quick overview of
the work divided according with the above classification.
Before discussing the main contributions in the field of automated discovery of process models, we briefly introduce
the most used process discovery algorithms.
The α miner (Van der Aalst et al., 2004) is one of the first process discovery algorithm appeared in the PM literature.
It examines causal relationships between events, and constructs a Petri net where each transition corresponds to an
observed event. It is able to discover concurrency, loops, points of choice, and it only focuses on the control-flow perspective, ignoring temporal, data, and organizational perspectives.
The heuristic miner (Weijters et al., 2006) improves the α algorithm by considering frequencies, therefore it can filter
out infrequent behavior, and noise. It is usually adopted with event logs with not too many different events. The algorithm first constructs a graph characterized by a frequency-based metric that indicates the likelihood that one activity
depends on another. Then, a dependency matrix is constructed and the all-activities connected heuristic (i.e., choosing
the best candidate within all connected activities) is applied to the matrix to extract a process model. Some specific heuristics are used to exclude from the model those activities that take place under certain conditions that are not explanatory of the process, and to alleviate the duplicate tasks and the long distance dependencies problems.
The inductive miner (Leemans et al., 2013) is an algorithm built to improve the performance of α miner and heuristics miner. It works by repeatedly finding a split in the event log within the trace, and detects the logical operator that
describes the split. This algorithm outputs a process tree, a directed hierarchical graph in which each node is an event
with child events representing potential subsequent events within the same session, thus showing the ordered relations
among events.
The fuzzy miner (Günther & Van Der Aalst, 2007) is the first algorithm which address the problems of large numbers of activities and highly unstructured behavior. It uses significance/correlation metrics to simplify the process
model at the desired level of abstraction. Besides the significance and correlation of two events directly following one
another, it can also measure longer-term relationships, and leave out less important activities (or hide them in clusters).
The fuzzy model is designed to allow aggregation and abstraction so as to simplify the observed behavior with the aim
of making it more understandable. However, it must be considered that there is a trade-off between model simplification and description of the actual behavior.
Genetic mining algorithms (de Medeiros et al., 2007) try to learn processes characterized by invisible tasks, and by
situations where there is a mixture of choice and synchronization. In the initial population each individual is represented as a matrix which elements are probabilities of the existence of a relation between pairs of activities. Each individual
is then assigned a fitness value as the capability of matching with log traces, and a new population is generated by
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
12 of 47
TABLE 1
13 of 47
List of papers divided per discovery approach
Discovery approach
References
Automated
(Abo-Hamad, 2017; Alharbi et al., 2018; Alvarez et al., 2018; Amantea et al., 2020; Antonelli & Bruno, 2015;
Araghi et al., 2020; Araghi, Fontaili, et al., 2018; Baker et al., 2017; Bouarfa & Dankelman, 2012; Caron
et al., 2011; Caron et al., 2014; B. Chen et al., 2021; Chiudinelli et al., 2020; Cho et al., 2014; Dagliati
et al., 2014; De Oliveira et al., 2020; De Weerdt et al., 2012; Duma & Aringhieri, 2020; Fei et al., 2010;
Fernandez-Llatas et al., 2015; Fernandez-Llatas et al., 2010; Fernandez-Llatas, Garcia-Gomez, et al., 2011;
Furniss et al., 2016; Garg & Agarwal, 2016; A. Grando et al., 2017; H. Xu, Pang, Yang, Jinghui, et al., 2020;
Hendricks, 2019; Huang et al., 2012, 2013; Jaroenphol et al., 2015; Kim et al., 2013; Kirchner &
Markovic, 2018; Kukreja & Batra, 2017; Leonardi et al., 2019; R. Mans, Reijers, et al., 2012; Meneu
et al., 2013; Mertens et al., 2018; Miclo et al., 2015; Montani et al., 2013; Naeem et al., 2017; Neumuth
et al., 2011, 2012; Perimal-Lewis et al., 2014; Placidi et al., 2021; Poelmans et al., 2010; Prodel et al., 2015;
Rojas & Capurro, 2018; Stefanini et al., 2016; Valero-Ramon et al., 2019; X. Xu et al., 2017; S. Yang, Li,
et al., 2017; S. Yang, Zhou, et al., 2017; Lee & Rismanchian, 2018; Zhang & Chen, 2012; M. Zhou
et al., 2017)
Automated with clustering (de Toledo et al., 2019; Delias et al., 2014, 2015; Huang et al., 2014; Kovalchuk et al., 2018; Lakshmanan
et al., 2013; Lismont et al., 2016; Najjar et al., 2018; Pebesma et al., 2019; Prokofyeva & Zaytsev, 2020;
Ronny et al., 2015; Lu et al., 2019; S. Yang, Tao, et al., 2018)
Interactive
(Benevento, Dixit, et al., 2019; Valero-Ramon, Fernandez-Llatas, Martinez-Millana, & Traver, 2020; ValeroRamon, Fernandez-Llatas, Valdivieso, & Traver, 2020)
means of genetic operators such as crossover, mutation, and elitism selection. This is repeated until the end of the evolution process, finally the individual which best fits with log traces is selected as the final solution.
ILP-based PM techniques (Jan Martijn et al., 2008) provide precise results by design, as the user can explicitly define
the constraints a process model must satisfy to fit exactly with the log traces. However, they do not scale well, as the
simplex algorithm solves linear programs in a time exponential in the number of variables (activities). ILP-based discovery algorithms mine causal dependencies between activities that are detected in the event log. However, this approach
performs well only under the assumption that the process under analysis shows frequent behavior, thus it results to be
uneffective in describing low-frequent exceptional behavior. The algorithm proposed by Jan Martijn et al. (2008) provides for the construction of a Petri net as a solution of the linear program, in which objective function expresses a preference for finding the Petri net places.
The PALIA algorithm (parallel activity-based log inference algorithm) (Fernandez-Llatas et al., 2010) is an activitybased PM algorithm. It first builds a graph taking into account the process parallelism, then for each node, the
algorithm verifies if their posterior branches are equivalent; if so, nodes and transitions are fused. Finally, repeated
transitions and unused nodes are deleted. This algorithm has been used also in the context of interactive process discovery, and follows an Activity-based strategy, which means that use activities along with their outcomes to generate state
transition models.
Some of the first work on automated discovery of process models proposed an HMM-based (Poelmans et al., 2010),
and the heuristic and genetic miners (Fei et al., 2010) algorithms for the discovery of patients pathways, in the field of
oncology, and surgery, respectively. In the same year, Neumuth et al. (2011) presented a method to calculate a medical
process model as a statistical “mean” intervention model, called generic Surgical Process Model (gSPM), from a number
of surgical interventions. Caron et al. (2011) used the heuristic miner for the study of cancer treatments, and the social
network miner for discovering the interactions between the hospital departments involved. Fuzzy miner has been
adopted by De Weerdt et al. (2012) for mining clinical pathways of oncological patients. Huang et al. (2012) used
sequential pattern mining to find a set of oncological clinical pathways given a minimum support threshold. Neumuth
et al. (2012) designed and implemented a surgical workflow management system (SWFMS) to provide a guidance for
cataract surgeries operations. R. Mans, Reijers, et al. (2012) used heuristic miner to discover the personnel interactions
process during dentistry operations. They investigated the process from the control-flow, the organizational, and the
performance perspectives. Bouarfa and Dankelman (2012) derive a workflow consensus from multiple logs of medical
processes, and then detect workflow outliers automatically and without any prior knowledge. B. Chen et al. (2021) and
Furniss et al. (2016) proposed a framework to discover the usage patterns of personnel when interacting with the HIS.
Cho et al. (2014), Jaroenphol et al. (2015), and Kim et al. (2013) used heuristic and fuzzy miners to discover an outpatient care process model. Additionally, in Kim et al. (2013), the model was compared with a process model deigned by a
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
domain expert in terms of the accuracy of the matching rate. In (Fernandez-Llatas et al., 2010; Meneu et al., 2013),
PALIA was used for the discovery of patients pathways in the field of cardiology, and home hospitalization, respectively. Montani et al. (2013) proposed a heuristic miner-based framework with the aim of analyzing the quality of stroke
management processes, in order to verify whether different patient categories are differently treated, and whether hospitals of different levels actually implement different processes. Huang et al. (2013) formally defined the clinical pathway summarization problem as an optimization problem which can be solved by using dynamic programming. This
approach first represents the pathway as continuous and overlapping time intervals, then discovers frequent patterns in
each time interval from the log. Dagliati et al. (2014) proposed a heuristic miner-based approach to analyze complex
temporal datasets of type 2 diabetes patients. The main idea underlying the approach is to use temporal data mining
techniques to derive patient behavior processes. Fernandez-Llatas, Garcia-Gomez, et al. (2011) used PALIA for mining
the behavior of patients affected by dementia. The same author (Fernandez-Llatas et al., 2015) tested both the heuristic
miner and PALIA for mining patient pathways from RLTS data collected in a surgical ward. Miclo et al. (2015) and
Araghi, Fontaili, et al. (2018) used the fuzzy miner, and the heuristic miner, respectively, on RTLS data to mine human
movements. Prodel et al. (2015) used an integer linear programming approach to discover the care process at a macroscopic scale from a large-size database. In the approach proposed by Naeem et al. (2017) similar activities are grouped
by means of a clustering algorithm, then inductive, heuristic, and fuzzy miners are applied to extract the clinical pathway of patients affected by hepatitis. The preliminary activity clustering step avoids to incur in the so-called spaghetti
effect. A similar approach was presented by X. Xu et al. (2017), where topics are assigned to similar activities and the
fuzzy miner is applied to extract clinical pathways in the context of surgery and neurology. S. Yang, Li, et al. (2017) proposed an HMM-based algorithm to investigate the trauma resuscitation process mined from video recordings. In the
same year, M. Zhou et al. (2017) and S. Yang, Zhou, et al. (2017) tested their trace alignment-based algorithm on a
trauma resuscitation dataset for the extraction of the process model. Mertens et al. (2018) proposed an algorithm called
DeciClareMiner that combines process and decision mining to extract a process model from past executions. Here the
process is modeled using Declare, a declarative, LTL-based modeling language. Abo-Hamad (2017) applied the fuzzy
miner to discover the actual patient pathways in an emergency department; then they studied the variance in patient
pathways taken by diverse groups of patients, and proposed a performance analysis in terms of bottleneck and resource
utilization. Rojas and Capurro (2018) provided an approach to identify specific patient cohorts based on complex digital
phenotypes as a starting point to identify process models. Using temporal abstraction-based digital phenotyping and
pattern matching, they identified a cohort of patients with sepsis from the MIMIC II database, and then applied heuristic mining to discover medication use patterns. With the aim of finding the optimal layout in an emergency department,
Lee and Rismanchian (2018) used a sequence clustering plug-in to remove infrequent events and to derive the process
model in the form of Markov chain. Alharbi et al. (2018) used an unsupervised method for detecting hidden clinical
pathways in the form of hidden Markov models form the MIMIC-III dataset. Kirchner and Markovic (2018) exploited
the local process models paradigm (Tax, Sidorova, Haakma, & van der Aalst, 2016) to model medical processes partially,
thus enabling the detection of major process steps. Valero-Ramon et al. (2019) proposed a method based on PALIA
algorithm to discover and identify weight changes behavior. They preliminarily applied a trace clustering algorithm to
manage variability. L. Xu et al. (2019) presented a constraint-based method using multi-perspective declarative PM for
the extraction of clinical pathways from cardiology data. The α algorithm was used in Garg and Agarwal (2016) and
Kukreja and Batra (2017), along with the heuristic miner, for the discovery of patient pathways. De Oliveira
et al. (2020) used a metaheuristic approach as a combination of Monte Carlo sampling and Tabu search to overcome
the complexity that characterizes medical event logs, and used a “replayability score” to determine the fitness of the discovered process model under specific size constraints.
A slightly different approach for the automated discovery of process models consists in applying a trace clustering
algorithm to the event log, so as to group together similar traces, and finally applying a discovery algorithm to each
cluster. This way, a number of more clear, coherent, and comprehensible process models are extracted from the event
log, each identifying a group of cases united by their behavior. In the medical domain, the event logs extracted from
healthcare information systems do show the characteristic of having many cases following different procedures. When
an event log summarizes patient treatments, trace clustering allows to cluster patients based upon the patient data and
on the characteristics of their care journeys. In contrast, sequence clustering focuses only on the control-flow perspective, thus generating more simple models (Rebuge & Ferreira, 2012).
Delias et al. (2014) and Ronny et al. (2015) followed the clustering approach to demonstrate the potentials of PM in
the healthcare domain. Huang et al. (2014) proposed a probabilistic topic models based approach to mine latent clinical
pathways, that composed together enable for the recognition of patient behavior. Traces were preliminarily clustered
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
14 of 47
FIGURE 5
15 of 47
Process discovery techniques adopted in the surveyed papers
based on their probability distribution to belong to a specific clinical pathway. Delias et al. (2015) proposed a spectral
clustering technique and a trace similarity metric to downgrade the effect of noise and outliers, avoiding the “spaghetti
effect” on the process model. Najjar et al. (2018) extracted clinical pathways following a clustering approach, then these
pathways are also clustered to distill typical pathways, enabling interpretation of clusters by experts. Kovalchuk
et al. (2018 used the K-means clustering algorithm to identify groups of patients based on the paths they followed
within the hospital. S. Yang, Tao, et al. (2018) presented a framework for analyzing associations between patient
cohorts and the trauma resuscitation procedures their patients received. Patient cohorts are decided by unsupervised
clustering, according to which patients being clustered into the same cohort must share similar relevant attributes. To
this end they proposed an algorithm for measuring the patient similarity, so as to use the learned weighted attribute distance as the similarity measures during clustering. Finally, one workflow model was discovered from each patient
cohort. With the aim of improving the management of patients disease, Pebesma et al. (2019) proposed to cluster frequent pathways based on risk factors such as the gender of the patients. de Toledo et al. (2019) proposed two trace representation methods, namely, vectorial, and syntactic. In the vector-based approach, each vector dimension
corresponds to a diagnosis, and traces are represented as vectors. Each dimension can be represented with a binary
value, meaning the absence or presence of the diagnosis, or with a numeric value as the frequency of occurrences of the
diagnosis in the log. In the syntactic-based approach, traces are represented as sequence of events. For clustering, the
Hamming and Levenshtein methods are used to define distance between traces. Lu et al. (2019) proposed a trace clustering approach according to which frequent sequence patterns are first learned based on a sample set of patients, then
used as a base to rank patients, and finally used to discover a process map. In (Prokofyeva and Zaytsev, 2020), groups of
patient routes are discovered with a hierarchical agglomerative algorithm, then to each cluster is assigned a topic by
means of a probabilistic topic modeling assignment, finally popular patient route patterns are identified.
Overall, the processes that have been most taken into consideration in the context of automated process discovery
are patient pathways (24 papers), and clinical pathways (20 papers), followed by medical processes (6 papers), personnel
tasks (4 papers), human movements (3 papers), personnel interactions (3 papers), and patient behavior (3 papers).
Much less attention has been paid on the use of interactive PM for the discovery of process models. Benevento,
Dixit, et al. (2019) showed that interactive process discovery approach allows to obtain process models that are more
accurate and compliant with clinical guidelines than those obtained by means of automated discovery techniques.
Valero-Ramon, Fernandez-Llatas, Valdivieso, and Traver (2020) introduced a method for discovering dynamic risk
models for chronic diseases using PALIA, based on the health sensors data as an evidence of the patients' dynamic
behavior. The same author used PALIA to examine the comorbidities associated to obesity in order to obtain common
patterns of patients' behaviors (Valero-Ramon, Fernandez-Llatas, Martinez-Millana, & Traver, 2020).
Figure 5 shows in which percentage each process discovery technique has been adopted in the reviewed papers.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
5.2.2
GUZZO ET AL.
| Conformance analysis
Conformance refers to the analysis of the relation between the expected behavior of a process and the event log that
has been recorded during the execution of the process (i.e., model “to be” vs. model “as is”). There are three approaches
to conformance checking: log replay, alignment, or rule checking. In the log replay approach, each trace of the considered log is reproduced event by event in the model to verify that each event complies with the specification of the model
itself. In the alignment approach, on the other hand, the log trace is used and it is compared with the execution trace of
the model, trying to maximize the correspondence between the events of the two traces. Finally, the last approach is
the classic meaning of model checking in which conformance is interpreted as the verification of consistency of the
rules derived from the model with the traces of the event log. The study of the selected documents revealed not only
that in the health context the three approaches were used, but also that they can be considered complementary in some
sense. In fact, log replay can be preferred when the interest is on the evaluation of early deviations, or simply to have
an intuition of the replayability of the traces. Instead, for an in-depth analysis of the deviations, the alignment is preferable (Caron et al., 2014; S. Yang, Sarcevic, et al., 2018). Finally, rule checking (H. Xu, Pang, Yang, Jinghui, et al., 2020;
H. Xu, Yan, Pang, Nan, et al., 2020) provides a simple but effective way to analyze the fulfillment of a set of rules. In
some case, a combination of the above approaches has been proposed (Asare et al., 2020; Dewandono et al., 2013;
Kirchner et al., 2012) in order to give a more exhaustive analysis of the case study.
In the context of healthcare, the work on conformance analysis includes paper focusing on comparing the actual
process model with the designed and prescriptive process model (e.g., clinical guidelines, models hand-made by medical
experts, etc.) (Anggrainingsih et al., 2018; Asare et al., 2020; Caron et al., 2014; de Vries et al., 2017; Fernandez-Llatas,
Meneu, et al., 2011; Ganesha et al., 2017; Jaturogpattana et al., 2017; Kelleher et al., 2014; Kim et al., 2013; Kirchner
et al., 2012; Kukreja & Batra, 2017; Mannhardt & Blinde, 2017; Neira et al., 2019; Placidi et al., 2021; Rinner et al., 2018;
Rovani et al., 2015; H. Xu, Pang, Yang, Jinghui, et al., 2020; H. Xu, Pang, Yang, Ma, et al., 2020; H. Xu, Yan, Pang, Nan,
et al., 2020; S. Yang, Li, et al., 2017; S. Yang, Sarcevic, et al., 2018). In these works, the results of conformance analysis
are the detection and fully understanding where clinical practice deviates from the specifications in a care pathway,
which leads to higher levels of quality care and patient outcomes. While in the majority of cases conformance checking
consisted in the simple application of well-known algorithms, some work provided further contributions in terms of
new algorithms, process modeling, and process analysis. (H. Xu, Pang, Yang, Jinghui, et al., 2020) presented a conformance checking algorithm based on LTL (linear temporal logic), and tested it for checking the conformance of an ischemic stroke treatment process with clinical guidelines. Kirchner et al. (2012) introduced a method for checking the
conformance of ongoing treatment processes. The interesting aspect is that the conformance checking is done on a
sparse log, that is an execution's log storing not all the activities of a patient's treatment. Rinner et al. (2018) described a
method, called “time boxing,” for data preparation using a specific naming convention to model the time aspects used
in medical guidelines. Then they exploited the potential of time boxing for a more precise trace alignment between discovered process models and clinical guidelines. In (S. Yang, Sarcevic, et al., 2018), a framework for compliance analysis
is proposed that first uses conformance checking to detect deviations of an event log from a process model of trauma
resuscitation and, then, an ad hoc algorithm to discriminate between true and false deviations (alarms). In particular,
Yang et al. focused attention on so-called “false” deviations, that are: (1) gaps or discrepancies between the model
(“work as imagined”) and actual practice (“work as done”); (2) errors in the coding of activity traces; and (3) algorithm
limitations. The approach followed by Dunkl et al. (2011) consisted in the following three steps: (1) a process model of
the clinical guidelines (CGPM) is generated; (2) synthetic treatment process data are generated by means of a simulation tool fed with CGPM and the process model of the actual process; (3) a process model is extracted from synthetic
data (SDPM); (4) CGPM and SDPM are compared using the Decision Miner ProM plug-in. Kelleher et al. (2014) showed
that trauma resuscitation processes without pre-arrival notification are performed with more variable adherence to
advanced trauma life support protocols, and proposed the implementation of a checklist to improve the performance of
such processes.
In contrast to the above papers, Dewandono et al. (2013) used conformance checking for comparing the event log of
incoming patients to that of previously healthy patients, so as to find the most similar medical record of patient cured
by the medical center. This processes comparison allows doctors to obtain useful tips while treating patients on the base
of the treatments given to healed patients.
Figure 6 shows in which percentage each conformance checking technique has been adopted in the reviewed
papers.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
16 of 47
5.2.3
17 of 47
| Process analysis
In most PM tasks, process analysis is the last step after a process model has been discovered. The analysis of a
healthcare process may concern various aspects such as workflow data, statistics, similarities with the same process performed in different hospitals, performance measurements, how process variants (e.g., patients pathways) differ with
each other, and so on. We divided the process analysis papers in five classes, which we separately discuss below.
Variants analysis
Health processes are mostly classified as ad hoc processes, in the sense that the same treatment can follow different paths
by virtue of the patient's response, but also of the discretion of doctors who have the power to deviate from the guidelines
on the basis of their knowledge and experience to address specific patient situations (Detro et al., 2017). In this context, it
is of crucial importance to have techniques capable of discriminating different behaviors in the same process so that the
analyst can study them separately. We refer to these techniques as process variants detection techniques aiming to identify
a mapping function between each process execution trace (i.e., case) and a specific variant (typically identified through a
label). A wide range of methods for log-based process variant analysis have been proposed in the past decade: Supervised
or semi-supervised approaches use statistical or visual models for identifying variants, specially in clinical pathways
(Andrews et al., 2020; A. Kurniati et al., 2018b). Unsupervised approaches specialize traditional clustering algorithms
(i.e., K-means) with event logs, where support of each cluster is used to discriminate main behavior from process variants
or infrequent behaviors. Eventually, these clusters can be used to reduce the complexity of the analysis, in a sort of “divide
and conquer,” and thus provide a zoom in for specific clinical cases, or highlight outliers in terms of medical errors, deviations from the clinical guidelines, or system defects (Caron et al., 2014). For example, Rebuge and Ferreira (2012) proposed a method based on the synergistic use of a cluster diagram and a minimum coverage tree to cluster the traces on
the basis of the frequencies of traces, thus managing to discriminate between the set of traces representing the main
behavior, and the less frequent clusters of traces representing the set of process variants.
Recently, variants analysis has also been used to compare processes that have the same goal but belong to different
organizations, referred also as process model comparison that we discuss in the next paragraph. For example, Partington
et al. (2015) and Suriadi et al. (2014) presented a case study on the application of PM techniques to measure and quantify
the differences in the treatment of patients with chest pain symptoms across four South Australian hospitals.
Process model comparison
PM techniques can be profitably used for model comparison of healthcare processes (Fernandez-Llatas et al., 2013), and
suitable metrics for measuring its similarity have been proposed (Montani et al., 2014). In the medical domain, model
comparison allows not only to compare the process actually implemented with the existing clinical reference guideline,
to verify its compliance, and/or to understand the level of adaptation to local constraints that may have been required,
but it is especially useful to discover different practices used to treat similar patients and to analyze their effects on the
final outcome of the process. In fact, from the studies treated so far it emerged that the existence of local resource constraints can lead to differences between the models implemented in different hospitals, even when referring to the treatment of the same disease (and to the same guideline). In (Andrews et al., 2020), comparison and similarity of processes
for different patient populations in the domain of breast cancer has been assessed in three different ways: (i) visual
inspection, that is, human judgment, (ii) similarity measures on directed graphs extracted from the process model, and
(iii) cross-log conformance checking by means the plug-in “Replay a Log on Petri Net for Conformance Analysis.”
Process performance measurements
A quantification of model differences (and perhaps a ranking of hospitals resulting from them) can be exploited for several purposes, such as, for example, administrative purposes, performance evaluation and distribution of public funds.
Especially nowadays it is important to quantify and monitor the quality of care in order to support the effectiveness and
efficiency of clinical care, as well as to improve a continuous measurement of performance. Also in this context, PM
techniques have been proved to be an important tool for the definition and automatic extraction of key performance
indicators (KPIs) from HISs, as well as an effective framework for analyzing the interactions between KPIs (see
Figure 7) and the process environment (control-flow, organizational resources involved, and data; Abo-Hamad, 2017;
R. Mans et al., 2013; R. Mans, Reijers, et al., 2012; Perimal-Lewis et al., 2012). Even though clinical KPI strongly depend
from the treatment disease and the clinical guidelines, the factor time (i.e., consultation wait time, time spent per task/
process, and the average length of stay [LOS] for all patients from arrival to departure whether discharged or admitted
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
FIGURE 6
GUZZO ET AL.
Conformance checking techniques adopted in the surveyed papers
to the hospital) is most important considering the process performance, specially in the field of time-critical diseases
(Jaisook & Premchaiswadi, 2015; Kukreja & Batra, 2017; Stefanini et al., 2018; Yoo et al., 2016). Gattnar, Ekinci, and
Detschew (2011) and Gattnar, Ekinci, Detschew, and Capel-Tunon (2011), a clinical reference process model describing
the clinical meaningful KPIs for acute diseases has been proposed.
Visual process analytic
Visual Analytics consists in the analysis of processes using the visual representations of the data in the form of charts,
graphs, histograms, maps, tables, and so on. Seeing and working with event logs in a visual format helps (nonexpert)
users identify patterns and understand data insights much more quickly.
Aguirre et al. (2019) used bar charts and donut diagrams to detect the existing bottleneck in the surgery process.
Fernandez-Llatas et al. (2013) used heat maps (Figure 8b) to highlight the locations and behavior patterns of the
patients. To this end, they proposed HMRA (heat maps rendering algorithm) which generates heat maps by highlighting the patients flows with different colors based on the temporal duration of the visited places and the number of transitions between them. Heat maps allow detect the most important parts of a patient movements on the first viewing, by
highlight the most visited locations. Perer et al. (2015) proposed care pathway explorer (Figure 8a), a tool which uses a
frequent sequence mining algorithm specifically designed to work with EMR data, and techniques for managing event
concurrency. CPE visualizes the mined pathways at different levels of abstraction in a user interface consisting of flow
visualizations.
Dotted charts (Figure 9) are another useful tool for conducting visual analysis. Dotted chart shows an overview
of the process where the X-axis shows the time, the Y-axis shows the process execution instances, and each point
represents an event; the colors of the points refer to the different activities (García et al., 2015). Yoo et al. (2016)
analyzed process changes based on changes in the hospital environment (e.g., the construction of a new building),
and to measure the effects of environmental changes in terms of consultation wait time, time spent per task, and
outpatient care processes. To analyze the task event distribution, they used a dotted chart analysis and derived a
two-dimensional graph by indicating task events using dots based on time or frequency.
Trace alignment is a PM technique which allows comparison between the activity sequences of process traces. Combined with visualization techniques, trace alignment is a powerful mean to detect deviation, analyze and discovery specific case (Figure 10a). S. Chen et al. (2017) proposed PIMA (process-oriented iterative multiple alignment), an
algorithm optimized to handle workflow data, and tested with endotracheal intubation, primary survey, and trauma
resuscitation data.
De Oliveira et al. (2020) proposed a “bow-tie” graph representation of the discovered process model (Figure 10b),
where the central activity is the activity of interest, and the left-hand and the right-hand sides of the graph contain the
set of activities that happened before and after the central one, respectively. In the “bow-tie” graph, circles represent
event nodes of the process model, and links represent the time-ordered sequence of one node following another. The
sizes of nodes and links are proportional to the number of patients following this pathway.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
18 of 47
19 of 47
F I G U R E 7 Duration of time analysis based on the disco Fluxicon performance approach within the collected medical event log (AboHamad, 2017; Jaisook & Premchaiswadi, 2015): (a) a screenshot of the resulting fuzzy mining graph; (b) investigation of the sections/wards
“irradiation cystitis” to itself (i.e., as a loop); (c) performance analysis for patients with different triage categories
Process understanding
Contrary to the papers discussed above, those classified as Process Understanding do not follow a precise analysis
approach, rather they provide a framework for obtaining information on various aspects of the process under consideration. In the context of the 2011 Business Processing Intelligence Challenge, the participants were provided with a real-
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
F I G U R E 8 (a) Care pathway explorer: Bubble chart displaying events of the most frequent patterns mined (left), and a flow
visualization to show the most frequent patterns (right; Perer et al., 2015); (b) heat map (Fernandez-Llatas et al., 2013)
FIGURE 9
An example of dotted chart
F I G U R E 1 0 (a) Trace alignment: Activities of the same type are aligned in the same column, and each row represents an individual
case as its sequence of activities (M. Zhou et al., 2017); (b) bow-tie graph (De Oliveira et al., 2020)
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
20 of 47
21 of 47
life event log taken from a Dutch Academic Hospital, where each case is a patient of a gynecology department. With
the aim of reporting on a broad range of aspects, Jagadeesh Chandra Bose and van der Aalst (2011) disclosed differences
among patients with respect to the diagnosis, treatment, control-flow, and time perspectives. Then, they were able to
derive artifacts (i.e., any concrete, identifiable, self-describing chunk of information used in business processes;
Nigam & Caswell, 2003) focusing on the organizational perspective. Finally, by means of the fuzzy mining and trace
alignment techniques implemented as ProM plug-ins, they reported on common patterns of execution, anomalies, and
distinguishing aspects with respect to the treatment procedures followed among cases. Still from the BPI Challenge
2011, Caron et al. (2011) first used the heuristic miner for process discovery, then reported on the differences at the
control-flow level between cases, and on common sequences between treatments. They also performed an SNA for discovering the links between hospital departments. Finally, they provided several observations about the use of specific
therapies, such as deviations between the prescribed average number of therapy cycles and the real average, and the
relation between personnel and therapies. Forsberg et al. (2016) analyzed PACS (Picture Archiving and Communication
System) usage patterns by means of the heuristic miner, and the discovered process model was represented by a Petri
net to compute the complexity of the process in terms of three metrics, that are: number of splits and joins, the ratio of
the number of arcs over the number of states, and the number states that can be reached by all splits. They also provided some statistics on PACS users, such as the number of cases, the average number of commands per case, the average number of command types per case, and the median time to read an examination. With the aim of supporting
governance, Agostinelli et al. (2020) used the inductive visual miner for discovering emergency room processes, and the
patient distribution among the different radiology subdepartments; the social network miner for inferring the interactions between the different subdepartments, and different operating rooms; and dotted charts for visualizing the distribution of patients without reservation.
5.2.4
| Process simulation
A process simulation task allows to mimic an existing business process with the aim of conducting a what if analysis,
that is, observing the different trajectories a process would follow as a result of the application of potential changes, or
simply for observing the process at runtime from an external point of view in order to understand its dynamics. This
way, the redesign of a business process can be explored and evaluated before it is actually implemented. The prediction
of the impact of potential changes enables for process understanding, bottleneck analysis, problem identification, performance evaluation, efficiency improvement, and comparison of alternative configurations. By using PM, a simulation
model can be generated based on the process model extracted from an event log by means of a process discovery task.
In the context of healthcare, discrete event simulation (DES) based approach appears to be a dominant tool. DES
concerns the modeling of a system as it evolves over time by a representation in which the state variables change
instantaneously at separate points in time (Law et al., 2000). The approach adopted for the implementation of a process
improvement task as the result of a process simulation consists of four steps, that are depicted in Figure 11, and
described as follows:
FIGURE 11
Graphical representation of a process simulation task
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
1. Data collection: raw data are collected and organized in the form of an event log.
2. Process mining: the event log is fed into a process discovery algorithm which returns the “As-is” process model.
3. Generation of the “To-be” process model: this is a macro step that can be further divided in the following sub-steps:
a. Model transformation: a simulation model is generated from the As-is process model.
b. Model validation: the simulation model is validated in order to verify whether it accurately mimics the behavior
of the As-is process in relation to a set of performance measures. For each performance measure, the average
value obtained from the multiple simulations of the As-is model is compared with the corresponding value
obtained in the PM step, so as to determine whether the simulation model is valid.
c. Simulation: the process described by the validated model is simulated many times in order to investigate the
impact of a set of process changes on the performance measures considered in the previous step. The optimal
combination of changes, that is, the one which allows for the desired process performance, are merged with the
validated process model so as to obtain the To-be process model.
4. Process improvement: the real process is changed in order to be aligned with the discovered To-be process model.
In the healthcare domain, process simulation is used to evaluate the impact of both medical and organizational
decisions, by means of a set of performance indicators that include, but are not limited to: the time a patient spends
within a hospital department; the time consumed between two activities; how many times activities are performed for a
patient; how many times staff members perform certain activities; how many patients are in the hospital daily; the cost
(in terms of human resources, consumed energy, spent money, etc.) of the process under analysis; percentage of the utilization of the hospital machinery in the process; death rate.
R. Mans et al. (2013) simulated a dental surgery process and found that the introduction of new digital technologies
is largely beneficial for patients and dental lab owners, whereas for dentists there is hardly any benefit. In (Z. Zhou
et al., 2014), a DES model was developed to study the impact of critical resources on the length of stay for patients. Different operational scenarios were also analyzed in order to provide recommendations for clinic management and
improvement. Cho et al. (2014) conducted a what-if analysis on an outpatient service considering the overall time for
out-clinic, the time for each activity, the frequency of each activity per head, the work time for each activity by
resources, the frequency for each activity by resources, and the hourly frequency of patients as performance indicators.
Lamine et al. (2015) used DES to assess the efficiency of the management of an emergency call center considering the
speed to answer and the phone call duration as performance indicators. Augusto et al. (2016) conducted a case study on
patients having cardiovascular diseases and eligible to receive an implantable defibrillator. They also studied the impact
of medical decisions, such as implanting or not a defibrillator, on the relapse rate, the death rate, and the cost.
Kovalchuk et al. (2018) demonstrated an example of the proposed approach's application within a task of simulating
the key departments involved in acute coronary syndrome treatment procedures, considering the length of stay as performance indicator. In Tamburis and Esposito (2020), the authors used DES to get to an effective analysis of a cataract
surgery process. In (Phan et al., 2019), a what-if analysis is proposed to improve care pathway for hernia affected
patients who are the most exposed. Here the simulation is used to understand times of occurrence of complications and
associated costs. O. A. Johnson et al. (2018)presented the ClearPath method, which extends the PM 2 PM method with a
process simulation approach that address issues of poor quality and missing data. Franck et al. (2020) presented a simulation model in order to analyze patient pathways from the ED to hospital discharge, and to find the causes of congestion problems in the ED. Then they proposed several designs of experiments in order to test medical unit capacity
variations taking into account real data and practitioners expertise.
The simulation tools used in the revised papers are: AnyLogic10 (30%); ProModel11 (10%); CPN Tools12 (30%);
Witness13 (10%); Simul814 (10%); NETIMIS (O. A. Johnson, Hall, & Hulme, 2016; 10%).
5.2.5
| Social network analysis
SNA is the analysis of the organizational perspective of a process for the evaluation of social structures such as the relationships among people, teams, departments, and organizations. In the context of healthcare, SNA centers on the
resource (human and departments) aspect of a care process, providing an analysis of the responsibilities, the authorization issues, and interactions. The output of a SNA can be either a social graph (Figure 12), in which nodes represent
personnel or hospital departments, and (possibly weighted and/or oriented) edges denote interaction between nodes, or
a comparison matrix (Figure 13).
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
22 of 47
FIGURE 12
et al., 2020)
23 of 47
Example of an oriented, non-weighted social graph, in which nodes represent hospital departments (Agostinelli
F I G U R E 1 3 Example of a comparison matrix. The symbols “!”, “
occurred in the process between the two entities (Riz et al., 2016)
”, and “
!” represent the direction of information exchange that
Although SNA can play a crucial role in the analysis of healthcare organizational processes, there is a limited literature review which is limited to the analysis of the five metrics implemented in ProM in order to generate social networks: Handover of work metric, which determines who passes work to whom, is based on causal dependency between
activities, with options to consider only direct succession or take into account a causality fall factor; Subcontracting metric, that is similar to Handover of work, except for the fact that the relationship between two individuals is bidirectional, while the previous one is unidirectional; Working together metric which focuses on how frequently certain
individuals work together on the same case, while not taking into account the activity dependency; Similar task metric,
which determines who performs the same type of activities; and Reassignment metric, which detects the reassigning of
activities from one individual to another, that is, people delegates work to somebody but not vice versa.
In (Caron et al., 2011, 2014; Naeem et al., 2017; Riz et al., 2016; Ronny et al., 2015) SNA is used to enable for the
exploration of departmental collaborations; in (R. Mans, Reijers, et al., 2012), for discovering personnel interactions
during dentistry surgery operations; in (Conca et al., 2018; and Durojaiye et al., 2019) for investigating multidisciplinary
collaboration in pediatric trauma care, and in the treatment of patients with type 2 diabetes in primary care, respectively. Rarely, SNA is a speedup for further analysis. For example, in (Alvarez et al., 2018) SNA is exploited to discover
roles interactions models in emergency department processes, and to provide useful knowledge that can help to
improve ED processes. A. Grando et al. (2017) used SNA to compare observed handover-of-care interactions with
patient cases, to evaluate timing performances of personnel involved in the care process, so as to find correlation
between care paths, and to deduce the time spent by different patient groups.
5.2.6
| Predictive process analytics
Learning how a process behaves at the present time may allow to predict how it will evolve in the future. Process prediction is an important task for policy making, resource management and utilization, decision support, planning, and
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
optimization. In the context of healthcare, forecasting patient flows can help managers allocate money and human
resources to the many health services provided by the hospital (Benevento, Aloini, et al., 2019; Duma &
Aringhieri, 2017, 2020; Kempa-Liehr et al., 2020; Van Der Spoel et al., 2012). Also, the prediction of complications, comorbidities, severity of a disease on the basis of the patients' characteristics can help physicians to choose the best treatments and drugs (Back et al., 2020; H. Xu, Pang, Yang, Li, & Zhao, 2020; H. Xu et al., 2021). In some cases, the problem
of predicting the process behavior reduces to a classification problem (Benevento, Aloini, et al., 2019; Duma &
Aringhieri, 2017; H. Xu, Pang, Yang, Li, & Zhao, 2020; Van Der Spoel et al., 2012; H. Xu et al., 2021), that is, finding
the most probable label for a given sequence of events, that can represent a process activity, a quantity, a time instant, a
resource, or any other key aspect of the process. In other cases, linear regression techniques are preferred to classifiers
(Benevento, Aloini, et al., 2019; Kempa-Liehr et al., 2020) as tools for inferring the trajectory of ongoing processes.
Van Der Spoel et al. (2012) proposed to use a set of care paths belonging to the same diagnosis as training data for
the supervised learning of the cost associated to a care product. By means of a random forest classifier they were able to
predict the cost of a care product with an F-score around 0.6, starting from a noisy event log. With the aim of providing
a tool to cope with overcrowding, Duma and Aringhieri (2017) used a decision tree classifier to identify the possible
paths of a patient on the basis of the information available at the triage. Benevento, Aloini, et al. (2019) used both linear
regression (LASSO) and classification (random forest) techniques for predicting the waiting time in an emergency
department, basing on queue-based indicators. Duma and Aringhieri (2020) aim to predict the next activities in the
view of a possible application to online optimization. In particular, they used a decision tree classifier to predict the use
of the emergency department resources by each patient on the basis of the only information known at the access of the
patient. Kempa-Liehr et al. (2020) proposed to use Geometric regression for predicting patient postoperative length of
stay. To assist medical staff with thrombolytic therapy decision-making for stroke patients, H. Xu, Pang, Yang, Li, and
Zhao (2020) proposed a clinical decision support method using decision tree and random forest classifiers for the prediction of the next activity given a sequence of events. H. Xu et al. (2021) proposed a transfer learning-based framework
for predicting the long-term recurrence risk in patients with ICE after discharge from hospitals, in order to point out
high-risk patients for intervention.
Contrary to the work cited above, Back et al. (2020) used Bayesian belief networks for predicting cycle times of individual phases of the patient flow. In general, Bayesian networks can be used for finding the probability of an any event
to occur, such as a surgery taking more than x minutes given the case type and condition of patient, or the likely destination of the patient given other evidence.
5.2.7
| Other applications
In this section, we discuss some papers selected in the process of collecting and filtering the works as dealing with new
emerging PM applications in healthcare. In fact, although the number of papers is small, related applications are successful employed in different domains and they are attracting attention in the healthcare too. In particular, concepts
drift is a well-known topic that analyses the process changes especially from the control-flow perspective, by detecting
changes in the process model (insertion, deletion, substitution, and reordering of process fragments), and in the time
points in which they occur. In the context of healthcare, concept drift is a particular concern since patterns of care
emerge and evolve in response to individual patient needs and through complex interactions between people, process,
technology and changing organizational structure. In (A. P. Kurniati et al., 2020) concept drift was exploited in the
well-established PM 2 methodology (Van Eck et al., 2015) in order to discover and analyze changes over time in complex
longitudinal healthcare data. Process change detection, localization, and characterization were carried out at three different levels of abstraction: model, trace, and activity. The case study examined process data related to the treatment of
endometrial cancer over a 15-year period (2003–2017) in one of the UK's largest cancer centers (Leeds Cancer Centre)
with a specific focus on the routes to diagnosis.
Outlier detection is another hot PM application which aim is to identify an event, a single process execution (trace),
or a process model whose behavior is different from the expected one. In (Bouarfa and Dankelman, 2012) deviations
from standard surgical practice are detected in order to enrich and extend medical protocols automatically in the case
of good practices, or to take countermeasures in the case of serious complications or errors. To detect outliers, a global
pair-wise sequence alignment (Needleman–Wunsch) algorithm is used and applied on video coming from Laparoscopic
Cholecystectomy (LAPCHOL) procedures. Alharbi et al. (2017) defined an event as outlier if it occurs more often than a
threshold interval determined from the central tendency and measure of dispersion of intervals for that event. More
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
24 of 47
25 of 47
often, clustering is the basic method of detecting outliers, that are identified as the smallest clusters (Folino
et al., 2011). This approach was used in (Han et al., 2011), where an Abnormal Method for Identifying Abnormal Process Instances (APIIM) to detect abnormal process instances in the clinical pathway was proposed.
It is quite known that one of the most important challenges in health applications concerns the privacy and security
of sensitive data, so much so that there are multiple health regulations at different government levels that define strict
privacy requirements (e.g., personal data is protected by privacy legislation, such as the GDPR in the European Union,
the Health Insurance Portability and Accountability Act in the United States, or the Personal Information Protection
and Electronic Documents Act in Canada). It is surprising that in the PM literature we have found no more than one
paper related to the application of Privacy Preserving PM in the healthcare domain. In (Pika et al., 2019), data privacy
and utility requirements for healthcare process data were analyzed in order to assess the suitability of existing privacypreserving data transformation approaches, and to propose a privacy-preserving PM framework that can support PM
analysis of healthcare processes with obfuscated data.
The effectiveness of PM applications is strictly dependent on the underlying data quality. Data acquisition represents thus a key step for the subsequent jobs performed with data such as PM tasks. Typically, data quality issues are
due to organizational reasons, in fact, often the data collection task in hospitals is first accomplished using paper and
put into electronic format at a later time. Moreover, HISs may be unable to exchange data with each other, thus leaving
to the hospital staff the responsibility to do it manually. Data quality assessment, thus, is a fundamental step to be carried before any PM task, as it enables for the correct evaluation of the adopted PM algorithms. Perimal-Lewis
et al. (2016) assessed the quality of emergency department data extracted from the HER of an Australian public hospital. The presence of incorrect timestamps was identified as the cause of flow anomalies found in patient pathways
mined from the event log. The set of corrective actions proposed by the authors, and to be put in place to address such
data quality issues, exclusively regard the improvement of the data collection task carried out by the hospital staff. A P.
Kurniati et al. (2019) provided an assessment of the MIMIC III data quality in terms of missing data, incorrect data,
imprecise data, and irrelevant data, following the method presented by Weiskopf and Weng (2013). Despite they found:
(1) missing data among events, case attributes, activity names, timestamps, and event attributes; (2) incorrect data
among events, cases, and timestamps; and (3) imprecise data among resources and timestamps; the overall data quality
of MIMIC-III was found to be good for performing PM tasks. Lanzola et al. (2014) assessed the data quality of a stroke
registry of an Italian hospital, and found that the majority of errors and missing data detected among records was represented by missing onset-arrival time (43% of the whole dataset), and by missing stroke scales at follow-up (22%),
followed by the violation of temporal constraints among dates (7.5%) and other minor issues.
5.3 | PM methodologies for the analysis of healthcare processes
In this section, we report on the PM frameworks that were proposed to guide the implementation of PM applications
with healthcare data. Most of them were explicitly built on existing, well-known PM methodologies, such as PM 2 (Van
Eck et al., 2015), L* (Van Der Aalst, Adriansyah, De Medeiros, et al., 2011), and CRISP-DM (Wirth & Hipp, 2000); other
ones, instead, simply follow the classical pipeline (1) data Collection, (2) Event log preparation, (3) process Discovery,
(4) process Analysis (CEDA). The main phases of these PM approaches are depicted in Figure 14, with the addition of
the DIAG approach, the unique PM methodology designed for PM tasks carried in a healthcare context.
McGregor et al. (2011) extended the CRISP-DM model with temporal and multidimensional aspects to provide a
structured approach to knowledge discovery of new conditions onset pathophysiologies in physiological data streams,
and used Patient Journey Modelling Architecture (PaJMa) for the knowledge representation. PaJMa (McGregor
et al., 2008), designed specifically for healthcare, provides a visual process representation, and the information and technologies involved in the patients journey. The CRISP-DM (CRoss Industry Standard Process for Data Mining) reference
model provides an overview of the life cycle of a data mining project. The life cycle is divided into six phases, as shown
in Figure 14. Although the proposed schematization is that of a block diagram, the sequence of the phases is not rigorous. The arrows indicate the most important and frequent dependencies between phases, especially in the particular
context of healthcare, where the phase to be performed depends on the outcome of each phase, or task of a phase previously performed. However, in general the process is not intended to be finished once a solution is deployed, as the
knowledge learned during the process can trigger new, often more targeted, business questions.
Gonzalez-García et al. (2020) proposed a PM 2-based methodology for the Code Stroke analysis use case. In Erdogan
and Tarhan (2018), the authors proposed a goal-driven PM 2 based approach, and tested it on the surgery process of a
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
FIGURE 14
GUZZO ET AL.
Process mining methodologies
hospital university. The application of the proposed approach revealed for bottlenecks and deviations, that were crucial
for determining measures to improve the efficiency of the surgery process. The PM 2 approach allows to consider two
types of models, namely process and analytical models. Process models describe the ordering of activities in a process,
detailing time constraints, resource, or data usage. Analytical models are any other type of model that provides information about the process, such as decision trees. In the planning and extraction phases initial research questions are
defined and event data are extracted, respectively. Then, multiple analysis iterations are performed, even in parallel. In
general, each analysis iteration performs the remaining phases one or more time, each focusing on answering a specific
research question by applying PM algorithms and evaluating the discovered models. If the results are satisfactory, then
they can be used for improving the process.
Delias et al. (2014) demonstrated the potentials of PM for the analysis of emergency department processes. The main
contributions include a CEDA-based approach enriched with a preliminary trace clustering before process discovery;
the identification and visualization of the process paths followed by patients; the discrepancy between rare flows; performance analysis; and compliance checking with respect to the medical standards. Trace clustering allows grouping
patient pathways with similar characteristics, so as to apply the process discovery phase to each of such clusters and
obtain a set process models instead of a single “spaghetti like” model. Lismont et al. proposed by Lismont et al. (2016),
a PM framework which follows the CEDA approach reinforced with trace clustering as well, with the addition of an
activity clustering task with the aim of renaming similar activities with same label so as to obtain more understandable
th et al. (2017) provided an overview of the difficulties of the application of PM in healthcare, gave
process models. To
recommendations for managing such problems, and suggested a CEDA-based workflow to generate more precise process models.
The L* methodology was used as a basis for a PM framework for the analysis of cancer pathways mined from the
MIMIC-III dataset (A. P. Kurniati et al., 2018a). In contrast to the other PM approaches, in L* particular attention is
given to the “process modeling” phase, which is divided into two stages (yellow blocks in Figure 14): in the first one, a
process model is mined from the subsets of data extracted in the previous stages; in the second, the model is enriched
with additional data coming from the different data perspectives of the process under analysis.
The DIAG approach helps visualize and analyze patient pathways by combining real-time localization and PM
(Araghi, Fontanili, et al., 2018). The analyses provided by this approach are based on the location data generated by
real-time location systems (RTLS), which besides being the basis for the analysis of patient movements, it turns to be
helpful as integration of other medical data too. DIAG is structured in four layers, namely data, information, awareness and governance, and operates through six stages (depicted in Figure 14). Through the gathering data stage, the
movements of patients/objects are monitored and stored by the localization systems. The positioning algorithms are
used to identify the tags placed on the monitored objects, while through the log refinement stage the data are
preprocessed to make them suitable for the PM techniques used in the modeling. In the modeling stage process discovery algorithms are executed. The analyzing stage provides quantitative analysis for users, and the performance
level of the processes are assessed by considering the way patients circulate in the environment while receiving their
treatment. The diagnostic stage the emphasis is placed on the causes that trigger weaknesses in the execution of processes. Finally, various possible scenarios for improvement are prefigured in the prognosis stage through simulation
techniques.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
26 of 47
27 of 47
Although not in the form of PM framework, the following papers have outlined a series of guidelines in the form of
key questions to be answered, or challenges to be faced when conducting a PM task in a healthcare context.
R. S. Mans, Van der Aalst, et al. (2012) investigated the data challenges that are faced when answering four frequently posed questions during PM projects in hospitals, that are: (Q1) What are the most followed paths and what
exceptional paths are followed?; (Q2) Are there differences in care paths followed by different patient groups?; (Q3) Do we
comply with internal and external guidelines?; and (Q4) Where are the bottlenecks in the process?. They investigated the
characteristics of HIS data and whether they help solve the above questions. Finally, they illustrated which data challenges exist when answering the questions, and provided tips for addressing them.
Homayounfar (2012) identified three main sources of problems that challenge the performances of PM tasks
in the healthcare domain. These are: (1) the complexity of processes caused by the heterogeneous nature
of hospital environments; (2) the continuous ad hoc actions that are performed by physicians, which is the main
cause of the so-called “spaghetti effect” on the process models as the result of a process discovery task; (3) the
poor quality of the data collected in HIS is the main cause of the generation of process models that do not fit with
the real processes.
Kaymak et al. (2012) argued that existing PM techniques fail to extract intelligible process models,15 and proposed a
few recommendations for making such techniques more effective when applied in the healthcare domain.
5.3.1
| Other approaches
The methodologies discussed so far are high-level PM guidelines that guide the execution of PM tasks from the collection of raw data till the analysis and improvement of the process, without going much into the details of the PM techniques used in each phase. The PM approaches we are going to discuss, instead, are specific for the discovery or the
conformance checking of process models.
Business Process Life cycle (BPL) (Weske, 2007) is a PM methodology typically adopted for conformance checking.
Among the surveyed papers, those that followed this approach are nine in total (de Vries et al., 2017; Kelleher
et al., 2014; Kirchner et al., 2012; Neira et al., 2019; Rinner et al., 2018; Rovani et al., 2015; H. Xu, Pang, Yang, Ma,
et al., 2020; H. Xu, Yan, Pang, Nan, et al., 2020; S. Yang, Sarcevic, et al., 2018). As shown in Figure 15, BPL consists of
four phases organized in a cyclical structure. In the design phase, a hand-made model of the “To-be” process is
designed. In the configuration phase, the process is configured according with the model designed in the previous
phase. In the enactment phase, the real process is executed and data are collected. In the evaluation phase, the “As-is”
process model is mined from collected data by means of process discovery algorithms and compared with the “To-be”
model designed in the design phase. The discrepancies between the two models can be used to improve the real process,
and the BPL methodology can then be reapplied on the improved process.
Despite the BPL structure does not imply any specific temporal order in which phases have to be executed, in the
majority of surveyed papers the design stage was the starting point.
The interactive PM is a PM approach specific for the discovery of process models, and bases its strengths on the combination of domain knowledge and PM techniques (Dixit et al., 2018). This allows for the generation of more precise
process models, as the potential drawbacks of automated discovery techniques are suppressed by the intervention of the
domain expert. In a healthcare context, where processes are highly complex, heterogeneous, and dynamic, physicians
lend their deep domain knowledge to provide useful advances in the process discovery task (Fernandez-Llatas, 2021a,
2021b; Ibanez-Sanchez et al., 2019; Martinez-Millana et al., 2021).
FIGURE 15
The business process life cycle methodology
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
The majority of the PM papers reviewed so far focus on the analysis of healthcare processes from a single point of
view, which most of times is the control-flow perspective (see Section 5.2.1), and less often, the organizational (see
Section 5.2.5) and performance perspectives (see Section 5.2.3). However, the high complexity of the healthcare domain
requires the analysis of multiple perspectives to have a more complete view of the processes under consideration. To
this end, the multi-perspective approach provides for the analysis of the processes from different points of view, a methodology which allows to acquire a deeper knowledge by combining the information mined from the different perspectives. Typically, this approach consists in the application of PM algorithms to different fragments of the event log,
namely, activities for the analysis of the control-flow perspective, resources for the analysis of the organizational perspective, and timestamps for the analysis of the performance perspective. In other cases, dedicated tools, such as the
Multi-perspective Process Explorer ProM plug-in (Mannhardt et al., 2015), automatize this task by taking as input the
whole log. Rebuge and Ferreira (2012) studied infrequent behavior and process variants from the analysis of the
control-flow, organizational, and performance perspectives. R. Mans et al. (2013) glued together in one model the
results obtained from the control-flow, organizational, and performance perspectives, and the obtained model was used
for process simulation purposes. Mannhardt and Blinde (2017) used the Multi-perspective Process Explorer ProM plugin, for inquiring the performance and control-flow perspectives. Araghi et al. (2019) collected RTLS data and analyzed
patients' movements from the control-flow and performance perspectives. H. Xu, Pang, Yang, Jinghui, et al. (2020) presented a declarative, multi-perspective PM method for the modeling of clinical processes. To capture the different process perspectives, they classified event attributes into seven types based on medical services, and each relationship
between types were represented in a relationship matrix in terms of a constraint expressed in the Declare language. The
control-flow, organizational, and performance perspectives were also at the basis of the analysis conducted by R. Mans,
Reijers, et al. (2012), Neumuth et al. (2012), and Ronny et al. (2015).
The process discovery algorithms discussed in Section 5.2.1 share the common characteristic of searching for “complete” process models, that is, they try to extract a model which describes the whole event log. However, in highly flexible settings, such as healthcare event logs where traces may considerably differ with each other and no global process
model exists that describes the whole log, such algorithms may be subject to the so-called “spaghetti effect,” that is, they
output an uninterpretable process model. To overcome this issue, Kirchner and Markovic (2018) exploited the potential
of Local Process Model Discovery (LPMD; Tax, Sidorova, Haakma, & van der Aalst, 2016) to extract a set of process
models for the clinical pathways followed by living liver donors. LPMD extracts a set of process models each representing the process described by a subset of the event log, thus enabling the detection of most representative process
behaviors. Although computationally inefficient in case of large number of distinct activities, heuristics for speedup the
LPMD task have been proposed (Tax, Sidorova, van der Aalst, & Haakma, 2016). Figure 16 shows in which percentage
each PM methodology has been adopted in the reviewed papers.
FIGURE 16
Process mining methodologies adopted in the surveyed papers
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
28 of 47
29 of 47
5.4 | Tools
For the implementation of the PM applications discussed in the previous sections, authors typically have made use of
dedicated PM tools such as ProM,16 Disco,17 Celonis,18 and so forth, more rarely they have developed PM algorithms
using programming languages such as Python, C#, R, and C. Figure 17 shows the percentage of use of the PM tools and
programming languages adopted by the authors of the surveyed papers (simulation tools are not included as already
listed in Section 5.2.4). It can be seen that the most used PM tools were ProM (43%) and Disco (20%), that together
account for more than half of all available choices. ProM is the most complete tool with tens of algorithms for process
discovery, conformance, alignment, and other PM applications, and the possibility for developers to implement their
own solutions. A more detailed description of ProM can be found in (Tibeme et al., 2018), where Tibeme et al. analyze
how ProM algorithms perform with clinical workflows.
Despite their scarce use, it is worth paying attention to the PALIA suite and R.IO-DIAG19 tools, as they were born
for working with healthcare data.
The PALIA suite consists of PALIA-ER (Rojas, Fernandez-Llatas, et al., 2017), and PALIA ILS Web Tool
(Fernandez-Llatas et al., 2015), that are both web-based PM tools that use the PALIA discovery algorithm, and designed
to be easy to use by users not experts in PM. The former is a tool for question-driven PM in Emergency Rooms which
includes model simplification and filtering features specially domain-specific for ER. The latter is specifically designed
for dealing with ILS data in healthcare environments, and provides the graphical view of the process under analysis,
plus trace clustering algorithms for the generation of different models related to groups of patients with similar
behavior.
R.IO-DIAG is an open-source software developed to assist users in the implementation of PM tasks that follow the
DIAG approach. DIAG is a PM methodology for the analysis of patient pathways obtained from RTLS data (Araghi,
Fontaili, et al., 2018). R.IO-DIAG allows users to perform process discovery, conformance checking, and process
enhancement from event logs corresponding to location data of patients.
Among the various programming languages used for the development of PM task in the healthcare domain, R was
the most used. In particular the pMineR (Gatta et al., 2017) library, designed by Gatta et al. to implement PM tasks with
medical data. pMineR allows to do process discovery and conformance checking, and to present processes in the form
of Markov Models, that are easy to understand for medical users. Moreover, it is well suited for the representation of
clinical guidelines (in terms of human-readability), thanks to some aspects taken from the Computer Interpretable
Clinical-Guidelines field.
FIGURE 17
Tools used in the surveyed papers
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
Different from the PM tools described above, PMApp (Valero-Ramon, Fernandez-Llatas, Martinez-Millana, &
Traver, 2020) is a tool designed to create custom PM dashboards for medical purposes. Such dashboards allow the selection of the most adequate views and algorithms for each medial case. PMApp is built using the Process Choreography
Paradigm (Barros et al., 2010), and integrates PMCode, a .NET toolkit for the development of PM algorithms.
Zhu et al. (2010) developed a workflow-based dashboard to help hospital staff monitor the state of the process in real
time. Martinez-Millana et al. (2019) developed a dashboard which allows to discover flows of patients based on the location data of patients undergoing an intervention. The dashboard also allows to filter data, and compute statistics.
Figure 17 shows in which percentage each tool has been used in the reviewed papers.
6 | O P E N I S S U E S A N D FU T U R E RE S E A R C H DI R E C T I O N S
In this section, we provide an answer to the research question Q3 defined in Section 2. From the literature discussion
emerges that the high complexity of the healthcare domain represents an obstacle for PM tasks in generating satisfactory results. This is due to four main reasons: first, the data collection process not always is performed by process aware
information systems, thus the data format is not suited to be employed in PM tasks; second, most of times the data are
first collected on paper and put into electronic format at a later time, thus it is error prone; third, health data are collected in different, not-synchronized repositories (EHR, PACS, etc.), thus, it needs to be manually recovered from different data sources before being processed; fourth, medical event logs may contain traces (e.g., patient cases) very different
with each other, which is an obstacle for process discovery algorithms to generate process models descriptive enough to
be successfully employed by physicians. These issues slow down the knowledge acquisition process as they require for
an intensive time/resource-consuming data preprocessing task to allow PM applications generating useful results. Note
that three out of four, are data collection related issues and as such they must be accounted by hospital managers, while
only one derives from the nature of the data itself. Data collection issues can be solved (albeit partially) by introducing
automated data collection tools, such as RTLS, and wearable smart devices that autonomously and punctually collect
information on patients and staff, and by providing hospital staff with electronic tools to increase the automation degree
and the quality of the data collection process. Furthermore, the adoption of a single information system orthogonal to
different healthcare facilities could be a valid solution to easily recover the whole patients care history, and arrive at the
analysis of patients entire care path. On the other hand, the heterogeneous nature of medical event logs is a problem
reserved to data scientists. As we already mentioned in the previous sections, the high complexity of health data is the
primary cause of the so-called “spaghetti effect” which characterizes the process models mined from such data. Different techniques have been proposed to overcome this issue, such as trace clustering, activity clustering, topic modeling,
and local process models discovery, that allow to separate the process discovery task on different sub-logs, each showing
different characteristics. However, such PM methodologies need to be properly tuned on each specific medical case,
which requires a deep knowledge of the medical domain, and thus a strict collaboration between data scientists and
physicians is needed.
Another important aspect the future research in PM for healthcare has to take into account regards the illustration
of the results obtained by means of the knowledge discovery tools. An improvement in the representation of complex
and variable health processes would help physicians understand the outcome of PM applications, and consequently
increase the usability of PM in healthcare contexts. The tip of the iceberg of this challenge is to hide the complex PM
techniques behind user-friendly and interactive interfaces and notations, through which automatically set analysis
models using simple settings on parameters and filters.
Another promising future research is to integrate PM analysis results into simulation and scheduling optimization
frameworks. PM was already used in production to extract relevant multiproduct planning information or to plan activities in accordance with business rules. To the best of our knowledge these approaches has not been investigated in
healthcare setting, while it could be supposed that this topic is worthy of further investigation.
Among the research areas to be further investigated there is the conformance analysis, that, although it is a widely
investigated line of research in PM, its applicability to the medical field is not trivial. The challenge is to review these
techniques so that they can work with less structured processes, by taking into account different perspectives of analysis
(not just control flow), and a greater number of activities.
Finally, we argue that privacy-preserving PM needs to be more investigated as it has received very little attention
(we found one paper only). In fact, event logs contain sensitive personal information which must be obfuscated using
privacy protection techniques to modify the original data while preserving its usability.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
30 of 47
31 of 47
7 | R E LA T E D WOR K
In this paragraph we report on the survey papers that have been published in the context of PM in healthcare. The discussion is organized according with the publication year.
2021:
Dallagassa et al. (2021) discussed 270 articles ranging from 2002 to 2019. They first show how the use of PM has
evolved in the health care domain over time; then they propose two classifications of the surveyed articles, based on
PM applications and on PM techniques. They found that the discovery of process models was the most frequent PM
application, and the most adopted algorithms were the fuzzy miner and heuristic miner.
2020:
Motivated by the fact that previous survey articles did not report clinical aspects in a uniform way, did not follow a
standard clinical coding scheme, and details of the event log data were not always described, Helm et al. (2020) surveyed 38 PM studies in healthcare, published between 2016 and 2018, with an emphasis on the details of the event log
data, algorithms and techniques used. In particular, of the reviewed papers they described the characteristics of the
event log, and referred to a standard clinical coding scheme for the information on clinical specialty and medical
diagnoses.
Grüger et al. (2020) surveyed 55 papers in the medical domain of oncology, and investigated how PM has been
applied in order to acquire semantic case descriptions from healthcare information systems (HIS). The authors envision
that such information can be used as experience by a case-based reasoning (CBR) system to support eminence-based
decision making. They found that (i) most of papers focus on the analysis of data using PM and less on describing the
process and difficulties of exporting and extracting HIS-data and transforming them into event logs; (ii) none of the surveyed papers examines the use of PM for case acquisition for CBR; and (iii) the application of PM in oncology especially
focuses on the control flow perspective.
2019:
Farid et al. (2019)) reviewed eight papers where PM techniques were applied to the care of frail elderly people. The
authors presented the results referring to five emerging themes, namely, geographical location, analysis methodology,
source data types, adopted process, medical context and open challenges.
Jangi et al. (2019) showed the use of semantic PM to enhance hospitals processes. In total, six articles are surveyed.
2018:
G. P. Kusuma et al. (2018) provided a summary of PM studies undertaken in the field of cardiology. They identified
32 relevant studies from 2008 to 2017, and analyzed them across five themes: process and data types, techniques, perspectives, tools, methodologies. In the analysis of the limitations and the future work, they pointed to data quality as
the major issue that needs to be addressed.
Williams et al. (2018) investigated the extent to which PM has been applied to primary care, and identified seven relevant papers. They summarized the data sources, geographical location and medical domains that were reported, and
identified PM challenges in a primary care context. The criticalities identified in the selected studies concern the coherence and completeness of the data collected, the choice of different algorithms and tools, and the effective presentation
and application of the obtained results in real scenarios.
Erdogan and Tarhan (2018) presented a systematic mapping of PM in healthcare where 172 research papers published between 2005 to 2017 are categorized considering the type of research, the specific application context, mining
algorithms, and process modeling language. The authors also summarized the demographic and bibliometric trends
specific of the referred domain, in terms of publication volume, most influential documents in the reference research
community, geographical location of contributing researchers, and top venues.
Batista and Solanas (2018) investigated on the applications of PM in the healthcare sector. In particular, the authors
focused on the major trends by dividing the discussion in the following main aspects: medical data and preprocessing,
medical fields, medical process type, objective, perspective, algorithms, tools, and medical structures. Heterogeneity
within the healthcare domain has proved to be one of the main challenges for PM, which is due to the fact that different
therapies can be followed to treat different patients with the same disease. The authors argued that this aspect makes it
necessary to enhance the quality of the event log data with further details related to the context.
2016:
Ghasemi and Amyot (2016) presented a systematized literature review based on the analysis of papers obtained
through existing reviews rather than on the direct analysis of a large list of first-hand papers. They also argue the literature discussion provided by other review papers.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
A. P. Kurniati et al. (2016), through an accurate thematic review, analyzed 37 works in the field of oncology, formulating specific research questions according to the processes and types of data, techniques, methodologies and tools
used, and also highlighting the limits found and the research guidelines for future works. The review highlights the
potential value of PM for improving cancer care processes, and provide a n overview of the work undertaken, finally
identifies research opportunities in this field of study.
Rojas et al. (2016) reviewed 74 papers according to the following main aspects: process and data types, frequently
posed questions, PM techniques, perspectives and tools, methodologies, implementation and analysis strategies, geographical analysis, and medical fields. The most commonly used categories, emerging topics and future trends were
identified. The authors concluded the paper with the list of most critical challenges: (i) portable solutions must be
developed that can adapt to hospital environments other than those in which they have been used; (ii) user-friendly
tools for the visualization of the process models are needed; (iii) benchmarking studies must be conducted between different hospitals to identify and emulate success stories.
2015:
Rojas et al. (2015) conducted a bibliographic study on the use of different algorithms, techniques and PM tools
applied in the healthcare domain. The authors highlighted the limitations and the resulting challenges, that are:
assessing the compliance of real medical processes with medical protocols and guidelines; overcoming the technical pitfalls inherent in identifying, accessing, and integrating data sources; and overcoming poor data quality issues by means
of preprocessing tasks.
Mans et al. published a book (R. S. Mans et al., 2015) which gives a wide overview of the application of PM in the
healthcare domain.
2014:
In (W. Yang and Su, 2014), 37 studies published between 2004 and 2013 were analyzed focusing on the discovery of
the medical process as an essential tool for the design of clinical pathways, on the analysis of process variants, and on
process performance measurements to identify possible improvements. From the discussion on the limitation of the
applicability of PM in medical contexts, it emerges that most of the mining algorithms are not adequate to handle
medial processes because of their unstructured, complex and variable nature. It follows that the models obtained are
not able to explain the numerous variants, slowing down the overall improvement process. To overcome these issues,
the authors suggested four research directions: (1) in-depth analysis of variants, (2) integrated process management,
(3) customization, and (4) self-learning improvement of the clinical pathways.
8 | C ON C L U S I ON S
In this article we have reviewed 172 papers published in the last 10 years, that present various applications of PM in the
healthcare domain. We have seen how PM techniques have been applied to health data for mining useful knowledge,
and how the acquired knowledge can help physicians and hospital managers to better understand and improve health
processes. We have discussed the surveyed papers based on the taxonomy proposed in Section 1, according to which a
PM task can be characterized by the application, which defines the purpose of the PM task; the algorithms used to
achieve the intended purpose; the tools used to implement the algorithms; the approach followed to finalize the application; the kind of process; and the source of data. From the literature discussion emerges that the PM research in
healthcare is growing rapidly and features a large number of techniques, that are necessary to deal with the high complexity of the healthcare data. In particular, the main contributions in the health sector that we have been able to
appreciate in the drafting of this work concern:
• Data preprocessing including the collection and preparation of event logs, as well as their optimization and quality
improvement;
• Discovery and analysis of process models for the evaluation of the healthcare services;
• Evaluation of the process conformance with respect to medical protocols and clinical guidelines;
• Performance evaluation of healthcare processes in terms of bottlenecks and time management;
• Investigation of the management of hospital resources by means of SNA;
• PM tools for the application of PM techniques and for the visualization of the obtained results;
• Simulation and predictive process techniques for the investigation of potential scenarios;
• PM methodologies to guide the execution of PM task in the healthcare domain;
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
32 of 47
33 of 47
Despite the great effort that has been spent by researchers in this last decade, the field is still open for further
research and practice. In Section 6, we have outlined the challenges to be faced in order to make PM methods more
effective when applied to health data, that can be summarized in: producing higher quality data; designing more performing PM techniques to deal with highly variable processes; providing user-friendly PM tools to be used by nonexpert
PM users; spending more effort in privacy-preserving PM so as to be able to analyze data containing sensitive information, as in the case of medical event logs, without incurring in privacy issues. We expect the results of this review to be
used fruitfully to make the research community and practitioners more aware of the importance of PM applications in
the healthcare sector and provide insights to direct future efforts.
CONFLICT OF INTEREST
The authors have declared no conflicts of interest for this article.
DATA AVAILABILITY STATEMENT
Data sharing is not applicable to this article as no new data were created or analyzed in this study.
A U T H O R C ON T R I B U T I O NS
Antonella Guzzo: Conceptualization (equal); formal analysis (equal); investigation (equal); methodology (equal);
supervision (equal). Eugenio Vocaturo: Data curation (equal); investigation (equal); resources (equal). Antonino
Rullo: Conceptualization (equal); formal analysis (equal); investigation (equal); methodology (equal); supervision
(equal).
E N D N O T ES
1
https://www.ariscommunity.com
https://fluxicon.com/disco
3
https://www.promtools.org
4
https://www.celonis.com
5
https://mimic.mit.edu
6
https://www.who.int/standards/classifications
7
https://cheatography.com/deleted-2754/cheat-sheets/major-diagnostic-category-mdc-to-ms-drg-mapping/
8
https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/
MedicareFeeforSvcPartsAB/downloads/DRGDesc05.eps
9
https://www.whocc.no/atc_ddd_index/
10
https://www.anylogic.com/
11
https://www.promodel.com/
12
https://cpntools.org/
13
https://www.lanner.com/en-us/technology/witness-simulation-software.html
14
https://www.simul8.com/
15
Note that this paper was published in 2012.
16
https://www.promtools.org/
17
https://fluxicon.com/disco/
18
https://www.celonis.com/
19
https://research-gi.mines-albi.fr/display/RIOSUITE/R-IOSuite+Home
2
R EF E RE N C E S
Abo-Hamad, W. (2017). Patient pathways discovery and analysis using process mining techniques: An emergency department case study. In
International conference on health care systems engineering (pp. 209–219). Springer.
Agostinelli, S., Covino, F., D'Agnese, G., De Crea, C., Leotta, F., & Marrella, A. (2020). Supporting governance in healthcare through process
mining: A case study. IEEE Access, 8, 186012–186025.
Aguirre, J. A., Torres, A. C., & Pescoran, M. E. (2019). Evaluation of operational process variables in healthcare using process mining and
data visualization techniques. Health, 7, 19.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
Alharbi, A., Bulpitt, A., & Johnson, O. (2017). Improving pattern detection in healthcare process mining using an interval-based event selection method. In International conference on business process management (pp. 88–105). Springer.
Alharbi, A., Bulpitt, A., & Johnson, O. A. (2018). Towards unsupervised detection of process models in healthcare. In MIE (pp. 381–385). IOS
Press.
Alvarez, C., Rojas, E., Arias, M., Munoz-Gama, J., Sepúlveda, M., Herskovic, V., & Capurro, D. (2018). Discovering role interaction models in
the emergency room using process mining. Journal of Biomedical Informatics, 78, 60–77.
Amantea, I. A., Sulis, E., Boella, G., Marinello, R., Bianca, D., Brunetti, E., Bo, M., & Fernandez-Llatas, C. (2020). A process mining application for the analysis of hospital-at-home admissions. In Studies in Health Technology and Informatics (Vol. 270, pp. 522–526). Europe
PCM Plus.
Andrews, R., Suriadi, S., Wynn, M., ter Hofstede, A. H. M., & Rothwell, S. (2018). Improving patient flows at St. Andrew's War Memorial
Hospital's Emergency Department through process mining. In Business process management cases (pp. 311–333). Springer.
Andrews, R., Wynn, M. T., Vallmuur, K., Ter Hofstede, A. H. M., & Bosley, E. (2020). A comparative process mining analysis of road trauma
patient pathways. International Journal of Environmental Research and Public Health, 17(10), 3426.
Anggrainingsih, R., Johannanda, B. O. P., & Cahyani, D. E. (2018). Business process evaluation of outpatient services using process mining.
Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 10(2–4), 125–128.
Antonelli, D., & Bruno, G. (2015). Application of process mining and semantic structuring towards a lean healthcare network. In Working
conference on virtual enterprises (pp. 497–508). Springer.
Araghi, S. N., Fontaili, F., Lamine, E., Salatge, N., Lesbegueries, J., Pouyade, S. R., Tancerel, L., & Benaben, F. (2018). A conceptual framework to support discovering of patients' pathways as operational process charts. In 2018 IEEE/ACS 15th international conference on computer systems and applications (AICCSA) (pp. 1–6). IEEE.
Araghi, S. N., Lamine, E., Salatge, N., & Benaben, F. (2020). Interpretation of patients' location data to support the application of process
mining notations. In HEALTHINF (pp. 472–481). SciTePress.
Araghi, S. N., Fontanili, F., Lamine, E., Salatge, N., Lesbegueries, J., Pouyade, S. R., & Benaben, F. (2019). Evaluating the process capability
ratio of patients' pathways by the application of process mining, SPC and RTLS. In HEALTHINF (pp. 302–309). ScitePress.
Araghi, S. N., Fontanili, F., Lamine, E., Tancerel, L., & Benaben, F. (2018). Applying process mining and RTLS for modeling, and analyzing
patients' pathways. In HEALTHINF (pp. 540–547). SciTePress.
Arias, M., Rojas, E., Aguirre, S., Cornejo, F., Munoz-Gama, J., Sepúlveda, M., & Capurro, D. (2020). Mapping the patient's journey in
healthcare through process mining. International Journal of Environmental Research and Public Health, 17(18), 6586.
Asare, E., Wang, L., & Fang, X. (2020). Conformance checking: Workflow of hospitals and workflow of open-source EMRs. IEEE Access, 8,
139546–139566.
Augusto, V., Xie, X., Prodel, M., Jouaneton, B., & Lamarsalle, L. (2016). Evaluation of discovered clinical pathways using process mining and
joint agent-based discrete-event simulation. In 2016 winter simulation conference (WSC) (pp. 2135–2146). IEEE.
Back, C. O., Manataki, A., & Harrison, E. (2020). Mining patient flow patterns in a surgical ward. In Proceedings of the 13th international joint
conference on biomedical engineering systems and technologies. SciTePress.
Badakhshan, P., & Alibabaei, A. (2020). Using process mining for process analysis improvement in pre-hospital emergency. In ICT for an
inclusive world (pp. 567–580). Springer.
Baker, K., Dunwoodie, E., Jones, R. G., Newsham, A., Johnson, O., Price, C. P., Wolstenholme, J., Leal, J., McGinley, P., Twelves, C., &
Hall, G. (2017). Process mining routinely collected electronic health records to define real-life clinical pathways during chemotherapy.
International Journal of Medical Informatics, 103, 32–41.
Barros, A., Hettel, T., & Flender, C. (2010). Process choreography modeling. In Handbook on business process management (Vol. 1,
pp. 257–277). Springer.
Batista, E., & Solanas, A. (2018). Process mining in healthcare: A systematic review. In 2018 9th international conference on information,
intelligence, systems and applications (IISA) (pp. 1–6). IEEE.
Benevento, E., Aloini, D., Squicciarini, N., Dulmin, R., & Mininno, V. (2019). Queue-based features for dynamic waiting time prediction in
emergency department. Measuring Business Excellence, 23(4), 458–471.
Benevento, E., Dixit, P. M., Sani, M. F., Aloini, D., & van der Aalst, W. M. P. (2019). Evaluating the effectiveness of interactive process discovery in healthcare: A case study. In International conference on business process management (pp. 508–519). Springer.
Berti, A., van Zelst, S. J., & van der Aalst, W. (2019). Process mining for Python (PM4Py): Bridging the gap between process and data science.
arXiv:1905.06169.
Binder, M., Dorda, W., Duftschmid, G., Dunkl, R., Fröschl, K. A., Gall, W., Grossmann, W., Harmankaya, K., Hronsky, M., Rinderle-Ma, S.,
Rinner, C., & Weber, S. (2012). On analyzing process compliance in skin cancer treatment: An experience report from the evidence-based
medical compliance cluster (EBMC 2). In International conference on advanced information systems engineering (pp. 398–413). Springer.
Blei, D., Carin, L., & Dunson, D. (2010). Probabilistic topic models. IEEE Signal Processing Magazine, 27(6), 55–65.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
Bouarfa, L., & Dankelman, J. (2012). Workflow mining and outlier detection from clinical activity logs. Journal of Biomedical Informatics,
45(6), 1185–1190.
Burattin, A. (2016). PLG2: Multiperspective process randomization with online and offline simulations (pp. 1–6). BPM (Demos).
Burattin, A., & Alessandro Sperduti, P. L. G. (2010). A framework for the generation of business process models and their execution logs. In
International conference on business process management (pp. 214–219). Springer.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
34 of 47
35 of 47
Caron, F., Vanthienen, J., De Weerdt, J., Baesens, B., De Weerdt, J., & Baesens, B. (2011). Beyond x-raying a care-flow: Adopting different
focuses on care-flow mining. In Proc. First Int. Bus. Process Intell. Chall (pp. 1–11).
Caron, F., Vanthienen, J., Vanhaecht, K., Van Limbergen, E., De Weerdt, J., & Baesens, B. (2014). Monitoring care processes in the gynecologic oncology department. Computers in Biology and Medicine, 44, 88–96.
Chang, H., Yu, J. Y., Yoon, S. Y., Hwang, S. Y., Yoon, H., Cha, W. C., Sim, M. S., Jo, I. J., & Kim, T. (2020). Impact of COVID-19 pandemic
on the overall diagnostic and therapeutic process for patients of emergency department and those with acute cerebrovascular disease.
Journal of Clinical Medicine, 9(12), 3842.
Chen, B., Alrifai, W., Cheng, G., Jones, B., Novak, L., Lorenzi, N., France, D., Malin, B., & Chen, Y. (2021). Mining tasks and task characteristics from electronic health record audit logs with unsupervised machine learning. Journal of the American Medical Informatics Association, 28(6), 1168–1177. https://doi.org/10.1093/jamia/ocaa338
Chen, S., Yang, S., Zhou, M., Burd, R., & Marsic, I. (2017). Process-oriented iterative multiple alignment for medical process mining. In 2017
IEEE international conference on data mining workshops (ICDMW) (pp. 438–445). IEEE.
Chinosi, M., & Trombetta, A. (2012). BPMN: An introduction to the standard. Computer Standards & Interfaces, 34(1), 124–134.
Chiudinelli, L., Dagliati, A., Tibollo, V., Albasini, S., Geifman, N., Peek, N., Holmes, J. H., Corsi, F., Bellazzi, R., & Sacchi, L. (2020). Mining
post-surgical care processes in breast cancer patients. Artificial Intelligence in Medicine, 105, 101855.
Cho, M., Song, M., Park, J., Yeom, S.-R., Wang, I.-J., & Choi, B.-K. (2020). Process mining-supported emergency room process performance
indicators. International Journal of Environmental Research and Public Health, 17(17), 6290.
Cho, M., Song, M., & Sooyoung Yoo, A. (2014). Systematic methodology for outpatient process analysis based on process mining. In AsiaPacific conference on business process management (pp. 31–42). Springer.
Conca, T., Saint-Pierre, C., Herskovic, V., Sepúlveda, M., Capurro, D., Prieto, F., & Fernandez-Llatas, C. (2018). Multidisciplinary collaboration in the treatment of patients with type 2 diabetes in primary care: Analysis using process mining. Journal of Medical Internet
Research, 20(4), e8884.
Dagliati, A., Sacchi, L., Cerra, C., Leporati, P., de Cata, P., Chiovato, L., Holmes, J. H., & Bellazzi, R. (2014). Temporal data mining and process mining techniques to identify cardiovascular risk-associated clinical pathways in type 2 diabetes patients. In IEEE-EMBS international conference on biomedical and health informatics (BHI) (pp. 240–243). IEEE.
Dallagassa, M. R., dos Santos Garcia, C., Scalabrin, E. E., Ioshii, S. O., & Carvalho, D. R. (2021). Opportunities and challenges for applying
process mining in healthcare: A systematic mapping study. Journal of Ambient Intelligence and Humanized Computing, 1–18.
de Medeiros, A. K. A., Weijters, A. J. M. M., & van der Aalst, W. M. P. (2007). Genetic process mining: An experimental evaluation. Data
Mining and Knowledge Discovery, 14(2), 245–304.
De Oliveira, H., Prodel, M., Lamarsalle, L., Inada-Kim, M., Ajayi, K., Wilkins, J., Sekelj, S., Beecroft, S., Snow, S., Slater, R., & Orlowski, A.
(2020). “Bow-tie” optimal pathway discovery analysis of sepsis hospital admissions using the hospital episode statistics database in
England. JAMIA Open, 3(3), 439–448.
de Toledo, P., Joppien, C., Sesmero, M. P., & Drews, P. (2019). Mining disease courses across organizations: A methodology based on process
mining of diagnosis events datasets. In 2019 41st annual international conference of the IEEE engineering in medicine and biology society
(EMBC) (pp. 354–357). IEEE.
de Vries, G.-J., Neira, R. A. Q., Geleijnse, G., Dixit, P., & Mazza, B. F. (2017). Towards process mining of EMR data. In International joint
conference on biomedical engineering systems and technologies (BIOSTEC). SciTePress.
De Weerdt, J., Caron, F., Vanthienen, J., & Baesens, B. (2012). Getting a grasp on clinical pathway data: An approach based on process
mining. In Pacific-Asia conference on knowledge discovery and data mining (pp. 22–35). Springer.
Delias, P., Doumpos, M., Grigoroudis, E., Manolitzas, P., & Matsatsinis, N. (2015). Supporting healthcare management decisions via robust
clustering of event logs. Knowledge-Based Systems, 84, 203–213.
Delias, P., Manolitzas, P., Grigoroudis, E., & Matsatsinis, N. (2014). Applying process mining to the emergency department. In Encyclopedia
of business analytics and optimization (pp. 168–178). IGI Global.
Detro, S. P., Santos, E. A. P., Panetto, H., de Freitas Rocha Loures, E., & Lezoche, M. (2017). Managing business process variability through
process mining and semantic reasoning: An application in healthcare. In Working conference on virtual enterprises (pp. 333–340).
Springer.
Dewandono, R. D., Fauzan, R., Sarno, R. & Sidiq, M. (2013). Ontology and process mining for diabetic medical treatment sequencing. In Proceedings of the 7th international conference on information & communication technology and systems (ICTS) (pp. 171–178).
Dixit, P. M., Verbeek, H. M. W., Buijs, J. C. A. M., & van der Aalst, W. M. P. (2018). Interactive data-driven process model construction. In
International conference on conceptual modeling (pp. 251–265). Springer.
Duma, D., & Aringhieri, R. (2017). Mining the patient flow through an emergency department to deal with overcrowding. In International
conference on health care systems engineering (pp. 49–59). Springer.
Duma, D., & Aringhieri, R. (2020). An ad hoc process mining approach to discover patient paths of an emergency department. Flexible
Services and Manufacturing Journal, 32(1), 6–34.
Dunkl, R., Fröschl, K. A., Grossmann, W., & Rinderle-Ma, S. (2011). Assessing medical treatment compliance based on formal process
modeling. In Symposium of the Austrian HCI and usability engineering group (pp. 533–546). Springer.
Durojaiye, A. B., Levin, S., Toerper, M., Kharrazi, H., Lehmann, H. P., & Gurses, A. P. (2019). Evaluation of multidisciplinary collaboration
in pediatric trauma care using EHR data. Journal of the American Medical Informatics Association, 26(6), 506–515.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
Durojaiye, A. B., McGeorge, N. M., Puett, L. L., Stewart, D., Fackler, J. C., Hoonakker, P. L. T., Lehmann, H. P., & Gurses, A. P. (2018).
Mapping the flow of pediatric trauma patients using process mining. Applied Clinical Informatics, 9(03), 654–666.
Erdogan, T. G., & Tarhan, A. (2018). A goal-driven evaluation method based on process mining for healthcare processes. Applied Sciences,
8(6), 894.
Erdogan, T. G., & Tarhan, A. (2018). Systematic mapping of process mining studies in healthcare. IEEE Access, 6, 24543–24567.
Farid, N. F., De Kamps, M., & Johnson, O. A. (2019). Process mining in frail elderly care: A literature review. In Proceedings of the 12th international joint conference on biomedical engineering systems and technologies—Volume 5: HEALTHINF (Vol. 5, pp. 332–339). SciTePress,
Science and Technology Publications.
Fei, H., & Meskens, N. (2010). Discovering patient care process models from event logs. In 8th international conference of modeling. Citeseer.
Fernandez-Llatas, C. (2021a). Applying interactive process mining paradigm in healthcare domain. In Interactive process mining in healthcare
(pp. 103–117). Springer.
Fernandez-Llatas, C. (2021b). Bringing interactive process mining to health professionals: Interactive data rodeos. In Interactive process mining in healthcare (pp. 119–140). Springer.
Fernandez-Llatas, C., Benedi, J. M., Gama, J. M., Sepulveda, M., Rojas, E., Vera, S., & Traver, V. (2021). Interactive process mining in surgery
with real time location systems: Interactive trace correction. In Interactive process mining in healthcare (pp. 181–202). Springer.
Fernandez-Llatas, C., Benedi, J.-M., García-G
omez, J. M., & Traver, V. (2013). Process mining for individualized behavior modeling using
wireless tracking in nursing homes. Sensors, 13(11), 15434–15451.
Fernandez-Llatas, C., Garcia-Gomez, J. M., Vicente, J., Naranjo, J. C., Robles, M., Benedi, J. M., & Traver, V. (2011). Behaviour patterns
detection for persuasive design in nursing homes to help dementia patients. In 2011 annual international conference of the IEEE engineering in medicine and biology society (pp. 6413–6417). IEEE.
Fernandez-Llatas, C., Lizondo, A., Monton, E., Benedi, J.-M., & Traver, V. (2015). Process mining methodology for health process tracking
using real-time indoor location systems. Sensors, 15(12), 29821–29840.
Fernandez-Llatas, C., Meneu, T., Benedi, J. M., & Traver, V. (2010). Activity-based process mining for clinical pathways computer aided
design. In 2010 annual international conference of the IEEE engineering in medicine and biology (pp. 6178–6181). IEEE.
Fernandez-Llatas, C., Meneu, T., Benedí, J.-M., & Traver, V. (2011). Continuous clinical pathways evaluation by using automatic learning
algorithms. In HEALTHINF (pp. 228–234). SciTePress.
Fernandez-Llatas, C., Sacchi, L., Benedi, J. M., Dagliati, A., Traver, V., & Bellazzi, R. (2014). Temporal abstractions to enrich activity-based
process mining corpus with clinical time series. In IEEE-EMBS international conference on biomedical and health informatics (BHI)
(pp. 785–788). IEEE.
Folino, F., Greco, G., Guzzo, A., & Pontieri, L. (2011). Mining usage scenarios in business processes: Outlier-aware discovery and run-time
prediction. Data & Knowledge Engineering, 70(12), 1005–1029. https://doi.org/10.1016/j.datak.2011.07.002
Forsberg, D., Rosipko, B., & Sunshine, J. L. (2016). Analyzing PACS usage patterns by means of process mining: Steps toward a more detailed
workflow analysis in radiology. Journal of Digital Imaging, 29(1), 47–58.
Franck, T., Bercelli, P., Aloui, S., & Augusto, V. (2020). A generic framework to analyze and improve patient pathways within a healthcare
network using process mining and discrete-event simulation. In 2020 winter simulation conference (WSC) (pp. 968–979). IEEE.
Furniss, S. K., Burton, M. M., Grando, A., Larson, D. W., & Kaufman, D. R. (2016). Integrating process mining and cognitive analysis to study
EHR workflow. In AMIA annual symposium proceedings (Vol. 2016, p. 580). American Medical Informatics Association.
Ganesha, K., Dhanush, S., & Swapnil Raj, S. M. (2017). An approach to fuzzy process mining to reduce patient waiting time in a hospital. In
2017 international conference on innovations in information, embedded and communication systems (ICIIECS) (pp. 1–6). IEEE.
García, A. O., Pérez-Alfonso, D., & Armenteros, O. U. L. (2015). Analysis of hospital processes with process mining techniques. In MedInfo
(pp. 310–314). IOS Press. doi:10.3233/978-1-61499-564-7-310
Garg, N., & Agarwal, S. (2016). Process mining for clinical workflows. In Proceedings of the international conference on advances in information communication technology & computing (pp. 1–5). ACM. doi:10.1145/2979779.2979784
Gatta, R., Lenkowicz, J., Vallati, M., Rojas, E., Damiani, A., Sacchi, L., de Bari, B., Dagliati, A., Fernandez-Llatas, C., Montesi, M.,
Marchetti, A., Castellano, M., & Valentini, V. (2017). pMineR: An innovative R library for performing process mining in medicine. In
Conference on artificial intelligence in medicine in Europe (pp. 351–355). Springer.
Gattnar, E., Ekinci, O., & Detschew, V. (2011). A novel generic clinical reference process model for event-based process times measurement.
In International conference on business information systems (pp. 65–76). Springer.
Gattnar, E., Ekinci, O., Detschew, V., & Capel-Tunon, M. (2011). Event-based workflow analysis in healthcare. In
IVM/FTMDD/RTSOABIS/MSVVEIS (pp. 61–70). SciTePress.
Ghasemi, M., & Amyot, D. (2016). Process mining in healthcare: A systematised literature review. International Journal of Electronic
Healthcare, 9(1), 60–88.
Gonzalez-García, J., Tellería-Orriols, C., Estupiñan-Romero, F., & Bernal-Delgado, E. (2020). Construction of empirical care pathways process models from multiple real-world datasets. IEEE Journal of Biomedical and Health Informatics, 24(9), 2671–2680.
Grando, A., Groat, D., Furniss, S. K., Nowak, J., Gaines, R., Kaufman, D. R., Poterack, K. A., Miksch, T., & Helmers, R. A. (2017). Using
process mining techniques to study workflows in a pre-operative setting. In AMIA annual symposium proceedings (Vol. 2017, p. 790).
American Medical Informatics Association.
Grando, M. A., Schonenberg, M. H., & van der Aalst, W. M. P. (2011). Semantic process mining for the verification of medical recommendations. In HEALTHINF (pp. 5–16). SciTePress.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
36 of 47
37 of 47
Grüger, J., Bergmann, R., Kazik, Y. & Kuhn, M. (2020). Process mining for case acquisition in oncology: A systematic literature review. In
Trabold, D., Welke, P., Piatkowski, N. (Eds.), CEUR Workshop Proceedings, (Vol. 2738, pp. 162-173). CEUR-WS.
Günther, C. W., & van der Aalst, W. M. P. (2006). Mining activity clusters from low-level event logs. Beta Research School for Operations
Management and Logistics.
Günther, C. W., & Van Der Aalst, W. M. P. (2007). Fuzzy mining–adaptive process simplification based on multi-perspective metrics. In
International conference on business process management (pp. 328–343). Springer.
Han, B., Jiang, L., & Cai, H. (2011). Abnormal process instances identification method in healthcare environment. In 2011IEEE 10th international conference on trust, security and privacy in computing and communications (pp. 1387–1392). IEEE.
Helm, E., Lin, A. M., Baumgartner, D., Lin, A. C., & Küng, J. (2020). Towards the use of standardized terms in clinical case studies for
process mining in healthcare. International Journal of Environmental Research and Public Health, 17(4), 1348.
Hendricks, R. M. (2019). Process mining of incoming patients with sepsis. Online Journal of Public Health Informatics, 11(2), e14. doi:10.
5210/ojphi.v11i2.10151
Homayounfar, P. (2012). Process mining challenges in hospital information systems. In 2012 federated conference on computer science and
information systems (FedCSIS) (pp. 1135–1140). IEEE.
Huang, Z., Dong, W., Ji, L., Gan, C., Xudong, L., & Duan, H. (2014). Discovery of clinical pathway patterns from event logs using probabilistic topic models. Journal of Biomedical Informatics, 47, 39–57.
Huang, Z., Xudong, L., & Duan, H. (2012). On mining clinical pathway patterns from medical behaviors. Artificial Intelligence in Medicine,
56(1), 35–50.
Huang, Z., Xudong, L., Duan, H., & Fan, W. (2013). Summarizing clinical pathways from event logs. Journal of Biomedical Informatics, 46(1),
111–127.
Ibanez-Sanchez, G., Celda, M. A., Mandingorra, J., & Fernandez-Llatas, C. (2021). Interactive process mining in emergencies. In Interactive
process mining in healthcare (pp. 165–180). Springer.
Ibanez-Sanchez, G., Fernandez-Llatas, C., Martinez-Millana, A., Celda, A., Mandingorra, J., Aparici-Tortajada, L., Valero-Ramon, Z., MunozGama, J., Sepúlveda, M., Rojas, E., Galvez, V., Capurro, D., & Traver, V. (2019). Toward value-based healthcare through interactive process
mining in emergency rooms: The stroke case. International Journal of Environmental Research and Public Health, 16(10), 1783.
IEEE. (2016). IEEE Standard for eXtensible Event Stream (XES) for achieving interoperability in event logs and event streams (IEEE Std
1849-2016) (pp. 1–50). IEEE. https://doi.org/10.1109/IEEESTD.2016.7740858
Jagadeesh Chandra Bose, R. P. & van der Aalst, W. M. P. 1118
Jagadeesh Chandra Bose, R. P., & Van der Aalst, W. M. P. (2009). Abstractions in process mining: A taxonomy of patterns. In International
conference on business process management (pp. 159–175). Springer.
Jaisook, P., & Premchaiswadi, W. (2015). Time performance analysis of medical treatment processes by using disco. In 2015 13th international conference on ICT and knowledge engineering (ICT & Knowledge Engineering 2015) (pp. 110–115). IEEE.
Jan Martijn, E. M., der Werf, V., van Dongen, B. F., Hurkens, C. A. J., & Serebrenik, A. (2008). Process discovery using integer linear
programming. In International conference on applications and theory of Petri nets (pp. 368–387). Springer.
Jangi, M., Moghbeli, F., Ghaffari, M., & Vahedinemani, A. (2019). Hospital management based on semantic process mining: A systematic
review. Frontiers in Health Informatics, 8(1), 4.
Janssenswillen, G., Depaire, B., Swennen, M., Jans, M., & Vanhoof, K. (2019). bupaR: Enabling reproducible business process analysis.
Knowledge-Based Systems, 163, 927–930.
Jaroenphol, E., Porouhan, P., & Premchaiswadi, W. (2015). Analysis of the patients' treatment process in a hospital in Thailand using fuzzy
mining algorithms. In 2015 13th international conference on ICT and knowledge engineering (ICT & Knowledge Engineering 2015)
(pp. 131–136). IEEE.
Jaturogpattana, T., Arpasat, P., Kungcharoen, K., Intarasema, S., & Premchaiswadi, W. (2017). Conformance analysis of outpatient data
using process mining technique. In 2017 15th international conference on ICT and knowledge engineering (ICT&KE) (pp. 1–6). IEEE.
Johnson, A. E. W., Pollard, T. J., Lu Shen, H., Li-Wei, L., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L. A., & Mark, R. G. (2016).
MIMIC-III, a freely accessible critical care database. Scientific Data, 3(1), 1–9.
Johnson, O. A., Dhafari, T. B., Kurniati, A., Fox, F., & Rojas, E. (2018). The clearpath method for care pathway process mining and simulation. In International conference on business process management (pp. 239–250). Springer.
Johnson, O. A., Hall, P. S., & Hulme, C. (2016). NETIMIS: Dynamic simulation of health economics outcomes using big data.
PharmacoEconomics, 34(2), 107–114.
Kamel Boulos, M. N., & Berry, G. (2012). Real-time locating systems (RTLS) in healthcare: A condensed primer. International Journal of
Health Geographics, 11(1), 1–8.
Kaymak, U., Mans, R., Van de Steeg, T., & Dierks, M. (2012). On process mining in health care. In 2012 IEEE international conference on systems, man, and cybernetics (SMC) (pp. 1859–1864). IEEE.
Kelleher, D. C., Jagadeesh Chandra Bose, R. P., Waterhouse, L. J., Carter, E. A., & Burd, R. S. (2014). Effect of a checklist on advanced
trauma life support workflow deviations during trauma resuscitations without pre-arrival notification. Journal of the American College of
Surgeons, 218(3), 459–466.
Kempa-Liehr, A. W., Lin, C. Y.-C., Britten, R., Armstrong, D., Wallace, J., Mordaunt, D., & O'Sullivan, M. (2020). Healthcare pathway discovery and probabilistic machine learning. International Journal of Medical Informatics, 137, 104087.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
Kim, E., Kim, S., Song, M., Kim, S., Yoo, D., Hwang, H., & Yoo, S. (2013). Discovery of outpatient care process of a tertiary university hospital
using process mining. Healthcare Informatics Research, 19(1), 42.
Kirchner, K., Herzberg, N., Solti, A. R., & Weske, M. (2012). Embedding conformance checking in a process intelligence system in hospital
environments. In Process support and knowledge representation in health care (pp. 126–139). Springer.
Kirchner, K., & Markovic, P. (2018). Unveiling hidden patterns in flexible medical treatment processes–a process mining case study. In International conference on decision support system technology (pp. 169–180). Springer.
Kitchenham, B. (2004). Procedures for performing systematic reviews. Keele University.
Kovalchuk, S. V., Funkner, A. A., Metsker, O. G., & Yakovlev, A. N. (2018). Simulation of patient flow in multiple healthcare units using
process and data mining techniques for model identification. Journal of Biomedical Informatics, 82, 128–142.
Kukreja, G., & Batra, S. (2017). Analogize process mining techniques in healthcare: Sepsis case study. In 2017 4th international conference on
signal processing, computing and control (ISPCC) (pp. 482–487). IEEE.
Kurniati, A., Hall, G., Hogg, D., & Johnson, O. (2018b). Process mining to explore variation in chemotherapy pathways for breast cancer
patients. British Journal of Cancer, 119, 16.
Kurniati, A. P., Hall, G., Hogg, D., & Johnson, O. (2018a). Process mining in oncology using the MIMIC-III dataset. Journal of Physics:
Conference Series, 971, 012008.
Kurniati, A. P., Johnson, O., Hogg, D., & Hall, G. (2016). Process mining in oncology: A literature review. In 2016 6th international conference
on information communication and management (ICICM) (pp. 291–297). IEEE.
Kurniati, A. P., McInerney, C., Zucker, K., Hall, G., Hogg, D., & Johnson, O. (2020). Using a multi-level process comparison for process
change analysis in cancer pathways. International Journal of Environmental Research and Public Health, 17(19), 7210.
Kurniati, A. P., Rojas, E., Hogg, D., Hall, G., & Johnson, O. A. (2019). The assessment of data quality issues for process mining in healthcare
using medical information mart for intensive care III, a freely available e-health record batabase. Health Informatics Journal, 25(4),
1878–1893.
Kusuma, G., Sykes, S., McInerney, C., & Johnson, O. (2020). Process mining of disease trajectories: A feasibility study. In Proceedings of the
13th international joint conference on biomedical engineering systems and technologies (Vol. 5, pp. 705–712). Science and Technology
Publications.
Kusuma, G. P., Hall, M., Gale, C. P., & Johnson, O. A. (2018). Process mining in cardiology: A literature review. International Journal of
Bioscience, Biochemistry and Bioinformatics, 8, 226–236.
Lakshmanan, G. T., Rozsnyai, S., & Wang, F. (2013). Investigating clinical care pathways correlated with outcomes. In Business process
management (pp. 323–338). Springer.
Lamine, E., Fontanili, F., Di Mascolo, M., & Pingaud, H. (2015). Improving the management of an emergency call service by combining
process mining and discrete event simulation approaches. In Working conference on virtual enterprises (pp. 535–546). Springer.
Lanzola, G., Parimbelli, E., Micieli, G., Cavallini, A., & Quaglini, S. (2014). Data quality and completeness in a web stroke registry as the
basis for data and process mining. Journal of Healthcare Engineering, 5(2), 163–184.
Law, A. M., David Kelton, W., & Kelton, W. D. (2000). Simulation modeling and analysis (Vol. 3). McGraw-Hill New York.
Lee, Y. H., & Rismanchian, F. (2018). Optimizing hospital facility layout planning through process mining of clinical pathways. Annals of
Optimization Theory and Practice, 1(1), 1–9.
Leemans, S. J. J., Fahland, D., & van der Aalst, W. M. P. (2013). Discovering block-structured process models from event logs—A constructive approach. In International conference on applications and theory of Petri nets and concurrency (pp. 311–329). Springer.
Leonardi, G., Montani, S., Portinale, L., Quaglini, S., & Striani, M. (2019). Discovering knowledge embedded in bio-medical databases: Experiences in food characterization and in medical process mining. In Innovations in big data mining and embedded knowledge (pp. 117–
136). Springer.
Lira, R., Salas-Morales, J., Leiva, L., Fuentes, R., Delfino, A., Nazal, C. H., Sepúlveda, M., Arias, M., Herskovic, V., & Munoz-Gama, J. (2019).
Process-oriented feedback through process mining for surgical procedures in medical training: The ultrasound-guided central venous
catheter placement case. International Journal of Environmental Research and Public Health, 16(11), 1877.
Lismont, J., Janssens, A.-S., Odnoletkova, I., vanden Broucke, S., Caron, F., & Vanthienen, J. (2016). A guide for the application of analytics
on healthcare processes: A dynamic view on patient pathways. Computers in Biology and Medicine, 77, 125–134.
Liu, C., Ge, Y., Xiong, H., Xiao, K., Geng, W., & Perkins, M. (2014). Proactive workflow modeling by stochastic processes with application to
healthcare operation and management. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and
data mining (pp. 1593–1602). ACM.
Lu, X., Tabatabaei, S. A., Hoogendoorn, M., & Reijers, H. A. (2019). Trace clustering on very large event data in healthcare using frequent
sequence patterns. In International conference on business process management (pp. 198–215). Springer.
Mannhardt, F., & Blinde, D. (2017). Analyzing the trajectories of patients with sepsis using process mining. In RADAR+ EMISA@ CAiSE
(pp. 72–80). CEUR-WS.
Mannhardt, F., De Leoni, M., & Reijers, H. A. (2015). The multi-perspective process explorer. BPM (Demos), 1418, 130–134.
Mans, R., Reijers, H., van Genuchten, M., & Wismeijer, D. (2012). Mining processes in dentistry. In Proceedings of the 2nd ACM SIGHIT international health informatics symposium (pp. 379–388). ACM.
Mans, R., Reijers, H., Wismeijer, D., & Van Genuchten, M. (2013). A process-oriented methodology for evaluating the impact of IT: A proposal and an application in healthcare. Information Systems, 38(8), 1097–1115.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
38 of 47
39 of 47
Mans, R. S., Van der Aalst, W. M. P., & Vanwersch, R. J. B. (2015). Process mining in healthcare: Evaluating and exploiting operational
healthcare processes. Springer International Publishing. https://doi.org/10.1007/978-3-319-16071-9_2
Mans, R. S., Van der Aalst, W. M. P., Vanwersch, R. J. B., & Moleman, A. J. (2012). Process mining in healthcare: Data challenges when
answering frequently posed questions. In Process support and knowledge representation in health care (pp. 140–153). Springer.
Marazza, F., Bukhsh, F. A., Geerdink, J., Vijlbrief, O., Pathak, S., van Keulen, M., & Seifert, C. (2020). Automatic process comparison for subpopulations: Application in cancer care. International Journal of Environmental Research and Public Health, 17(16), 5707.
Martin, N. (2018). Using indoor location system data to enhance the quality of healthcare event logs: Opportunities and challenges. In International conference on business process management (pp. 226–238). Springer.
Martinez-Millana, A., Lizondo, A., Gatta, R., Vera, S., Salcedo, V. T., & Fernandez-Llatas, C. (2019). Process mining dashboard in operating
rooms: Analysis of staff expectations with analytic hierarchy process. International Journal of Environmental Research and Public Health,
16(2), 199.
Martinez-Millana, A., Merino-Torres, J.-F., Valdivieso, B., & Fernandez-Llatas, C. (2021). Interactive process mining in type 2 diabetes
mellitus. In Interactive process mining in healthcare (pp. 203–215). Springer.
McGregor, C., Catley, C., & James, A. (2011). A process mining driven framework for clinical guideline improvement in critical care. In
Proceedings of the learning from medical data streams workshop. Bled, Slovenia (July 2011), (Vol. 765, pp. 34–45). CEUR.
McGregor, C., Percival, J., Curry, J., Foster, D., Anstey, E., & Churchill, D. (2008). A structured approach to requirements gathering creation using
PaJMa models. In 2008 30th annual international conference of the IEEE engineering in medicine and biology society (pp. 1506–1509). IEEE.
Meneu, T., Traver, V., Guillén, S., Valdivieso, B., Benedi, J., & Fernandez-Llatas, C. (2013). Heart cycle: Facilitating the deployment of
advanced care processes. In 2013 35th annual international conference of the IEEE engineering in medicine and biology society (EMBC)
(pp. 6996–6999). IEEE.
Mertens, S., Gailly, F., & Poels, G. (2018). Discovering health-care processes using DeciClareMiner. Health Systems, 7(3), 195–211.
Metsker, O., Bolgova, E., Yakovlev, A., Funkner, A., & Kovalchuk, S. (2017). Pattern-based mining in electronic health records for complex
clinical process analysis. Procedia Computer Science, 119, 197–206.
Miclo, R., Fontanili, F., Marquès, G., Bomert, P., & Lauras, M. (2015). RTLS-based process mining: Towards an automatic process diagnosis
in healthcare. In 2015 IEEE international conference on automation science and engineering (CASE) (pp. 1397–1402). IEEE.
Montani, S., Leonardi, G., Quaglini, S., Cavallini, A., & Micieli, G. (2013). Mining and retrieving medical processes to assess the quality of
care. In International conference on case-based reasoning (pp. 233–240). Springer.
Montani, S., Leonardi, G., Quaglini, S., Cavallini, A., & Micieli, G. (2014). Improving structural medical process comparison by exploiting
domain knowledge and mined information. Artificial Intelligence in Medicine, 62(1), 33–45.
Montani, S., Striani, M., Quaglini, S., Cavallini, A., & Leonardi, G. (2017). Knowledge-based trace abstraction for semantic process mining.
In Conference on artificial intelligence in medicine in Europe (pp. 267–271). Springer.
Naeem, M. R., Naeem, H., Aamir, M., Ali, W., & Abro, W. A. (2017). A multi-level process mining framework for correlating and clustering
of biomedical activities using event logs. International Journal of Advanced Computer Science and Applications, 8(3), 393–401.
Najjar, A., Reinharz, D., Girouard, C., & Gagné, C. (2018). A two-step approach for mining patient treatment pathways in administrative
healthcare databases. Artificial Intelligence in Medicine, 87, 34–48.
Neira, R. A. Q., Hompes, B. F. A., de Vries, J. G.-J., Mazza, B. F., Simões de Almeida, S. L., Stretton, E., Buijs, J. C. A. M., & Hamacher, S.
(2019). Analysis and optimization of a sepsis clinical pathway using process mining. In International conference on business process management (pp. 459–470). Springer.
Neumuth, T., Liebmann, P., Wiedemann, P., & Meixensberger, J. (2012). Surgical workflow management schemata for cataract procedures.
Methods of Information in Medicine, 51(05), 371–382.
Neumuth, T., Jannin, P., Schlomberg, J., Meixensberger, J., Wiedemann, P., & Burgert, O. (2011). Analysis of surgical intervention
populations using generic surgical process models. International Journal of Computer Assisted Radiology and Surgery, 6(1), 59–71.
Nigam, A., & Caswell, N. S. (2003). Business artifacts: An approach to operational specification. IBM Systems Journal, 42(3), 428–445.
Partington, A., Wynn, M., Suriadi, S., Ouyang, C., & Karnon, J. (2015). Process mining for clinical processes: A comparative analysis of four
Australian hospitals. ACM Transactions on Management Information Systems (TMIS), 5(4), 1–18.
Pebesma, J., Martinez-Millana, A., Sacchi, L., Fernandez-Llatas, C., de Cata, P., Chiovato, L., Bellazzi, R., & Traver, V. (2019). Clustering cardiovascular risk trajectories of patients with type 2 diabetes using process mining. In 2019 41st annual international conference of the
IEEE engineering in medicine and biology society (EMBC) (pp. 341–344). IEEE.
Perer, A., Wang, F., & Jianying, H. (2015). Mining and exploring care pathways from electronic medical records with visual analytics. Journal
of Biomedical Informatics, 56, 369–378.
Perimal-Lewis, L., De Vries, D., & Thompson, C. H. (2014). Health intelligence: Discovering the process model using process mining by constructing start-to-end patient journeys. In Proceedings of the seventh Australasian workshop on health informatics and knowledge management (Vol. 153, pp. 59–67). Australian Computer Society, Inc.
Perimal-Lewis, L., Qin, S., Thompson, C., & Hakendorf, P. (2012). Gaining insight from patient journey data using a process-oriented analysis approach. In Proceedings of the fifth Australasian workshop on health informatics and knowledge management (Vol. 129, pp. 59–66).
Australian Computer Society Inc.
Perimal-Lewis, L., Teubner, D., Hakendorf, P., & Horwood, C. (2016). Application of process mining to assess the data quality of routinely
collected time-based performance data sourced from electronic health records by validating process conformance. Health Informatics
Journal, 22(4), 1017–1029.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
Pesic, M., Schonenberg, H., & Van der Aalst, W. M. P. (2007). Declare: Full support for loosely-structured processes. In 11th IEEE international enterprise distributed object computing conference (EDOC 2007) (p. 287). IEEE.
Peterson, J. L. (1977). Petri nets. ACM Computing Surveys (CSUR), 9(3), 223–252.
Phan, R., Augusto, V., Martin, D., & Sarazin, M. (2019). Clinical pathway analysis using process mining and discrete-event simulation: An
application to incisional hernia. In 2019 winter simulation conference (WSC) (pp. 1172–1183). IEEE.
Piccialli, F., Di Somma, V., Giampaolo, F., Cuomo, S., & Fortino, G. (2021). A survey on deep learning in medicine: Why, how and when?
Information Fusion, 66, 111–137. https://doi.org/10.1016/j.inffus.2020.09.006
Pika, A., Wynn, M. T., Budiono, S., ter Hofstede, A. H. M., van der Aalst, W. M. P., & Reijers, H. A. (2019). Towards privacy-preserving process mining in healthcare. In International conference on business process management (pp. 483–495). Springer.
Placidi, L., Boldrini, L., Lenkowicz, J., Manfrida, S., Gatta, R., Damiani, A., Chiesa, S., Ciellini, F., & Valentini, V. (2021). Process mining to optimize
palliative patient flow in a high-volume radiotherapy department. Technical Innovations & Patient Support in Radiation Oncology, 17, 32–39.
Poelmans, J., Dedene, G., Verheyden, G., Van der Mussele, H., Viaene, S., & Peters, E. (2010). Combining business process and data discovery techniques for analyzing and improving integrated care pathways. In Industrial conference on data mining (pp. 505–517). Springer.
Prodel, M., Augusto, V., Xie, X., Jouaneton, B., & Lamarsalle, L. (2015). Discovery of patient pathways from a national hospital database
using process mining and integer linear programming. In 2015 IEEE international conference on automation science and engineering
(CASE) (pp. 1409–1414). IEEE.
Prokofyeva, E. S., & Zaytsev, R. D. (2020). Clinical pathways analysis of patients in medical institutions based on hard and fuzzy clustering
methods. Data Analysis and Intellingence Systems, 14(1), 19–31.
& Ferreira, D. R. (2012). Business process analysis in healthcare environments: A methodology based on process mining. InforRebuge, A.,
mation Systems, 37(2), 99–116.
Rinner, C., Helm, E., Dunkl, R., Kittler, H., & Rinderle-Ma, S. (2018). An application of process mining in the context of melanoma surveillance using time boxing. In International conference on business process management (pp. 175–186). Springer.
Riz, G., Santos, E. A. P., de Freitas, E., & Loures, R. (2016). Process mining to knowledge discovery in healthcare processes. In Transdisciplinary engineering: Crossing boundaries (pp. 1019–1028). IOS Press.
Rojas, E., Arias, M., & Sepúlveda, M. (2015). Clinical processes and its data, what can we do with them. In Proceedings of the international
conference on health informatics (HEALTHINF 2015), Lisbon, Portugal (pp. 12–15). SciTePress.
Rojas, E., & Capurro, D. (2018). Characterization of drug use patterns using process mining and temporal abstraction digital phenotyping. In
International conference on business process management (pp. 187–198). Springer.
Rojas, E., Cifuentes, A., Burattin, A., Munoz-Gama, J., Sepúlveda, M., & Capurro, D. (2018). Analysis of emergency room episodes duration
through process mining. In International conference on business process management (pp. 251–263). Springer.
Rojas, E., Fernandez-Llatas, C., Traver, V., Munoz-Gama, J., Sepúlveda, M., Herskovic, V., & Capurro, D. (2017). PALIA-ER: Bringing
question-driven process mining closer to the emergency room. In BPM (Demos).
Rojas, E., Munoz-Gama, J., Sepúlveda, M., & Capurro, D. (2016). Process mining in healthcare: A literature review. Journal of Biomedical
Informatics, 61, 224–236.
Rojas, E., Sepúlveda, M., Munoz-Gama, J., Capurro, D., Traver, V., & Fernandez-Llatas, C. (2017). Question-driven methodology for analyzing emergency room processes using process mining. Applied Sciences, 7(3), 302.
Ronny, S., Mans, M. H. S., Song, M., Van der Aalst, W. M. P., & Bakker, P. J. M. (2015). Process Mining in Healthcare. In International
conference on health informatics (HEALTHINF'08) (pp. 118–125). SciTePress.
Rovani, M., Maggi, F. M., De Leoni, M., & Van Der Aalst, W. M. P. (2015). Declarative process mining in healthcare. Expert Systems with
Applications, 42(23), 9236–9251.
Sato, D. M. V., Mantovani, L. K., Safanelli, J., Guesser, V., Nagel, V., Moro, C. H. C., Cabral, N. L., Scalabrin, E. E., Moro, C., &
Santos, E. A. P. (2020). Ischemic stroke: Process perspective, clinical and profile characteristics, and external factors. Journal of Biomedical Informatics, 111, 103582.
Stefanini, A., Aloini, D., Benevento, E., Dulmin, R., & Mininno, V. (2018). Performance analysis in emergency departments: A data-driven
approach. Measuring Business Excellence, 22(2), 130–145. https://doi.org/10.1108/MBE-07-2017-0040
Stefanini, A., Aloini, D., Dulmin, R., & Mininno, V. (2016). Linking diagnostic-related groups (DRGs) to their processes by process mining.
HEALTHINF, 5, 438–443.
Suriadi, S., Mans, R. S., Wynn, M. T., Partington, A., & Karnon, J. (2014). Measuring patient flow variations: A cross-organisational process
mining approach. In Asia-Pacific conference on business process management (pp. 43–58). Springer.
Tamburis, O., & Esposito, C. (2020). Process mining as support to simulation modeling: A hospital-based case study. Simulation Modelling
Practice and Theory, 104, 102149.
Tax, N., Sidorova, N., Haakma, R., & van der Aalst, W. M. P. (2016). Mining local process models. Journal of Innovation in Digital Ecosystems,
3(2), 183–196.
Tax, N., Sidorova, N., van der Aalst, W. M. P., & Haakma, R. (2016). Heuristic approaches for generating local process models through log
projections. In 2016 IEEE symposium series on computational intelligence (SSCI) (pp. 1–8). IEEE.
Tibeme, B., Shahriar, H., & Zhang, C. (2018). Process mining algorithms for clinical workflow analysis. In SoutheastCon 2018 (pp. 1–6).
IEEE.
(2017). Applicability of process mining in the exploration of healthcare
T
oth, K., Machalik, K., Fogarassy, G., & Vathy-Fogarassy, A.
sequences. In 2017 IEEE 30th Neumann colloquium (NC) (pp. 000151–000156). IEEE.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
40 of 47
41 of 47
Valero-Ramon, Z., Fernandez-Llatas, C., Martinez-Millana, A., & Traver, V. (2019). A dynamic behavioral approach to nutritional assessment using process mining. In 2019 IEEE 32nd international symposium on computer-based medical systems (CBMS) (pp. 398–404).
IEEE.
Valero-Ramon, Z., Fernandez-Llatas, C., Martinez-Millana, A., & Traver, V. (2020). Interactive process indicators for obesity modelling using
process mining. In Advanced computational intelligence in Healthcare-7 (pp. 45–64). Springer.
Valero-Ramon, Z., Fernandez-Llatas, C., Valdivieso, B., & Traver, V. (2020). Dynamic models supporting personalised chronic disease management through healthcare sensors with interactive process mining. Sensors, 20(18), 5330.
Van Der Aalst, W., Adriansyah, A., De Medeiros, A. K. A., Arcieri, F., Baier, T., Blickle, T., Bose, J. C., Van Den Brand, P.,
Brandtjen, R., Buijs, J., Burattin, A., Carmona, J., Castellanos, M., Claes, J., Cook, J., Costantini, N., Curbera, F., Damiani, E., de
Leoni, M., … Wynn, M. (2011). Process mining manifesto. In International conference on business process management (pp. 169–
194). Springer.
Van Der Aalst, W., Adriansyah, A., & Van Dongen, B. (2011). Causal nets: A modeling language tailored towards process discovery. In International conference on concurrency theory (pp. 28–42). Springer.
Van der Aalst, W. M. P. (2009). Process-aware information systems: Lessons to be learned from process mining. In Transactions on Petri nets
and other models of concurrency II (pp. 1–26). Springer.
Van der Aalst, W. M. P. (2011). Process mining: Discovering and improving spaghetti and lasagna processes. In 2011 IEEE symposium on
computational intelligence and data mining (CIDM) (pp. 1–7). IEEE.
van der Aalst, W. M. P., De Beer, H. T., & van Dongen, B. F. (2005). Process mining and verification of properties: An approach based
on temporal logic. In OTM confederated international conferences on the move to meaningful internet systems (pp. 130–147).
Springer.
Van der Aalst, W., Weijters, T., & Maruster, L. (2004). Workflow mining: Discovering process models from event logs. IEEE Transactions on
Knowledge and Data Engineering, 16(9), 1128–1142.
Van Der Spoel, S., Van Keulen, M., & Amrit, C. (2012). Process prediction in noisy data sets: A case study in a Dutch hospital. In International symposium on data-driven process discovery and analysis (pp. 60–83). Springer.
Van Eck, M. L., Lu, X., Leemans, S. J. J., & Van Der Aalst, W. M. P. (2015). PM 2: A process mining project methodology. In International
conference on advanced information systems engineering (pp. 297–313). Springer.
Verbeek, H. M. W., Buijs, J. C. A. M., Van Dongen, B. F., & Van Der Aalst, W. M. P. (2010). Xes, XESame, and ProM 6. In International conference on advanced information systems engineering (pp. 60–75). Springer.
Vogelgesang, T., & Appelrath, H.-J. (2013). Multidimensional process mining: A flexible analysis approach for health services research. In
Proceedings of the joint EDBT/ICDT 2013 workshops (Vol. 2013, pp. 17–22). ACM.
Weijters, A. J. M. M., van Der Aalst, W. M. P., & De Medeiros, A. K. A. (2006). Process mining with the heuristics miner algorithm (Tech. Rep.
WP 166). Technische Universiteit Eindhoven.
Weiskopf, N. G., & Weng, C. (2013). Methods and dimensions of electronic health record data quality assessment: Enabling reuse for clinical
research. Journal of the American Medical Informatics Association, 20(1), 144–151.
Weske, M. (2007). Business process management—Concepts, languages, architectures. Springer.
Williams, R., Rojas, E., Peek, N., & Johnson, O. A. (2018). Process mining in primary care: A literature review. Studies in Health Technology
and Informatics, 247, 376–380.
Wirth, R., & Hipp, J. (2000). CRISP-DM: Towards a standard process model for data mining. In Proceedings of the 4th international conference
on the practical applications of knowledge discovery and data mining (Vol. 1). Springer-Verlag.
Xu, H., Pang, J., Yang, X., Jinghui, Y., Li, X., & Zhao, D. (2020). Modeling clinical activities based on multi-perspective declarative process
mining with OpenEHR's characteristic. BMC Medical Informatics and Decision Making, 20(14), 1–11.
Xu, H., Pang, J., Yang, X., Jinghui, Y., & Zhao, D. (2019). A modeling approach based on multi-perspective declarative process mining for
clinical activity. In 2019 IEEE international conference on bioinformatics and biomedicine (BIBM) (pp. 1688–1691). IEEE.
Xu, H., Pang, J., Yang, X., Li, M., & Zhao, D. (2020). Using predictive process monitoring to assist thrombolytic therapy decision-making for
ischemic stroke patients. BMC Medical Informatics and Decision Making, 20(3), 1–10.
Xu, H., Pang, J., Yang, X., Ma, L., Mao, H., & Zhao, D. (2020). Applying clinical guidelines to conformance checking for diagnosis and treatment: A case study of ischemic stroke. In 2020 IEEE international conference on bioinformatics and biomedicine (BIBM) (pp. 2125–2130).
IEEE.
Xu, H., Pang, J., Zhang, W., Li, X., Li, M., & Zhao, D. (2021). Predicting recurrence for patients with ischemic cerebrovascular events based
on process discovery and transfer learning. IEEE Journal of Biomedical and Health Informatics, 25(7), 2445–2453.
Xu, H., Yan, H., Pang, J., Nan, S., Yang, X., & Zhao, D. (2020). Evaluating the relative value of care interventions based on clinical pathway
variation detection and propensity score. In 2020 IEEE international conference on bioinformatics and biomedicine (BIBM) (pp. 1184–
1187). IEEE.
Xu, X., Jin, T., Wei, Z., & Wang, J. (2017). Incorporating topic assignment constraint and topic correlation limitation into clinical goal discovering for clinical pathway mining. Journal of Healthcare Engineering, 2017(5208072), 1–13.
Yang, S., Li, J., Tang, X., Chen, S., Marsic, I., & Burd, R. S. (2017). Process mining for trauma resuscitation. The IEEE Intelligent Informatics
Bulletin, 18(1), 15.
Yang, S., Sarcevic, A., Farneth, R. A., Chen, S., Ahmed, O. Z., Marsic, I., & Burd, R. S. (2018). An approach to automatic process deviation
detection in a time-critical clinical process. Journal of Biomedical Informatics, 85, 155–167.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
GUZZO ET AL.
Yang, S., Tao, F., Li, J., Wang, D., Chen, S., Ahmed, O. Z., Marsic, I., & Burd, R. S. (2018). Process mining the trauma resuscitation patient
cohorts. In 2018 IEEE international conference on healthcare informatics (ICHI) (pp. 29–35). IEEE.
Yang, S., Zhou, M., Chen, S., Dong, X., Ahmed, O., Burd, R. S., & Marsic, I. (2017). Medical workflow modeling using alignment-guided
state-splitting HMM. In 2017 IEEE international conference on healthcare informatics (ICHI) (pp. 144–153). IEEE.
Yang, W., & Su, Q. (2014). Process mining for clinical pathway: Literature review and future directions. In 2014 11th international conference
on service systems and service management (ICSSSM) (pp. 1–5). IEEE.
Yoo, S., Cho, M., Kim, E., Kim, S., Sim, Y., Yoo, D., Hwang, H., & Song, M. (2016). Assessment of hospital processes using a process mining
technique: Outpatient process analysis at a tertiary hospital. International Journal of Medical Informatics, 88, 34–43.
Zhang, X., & Chen, S. (2012). Pathway identification via process mining for patients with multiple conditions. In 2012 IEEE international
conference on industrial engineering and engineering management (pp. 1754–1758). IEEE.
Zhou, M., Yang, S., Li, X., Lv, S., Chen, S., Marsic, I., Farneth, R. A., & Burd, R. S. (2017). Evaluation of trace alignment quality and
its application in medical process mining. In 2017 IEEE international conference on healthcare informatics (ICHI) (pp. 258–267).
IEEE.
Zhou, Z., Wang, Y., & Li, L. (2014). Process mining based modeling and analysis of workflows in clinical care-a case study in a Chicago outpatient clinic. In Proceedings of the 11th IEEE international conference on networking, sensing and control (pp. 590–595). IEEE.
Zhu, Q., Nie, H., Lu, X., & Duan, H. (2010). Radiology workflow-based monitoring dashboard in a heterogeneous environment. In 2010 3rd
international conference on biomedical engineering and informatics (Vol. 6, pp. 2494–2498). IEEE.
How to cite this article: Guzzo, A., Rullo, A., & Vocaturo, E. (2022). Process mining applications in the
healthcare domain: A comprehensive review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge
Discovery, 12(2), e1442. https://doi.org/10.1002/widm.1442
A P P EN D I X
TABLE A1
Classification of the surveyed papers
Reference
Application Process
Approach
Tool
(Rojas, Fernandez-Llatas,
et al., 2017)
PMTD
(Araghi, Fontanili,
et al., 2018)
PMM
HM
DIAG
R.IO-DIAG
(Kelleher et al., 2014)
CA
MP
BPL
ProM
(A. Grando et al., 2017)
PD, SNA
PI, HM, PT CEDA
ProM, Disco,
PMApp
(S. Yang, Tao, et al., 2018)
PD
MP
CEDA
2
Technique
Data source Data
FC, HC
RTLS
PALIA-Web
Video rec.
TR
HoWSN
Video rec.
Surgery
Disco
FM, HC, KM
Video rec.
TR
Celonis
FM
Video rec.
Surgery
BSD
Sepsis
PALIA
BSD
Geriatrics
(Lira et al., 2019)
PPM
MP
PM
(McGregor et al., 2011)
PMM
PB
CRISP-DM
(Fernandez-Llatas, GarciaGomez, et al., 2011)
PD
PB
AB
(Kaymak et al., 2012)
PMM
MP
CEDA
ProM
HM
BSD
Surgery
(Alharbi et al., 2018)
PD, ELQI
CP
CEDA
Disco, R lib.
HMM-based
MIMIC
Neurology
(Rojas & Capurro, 2018)
PD
CP
CEDA
Disco
HM
MIMIC
Sepsis
CP
PM 2, L*
ProM
RLPN
MIMIC
Oncology
L*
ProM
HM
MIMIC
(A. P. Kurniati et al., 2018a) PMM
(A. P. Kurniati et al., 2019)
DQA
(Pika et al., 2019)
PPPM
PP
HIS, MIMIC
(Marazza et al., 2020)
VPA, VA
CP
CEDA
(Fernandez-Llatas
et al., 2013)
VPA
HM
CEDA
(Araghi, Fontaili,
et al., 2018)
PD
HM
DIAG
ProM
R.IO-DIAG
IVM, RLPN
MIMIC
Oncology
PALIA
ILS
HH
HM
RTLS
Urology
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
42 of 47
TABLE A1
43 of 47
(Continued)
Reference
Application Process
Approach
Tool
Technique
Data source Data
(Araghi et al., 2019)
PPM
HM
DIAG+MP R.IO-DIAG
HM
RTLS
(Martinez-Millana
et al., 2019)
PMTD
HM
DIAG
PALIA-Web
IVM
RTLS
(Araghi et al., 2020)
PD
HM
DIAG
R.IO-DIAG
(Fernandez-Llatas
et al., 2015)
PD
PP
(Martin, 2018)
ELQI
(Fernandez-Llatas
et al., 2021)
ELQI
HM
(Binder et al., 2012)
ELE
CP
ProM, PALIAWeb
Surgery
ILS
HM, PALIA,
RLPN, FrM
RTLS
Surgery
HIS, ILS
IPM
ProM
PALIA
RTLS
Surgery
HM
HIS
Oncology
Cardiology
(Perimal-Lewis et al., 2014)
PD, ELE
PP
CEDA
ProM, Disco
HM
HIS
(García et al., 2015)
VPA, ELE
PT
CEDA
ProM
HM, RLPN
HIS
(Naeem et al., 2017)
PD, SNA,
CP
ELE, ELQI
CEDA
ProM, Disco
IM, IVM, HM,
FM, RLPN
HIS
Hepatitis
(Metsker et al., 2017)
ELE
CP
CEDA
Disco
TM
EHR
Cardiology
(Rinner et al., 2018)
CA, ELQI
CP
BPL + MP
ProM
MPPE
EHR
Oncology
(Vogelgesang &
Appelrath, 2013)
PMM
(Alharbi et al., 2017)
ELQI
ProM
IM, IMI
MIMIC
Diabetes
OLAP
MP
CEDA
2
(O. A. Johnson et al., 2018)
Sim, ELQI
CP
PM
(Fernandez-Llatas
et al., 2010)
PD, ELQI
PP
CEDA
(Ibanez-Sanchez
et al., 2019)
PMM, ELQI
PP
IPM
(Fernandez-Llatas
et al., 2014)
ELQI
(Lismont et al., 2016)
PD, PMM,
ELQI
PP
(Duma & Aringhieri, 2017)
PPA, ELQI
(Stefanini et al., 2016)
(Huang et al., 2014)
ProM, Netimis
EHR
PALIA
PMApp
Neurology
Cardiology
PALIA
HIS
Cardiology
CEDA+TC ProM, Disco
SNM, FM, SC
EHR
Diabetes
PP
CEDA
ProM
IMI, HM, DT
HIS
ED
PD, ELQI
PP
CEDA
Disco
FM
HIS
Oncology
PD, ELQI
CP
CEDA
LDA
HIS
Oncology
(Prokofyeva &
Zaytsev, 2020)
PD, ELQI
CP
CEDA+TC stringdist
HC, KM, LDA
HIS
Sepsis
(Chiudinelli et al., 2020)
PD, ELQI
CP
CEDA
SPM, LDA
EHR
Oncology
(X. Xu et al., 2017)
PD, ELQI
CP
CEDA
FM, LDA
HIS
Neurology
(Antonelli & Bruno, 2015)
PD, elqi
PP
CEDA
Disco
FM
HIS
(Montani et al., 2017)
ELQI
PP
CEDA
ProM
HM
HIS
(M. A. Grando et al., 2011)
ELQI
CP
CEDA
ProM
α++, LTL-C
(Dewandono et al., 2013)
CA, ELQI
CP
CEDA
ProM
(Detro et al., 2017)
VA, ELQI
(Benevento, Dixit,
et al., 2019)
PD
CP
IPM
(Neumuth et al., 2011)
PD
MP
CEDA
(Caron et al., 2011)
PD, PU, SNA MP
CEDA
topicmodels
Cardiology
Respiratory
dis.
Diabetes
CEDA
Cardiology
ProM
IPM
HIS
ProM
HM, SNM, LTL- HIS
C
HIS
Oncology
Surgery
Oncology
(Continues)
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
TABLE A1
GUZZO ET AL.
(Continued)
Reference
Application Process
Approach
Tool
Technique
Data source Data
FM
HIS
Oncology
SPM
EMR
Oncology
(De Weerdt et al., 2012)
PD
CP
CEDA
(Huang et al., 2012)
PD
CP
CEDA
C# lib.
(Neumuth et al., 2012)
PD
MP
MP
YAWL-PM
(R. Mans, Reijers,
et al., 2012)
PD, PPM,
SNA
PI
MP
ProM
(Bouarfa &
Dankelman, 2012)
PD, OD
MP
CEDA
(Furniss et al., 2016)
PD
PT
CEDA
Disco
(B. Chen et al., 2021)
PD
PT
CEDA
ProM
(Kim et al., 2013)
PD, CA
PP
CEDA
ProM
(Cho et al., 2014)
PD, Sim
PP
DIAG
ProM, CPN
Tools
(Jaroenphol et al., 2015)
PD
CP
CEDA
ProM, Disco
(Meneu et al., 2013)
PD
PP
CEDA
(Montani et al., 2013)
PD
CP
ProM
HM
HIS
Cardiology
(Huang et al., 2013)
PD
CP
C# lib.
Ad hoc
EMR
Oncology
(Dagliati et al., 2014)
PD
PB
CEDA
ProM
HM
HIS
Diabetes
(Miclo et al., 2015)
PD
HM
CEDA
Disco
FM
RTLS
CPLEX
ILP
HIS
Cardiology
HMM-based
Video rec.
TR
M-COFFEE
TA
Video rec.,
SD
TR
CEDA
Disco
HMM-based
HIS
TR
Declare
HIS
Orthopedic
CEDA
Disco
FM
HIS
ED
ProM
SC
HIS
ED
HIS
OS
PALIA, QTC
HIS
Malnutrition
ProM
α-M, HM
HIS
CEDA
ProM
α-M, HM, FM,
RLPN
HIS
Python lib.
(Prodel et al., 2015)
PD
PP
(S. Yang, Li, et al., 2017)
PD, CA
PP
(M. Zhou et al., 2017)
PD
MP
(S. Yang, Zhou, et al., 2017) PD
MP
Surgery
HM, RLPN
EMR
Dentistry
Video rec.
Surgery
Video rec.
IM
EHR
Pediatric
HIS
OS
HM, FM
HIS
OS
FM
HIS
OS
PALIA
CEDA
(Mertens et al., 2018)
PD
PP
(Abo-Hamad, 2017)
PD, PPM
PP
(Lee & Rismanchian, 2018)
PD
PP
(Kirchner &
Markovic, 2018)
PD
MP
LPM
(Valero-Ramon et al., 2019) PD
PB
CEDA+TC PALIA-Web
(Garg & Agarwal, 2016)
PD
PP
(Kukreja & Batra, 2017)
PD, PPM,
CA
PP
ProM
HH
Sepsis
(De Oliveira et al., 2020)
PD, VPA
CP
CEDA
HIS
Sepsis
(Rebuge & Ferreira, 2012)
VA
MP
CEDA+MP ProM
RLPN, SC
HIS
ED
(Ronny et al., 2015)
PD, SNA
CP
CEDA+TC ProM
SNM
HIS
Oncology
(Delias et al., 2014)
PD, PMM
PP
CEDA+TC ProM
HM, FM, GM
HIS
ED
(Delias et al., 2015)
PD
PP
CEDA+TC ProM, Disco
ILP, FM, USC
(Najjar et al., 2018)
PD
CP
TC
HMM-based
HIS
Cardiology
(Kovalchuk et al., 2018)
PD, Sim,
PPA
PP
CEDA+TC SKLearn
KM
EHR
Cardiology
(Pebesma et al., 2019)
PD
CP
CEDA+TC
HIS
Diabetes
(de Toledo et al., 2019)
PD
CP
CEDA+TC ProM
HM, FM, MBC,
HC, KM
HIS
Diabetes
(Lu et al., 2019)
PD
CP
TC
FSPM
HIS
Oncology,
pediatric,
diabetes
C lib.
ProM
ED
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
44 of 47
TABLE A1
45 of 47
(Continued)
Reference
Application Process
Approach
Tool
Technique
Data source Data
EHR
(Valero-Ramon, Fernandez- PD
Llatas, Valdivieso, &
Traver, 2020)
PB
IPM
PMApp
PALIA
(Valero-Ramon, Fernandez- PD
Llatas, Martinez-Millana,
& Traver, 2020)
PB
IPM
PMApp
PALIA
(Hendricks, 2019)
PD
PP
CEDA
ProM
α-M
HIS
Sepsis
(Caron et al., 2014)
PD, VA, CA, CP
SNA
CEDA
HM, SNM, TA,
LTL-C
HIS
Oncology
(Baker et al., 2017)
PD
PP
Ad hoc
EHR
Oncology
(Alvarez et al., 2018)
PD, SNA
PI
FM
HIS
ED
CEDA
Disco
Cardiology
Obesity
(Leonardi et al., 2019)
PD
CP
CEDA
ProM
HM
HIS
Cardiology
(Amantea et al., 2020)
PD
PP
CEDA
PALIA-Web
PALIA
HIS
Geriatrics
(Duma & Aringhieri, 2020)
PD, PPA
PP
CEDA
Ad hoc
HIS
ED
(H. Xu, Pang, Yang,
Jinghui, et al., 2020)
PD, CA
CP
MP
ProM
Decalre
EMR
Cardiology
(Placidi et al., 2021)
PD, CA
PP
CEDA
pMineR
HMM-based
HIS
Oncology
(Lakshmanan et al., 2013)
PD
CP
CEDA+TC ProDiscovery
HM, SC
EMR
Cardiology
(Fernandez-Llatas, Meneu,
et al., 2011)
CA
PP
CEDA
PALIA
(Kirchner et al., 2012)
CA
PP
BPL
Signavio
TA
HIS
Surgery
(Rovani et al., 2015)
CA
PP
BPL
ProM
Declare
HIS
Urology
(de Vries et al., 2017)
CA
PP
BPL
ProM
EMR
Sepsis
(Ganesha et al., 2017)
CA
CP
CEDA
ProM
Video rec.
OS
ProM, Disco
FM, RLPN
ED
(Jaturogpattana et al., 2017) CA
PP
CEDA
IM, FM, RLPN
HIS
OS
(Mannhardt &
Blinde, 2017)
CA
PP
CEDA+MP ProM
IM, MPPE
HIS
Sepsis
(Anggrainingsih
et al., 2018)
PPM, CA
PP
CEDA
ProM
α++, IM
HIS
OS
(S. Yang, Sarcevic,
et al., 2018)
CA
MP
BPL
ProM
TA
Video rec.
Pediatric
(Neira et al., 2019)
PPM, CA
CP
BPL
ProM, Disco
RLPN, FCBDP
HIS
Sepsis
(Asare et al., 2020)
CA
PP
CEDA
ProM
IM, RLPN
(H. Xu, Pang, Yang, Ma,
et al., 2020)
CA
CP
BPL
ProM
Declare
HIS
Cardiology
(H. Xu, Yan, Pang, Nan,
et al., 2020)
CA
CP
BPL
A*
EMR
Cardiology
(Dunkl et al., 2011)
CA
CP
DM
HIS
Oncology
(Andrews et al., 2020)
VA, PPM
PP
(Suriadi et al., 2014)
VA
PP
(Partington et al., 2015)
VA
PP
(Montani et al., 2014)
PMC
PP
(Perimal-Lewis et al., 2012)
PPM
PP
CEDA
ProM
HIS
(R. Mans et al., 2013)
PPM, Sim
MP
MP
ProM, CPN
Tools
HIS
ProM, CPN
Tools
CEDA
L*
ProM
IM
HIS
RTC
ProM
HM, FM, PM,
TA, KM
EHR
Cardiology
ProM
HM, FM, RLPN
HIS
Cardiology
ProM
HM
HIS
Cardiology
Dentistry
(Continues)
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
TABLE A1
GUZZO ET AL.
(Continued)
Reference
Application Process
Approach
Tool
Technique
Data source Data
(Jaisook &
Premchaiswadi, 2015)
PPM
CP
CEDA
Disco
FM
HIS
Respiratory
dis.
(Yoo et al., 2016)
VPA, PMC,
PPM
PP
CEDA
HM
EHR
Oncology
(Stefanini et al., 2018)
PPM
PP
CEDA
FM, IM
HIS
ED
(Gattnar, Ekinci, &
Detschew, 2011)
PPM
PP
FM
HIS
Surgery
FSPM
EMR
Cardiology
TA
HIS, Video
rec., SD
Oncology
FM, TA
HIS
Oncology
(Gattnar, Ekinci, Detschew, PPM, PMM
& Capel-Tunon, 2011)
MP
(Aguirre et al., 2019)
VPA
PP
PM 2
(Perer et al., 2015)
VPA
CP
CEDA
(S. Chen et al., 2017)
VPA
ProM, Disco
ARIS
CEDA
Celonis
(Jagadeesh Chandra Bose & PU
van der Aalst, 2011)
PP
(Forsberg et al., 2016)
PU
PT
CEDA
ProM
HM, RLPN
PACS
Radiology
(Agostinelli et al., 2020)
PU
CP
PM 2
ProM
IVM, SNM
HIS
ED, OS
(Z. Zhou et al., 2014)
Sim
PP
Disco, ProModel α-M, FM
HIS
OS
(Lamine et al., 2015)
Sim
PT
Disco, Witness
FM
HIS
ED
(Augusto et al., 2016)
Sim
CP
Anylogic
ILP
HIS
Cardiology
(Tamburis &
Esposito, 2020)
Sim
CP
CEDA
ProM, Disco,
Simul8
PALIA
HIS
Surgery
(Phan et al., 2019)
PPM, Sim
CP
CEDA
Disco, Anylogic
FM
HIS
Surgery
(Franck et al., 2020)
Sim
PP
CEDA
ProM, Anylogic
HIS
Cardiology
(Riz et al., 2016)
SNA
PP
CEDA
ProM
SNA
HIS
Oncology
(Durojaiye et al., 2019)
SNA
PI
CEDA
igraph,
graphkernels,
kernlab
USC, GKC
EHR
Pediatric
(Conca et al., 2018)
SNA
PI
CEDA
PALIA-Web
PALIA, SC
HIS
Diabetes
CEDA
ProM
(Van Der Spoel et al., 2012)
PPA
PP
CEDA
(Benevento, Aloini,
et al., 2019)
PPA
PP
CRISP-DM Disco
RF
(Kempa-Liehr et al., 2020)
PPM, PPA
CP
CEDA+TC ProM
(Back et al., 2020)
PPA
PP
CEDA
HIS
Cardiology
HIS
ED
IVM, EEL
EMR
Surgery
α-M
HIS
Surgery
(H. Xu et al., 2021)
PPA
CP
CEDA
IM, Declare
EMR
Cardiology
(H. Xu, Pang, Yang, Li, &
Zhao, 2020)
PPA
CP
CEDA
SKLearn
DT, RF
EMR
Cardiology
(A. P.Kurniati et al., 2020)
CD
PP
PM 2
ProM, bupaR
IM, IMI,
IDAHM, ILP
EHR
Oncology
(Han et al., 2011)
OD
PP
YAWL-BPM
HIS
Diabetes
(Perimal-Lewis et al., 2016)
DQA
PP
Disco
EHR
ED
(Lanzola et al., 2014)
DQA
PP
CEDA
HIS
2
bupaR
(Gonzalez-García
et al., 2020)
PMM
CP
PM
(Erdogan & Tarhan, 2018)
PMM
PP
PM 2
Disco
(T
oth et al., 2017)
PMM
CP
CEDA
ProM
cardiology
HIS
Cardiology
FM
HIS
Surgery
IVM
HIS
Oncology
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
46 of 47
TABLE A1
47 of 47
(Continued)
Reference
Application Process
Approach
Tool
(R. S. Mans, Van der Aalst,
et al., 2012)
PMM
CEDA
ProM
(Homayounfar, 2012)
PMM
(Fernandez-Llatas, 2021a)
PMM
IPM
(Fernandez-Llatas, 2021b)
PMM
IPM
(Martinez-Millana
et al., 2021)
PMM
PP
(Tibeme et al., 2018)
PMM
PP
(Zhu et al., 2010)
PMTD
PT
(Rojas, Sepúlveda,
et al., 2017)
PMM
(Rojas et al., 2018)
PPM
PP
CEDA
ProM, Disco
(Andrews et al., 2018)
PPM
PP
CEDA
Disco
(Durojaiye et al., 2018)
PPM
PP
CEDA+TC ProM
(Chang et al., 2020)
PPM
PP
CEDA
bupaR
(Sato et al., 2020)
PPM
PP
CEDA
(Arias et al., 2020)
PMM
PP
CEDA
2
Technique
Data source Data
HIS
Surgery
HIS
Diabetes
PP
IPM
α-M, IM, HM,
FM
SD
HIS
CEDA
Radiology
ED
FM
HIS
ED
HIS
Cardiology
HM, RLPN
HIS
Pediatric
HM
EMR
Cardiology
Disco
EHR
Cardiology
Celonis
HIS
Cardiology
(G. Kusuma et al., 2020)
PD
PB
PM
ProM, Disco
RLPN
SD
(Cho et al., 2020)
PPM
PP
CEDA
ProM, Disco,
ProDiscovery
IM, FrM
EHR
ED
(Badakhshan &
Alibabaei, 2020)
PPM
PP
CEDA
Disco
FM
HIS
ED
(Ibanez-Sanchez
et al., 2021)
PPM
PP
IPM
HIS
ED
Note: Applications: CA—conformance analysis; CD—concept drift; ELE—event log extraction; ELQI—event log quality improvement; DQA—data quality
assessment; PA—process understanding; PD—process discovery; PMC—process model comparison; PMM—PM methodology; PMTD—PM tool development;
PPA—predictive process analytic; PPM—process performance measures; PPPM—privacy preserving PM; OD—outlier detection; Sim—simulation; SNA—
social network analysis; VA—variants analysis; VPA—visual process analytic. Processes: CP—clinical pathways; HM—human movements; MP—medical
processes; PB—patient behavior; PI—personnel interactions; PP—patient pathways; PT—personnel tasks. Techniques: α-M—α-miner; bupaR—R lib; CPLEX—
C lib for simplex method; DM—decision miner (ProM plug-in); DT—decision tree; EEL—explore event log (ProM plug-in); FC—fuzzy clustering; FCBDP—
find context based differences in process (ProM plug-in); FM—fuzzy miner; FrM—frequency mining; FSPM—frequent sequential pattern mining; GKC—graph
kernel clustering; GM—genetic miner; graphkernels—R lib; HC—hierarchical clustering; HM—heuristic miner; HMM—hidden Markov model; HoWSN—
handover-of-work social network; IDAHM—interactive data aware heuristic miner (ProM plug-in); igraph—R lib; ILP—integer linear programming; IM—
inductive miner; IMI—inductive miner infrequent; IPM—interactive PM (ProM plug-in); IVM—inductive visual miner; kernlab—R lib; KM—K-means
clustering; LDA—latent Dirichlet allocation; LTL-C—LTL-checker; MBC—model based clustering; MPPE—multi-perspective process explorer; PM—passage
miner; pMineR—R lib; QTC—quality threshold clustering; RF—random forest; RLPN—replay log on Petri net; SC—sequence clustering; SKLearn—Python
lib; SNM—social network miner; SPM—sequential pattern mining; stringdist—R lib; TM—text mining; topicmodels—R lib; USC—unsupervised spectral
clustering; TA—trace alignment. PM methodologies: BPL—business process life cycle; CEDA—data collection, event log extraction, process discovery, process
analysis; IPM—interactive PM; LPM—local process model; MP—multi-perspective PM; TC—trace clustering. Data source: BSD—body-sensor data; EHR—
electronic health record; EMR—electronic medical record; ILS—indoor location system; PACS—picture archiving and communication system; RTLS—real
time location system; HIS—hospital information system. Data: ED—emergency department; HH—home hospitalization; OS—outpatient service; RTC—road
traffic crashes; TR—trauma resuscitation; SD—synthetic data.
19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
GUZZO ET AL.
Download