Received: 22 September 2021 Revised: 9 November 2021 Accepted: 10 November 2021 DOI: 10.1002/widm.1442 ADVANCED REVIEW Process mining applications in the healthcare domain: A comprehensive review Antonella Guzzo1 | Antonino Rullo1 | Eugenio Vocaturo1,2 1 DIMES Department, University of Calabria, Rende, CS, Italy 2 CNR-NANOTEC National Research Council, Rende, CS, Italy Correspondence Antonella Guzzo, DIMES Department, University of Calabria, Rende, CS 87036, Italy. Email: antonella.guzzo@unical.it Edited by: Elisa Bertino, Associate Editor and Witold Pedrycz, Editor in Chief Abstract Process mining (PM) is a well-known research area that includes techniques, methodologies, and tools for analyzing processes in a variety of application domains. In the case of healthcare, processes are characterized by high variability in terms of activities, duration, and involved resources (e.g., physicians, nurses, administrators, machineries, etc.). Besides, the multitude of diseases that the patients housed in healthcare facilities suffer from makes medical contexts highly heterogeneous. As a result, understanding and analyzing healthcare processes are certainly not trivial tasks, and administrators and doctors look for tools and methods that can concretely support them in improving the healthcare services they are involved in. In this context, PM has been increasingly used for a wide range of applications as reported in some recent reviews. However, these reviews mainly focus on discussion on applications related to the clinical pathways, while a systematic review of all possible applications is absent. In this article, we selected 172 papers published in the last 10 years, that present applications of PM in the healthcare domain. The objective of this study is to help and guide researchers interested in the medical field to understand the main PM applications in the healthcare, but also to suggest new ways to develop promising and not yet fully investigated applications. Moreover, our study could be of interest for practitioners who are considering applications of PM, who can identify and choose PM algorithms, techniques, tools, methodologies, and approaches, toward what have been the experiences of success. This article is categorized under: Application Areas > Health Care Fundamental Concepts of Data and Knowledge > Key Design Issues in Data Mining KEYWORDS conformance analysis, healthcare, hospital information system, process discovery, process mining, process performance measurements, process simulation WIREs Data Mining Knowl Discov. 2022;12:e1442. https://doi.org/10.1002/widm.1442 wires.wiley.com/dmkd © 2021 Wiley Periodicals LLC. 1 of 47 GUZZO ET AL. 1 | INTRODUCTION The high heterogeneity that characterizes the large amount of data generated by health information systems represents the major obstacle for the assessment of healthcare quality. Such a heterogeneity is due to multiple factors. First, there exist several healthcare organizations where patients go for treatment, that differ from each other for many aspects such as resource management, specialized departments, medical disciplines, outsourced care services, technological equipment, and size of the healthcare facility, to name a few. These differences at the healthcare facility level mean that healthcare processes, even in similar contexts (e.g., the treatment of a specific disease), may differ from each other in quality, duration, quantity, and type of the resources involved in the process (e.g., physicians, nurses, administrators, etc.). However, the dynamics of healthcare processes may also depend on further aspects, among which the introduction of new administrative procedures, and medical guidelines; the discovery of new drugs, treatments, diagnostic procedures, and diseases; the wide range of reactions of the patients' body when subjected to a specific treatment; the effectiveness of a treatment on each patient; the patient behavior; the course and evolution of diseases for each single patient; the individual experience of the hospital staff, which leads to different decisions and interpretations. The result of such a complex, variable, dynamic, and multidisciplinary nature of the healthcare domain is that healthcare processes have high degree of variability, non-repetitive character, and a nondeterministic order of execution to a large extent. It comes out that the understanding and analysis of healthcare processes are definitively not trivial tasks. An effective assessment of their quality, indeed, may require the processes to be analyzed from different perspectives, such as workflow (the sequence of performed activities), time (temporal relation between performed activities), resource (actors and medical equipment involved in the process), and data (additional information like drugs, department, and hospital related data). It is clear, then, that hospital administrators and doctors need to be assisted in the task of process understanding by tools for the automatic extraction of knowledge from data, that are able to process and correlate large amount of (possibly unstructured) heterogeneous information. Such demand calls for the use of process mining (PM), a research area that includes techniques, methodologies, and tools to analyze and improve processes in a variety of applications, such as the automated discovery of knowledge from data, and checking conformance with regulation and choosing the most suitable model to represent complex processes by the sequence of events, their timing, and the set of involved resources. The PM approach allows to find out patterns hidden by the huge volume of data, and to evaluate the process performance with the aim of acting on process improvement and generating predictions. Moreover, comparative analyses can be performed between the discovered models and reference ones, such as medical protocols and clinical guidelines, in order to evaluate the compliance of actual processes with respect to prescribed ones, or in other words, to figure out what is really happening compared to what it was expected to occur. In the last decade, PM techniques has been used in the field of healthcare processes in a wide range of applications: discovering of process models, analysis of social interactions, conformance checking, as reported in some recent surveys. However, these surveys mainly focus the discussion on applications related to clinical pathways, while a systematic review of all possible applications is completely absent. Objective of this study is to help and guide researchers who are interested in the medical field to understand the main PM applications in the healthcare domain, but also to suggest new ways to go for promising and not yet fully investigated applications. On the other hand, our study could be of interest also for practitioners who are considering applications of PM, who can identify and choose PM algorithms, techniques, tools, methodologies, and approaches, toward what have been the experiences of success. We searched manually for the studies published in the last 10 years in the main electronic digital libraries of scientific literature and we selected 172 papers for a thorough review. In this article, we propose a taxonomy for the conceptual organization of the retrieved papers on PM applications in healthcare, and use this to discuss results. The article is organized as follows: Section 2 overviews the review protocol of the current study by presenting research questions and details about the publication selection process; Section 3 gives a short overview of main PM techniques, tools and methods; Section 4 provides insight on healthcare PM application by characterizing data and process in the related field; Section 5 reports a literature discussion based on the proposed categorization of the works that were selected; Section 6 includes a summary of our findings together with open issues and future research directions; Section 7 presents past reviews as related works; finally, Section 8 concludes the article by providing overall results and plans for future work. agenda 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 2 of 47 3 of 47 2 | REVI EW PROT OCOL In order to retrieve and select relevant studies we conducted a systematic literature research according to the approach described by Kitchenham (2004), which allowed us to identify and classify the revised papers on the basis of the adopted PM application methods and algorithms. In particular, our literature research and revision process consist of the following steps: 1. Specification of the research questions to be answered; 2. Definition of the query to be launched on the most relevant databases of academic papers; 3. Identification of inclusion and exclusion criteria for the screening of the retrieved papers; 4. Revision of the papers obtained so far; 5. Definition of a taxonomy to guide the classification of revised papers. The following research questions have been defined: RQ1: What methods and techniques have been adopted for the PM applications in healthcare? RQ2: How to categorize the processes that are performed in healthcare facilities and which kind of such processes were subject to more attention in the literature? RQ3: What are the main challenges PM faces when applied to the healthcare domain? In order to identify the largest number of targeted research and to assess the volume of potentially relevant studies, we have defined the following query: (“process mining” OR “workflow mining”) AND (“health” OR “care” OR “healthcare” OR “hospital” OR “clinic” OR “clinical” OR “pathways” OR “health-care” OR “patient”) Then we launched the query on the most relevant databases of academic papers, that include PubMed, dblp, Google Scholar, Scopus, Web of Science, IEEE Xplore, ACM Digital Library, SpringerLink, and ScienceDirect. The retrieved papers were the result of the above query evaluated on the title, keywords and abstract of the articles. The search was done independently by each author within the reference time range from 2010 to 2021. As expected, the number of items provided by each database was unreasonably high (in the order of thousands in Google Scholar, and of hundreds in the other platforms); however, it was evident that after a certain number of pages the results were articles either already identified or off-topic (e.g., only PM or only healthcare). In parallel, we carried out a search on the four websites processmining.org, processmining4healthcare.org, bpmcenter.org, and pods4h.com, trying to identify relevant studies or project that fit with the application of PM in the healthcare field. We found a considerable overlap in the obtained results, as many articles are contained in different databases. After an aggregation process, in total 296 articles were considered in our review process. To determine which articles include in, and exclude from this review, we adopted the following criteria: Inclusion criteria: • Articles where PM has been applied in the healthcare domain; • Articles published in English; • Articles published in relevant journals and conferences; Exclusion criteria: • Articles that did not present a PM application in the context of healthcare; • Articles which contribution were not clear; • Articles written not in English; • Bachelor, master, and PhD thesis; • Conference articles for which an extended journal version has been published; in these cases, the journal version only was selected. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. According to Kitchenham (2004), these criteria were decided during the protocol definition and refined during the search process, in order to reduce the likelihood of bias. Of the 296 articles found from the evaluation of the above query, 187 were considered after the application of the inclusion and exclusion criteria. Finally, 15 papers were excluded from the review process because review papers or systematic mapping studies, that are separately discussed in Section 7. To ensure the quality of the research process, a number of activities have been undertaken. Search in DB was performed in unknown mode to avoid conditioning on previous navigation, and search of papers was extended to websites accredited in PM healthcare in order to minimize threats of incompleteness that may result from search terms and search engines. The research, analysis, and evaluation of articles based on initial queries were performed manually by the first author and by at least one additional author. Any disagreements with the inclusion/exclusion of a paper were resolved by all authors through discussion. Therefore, we believe that by following the methodology proposed by Kitchenham (2004) which is sufficiently accredited, we have managed to ensure an adequate and inclusive basis for this study and, in the event of a missing publication, the rate would be negligible. 3 | P R O C E S S MI N I N G The goal of PM is to discover, monitor, analyze, improve, and predict real processes. Organizations that have limited information on the processes they manage resort to PM to get a deeper insight into the details that characterize these processes, such as performances, bottlenecks, involved resources, process deviations, and so on. This is because there may be a significant gap between how a process is supposed to behave, and what actually happens. Figure 1 depicts a taxonomy for PM according to which a PM task is characterized by six aspects, that are: Application, which identifies the purpose of the PM task; Technique, which identifies the algorithm used to achieve the intended purpose; Tool, which identifies the technology used to implement the technique; Approach, which identifies the steps followed to finalize the application; Process, which identifies the kind of process being analyzed; and Data, which identifies the data source from which the event log has been extracted. The most widespread PM applications are process discovery, conformance analysis, process enhancement, process performance measurement, and social network analysis (SNA). Process discovery is the task of inferring a model of the process under analysis by means of a discovery algorithm, which takes as input a log of process executions and outputs a process model described in terms of a process modeling language. In a conformance analysis task, the data generated from an a priori model are compared with real data in order to compare the model with reality. Enhancement techniques allow to improve or/and extend an existing process model with the insights extracted from logged process executions. The scope of a process performance measurement task is to describe the process under analysis in terms of a set of performance key indicators (PKI), defined with the aim finding bottlenecks and inefficiencies. SNA allows to discover the interaction patterns that characterize the actors involved in the process. PM techniques include algorithms for process discovery, conformance checking, process deviation, trace clustering, and event clustering; but also data visualization methods such as dotted charts, and languages for the representation of processes. Existing modeling languages visualize the content of a process execution either as declarative or procedural process models. Procedural languages are Petri nets (Peterson, 1977), process trees, causal nets (Van Der Aalst, Adriansyah, & Van Dongen, 2011), state machines, and BPMN (Chinosi & Trombetta, 2012), to name a few; declarative languages are LTL (van der Aalst et al., 2005) and declare (Pesic et al., 2007). Various are the tools that allow to implement or apply the above-mentioned PM techniques. Someones are GUIbased, such as ARIS,1 Disco,2 ProM,3 PALIA-ER (Rojas, Fernandez-Llatas, et al., 2017), and Celonis.4 GUI-based tools FIGURE 1 A taxonomy for process mining 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 4 of 47 5 of 47 allow nonexpert PM users to deploy complete PM applications, but also provide experienced ones with a convenient environment to implement ad hoc solutions. Other tools include programming language's libraries, such as pMineR (Gatta et al., 2017), bupaR (Janssenswillen et al., 2019), and pm4py (Berti et al., 2019). The typical approach followed to conduct a PM task consists of four steps: data collection, event log extraction, process discovery, and process analysis. The last step can serve for different purposes, for example: the user may wish to verify the compliance of the actual process execution with respect to medical guidelines; or else, the user may want to analyze the process performances, or even, to perform a process simulation to conduct a what if analysis. However, more specific PM frameworks have been proposed in literature in the form of guidelines to be followed in order to conduct a sound process analysis. These include the L*, which describes the life cycle of a typical PM project (Van Der Aalst, Adriansyah, De Medeiros, et al., 2011), the PM Project Methodology (PM 2), which includes iterations and consists of six key phases (Van Eck et al., 2015), and, in the context of healthcare, the DIAG approach (Araghi, Fontanili, et al., 2018), which provides location awareness by means of location systems generated data. A PM task can be also characterized by means of processes and data when referring to a specific application context, such as the healthcare in our case. Disparate is, indeed, the different medical and organizational processes that take place within a healthcare facility, as well as the information systems from which process data are extracted. These two aspects are described more in detail in Sections 4.1 and 4.2. Processes can be analyzed from different viewpoints, also called perspectives, that are: • Control-flow, which focuses on the activities that are executed and the relationships of precedence among them (in terms of preconditions and post-conditions); • Organizational, (a.k.a. resource) which focuses on the actors that are involved, on their roles, and on how they are mapped/assigned to the activities; • Temporal, (a.k.a. performance) which focuses on the temporal aspect of process executions, such as their duration, the temporal relations between activities, frequency of specific activities, and is typically considered for the detection of bottlenecks and the estimation of performance indicators, such as throughput and average duration of process executions; • Data, which focuses on the data used and generated during the process execution, including the attributes that characterize the given enactment, and the attributes that characterize the triggered events. PM techniques assume that process executions are recorded in the form of an event log, that is, a log consisting of a set of traces. Each trace represents a process case (i.e., a specific execution or instance of a process), and is characterized by a set of attributes, and by a temporal ordered sequence of events. Events, in turn, are characterized by a timestamp and a set of event attributes describing the intrinsic properties of the event, for example, activity's name, performing resource, the cost of executing the activity, the life cycle phase of the activity, origin place, and so forth. An XML-based standard for event logs is the extensible event stream (XES) (IEEE, 2016), accepted as the standard format for the interchange of event log data between tools and application domains, and supported by the vast majority of PM tools. The use of PM provides several benefits with respect to other data mining approaches when dealing with processes executions data. First, it combines the strengths of process modeling and data mining, thus automatizing the generation of processes models from raw data. Furthermore, with the help of data visualization methods and process model representation languages, such models can include a large set of process information such as different event relations and temporal features. Other techniques like sequential pattern mining and episode mining, instead, while being sequenceoriented data mining techniques, they focus on the discovery of sequential orderings and partially ordered sets of events only, leaving out many process characteristics that can be useful for the process management, such as sequential orderings, (exclusive) choice relations, concurrency, and loops. On the other hand, PM can have drawbacks depending on the nature of the process being analyzed. For instance, in complex scenarios process models may oversimplify the real processes as they may fail to capture important aspects, thus providing a naive view of reality (Van der Aalst, 2009); or contrariwise, a process discovery task may generate an overfitting model, that is, a model which perfectly describes a given set of process executions but unable to adapt to unexpected process deviations. Another common issue is the so-called spaghetti-like effect, that is the term of the PM jargon for a model hardly intelligible because characterized by a high number of paths between starting and ending activities. These circumstances occur when dealing with unstructured processes, that is, when process traces do not form a homogeneous group because very different with each other, or when logged events are too fine-grained, in both cases causing process discovery algorithms to produce incomprehensible models, or models that are not representative. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. Sometimes, models for other perspectives can be created, such as flow times and social networks; however, it is very unlikely that all of these can be folded into a meaningful single process model as the highly variable nature of the control-flow does not allow it (Van der Aalst, 2011). In these cases, PM must be supported by a domain expert who, depending on the objectives of the investigation, brings the analysis on a different level of abstraction by inserting/removing data to/from the event log, so as to obtain a more clear view of the process. The inclusion of a human in the process discovery task is known as interactive process mining, which will be discussed later in this survey. 4 | PM IN HEALTHCARE As already mentioned, a complicating factor for the application of PM in the healthcare domain is the highly complex and variable nature of processes: a wide range of processes with different characteristics are daily executed; the delivery of care services involves several departments and organizations, and provides for the collaboration between professionals with different skills; the flow of each healthcare process is determined by many factors, like the unpredictable outcome of treatments, and the different interpretations physicians may give to patient's symptoms. Additionally, the amount of data registered in health information systems may be so huge to make it difficult for organizations to be aware of the inefficiencies and bottlenecks. Furthermore, the health information systems are daily updated with information related to different actors such as payers, patients, physicians, nurses, and other service providers, which prevents the integration of medical data and, consequently, the evaluation of health care quality. In this context, PM aims to automatically identify models of patient care, of hospital staff practices, of organizational processes, and can help to carry proactive process analysis in order to anticipate problems and to put in place countermeasures that enhance the quality of outsourced care services, for example, by shortening treatment times, waiting times in emergency departments, but also to reduce healthcare costs. Besides the identification and the consequent enhancement of actual processes, PM is widely used for the assessment of clinical guidelines. Clinical guideline are general recommendations about care, treatments, and diagnosis, defined a priori with the aim of leading hospital staff in doing their job. However, many processes are theoretically defined but hard to follow due to the high variety of unexpected scenarios that some events can lead to. Thus, a common practice is to perform a conformance analysis of actual processes against the clinical guidelines so as to find the differences between what really happens and what it was expected to occur, and then update/enrich clinical guidelines accordingly. In this review, we propose the taxonomy shown in Figure 1 as a reference point for the characterization of PM tasks carried in a healthcare context. Healthcare Data and Processes are described more in detail in Sections 4.1 and 4.2, respectively; Applications are discussed in Sections 5.1 and 5.2; Tools are described in Section 5.4; Approaches are discussed in Section 5.3. We separately discuss PM techniques and algorithms for each of these aspects. 4.1 | Healthcare data Process-oriented commercial systems store process information by tracing the process related activities in the form of an event log. Healthcare data, however, are typically archived by means of DBMS-like systems in a relational/ transactional format, which is not suitable for use in PM applications, and therefore requires a preliminary preprocessing step to be turned into the form of an event log. The digital records technologies most used in the healthcare domain are: hospital information system (HIS), electronic medical records (EMR), and electronic health records (EHR). An EMR is the digital version of the patient's medical record as it contains the patient's clinical history in digital form, and typically does not come out of the doctor's office. An EHR can be considered as an augmented EMR as it is inclusive of a broader view of a patient's care. Besides the patient's medical history, an EHR contains information about all the staff involved in the patient's care, and it can also share information with other healthcare facilities. A HIS is a wider information system that would be inclusive of several subsystems like EMR, EHR, and Picture Archiving and Communication Systems (PACS), as well as other systems to support care/hospital management. It keeps heterogeneous data as it is designed to manage multiple operational aspects such as medical, administrative, and financial. In a HIS, thus, we can find disparate information related to patient registration, admission, discharge, and transfer; but also finance functions such as billing, accounts 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 6 of 47 FIGURE 2 7 of 47 Data sources used in the reviewed papers receivable, accounts payable, and some clinical documentation such as diagnoses and billing codes associated with medical encounters and procedures (Piccialli et al., 2021). As we will show in the next sections, HIS, EHR, and EMR of healthcare facilities are not the only source of data for PM applications. In some work the event log has been synthetically generated using (Burattin, 2016; Burattin & Alessandro Sperduti, 2010), extracted from video recordings (e.g., of a surgical operation; A. Grando et al., 2017; Kelleher et al., 2014; Lira et al., 2019; S. Yang, Tao, et al., 2018), from body-sensor data (e.g., sequence of blood pressure measurements; Fernandez-Llatas, Garcia-Gomez, et al., 2011; Kaymak et al., 2012; McGregor et al., 2011), from location systems such as Real Time Location Systems (RTLS) logs, and from the Medical Information Mart for Intensive Care III (MIMIC-III)5 open access dataset (Alharbi et al., 2018; A. P. Kurniati et al., 2018a, 2019; Marazza et al., 2020; Pika et al., 2019; Rojas & Capurro, 2018). A RTLS (Kamel Boulos and Berry, 2012) contains information on the location of process actors such as patients, clinicians, and medical machineries. Locations are determined by means of wireless technologies as radio frequency identification (RFID) tags, integrated in patient identification bands or staff cards, and antennas which are deployed in the hospital's departments. RTLS data have been used for PM to collect patients' movements (Araghi et al., 2019, 2020; Araghi, Fontaili, et al., 2018; Araghi, Fontanili, et al., 2018; Fernandez-Llatas et al., 2013; Martinez-Millana et al., 2019), to mine the workflow of medical equipments (Liu et al., 2014), to study a surgical process (FernandezLlatas et al., 2015), and to enhance event logs extracted from a HIS (Fernandez-Llatas et al., 2021; Martin, 2018). MIMIC-III is an open-source healthcare database which contains obfuscated data of about 46.000 intensive care unit patients that were housed at the Beth Israel Deaconess Medical Centre in Boston, USA, between June 2001 and October 2012. The data were extracted from the hospital's EHR, and can be used to create a relational database with 26 tables, 16 of which with timestamped data useful for building patient workflows. The dataset includes body measurements, medications, laboratory measurements, observations and notes charted by clinicians, procedure codes, diagnostic codes, images, hospital length of stay, and more, providing a comprehensive example of EHR data from a large and busy hospital (A. E. W. Johnson, Pollard, et al., 2016). Figure 2 shows in which percentage each repository has been used as data source for the experiments conducted in the reviewed papers for the evaluation of the proposed PM application. 4.2 | Healthcare processes Rojas et al. (2015) categorized healthcare processes into two types: the medical treatment processes, which include activities from diagnostics to the execution of patient relief activities, and the organizational processes, which focus on the collaborative information of the healthcare professionals and their organizational units. Another classification (R. Mans et al., 2015) divided operational healthcare processes (i.e., “orchestration and management of the care processes rather than individual activities”) into two types: nonelective care, including medical emergencies (i.e., “patient for whom medical treatment is unexpected and needs to be planned on short notice”); and elective care, which includes scheduled 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. standard, routine and nonroutine procedures (i.e., “care for which it is medically sound to postpone treatment for days or weeks”). In this review, we propose a further categorization of healthcare processes to answer RQ2, which somehow specializes the one proposed by Rojas et al. (2015). Our categorization, however, does not rely on any a priori knowledge about the organization of healthcare facilities and medical treatments, rather it is the result of the classification of the processes analyzed in the reviewed paper. Overall, it consists of seven types of processes, each of medical (M) and/or organizational (O) nature: • Clinical pathway (M): is the sequence of patients' clinical activities, such as medical examinations, drug prescriptions, surgical operations, symptoms, and so forth, which determines the clinical history of a patient. This type of process is typically considered in a PM task for prediction purposes, that is, to determine all the possible outcomes of a disease (each with an associated probability) given the sequence of clinical activities that have occurred in the past. These data are usually retrieved from a HIS or an EHR. • Patient pathway (M–O): is the sequence of steps a patient undergoes within a healthcare facility, such as registration, visit, diagnosis received, prescription received, hospitalization or discharge, departments a patient has stop by. Often, patient pathways are analyzed with the purpose of determining the process performances in a particular department (e.g., an emergency department) by means of performance indicators such as the average length of stay of a patient. As for clinical pathways, patient pathways data are typically stored into HIS and EHR. • Patient behavior (M): is a sequence of activities performed on the patient's body, such as temperature measurements, body weight measurements, values of a specific blood component, but also related to the patient's behavior, such as what or how much the patient has eaten over time, or other data related to the patient habits. This kind of processes are analyzed mainly to predict disease trajectories and build disease/patient models. Patient behavior data can be retrieved by means of body-sensor measurements, video recordings, or directly from the patient. • Personnel interactions (O): is the sequence of interactions between hospital staff within a specific environment. The analysis of this kind of processes is called SNA, and is typically adopted for the discovery of interaction patterns between hospital staff or different departments. These data can be gathered from the hospital HIS/EHR, but also by means of an RTLS. • Medical processes (M): is the sequence of the activities that identify a specific medical process, such as a surgical operation, or a resuscitation. Such processes are analyzed with the aim of improving process performances, and the sequence of activities is typically manually extracted from video recordings. • Human movements (O): is the sequence of places in a specific department or portion of the hospital through which a patient or a member of the medical staff passes during a certain time slot. Discovering the model of human trajectories has been mainly used with the aim of improving the hospital/department layout. The data are typically gathered by means of an RTLS. • Personnel tasks (O): is the sequence of activities performed by medical staff, such as the interactions with information systems such as HIS, EHR, and PACS, with the aim of analyzing the temporal aspect of the process by characterizing the frequency of execution of tasks over time, and detect possible trends, cyclic behavior, or auto-correlation between tasks; but also for clustering analysis aimed at separating, detecting and studying regular behavior, variants, and infrequencies in the habits of health professionals. These data can be obtained either from HIS/EHR, RTLS, or PACS. Figure 3 shows in which percentage each process category has been taken into account in the reviewed papers. 5 | L I T E R A T U R E DI S C U S S I O N To respond RQ1, here we characterize the state of the art of PM in the healthcare domain according to the taxonomy reported in Figure 1. We separately discuss the PM applications and for each application we report on the adopted techniques. Then we discuss the PM tools used for the implementation or deployment of PM techniques, and the PM methodologies followed for the accomplishment of PM tasks (process and data aspects have already been discussed in Sections 4.1 and 4.2). For each of these aspects we report on the work done by the research community. In the Appendix, Table A1 summarizes the classification of the surveyed papers. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 8 of 47 FIGURE 3 9 of 47 Processes analyzed in the reviewed papers 5.1 | Preprocessing: The initial step of PM applications In the best of cases, event logs are generated by Process Aware Information Systems, that are specifically programmed for detecting and registering process events in a form suited for PM purposes. However, in most cases event logs are the result of a preprocessing task performed by a human on raw data stored in different forms (e.g., relational DB, video recordings, hand notes, etc.), in different places, such as the multiple components of a HIS (e.g., RTLS, EHR, administrative data, etc.), and from different, non-synchronized sources (e.g., different hospitals). Collecting, transforming, merging, organizing, cleaning, and enhancing data, thus, are fundamental steps for any PM task, that we refer to as preprocessing. Without any kind of data preprocessing, a PM task conducted on a poor-quality event log may return biased information about the process under analysis. In this review, we distinguish between two types of preprocessing, that are Event Log Extraction and Preparation and Event Log Quality Improvement. The former identifies methods for the generation of event logs from raw data, and for the selection of relevant data from event logs; the latter identifies techniques to improve event logs quality by removing “dirty” data (e.g., outliers), or by including additional information, both with the aim of improving the performance of subsequent PM tasks. 5.1.1 | Event log extraction and preparation Binder et al. (2012) provided different approaches to empower data that is not already available in temporally ordered event logs, and developed a prototype for structured data acquisition called PTDocs. PTDocs was tested on a billingoriented view of medical patient treatments, which does not provide direct information about administered treatments. However, by cross-referencing medications data recorded at the pharmacy, the underlying treatments was reconstructed. Perimal-Lewis et al. (2014) showed step by step how to convert a set of comma-separated files into a unique event log, where cases were the patient pathways within an emergency department. García et al. (2015) developed a software that can be used by nonexpert users for generating event logs from a HIS without relying on other external tools, and with the same characteristics, format, and structure as obtained with XESame tool (Verbeek et al., 2010). Naeem et al. (2017) showed how to convert nonevent biomedical data into events by querying among multiple dataset tables. Then, they proposed the “LOG Generator” algorithm, used to generate an event log for from the preprocessed event data. Metsker et al. (2017) developed a text mining-based algorithms for the interpretation of medical records as a potential solution to the problem of knowledge retrieval from EHR. In Rinner et al. (2018), the authors described a data preparation method which uses a naming convention based on “time boxing” for recurring events to model the time aspects used in medical guidelines. Each activity is associated to a “time box” (a fixed time period it matches in), each time box corresponds to an event in the medical guideline, and the events in each time box are named according to the name of the time box. This way they were able to extract an event log by cross-referencing data from clinical guidelines and an EHR in the context of melanoma surveillance. Vogelgesang and Appelrath (2013) 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. introduced the concept of “factors of influence,” that is, data dimensions which can be used to form a multidimensional data cube. A separation of the relevant data is desirable to mine different models for different groups of patients, the selection of which can be considered as OLAP (OnLine Analytical Processing) operations on the multidimensional cube. PM is then performed on a data slice selected by OLAP operations. 5.1.2 | Event log quality improvement Event logs may suffer from different types of inaccuracies such as missing timestamps, sequence of events stored in nonchronological order, missing resources, outliers, noise, and so on. Besides, the high complexity of healthcare processes makes event log a set of heterogeneous data that may enable the generation of unintelligible process models. The following papers proposed solutions to overcome such issues. Alharbi et al. (2017) proposed an approach to filter out the outlier events of repeated activity using an interval-based selection method as a preprocessing step to improve pattern detection. The method provides for variations reduction in clinical pathways data by reducing the number of repeated events still preserving the mainstream temporal pattern. The method uses the interval-pattern of repeated activity as a threshold to remove outlier events. With the aim of avoiding the “spaghetti-like” effect on process discovery tasks, the same authors (Alharbi et al., 2018) proposed an event abstraction method according to which the event log is enriched with HMM-derived states, so as to remodel the healthcare processes as state transitions. Martin (2018) showed how indoor location system (ILS) data can be used to attenuate data quality issues present in the event logs extracted from a HIS. In particular, he proposed various ways of cross-referencing data from an ILS and an HIS. O. A. Johnson et al. (2018) presented the ClearPath method, which extends the PM 2 methodology with a process simulation task to address data quality issues. As noisy environments and building structures can interfere with RTLS signals affecting their accuracy, Fernandez-Llatas et al. (2021) proposed the use of the interactive PM paradigm for supporting the semiautomated correction of RTLS data. To this end, they presented an “interactive trace correction” method which uses an edit distance framework to correct the data. Activity-based PM was introduced by Fernandez-Llatas et al. (2010), and is an interactive preprocessing technique designed to be adopted in healthcare contexts, and which provides for the enrichment of the event log with the activity outcomes (i.e., information arising from the result of taken actions), with the aim of obtaining more explanatory process models. The term “interactive” means that the log enrichment task is conducted by a human expert, besides other automated procedures (with which the human interacts). An example of Activity-based PM algorithm is PALIA (Fernandez-Llatas et al., 2010), used for process discovery. Ibanez-Sanchez et al. (2019) demonstrated the applicability of this methodology on emergency department processes, following a question-driven, interactive PM approach. As Activity-based techniques are based on discrete labels, Fernandez-Llatas et al. (2014) coupled activity-based PM with a temporal abstraction framework to create high level labels. Temporal abstraction techniques allow to abstract high-level concepts from timestamped data by transforming timestamped representation of raw data to an interval-based description of time series. The authors argued that these high level, discrete labels allow overcoming some limitations introduced by the use of continuous variables, in particular it enables experts to better understand how the process is being deployed. Activity clustering is a preprocessing technique aimed at categorizing similar activities with the same label. This allows to obtain more understandable process models, and results particularly useful when dealing with data coming from heterogeneous sources as the several information systems adopted in healthcare contexts (e.g., HIS, EHR, EMR, etc.). Similar activities are, for instance, those that share some characteristics, or those having different names but same semantic. Lismont et al. (2016) applied two clustering methods, namely, the pattern abstraction technique (Jagadeesh Chandra Bose & Van der Aalst, 2009), and the activity clustering mining proposed by Günther and van der Aalst (2006). The former groups activities that frequently appear together, the later identifies activities that always happen in series and that share some characteristics. Duma and Aringhieri (2017) classified events into 15 classes, then they merged consecutive events of the same event class because irrelevant from a control-flow perspective. Stefanini et al. (2016) manually grouped activities having different names but same semantic. In (Naeem et al., 2017), clusters have been discovered based on time aspects, where activities having lesser frequency of occurrence were grouped as one. With the aim of dealing with the problems related to batch processing (i.e., the recording of several events with same timestamp, such as a group of laboratory results received at the same time), Alharbi et al. (2017) proposed to group batched events in a single event. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 10 of 47 11 of 47 Topic modeling (TM) (D. Blei et al., 2010) is a clustering approach based on latent Dirichlet allocation (LDA) (D. M. Blei et al., 2003), that has been largely used for activity clustering in medical event logs. LDA is an unsupervised algorithm that allows grouping a set of documents into K different topics, each of which represents a set of words. The objective is to find a correspondence between documents and topics, such that the words in each document are captured by the K topics. This, translated into medical terms, sounds as: each patient trace is a multinomial distribution of topics, and each topic is modeled as a multinomial distribution over treatment activities. TM can be implemented with the R library topicmodels. Among the surveyed papers, those that adopted the TM approach are (Huang et al., 2014; Stefanini et al., 2016; Xu et al., 2017; Prokofyeva and Zaytsev, 2020; Chiudinelli et al., 2020). Besides (Chiudinelli et al., 2020); Huang et al., 2014); Prokofyeva and Zaytsev, 2020), where authors applied TM on medical event logs, X. Xu et al. (2017) used billing data as the reflection of medical behaviors, and used LDA to organize billing items into high-granularity topics. They start from the assumption that billing items used for patient are goal-oriented, and thus different items that occurred in 1 day represent the day goals. According with the Semantic PM paradigm, an event log is annotated with the labels of a predefined ontology. An ontology consists of the concepts, categories, and their properties and relationships that characterize a subject area. As for activity clustering and topic modeling, semantic PM is another preprocessing method which allows for the simplification, by means of semantic labeling, of the collected data. The ontology can be manually created from scratch, or it can be automatically mined from the event log and then manually improved. Jangi et al. (2019) reviewed six papers to show the use of semantic PM for hospitals processes. The ontology used by Antonelli and Bruno (2015) is modeled as a UML class diagram, and uses the ICD106 vocabulary for disease classification, the MDC7 vocabulary for the major diagnostic categories, the DRG8 vocabulary for diagnosis related group, and the ATC9 vocabulary for drug classification. In (Montani et al., 2017), ground terms were used to represent patient management actions, while abstracted terms were used for medical goals. Then, a rule-based mechanism was used to identify which ancestor of an action in the ontology had to be considered for abstracting the action itself. Finally, consecutive actions on the trace with same ancestor were merged and labeled as the common ancestor. M. A. Grando et al. (2011) proposed a semantic PM approach to check the conformance of computer interpretable guideline (CIG) with the related medical recommendations. Dewandono et al. (2013) proposed a method that combines ontology and PM for giving recommendation in cases of diabetic treatment. The ontology, implemented in the OWL description logic, is used for matching the data of a patient with that of other healthy patients, while conformance checking algorithms are used to compare process model of a patient with the process model of healthy patients. Detro et al. (2017) proposed a semantic PM framework for selecting the appropriate process variant according with the patient's symptoms by means of ontologies based on known expertise. 5.2 | PM applications in the healthcare domain In this section, we discuss the PM applications presented in the reviewed papers. Figure 4 shows in which percentage each application has been treated. 5.2.1 | Process discovery The process discovery task consists in applying a process discovery algorithm with the aim of inferring a model of the process from the event log. In the healthcare context, many of the medical and organizational processes are standardized, that is, clinical guidelines are provided by medical organizations or government institutions to the hospital staff so as they know what to do in specific situations, and healthcare services can be outsourced uniformly by the healthcare facilities that belong to the same community (e.g., states, private organizations, etc.). However, such guidelines are difficult to accomplish given that, for example, patients suffering from the same disease may require different treatments depending on how their body reacts to certain drugs and medications. For this reason, processes carried within the same medical context may differ from each other in terms of control-flow, organizational, and temporal perspective. Thus, the process discovery task is of paramount importance for physicians and administrators, as it allows to obtain a broad view about what actually happens, and possibly to rectify or enrich guidelines accordingly. The literature on process discovery can be divided in two main strands, that are automated and interactive discovery of process models. An automated discovery task makes use of a process discovery algorithm which takes as input the event log and outputs a process model in terms of a process modeling language, without using any a priori knowledge. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. FIGURE 4 GUZZO ET AL. Applications proposed in the reviewed papers An interactive discovery task, in addition, takes advantage of a domain expert to discover the process model, exploiting his domain knowledge along with the event log (Benevento, Dixit, et al., 2019). Table 1 provides a quick overview of the work divided according with the above classification. Before discussing the main contributions in the field of automated discovery of process models, we briefly introduce the most used process discovery algorithms. The α miner (Van der Aalst et al., 2004) is one of the first process discovery algorithm appeared in the PM literature. It examines causal relationships between events, and constructs a Petri net where each transition corresponds to an observed event. It is able to discover concurrency, loops, points of choice, and it only focuses on the control-flow perspective, ignoring temporal, data, and organizational perspectives. The heuristic miner (Weijters et al., 2006) improves the α algorithm by considering frequencies, therefore it can filter out infrequent behavior, and noise. It is usually adopted with event logs with not too many different events. The algorithm first constructs a graph characterized by a frequency-based metric that indicates the likelihood that one activity depends on another. Then, a dependency matrix is constructed and the all-activities connected heuristic (i.e., choosing the best candidate within all connected activities) is applied to the matrix to extract a process model. Some specific heuristics are used to exclude from the model those activities that take place under certain conditions that are not explanatory of the process, and to alleviate the duplicate tasks and the long distance dependencies problems. The inductive miner (Leemans et al., 2013) is an algorithm built to improve the performance of α miner and heuristics miner. It works by repeatedly finding a split in the event log within the trace, and detects the logical operator that describes the split. This algorithm outputs a process tree, a directed hierarchical graph in which each node is an event with child events representing potential subsequent events within the same session, thus showing the ordered relations among events. The fuzzy miner (Günther & Van Der Aalst, 2007) is the first algorithm which address the problems of large numbers of activities and highly unstructured behavior. It uses significance/correlation metrics to simplify the process model at the desired level of abstraction. Besides the significance and correlation of two events directly following one another, it can also measure longer-term relationships, and leave out less important activities (or hide them in clusters). The fuzzy model is designed to allow aggregation and abstraction so as to simplify the observed behavior with the aim of making it more understandable. However, it must be considered that there is a trade-off between model simplification and description of the actual behavior. Genetic mining algorithms (de Medeiros et al., 2007) try to learn processes characterized by invisible tasks, and by situations where there is a mixture of choice and synchronization. In the initial population each individual is represented as a matrix which elements are probabilities of the existence of a relation between pairs of activities. Each individual is then assigned a fitness value as the capability of matching with log traces, and a new population is generated by 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 12 of 47 TABLE 1 13 of 47 List of papers divided per discovery approach Discovery approach References Automated (Abo-Hamad, 2017; Alharbi et al., 2018; Alvarez et al., 2018; Amantea et al., 2020; Antonelli & Bruno, 2015; Araghi et al., 2020; Araghi, Fontaili, et al., 2018; Baker et al., 2017; Bouarfa & Dankelman, 2012; Caron et al., 2011; Caron et al., 2014; B. Chen et al., 2021; Chiudinelli et al., 2020; Cho et al., 2014; Dagliati et al., 2014; De Oliveira et al., 2020; De Weerdt et al., 2012; Duma & Aringhieri, 2020; Fei et al., 2010; Fernandez-Llatas et al., 2015; Fernandez-Llatas et al., 2010; Fernandez-Llatas, Garcia-Gomez, et al., 2011; Furniss et al., 2016; Garg & Agarwal, 2016; A. Grando et al., 2017; H. Xu, Pang, Yang, Jinghui, et al., 2020; Hendricks, 2019; Huang et al., 2012, 2013; Jaroenphol et al., 2015; Kim et al., 2013; Kirchner & Markovic, 2018; Kukreja & Batra, 2017; Leonardi et al., 2019; R. Mans, Reijers, et al., 2012; Meneu et al., 2013; Mertens et al., 2018; Miclo et al., 2015; Montani et al., 2013; Naeem et al., 2017; Neumuth et al., 2011, 2012; Perimal-Lewis et al., 2014; Placidi et al., 2021; Poelmans et al., 2010; Prodel et al., 2015; Rojas & Capurro, 2018; Stefanini et al., 2016; Valero-Ramon et al., 2019; X. Xu et al., 2017; S. Yang, Li, et al., 2017; S. Yang, Zhou, et al., 2017; Lee & Rismanchian, 2018; Zhang & Chen, 2012; M. Zhou et al., 2017) Automated with clustering (de Toledo et al., 2019; Delias et al., 2014, 2015; Huang et al., 2014; Kovalchuk et al., 2018; Lakshmanan et al., 2013; Lismont et al., 2016; Najjar et al., 2018; Pebesma et al., 2019; Prokofyeva & Zaytsev, 2020; Ronny et al., 2015; Lu et al., 2019; S. Yang, Tao, et al., 2018) Interactive (Benevento, Dixit, et al., 2019; Valero-Ramon, Fernandez-Llatas, Martinez-Millana, & Traver, 2020; ValeroRamon, Fernandez-Llatas, Valdivieso, & Traver, 2020) means of genetic operators such as crossover, mutation, and elitism selection. This is repeated until the end of the evolution process, finally the individual which best fits with log traces is selected as the final solution. ILP-based PM techniques (Jan Martijn et al., 2008) provide precise results by design, as the user can explicitly define the constraints a process model must satisfy to fit exactly with the log traces. However, they do not scale well, as the simplex algorithm solves linear programs in a time exponential in the number of variables (activities). ILP-based discovery algorithms mine causal dependencies between activities that are detected in the event log. However, this approach performs well only under the assumption that the process under analysis shows frequent behavior, thus it results to be uneffective in describing low-frequent exceptional behavior. The algorithm proposed by Jan Martijn et al. (2008) provides for the construction of a Petri net as a solution of the linear program, in which objective function expresses a preference for finding the Petri net places. The PALIA algorithm (parallel activity-based log inference algorithm) (Fernandez-Llatas et al., 2010) is an activitybased PM algorithm. It first builds a graph taking into account the process parallelism, then for each node, the algorithm verifies if their posterior branches are equivalent; if so, nodes and transitions are fused. Finally, repeated transitions and unused nodes are deleted. This algorithm has been used also in the context of interactive process discovery, and follows an Activity-based strategy, which means that use activities along with their outcomes to generate state transition models. Some of the first work on automated discovery of process models proposed an HMM-based (Poelmans et al., 2010), and the heuristic and genetic miners (Fei et al., 2010) algorithms for the discovery of patients pathways, in the field of oncology, and surgery, respectively. In the same year, Neumuth et al. (2011) presented a method to calculate a medical process model as a statistical “mean” intervention model, called generic Surgical Process Model (gSPM), from a number of surgical interventions. Caron et al. (2011) used the heuristic miner for the study of cancer treatments, and the social network miner for discovering the interactions between the hospital departments involved. Fuzzy miner has been adopted by De Weerdt et al. (2012) for mining clinical pathways of oncological patients. Huang et al. (2012) used sequential pattern mining to find a set of oncological clinical pathways given a minimum support threshold. Neumuth et al. (2012) designed and implemented a surgical workflow management system (SWFMS) to provide a guidance for cataract surgeries operations. R. Mans, Reijers, et al. (2012) used heuristic miner to discover the personnel interactions process during dentistry operations. They investigated the process from the control-flow, the organizational, and the performance perspectives. Bouarfa and Dankelman (2012) derive a workflow consensus from multiple logs of medical processes, and then detect workflow outliers automatically and without any prior knowledge. B. Chen et al. (2021) and Furniss et al. (2016) proposed a framework to discover the usage patterns of personnel when interacting with the HIS. Cho et al. (2014), Jaroenphol et al. (2015), and Kim et al. (2013) used heuristic and fuzzy miners to discover an outpatient care process model. Additionally, in Kim et al. (2013), the model was compared with a process model deigned by a 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. domain expert in terms of the accuracy of the matching rate. In (Fernandez-Llatas et al., 2010; Meneu et al., 2013), PALIA was used for the discovery of patients pathways in the field of cardiology, and home hospitalization, respectively. Montani et al. (2013) proposed a heuristic miner-based framework with the aim of analyzing the quality of stroke management processes, in order to verify whether different patient categories are differently treated, and whether hospitals of different levels actually implement different processes. Huang et al. (2013) formally defined the clinical pathway summarization problem as an optimization problem which can be solved by using dynamic programming. This approach first represents the pathway as continuous and overlapping time intervals, then discovers frequent patterns in each time interval from the log. Dagliati et al. (2014) proposed a heuristic miner-based approach to analyze complex temporal datasets of type 2 diabetes patients. The main idea underlying the approach is to use temporal data mining techniques to derive patient behavior processes. Fernandez-Llatas, Garcia-Gomez, et al. (2011) used PALIA for mining the behavior of patients affected by dementia. The same author (Fernandez-Llatas et al., 2015) tested both the heuristic miner and PALIA for mining patient pathways from RLTS data collected in a surgical ward. Miclo et al. (2015) and Araghi, Fontaili, et al. (2018) used the fuzzy miner, and the heuristic miner, respectively, on RTLS data to mine human movements. Prodel et al. (2015) used an integer linear programming approach to discover the care process at a macroscopic scale from a large-size database. In the approach proposed by Naeem et al. (2017) similar activities are grouped by means of a clustering algorithm, then inductive, heuristic, and fuzzy miners are applied to extract the clinical pathway of patients affected by hepatitis. The preliminary activity clustering step avoids to incur in the so-called spaghetti effect. A similar approach was presented by X. Xu et al. (2017), where topics are assigned to similar activities and the fuzzy miner is applied to extract clinical pathways in the context of surgery and neurology. S. Yang, Li, et al. (2017) proposed an HMM-based algorithm to investigate the trauma resuscitation process mined from video recordings. In the same year, M. Zhou et al. (2017) and S. Yang, Zhou, et al. (2017) tested their trace alignment-based algorithm on a trauma resuscitation dataset for the extraction of the process model. Mertens et al. (2018) proposed an algorithm called DeciClareMiner that combines process and decision mining to extract a process model from past executions. Here the process is modeled using Declare, a declarative, LTL-based modeling language. Abo-Hamad (2017) applied the fuzzy miner to discover the actual patient pathways in an emergency department; then they studied the variance in patient pathways taken by diverse groups of patients, and proposed a performance analysis in terms of bottleneck and resource utilization. Rojas and Capurro (2018) provided an approach to identify specific patient cohorts based on complex digital phenotypes as a starting point to identify process models. Using temporal abstraction-based digital phenotyping and pattern matching, they identified a cohort of patients with sepsis from the MIMIC II database, and then applied heuristic mining to discover medication use patterns. With the aim of finding the optimal layout in an emergency department, Lee and Rismanchian (2018) used a sequence clustering plug-in to remove infrequent events and to derive the process model in the form of Markov chain. Alharbi et al. (2018) used an unsupervised method for detecting hidden clinical pathways in the form of hidden Markov models form the MIMIC-III dataset. Kirchner and Markovic (2018) exploited the local process models paradigm (Tax, Sidorova, Haakma, & van der Aalst, 2016) to model medical processes partially, thus enabling the detection of major process steps. Valero-Ramon et al. (2019) proposed a method based on PALIA algorithm to discover and identify weight changes behavior. They preliminarily applied a trace clustering algorithm to manage variability. L. Xu et al. (2019) presented a constraint-based method using multi-perspective declarative PM for the extraction of clinical pathways from cardiology data. The α algorithm was used in Garg and Agarwal (2016) and Kukreja and Batra (2017), along with the heuristic miner, for the discovery of patient pathways. De Oliveira et al. (2020) used a metaheuristic approach as a combination of Monte Carlo sampling and Tabu search to overcome the complexity that characterizes medical event logs, and used a “replayability score” to determine the fitness of the discovered process model under specific size constraints. A slightly different approach for the automated discovery of process models consists in applying a trace clustering algorithm to the event log, so as to group together similar traces, and finally applying a discovery algorithm to each cluster. This way, a number of more clear, coherent, and comprehensible process models are extracted from the event log, each identifying a group of cases united by their behavior. In the medical domain, the event logs extracted from healthcare information systems do show the characteristic of having many cases following different procedures. When an event log summarizes patient treatments, trace clustering allows to cluster patients based upon the patient data and on the characteristics of their care journeys. In contrast, sequence clustering focuses only on the control-flow perspective, thus generating more simple models (Rebuge & Ferreira, 2012). Delias et al. (2014) and Ronny et al. (2015) followed the clustering approach to demonstrate the potentials of PM in the healthcare domain. Huang et al. (2014) proposed a probabilistic topic models based approach to mine latent clinical pathways, that composed together enable for the recognition of patient behavior. Traces were preliminarily clustered 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 14 of 47 FIGURE 5 15 of 47 Process discovery techniques adopted in the surveyed papers based on their probability distribution to belong to a specific clinical pathway. Delias et al. (2015) proposed a spectral clustering technique and a trace similarity metric to downgrade the effect of noise and outliers, avoiding the “spaghetti effect” on the process model. Najjar et al. (2018) extracted clinical pathways following a clustering approach, then these pathways are also clustered to distill typical pathways, enabling interpretation of clusters by experts. Kovalchuk et al. (2018 used the K-means clustering algorithm to identify groups of patients based on the paths they followed within the hospital. S. Yang, Tao, et al. (2018) presented a framework for analyzing associations between patient cohorts and the trauma resuscitation procedures their patients received. Patient cohorts are decided by unsupervised clustering, according to which patients being clustered into the same cohort must share similar relevant attributes. To this end they proposed an algorithm for measuring the patient similarity, so as to use the learned weighted attribute distance as the similarity measures during clustering. Finally, one workflow model was discovered from each patient cohort. With the aim of improving the management of patients disease, Pebesma et al. (2019) proposed to cluster frequent pathways based on risk factors such as the gender of the patients. de Toledo et al. (2019) proposed two trace representation methods, namely, vectorial, and syntactic. In the vector-based approach, each vector dimension corresponds to a diagnosis, and traces are represented as vectors. Each dimension can be represented with a binary value, meaning the absence or presence of the diagnosis, or with a numeric value as the frequency of occurrences of the diagnosis in the log. In the syntactic-based approach, traces are represented as sequence of events. For clustering, the Hamming and Levenshtein methods are used to define distance between traces. Lu et al. (2019) proposed a trace clustering approach according to which frequent sequence patterns are first learned based on a sample set of patients, then used as a base to rank patients, and finally used to discover a process map. In (Prokofyeva and Zaytsev, 2020), groups of patient routes are discovered with a hierarchical agglomerative algorithm, then to each cluster is assigned a topic by means of a probabilistic topic modeling assignment, finally popular patient route patterns are identified. Overall, the processes that have been most taken into consideration in the context of automated process discovery are patient pathways (24 papers), and clinical pathways (20 papers), followed by medical processes (6 papers), personnel tasks (4 papers), human movements (3 papers), personnel interactions (3 papers), and patient behavior (3 papers). Much less attention has been paid on the use of interactive PM for the discovery of process models. Benevento, Dixit, et al. (2019) showed that interactive process discovery approach allows to obtain process models that are more accurate and compliant with clinical guidelines than those obtained by means of automated discovery techniques. Valero-Ramon, Fernandez-Llatas, Valdivieso, and Traver (2020) introduced a method for discovering dynamic risk models for chronic diseases using PALIA, based on the health sensors data as an evidence of the patients' dynamic behavior. The same author used PALIA to examine the comorbidities associated to obesity in order to obtain common patterns of patients' behaviors (Valero-Ramon, Fernandez-Llatas, Martinez-Millana, & Traver, 2020). Figure 5 shows in which percentage each process discovery technique has been adopted in the reviewed papers. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. 5.2.2 GUZZO ET AL. | Conformance analysis Conformance refers to the analysis of the relation between the expected behavior of a process and the event log that has been recorded during the execution of the process (i.e., model “to be” vs. model “as is”). There are three approaches to conformance checking: log replay, alignment, or rule checking. In the log replay approach, each trace of the considered log is reproduced event by event in the model to verify that each event complies with the specification of the model itself. In the alignment approach, on the other hand, the log trace is used and it is compared with the execution trace of the model, trying to maximize the correspondence between the events of the two traces. Finally, the last approach is the classic meaning of model checking in which conformance is interpreted as the verification of consistency of the rules derived from the model with the traces of the event log. The study of the selected documents revealed not only that in the health context the three approaches were used, but also that they can be considered complementary in some sense. In fact, log replay can be preferred when the interest is on the evaluation of early deviations, or simply to have an intuition of the replayability of the traces. Instead, for an in-depth analysis of the deviations, the alignment is preferable (Caron et al., 2014; S. Yang, Sarcevic, et al., 2018). Finally, rule checking (H. Xu, Pang, Yang, Jinghui, et al., 2020; H. Xu, Yan, Pang, Nan, et al., 2020) provides a simple but effective way to analyze the fulfillment of a set of rules. In some case, a combination of the above approaches has been proposed (Asare et al., 2020; Dewandono et al., 2013; Kirchner et al., 2012) in order to give a more exhaustive analysis of the case study. In the context of healthcare, the work on conformance analysis includes paper focusing on comparing the actual process model with the designed and prescriptive process model (e.g., clinical guidelines, models hand-made by medical experts, etc.) (Anggrainingsih et al., 2018; Asare et al., 2020; Caron et al., 2014; de Vries et al., 2017; Fernandez-Llatas, Meneu, et al., 2011; Ganesha et al., 2017; Jaturogpattana et al., 2017; Kelleher et al., 2014; Kim et al., 2013; Kirchner et al., 2012; Kukreja & Batra, 2017; Mannhardt & Blinde, 2017; Neira et al., 2019; Placidi et al., 2021; Rinner et al., 2018; Rovani et al., 2015; H. Xu, Pang, Yang, Jinghui, et al., 2020; H. Xu, Pang, Yang, Ma, et al., 2020; H. Xu, Yan, Pang, Nan, et al., 2020; S. Yang, Li, et al., 2017; S. Yang, Sarcevic, et al., 2018). In these works, the results of conformance analysis are the detection and fully understanding where clinical practice deviates from the specifications in a care pathway, which leads to higher levels of quality care and patient outcomes. While in the majority of cases conformance checking consisted in the simple application of well-known algorithms, some work provided further contributions in terms of new algorithms, process modeling, and process analysis. (H. Xu, Pang, Yang, Jinghui, et al., 2020) presented a conformance checking algorithm based on LTL (linear temporal logic), and tested it for checking the conformance of an ischemic stroke treatment process with clinical guidelines. Kirchner et al. (2012) introduced a method for checking the conformance of ongoing treatment processes. The interesting aspect is that the conformance checking is done on a sparse log, that is an execution's log storing not all the activities of a patient's treatment. Rinner et al. (2018) described a method, called “time boxing,” for data preparation using a specific naming convention to model the time aspects used in medical guidelines. Then they exploited the potential of time boxing for a more precise trace alignment between discovered process models and clinical guidelines. In (S. Yang, Sarcevic, et al., 2018), a framework for compliance analysis is proposed that first uses conformance checking to detect deviations of an event log from a process model of trauma resuscitation and, then, an ad hoc algorithm to discriminate between true and false deviations (alarms). In particular, Yang et al. focused attention on so-called “false” deviations, that are: (1) gaps or discrepancies between the model (“work as imagined”) and actual practice (“work as done”); (2) errors in the coding of activity traces; and (3) algorithm limitations. The approach followed by Dunkl et al. (2011) consisted in the following three steps: (1) a process model of the clinical guidelines (CGPM) is generated; (2) synthetic treatment process data are generated by means of a simulation tool fed with CGPM and the process model of the actual process; (3) a process model is extracted from synthetic data (SDPM); (4) CGPM and SDPM are compared using the Decision Miner ProM plug-in. Kelleher et al. (2014) showed that trauma resuscitation processes without pre-arrival notification are performed with more variable adherence to advanced trauma life support protocols, and proposed the implementation of a checklist to improve the performance of such processes. In contrast to the above papers, Dewandono et al. (2013) used conformance checking for comparing the event log of incoming patients to that of previously healthy patients, so as to find the most similar medical record of patient cured by the medical center. This processes comparison allows doctors to obtain useful tips while treating patients on the base of the treatments given to healed patients. Figure 6 shows in which percentage each conformance checking technique has been adopted in the reviewed papers. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 16 of 47 5.2.3 17 of 47 | Process analysis In most PM tasks, process analysis is the last step after a process model has been discovered. The analysis of a healthcare process may concern various aspects such as workflow data, statistics, similarities with the same process performed in different hospitals, performance measurements, how process variants (e.g., patients pathways) differ with each other, and so on. We divided the process analysis papers in five classes, which we separately discuss below. Variants analysis Health processes are mostly classified as ad hoc processes, in the sense that the same treatment can follow different paths by virtue of the patient's response, but also of the discretion of doctors who have the power to deviate from the guidelines on the basis of their knowledge and experience to address specific patient situations (Detro et al., 2017). In this context, it is of crucial importance to have techniques capable of discriminating different behaviors in the same process so that the analyst can study them separately. We refer to these techniques as process variants detection techniques aiming to identify a mapping function between each process execution trace (i.e., case) and a specific variant (typically identified through a label). A wide range of methods for log-based process variant analysis have been proposed in the past decade: Supervised or semi-supervised approaches use statistical or visual models for identifying variants, specially in clinical pathways (Andrews et al., 2020; A. Kurniati et al., 2018b). Unsupervised approaches specialize traditional clustering algorithms (i.e., K-means) with event logs, where support of each cluster is used to discriminate main behavior from process variants or infrequent behaviors. Eventually, these clusters can be used to reduce the complexity of the analysis, in a sort of “divide and conquer,” and thus provide a zoom in for specific clinical cases, or highlight outliers in terms of medical errors, deviations from the clinical guidelines, or system defects (Caron et al., 2014). For example, Rebuge and Ferreira (2012) proposed a method based on the synergistic use of a cluster diagram and a minimum coverage tree to cluster the traces on the basis of the frequencies of traces, thus managing to discriminate between the set of traces representing the main behavior, and the less frequent clusters of traces representing the set of process variants. Recently, variants analysis has also been used to compare processes that have the same goal but belong to different organizations, referred also as process model comparison that we discuss in the next paragraph. For example, Partington et al. (2015) and Suriadi et al. (2014) presented a case study on the application of PM techniques to measure and quantify the differences in the treatment of patients with chest pain symptoms across four South Australian hospitals. Process model comparison PM techniques can be profitably used for model comparison of healthcare processes (Fernandez-Llatas et al., 2013), and suitable metrics for measuring its similarity have been proposed (Montani et al., 2014). In the medical domain, model comparison allows not only to compare the process actually implemented with the existing clinical reference guideline, to verify its compliance, and/or to understand the level of adaptation to local constraints that may have been required, but it is especially useful to discover different practices used to treat similar patients and to analyze their effects on the final outcome of the process. In fact, from the studies treated so far it emerged that the existence of local resource constraints can lead to differences between the models implemented in different hospitals, even when referring to the treatment of the same disease (and to the same guideline). In (Andrews et al., 2020), comparison and similarity of processes for different patient populations in the domain of breast cancer has been assessed in three different ways: (i) visual inspection, that is, human judgment, (ii) similarity measures on directed graphs extracted from the process model, and (iii) cross-log conformance checking by means the plug-in “Replay a Log on Petri Net for Conformance Analysis.” Process performance measurements A quantification of model differences (and perhaps a ranking of hospitals resulting from them) can be exploited for several purposes, such as, for example, administrative purposes, performance evaluation and distribution of public funds. Especially nowadays it is important to quantify and monitor the quality of care in order to support the effectiveness and efficiency of clinical care, as well as to improve a continuous measurement of performance. Also in this context, PM techniques have been proved to be an important tool for the definition and automatic extraction of key performance indicators (KPIs) from HISs, as well as an effective framework for analyzing the interactions between KPIs (see Figure 7) and the process environment (control-flow, organizational resources involved, and data; Abo-Hamad, 2017; R. Mans et al., 2013; R. Mans, Reijers, et al., 2012; Perimal-Lewis et al., 2012). Even though clinical KPI strongly depend from the treatment disease and the clinical guidelines, the factor time (i.e., consultation wait time, time spent per task/ process, and the average length of stay [LOS] for all patients from arrival to departure whether discharged or admitted 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. FIGURE 6 GUZZO ET AL. Conformance checking techniques adopted in the surveyed papers to the hospital) is most important considering the process performance, specially in the field of time-critical diseases (Jaisook & Premchaiswadi, 2015; Kukreja & Batra, 2017; Stefanini et al., 2018; Yoo et al., 2016). Gattnar, Ekinci, and Detschew (2011) and Gattnar, Ekinci, Detschew, and Capel-Tunon (2011), a clinical reference process model describing the clinical meaningful KPIs for acute diseases has been proposed. Visual process analytic Visual Analytics consists in the analysis of processes using the visual representations of the data in the form of charts, graphs, histograms, maps, tables, and so on. Seeing and working with event logs in a visual format helps (nonexpert) users identify patterns and understand data insights much more quickly. Aguirre et al. (2019) used bar charts and donut diagrams to detect the existing bottleneck in the surgery process. Fernandez-Llatas et al. (2013) used heat maps (Figure 8b) to highlight the locations and behavior patterns of the patients. To this end, they proposed HMRA (heat maps rendering algorithm) which generates heat maps by highlighting the patients flows with different colors based on the temporal duration of the visited places and the number of transitions between them. Heat maps allow detect the most important parts of a patient movements on the first viewing, by highlight the most visited locations. Perer et al. (2015) proposed care pathway explorer (Figure 8a), a tool which uses a frequent sequence mining algorithm specifically designed to work with EMR data, and techniques for managing event concurrency. CPE visualizes the mined pathways at different levels of abstraction in a user interface consisting of flow visualizations. Dotted charts (Figure 9) are another useful tool for conducting visual analysis. Dotted chart shows an overview of the process where the X-axis shows the time, the Y-axis shows the process execution instances, and each point represents an event; the colors of the points refer to the different activities (García et al., 2015). Yoo et al. (2016) analyzed process changes based on changes in the hospital environment (e.g., the construction of a new building), and to measure the effects of environmental changes in terms of consultation wait time, time spent per task, and outpatient care processes. To analyze the task event distribution, they used a dotted chart analysis and derived a two-dimensional graph by indicating task events using dots based on time or frequency. Trace alignment is a PM technique which allows comparison between the activity sequences of process traces. Combined with visualization techniques, trace alignment is a powerful mean to detect deviation, analyze and discovery specific case (Figure 10a). S. Chen et al. (2017) proposed PIMA (process-oriented iterative multiple alignment), an algorithm optimized to handle workflow data, and tested with endotracheal intubation, primary survey, and trauma resuscitation data. De Oliveira et al. (2020) proposed a “bow-tie” graph representation of the discovered process model (Figure 10b), where the central activity is the activity of interest, and the left-hand and the right-hand sides of the graph contain the set of activities that happened before and after the central one, respectively. In the “bow-tie” graph, circles represent event nodes of the process model, and links represent the time-ordered sequence of one node following another. The sizes of nodes and links are proportional to the number of patients following this pathway. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 18 of 47 19 of 47 F I G U R E 7 Duration of time analysis based on the disco Fluxicon performance approach within the collected medical event log (AboHamad, 2017; Jaisook & Premchaiswadi, 2015): (a) a screenshot of the resulting fuzzy mining graph; (b) investigation of the sections/wards “irradiation cystitis” to itself (i.e., as a loop); (c) performance analysis for patients with different triage categories Process understanding Contrary to the papers discussed above, those classified as Process Understanding do not follow a precise analysis approach, rather they provide a framework for obtaining information on various aspects of the process under consideration. In the context of the 2011 Business Processing Intelligence Challenge, the participants were provided with a real- 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. F I G U R E 8 (a) Care pathway explorer: Bubble chart displaying events of the most frequent patterns mined (left), and a flow visualization to show the most frequent patterns (right; Perer et al., 2015); (b) heat map (Fernandez-Llatas et al., 2013) FIGURE 9 An example of dotted chart F I G U R E 1 0 (a) Trace alignment: Activities of the same type are aligned in the same column, and each row represents an individual case as its sequence of activities (M. Zhou et al., 2017); (b) bow-tie graph (De Oliveira et al., 2020) 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 20 of 47 21 of 47 life event log taken from a Dutch Academic Hospital, where each case is a patient of a gynecology department. With the aim of reporting on a broad range of aspects, Jagadeesh Chandra Bose and van der Aalst (2011) disclosed differences among patients with respect to the diagnosis, treatment, control-flow, and time perspectives. Then, they were able to derive artifacts (i.e., any concrete, identifiable, self-describing chunk of information used in business processes; Nigam & Caswell, 2003) focusing on the organizational perspective. Finally, by means of the fuzzy mining and trace alignment techniques implemented as ProM plug-ins, they reported on common patterns of execution, anomalies, and distinguishing aspects with respect to the treatment procedures followed among cases. Still from the BPI Challenge 2011, Caron et al. (2011) first used the heuristic miner for process discovery, then reported on the differences at the control-flow level between cases, and on common sequences between treatments. They also performed an SNA for discovering the links between hospital departments. Finally, they provided several observations about the use of specific therapies, such as deviations between the prescribed average number of therapy cycles and the real average, and the relation between personnel and therapies. Forsberg et al. (2016) analyzed PACS (Picture Archiving and Communication System) usage patterns by means of the heuristic miner, and the discovered process model was represented by a Petri net to compute the complexity of the process in terms of three metrics, that are: number of splits and joins, the ratio of the number of arcs over the number of states, and the number states that can be reached by all splits. They also provided some statistics on PACS users, such as the number of cases, the average number of commands per case, the average number of command types per case, and the median time to read an examination. With the aim of supporting governance, Agostinelli et al. (2020) used the inductive visual miner for discovering emergency room processes, and the patient distribution among the different radiology subdepartments; the social network miner for inferring the interactions between the different subdepartments, and different operating rooms; and dotted charts for visualizing the distribution of patients without reservation. 5.2.4 | Process simulation A process simulation task allows to mimic an existing business process with the aim of conducting a what if analysis, that is, observing the different trajectories a process would follow as a result of the application of potential changes, or simply for observing the process at runtime from an external point of view in order to understand its dynamics. This way, the redesign of a business process can be explored and evaluated before it is actually implemented. The prediction of the impact of potential changes enables for process understanding, bottleneck analysis, problem identification, performance evaluation, efficiency improvement, and comparison of alternative configurations. By using PM, a simulation model can be generated based on the process model extracted from an event log by means of a process discovery task. In the context of healthcare, discrete event simulation (DES) based approach appears to be a dominant tool. DES concerns the modeling of a system as it evolves over time by a representation in which the state variables change instantaneously at separate points in time (Law et al., 2000). The approach adopted for the implementation of a process improvement task as the result of a process simulation consists of four steps, that are depicted in Figure 11, and described as follows: FIGURE 11 Graphical representation of a process simulation task 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. 1. Data collection: raw data are collected and organized in the form of an event log. 2. Process mining: the event log is fed into a process discovery algorithm which returns the “As-is” process model. 3. Generation of the “To-be” process model: this is a macro step that can be further divided in the following sub-steps: a. Model transformation: a simulation model is generated from the As-is process model. b. Model validation: the simulation model is validated in order to verify whether it accurately mimics the behavior of the As-is process in relation to a set of performance measures. For each performance measure, the average value obtained from the multiple simulations of the As-is model is compared with the corresponding value obtained in the PM step, so as to determine whether the simulation model is valid. c. Simulation: the process described by the validated model is simulated many times in order to investigate the impact of a set of process changes on the performance measures considered in the previous step. The optimal combination of changes, that is, the one which allows for the desired process performance, are merged with the validated process model so as to obtain the To-be process model. 4. Process improvement: the real process is changed in order to be aligned with the discovered To-be process model. In the healthcare domain, process simulation is used to evaluate the impact of both medical and organizational decisions, by means of a set of performance indicators that include, but are not limited to: the time a patient spends within a hospital department; the time consumed between two activities; how many times activities are performed for a patient; how many times staff members perform certain activities; how many patients are in the hospital daily; the cost (in terms of human resources, consumed energy, spent money, etc.) of the process under analysis; percentage of the utilization of the hospital machinery in the process; death rate. R. Mans et al. (2013) simulated a dental surgery process and found that the introduction of new digital technologies is largely beneficial for patients and dental lab owners, whereas for dentists there is hardly any benefit. In (Z. Zhou et al., 2014), a DES model was developed to study the impact of critical resources on the length of stay for patients. Different operational scenarios were also analyzed in order to provide recommendations for clinic management and improvement. Cho et al. (2014) conducted a what-if analysis on an outpatient service considering the overall time for out-clinic, the time for each activity, the frequency of each activity per head, the work time for each activity by resources, the frequency for each activity by resources, and the hourly frequency of patients as performance indicators. Lamine et al. (2015) used DES to assess the efficiency of the management of an emergency call center considering the speed to answer and the phone call duration as performance indicators. Augusto et al. (2016) conducted a case study on patients having cardiovascular diseases and eligible to receive an implantable defibrillator. They also studied the impact of medical decisions, such as implanting or not a defibrillator, on the relapse rate, the death rate, and the cost. Kovalchuk et al. (2018) demonstrated an example of the proposed approach's application within a task of simulating the key departments involved in acute coronary syndrome treatment procedures, considering the length of stay as performance indicator. In Tamburis and Esposito (2020), the authors used DES to get to an effective analysis of a cataract surgery process. In (Phan et al., 2019), a what-if analysis is proposed to improve care pathway for hernia affected patients who are the most exposed. Here the simulation is used to understand times of occurrence of complications and associated costs. O. A. Johnson et al. (2018)presented the ClearPath method, which extends the PM 2 PM method with a process simulation approach that address issues of poor quality and missing data. Franck et al. (2020) presented a simulation model in order to analyze patient pathways from the ED to hospital discharge, and to find the causes of congestion problems in the ED. Then they proposed several designs of experiments in order to test medical unit capacity variations taking into account real data and practitioners expertise. The simulation tools used in the revised papers are: AnyLogic10 (30%); ProModel11 (10%); CPN Tools12 (30%); Witness13 (10%); Simul814 (10%); NETIMIS (O. A. Johnson, Hall, & Hulme, 2016; 10%). 5.2.5 | Social network analysis SNA is the analysis of the organizational perspective of a process for the evaluation of social structures such as the relationships among people, teams, departments, and organizations. In the context of healthcare, SNA centers on the resource (human and departments) aspect of a care process, providing an analysis of the responsibilities, the authorization issues, and interactions. The output of a SNA can be either a social graph (Figure 12), in which nodes represent personnel or hospital departments, and (possibly weighted and/or oriented) edges denote interaction between nodes, or a comparison matrix (Figure 13). 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 22 of 47 FIGURE 12 et al., 2020) 23 of 47 Example of an oriented, non-weighted social graph, in which nodes represent hospital departments (Agostinelli F I G U R E 1 3 Example of a comparison matrix. The symbols “!”, “ occurred in the process between the two entities (Riz et al., 2016) ”, and “ !” represent the direction of information exchange that Although SNA can play a crucial role in the analysis of healthcare organizational processes, there is a limited literature review which is limited to the analysis of the five metrics implemented in ProM in order to generate social networks: Handover of work metric, which determines who passes work to whom, is based on causal dependency between activities, with options to consider only direct succession or take into account a causality fall factor; Subcontracting metric, that is similar to Handover of work, except for the fact that the relationship between two individuals is bidirectional, while the previous one is unidirectional; Working together metric which focuses on how frequently certain individuals work together on the same case, while not taking into account the activity dependency; Similar task metric, which determines who performs the same type of activities; and Reassignment metric, which detects the reassigning of activities from one individual to another, that is, people delegates work to somebody but not vice versa. In (Caron et al., 2011, 2014; Naeem et al., 2017; Riz et al., 2016; Ronny et al., 2015) SNA is used to enable for the exploration of departmental collaborations; in (R. Mans, Reijers, et al., 2012), for discovering personnel interactions during dentistry surgery operations; in (Conca et al., 2018; and Durojaiye et al., 2019) for investigating multidisciplinary collaboration in pediatric trauma care, and in the treatment of patients with type 2 diabetes in primary care, respectively. Rarely, SNA is a speedup for further analysis. For example, in (Alvarez et al., 2018) SNA is exploited to discover roles interactions models in emergency department processes, and to provide useful knowledge that can help to improve ED processes. A. Grando et al. (2017) used SNA to compare observed handover-of-care interactions with patient cases, to evaluate timing performances of personnel involved in the care process, so as to find correlation between care paths, and to deduce the time spent by different patient groups. 5.2.6 | Predictive process analytics Learning how a process behaves at the present time may allow to predict how it will evolve in the future. Process prediction is an important task for policy making, resource management and utilization, decision support, planning, and 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. optimization. In the context of healthcare, forecasting patient flows can help managers allocate money and human resources to the many health services provided by the hospital (Benevento, Aloini, et al., 2019; Duma & Aringhieri, 2017, 2020; Kempa-Liehr et al., 2020; Van Der Spoel et al., 2012). Also, the prediction of complications, comorbidities, severity of a disease on the basis of the patients' characteristics can help physicians to choose the best treatments and drugs (Back et al., 2020; H. Xu, Pang, Yang, Li, & Zhao, 2020; H. Xu et al., 2021). In some cases, the problem of predicting the process behavior reduces to a classification problem (Benevento, Aloini, et al., 2019; Duma & Aringhieri, 2017; H. Xu, Pang, Yang, Li, & Zhao, 2020; Van Der Spoel et al., 2012; H. Xu et al., 2021), that is, finding the most probable label for a given sequence of events, that can represent a process activity, a quantity, a time instant, a resource, or any other key aspect of the process. In other cases, linear regression techniques are preferred to classifiers (Benevento, Aloini, et al., 2019; Kempa-Liehr et al., 2020) as tools for inferring the trajectory of ongoing processes. Van Der Spoel et al. (2012) proposed to use a set of care paths belonging to the same diagnosis as training data for the supervised learning of the cost associated to a care product. By means of a random forest classifier they were able to predict the cost of a care product with an F-score around 0.6, starting from a noisy event log. With the aim of providing a tool to cope with overcrowding, Duma and Aringhieri (2017) used a decision tree classifier to identify the possible paths of a patient on the basis of the information available at the triage. Benevento, Aloini, et al. (2019) used both linear regression (LASSO) and classification (random forest) techniques for predicting the waiting time in an emergency department, basing on queue-based indicators. Duma and Aringhieri (2020) aim to predict the next activities in the view of a possible application to online optimization. In particular, they used a decision tree classifier to predict the use of the emergency department resources by each patient on the basis of the only information known at the access of the patient. Kempa-Liehr et al. (2020) proposed to use Geometric regression for predicting patient postoperative length of stay. To assist medical staff with thrombolytic therapy decision-making for stroke patients, H. Xu, Pang, Yang, Li, and Zhao (2020) proposed a clinical decision support method using decision tree and random forest classifiers for the prediction of the next activity given a sequence of events. H. Xu et al. (2021) proposed a transfer learning-based framework for predicting the long-term recurrence risk in patients with ICE after discharge from hospitals, in order to point out high-risk patients for intervention. Contrary to the work cited above, Back et al. (2020) used Bayesian belief networks for predicting cycle times of individual phases of the patient flow. In general, Bayesian networks can be used for finding the probability of an any event to occur, such as a surgery taking more than x minutes given the case type and condition of patient, or the likely destination of the patient given other evidence. 5.2.7 | Other applications In this section, we discuss some papers selected in the process of collecting and filtering the works as dealing with new emerging PM applications in healthcare. In fact, although the number of papers is small, related applications are successful employed in different domains and they are attracting attention in the healthcare too. In particular, concepts drift is a well-known topic that analyses the process changes especially from the control-flow perspective, by detecting changes in the process model (insertion, deletion, substitution, and reordering of process fragments), and in the time points in which they occur. In the context of healthcare, concept drift is a particular concern since patterns of care emerge and evolve in response to individual patient needs and through complex interactions between people, process, technology and changing organizational structure. In (A. P. Kurniati et al., 2020) concept drift was exploited in the well-established PM 2 methodology (Van Eck et al., 2015) in order to discover and analyze changes over time in complex longitudinal healthcare data. Process change detection, localization, and characterization were carried out at three different levels of abstraction: model, trace, and activity. The case study examined process data related to the treatment of endometrial cancer over a 15-year period (2003–2017) in one of the UK's largest cancer centers (Leeds Cancer Centre) with a specific focus on the routes to diagnosis. Outlier detection is another hot PM application which aim is to identify an event, a single process execution (trace), or a process model whose behavior is different from the expected one. In (Bouarfa and Dankelman, 2012) deviations from standard surgical practice are detected in order to enrich and extend medical protocols automatically in the case of good practices, or to take countermeasures in the case of serious complications or errors. To detect outliers, a global pair-wise sequence alignment (Needleman–Wunsch) algorithm is used and applied on video coming from Laparoscopic Cholecystectomy (LAPCHOL) procedures. Alharbi et al. (2017) defined an event as outlier if it occurs more often than a threshold interval determined from the central tendency and measure of dispersion of intervals for that event. More 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 24 of 47 25 of 47 often, clustering is the basic method of detecting outliers, that are identified as the smallest clusters (Folino et al., 2011). This approach was used in (Han et al., 2011), where an Abnormal Method for Identifying Abnormal Process Instances (APIIM) to detect abnormal process instances in the clinical pathway was proposed. It is quite known that one of the most important challenges in health applications concerns the privacy and security of sensitive data, so much so that there are multiple health regulations at different government levels that define strict privacy requirements (e.g., personal data is protected by privacy legislation, such as the GDPR in the European Union, the Health Insurance Portability and Accountability Act in the United States, or the Personal Information Protection and Electronic Documents Act in Canada). It is surprising that in the PM literature we have found no more than one paper related to the application of Privacy Preserving PM in the healthcare domain. In (Pika et al., 2019), data privacy and utility requirements for healthcare process data were analyzed in order to assess the suitability of existing privacypreserving data transformation approaches, and to propose a privacy-preserving PM framework that can support PM analysis of healthcare processes with obfuscated data. The effectiveness of PM applications is strictly dependent on the underlying data quality. Data acquisition represents thus a key step for the subsequent jobs performed with data such as PM tasks. Typically, data quality issues are due to organizational reasons, in fact, often the data collection task in hospitals is first accomplished using paper and put into electronic format at a later time. Moreover, HISs may be unable to exchange data with each other, thus leaving to the hospital staff the responsibility to do it manually. Data quality assessment, thus, is a fundamental step to be carried before any PM task, as it enables for the correct evaluation of the adopted PM algorithms. Perimal-Lewis et al. (2016) assessed the quality of emergency department data extracted from the HER of an Australian public hospital. The presence of incorrect timestamps was identified as the cause of flow anomalies found in patient pathways mined from the event log. The set of corrective actions proposed by the authors, and to be put in place to address such data quality issues, exclusively regard the improvement of the data collection task carried out by the hospital staff. A P. Kurniati et al. (2019) provided an assessment of the MIMIC III data quality in terms of missing data, incorrect data, imprecise data, and irrelevant data, following the method presented by Weiskopf and Weng (2013). Despite they found: (1) missing data among events, case attributes, activity names, timestamps, and event attributes; (2) incorrect data among events, cases, and timestamps; and (3) imprecise data among resources and timestamps; the overall data quality of MIMIC-III was found to be good for performing PM tasks. Lanzola et al. (2014) assessed the data quality of a stroke registry of an Italian hospital, and found that the majority of errors and missing data detected among records was represented by missing onset-arrival time (43% of the whole dataset), and by missing stroke scales at follow-up (22%), followed by the violation of temporal constraints among dates (7.5%) and other minor issues. 5.3 | PM methodologies for the analysis of healthcare processes In this section, we report on the PM frameworks that were proposed to guide the implementation of PM applications with healthcare data. Most of them were explicitly built on existing, well-known PM methodologies, such as PM 2 (Van Eck et al., 2015), L* (Van Der Aalst, Adriansyah, De Medeiros, et al., 2011), and CRISP-DM (Wirth & Hipp, 2000); other ones, instead, simply follow the classical pipeline (1) data Collection, (2) Event log preparation, (3) process Discovery, (4) process Analysis (CEDA). The main phases of these PM approaches are depicted in Figure 14, with the addition of the DIAG approach, the unique PM methodology designed for PM tasks carried in a healthcare context. McGregor et al. (2011) extended the CRISP-DM model with temporal and multidimensional aspects to provide a structured approach to knowledge discovery of new conditions onset pathophysiologies in physiological data streams, and used Patient Journey Modelling Architecture (PaJMa) for the knowledge representation. PaJMa (McGregor et al., 2008), designed specifically for healthcare, provides a visual process representation, and the information and technologies involved in the patients journey. The CRISP-DM (CRoss Industry Standard Process for Data Mining) reference model provides an overview of the life cycle of a data mining project. The life cycle is divided into six phases, as shown in Figure 14. Although the proposed schematization is that of a block diagram, the sequence of the phases is not rigorous. The arrows indicate the most important and frequent dependencies between phases, especially in the particular context of healthcare, where the phase to be performed depends on the outcome of each phase, or task of a phase previously performed. However, in general the process is not intended to be finished once a solution is deployed, as the knowledge learned during the process can trigger new, often more targeted, business questions. Gonzalez-García et al. (2020) proposed a PM 2-based methodology for the Code Stroke analysis use case. In Erdogan and Tarhan (2018), the authors proposed a goal-driven PM 2 based approach, and tested it on the surgery process of a 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. FIGURE 14 GUZZO ET AL. Process mining methodologies hospital university. The application of the proposed approach revealed for bottlenecks and deviations, that were crucial for determining measures to improve the efficiency of the surgery process. The PM 2 approach allows to consider two types of models, namely process and analytical models. Process models describe the ordering of activities in a process, detailing time constraints, resource, or data usage. Analytical models are any other type of model that provides information about the process, such as decision trees. In the planning and extraction phases initial research questions are defined and event data are extracted, respectively. Then, multiple analysis iterations are performed, even in parallel. In general, each analysis iteration performs the remaining phases one or more time, each focusing on answering a specific research question by applying PM algorithms and evaluating the discovered models. If the results are satisfactory, then they can be used for improving the process. Delias et al. (2014) demonstrated the potentials of PM for the analysis of emergency department processes. The main contributions include a CEDA-based approach enriched with a preliminary trace clustering before process discovery; the identification and visualization of the process paths followed by patients; the discrepancy between rare flows; performance analysis; and compliance checking with respect to the medical standards. Trace clustering allows grouping patient pathways with similar characteristics, so as to apply the process discovery phase to each of such clusters and obtain a set process models instead of a single “spaghetti like” model. Lismont et al. proposed by Lismont et al. (2016), a PM framework which follows the CEDA approach reinforced with trace clustering as well, with the addition of an activity clustering task with the aim of renaming similar activities with same label so as to obtain more understandable th et al. (2017) provided an overview of the difficulties of the application of PM in healthcare, gave process models. To recommendations for managing such problems, and suggested a CEDA-based workflow to generate more precise process models. The L* methodology was used as a basis for a PM framework for the analysis of cancer pathways mined from the MIMIC-III dataset (A. P. Kurniati et al., 2018a). In contrast to the other PM approaches, in L* particular attention is given to the “process modeling” phase, which is divided into two stages (yellow blocks in Figure 14): in the first one, a process model is mined from the subsets of data extracted in the previous stages; in the second, the model is enriched with additional data coming from the different data perspectives of the process under analysis. The DIAG approach helps visualize and analyze patient pathways by combining real-time localization and PM (Araghi, Fontanili, et al., 2018). The analyses provided by this approach are based on the location data generated by real-time location systems (RTLS), which besides being the basis for the analysis of patient movements, it turns to be helpful as integration of other medical data too. DIAG is structured in four layers, namely data, information, awareness and governance, and operates through six stages (depicted in Figure 14). Through the gathering data stage, the movements of patients/objects are monitored and stored by the localization systems. The positioning algorithms are used to identify the tags placed on the monitored objects, while through the log refinement stage the data are preprocessed to make them suitable for the PM techniques used in the modeling. In the modeling stage process discovery algorithms are executed. The analyzing stage provides quantitative analysis for users, and the performance level of the processes are assessed by considering the way patients circulate in the environment while receiving their treatment. The diagnostic stage the emphasis is placed on the causes that trigger weaknesses in the execution of processes. Finally, various possible scenarios for improvement are prefigured in the prognosis stage through simulation techniques. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 26 of 47 27 of 47 Although not in the form of PM framework, the following papers have outlined a series of guidelines in the form of key questions to be answered, or challenges to be faced when conducting a PM task in a healthcare context. R. S. Mans, Van der Aalst, et al. (2012) investigated the data challenges that are faced when answering four frequently posed questions during PM projects in hospitals, that are: (Q1) What are the most followed paths and what exceptional paths are followed?; (Q2) Are there differences in care paths followed by different patient groups?; (Q3) Do we comply with internal and external guidelines?; and (Q4) Where are the bottlenecks in the process?. They investigated the characteristics of HIS data and whether they help solve the above questions. Finally, they illustrated which data challenges exist when answering the questions, and provided tips for addressing them. Homayounfar (2012) identified three main sources of problems that challenge the performances of PM tasks in the healthcare domain. These are: (1) the complexity of processes caused by the heterogeneous nature of hospital environments; (2) the continuous ad hoc actions that are performed by physicians, which is the main cause of the so-called “spaghetti effect” on the process models as the result of a process discovery task; (3) the poor quality of the data collected in HIS is the main cause of the generation of process models that do not fit with the real processes. Kaymak et al. (2012) argued that existing PM techniques fail to extract intelligible process models,15 and proposed a few recommendations for making such techniques more effective when applied in the healthcare domain. 5.3.1 | Other approaches The methodologies discussed so far are high-level PM guidelines that guide the execution of PM tasks from the collection of raw data till the analysis and improvement of the process, without going much into the details of the PM techniques used in each phase. The PM approaches we are going to discuss, instead, are specific for the discovery or the conformance checking of process models. Business Process Life cycle (BPL) (Weske, 2007) is a PM methodology typically adopted for conformance checking. Among the surveyed papers, those that followed this approach are nine in total (de Vries et al., 2017; Kelleher et al., 2014; Kirchner et al., 2012; Neira et al., 2019; Rinner et al., 2018; Rovani et al., 2015; H. Xu, Pang, Yang, Ma, et al., 2020; H. Xu, Yan, Pang, Nan, et al., 2020; S. Yang, Sarcevic, et al., 2018). As shown in Figure 15, BPL consists of four phases organized in a cyclical structure. In the design phase, a hand-made model of the “To-be” process is designed. In the configuration phase, the process is configured according with the model designed in the previous phase. In the enactment phase, the real process is executed and data are collected. In the evaluation phase, the “As-is” process model is mined from collected data by means of process discovery algorithms and compared with the “To-be” model designed in the design phase. The discrepancies between the two models can be used to improve the real process, and the BPL methodology can then be reapplied on the improved process. Despite the BPL structure does not imply any specific temporal order in which phases have to be executed, in the majority of surveyed papers the design stage was the starting point. The interactive PM is a PM approach specific for the discovery of process models, and bases its strengths on the combination of domain knowledge and PM techniques (Dixit et al., 2018). This allows for the generation of more precise process models, as the potential drawbacks of automated discovery techniques are suppressed by the intervention of the domain expert. In a healthcare context, where processes are highly complex, heterogeneous, and dynamic, physicians lend their deep domain knowledge to provide useful advances in the process discovery task (Fernandez-Llatas, 2021a, 2021b; Ibanez-Sanchez et al., 2019; Martinez-Millana et al., 2021). FIGURE 15 The business process life cycle methodology 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. The majority of the PM papers reviewed so far focus on the analysis of healthcare processes from a single point of view, which most of times is the control-flow perspective (see Section 5.2.1), and less often, the organizational (see Section 5.2.5) and performance perspectives (see Section 5.2.3). However, the high complexity of the healthcare domain requires the analysis of multiple perspectives to have a more complete view of the processes under consideration. To this end, the multi-perspective approach provides for the analysis of the processes from different points of view, a methodology which allows to acquire a deeper knowledge by combining the information mined from the different perspectives. Typically, this approach consists in the application of PM algorithms to different fragments of the event log, namely, activities for the analysis of the control-flow perspective, resources for the analysis of the organizational perspective, and timestamps for the analysis of the performance perspective. In other cases, dedicated tools, such as the Multi-perspective Process Explorer ProM plug-in (Mannhardt et al., 2015), automatize this task by taking as input the whole log. Rebuge and Ferreira (2012) studied infrequent behavior and process variants from the analysis of the control-flow, organizational, and performance perspectives. R. Mans et al. (2013) glued together in one model the results obtained from the control-flow, organizational, and performance perspectives, and the obtained model was used for process simulation purposes. Mannhardt and Blinde (2017) used the Multi-perspective Process Explorer ProM plugin, for inquiring the performance and control-flow perspectives. Araghi et al. (2019) collected RTLS data and analyzed patients' movements from the control-flow and performance perspectives. H. Xu, Pang, Yang, Jinghui, et al. (2020) presented a declarative, multi-perspective PM method for the modeling of clinical processes. To capture the different process perspectives, they classified event attributes into seven types based on medical services, and each relationship between types were represented in a relationship matrix in terms of a constraint expressed in the Declare language. The control-flow, organizational, and performance perspectives were also at the basis of the analysis conducted by R. Mans, Reijers, et al. (2012), Neumuth et al. (2012), and Ronny et al. (2015). The process discovery algorithms discussed in Section 5.2.1 share the common characteristic of searching for “complete” process models, that is, they try to extract a model which describes the whole event log. However, in highly flexible settings, such as healthcare event logs where traces may considerably differ with each other and no global process model exists that describes the whole log, such algorithms may be subject to the so-called “spaghetti effect,” that is, they output an uninterpretable process model. To overcome this issue, Kirchner and Markovic (2018) exploited the potential of Local Process Model Discovery (LPMD; Tax, Sidorova, Haakma, & van der Aalst, 2016) to extract a set of process models for the clinical pathways followed by living liver donors. LPMD extracts a set of process models each representing the process described by a subset of the event log, thus enabling the detection of most representative process behaviors. Although computationally inefficient in case of large number of distinct activities, heuristics for speedup the LPMD task have been proposed (Tax, Sidorova, van der Aalst, & Haakma, 2016). Figure 16 shows in which percentage each PM methodology has been adopted in the reviewed papers. FIGURE 16 Process mining methodologies adopted in the surveyed papers 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 28 of 47 29 of 47 5.4 | Tools For the implementation of the PM applications discussed in the previous sections, authors typically have made use of dedicated PM tools such as ProM,16 Disco,17 Celonis,18 and so forth, more rarely they have developed PM algorithms using programming languages such as Python, C#, R, and C. Figure 17 shows the percentage of use of the PM tools and programming languages adopted by the authors of the surveyed papers (simulation tools are not included as already listed in Section 5.2.4). It can be seen that the most used PM tools were ProM (43%) and Disco (20%), that together account for more than half of all available choices. ProM is the most complete tool with tens of algorithms for process discovery, conformance, alignment, and other PM applications, and the possibility for developers to implement their own solutions. A more detailed description of ProM can be found in (Tibeme et al., 2018), where Tibeme et al. analyze how ProM algorithms perform with clinical workflows. Despite their scarce use, it is worth paying attention to the PALIA suite and R.IO-DIAG19 tools, as they were born for working with healthcare data. The PALIA suite consists of PALIA-ER (Rojas, Fernandez-Llatas, et al., 2017), and PALIA ILS Web Tool (Fernandez-Llatas et al., 2015), that are both web-based PM tools that use the PALIA discovery algorithm, and designed to be easy to use by users not experts in PM. The former is a tool for question-driven PM in Emergency Rooms which includes model simplification and filtering features specially domain-specific for ER. The latter is specifically designed for dealing with ILS data in healthcare environments, and provides the graphical view of the process under analysis, plus trace clustering algorithms for the generation of different models related to groups of patients with similar behavior. R.IO-DIAG is an open-source software developed to assist users in the implementation of PM tasks that follow the DIAG approach. DIAG is a PM methodology for the analysis of patient pathways obtained from RTLS data (Araghi, Fontaili, et al., 2018). R.IO-DIAG allows users to perform process discovery, conformance checking, and process enhancement from event logs corresponding to location data of patients. Among the various programming languages used for the development of PM task in the healthcare domain, R was the most used. In particular the pMineR (Gatta et al., 2017) library, designed by Gatta et al. to implement PM tasks with medical data. pMineR allows to do process discovery and conformance checking, and to present processes in the form of Markov Models, that are easy to understand for medical users. Moreover, it is well suited for the representation of clinical guidelines (in terms of human-readability), thanks to some aspects taken from the Computer Interpretable Clinical-Guidelines field. FIGURE 17 Tools used in the surveyed papers 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. Different from the PM tools described above, PMApp (Valero-Ramon, Fernandez-Llatas, Martinez-Millana, & Traver, 2020) is a tool designed to create custom PM dashboards for medical purposes. Such dashboards allow the selection of the most adequate views and algorithms for each medial case. PMApp is built using the Process Choreography Paradigm (Barros et al., 2010), and integrates PMCode, a .NET toolkit for the development of PM algorithms. Zhu et al. (2010) developed a workflow-based dashboard to help hospital staff monitor the state of the process in real time. Martinez-Millana et al. (2019) developed a dashboard which allows to discover flows of patients based on the location data of patients undergoing an intervention. The dashboard also allows to filter data, and compute statistics. Figure 17 shows in which percentage each tool has been used in the reviewed papers. 6 | O P E N I S S U E S A N D FU T U R E RE S E A R C H DI R E C T I O N S In this section, we provide an answer to the research question Q3 defined in Section 2. From the literature discussion emerges that the high complexity of the healthcare domain represents an obstacle for PM tasks in generating satisfactory results. This is due to four main reasons: first, the data collection process not always is performed by process aware information systems, thus the data format is not suited to be employed in PM tasks; second, most of times the data are first collected on paper and put into electronic format at a later time, thus it is error prone; third, health data are collected in different, not-synchronized repositories (EHR, PACS, etc.), thus, it needs to be manually recovered from different data sources before being processed; fourth, medical event logs may contain traces (e.g., patient cases) very different with each other, which is an obstacle for process discovery algorithms to generate process models descriptive enough to be successfully employed by physicians. These issues slow down the knowledge acquisition process as they require for an intensive time/resource-consuming data preprocessing task to allow PM applications generating useful results. Note that three out of four, are data collection related issues and as such they must be accounted by hospital managers, while only one derives from the nature of the data itself. Data collection issues can be solved (albeit partially) by introducing automated data collection tools, such as RTLS, and wearable smart devices that autonomously and punctually collect information on patients and staff, and by providing hospital staff with electronic tools to increase the automation degree and the quality of the data collection process. Furthermore, the adoption of a single information system orthogonal to different healthcare facilities could be a valid solution to easily recover the whole patients care history, and arrive at the analysis of patients entire care path. On the other hand, the heterogeneous nature of medical event logs is a problem reserved to data scientists. As we already mentioned in the previous sections, the high complexity of health data is the primary cause of the so-called “spaghetti effect” which characterizes the process models mined from such data. Different techniques have been proposed to overcome this issue, such as trace clustering, activity clustering, topic modeling, and local process models discovery, that allow to separate the process discovery task on different sub-logs, each showing different characteristics. However, such PM methodologies need to be properly tuned on each specific medical case, which requires a deep knowledge of the medical domain, and thus a strict collaboration between data scientists and physicians is needed. Another important aspect the future research in PM for healthcare has to take into account regards the illustration of the results obtained by means of the knowledge discovery tools. An improvement in the representation of complex and variable health processes would help physicians understand the outcome of PM applications, and consequently increase the usability of PM in healthcare contexts. The tip of the iceberg of this challenge is to hide the complex PM techniques behind user-friendly and interactive interfaces and notations, through which automatically set analysis models using simple settings on parameters and filters. Another promising future research is to integrate PM analysis results into simulation and scheduling optimization frameworks. PM was already used in production to extract relevant multiproduct planning information or to plan activities in accordance with business rules. To the best of our knowledge these approaches has not been investigated in healthcare setting, while it could be supposed that this topic is worthy of further investigation. Among the research areas to be further investigated there is the conformance analysis, that, although it is a widely investigated line of research in PM, its applicability to the medical field is not trivial. The challenge is to review these techniques so that they can work with less structured processes, by taking into account different perspectives of analysis (not just control flow), and a greater number of activities. Finally, we argue that privacy-preserving PM needs to be more investigated as it has received very little attention (we found one paper only). In fact, event logs contain sensitive personal information which must be obfuscated using privacy protection techniques to modify the original data while preserving its usability. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 30 of 47 31 of 47 7 | R E LA T E D WOR K In this paragraph we report on the survey papers that have been published in the context of PM in healthcare. The discussion is organized according with the publication year. 2021: Dallagassa et al. (2021) discussed 270 articles ranging from 2002 to 2019. They first show how the use of PM has evolved in the health care domain over time; then they propose two classifications of the surveyed articles, based on PM applications and on PM techniques. They found that the discovery of process models was the most frequent PM application, and the most adopted algorithms were the fuzzy miner and heuristic miner. 2020: Motivated by the fact that previous survey articles did not report clinical aspects in a uniform way, did not follow a standard clinical coding scheme, and details of the event log data were not always described, Helm et al. (2020) surveyed 38 PM studies in healthcare, published between 2016 and 2018, with an emphasis on the details of the event log data, algorithms and techniques used. In particular, of the reviewed papers they described the characteristics of the event log, and referred to a standard clinical coding scheme for the information on clinical specialty and medical diagnoses. Grüger et al. (2020) surveyed 55 papers in the medical domain of oncology, and investigated how PM has been applied in order to acquire semantic case descriptions from healthcare information systems (HIS). The authors envision that such information can be used as experience by a case-based reasoning (CBR) system to support eminence-based decision making. They found that (i) most of papers focus on the analysis of data using PM and less on describing the process and difficulties of exporting and extracting HIS-data and transforming them into event logs; (ii) none of the surveyed papers examines the use of PM for case acquisition for CBR; and (iii) the application of PM in oncology especially focuses on the control flow perspective. 2019: Farid et al. (2019)) reviewed eight papers where PM techniques were applied to the care of frail elderly people. The authors presented the results referring to five emerging themes, namely, geographical location, analysis methodology, source data types, adopted process, medical context and open challenges. Jangi et al. (2019) showed the use of semantic PM to enhance hospitals processes. In total, six articles are surveyed. 2018: G. P. Kusuma et al. (2018) provided a summary of PM studies undertaken in the field of cardiology. They identified 32 relevant studies from 2008 to 2017, and analyzed them across five themes: process and data types, techniques, perspectives, tools, methodologies. In the analysis of the limitations and the future work, they pointed to data quality as the major issue that needs to be addressed. Williams et al. (2018) investigated the extent to which PM has been applied to primary care, and identified seven relevant papers. They summarized the data sources, geographical location and medical domains that were reported, and identified PM challenges in a primary care context. The criticalities identified in the selected studies concern the coherence and completeness of the data collected, the choice of different algorithms and tools, and the effective presentation and application of the obtained results in real scenarios. Erdogan and Tarhan (2018) presented a systematic mapping of PM in healthcare where 172 research papers published between 2005 to 2017 are categorized considering the type of research, the specific application context, mining algorithms, and process modeling language. The authors also summarized the demographic and bibliometric trends specific of the referred domain, in terms of publication volume, most influential documents in the reference research community, geographical location of contributing researchers, and top venues. Batista and Solanas (2018) investigated on the applications of PM in the healthcare sector. In particular, the authors focused on the major trends by dividing the discussion in the following main aspects: medical data and preprocessing, medical fields, medical process type, objective, perspective, algorithms, tools, and medical structures. Heterogeneity within the healthcare domain has proved to be one of the main challenges for PM, which is due to the fact that different therapies can be followed to treat different patients with the same disease. The authors argued that this aspect makes it necessary to enhance the quality of the event log data with further details related to the context. 2016: Ghasemi and Amyot (2016) presented a systematized literature review based on the analysis of papers obtained through existing reviews rather than on the direct analysis of a large list of first-hand papers. They also argue the literature discussion provided by other review papers. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. A. P. Kurniati et al. (2016), through an accurate thematic review, analyzed 37 works in the field of oncology, formulating specific research questions according to the processes and types of data, techniques, methodologies and tools used, and also highlighting the limits found and the research guidelines for future works. The review highlights the potential value of PM for improving cancer care processes, and provide a n overview of the work undertaken, finally identifies research opportunities in this field of study. Rojas et al. (2016) reviewed 74 papers according to the following main aspects: process and data types, frequently posed questions, PM techniques, perspectives and tools, methodologies, implementation and analysis strategies, geographical analysis, and medical fields. The most commonly used categories, emerging topics and future trends were identified. The authors concluded the paper with the list of most critical challenges: (i) portable solutions must be developed that can adapt to hospital environments other than those in which they have been used; (ii) user-friendly tools for the visualization of the process models are needed; (iii) benchmarking studies must be conducted between different hospitals to identify and emulate success stories. 2015: Rojas et al. (2015) conducted a bibliographic study on the use of different algorithms, techniques and PM tools applied in the healthcare domain. The authors highlighted the limitations and the resulting challenges, that are: assessing the compliance of real medical processes with medical protocols and guidelines; overcoming the technical pitfalls inherent in identifying, accessing, and integrating data sources; and overcoming poor data quality issues by means of preprocessing tasks. Mans et al. published a book (R. S. Mans et al., 2015) which gives a wide overview of the application of PM in the healthcare domain. 2014: In (W. Yang and Su, 2014), 37 studies published between 2004 and 2013 were analyzed focusing on the discovery of the medical process as an essential tool for the design of clinical pathways, on the analysis of process variants, and on process performance measurements to identify possible improvements. From the discussion on the limitation of the applicability of PM in medical contexts, it emerges that most of the mining algorithms are not adequate to handle medial processes because of their unstructured, complex and variable nature. It follows that the models obtained are not able to explain the numerous variants, slowing down the overall improvement process. To overcome these issues, the authors suggested four research directions: (1) in-depth analysis of variants, (2) integrated process management, (3) customization, and (4) self-learning improvement of the clinical pathways. 8 | C ON C L U S I ON S In this article we have reviewed 172 papers published in the last 10 years, that present various applications of PM in the healthcare domain. We have seen how PM techniques have been applied to health data for mining useful knowledge, and how the acquired knowledge can help physicians and hospital managers to better understand and improve health processes. We have discussed the surveyed papers based on the taxonomy proposed in Section 1, according to which a PM task can be characterized by the application, which defines the purpose of the PM task; the algorithms used to achieve the intended purpose; the tools used to implement the algorithms; the approach followed to finalize the application; the kind of process; and the source of data. From the literature discussion emerges that the PM research in healthcare is growing rapidly and features a large number of techniques, that are necessary to deal with the high complexity of the healthcare data. In particular, the main contributions in the health sector that we have been able to appreciate in the drafting of this work concern: • Data preprocessing including the collection and preparation of event logs, as well as their optimization and quality improvement; • Discovery and analysis of process models for the evaluation of the healthcare services; • Evaluation of the process conformance with respect to medical protocols and clinical guidelines; • Performance evaluation of healthcare processes in terms of bottlenecks and time management; • Investigation of the management of hospital resources by means of SNA; • PM tools for the application of PM techniques and for the visualization of the obtained results; • Simulation and predictive process techniques for the investigation of potential scenarios; • PM methodologies to guide the execution of PM task in the healthcare domain; 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 32 of 47 33 of 47 Despite the great effort that has been spent by researchers in this last decade, the field is still open for further research and practice. In Section 6, we have outlined the challenges to be faced in order to make PM methods more effective when applied to health data, that can be summarized in: producing higher quality data; designing more performing PM techniques to deal with highly variable processes; providing user-friendly PM tools to be used by nonexpert PM users; spending more effort in privacy-preserving PM so as to be able to analyze data containing sensitive information, as in the case of medical event logs, without incurring in privacy issues. We expect the results of this review to be used fruitfully to make the research community and practitioners more aware of the importance of PM applications in the healthcare sector and provide insights to direct future efforts. CONFLICT OF INTEREST The authors have declared no conflicts of interest for this article. DATA AVAILABILITY STATEMENT Data sharing is not applicable to this article as no new data were created or analyzed in this study. A U T H O R C ON T R I B U T I O NS Antonella Guzzo: Conceptualization (equal); formal analysis (equal); investigation (equal); methodology (equal); supervision (equal). Eugenio Vocaturo: Data curation (equal); investigation (equal); resources (equal). Antonino Rullo: Conceptualization (equal); formal analysis (equal); investigation (equal); methodology (equal); supervision (equal). E N D N O T ES 1 https://www.ariscommunity.com https://fluxicon.com/disco 3 https://www.promtools.org 4 https://www.celonis.com 5 https://mimic.mit.edu 6 https://www.who.int/standards/classifications 7 https://cheatography.com/deleted-2754/cheat-sheets/major-diagnostic-category-mdc-to-ms-drg-mapping/ 8 https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/ MedicareFeeforSvcPartsAB/downloads/DRGDesc05.eps 9 https://www.whocc.no/atc_ddd_index/ 10 https://www.anylogic.com/ 11 https://www.promodel.com/ 12 https://cpntools.org/ 13 https://www.lanner.com/en-us/technology/witness-simulation-software.html 14 https://www.simul8.com/ 15 Note that this paper was published in 2012. 16 https://www.promtools.org/ 17 https://fluxicon.com/disco/ 18 https://www.celonis.com/ 19 https://research-gi.mines-albi.fr/display/RIOSUITE/R-IOSuite+Home 2 R EF E RE N C E S Abo-Hamad, W. (2017). Patient pathways discovery and analysis using process mining techniques: An emergency department case study. In International conference on health care systems engineering (pp. 209–219). Springer. Agostinelli, S., Covino, F., D'Agnese, G., De Crea, C., Leotta, F., & Marrella, A. (2020). Supporting governance in healthcare through process mining: A case study. IEEE Access, 8, 186012–186025. Aguirre, J. A., Torres, A. C., & Pescoran, M. E. (2019). Evaluation of operational process variables in healthcare using process mining and data visualization techniques. Health, 7, 19. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. Alharbi, A., Bulpitt, A., & Johnson, O. (2017). Improving pattern detection in healthcare process mining using an interval-based event selection method. In International conference on business process management (pp. 88–105). Springer. Alharbi, A., Bulpitt, A., & Johnson, O. A. (2018). Towards unsupervised detection of process models in healthcare. In MIE (pp. 381–385). IOS Press. Alvarez, C., Rojas, E., Arias, M., Munoz-Gama, J., Sepúlveda, M., Herskovic, V., & Capurro, D. (2018). Discovering role interaction models in the emergency room using process mining. Journal of Biomedical Informatics, 78, 60–77. Amantea, I. A., Sulis, E., Boella, G., Marinello, R., Bianca, D., Brunetti, E., Bo, M., & Fernandez-Llatas, C. (2020). A process mining application for the analysis of hospital-at-home admissions. In Studies in Health Technology and Informatics (Vol. 270, pp. 522–526). Europe PCM Plus. Andrews, R., Suriadi, S., Wynn, M., ter Hofstede, A. H. M., & Rothwell, S. (2018). Improving patient flows at St. Andrew's War Memorial Hospital's Emergency Department through process mining. In Business process management cases (pp. 311–333). Springer. Andrews, R., Wynn, M. T., Vallmuur, K., Ter Hofstede, A. H. M., & Bosley, E. (2020). A comparative process mining analysis of road trauma patient pathways. International Journal of Environmental Research and Public Health, 17(10), 3426. Anggrainingsih, R., Johannanda, B. O. P., & Cahyani, D. E. (2018). Business process evaluation of outpatient services using process mining. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 10(2–4), 125–128. Antonelli, D., & Bruno, G. (2015). Application of process mining and semantic structuring towards a lean healthcare network. In Working conference on virtual enterprises (pp. 497–508). Springer. Araghi, S. N., Fontaili, F., Lamine, E., Salatge, N., Lesbegueries, J., Pouyade, S. R., Tancerel, L., & Benaben, F. (2018). A conceptual framework to support discovering of patients' pathways as operational process charts. In 2018 IEEE/ACS 15th international conference on computer systems and applications (AICCSA) (pp. 1–6). IEEE. Araghi, S. N., Lamine, E., Salatge, N., & Benaben, F. (2020). Interpretation of patients' location data to support the application of process mining notations. In HEALTHINF (pp. 472–481). SciTePress. Araghi, S. N., Fontanili, F., Lamine, E., Salatge, N., Lesbegueries, J., Pouyade, S. R., & Benaben, F. (2019). Evaluating the process capability ratio of patients' pathways by the application of process mining, SPC and RTLS. In HEALTHINF (pp. 302–309). ScitePress. Araghi, S. N., Fontanili, F., Lamine, E., Tancerel, L., & Benaben, F. (2018). Applying process mining and RTLS for modeling, and analyzing patients' pathways. In HEALTHINF (pp. 540–547). SciTePress. Arias, M., Rojas, E., Aguirre, S., Cornejo, F., Munoz-Gama, J., Sepúlveda, M., & Capurro, D. (2020). Mapping the patient's journey in healthcare through process mining. International Journal of Environmental Research and Public Health, 17(18), 6586. Asare, E., Wang, L., & Fang, X. (2020). Conformance checking: Workflow of hospitals and workflow of open-source EMRs. IEEE Access, 8, 139546–139566. Augusto, V., Xie, X., Prodel, M., Jouaneton, B., & Lamarsalle, L. (2016). Evaluation of discovered clinical pathways using process mining and joint agent-based discrete-event simulation. In 2016 winter simulation conference (WSC) (pp. 2135–2146). IEEE. Back, C. O., Manataki, A., & Harrison, E. (2020). Mining patient flow patterns in a surgical ward. In Proceedings of the 13th international joint conference on biomedical engineering systems and technologies. SciTePress. Badakhshan, P., & Alibabaei, A. (2020). Using process mining for process analysis improvement in pre-hospital emergency. In ICT for an inclusive world (pp. 567–580). Springer. Baker, K., Dunwoodie, E., Jones, R. G., Newsham, A., Johnson, O., Price, C. P., Wolstenholme, J., Leal, J., McGinley, P., Twelves, C., & Hall, G. (2017). Process mining routinely collected electronic health records to define real-life clinical pathways during chemotherapy. International Journal of Medical Informatics, 103, 32–41. Barros, A., Hettel, T., & Flender, C. (2010). Process choreography modeling. In Handbook on business process management (Vol. 1, pp. 257–277). Springer. Batista, E., & Solanas, A. (2018). Process mining in healthcare: A systematic review. In 2018 9th international conference on information, intelligence, systems and applications (IISA) (pp. 1–6). IEEE. Benevento, E., Aloini, D., Squicciarini, N., Dulmin, R., & Mininno, V. (2019). Queue-based features for dynamic waiting time prediction in emergency department. Measuring Business Excellence, 23(4), 458–471. Benevento, E., Dixit, P. M., Sani, M. F., Aloini, D., & van der Aalst, W. M. P. (2019). Evaluating the effectiveness of interactive process discovery in healthcare: A case study. In International conference on business process management (pp. 508–519). Springer. Berti, A., van Zelst, S. J., & van der Aalst, W. (2019). Process mining for Python (PM4Py): Bridging the gap between process and data science. arXiv:1905.06169. Binder, M., Dorda, W., Duftschmid, G., Dunkl, R., Fröschl, K. A., Gall, W., Grossmann, W., Harmankaya, K., Hronsky, M., Rinderle-Ma, S., Rinner, C., & Weber, S. (2012). On analyzing process compliance in skin cancer treatment: An experience report from the evidence-based medical compliance cluster (EBMC 2). In International conference on advanced information systems engineering (pp. 398–413). Springer. Blei, D., Carin, L., & Dunson, D. (2010). Probabilistic topic models. IEEE Signal Processing Magazine, 27(6), 55–65. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. Bouarfa, L., & Dankelman, J. (2012). Workflow mining and outlier detection from clinical activity logs. Journal of Biomedical Informatics, 45(6), 1185–1190. Burattin, A. (2016). PLG2: Multiperspective process randomization with online and offline simulations (pp. 1–6). BPM (Demos). Burattin, A., & Alessandro Sperduti, P. L. G. (2010). A framework for the generation of business process models and their execution logs. In International conference on business process management (pp. 214–219). Springer. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 34 of 47 35 of 47 Caron, F., Vanthienen, J., De Weerdt, J., Baesens, B., De Weerdt, J., & Baesens, B. (2011). Beyond x-raying a care-flow: Adopting different focuses on care-flow mining. In Proc. First Int. Bus. Process Intell. Chall (pp. 1–11). Caron, F., Vanthienen, J., Vanhaecht, K., Van Limbergen, E., De Weerdt, J., & Baesens, B. (2014). Monitoring care processes in the gynecologic oncology department. Computers in Biology and Medicine, 44, 88–96. Chang, H., Yu, J. Y., Yoon, S. Y., Hwang, S. Y., Yoon, H., Cha, W. C., Sim, M. S., Jo, I. J., & Kim, T. (2020). Impact of COVID-19 pandemic on the overall diagnostic and therapeutic process for patients of emergency department and those with acute cerebrovascular disease. Journal of Clinical Medicine, 9(12), 3842. Chen, B., Alrifai, W., Cheng, G., Jones, B., Novak, L., Lorenzi, N., France, D., Malin, B., & Chen, Y. (2021). Mining tasks and task characteristics from electronic health record audit logs with unsupervised machine learning. Journal of the American Medical Informatics Association, 28(6), 1168–1177. https://doi.org/10.1093/jamia/ocaa338 Chen, S., Yang, S., Zhou, M., Burd, R., & Marsic, I. (2017). Process-oriented iterative multiple alignment for medical process mining. In 2017 IEEE international conference on data mining workshops (ICDMW) (pp. 438–445). IEEE. Chinosi, M., & Trombetta, A. (2012). BPMN: An introduction to the standard. Computer Standards & Interfaces, 34(1), 124–134. Chiudinelli, L., Dagliati, A., Tibollo, V., Albasini, S., Geifman, N., Peek, N., Holmes, J. H., Corsi, F., Bellazzi, R., & Sacchi, L. (2020). Mining post-surgical care processes in breast cancer patients. Artificial Intelligence in Medicine, 105, 101855. Cho, M., Song, M., Park, J., Yeom, S.-R., Wang, I.-J., & Choi, B.-K. (2020). Process mining-supported emergency room process performance indicators. International Journal of Environmental Research and Public Health, 17(17), 6290. Cho, M., Song, M., & Sooyoung Yoo, A. (2014). Systematic methodology for outpatient process analysis based on process mining. In AsiaPacific conference on business process management (pp. 31–42). Springer. Conca, T., Saint-Pierre, C., Herskovic, V., Sepúlveda, M., Capurro, D., Prieto, F., & Fernandez-Llatas, C. (2018). Multidisciplinary collaboration in the treatment of patients with type 2 diabetes in primary care: Analysis using process mining. Journal of Medical Internet Research, 20(4), e8884. Dagliati, A., Sacchi, L., Cerra, C., Leporati, P., de Cata, P., Chiovato, L., Holmes, J. H., & Bellazzi, R. (2014). Temporal data mining and process mining techniques to identify cardiovascular risk-associated clinical pathways in type 2 diabetes patients. In IEEE-EMBS international conference on biomedical and health informatics (BHI) (pp. 240–243). IEEE. Dallagassa, M. R., dos Santos Garcia, C., Scalabrin, E. E., Ioshii, S. O., & Carvalho, D. R. (2021). Opportunities and challenges for applying process mining in healthcare: A systematic mapping study. Journal of Ambient Intelligence and Humanized Computing, 1–18. de Medeiros, A. K. A., Weijters, A. J. M. M., & van der Aalst, W. M. P. (2007). Genetic process mining: An experimental evaluation. Data Mining and Knowledge Discovery, 14(2), 245–304. De Oliveira, H., Prodel, M., Lamarsalle, L., Inada-Kim, M., Ajayi, K., Wilkins, J., Sekelj, S., Beecroft, S., Snow, S., Slater, R., & Orlowski, A. (2020). “Bow-tie” optimal pathway discovery analysis of sepsis hospital admissions using the hospital episode statistics database in England. JAMIA Open, 3(3), 439–448. de Toledo, P., Joppien, C., Sesmero, M. P., & Drews, P. (2019). Mining disease courses across organizations: A methodology based on process mining of diagnosis events datasets. In 2019 41st annual international conference of the IEEE engineering in medicine and biology society (EMBC) (pp. 354–357). IEEE. de Vries, G.-J., Neira, R. A. Q., Geleijnse, G., Dixit, P., & Mazza, B. F. (2017). Towards process mining of EMR data. In International joint conference on biomedical engineering systems and technologies (BIOSTEC). SciTePress. De Weerdt, J., Caron, F., Vanthienen, J., & Baesens, B. (2012). Getting a grasp on clinical pathway data: An approach based on process mining. In Pacific-Asia conference on knowledge discovery and data mining (pp. 22–35). Springer. Delias, P., Doumpos, M., Grigoroudis, E., Manolitzas, P., & Matsatsinis, N. (2015). Supporting healthcare management decisions via robust clustering of event logs. Knowledge-Based Systems, 84, 203–213. Delias, P., Manolitzas, P., Grigoroudis, E., & Matsatsinis, N. (2014). Applying process mining to the emergency department. In Encyclopedia of business analytics and optimization (pp. 168–178). IGI Global. Detro, S. P., Santos, E. A. P., Panetto, H., de Freitas Rocha Loures, E., & Lezoche, M. (2017). Managing business process variability through process mining and semantic reasoning: An application in healthcare. In Working conference on virtual enterprises (pp. 333–340). Springer. Dewandono, R. D., Fauzan, R., Sarno, R. & Sidiq, M. (2013). Ontology and process mining for diabetic medical treatment sequencing. In Proceedings of the 7th international conference on information & communication technology and systems (ICTS) (pp. 171–178). Dixit, P. M., Verbeek, H. M. W., Buijs, J. C. A. M., & van der Aalst, W. M. P. (2018). Interactive data-driven process model construction. In International conference on conceptual modeling (pp. 251–265). Springer. Duma, D., & Aringhieri, R. (2017). Mining the patient flow through an emergency department to deal with overcrowding. In International conference on health care systems engineering (pp. 49–59). Springer. Duma, D., & Aringhieri, R. (2020). An ad hoc process mining approach to discover patient paths of an emergency department. Flexible Services and Manufacturing Journal, 32(1), 6–34. Dunkl, R., Fröschl, K. A., Grossmann, W., & Rinderle-Ma, S. (2011). Assessing medical treatment compliance based on formal process modeling. In Symposium of the Austrian HCI and usability engineering group (pp. 533–546). Springer. Durojaiye, A. B., Levin, S., Toerper, M., Kharrazi, H., Lehmann, H. P., & Gurses, A. P. (2019). Evaluation of multidisciplinary collaboration in pediatric trauma care using EHR data. Journal of the American Medical Informatics Association, 26(6), 506–515. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. Durojaiye, A. B., McGeorge, N. M., Puett, L. L., Stewart, D., Fackler, J. C., Hoonakker, P. L. T., Lehmann, H. P., & Gurses, A. P. (2018). Mapping the flow of pediatric trauma patients using process mining. Applied Clinical Informatics, 9(03), 654–666. Erdogan, T. G., & Tarhan, A. (2018). A goal-driven evaluation method based on process mining for healthcare processes. Applied Sciences, 8(6), 894. Erdogan, T. G., & Tarhan, A. (2018). Systematic mapping of process mining studies in healthcare. IEEE Access, 6, 24543–24567. Farid, N. F., De Kamps, M., & Johnson, O. A. (2019). Process mining in frail elderly care: A literature review. In Proceedings of the 12th international joint conference on biomedical engineering systems and technologies—Volume 5: HEALTHINF (Vol. 5, pp. 332–339). SciTePress, Science and Technology Publications. Fei, H., & Meskens, N. (2010). Discovering patient care process models from event logs. In 8th international conference of modeling. Citeseer. Fernandez-Llatas, C. (2021a). Applying interactive process mining paradigm in healthcare domain. In Interactive process mining in healthcare (pp. 103–117). Springer. Fernandez-Llatas, C. (2021b). Bringing interactive process mining to health professionals: Interactive data rodeos. In Interactive process mining in healthcare (pp. 119–140). Springer. Fernandez-Llatas, C., Benedi, J. M., Gama, J. M., Sepulveda, M., Rojas, E., Vera, S., & Traver, V. (2021). Interactive process mining in surgery with real time location systems: Interactive trace correction. In Interactive process mining in healthcare (pp. 181–202). Springer. Fernandez-Llatas, C., Benedi, J.-M., García-G omez, J. M., & Traver, V. (2013). Process mining for individualized behavior modeling using wireless tracking in nursing homes. Sensors, 13(11), 15434–15451. Fernandez-Llatas, C., Garcia-Gomez, J. M., Vicente, J., Naranjo, J. C., Robles, M., Benedi, J. M., & Traver, V. (2011). Behaviour patterns detection for persuasive design in nursing homes to help dementia patients. In 2011 annual international conference of the IEEE engineering in medicine and biology society (pp. 6413–6417). IEEE. Fernandez-Llatas, C., Lizondo, A., Monton, E., Benedi, J.-M., & Traver, V. (2015). Process mining methodology for health process tracking using real-time indoor location systems. Sensors, 15(12), 29821–29840. Fernandez-Llatas, C., Meneu, T., Benedi, J. M., & Traver, V. (2010). Activity-based process mining for clinical pathways computer aided design. In 2010 annual international conference of the IEEE engineering in medicine and biology (pp. 6178–6181). IEEE. Fernandez-Llatas, C., Meneu, T., Benedí, J.-M., & Traver, V. (2011). Continuous clinical pathways evaluation by using automatic learning algorithms. In HEALTHINF (pp. 228–234). SciTePress. Fernandez-Llatas, C., Sacchi, L., Benedi, J. M., Dagliati, A., Traver, V., & Bellazzi, R. (2014). Temporal abstractions to enrich activity-based process mining corpus with clinical time series. In IEEE-EMBS international conference on biomedical and health informatics (BHI) (pp. 785–788). IEEE. Folino, F., Greco, G., Guzzo, A., & Pontieri, L. (2011). Mining usage scenarios in business processes: Outlier-aware discovery and run-time prediction. Data & Knowledge Engineering, 70(12), 1005–1029. https://doi.org/10.1016/j.datak.2011.07.002 Forsberg, D., Rosipko, B., & Sunshine, J. L. (2016). Analyzing PACS usage patterns by means of process mining: Steps toward a more detailed workflow analysis in radiology. Journal of Digital Imaging, 29(1), 47–58. Franck, T., Bercelli, P., Aloui, S., & Augusto, V. (2020). A generic framework to analyze and improve patient pathways within a healthcare network using process mining and discrete-event simulation. In 2020 winter simulation conference (WSC) (pp. 968–979). IEEE. Furniss, S. K., Burton, M. M., Grando, A., Larson, D. W., & Kaufman, D. R. (2016). Integrating process mining and cognitive analysis to study EHR workflow. In AMIA annual symposium proceedings (Vol. 2016, p. 580). American Medical Informatics Association. Ganesha, K., Dhanush, S., & Swapnil Raj, S. M. (2017). An approach to fuzzy process mining to reduce patient waiting time in a hospital. In 2017 international conference on innovations in information, embedded and communication systems (ICIIECS) (pp. 1–6). IEEE. García, A. O., Pérez-Alfonso, D., & Armenteros, O. U. L. (2015). Analysis of hospital processes with process mining techniques. In MedInfo (pp. 310–314). IOS Press. doi:10.3233/978-1-61499-564-7-310 Garg, N., & Agarwal, S. (2016). Process mining for clinical workflows. In Proceedings of the international conference on advances in information communication technology & computing (pp. 1–5). ACM. doi:10.1145/2979779.2979784 Gatta, R., Lenkowicz, J., Vallati, M., Rojas, E., Damiani, A., Sacchi, L., de Bari, B., Dagliati, A., Fernandez-Llatas, C., Montesi, M., Marchetti, A., Castellano, M., & Valentini, V. (2017). pMineR: An innovative R library for performing process mining in medicine. In Conference on artificial intelligence in medicine in Europe (pp. 351–355). Springer. Gattnar, E., Ekinci, O., & Detschew, V. (2011). A novel generic clinical reference process model for event-based process times measurement. In International conference on business information systems (pp. 65–76). Springer. Gattnar, E., Ekinci, O., Detschew, V., & Capel-Tunon, M. (2011). Event-based workflow analysis in healthcare. In IVM/FTMDD/RTSOABIS/MSVVEIS (pp. 61–70). SciTePress. Ghasemi, M., & Amyot, D. (2016). Process mining in healthcare: A systematised literature review. International Journal of Electronic Healthcare, 9(1), 60–88. Gonzalez-García, J., Tellería-Orriols, C., Estupiñan-Romero, F., & Bernal-Delgado, E. (2020). Construction of empirical care pathways process models from multiple real-world datasets. IEEE Journal of Biomedical and Health Informatics, 24(9), 2671–2680. Grando, A., Groat, D., Furniss, S. K., Nowak, J., Gaines, R., Kaufman, D. R., Poterack, K. A., Miksch, T., & Helmers, R. A. (2017). Using process mining techniques to study workflows in a pre-operative setting. In AMIA annual symposium proceedings (Vol. 2017, p. 790). American Medical Informatics Association. Grando, M. A., Schonenberg, M. H., & van der Aalst, W. M. P. (2011). Semantic process mining for the verification of medical recommendations. In HEALTHINF (pp. 5–16). SciTePress. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 36 of 47 37 of 47 Grüger, J., Bergmann, R., Kazik, Y. & Kuhn, M. (2020). Process mining for case acquisition in oncology: A systematic literature review. In Trabold, D., Welke, P., Piatkowski, N. (Eds.), CEUR Workshop Proceedings, (Vol. 2738, pp. 162-173). CEUR-WS. Günther, C. W., & van der Aalst, W. M. P. (2006). Mining activity clusters from low-level event logs. Beta Research School for Operations Management and Logistics. Günther, C. W., & Van Der Aalst, W. M. P. (2007). Fuzzy mining–adaptive process simplification based on multi-perspective metrics. In International conference on business process management (pp. 328–343). Springer. Han, B., Jiang, L., & Cai, H. (2011). Abnormal process instances identification method in healthcare environment. In 2011IEEE 10th international conference on trust, security and privacy in computing and communications (pp. 1387–1392). IEEE. Helm, E., Lin, A. M., Baumgartner, D., Lin, A. C., & Küng, J. (2020). Towards the use of standardized terms in clinical case studies for process mining in healthcare. International Journal of Environmental Research and Public Health, 17(4), 1348. Hendricks, R. M. (2019). Process mining of incoming patients with sepsis. Online Journal of Public Health Informatics, 11(2), e14. doi:10. 5210/ojphi.v11i2.10151 Homayounfar, P. (2012). Process mining challenges in hospital information systems. In 2012 federated conference on computer science and information systems (FedCSIS) (pp. 1135–1140). IEEE. Huang, Z., Dong, W., Ji, L., Gan, C., Xudong, L., & Duan, H. (2014). Discovery of clinical pathway patterns from event logs using probabilistic topic models. Journal of Biomedical Informatics, 47, 39–57. Huang, Z., Xudong, L., & Duan, H. (2012). On mining clinical pathway patterns from medical behaviors. Artificial Intelligence in Medicine, 56(1), 35–50. Huang, Z., Xudong, L., Duan, H., & Fan, W. (2013). Summarizing clinical pathways from event logs. Journal of Biomedical Informatics, 46(1), 111–127. Ibanez-Sanchez, G., Celda, M. A., Mandingorra, J., & Fernandez-Llatas, C. (2021). Interactive process mining in emergencies. In Interactive process mining in healthcare (pp. 165–180). Springer. Ibanez-Sanchez, G., Fernandez-Llatas, C., Martinez-Millana, A., Celda, A., Mandingorra, J., Aparici-Tortajada, L., Valero-Ramon, Z., MunozGama, J., Sepúlveda, M., Rojas, E., Galvez, V., Capurro, D., & Traver, V. (2019). Toward value-based healthcare through interactive process mining in emergency rooms: The stroke case. International Journal of Environmental Research and Public Health, 16(10), 1783. IEEE. (2016). IEEE Standard for eXtensible Event Stream (XES) for achieving interoperability in event logs and event streams (IEEE Std 1849-2016) (pp. 1–50). IEEE. https://doi.org/10.1109/IEEESTD.2016.7740858 Jagadeesh Chandra Bose, R. P. & van der Aalst, W. M. P. 1118 Jagadeesh Chandra Bose, R. P., & Van der Aalst, W. M. P. (2009). Abstractions in process mining: A taxonomy of patterns. In International conference on business process management (pp. 159–175). Springer. Jaisook, P., & Premchaiswadi, W. (2015). Time performance analysis of medical treatment processes by using disco. In 2015 13th international conference on ICT and knowledge engineering (ICT & Knowledge Engineering 2015) (pp. 110–115). IEEE. Jan Martijn, E. M., der Werf, V., van Dongen, B. F., Hurkens, C. A. J., & Serebrenik, A. (2008). Process discovery using integer linear programming. In International conference on applications and theory of Petri nets (pp. 368–387). Springer. Jangi, M., Moghbeli, F., Ghaffari, M., & Vahedinemani, A. (2019). Hospital management based on semantic process mining: A systematic review. Frontiers in Health Informatics, 8(1), 4. Janssenswillen, G., Depaire, B., Swennen, M., Jans, M., & Vanhoof, K. (2019). bupaR: Enabling reproducible business process analysis. Knowledge-Based Systems, 163, 927–930. Jaroenphol, E., Porouhan, P., & Premchaiswadi, W. (2015). Analysis of the patients' treatment process in a hospital in Thailand using fuzzy mining algorithms. In 2015 13th international conference on ICT and knowledge engineering (ICT & Knowledge Engineering 2015) (pp. 131–136). IEEE. Jaturogpattana, T., Arpasat, P., Kungcharoen, K., Intarasema, S., & Premchaiswadi, W. (2017). Conformance analysis of outpatient data using process mining technique. In 2017 15th international conference on ICT and knowledge engineering (ICT&KE) (pp. 1–6). IEEE. Johnson, A. E. W., Pollard, T. J., Lu Shen, H., Li-Wei, L., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L. A., & Mark, R. G. (2016). MIMIC-III, a freely accessible critical care database. Scientific Data, 3(1), 1–9. Johnson, O. A., Dhafari, T. B., Kurniati, A., Fox, F., & Rojas, E. (2018). The clearpath method for care pathway process mining and simulation. In International conference on business process management (pp. 239–250). Springer. Johnson, O. A., Hall, P. S., & Hulme, C. (2016). NETIMIS: Dynamic simulation of health economics outcomes using big data. PharmacoEconomics, 34(2), 107–114. Kamel Boulos, M. N., & Berry, G. (2012). Real-time locating systems (RTLS) in healthcare: A condensed primer. International Journal of Health Geographics, 11(1), 1–8. Kaymak, U., Mans, R., Van de Steeg, T., & Dierks, M. (2012). On process mining in health care. In 2012 IEEE international conference on systems, man, and cybernetics (SMC) (pp. 1859–1864). IEEE. Kelleher, D. C., Jagadeesh Chandra Bose, R. P., Waterhouse, L. J., Carter, E. A., & Burd, R. S. (2014). Effect of a checklist on advanced trauma life support workflow deviations during trauma resuscitations without pre-arrival notification. Journal of the American College of Surgeons, 218(3), 459–466. Kempa-Liehr, A. W., Lin, C. Y.-C., Britten, R., Armstrong, D., Wallace, J., Mordaunt, D., & O'Sullivan, M. (2020). Healthcare pathway discovery and probabilistic machine learning. International Journal of Medical Informatics, 137, 104087. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. Kim, E., Kim, S., Song, M., Kim, S., Yoo, D., Hwang, H., & Yoo, S. (2013). Discovery of outpatient care process of a tertiary university hospital using process mining. Healthcare Informatics Research, 19(1), 42. Kirchner, K., Herzberg, N., Solti, A. R., & Weske, M. (2012). Embedding conformance checking in a process intelligence system in hospital environments. In Process support and knowledge representation in health care (pp. 126–139). Springer. Kirchner, K., & Markovic, P. (2018). Unveiling hidden patterns in flexible medical treatment processes–a process mining case study. In International conference on decision support system technology (pp. 169–180). Springer. Kitchenham, B. (2004). Procedures for performing systematic reviews. Keele University. Kovalchuk, S. V., Funkner, A. A., Metsker, O. G., & Yakovlev, A. N. (2018). Simulation of patient flow in multiple healthcare units using process and data mining techniques for model identification. Journal of Biomedical Informatics, 82, 128–142. Kukreja, G., & Batra, S. (2017). Analogize process mining techniques in healthcare: Sepsis case study. In 2017 4th international conference on signal processing, computing and control (ISPCC) (pp. 482–487). IEEE. Kurniati, A., Hall, G., Hogg, D., & Johnson, O. (2018b). Process mining to explore variation in chemotherapy pathways for breast cancer patients. British Journal of Cancer, 119, 16. Kurniati, A. P., Hall, G., Hogg, D., & Johnson, O. (2018a). Process mining in oncology using the MIMIC-III dataset. Journal of Physics: Conference Series, 971, 012008. Kurniati, A. P., Johnson, O., Hogg, D., & Hall, G. (2016). Process mining in oncology: A literature review. In 2016 6th international conference on information communication and management (ICICM) (pp. 291–297). IEEE. Kurniati, A. P., McInerney, C., Zucker, K., Hall, G., Hogg, D., & Johnson, O. (2020). Using a multi-level process comparison for process change analysis in cancer pathways. International Journal of Environmental Research and Public Health, 17(19), 7210. Kurniati, A. P., Rojas, E., Hogg, D., Hall, G., & Johnson, O. A. (2019). The assessment of data quality issues for process mining in healthcare using medical information mart for intensive care III, a freely available e-health record batabase. Health Informatics Journal, 25(4), 1878–1893. Kusuma, G., Sykes, S., McInerney, C., & Johnson, O. (2020). Process mining of disease trajectories: A feasibility study. In Proceedings of the 13th international joint conference on biomedical engineering systems and technologies (Vol. 5, pp. 705–712). Science and Technology Publications. Kusuma, G. P., Hall, M., Gale, C. P., & Johnson, O. A. (2018). Process mining in cardiology: A literature review. International Journal of Bioscience, Biochemistry and Bioinformatics, 8, 226–236. Lakshmanan, G. T., Rozsnyai, S., & Wang, F. (2013). Investigating clinical care pathways correlated with outcomes. In Business process management (pp. 323–338). Springer. Lamine, E., Fontanili, F., Di Mascolo, M., & Pingaud, H. (2015). Improving the management of an emergency call service by combining process mining and discrete event simulation approaches. In Working conference on virtual enterprises (pp. 535–546). Springer. Lanzola, G., Parimbelli, E., Micieli, G., Cavallini, A., & Quaglini, S. (2014). Data quality and completeness in a web stroke registry as the basis for data and process mining. Journal of Healthcare Engineering, 5(2), 163–184. Law, A. M., David Kelton, W., & Kelton, W. D. (2000). Simulation modeling and analysis (Vol. 3). McGraw-Hill New York. Lee, Y. H., & Rismanchian, F. (2018). Optimizing hospital facility layout planning through process mining of clinical pathways. Annals of Optimization Theory and Practice, 1(1), 1–9. Leemans, S. J. J., Fahland, D., & van der Aalst, W. M. P. (2013). Discovering block-structured process models from event logs—A constructive approach. In International conference on applications and theory of Petri nets and concurrency (pp. 311–329). Springer. Leonardi, G., Montani, S., Portinale, L., Quaglini, S., & Striani, M. (2019). Discovering knowledge embedded in bio-medical databases: Experiences in food characterization and in medical process mining. In Innovations in big data mining and embedded knowledge (pp. 117– 136). Springer. Lira, R., Salas-Morales, J., Leiva, L., Fuentes, R., Delfino, A., Nazal, C. H., Sepúlveda, M., Arias, M., Herskovic, V., & Munoz-Gama, J. (2019). Process-oriented feedback through process mining for surgical procedures in medical training: The ultrasound-guided central venous catheter placement case. International Journal of Environmental Research and Public Health, 16(11), 1877. Lismont, J., Janssens, A.-S., Odnoletkova, I., vanden Broucke, S., Caron, F., & Vanthienen, J. (2016). A guide for the application of analytics on healthcare processes: A dynamic view on patient pathways. Computers in Biology and Medicine, 77, 125–134. Liu, C., Ge, Y., Xiong, H., Xiao, K., Geng, W., & Perkins, M. (2014). Proactive workflow modeling by stochastic processes with application to healthcare operation and management. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1593–1602). ACM. Lu, X., Tabatabaei, S. A., Hoogendoorn, M., & Reijers, H. A. (2019). Trace clustering on very large event data in healthcare using frequent sequence patterns. In International conference on business process management (pp. 198–215). Springer. Mannhardt, F., & Blinde, D. (2017). Analyzing the trajectories of patients with sepsis using process mining. In RADAR+ EMISA@ CAiSE (pp. 72–80). CEUR-WS. Mannhardt, F., De Leoni, M., & Reijers, H. A. (2015). The multi-perspective process explorer. BPM (Demos), 1418, 130–134. Mans, R., Reijers, H., van Genuchten, M., & Wismeijer, D. (2012). Mining processes in dentistry. In Proceedings of the 2nd ACM SIGHIT international health informatics symposium (pp. 379–388). ACM. Mans, R., Reijers, H., Wismeijer, D., & Van Genuchten, M. (2013). A process-oriented methodology for evaluating the impact of IT: A proposal and an application in healthcare. Information Systems, 38(8), 1097–1115. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 38 of 47 39 of 47 Mans, R. S., Van der Aalst, W. M. P., & Vanwersch, R. J. B. (2015). Process mining in healthcare: Evaluating and exploiting operational healthcare processes. Springer International Publishing. https://doi.org/10.1007/978-3-319-16071-9_2 Mans, R. S., Van der Aalst, W. M. P., Vanwersch, R. J. B., & Moleman, A. J. (2012). Process mining in healthcare: Data challenges when answering frequently posed questions. In Process support and knowledge representation in health care (pp. 140–153). Springer. Marazza, F., Bukhsh, F. A., Geerdink, J., Vijlbrief, O., Pathak, S., van Keulen, M., & Seifert, C. (2020). Automatic process comparison for subpopulations: Application in cancer care. International Journal of Environmental Research and Public Health, 17(16), 5707. Martin, N. (2018). Using indoor location system data to enhance the quality of healthcare event logs: Opportunities and challenges. In International conference on business process management (pp. 226–238). Springer. Martinez-Millana, A., Lizondo, A., Gatta, R., Vera, S., Salcedo, V. T., & Fernandez-Llatas, C. (2019). Process mining dashboard in operating rooms: Analysis of staff expectations with analytic hierarchy process. International Journal of Environmental Research and Public Health, 16(2), 199. Martinez-Millana, A., Merino-Torres, J.-F., Valdivieso, B., & Fernandez-Llatas, C. (2021). Interactive process mining in type 2 diabetes mellitus. In Interactive process mining in healthcare (pp. 203–215). Springer. McGregor, C., Catley, C., & James, A. (2011). A process mining driven framework for clinical guideline improvement in critical care. In Proceedings of the learning from medical data streams workshop. Bled, Slovenia (July 2011), (Vol. 765, pp. 34–45). CEUR. McGregor, C., Percival, J., Curry, J., Foster, D., Anstey, E., & Churchill, D. (2008). A structured approach to requirements gathering creation using PaJMa models. In 2008 30th annual international conference of the IEEE engineering in medicine and biology society (pp. 1506–1509). IEEE. Meneu, T., Traver, V., Guillén, S., Valdivieso, B., Benedi, J., & Fernandez-Llatas, C. (2013). Heart cycle: Facilitating the deployment of advanced care processes. In 2013 35th annual international conference of the IEEE engineering in medicine and biology society (EMBC) (pp. 6996–6999). IEEE. Mertens, S., Gailly, F., & Poels, G. (2018). Discovering health-care processes using DeciClareMiner. Health Systems, 7(3), 195–211. Metsker, O., Bolgova, E., Yakovlev, A., Funkner, A., & Kovalchuk, S. (2017). Pattern-based mining in electronic health records for complex clinical process analysis. Procedia Computer Science, 119, 197–206. Miclo, R., Fontanili, F., Marquès, G., Bomert, P., & Lauras, M. (2015). RTLS-based process mining: Towards an automatic process diagnosis in healthcare. In 2015 IEEE international conference on automation science and engineering (CASE) (pp. 1397–1402). IEEE. Montani, S., Leonardi, G., Quaglini, S., Cavallini, A., & Micieli, G. (2013). Mining and retrieving medical processes to assess the quality of care. In International conference on case-based reasoning (pp. 233–240). Springer. Montani, S., Leonardi, G., Quaglini, S., Cavallini, A., & Micieli, G. (2014). Improving structural medical process comparison by exploiting domain knowledge and mined information. Artificial Intelligence in Medicine, 62(1), 33–45. Montani, S., Striani, M., Quaglini, S., Cavallini, A., & Leonardi, G. (2017). Knowledge-based trace abstraction for semantic process mining. In Conference on artificial intelligence in medicine in Europe (pp. 267–271). Springer. Naeem, M. R., Naeem, H., Aamir, M., Ali, W., & Abro, W. A. (2017). A multi-level process mining framework for correlating and clustering of biomedical activities using event logs. International Journal of Advanced Computer Science and Applications, 8(3), 393–401. Najjar, A., Reinharz, D., Girouard, C., & Gagné, C. (2018). A two-step approach for mining patient treatment pathways in administrative healthcare databases. Artificial Intelligence in Medicine, 87, 34–48. Neira, R. A. Q., Hompes, B. F. A., de Vries, J. G.-J., Mazza, B. F., Simões de Almeida, S. L., Stretton, E., Buijs, J. C. A. M., & Hamacher, S. (2019). Analysis and optimization of a sepsis clinical pathway using process mining. In International conference on business process management (pp. 459–470). Springer. Neumuth, T., Liebmann, P., Wiedemann, P., & Meixensberger, J. (2012). Surgical workflow management schemata for cataract procedures. Methods of Information in Medicine, 51(05), 371–382. Neumuth, T., Jannin, P., Schlomberg, J., Meixensberger, J., Wiedemann, P., & Burgert, O. (2011). Analysis of surgical intervention populations using generic surgical process models. International Journal of Computer Assisted Radiology and Surgery, 6(1), 59–71. Nigam, A., & Caswell, N. S. (2003). Business artifacts: An approach to operational specification. IBM Systems Journal, 42(3), 428–445. Partington, A., Wynn, M., Suriadi, S., Ouyang, C., & Karnon, J. (2015). Process mining for clinical processes: A comparative analysis of four Australian hospitals. ACM Transactions on Management Information Systems (TMIS), 5(4), 1–18. Pebesma, J., Martinez-Millana, A., Sacchi, L., Fernandez-Llatas, C., de Cata, P., Chiovato, L., Bellazzi, R., & Traver, V. (2019). Clustering cardiovascular risk trajectories of patients with type 2 diabetes using process mining. In 2019 41st annual international conference of the IEEE engineering in medicine and biology society (EMBC) (pp. 341–344). IEEE. Perer, A., Wang, F., & Jianying, H. (2015). Mining and exploring care pathways from electronic medical records with visual analytics. Journal of Biomedical Informatics, 56, 369–378. Perimal-Lewis, L., De Vries, D., & Thompson, C. H. (2014). Health intelligence: Discovering the process model using process mining by constructing start-to-end patient journeys. In Proceedings of the seventh Australasian workshop on health informatics and knowledge management (Vol. 153, pp. 59–67). Australian Computer Society, Inc. Perimal-Lewis, L., Qin, S., Thompson, C., & Hakendorf, P. (2012). Gaining insight from patient journey data using a process-oriented analysis approach. In Proceedings of the fifth Australasian workshop on health informatics and knowledge management (Vol. 129, pp. 59–66). Australian Computer Society Inc. Perimal-Lewis, L., Teubner, D., Hakendorf, P., & Horwood, C. (2016). Application of process mining to assess the data quality of routinely collected time-based performance data sourced from electronic health records by validating process conformance. Health Informatics Journal, 22(4), 1017–1029. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. Pesic, M., Schonenberg, H., & Van der Aalst, W. M. P. (2007). Declare: Full support for loosely-structured processes. In 11th IEEE international enterprise distributed object computing conference (EDOC 2007) (p. 287). IEEE. Peterson, J. L. (1977). Petri nets. ACM Computing Surveys (CSUR), 9(3), 223–252. Phan, R., Augusto, V., Martin, D., & Sarazin, M. (2019). Clinical pathway analysis using process mining and discrete-event simulation: An application to incisional hernia. In 2019 winter simulation conference (WSC) (pp. 1172–1183). IEEE. Piccialli, F., Di Somma, V., Giampaolo, F., Cuomo, S., & Fortino, G. (2021). A survey on deep learning in medicine: Why, how and when? Information Fusion, 66, 111–137. https://doi.org/10.1016/j.inffus.2020.09.006 Pika, A., Wynn, M. T., Budiono, S., ter Hofstede, A. H. M., van der Aalst, W. M. P., & Reijers, H. A. (2019). Towards privacy-preserving process mining in healthcare. In International conference on business process management (pp. 483–495). Springer. Placidi, L., Boldrini, L., Lenkowicz, J., Manfrida, S., Gatta, R., Damiani, A., Chiesa, S., Ciellini, F., & Valentini, V. (2021). Process mining to optimize palliative patient flow in a high-volume radiotherapy department. Technical Innovations & Patient Support in Radiation Oncology, 17, 32–39. Poelmans, J., Dedene, G., Verheyden, G., Van der Mussele, H., Viaene, S., & Peters, E. (2010). Combining business process and data discovery techniques for analyzing and improving integrated care pathways. In Industrial conference on data mining (pp. 505–517). Springer. Prodel, M., Augusto, V., Xie, X., Jouaneton, B., & Lamarsalle, L. (2015). Discovery of patient pathways from a national hospital database using process mining and integer linear programming. In 2015 IEEE international conference on automation science and engineering (CASE) (pp. 1409–1414). IEEE. Prokofyeva, E. S., & Zaytsev, R. D. (2020). Clinical pathways analysis of patients in medical institutions based on hard and fuzzy clustering methods. Data Analysis and Intellingence Systems, 14(1), 19–31. & Ferreira, D. R. (2012). Business process analysis in healthcare environments: A methodology based on process mining. InforRebuge, A., mation Systems, 37(2), 99–116. Rinner, C., Helm, E., Dunkl, R., Kittler, H., & Rinderle-Ma, S. (2018). An application of process mining in the context of melanoma surveillance using time boxing. In International conference on business process management (pp. 175–186). Springer. Riz, G., Santos, E. A. P., de Freitas, E., & Loures, R. (2016). Process mining to knowledge discovery in healthcare processes. In Transdisciplinary engineering: Crossing boundaries (pp. 1019–1028). IOS Press. Rojas, E., Arias, M., & Sepúlveda, M. (2015). Clinical processes and its data, what can we do with them. In Proceedings of the international conference on health informatics (HEALTHINF 2015), Lisbon, Portugal (pp. 12–15). SciTePress. Rojas, E., & Capurro, D. (2018). Characterization of drug use patterns using process mining and temporal abstraction digital phenotyping. In International conference on business process management (pp. 187–198). Springer. Rojas, E., Cifuentes, A., Burattin, A., Munoz-Gama, J., Sepúlveda, M., & Capurro, D. (2018). Analysis of emergency room episodes duration through process mining. In International conference on business process management (pp. 251–263). Springer. Rojas, E., Fernandez-Llatas, C., Traver, V., Munoz-Gama, J., Sepúlveda, M., Herskovic, V., & Capurro, D. (2017). PALIA-ER: Bringing question-driven process mining closer to the emergency room. In BPM (Demos). Rojas, E., Munoz-Gama, J., Sepúlveda, M., & Capurro, D. (2016). Process mining in healthcare: A literature review. Journal of Biomedical Informatics, 61, 224–236. Rojas, E., Sepúlveda, M., Munoz-Gama, J., Capurro, D., Traver, V., & Fernandez-Llatas, C. (2017). Question-driven methodology for analyzing emergency room processes using process mining. Applied Sciences, 7(3), 302. Ronny, S., Mans, M. H. S., Song, M., Van der Aalst, W. M. P., & Bakker, P. J. M. (2015). Process Mining in Healthcare. In International conference on health informatics (HEALTHINF'08) (pp. 118–125). SciTePress. Rovani, M., Maggi, F. M., De Leoni, M., & Van Der Aalst, W. M. P. (2015). Declarative process mining in healthcare. Expert Systems with Applications, 42(23), 9236–9251. Sato, D. M. V., Mantovani, L. K., Safanelli, J., Guesser, V., Nagel, V., Moro, C. H. C., Cabral, N. L., Scalabrin, E. E., Moro, C., & Santos, E. A. P. (2020). Ischemic stroke: Process perspective, clinical and profile characteristics, and external factors. Journal of Biomedical Informatics, 111, 103582. Stefanini, A., Aloini, D., Benevento, E., Dulmin, R., & Mininno, V. (2018). Performance analysis in emergency departments: A data-driven approach. Measuring Business Excellence, 22(2), 130–145. https://doi.org/10.1108/MBE-07-2017-0040 Stefanini, A., Aloini, D., Dulmin, R., & Mininno, V. (2016). Linking diagnostic-related groups (DRGs) to their processes by process mining. HEALTHINF, 5, 438–443. Suriadi, S., Mans, R. S., Wynn, M. T., Partington, A., & Karnon, J. (2014). Measuring patient flow variations: A cross-organisational process mining approach. In Asia-Pacific conference on business process management (pp. 43–58). Springer. Tamburis, O., & Esposito, C. (2020). Process mining as support to simulation modeling: A hospital-based case study. Simulation Modelling Practice and Theory, 104, 102149. Tax, N., Sidorova, N., Haakma, R., & van der Aalst, W. M. P. (2016). Mining local process models. Journal of Innovation in Digital Ecosystems, 3(2), 183–196. Tax, N., Sidorova, N., van der Aalst, W. M. P., & Haakma, R. (2016). Heuristic approaches for generating local process models through log projections. In 2016 IEEE symposium series on computational intelligence (SSCI) (pp. 1–8). IEEE. Tibeme, B., Shahriar, H., & Zhang, C. (2018). Process mining algorithms for clinical workflow analysis. In SoutheastCon 2018 (pp. 1–6). IEEE. (2017). Applicability of process mining in the exploration of healthcare T oth, K., Machalik, K., Fogarassy, G., & Vathy-Fogarassy, A. sequences. In 2017 IEEE 30th Neumann colloquium (NC) (pp. 000151–000156). IEEE. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 40 of 47 41 of 47 Valero-Ramon, Z., Fernandez-Llatas, C., Martinez-Millana, A., & Traver, V. (2019). A dynamic behavioral approach to nutritional assessment using process mining. In 2019 IEEE 32nd international symposium on computer-based medical systems (CBMS) (pp. 398–404). IEEE. Valero-Ramon, Z., Fernandez-Llatas, C., Martinez-Millana, A., & Traver, V. (2020). Interactive process indicators for obesity modelling using process mining. In Advanced computational intelligence in Healthcare-7 (pp. 45–64). Springer. Valero-Ramon, Z., Fernandez-Llatas, C., Valdivieso, B., & Traver, V. (2020). Dynamic models supporting personalised chronic disease management through healthcare sensors with interactive process mining. Sensors, 20(18), 5330. Van Der Aalst, W., Adriansyah, A., De Medeiros, A. K. A., Arcieri, F., Baier, T., Blickle, T., Bose, J. C., Van Den Brand, P., Brandtjen, R., Buijs, J., Burattin, A., Carmona, J., Castellanos, M., Claes, J., Cook, J., Costantini, N., Curbera, F., Damiani, E., de Leoni, M., … Wynn, M. (2011). Process mining manifesto. In International conference on business process management (pp. 169– 194). Springer. Van Der Aalst, W., Adriansyah, A., & Van Dongen, B. (2011). Causal nets: A modeling language tailored towards process discovery. In International conference on concurrency theory (pp. 28–42). Springer. Van der Aalst, W. M. P. (2009). Process-aware information systems: Lessons to be learned from process mining. In Transactions on Petri nets and other models of concurrency II (pp. 1–26). Springer. Van der Aalst, W. M. P. (2011). Process mining: Discovering and improving spaghetti and lasagna processes. In 2011 IEEE symposium on computational intelligence and data mining (CIDM) (pp. 1–7). IEEE. van der Aalst, W. M. P., De Beer, H. T., & van Dongen, B. F. (2005). Process mining and verification of properties: An approach based on temporal logic. In OTM confederated international conferences on the move to meaningful internet systems (pp. 130–147). Springer. Van der Aalst, W., Weijters, T., & Maruster, L. (2004). Workflow mining: Discovering process models from event logs. IEEE Transactions on Knowledge and Data Engineering, 16(9), 1128–1142. Van Der Spoel, S., Van Keulen, M., & Amrit, C. (2012). Process prediction in noisy data sets: A case study in a Dutch hospital. In International symposium on data-driven process discovery and analysis (pp. 60–83). Springer. Van Eck, M. L., Lu, X., Leemans, S. J. J., & Van Der Aalst, W. M. P. (2015). PM 2: A process mining project methodology. In International conference on advanced information systems engineering (pp. 297–313). Springer. Verbeek, H. M. W., Buijs, J. C. A. M., Van Dongen, B. F., & Van Der Aalst, W. M. P. (2010). Xes, XESame, and ProM 6. In International conference on advanced information systems engineering (pp. 60–75). Springer. Vogelgesang, T., & Appelrath, H.-J. (2013). Multidimensional process mining: A flexible analysis approach for health services research. In Proceedings of the joint EDBT/ICDT 2013 workshops (Vol. 2013, pp. 17–22). ACM. Weijters, A. J. M. M., van Der Aalst, W. M. P., & De Medeiros, A. K. A. (2006). Process mining with the heuristics miner algorithm (Tech. Rep. WP 166). Technische Universiteit Eindhoven. Weiskopf, N. G., & Weng, C. (2013). Methods and dimensions of electronic health record data quality assessment: Enabling reuse for clinical research. Journal of the American Medical Informatics Association, 20(1), 144–151. Weske, M. (2007). Business process management—Concepts, languages, architectures. Springer. Williams, R., Rojas, E., Peek, N., & Johnson, O. A. (2018). Process mining in primary care: A literature review. Studies in Health Technology and Informatics, 247, 376–380. Wirth, R., & Hipp, J. (2000). CRISP-DM: Towards a standard process model for data mining. In Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining (Vol. 1). Springer-Verlag. Xu, H., Pang, J., Yang, X., Jinghui, Y., Li, X., & Zhao, D. (2020). Modeling clinical activities based on multi-perspective declarative process mining with OpenEHR's characteristic. BMC Medical Informatics and Decision Making, 20(14), 1–11. Xu, H., Pang, J., Yang, X., Jinghui, Y., & Zhao, D. (2019). A modeling approach based on multi-perspective declarative process mining for clinical activity. In 2019 IEEE international conference on bioinformatics and biomedicine (BIBM) (pp. 1688–1691). IEEE. Xu, H., Pang, J., Yang, X., Li, M., & Zhao, D. (2020). Using predictive process monitoring to assist thrombolytic therapy decision-making for ischemic stroke patients. BMC Medical Informatics and Decision Making, 20(3), 1–10. Xu, H., Pang, J., Yang, X., Ma, L., Mao, H., & Zhao, D. (2020). Applying clinical guidelines to conformance checking for diagnosis and treatment: A case study of ischemic stroke. In 2020 IEEE international conference on bioinformatics and biomedicine (BIBM) (pp. 2125–2130). IEEE. Xu, H., Pang, J., Zhang, W., Li, X., Li, M., & Zhao, D. (2021). Predicting recurrence for patients with ischemic cerebrovascular events based on process discovery and transfer learning. IEEE Journal of Biomedical and Health Informatics, 25(7), 2445–2453. Xu, H., Yan, H., Pang, J., Nan, S., Yang, X., & Zhao, D. (2020). Evaluating the relative value of care interventions based on clinical pathway variation detection and propensity score. In 2020 IEEE international conference on bioinformatics and biomedicine (BIBM) (pp. 1184– 1187). IEEE. Xu, X., Jin, T., Wei, Z., & Wang, J. (2017). Incorporating topic assignment constraint and topic correlation limitation into clinical goal discovering for clinical pathway mining. Journal of Healthcare Engineering, 2017(5208072), 1–13. Yang, S., Li, J., Tang, X., Chen, S., Marsic, I., & Burd, R. S. (2017). Process mining for trauma resuscitation. The IEEE Intelligent Informatics Bulletin, 18(1), 15. Yang, S., Sarcevic, A., Farneth, R. A., Chen, S., Ahmed, O. Z., Marsic, I., & Burd, R. S. (2018). An approach to automatic process deviation detection in a time-critical clinical process. Journal of Biomedical Informatics, 85, 155–167. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. GUZZO ET AL. Yang, S., Tao, F., Li, J., Wang, D., Chen, S., Ahmed, O. Z., Marsic, I., & Burd, R. S. (2018). Process mining the trauma resuscitation patient cohorts. In 2018 IEEE international conference on healthcare informatics (ICHI) (pp. 29–35). IEEE. Yang, S., Zhou, M., Chen, S., Dong, X., Ahmed, O., Burd, R. S., & Marsic, I. (2017). Medical workflow modeling using alignment-guided state-splitting HMM. In 2017 IEEE international conference on healthcare informatics (ICHI) (pp. 144–153). IEEE. Yang, W., & Su, Q. (2014). Process mining for clinical pathway: Literature review and future directions. In 2014 11th international conference on service systems and service management (ICSSSM) (pp. 1–5). IEEE. Yoo, S., Cho, M., Kim, E., Kim, S., Sim, Y., Yoo, D., Hwang, H., & Song, M. (2016). Assessment of hospital processes using a process mining technique: Outpatient process analysis at a tertiary hospital. International Journal of Medical Informatics, 88, 34–43. Zhang, X., & Chen, S. (2012). Pathway identification via process mining for patients with multiple conditions. In 2012 IEEE international conference on industrial engineering and engineering management (pp. 1754–1758). IEEE. Zhou, M., Yang, S., Li, X., Lv, S., Chen, S., Marsic, I., Farneth, R. A., & Burd, R. S. (2017). Evaluation of trace alignment quality and its application in medical process mining. In 2017 IEEE international conference on healthcare informatics (ICHI) (pp. 258–267). IEEE. Zhou, Z., Wang, Y., & Li, L. (2014). Process mining based modeling and analysis of workflows in clinical care-a case study in a Chicago outpatient clinic. In Proceedings of the 11th IEEE international conference on networking, sensing and control (pp. 590–595). IEEE. Zhu, Q., Nie, H., Lu, X., & Duan, H. (2010). Radiology workflow-based monitoring dashboard in a heterogeneous environment. In 2010 3rd international conference on biomedical engineering and informatics (Vol. 6, pp. 2494–2498). IEEE. How to cite this article: Guzzo, A., Rullo, A., & Vocaturo, E. (2022). Process mining applications in the healthcare domain: A comprehensive review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 12(2), e1442. https://doi.org/10.1002/widm.1442 A P P EN D I X TABLE A1 Classification of the surveyed papers Reference Application Process Approach Tool (Rojas, Fernandez-Llatas, et al., 2017) PMTD (Araghi, Fontanili, et al., 2018) PMM HM DIAG R.IO-DIAG (Kelleher et al., 2014) CA MP BPL ProM (A. Grando et al., 2017) PD, SNA PI, HM, PT CEDA ProM, Disco, PMApp (S. Yang, Tao, et al., 2018) PD MP CEDA 2 Technique Data source Data FC, HC RTLS PALIA-Web Video rec. TR HoWSN Video rec. Surgery Disco FM, HC, KM Video rec. TR Celonis FM Video rec. Surgery BSD Sepsis PALIA BSD Geriatrics (Lira et al., 2019) PPM MP PM (McGregor et al., 2011) PMM PB CRISP-DM (Fernandez-Llatas, GarciaGomez, et al., 2011) PD PB AB (Kaymak et al., 2012) PMM MP CEDA ProM HM BSD Surgery (Alharbi et al., 2018) PD, ELQI CP CEDA Disco, R lib. HMM-based MIMIC Neurology (Rojas & Capurro, 2018) PD CP CEDA Disco HM MIMIC Sepsis CP PM 2, L* ProM RLPN MIMIC Oncology L* ProM HM MIMIC (A. P. Kurniati et al., 2018a) PMM (A. P. Kurniati et al., 2019) DQA (Pika et al., 2019) PPPM PP HIS, MIMIC (Marazza et al., 2020) VPA, VA CP CEDA (Fernandez-Llatas et al., 2013) VPA HM CEDA (Araghi, Fontaili, et al., 2018) PD HM DIAG ProM R.IO-DIAG IVM, RLPN MIMIC Oncology PALIA ILS HH HM RTLS Urology 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 42 of 47 TABLE A1 43 of 47 (Continued) Reference Application Process Approach Tool Technique Data source Data (Araghi et al., 2019) PPM HM DIAG+MP R.IO-DIAG HM RTLS (Martinez-Millana et al., 2019) PMTD HM DIAG PALIA-Web IVM RTLS (Araghi et al., 2020) PD HM DIAG R.IO-DIAG (Fernandez-Llatas et al., 2015) PD PP (Martin, 2018) ELQI (Fernandez-Llatas et al., 2021) ELQI HM (Binder et al., 2012) ELE CP ProM, PALIAWeb Surgery ILS HM, PALIA, RLPN, FrM RTLS Surgery HIS, ILS IPM ProM PALIA RTLS Surgery HM HIS Oncology Cardiology (Perimal-Lewis et al., 2014) PD, ELE PP CEDA ProM, Disco HM HIS (García et al., 2015) VPA, ELE PT CEDA ProM HM, RLPN HIS (Naeem et al., 2017) PD, SNA, CP ELE, ELQI CEDA ProM, Disco IM, IVM, HM, FM, RLPN HIS Hepatitis (Metsker et al., 2017) ELE CP CEDA Disco TM EHR Cardiology (Rinner et al., 2018) CA, ELQI CP BPL + MP ProM MPPE EHR Oncology (Vogelgesang & Appelrath, 2013) PMM (Alharbi et al., 2017) ELQI ProM IM, IMI MIMIC Diabetes OLAP MP CEDA 2 (O. A. Johnson et al., 2018) Sim, ELQI CP PM (Fernandez-Llatas et al., 2010) PD, ELQI PP CEDA (Ibanez-Sanchez et al., 2019) PMM, ELQI PP IPM (Fernandez-Llatas et al., 2014) ELQI (Lismont et al., 2016) PD, PMM, ELQI PP (Duma & Aringhieri, 2017) PPA, ELQI (Stefanini et al., 2016) (Huang et al., 2014) ProM, Netimis EHR PALIA PMApp Neurology Cardiology PALIA HIS Cardiology CEDA+TC ProM, Disco SNM, FM, SC EHR Diabetes PP CEDA ProM IMI, HM, DT HIS ED PD, ELQI PP CEDA Disco FM HIS Oncology PD, ELQI CP CEDA LDA HIS Oncology (Prokofyeva & Zaytsev, 2020) PD, ELQI CP CEDA+TC stringdist HC, KM, LDA HIS Sepsis (Chiudinelli et al., 2020) PD, ELQI CP CEDA SPM, LDA EHR Oncology (X. Xu et al., 2017) PD, ELQI CP CEDA FM, LDA HIS Neurology (Antonelli & Bruno, 2015) PD, elqi PP CEDA Disco FM HIS (Montani et al., 2017) ELQI PP CEDA ProM HM HIS (M. A. Grando et al., 2011) ELQI CP CEDA ProM α++, LTL-C (Dewandono et al., 2013) CA, ELQI CP CEDA ProM (Detro et al., 2017) VA, ELQI (Benevento, Dixit, et al., 2019) PD CP IPM (Neumuth et al., 2011) PD MP CEDA (Caron et al., 2011) PD, PU, SNA MP CEDA topicmodels Cardiology Respiratory dis. Diabetes CEDA Cardiology ProM IPM HIS ProM HM, SNM, LTL- HIS C HIS Oncology Surgery Oncology (Continues) 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. TABLE A1 GUZZO ET AL. (Continued) Reference Application Process Approach Tool Technique Data source Data FM HIS Oncology SPM EMR Oncology (De Weerdt et al., 2012) PD CP CEDA (Huang et al., 2012) PD CP CEDA C# lib. (Neumuth et al., 2012) PD MP MP YAWL-PM (R. Mans, Reijers, et al., 2012) PD, PPM, SNA PI MP ProM (Bouarfa & Dankelman, 2012) PD, OD MP CEDA (Furniss et al., 2016) PD PT CEDA Disco (B. Chen et al., 2021) PD PT CEDA ProM (Kim et al., 2013) PD, CA PP CEDA ProM (Cho et al., 2014) PD, Sim PP DIAG ProM, CPN Tools (Jaroenphol et al., 2015) PD CP CEDA ProM, Disco (Meneu et al., 2013) PD PP CEDA (Montani et al., 2013) PD CP ProM HM HIS Cardiology (Huang et al., 2013) PD CP C# lib. Ad hoc EMR Oncology (Dagliati et al., 2014) PD PB CEDA ProM HM HIS Diabetes (Miclo et al., 2015) PD HM CEDA Disco FM RTLS CPLEX ILP HIS Cardiology HMM-based Video rec. TR M-COFFEE TA Video rec., SD TR CEDA Disco HMM-based HIS TR Declare HIS Orthopedic CEDA Disco FM HIS ED ProM SC HIS ED HIS OS PALIA, QTC HIS Malnutrition ProM α-M, HM HIS CEDA ProM α-M, HM, FM, RLPN HIS Python lib. (Prodel et al., 2015) PD PP (S. Yang, Li, et al., 2017) PD, CA PP (M. Zhou et al., 2017) PD MP (S. Yang, Zhou, et al., 2017) PD MP Surgery HM, RLPN EMR Dentistry Video rec. Surgery Video rec. IM EHR Pediatric HIS OS HM, FM HIS OS FM HIS OS PALIA CEDA (Mertens et al., 2018) PD PP (Abo-Hamad, 2017) PD, PPM PP (Lee & Rismanchian, 2018) PD PP (Kirchner & Markovic, 2018) PD MP LPM (Valero-Ramon et al., 2019) PD PB CEDA+TC PALIA-Web (Garg & Agarwal, 2016) PD PP (Kukreja & Batra, 2017) PD, PPM, CA PP ProM HH Sepsis (De Oliveira et al., 2020) PD, VPA CP CEDA HIS Sepsis (Rebuge & Ferreira, 2012) VA MP CEDA+MP ProM RLPN, SC HIS ED (Ronny et al., 2015) PD, SNA CP CEDA+TC ProM SNM HIS Oncology (Delias et al., 2014) PD, PMM PP CEDA+TC ProM HM, FM, GM HIS ED (Delias et al., 2015) PD PP CEDA+TC ProM, Disco ILP, FM, USC (Najjar et al., 2018) PD CP TC HMM-based HIS Cardiology (Kovalchuk et al., 2018) PD, Sim, PPA PP CEDA+TC SKLearn KM EHR Cardiology (Pebesma et al., 2019) PD CP CEDA+TC HIS Diabetes (de Toledo et al., 2019) PD CP CEDA+TC ProM HM, FM, MBC, HC, KM HIS Diabetes (Lu et al., 2019) PD CP TC FSPM HIS Oncology, pediatric, diabetes C lib. ProM ED 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 44 of 47 TABLE A1 45 of 47 (Continued) Reference Application Process Approach Tool Technique Data source Data EHR (Valero-Ramon, Fernandez- PD Llatas, Valdivieso, & Traver, 2020) PB IPM PMApp PALIA (Valero-Ramon, Fernandez- PD Llatas, Martinez-Millana, & Traver, 2020) PB IPM PMApp PALIA (Hendricks, 2019) PD PP CEDA ProM α-M HIS Sepsis (Caron et al., 2014) PD, VA, CA, CP SNA CEDA HM, SNM, TA, LTL-C HIS Oncology (Baker et al., 2017) PD PP Ad hoc EHR Oncology (Alvarez et al., 2018) PD, SNA PI FM HIS ED CEDA Disco Cardiology Obesity (Leonardi et al., 2019) PD CP CEDA ProM HM HIS Cardiology (Amantea et al., 2020) PD PP CEDA PALIA-Web PALIA HIS Geriatrics (Duma & Aringhieri, 2020) PD, PPA PP CEDA Ad hoc HIS ED (H. Xu, Pang, Yang, Jinghui, et al., 2020) PD, CA CP MP ProM Decalre EMR Cardiology (Placidi et al., 2021) PD, CA PP CEDA pMineR HMM-based HIS Oncology (Lakshmanan et al., 2013) PD CP CEDA+TC ProDiscovery HM, SC EMR Cardiology (Fernandez-Llatas, Meneu, et al., 2011) CA PP CEDA PALIA (Kirchner et al., 2012) CA PP BPL Signavio TA HIS Surgery (Rovani et al., 2015) CA PP BPL ProM Declare HIS Urology (de Vries et al., 2017) CA PP BPL ProM EMR Sepsis (Ganesha et al., 2017) CA CP CEDA ProM Video rec. OS ProM, Disco FM, RLPN ED (Jaturogpattana et al., 2017) CA PP CEDA IM, FM, RLPN HIS OS (Mannhardt & Blinde, 2017) CA PP CEDA+MP ProM IM, MPPE HIS Sepsis (Anggrainingsih et al., 2018) PPM, CA PP CEDA ProM α++, IM HIS OS (S. Yang, Sarcevic, et al., 2018) CA MP BPL ProM TA Video rec. Pediatric (Neira et al., 2019) PPM, CA CP BPL ProM, Disco RLPN, FCBDP HIS Sepsis (Asare et al., 2020) CA PP CEDA ProM IM, RLPN (H. Xu, Pang, Yang, Ma, et al., 2020) CA CP BPL ProM Declare HIS Cardiology (H. Xu, Yan, Pang, Nan, et al., 2020) CA CP BPL A* EMR Cardiology (Dunkl et al., 2011) CA CP DM HIS Oncology (Andrews et al., 2020) VA, PPM PP (Suriadi et al., 2014) VA PP (Partington et al., 2015) VA PP (Montani et al., 2014) PMC PP (Perimal-Lewis et al., 2012) PPM PP CEDA ProM HIS (R. Mans et al., 2013) PPM, Sim MP MP ProM, CPN Tools HIS ProM, CPN Tools CEDA L* ProM IM HIS RTC ProM HM, FM, PM, TA, KM EHR Cardiology ProM HM, FM, RLPN HIS Cardiology ProM HM HIS Cardiology Dentistry (Continues) 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL. TABLE A1 GUZZO ET AL. (Continued) Reference Application Process Approach Tool Technique Data source Data (Jaisook & Premchaiswadi, 2015) PPM CP CEDA Disco FM HIS Respiratory dis. (Yoo et al., 2016) VPA, PMC, PPM PP CEDA HM EHR Oncology (Stefanini et al., 2018) PPM PP CEDA FM, IM HIS ED (Gattnar, Ekinci, & Detschew, 2011) PPM PP FM HIS Surgery FSPM EMR Cardiology TA HIS, Video rec., SD Oncology FM, TA HIS Oncology (Gattnar, Ekinci, Detschew, PPM, PMM & Capel-Tunon, 2011) MP (Aguirre et al., 2019) VPA PP PM 2 (Perer et al., 2015) VPA CP CEDA (S. Chen et al., 2017) VPA ProM, Disco ARIS CEDA Celonis (Jagadeesh Chandra Bose & PU van der Aalst, 2011) PP (Forsberg et al., 2016) PU PT CEDA ProM HM, RLPN PACS Radiology (Agostinelli et al., 2020) PU CP PM 2 ProM IVM, SNM HIS ED, OS (Z. Zhou et al., 2014) Sim PP Disco, ProModel α-M, FM HIS OS (Lamine et al., 2015) Sim PT Disco, Witness FM HIS ED (Augusto et al., 2016) Sim CP Anylogic ILP HIS Cardiology (Tamburis & Esposito, 2020) Sim CP CEDA ProM, Disco, Simul8 PALIA HIS Surgery (Phan et al., 2019) PPM, Sim CP CEDA Disco, Anylogic FM HIS Surgery (Franck et al., 2020) Sim PP CEDA ProM, Anylogic HIS Cardiology (Riz et al., 2016) SNA PP CEDA ProM SNA HIS Oncology (Durojaiye et al., 2019) SNA PI CEDA igraph, graphkernels, kernlab USC, GKC EHR Pediatric (Conca et al., 2018) SNA PI CEDA PALIA-Web PALIA, SC HIS Diabetes CEDA ProM (Van Der Spoel et al., 2012) PPA PP CEDA (Benevento, Aloini, et al., 2019) PPA PP CRISP-DM Disco RF (Kempa-Liehr et al., 2020) PPM, PPA CP CEDA+TC ProM (Back et al., 2020) PPA PP CEDA HIS Cardiology HIS ED IVM, EEL EMR Surgery α-M HIS Surgery (H. Xu et al., 2021) PPA CP CEDA IM, Declare EMR Cardiology (H. Xu, Pang, Yang, Li, & Zhao, 2020) PPA CP CEDA SKLearn DT, RF EMR Cardiology (A. P.Kurniati et al., 2020) CD PP PM 2 ProM, bupaR IM, IMI, IDAHM, ILP EHR Oncology (Han et al., 2011) OD PP YAWL-BPM HIS Diabetes (Perimal-Lewis et al., 2016) DQA PP Disco EHR ED (Lanzola et al., 2014) DQA PP CEDA HIS 2 bupaR (Gonzalez-García et al., 2020) PMM CP PM (Erdogan & Tarhan, 2018) PMM PP PM 2 Disco (T oth et al., 2017) PMM CP CEDA ProM cardiology HIS Cardiology FM HIS Surgery IVM HIS Oncology 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 46 of 47 TABLE A1 47 of 47 (Continued) Reference Application Process Approach Tool (R. S. Mans, Van der Aalst, et al., 2012) PMM CEDA ProM (Homayounfar, 2012) PMM (Fernandez-Llatas, 2021a) PMM IPM (Fernandez-Llatas, 2021b) PMM IPM (Martinez-Millana et al., 2021) PMM PP (Tibeme et al., 2018) PMM PP (Zhu et al., 2010) PMTD PT (Rojas, Sepúlveda, et al., 2017) PMM (Rojas et al., 2018) PPM PP CEDA ProM, Disco (Andrews et al., 2018) PPM PP CEDA Disco (Durojaiye et al., 2018) PPM PP CEDA+TC ProM (Chang et al., 2020) PPM PP CEDA bupaR (Sato et al., 2020) PPM PP CEDA (Arias et al., 2020) PMM PP CEDA 2 Technique Data source Data HIS Surgery HIS Diabetes PP IPM α-M, IM, HM, FM SD HIS CEDA Radiology ED FM HIS ED HIS Cardiology HM, RLPN HIS Pediatric HM EMR Cardiology Disco EHR Cardiology Celonis HIS Cardiology (G. Kusuma et al., 2020) PD PB PM ProM, Disco RLPN SD (Cho et al., 2020) PPM PP CEDA ProM, Disco, ProDiscovery IM, FrM EHR ED (Badakhshan & Alibabaei, 2020) PPM PP CEDA Disco FM HIS ED (Ibanez-Sanchez et al., 2021) PPM PP IPM HIS ED Note: Applications: CA—conformance analysis; CD—concept drift; ELE—event log extraction; ELQI—event log quality improvement; DQA—data quality assessment; PA—process understanding; PD—process discovery; PMC—process model comparison; PMM—PM methodology; PMTD—PM tool development; PPA—predictive process analytic; PPM—process performance measures; PPPM—privacy preserving PM; OD—outlier detection; Sim—simulation; SNA— social network analysis; VA—variants analysis; VPA—visual process analytic. Processes: CP—clinical pathways; HM—human movements; MP—medical processes; PB—patient behavior; PI—personnel interactions; PP—patient pathways; PT—personnel tasks. Techniques: α-M—α-miner; bupaR—R lib; CPLEX— C lib for simplex method; DM—decision miner (ProM plug-in); DT—decision tree; EEL—explore event log (ProM plug-in); FC—fuzzy clustering; FCBDP— find context based differences in process (ProM plug-in); FM—fuzzy miner; FrM—frequency mining; FSPM—frequent sequential pattern mining; GKC—graph kernel clustering; GM—genetic miner; graphkernels—R lib; HC—hierarchical clustering; HM—heuristic miner; HMM—hidden Markov model; HoWSN— handover-of-work social network; IDAHM—interactive data aware heuristic miner (ProM plug-in); igraph—R lib; ILP—integer linear programming; IM— inductive miner; IMI—inductive miner infrequent; IPM—interactive PM (ProM plug-in); IVM—inductive visual miner; kernlab—R lib; KM—K-means clustering; LDA—latent Dirichlet allocation; LTL-C—LTL-checker; MBC—model based clustering; MPPE—multi-perspective process explorer; PM—passage miner; pMineR—R lib; QTC—quality threshold clustering; RF—random forest; RLPN—replay log on Petri net; SC—sequence clustering; SKLearn—Python lib; SNM—social network miner; SPM—sequential pattern mining; stringdist—R lib; TM—text mining; topicmodels—R lib; USC—unsupervised spectral clustering; TA—trace alignment. PM methodologies: BPL—business process life cycle; CEDA—data collection, event log extraction, process discovery, process analysis; IPM—interactive PM; LPM—local process model; MP—multi-perspective PM; TC—trace clustering. Data source: BSD—body-sensor data; EHR— electronic health record; EMR—electronic medical record; ILS—indoor location system; PACS—picture archiving and communication system; RTLS—real time location system; HIS—hospital information system. Data: ED—emergency department; HH—home hospitalization; OS—outpatient service; RTC—road traffic crashes; TR—trauma resuscitation; SD—synthetic data. 19424795, 2022, 2, Downloaded from https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1442 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [21/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License GUZZO ET AL.