A Scaleable Heterogeneous Architecture for Agent – Oriented Workflow Management Content Areas: Agent Technology, Interoperable Agent Platforms, FIPA, Workflow Ma nagement John Hickie 1, James Kennedy1, Georgios Koudouridis 2, Vaggelis Ouzounis 3 and Matthew Studley 4 1 Broadcom Éireann Research Ltd., Kestrel House, Clanwilliam Place, Dublin, Ireland. jh@broadcom.ie jky@broadcom.ie 2Telia Research AB Vitsandsgatan 9, D713, SE- 123 6 Farsta, Sweden George.P.Koudouridis@telia.se 3GMD - FOKUS Vaggelis Ouzounis Kaiserin-Augusta Allee 31 D-10589 Berlin, Germany ouzounis@fokus.gmd.de 4 BT Laboratories Martlesham Heath, Ipswich IP5 3RE United Kingdom matthew.studley@bt.com “This paper has not already been accepted by and is not currently under review for a journal or another conference, nor will it be submitted for such during IJCAI's review period.” interoperability and scalability in these tools has been identified as a serious weakness. P815 proposes a workflow architecture that will address these issues by applying agent standards and technologies. This agent architecture is found to map well to the problems described above. Abstract With a growth rate of 35% per annum, the workflow management market has become a multibillion dollar business. This paper describes the work of the EURESCOM P815 project, which is investigating the advantages of bringing agent concepts and technologies to this booming sector. Existing workflow systems provide an intuitive way of defining and controlling business processes. However, a lack of 1 Introduction This paper discusses the ongoing work in the EURESCOM project P815. This project concentrates on applying agent technologies and standards to the management of business processes. However, this paper focuses on one of the casestudies under development and in particular how this casestudy achieves a scalable interoperable architecture for agent-oriented workflow. Section 2 outlines traditional workflow and identifies various shortcomings, while section 3 outlines the case-study under consideration. 2 Traditional Workflow Management Workflow systems have evolved over time to meet the demands of current business processes. A workflow management system (WfMS) will make sure that individual tasks in a process will occur in the order in which they are supposed to. A number of different software packages have been developed to perform this task. The WfMS traditionally developed have been proprietary. Different systems will have different approaches, strengths and weaknesses, consequently there is a lack of interoperability between them. To address this problem, the Workflow Management Coalition has been set up to try to integrate offerings from different vendors. Although much of this specification is complete, it is not widely adopted. However, the specification provides an insight into the commonalties between vendors. The workflow reference model (http://www.aiim.org/wfmc/mainframe.htm) attempts to depict the general structure of workflow systems. These systems are based upon the precept that the workflow relevant and non-workflow relevant data should be separated. This data is separated by maintaining two information sources, one for each of the above. The workflow engine instructs tasks to start based on the information held in the workflow relevant database. The task then starts, knowing where to get its data from the non-workflow relevant database. This model works well for centralised systems, where database access is ubiquitous. However, in distributed systems, there is no such central repositories as central databases start to become performance bottlenecks as the scale of the workflow system increases. Aside from this model of communication, there are other important aspects of workflow management system design. Chief among these is how the user is allowed to define their business logic, and how that business logic maps onto the underlying operational data. 2.1 Business Logic and Data In any WfMS, two types of data flow can be clearly distinguished; the data flowing between the entities which perform tasks (the actors), and the data associated with the initiation and control of these entities’ activities. The first, referred to by the Workflow Management Coalition (WMC) as ‘application data’, may be defined thus; ‘Data which is application specific and not accessible by the WfMS’ reference model [WfMCa]. The second, the business logic or ‘workflow relevant data’, is defined by the WMC as; ‘Data that is used by a WfMS to determine the state transitions of a workflow instance, for example within pre- and post-conditions, transition conditions or workflow participant assignment.’[WfMCb]. There is a clear advantage in maintaining a partition between these data types. The workflow relevant data has nothing to do with the underlying tasks. While the actors may change, a given process may remain the same; all that is required is that there exists an actor for each role referred to within the process definition. Equally, an actor may be required to enact his role as part of many different processes, and in many instances of these processes. 2.2 Shortcomings of Traditional Workf low The rapidly increasing capability and flexibility of telecommunications technology, especially the emergence and expansion of the WWW, Internet/Intranet and broadband network infrastructures, have brought a number of new possibilities and challenges to the design and organisation of business activities using telecommunications networks. Such possibilities and challenges determine the changes in the features or emphasis of business process management. The ubiquity of business processes and activities in the emerging global information infrastructure (GII) requires a WFMS to offer the following features: Openness A business context in the emerging telecommunications environment accommodates a variety of heterogeneous roles, players and software/hardware components/tools. All these can not be fully identified at the WFMS development phase. A suitable WFMS design should therefore support a higher degree of openness with the twofold meanings: open to the integration of (and the interoperation with) highly heterogeneous software components and tools that could exist in the global environment in which the WFMS is to be deployed, open to the accommodation and integration of/interoperation with heterogeneous business sub-contexts with their specific requirements, roles and players. Dynamic Nature Business processes within the virtual telecom world will have a highly dynamic nature in relation to the environmental business restrictions, customer requirements, internal objectives/interests and the technologies deployed by the local organisations or the partners. Each process has to therefore dynamically modify its model and knowledge of the business contexts, and adapt its functions following the evolution of the business requirements. As a result, a WFMS should enable the business processes/activities to dynamically/reactively adapt (learn) their co-operation interfaces (services, APIs) and behaviours, especially by negotiating and interacting with their environments. Distribution and Globalisation Globalisation of the telecommunications infrastructure enables the wide distribution of business activities within the workflow instances. Unlike traditional WFMS scenarios, where workflows are typically managed within relatively small administrative and geographical domains, future WFMS will have to deal with concurrent business processes that are distributed over large areas such as countries, and over physically different enterprises (e.g. federation for Virtual Business Enterprises). Decentralised, Autonomous Activities Emerging virtual business applications typically consist of decentralised and autonomous entities. One example is the Virtual Business Enterprises, where independent enterprises can jointly form a federated new business for specific interests and for a specific duration. Each business entity/process in this context can have its own objective and agenda, and will base its behaviour on the feedback from the environment. Co-operations and coordinations will have to be based on dynamic negotiations following either pre-specified or dynamically negotiated protocols. 2.3 Agent-Oriented Workf low Current WFMSs, based on the traditional Remote Procedure Call (RPC) paradigm, support only the very basic services for workflow modelling, enactment and coordination. Most of these services are in the context of workflow process/activity life-cycle management. The idea behind agent-oriented workflow management system (AoWfMS) is to utilise the agent-oriented paradigm to augment and enhance WFMS systems to manage business processes enacted in highly dynamic and distributed environments. Specifically, agent-oriented workflow contributes to the WFMS system’s objective to achieve high availability and reliability of management activities. In contrast centralised WFMS schemes result in bottlenecks around the WFMS servers, while AoWFMS achieve flexibility by distributing control. AoWFMS provide mechanisms for co-ordination among workflow activities and processes that are fault and exception tolerant, and dynamically distribute service functionality to achieve a load balance among service processing management sites. Agent systems are traditionally well suited to solving this type of problem. This will be better understood in the sections that follow. So far, there is no universally accepted definition that clarifies the exact semantics of the notion agent. Rather than attempting to define an agent it is better to identify and reason upon some of the properties of agents and their contribution to WFMS systems. According to the definition given in [Wooldridge] 1, an agent is a problem solving entity that has the following properties: Autonomy: An agent is autonomous in the sense that it does not require direct intervention from the user to carry out its tasks. It has control over its actions and internal state. Pro-activity: An agent dynamically pursues its goals that have either been assigned by the user or activated due to external events. Being pro-active means being goaloriented. Reactivity: As agents have the ability to perceive their environment, they can react appropriately and in a timely fashion to changes in it. Social ability: An agent can interact with other agents and its user by means of communication based on speech-acts. This communication provides a means for agents to co-ordinate their actions and to co-operate with other agents. AOWfMS address the above-mentioned shortcomings of WfMS in the following ways. Openness – the integration between heterogeneous system becomes easier because agents provide a speech-actbased communication allowing the agents to be loosely coupled. Dynamic Nature – coping with changes and exceptions as they arise could be possible due to the agents’ ability to perceive their environment and co-ordinate themselves and their actions accordingly during run-time. Decentralisation & Autonomy – rather than having centrally imposed control agents can, by being autonomous and goal-oriented, evaluate and decide upon their own actions based on their own goals and the prevailing circumstances 1 Defining an agent has been the subject of a lot of debate in the agent community. The literature abounds with attempts to define agency, with attempts varying from rigorous to informal (see [Franklin], [Petrie 96], [Russell & Norvig 95] and [Wooldridge] for example). The definition of an agent put forward by Wooldridge and Jennings, however, seems to have gained widespread acceptance. Distribution & Globalisation: - distribution of system elements is also an inherent characteristic of an AoWfMS due to the delegation and co-ordination of actions between agents that could be situated at different locations. In the rest of this paper and in terms of workflow, we focus on the “openness” issue, as this is the main objective of our proposed scalable heterogeneous architecture for AoWfMs . 3 P815 AoWFMS : A Case-Study in International Leased Line Provisioning 3.1 Business Process Def initions The process definition as defined by WfMC [WfMCa] is the computerized representation of the activities, steps, participants, rules and sequences involved in a particular process. It depicts, for example, process initiation and completion conditions, rules for driving process activities, user tasks to be undertaken, references to applications which may be invoked and definition of any required workflow relevant data. The process definition can be represented in textual and/or graphical form or by means of a formal language notation, which can be the same as the content language used for representing an agent’s knowledge. In either case, a process definition language should be equipped with semantics required for the description of business processes and should also allow goal-oriented specifications of activities (that is, a description of what is to be achieved rather than prescribing how it is to be achieved). Figure 1 illustrates the main structural elements of a business process. Begin business process definition Parameter value (business process :name ”Inland Section Provision” :type composite :role ”Inland Section Representative” :data (in (SLA-ID #4)) :pre-condition (order handled) :post-condition (inland section provided) :nodes (process-node ”Inland Section Design”) (process-node ”Inland Section Test”) ... ) A business process definition is an object data structure consisting of a sequence of process parameters, introduced by parameter keywords (constants) beginning with a colon. As depicted in the figure and according to the above discussion the process’s parameters state the following information: process’s name (e.g. :name); process’s type (e.g. :type); which composite or atomic initiation conditions for the process to be enacted (e.g. :pre-conditions); termination conditions indicating process termination (e.g. :postconditions); the role of the organizational entity (:role); the process’s input and output application data (e.g. :data); activities to be carried out within the process (e.g. :nodes). In the case where a process is atomic the parameter “:nodes” includes the name of the function to be executed. As in the business process pre- and post-conditions could also be atomic or composite, i.e., consisting of orbranches or and-branches of conditions. Conditions can also be negative statements by using “not”. Depending on the workflow engine, the data parameter can be used to pass application data or references in the case where control and data flow are separated. The minimal functionality required for handling business process definitions should include: Editorial or graphical capabilities that facilitate the definition of processes and enables designers to control process deployment. These capabilities can be either incorporated or external to the agent. The maintenance of a repository used to store these definitions. A mechanism that deploys and distributes these definitions among agents and inserts these definitions in the agents These functions should be available both off-line and at run-time to allow automatic and dynamic adaptations to environmental changes. 3.2 Workf low Engine The workflow engine must respond to an initiation signal which indicates the type of process which is to be enacted. Using this information, the workflow engine retrieves a Business Process Definition (BPD) from a local repository, and thence creates a Business Process Instance (BPI). There can be many BPIs concurrently executing, and these may be instances of multiple BPDs. Business process parameters Parameter expression Figure 1: Main structural elements of a business process A BPI may be thought of as a directed graph. Each node in the graph may be a reference to a further BPD, similarly available from the local repository. Alternatively, a node may define an activity in the BPI. Activities are to be enacted by agents, which are long-lived entities who register their abilities with a central directory. The workflow engine uses the directory facility (DF) to find the logical address of an agent which can satisfy the requirements specified by a non-BPD node in a BPI. When an agent finishes its task, it signals the termination conditions to the workflow engine. The workflow engine may use these signals to make decisions on which step in the BPI is the next to be performed. In our case, there are three general types of agents that support the execution and management of the business processes. Workflow Request Agent (WRA) - gets the requests for business process execution/instantiation. Workflow Engine Agent (WEA) - executes and monitors the execution of a unique BDI. Resource Agent (RA) - provides the functionality required by different activities (non-BPD node in the BPI). In general, a client requests the execution of a business process by sending a request for a BP invocation to the WRA. In the sequel, the WRA instantiates a WEA. The WEA will undertake the responsibility for the execution and control of the BPI. Whenever an activity needs to be executed the WEA locates the appropriate RA and ask it to provide the required functionality. Whenever a sub-BP needs to be executed the WEA contacts another WEA and asks it to execute the sub-BP. In general, a client can start a business process, pause a business process, resume a business process and terminate a business process. Upon request of these operations, the WRA instatiates a WEA that will undertake the responsibility to execute the BP. For that reason, the WEA contacts the Business Process Repository, retrieves the BPD and creates a BPI. After the creation of the BDI, the execution of the BPI can be started. Whenever a client wants to pause a BP, sends a requests to the WRA and then the WRA pauses the execution of WEA. Thus, the operations that the WEA will provide to the WRA are the instantiation of a BPD, the initiation of a BDI, the pausing of a BDI, the resumption of a BDI and finally, the termination of a BDI. The WRA has full control over the execution of the BDI and, based on client requests, can force the WEA to behave accordingly. If the WEA identifies that the execution of the BDI should be continued with the invocation of an activity, then the WEA contacts the DF and locates the physical address of the RA that can provide the activity. Then, the WEA can request to start the activity, to stop the activity, to terminate the activity or to resume the activity. When the RA terminates, normally the provision of the activity notifies the WEA by sending him a notification message. The message can be sent either in synchronous or asyn- chronous mode. Based on these messages the WEA can evaluate the constraints of the BDI and decide the activity or sub-BP needs to be executed next. If the WEA identifies that the execution of the BDI should be continued with the invocation of a sub-BP, then the WEA contacts the DF and locates the physical address of the WEA that can support the execution of this sub-BP. In the sequel, the source WEA contacts the target WEA and asks it to instantiate the sub-BP. The target WEA retrieves the sub-BP from the Business Process Repository, interprets it and creates a sub-BDI. Based on this BDI, the target WEA will start the execution of the sub-BP. The source WEA can request it to start the execution, to stop the execution, to resume the execution and to terminate the execution. In any case, the target WEA informs the source WEA about the results of the sub-BPs by sending notification messages. The above mentioned agents are interacting together in order to support the execution of the workflow on behalf of the clients by exchanging ACL messages. Three types of messages can be exchanged, namely: request messages response messages notification messages Request messages are exchanged between an agent a and an agent b when agent a wants a service or operation to be executed by agent b. Response messages are exchanged when an agent b wants to respond to the request message of agent a by providing the result of the operation requested. Finally, notification messages are exchanged when an agent b wants to inform an agent a about the termination (normal or abnormal) of its task. Messages can be sent either synchronously or asynchronously according to the needs of the agents. In some cases, workflow occurs across the boundary of organisational units. An example of this is the current case-study, International Leased Line (ILL) Provisioning. Consider for a moment the following scenario: There are two telecommunications companies, a and b. A customer requests a to establish an ILL between two points, a 1 and b 1, where a 1 is locally accessible to a and b 1 is locally accessible to b. In such a case it is merely necessary that a should know that there exists a BPD within the remote domain b which can satisfy a’s requirements, subject to prior agreement. For the workflow engine, requests are received, and actors are discovered who can satisfy the task requirements. One BPI’s actor may in some cases be another workflow engine, and a BPI’s initiating ‘customer’ may, vice versa, be a remote workflow engine. In this manner, the AoWfMS presented in this paper addresses the problems of scaling and encompasses hetero- geneity. Different domains may be expected to have BPDs that differ though they are functionally equivalent. However, in so far as these BPDs can be referred to at a level of abstraction that hides their differences, the heterogeneity between domains becomes transparent. 3.3 The P815 AoWfMS will use a FIPA [FIPA98] compliant interface to communicate between heterogeneous WfMS. In this way, FIPA agent message formats, can be used to limit the amount of communications management needed. Table 1 depicts how issues such as session handling, control of the workflow process and non-workflow relevant data are separated. Agent Communication Non-Workflow Relevant Data An AoWfMS will need to communicate workflow relevant and non-workflow relevant data between its agents. The workflow relevant data will be generic to all workflows that are modelled. An example of this may be a ‘Start Task’ message that is sent between tasks in a workflow. Equally, we might expect a ‘Task Completed’ response from a task, which has finished a requested service. These types of generic messages have been well defined in the WfMC interoperability specification [WfMCa]. This specification outlines the type of messaging that might be expected in the generic case. There are basic message types defined which allow two workflow engines to communicate. Issues such as session and message handling are addressed in [WfMCa]. The scale of what this specification attempts to do is perhaps reflected in the fact that the standard remains largely unimplemented. Defining the interfaces between different WfMS has lead to a lengthy specification. Agent communication technologies would seem to be ideally suited to removing much of the complexity of interoperability. Agent platforms may be used to ‘hide’ much of the detail, providing a layer of workflow management primitives upon which workflow systems may communicate. Agent standards for interoperability [FIPA98] have grown in maturity over the last few years to a stage where real applications may be built upon them [FACTS]. There has already been work done on the application of agents for workflow interoperability [KAM]. The P815 project will focus on scalability and openness. To improve the scalability of a WfMS, the handling of workflow relevant and non-workflow relevant data is key. In a centralised model, control of the workflow process is separated from the data needed to complete a domain specific task. This may be both a logical and physical separation, as workflow specific data (control data for the purposes of the workflow) is stored in a separate database. This approach helps to distinguish between the workflow process and the individual tasks. However, in a distributed system, there may not be universal access to either workflow specific or nonworkflow specific data. Messages passed between systems will contain workflow specific and non-workflow specific data. It might seem that this approach may blur the lines between the two. To separate the two, much of the communication management that was previously specified in the WfMC interoperability specification, will be replaced by agent communication tools. Workflow Relevant Data Agent Communication (FIPA) and Management Table 1: Three layers of communication in P815 AOWfMS There is a distinct logical separation of workflow/nonworkflow relevant data. The workflow relevant data is dealt with in the middle layer of Table 1. 4 Conclusions Workflow Management would benefit from applying agent-oriented technology in several ways. It is envisaged that an AoWfMS can obtain the required properties of openness, dynamic nature, distribution and decentralisation by employing common agent features and capabilities as well as patterns for agent interaction. In this paper we presented an architecture for an AoWfMS that addresses the problems of scaling and encompasses heterogeneity to deal with the openness issue. By equipping agents with a workflow engine and by utilising agents’ social ability and the way they communicate using speech acts, we can achieve a WfMS that facilitates the enactment of business processes in heterogeneous domains or even environments. Furthermore, as business process definitions can be referred to at a level of abstraction that hides their differences, the heterogeneity between domains becomes transparent. Additionally, the proposed architecture enables the creation of an agentoriented workflow management system that is easily extensible in functionality and can adapt to changing business environments. Such a feature gives us a system that can be deployed and developed at the same time . At the time of writing phase 1 of the project, which builds the initial infrastructure, is under implementation. This phase concentrates on achieving interoperability between various FIPA platforms with in-build workflow engines. Once this phase is completed phase2 will build on this infrastructure to determine the benefits of an AoWFMs. P815 is due for completion in March 2000. Disclaimer This document is based on results achieved in a EURESCOM Project. It is not a document approved by EURESCOM, and may not reflect the technical position of all the EURESCOM Shareholders. The contents and the specifications given in this document may be subject to further changes without prior notification. Neither the Project participants nor EURESCOM warrant that the information contained in the report is capable of use, or that use of the information is free from risk, and accept neither liability for loss or damage suffered by any person using this information nor for any damage which may be caused by the modification of a specification. This document contains material which is the copyright of some EURESCOM Project Participants and may not be reproduced or copied without permission. The commercial use of any information contained in this document may require a license from the proprietor of that information. References [FACTS] [FIPA98] [Franklin] http://www.labs.bt.com/profsoc/facts/ http://www.fipa.org/spec/FIPA98.html http://www.msci.memphis.edu /~franklin/ AgentProg.html [KAM] Mohan Kamath, Oracle Corp., Krithi Ramamritham, Pragmatic Issues in Coordination Execution and Failure Handling of Workflows in Distributed Workflow Control Architectures,. [Petrie 96] Petrie, Charles J., ‘Agent-Based Engineering the Web and Intelligence’, IEEE Expert, December 1996, also http://cdr.stanford.edu/NextLink/Expert.html [Russell] Russell, Stuart J., Norvig, Peter. Artificial Intelligence: A Modern Approach. Englewood Cliffs, NJ: Prentice Hall, 1995. [WfMCa] Interoperability Abstract Specification, 20 th October 1996, Workflow Management Coalition. [WFMCb] Workflow Management Coalition: Terminology & Glossary (WFMC-TC-1011, June-1996, 2.0) [Wooldridge] Wooldridge, M., Jennings, N. Intelligent Agents: Theory and Practice. The Knowledge Engineering Review 10(2). pp.115-152. 1995.