Deliverable1Wp5

advertisement
Report on the State-of-the-Art and Requirements Analysis
DIP
Data, Information and Process Integration with Semantic Web Services
FP6 - 507483
Deliverable
D5.1
Report on the State-of-the-Art and Requirements Analysis
(WP 5 – Service Mediation)
Emilia Cimpian
Christian Drumm
Michael Stollberg
Ion Constantinescu
Liliana Cabral
John Domingue
Farshad Hakimpour
Atanas Kiryakov
07 March 2016
Deliverable 5.1
i
Report on the State-of-the-Art and Requirements Analysis
EXECUTIVE SUMMARY
This deliverable covers the current state-of-the-art in data, information, and process
mediation, and provides an analysis of mediation requirements for the DIP Mediation
Component.
The document treats the mediation of data and information separately from process
mediation since process mediation requires the interpretation of goals and workflow as
well as flexible Web Service invocation, which are not required for data and
information mediation.
This document consists of two main parts. The first part provides an overview of the
current state of the art in mediation, describing some of the existing approaches and
projects. In this section, the industrial and research approaches are treated differently for
data and information mediation.
The second part of the document provides an analysis of mediation requirements. Three
types of requirements need to be considered here: requirements regarding the general
architecture of the DIP Mediation Component (which can be requirements for the runtime environment and requirements for the design time tool), requirements for data and
information mediation, and requirements for processes mediation.
Deliverable 5.1
ii
Report on the State-of-the-Art and Requirements Analysis
Document Information
IST Project
Number
FP6 – 507483
Full title
Data, Information and Process Integration with Semantic Web Services
Project URL
http://dip.semanticweb.org
Document URL
https://bscw.dip.deri.ie/bscw/bscw.cgi/0/521
Acronym
DIP
EU Project officer Daniele Rizzi
Deliverable
Number 5.1
Title
Report on the State-of-the-Art and
Requirements Analysis
Work package
Number 5
Title
Service Mediation
Date of delivery
Contractual
M6
Actual
version 0. 2
final 
Status
30-June-04
Nature
Prototype  Report  Dissemination 
Dissemination
Level
Public  Consortium 
Authors (Partner)
Emilia Cimpian (NUIG), Christian Drumm (SAP),
Michael Stollberg (UIBK), Ion Constantinescu (EPFL)
Liliana Cabral (OU), John Domingue(OU), Farshad Hakimpour(OU),
Atanas Kiryakov (SIRMA)
Responsible
Author
Emilia Cimpian
Email
Partner NUIG
Phone +353-91-512640
emilia.cimpian@deri.ie
Abstract
In the last twelve years since Gio Wiederhold [Wiederhold, 1992] first
(for dissemination) came up with the idea of mediation and mediation systems, intensive
research has been done in this field. In this report, we provide an overview
of the current state-of-the-art in data, information, and processes
mediation and an analysis of the requirements for constructing a mediation
system.
Keywords
Data and information mediation; processes mediation; schema matching;
ontology mapping, merging and alignment; choreography; orchestration;
collaboration
Version log/Date
Change
27-Feb-04
First draft of the skeleton of the
deliverable
19-Feb-04
Changes on the skeleton conforming Emilia Cimpian
to the discussions during the
Wiesbaden meeting
22-March-04
Paragraphs added regarding the
state-of-the art in data mediation
Deliverable 5.1
Author
iii
Emilia Cimpian
Christian Drumm
Report on the State-of-the-Art and Requirements Analysis
29-March-04
Outline of the state-of-the-art
analysis for process mediation
Michael Stollberg
05-April-04
IRS II description
John Domingue, Liliana Cabral
21-April-04
Bullet points on requirements
analysis
Christian Drumm
29-April-04
D5.1a: Survey of Industrial Data
Vladimir Alexiev
Integration Systems – separate from Atanas Kiryakov
this document, but part of the same
deliverable
29-April-04
Compilation of work on ontologybased data mediation
Farshad Hakimpour
30-April-04
Restructuring of the document
Emilia Cimpian, Christian Drumm,
Michael Stollberg
07-May-04
EPFL contribution and process
mediation included
Ion Constantinescu
14-May-04
Paragraphs added to the
introduction
Michael Stollberg
17-May-04
SAP XI and XMapper descriptions
added; also significant changes to
the state of the art in data and
information mediation
Christian Drumm
20-May-04
Paragraphs added to “Overview on Emilia Cimpian
Data and Information Mediation
Approach” and on the requirements;
reordering of the references in
alphabetical order.
21-May-04
More requirements
Christian Drumm
25-May-04
More requirements
Christian Drumm
28-May-04
Update on process mediation
Michael Stollberg
1-June-04
Reference added to D5.1a
Atanas Kiryakov, Emilia Cimpian
02-June-04
Ms BizTalk description added;
Emilia Cimpian
Restructuring of the research stateof-the-art in data and information
mediation
03-June-04
More requirements
Christian Drumm
04-June-04
Small changes concerning the form
of the document
Emilia Cimpian
22-June-04
Changes in the entire document,
conforming to the reviewers
comments
Emilia Cimpian, Christian Drumm,
Michael Stollberg
24-June-04
Small changes after the proofreading
Emilia Cimpian
Deliverable 5.1
iv
Report on the State-of-the-Art and Requirements Analysis
Project Consortium Information
Partner
Acronym
NUIG
Contact
Prof. Dr. Christoph Bussler
Digital Enterprise Research Institute (DERI)
National University of Ireland, Galway
National University of Ireland Galway
Galway
Ireland
Email: chris.bussler@deri.ie
Tel: +353 91 512460
Bankinter
Monica Martinez Montes
Fundacion de la Innovation. BankInter,
Paseo Castellana, 29
28046 Madrid,
Fundacion De La Innovacion.Bankinter
Spain
Email: mmtnez@bankinter.es
Tel: 916234238
Berlecon
Dr. Thorsten Wichmann
Berlecon Research GmbH,
Oranienburger Str. 32
10117 Berlin,
Berlecon Research GmbH
Germany
Email: tw@berlecon.de
Tel: +49 30 2852960
BT
Dr John Davies
BT Exact (Orion Floor 5 pp12)
Adastral Park Martlesham,
British Telecommunications Plc.
Ipswich IP5 3RE,
United Kingdom
Email: john.nj.davies@bt.com
Tel: +44 1473 609583
EPFL
Prof. Karl Aberer
Distributed Information Systems Laboratory
École Polytechnique Féderale de Lausanne
Swiss Federal Institute of Technology,
Lausanne
Bât. PSE-A
1015 Lausanne, Switzerland
Email : Karl.Aberer@epfl.ch
Tel: +41 21 693 4679
Essex
Mary Rowlatt,
Essex County Council,
PO Box 11, County Hall, Duke Street,
Chelmsford, Essex, CM1 1LX,
Essex County Council
United Kingdom.
Email: maryr@essexcc.gov.uk
Tel: +44 (0)1245 436524
FZI
Andreas Abecker
Forschungszentrum Informatik
Haid-und-Neu Strasse 10-14
76131 Karlsruhe,
Forschungszentrum Informatik
Germany
Email: abecker@fzi.de
Tel: +49 721 9654 0
Institut für Informatik, Leopold-Franzens
Deliverable 5.1
UIBK
Prof. Dieter Fensel
v
Report on the State-of-the-Art and Requirements Analysis
Universität Innsbruck
Institute of computer science
University of Innsbruck
Technikerstr. 25
A-6020 Innsbruck, Austria
Email: dieter.fensel@deri.org
Tel: +43 512 5076485
ILOG
Christian de Sainte Marie
9 Rue de Verdun, 94253
Gentilly, France
ILOG SA
Email: csma@ilog.fr
Tel: +33 1 49082981
Inubit
Torsten Schmale,
inubit AG
Lützowstraße 105-106
D-10785 Berlin,
inubit AG
Germany
Email: ts@inubit.com
Tel: +49 30726112 0
iSOCO
Dr. V. Richard Benjamins, Director R&D
Intelligent Software Components, S.A.
Pedro de Valdivia 10
Intelligent Software Components, S.A.
28006 Madrid, Spain
Email: rbenjamins@isoco.com
Tel. +34 913 349 797
Net Dynamics
Peter Smolle
Net Dynamics Internet Technologies GmbH &.
Co KG
Net Dynamics Internet Technologies
GmbH u. Co KG
Prinz-Eugen-Strasse 68-70
A-1040 Wien, Austria
Email: peter.smolle@netdynamics-tech.com
Tel.: +43 1 503982615
OU
Dr. John Domingue
Knowledge Media Institute,
The Open University, Walton Hall,
Milton Keynes, MK7 6AA,
The Open University
United Kingdom
Email: j.b.domingue@open.ac.uk
Tel.: +44 1908 655014
SAP
Dr. Elmar Dorner
SAP Research, CEC Karlsruhe
SAP AG
SAP AG
Vincenz-Priessnitz-Str. 1
76131 Karlsruhe, Germany
Email: elmar.dorner@sap.com
Tel: +49 721 6902 31
Sirma
Atanas Kiryakov,
Ontotext Lab, - Sirma AI EAD,
Office Express IT Centre, 3rd Floor
135 Tzarigradsko Chausse,
Sirma AI Ltd.
Sofia 1784, Bulgaria
Email: atanas.kiryakov@sirma.bg
Tel.: +359 2 9768 303
Tiscali
Tiscali Österreich Gmbh
Dieter Haacker
Tiscali Österreich GmbH.
Diefenbachgasse 35,
Deliverable 5.1
vi
Report on the State-of-the-Art and Requirements Analysis
A-1150 Vienna,
Austria
Email: Dieter.Haacker@at.tiscali.com
Tel: +43 1 899 33 160
Unicorn
Jeff Eisenberg
Unicorn Solutions Ltd,
Malcha Technology Park 1
Jerusalem 96951,
Unicorn Solution Ltd.
Israel
Email: Jeff.Eisenberg@unicorn.com
Tel.: +972 2 6491111
VUB
Carlo Wouters,
Starlab- VUB
Vrije Universiteit Brussel
Pleinlaan 2, G-10
Vrije Universiteit Brussel
1050 Brussel ,Belgium
Email: carlo.wouters@vub.ac.be
Tel.: +32 (0) 2 629 3719
Deliverable 5.1
vii
Report on the State-of-the-Art and Requirements Analysis
TABLE OF CONTENTS
EXECUTIVE SUMMARY ....................................................................................................................... II
TABLE OF CONTENTS ..................................................................................................................... VIII
1 INTRODUCTION ................................................................................................................................... 1
2 STATE-OF-THE-ART ANALYSIS ...................................................................................................... 4
2.1 MEDIATION OF DATA AND INFORMATION........................................................................................... 5
2.1.1 Importance of Data and Information Mediation in Semantic Web Services .............................. 6
2.1.2 Industrial State-Of-The-Art........................................................................................................ 8
2.1.2.1 Approaches and Projects .................................................................................................................... 9
2.1.2.2 Conclusions ...................................................................................................................................... 14
2.1.3 Research State-of-the-Art ........................................................................................................ 15
2.1.3.1 Classification Based on the Scope of Mediation .............................................................................. 15
2.1.3.2 Classification Based on the Classes of Application .......................................................................... 17
2.1.3.3 Approaches in Constructing the Mediator ........................................................................................ 20
2.1.3.4 Conclusions ...................................................................................................................................... 23
2.2 MEDIATION OF PROCESSES ............................................................................................................... 24
2.2.1 Processes and Process Technologies ........................................................................................ 24
2.2.1.1 Usage of Process Technologies within Semantic Web Services ....................................................... 24
2.2.1.2 Need for Process Mediation ............................................................................................................. 26
2.2.1.2.1. Choreography .......................................................................................................................... 27
2.2.1.2.2. Orchestration ........................................................................................................................... 29
2.2.2 Technologies for Process Mediation ........................................................................................ 31
2.2.2.1 Existing Process Representation Technologies ................................................................................ 31
2.2.2.2 Formalization of Process Representation ......................................................................................... 38
2.2.2.2.1. Logics for Representing Interaction Protocols ........................................................................ 39
2.2.2.2.2. Formalizing Choreography Description .................................................................................. 41
2.2.2.2.3. Formalizing Web Service Orchestrations ................................................................................ 43
2.2.2.3 Process Integration by Process Composition .................................................................................... 44
2.2.2.3.1. Situation Calculus for Service Composition ........................................................................... 44
2.2.2.3.2. Hierarchical Task Planning for Service Composition ............................................................. 44
2.2.2.3.3. Type Based Service Composition ........................................................................................... 45
2.2.3 Conclusion ............................................................................................................................... 51
3 REQUIREMENT ANALYSIS ............................................................................................................. 52
3.1 ARCHITECTURAL REQUIREMENTS FOR DIP MEDIATION COMPONENT.............................................. 52
3.1.1 Requirements for the Run-Time Environment ......................................................................... 53
3.1.2 Requirements on the Design-Time Tool .................................................................................. 54
3.2 REQUIREMENTS FOR DATA LEVEL MEDIATION ................................................................................ 57
3.3 REQUIREMENTS FOR PROCESS LEVEL MEDIATION ........................................................................... 59
4 CONCLUSIONS .................................................................................................................................... 61
5 REFERENCES ...................................................................................................................................... 62
Deliverable 5.1
viii
FP6 – 504083
Deliverable 5.1
1 INTRODUCTION
Due to an ever-increasing number of resources available on-line, end-users are
presented with large amounts of data and information, and can find it nearly impossible
to extract the relevant information items. Possibly the best solution to this overloading
problem is to mediate between different heterogeneous sources and to provide the user
with a single, relevant information source that is obtained by combining and relating a
wide variety of different sources.
The systems used for integrating heterogeneous sources are mediators [Wiederhold,
1992]. The basis of mediators and mediation architecture, as introduced by Wiederhold,
is that a mediator resolves mismatches during the run-time of a system; it does not wrap
resources before they are used in a system. Additionally, a mediator not only provides
the mediation, but also provides an authoring environment in order to define mediation.
From the business data viewpoint, it is a mapping tool that maps concepts from different
data sources without losing or altering their semantics. Mediation / Integration, or more
generally, dealing with heterogeneity, becomes very important when operating in
distributed systems, likewise in Internet applications and the Semantic Web, and
especially when dealing with Semantic Web Services [Bussler, 2003].
The basis of mediation is that it provides a high-level description technique for
describing the structure of resources. Firstly, a mechanism checks the resources to be
integrated (that is, to be made interoperable) according to their structure, then it
provides mapping functionalities in order to make the resources interoperable. The basis
of such a mechanism is an exhaustive, declarative description technique that allows the
description of all features of the resources, thus providing a powerful ontology language
for ontologies, and a suitable process description language for business processes (for
ontologies we adopted the definition provided by Gruber [Gruber, 1993]: an ontology is
a specification of a conceptualization; by processes we mean a set of activities and
transitions with conditions for transitions).
Secondly, an algebra is needed on top of this that defines the computable relations
between the modeling primitives and the operations between them [Papakonstantinou et
al., 1996].
Thirdly, a classification of mismatches that can occur between resources is required; the
mismatch identification scheme should also classify the types of mismatches according
to their resolvability.
The fourth component of a mediator is a mechanism that works on the representation
language and resolves a subset of the mismatches, with the algebra as the basis.
In general, mediation is an infinite problem field and only partial solutions can be
realized by automated mechanism. The reason for this is that it is considered to be
impossible to define an algebra and mismatch-resolving mechanisms for all kinds of
heterogeneities that can appear [Yerneni et al., 1999]. Most mediation technologies are
only semi-automatic for resolving conceptual mismatches, where both resources model
some aspect differently but correctly (which means that they require human intervention
when defining the mappings between concepts).
In the Web Service Modeling Framework (WSMF) [Fensel and Bussler, 2002], which
represents the basis of DIP, and which serves as the conceptual foundation of Web
1
FP6 – 504083
Deliverable 5.1
Service Modeling Ontology (WSMO) [Roman et al., 2004], three levels of mediation
are differentiated that are needed for Semantic Web Services. These levels are:
-
Data Level: establishes interoperability between heterogeneous data sources. As
DIP uses ontologies for data representation, special attention should be given to
ontology mediation.
-
Process Level: establishes interoperability between heterogeneous processes. In
the external visible behaviour of Web Services that ought to interact, there are
some mismatches that may occur (for example mismatching regarding the
sequence of activities). These have to be resolved in order to make the processes
interoperable.
-
Protocol Level: establishes interoperability between resources that request
and/or use heterogeneous messaging patterns or messaging sequences1.
It is important to note that for automated handling of Semantic Web Services, they have
to be mediated on all three levels.
The objective of DIP Work Package (WP) 5 is to specify and develop the DIP
Mediation Component that is required to handle all mediation aspects within the DIP
architecture (see [Altenhofen et al., 2004]). The challenges for WP 5 are to specify the
architecture and functionality of the DIP Mediation Component. They have to be usable
for the resolution of mismatches between different DIP components, and it should be
possible to invoke adequate mediation facilities within the DIP architecture.
In fact, the DIP Mediation Component poses two major challenges: firstly, the
implementation of a mediation-oriented architecture in accordance to the concept of
mediators and their usage in modern system architecture as outlined in [Wiederhold,
1992] and, secondly, the development of suitable mediation facilities, that is
mechanisms that actually resolve mismatches between possibly heterogeneous
resources at the three distinct levels of mediation identified above. The mechanisms to
be developed within WP 5 have to support the representation languages and ontological
structures of the different components in DIP, and they should apply and extend
existing mediation techniques in order to provide high quality concerning the
resolvability of possible mismatches.
The aim of this deliverable is to provide an introduction into the field of mediation, and
to analyse existing technologies with respect to their applicability for the DIP Mediation
Component, resulting in a requirements analysis for WP 5. The document will serve as
1
There have been terminological dissimilarities between the naming of the different levels of mediation.
While WSMF understands protocol level mediation to be concerned with messaging sequences, another
interpretation is that protocol level mediation is concerned with different communication protocols, for
example HTTP, SOAP, FIPA, and so on. The latter is the understanding that underlies the DIP WP 5
structure: here, process level mediation is understood to deal with heterogeneities between business
processes (the workflow from an application domain point of view), and with the external visible
behaviour of services along with possible mismatches in messaging sequences. With protocol level
mediation, DIP WP 5 refers to the aspects of the communication protocol technology used.
The WP 5 consortium agreed that establishing interoperability between the external visible behaviours
and the messaging patterns of Web Services are a fundamental challenge to be addressed in the DIP
project. Thus, the Protocol Mediation Level has been included in the Process Level aspects in this WP 5
(at least at this point in time).
2
FP6 – 504083
Deliverable 5.1
the basis of WP 5. To achieve this, the document covers the following aspects: for the
requirements analysis of the DIP Mediation Component architecture, on the one hand
existing architectures should be taken into consideration (see the survey in DIP D5.1a),
and, on the other, the interoperability and usability within the overall DIP architecture
has to be ensured.
Regarding the mediation facilities to be developed in WP 5, the objective is to create
high quality mediation facilities that extend existing approaches and techniques for the
different levels of mediation. Therefore, we have to exhaustively investigate existing
mediation technologies and systems, whereupon a reasonable requirements analysis and
specification of the mediation facilities can be derived. For mediation within Semantic
Web Services, the primary interest is in data level mediation and process level
mediation.
Protocol level mediation is covered by the fact that SOAP-based communication
protocols are used for messaging within Semantic Web Services, thus no mismatches
are expected on this level. (At least protocol level mediation does have the same priority
as data level and process level mediation, with regard to the distinction explained in
footnote 1). Thus, in the following we concentrate the state-of-the-art and requirement
analysis on these areas of mediation, and the requirements analysis is comprised of
aspects on the architecture of the DIP Mediation Component as well as for the
development of mediation facilities for the data and the process level.
This document is structured as follows: Chapter 2 reports on the state-of-the-art in
mediation facilities and technologies for the data and the process level, illustrating
current techniques and existing mediation systems; Chapter 3 provides the requirements
analysis for the DIP Mediation Component; Chapter 4 concludes the document.
3
FP6 – 504083
Deliverable 5.1
2 STATE-OF-THE-ART ANALYSIS
As mentioned in the previous chapter (Chapter 1), mediation in Semantic Web Services
can take place at three different levels: data, process, and protocol levels. However, in
our report we concentrate only on data (information) and process mediation.
The relationship between data and information has to be clearly stated here before
continuing with this report. Any facts handled on the Web are considered to be data, but
they remain meaningless if interpreted out-of-context; data only becomes meaningful
information if it makes sense in some context and has some sense for humans. As a
consequence, the mediation of pure data can be done only on a syntactic level. To
obtain meaningful mediated data, the semantic aspects must be considered, and the
mediation must be based on the information inferred from the data. So the mediation of
data actually implies the mediation of information and requires semantic mapping
capabilities, as well as specialized mapping and integration techniques for a specific
application context.
While data and information are static structures, when mediating processes we have to
take into consideration the dynamic aspects: the order of the activities, the transactions
that may occur, and the conditions for transitions. In other words, we have to consider
the interpretation of goals, as well as workflow and flexible Web Service invocation.
Considering this difference between static and dynamic, we present the current state-ofthe-art in data and information mediation separately from the state-of-the-art in process
mediation.
4
FP6 – 504083
Deliverable 5.1
2.1 Mediation of Data and Information
In the last twelve years, there has been intense research activity in this field resulting in
the development of many mediation techniques. There have been two different
directions in this area, directions that strongly marked the out coming solutions: the
industrial and the research areas.
In the industrial area, the main focus was the rapid development of mediation systems
appropriate for particular needs. The industrial systems are application oriented,
offering simple solutions with a low-risk factor, based on technologies with years of
expertise.
Additionally, research activity is strongly concentrated on finding new and innovative
solutions to improve the quality of results and reduce the effort of the human user;
aiming to semi-automate mediating systems. One of the most daring approaches is the
consideration of the semantic as an indispensable factor in the mediation solutions.
Together with already well-refined (as much as possible) syntactical techniques, this
approach intends to have a crucial role in the emergence of the Semantic Web Services.
Unfortunately, both the research and industrial approaches still rely on human user
input.
In the following sections, we first provide a short rational for the use of data and
information mediation in Semantic Web Services, and then we present the current stateof-the-art in both the research and industrial fields.
5
FP6 – 504083
Deliverable 5.1
2.1.1 Importance of Data and Information Mediation in Semantic Web Services
The main reason for the use of Web Services is to provide a standard means of strongly
decoupled interoperation between different software applications, running on a variety
of platforms [Booth et al., 2004]. The purpose of adding semantic to the Web and to the
Web Services is to define meanings that enable computers to operate in a more
appropriate manner with the information they manage.
The process of engaging a Web Service (or a Semantic Web Service), consists of the
following steps [Booth et al., 2004]:
1. the requester and provider entities become known to each other;
2. the requester and provider entities agree on the service description and semantics that
will govern the interaction between the requester and provider agents;
3. the service description and semantics are realized by the requester and provider
agents;
4. the requester and provider agents interact by exchanging messages.
These steps are illustrated in the following figure:
Figure 1: Engaging a Web Service2
2
Source: [Booth et al., 2004]
6
FP6 – 504083
Deliverable 5.1
A misunderstanding between the service requestor and service provider can appear
during either step 1 or 2, due to the fact that the two entities involved in the process can
use different data sources. This imposes the use of mediation at data and information
level, for facilitating the communication between a requestor and a provider of a
service.
7
FP6 – 504083
Deliverable 5.1
2.1.2 Industrial State-Of-The-Art
The current state-of-the-art in industrial applications is a central integration server,
which intercepts messages between different systems and translates the message from a
given source format into the necessary target format. The necessary transformations to
perform these translations are static scripts that are executed by the integration server.
The decision as to which script to execute is either taken during design time by the
system designer or during run-time based on predefined rules. Examples for such
systems are MS BizTalk [MS BizTalk, 2004], Seeburger Business Integration Server
[SBIS, 2004], and SAP Exchange Infrastructure [SAP XI, 2004].
Most current integration servers also allow the dynamic routing of messages during runtime based on the message content. This enables some dynamic behaviour of the system
as the target of given messages can be determined at run-time and doesn’t need to be
defined at design time, for example SAP Master Data Management.
The creation of a single view over multiple databases is also a problem that is addressed
by several products.
The points of interest in the industrial state-of-the-art are how the necessary
transformations are created and how these transformations are executed, the rest of the
mediation process being strongly dependent on the implementation, and not relevant for
this state of the art.
1. Creation of Transformation
The creation of transformations in all existing solution is strongly based on a domain
expert inputs, that is they are semi-automatic or entirely manual. Two very different
approaches to the creation of transformations exist: either they are created using a
graphical tool or by directly programming the transformations using some kind of
scripting language. Both approaches have different advantages and disadvantages , for
example, graphical tools become very confusing for large message schemas, direct
programming does not have any feedback on which elements are already treated, or
which elements were probably forgotten. Some companies like, for example, Seeburger
use a combination of the two builds into an IDE to create transformations. This enables
the user to choose the approach that is most suitable for the given problem.
2. Execution of Transformation
There are two basic approaches for the execution of the transformation. One is to
interpret the scripting language used to define the transformation at run-time. As an
example, one could use XSLT3 [W3C, XSLT, 1999] to program a transformation and
use a standard XSLT processor to execute it. The second approach is to compile the
transformation into an executable program, for example, a Java class, and to simply call
this program during run-time.
3
XLST is a language for transforming XML documents into other XML documents.
8
FP6 – 504083
Deliverable 5.1
2.1.2.1 Approaches and Projects
A broad survey is provided as a separate sub-deliverable D5.1a “Survey of Industrial
Data Integration Systems”. It is provided separately because of its size and complex
internal structure. The survey provides an introduction to industrial data integration
systems and an overview of the systems provided by a few of the leaders in the database
management and enterprise application integration, that is, ORACLE, IBM, Microsoft,
WebLogic, and Cape Clear. Sub-deliverable D5.1a also represents the current
“paradigms” and structuring of the data integration area without a bias towards Web
Services, ontologies, or the Semantic Web; the reason for this being that most of the
experience, industrial approaches, and technologies used for data mediation and
integration are non-semantic.
In order to point out some of the interesting features of existing systems, we now
describe SAP Exchange Infrastructure, Seeburger Business Integration Server, and
Microsoft BizTalk Server.
The SAP Exchange Infrastructure [SAP XI, 2004] is a system that enables the
integration of different enterprise applications on different platforms. It offers a runtime infrastructure for message exchange, configuration options for managing message
flows and business processes, and support for the creation of the necessary message
transformations. An overview of the architecture of SAP XI is shown in Figure 2.
Figure 2: SAP Exchange Infrastructure Overview4
The SAP Exchange Infrastructure consists of three main parts: the Integration
Repository, the Integration Directory and the Integration Server. The Integration
Repository is used to capture all the information available during design time about an
integration project. This information consists of interface descriptions, components,
mappings, and business processes. In addition to this information, the Integration
Directory contains additional configuration information, such as information about the
system landscape or business partners. The central component of the SAP XI is the
Integration Server, which is the central communication engine for messages sent
4
Source: [SAP XI, 2004]
9
FP6 – 504083
Deliverable 5.1
between different systems. The Integration Server is responsible for the routing and
mappings of messages based on the information stored in the Integration Directory.
During runtime the Integration server uses the information stored in the Integration
Directory to perform these tasks dynamically based on the content of a message.
The SAP XI is build upon Java 2 Enterprise Edition Platform. It supports a large
number of open standards in order to ensure wider interoperability. Examples of
supported standards include WSDL for the description of interfaces as well as the
SOAP with Attachments specification and XML on which the communication is built.
The Integration Server is currently capable of executing two types of transformations,
XSLT scripts and Java classes. XSLT is supported only for enabling the integration of
pre-existing mappings. Therefore, SAP XI offers no support for creating XSLT
mappings. During run-time the appropriate mappings are dynamically selected from the
integration directory based on the message header or contents using user-defined rules.
For the creation of Java mapping programs, SAP XI offers a graphical tool that is very
similar to many other products in this area. A user creates a mapping by graphically
connecting elements of the schema of the source message with elements of the target
message schema. In order to support more complexity than just one-to-one element
mapping, the tool offers the possibility of inserting arbitrary functions into the data
flow. The tool offers a large number of predefined functions and the possibility of
creating new user-defined ones if needed.
Figure 3 shows a screen shot of the SAP XI Mapping Tool.
10
FP6 – 504083
Deliverable 5.1
Figure 3: The SAP XI Mapping Tool5
Another industrial product is the Seeburger Business Integration Server [SBIS, 2004],
which is an integration engine for B2B integration. An overview of the architecture is
shown in Figure 4.
The core of BIS is the so-called Workflow Engine. This engine controls all integration
processes that are handled by BIS. This engine is connected to different Event Sources
that can trigger the execution of a workflow. Events can be triggered by different
sources, for example files, databases, or messages. Another possible source for events is
the Web Services interface. BIS is capable of offering to other systems and to use
requests to a web service to trigger the execution of a workflow. The connection of BIS
to the legacy systems is achieved through the Components. The available Components
include Connectors to standard ERP or eBusiness solutions, Communication
Components to enable communication with external partners, Converters that enable the
easy conversion between different communication standards and further components
that include, for example security components to enable secure communication.
While the overall architecture of BIS is very similar to the architecture of most
industrial integration systems, there are two special features of BIS. Firstly, it contains a
large number of converters and connectors. There exist transformations to convert
among all major communication standards and connectors to connect with other
available business systems. This enables the solution of a large number of integration
5
Source: [SAP XI, 2004]
11
FP6 – 504083
Deliverable 5.1
problems easily, as the development can focus on the integration of workflows and does
not need to be concerned with the creation of transformations.
Figure 4: Overview of the Seeburger BIS6
The second special feature that we want to highlight is the tool for generating
transformations, the so-called Mapping Designer. This Mapping Designer is a very
advanced tool, and can be seen as an integrated development environment for
transformations. Transformations can not only be created using either a graphical
interface or a scripting language like in most other tools, but uses both approaches in an
integrated fashion. The user can either write a transformation script in a special scripting
language and immediately see a graphical representation of the created transformation
or create transformations graphically with the tool creating the resulting script. This
allows a development process similar to those supported, for example, by well-known
UML tools.
The Mapping Designer also integrates debugging support in order to support the
complete development process.
The last project described here is Microsoft BizTalk Server [MS BizTalk, 2004], which
enables the connection of diverse applications using a graphical user interface to create
and modify business processes that use services from those applications. In order to do
this, the Microsoft BizTalk Server engine must provide a way to specify the business
6
Source: [SBIS, 2004]
12
FP6 – 504083
Deliverable 5.1
processes and a mechanism for communicating between the applications that the
business process uses.
The main components of BizTalk Server 2004 are send and receive adapters, send and
receive pipelines, orchestrations, the BizTalk Server message box, and the business
rules engine (Figure 5).
Figure 5: BizTalk Server Architecture7
The following way of processing messages is also illustrated in Figure 5:
1. A message is received through a receive adapter, and then processed through a
receive pipeline (this processing can include converting the message from its
native format into an XML document and validating its digital signature).
2. The message is delivered to the so-called MessageBox database, a database that
uses Microsoft SQL Server.
3. The message is dispatched to its target orchestration, which takes whatever
action the business process requires. The result is usually another message,
which is also saved in the MessageBox database.
4. The new message is processed by a send pipeline (the processing can include its
conversion from the internal XML format used by BizTalk Server 2004 to the
format required by its destination and the adding of a digital signature). The
message is sent to the send adapter.
BizTalk Server 2004 is built completely around the Microsoft .NET Framework and
Microsoft Visual Studio .NET. It also has native support for communicating by using
7
Source: [MS BizTalk, 2004]
13
FP6 – 504083
Deliverable 5.1
Web Services, along with the ability to import and export business processes described
in Business Process Execution Language.
2.1.2.2 Conclusions
The mediation systems described in this section are, like most (or all) of the industrial
mediation systems, application oriented and appropriate for particular needs. One of the
major challenges in industry is not to obtain a general solution, open to the new and
innovating technique, but to offer a simple solution with a low risk factor.
The first impression may be that these approaches do not raise any challenges, and that
they are not appropriate for mediation in the context of Semantic Web Services.
However, they should not be ignored, and when developing a new mediation system the
option should be considered of improving one of the already existing industrial
mediators are robust and reliable.
14
FP6 – 504083
Deliverable 5.1
2.1.3 Research State-of-the-Art
The main focus on the research activity in data and information mediation is the
development of approaches as systems that would require the minimum human
intervention. Of course, the ideal scenario would be no requirement for human inputs at
all, but this still remains an unsolved problem.
The impressive number of projects regarding the mediation of data and information
prevents us from trying to enumerate or describe all of them. Instead, we will provide a
short list of possible classifications for these projects, exemplifying with projects based
on those specific approaches.
The classification criteria we chose are based on:
A) Scope of Mediation
B) Classes of application
C) Approaches in constructing the mediator8
2.1.3.1 Classification Based on the Scope of Mediation
Based on the scope of mediation, three strategies for ontology mediation are
distinguished: ontology mapping, ontology merging and ontology alignment [Noy and
Munsen, 2000].
Ontology Mapping
In so-called ontology mapping, rules are defines to enable interoperability between two
or more ontologies. The rules and the source ontologies are kept separated after
integrating. The advantage of using this approach is that the mapping rules, once
defined, can be reuse as many times as needed; a mapping rule must be rewritten only
when one of the ontologies is changing.
A project that implements this approach is the COIN (COntext INterchange) [Goh et al.,
1999] project, which implements a suitable architecture for semantic interoperability.
The COIN Framework consists of three main components: domain model, evaluation
axiom and context axiom. The domain model defines the application domain in terms of
primitive types and semantic types. The elevation axioms identify the correspondences
between the attribute from the sources ontology and the types defined in the domain
model. The last components, the context axioms, correspond to the named contexts
associated with different sources, providing the semantics of data in terms of value
assigned to semantic objects. Associated with a source ontology the context axioms
provide the articulation of the data semantic.
Articulation of data based on domain model (ontology) and relating the data with
domain model are important facts considered in their architecture. However, a domain
model is closer to a conceptual schema than an ontology. In an example in [Goh et al.,
1999], one can see that "money amount" is considered a subtype of "semantic number"
8
These classification criteria are not disjoint. A project can be classified using any (or all) of these
criteria.
15
FP6 – 504083
Deliverable 5.1
while number is only a primitive type for representing the value -or "currency type" is a
subtype of "semantic string". However, according to [Gruber, 1993] the definition of
ontology is based on the conceptualizations of the people in a community. Therefore,
"money amount" is an amount or a quantity. Treating "semantic number" as a supertype is the result of influence of application development, while "money amount" or
"currency type" are related to a value of type number or string, respectively, only for
representation purpose.
Ontology Merging
In the ontology merging approach the two source ontologies are united into a single
ontology that comprises all the information of the source ontologies [Noy and Musen,
2000]. The algorithm used for merging the ontologies should be able to eliminate any
possible duplicates or inconsistencies that can occur (the original ontologies may cover
similar or overlapping domains which implies that some concepts may be defined in
both of them, not always in a similar or even consistent manner).
The Kraft project [Visser et al., 1999] implements the ontology merging approach. It is
a project for the integration of heterogeneous information, using ontologies to resolve
semantics problems. The approach is to extract the vocabulary of the community and
the definition of terms from documents existing in an application domain KRAFT uses
shared ontology [Jones, 1998] as a basis for mapping between ontology definitions and
communication between agents. In [Visser et al., 1999], the architecture is "chosen to
make shared ontology as expressive as the 'union' of the ontologies". However, the
definition of the union of ontologies and its similarities or differences with shared
ontology is not stated. KRAFT detects a set of ontology mismatches (as described in
[Visser et al., 1999]) and establishes mappings between the shared ontology and local
ontologies.
Ontology Alignment
The alignment of the ontologies is accomplished by establishing links between them. A
consequence of the alignment is that the two ontologies can reuse information from one
another.
The ontology alignment approach is applied in Observer [Mena et al., 2000], which
uses ontologies to allow queries against heterogeneous sources. It replaces terms in user
queries with suitable terms in target ontologies, by means of Inter-Ontology relations.
Observer uses description logic as both ontology definition language and query
language.
There are three steps in processing a query: query construction, access to underlying
data and controlled query expansion to new ontologies. The first step, query
construction needs human intervention in selecting the user ontology (which contains
information about the semantics of the query) and in editing the query. The execution of
the query is performed in the second step (access to underlying data), when the user
ontology is accessed. If the user is not satisfied with the query results, than other
16
FP6 – 504083
Deliverable 5.1
ontologies containing related terms are visited (this being the third step, controlled
query expansion to new ontologies).
A graphical representation of these three approaches is shown in Figure 6.
Ontology Mapping
Ontology Alignment
Mapping
Rules
Ontology A is made
compatible to ontology B
Ontology Merging
Figure 6: Ontology Integration Strategies
An important issue that needs to be specified here is that ontology merging and
ontology alignment cannot be considered totally independently from ontology mapping,
as mappings are still necessary for making the merging or the alignment possible. A
good example in this sense is the Observer project, which maps the query results
obtained by consulting remote ontologies with the results obtained by consulting the
user ontology.
The choice of one of these mediation strategies is determined by the application field. If
the only requirement is to express instances of one ontology in terms of the other, then
the ontology mapping is the most appropriate technique. If it is necessary to have a set
of rules and links that permit the usage and the interoperation of two ontologies, the
ontology alignment approach is more suited. Finally, if the purpose is to obtain an
ontology containing information from two different sources (ontologies), then merging
them is the right solution.
2.1.3.2 Classification Based on the Classes of Application
Considering the classes of application, we can distinguish the following approaches
[Madhavan et al., 2002]: information integration and Semantic Web Services, ontology
merging, and data migration.
Information Integration and Semantic Web Services
The information integration and Semantic Web Services approach is appropriate when
there is a need for the use of many heterogeneous sources, without explicitly referring to
each of them. The user just queries a mediated logical schema containing relevant
information for the application. [Wache and Fensel, 2000] proposed the so-called
intelligent integration that would allow the integration of a large variety of data sources,
17
FP6 – 504083
Deliverable 5.1
should be based on semantics by means of used ontologies and should provide an
advanced query processor, that includes facilities for the extraction of content, data
abstraction and a semantic-based query interface.
An example of a system that uses this approach is the IRS-II (Internet Reasoning
Service) [Motta et al., 2003] system. Since this system addresses mediation in the
context of Semantic Web Services, a more detailed description is provided than for the
previous presented systems.
The mediation component of IRS is called a Bridge (a type of adapter) and stands
within a framework for describing knowledge components. IRS bridges are not
explicitly modeled (as, for example, in Web Service Modeling Ontology [Roman et al.,
2004]), but they have specific roles, as discussed below.
The IRS-II [Motta et al., 2003] is a Semantic Web Services framework, which allows
applications to semantically describe and execute Web services.
IRS-II is based on the Unified Problem Solving Method Development Language
(UPML) framework [Omelayenko et al., 2003], which distinguishes between the
following categories of components specified by means of an appropriate ontology:

Domain models: these describe the domain of an application, for example,
vehicles, a medical disease.
 Task models: these provide a generic description of the task to be solved,
specifying the input and output types, the goal to be achieved and applicable
preconditions.
 Problem Solving Methods (PSMs): these provide abstract, implementationindependent descriptions of reasoning processes that can be applied to solve
tasks in a specific domain.
 Bridges: these specify mappings between the different model components within
an application.
The IRS implementation of the UPML framework covers semantic mappings amongst
knowledge components and integration techniques for task-centred invocation of Web
Services. The publishing platforms of IRS-II facilitate the invocation of Semantic Web
Services by mediating between the server of semantic descriptions and the actual Web
service. The definitions of task, problem solving method (PSM) and bridge in IRS
correspond to the definitions of goal, web service description and mediators in WSMF
[Fensel and Bussler, 2002] since both approaches derive from the UPML framework.
The process of semantically describing services in IRS involves several mediation
activities: mapping generic tasks and PSMs to a domain model, mapping PSMs to tasks
or, in general, adapting existing resources. More specifically, in the UPML framework,
the knowledge components of a library can be described and connected together in
different running systems, through the creation of explicit mediating elements—
adapters. In particular, bridge adapters connect two kinds of components by way of
mapping relations between the ontologies of both components, such as:



Task-Domain bridge
PSM-Domain bridge
Task-PSM bridge
18
FP6 – 504083
Deliverable 5.1
IRS supports the direct acquisition of the value of an input role, according to the task
ontology. If the domain knowledge does not conform to the task ontology, the IRS
supports users in constructing a mapping relation between the task role and the
corresponding domain knowledge. A domain-task mapping relation defines the
transformation of a piece of domain knowledge or attributes into an instantiated input
role for the task; mappings may also be required for task outputs to conform to the
domain ontology.
The UPML description of the library may also include a set of PSM–task bridges. If not
already provided in the library, the IRS supports the creation of such bridges to map the
inputs and outputs of the described task to the ones of the selected PSM. IRS users
specify the domain entities that fill-in the input– output roles of the PSM. Some of the
roles for the PSM are inherited from the configured task, through a corresponding
PSM–task bridge. In addition, the selected PSM may define supplemental roles. For
example, a PSM can define the notion of an abstractor, a function that computes
abstract types from raw data. The IRS supports the acquisition of domain-method
mapping knowledge in a way similar to the domain-task mapping during task
description.
The invocation process consists of running the Web Service associated with the PSM to
realize the specified task, with domain case data entered by the user. The IRS first
acquires case data from the user and instantiates the case inputs of the PSM by
interpreting the Task-Domain, Task–PSM and Domain–PSM mapping relations. The
IRS also checks the preconditions of the PSM and task on the mapped case data. The
IRS then runs the Web Service with the mapped inputs, by accessing the publishing
platform used for registering the service for the PSM. IRS uses the publishing platform
to retrieve knowledge about the location and type of PSM code. Finally, the IRS fills-in
the domain outputs with the results of PSM execution, possibly transformed with
domain–PSM mapping relations defined at PSM description time.
The IRS-Protégé implementation supports a structured methodology for mapping the
input–output roles of reasoning resources to relevant domain entities. The methodology
provides a typology of mapping-relation template, that is a mapping ontology, which
covers a wide range of mapping relations, from simple renaming mappings, to complex
numerical or lexical transformation of entities.
Data Migration
The data migration approach is used for importing external data and then mapping,
merging or aligning them with internal application data (see Section 2.1.3.1 for more
details about these three approaches). The decision as to which of these three techniques
is most appropriate is again dictated by the application field, the main criteria being that
the mismatches between internal and external data should be covered as much as
possible.
The Clio project [Popa et al., 2002] is an example of project that implements a data
migration approach using queries for ontology mapping. Clio is a high-level schemamapping tool that guides the user to the mapping specification by using the so called
19
FP6 – 504083
Deliverable 5.1
value correspondences. These value correspondences specify how the values of the
source attributes are mapped to values of the target attributes.
The entire process consists of two main steps: semantic translation and data translation.
The first step implies the understanding of the given value references, which means that
the semantic mappings must be understood and converted to logical mappings, while in
the second step the logical mappings are transformed in low-level mappings, in this case
queries.
Ontology Merging
The ontology merging approach was described in the previous section (Classification
Based on the Scope of Mediation).
The three approaches illustrated in this section are not, by any means, disjoint. Maybe a
separation between them is theoretically possible, but the actual implementation of
information integration or data migration is not possible without combining it with
ontology merging (or with another one of the techniques described in the previous
section)
2.1.3.3 Approaches in Constructing the Mediator
There are three main approaches in constructing a mediator: machine learning [Doan et
al., 2002], and structure based (schema matching) and linguistic/lexical analysis [Rahm
and Bernstein, 2001].
Machine Learning
In the machine learning approach, the mapping rules are “learned” based on existing
examples of mappings. These mappings are usually constructed manually or semiautomatically (in which case the systems makes mapping suggestions but inputs from a
domain expert are still needed). The larger the training set the more accurate are the
results obtained by using this approach
As an example of a system that uses machine learning technique for assisting the
ontology mapping process we will describe here the Glue system [Doan et al., 2002].
By applying probabilistic definitions for similarity measures, Glue is able to find the
most similar concepts between two heterogeneous data sources. The architecture of the
system is shown in Figure 7.
20
FP6 – 504083
Deliverable 5.1
Figure 7: GLUE Architecture9
The main elements of the architecture are: the distribution estimator, the similarity
estimator and the relaxation labeler. The distribution estimator applies machine
learning technique to compute the joint probability distribution between two concepts
belonging to two different taxonomies (the probability for the two concepts to have the
same semantic). The similarity estimator applies a user supply function on this
probability distribution, obtaining a similarity factor for each two concepts. Considering
the entire taxonomies, these similarity factors form a similarity matrix that, together
with domain specific constraints and the heuristic knowledge is computed by the
relaxation labeler for obtaining a mapping configuration
Schema Matching and Linguistic/Lexical Analysis
In this case, the internal structure of the concepts is analyzed. Simultaneously, there
may be used some heuristic functions based on linguistic similarities (for example
consulting a dictionary or a thesaurus for finding lexical relations between concepts
name like synonymy or hyponymy).
The XMapper system10 is appropriate for illustrating the structured base approach. It
was especially developed to create transformations between different XML message
formats. XMapper uses only instance information to create the transformations. Figure 8
shows the functionality of the XMapper system.
9
Source: [Doan et al., 2002]
10
http://citeseer.nj.nec.com/kurgan02semantic.html
21
FP6 – 504083
Deliverable 5.1
To actually create a transformation XMapper first extracts a number of XML message
instances from each data source. From these instances, XMapper then extracts the
structure of the message as well as all XML elements of each message and a set of
possible values for each of these elements. In the next step a feature vector containing
22 features (like type, allowed values, lengh) is created for each XML element. Sixteen
elements of each feature vector are created by the constraint analysis and 6 elements by
the learning component, using an algorithm called DataSqueezer [Larson et al., 1989].
After a feature vector for each XML element has been created, a Distance Table is built
by calculating the distance between every two elements of the different sources. The
transformation can now easily be found by mapping the two elements of each source
that have the shortest distance.
Figure 8: Functionality of the XMapper system11
As in the previous section (Classification Based on the Classes of Application) the
technique illustrated here are often combined. The XMapper system actually uses both
of these approaches, by “learning” while it creates the 22 features vector.
11
Source: http://citeseer.nj.nec.com/kurgan02semantic.html
22
FP6 – 504083
Deliverable 5.1
2.1.3.4 Conclusions
In this section (research state-of-the-art on data and information mediation – Section
2.1) we have described some of the current existing research approaches on data and
information mediation. The selection of the projects presented was made based on the
classification criteria and the approaches identified; for each approach, we present a
project that would illustrate its particularities.
The vast number of research approaches and projects in this area lead to only one
conclusion: the research is far from over, but maybe better solutions could be found. As
previously stated in this section, an attempt to implement a particular approach,
completely disregarding all the others, is neither an optimal nor a feasible solution. The
best solution is probably to try to extract the most important features (from the
functionality point of view) of all these approaches and to try to combine them, in order
to achieve the desired functionality.
23
FP6 – 504083
Deliverable 5.1
2.2 Mediation of Processes
To analyse the state-of-the-art in process mediation, we first need a clear understanding
of the related concepts. Therefore, we will examine uses of processes in systems, paying
special attention to the usage of process technologies within Semantic Web Services.
Then, we point out where process mediation is needed and the specific requirements
that arise for different process mediation scenarios. This section investigates these
aspects. Starting with a general overview of process technologies, we point out the
application scenarios for process level mediation within Semantic Web Services, and
then investigate the existing technologies and approaches that can serve as a starting
point for development of the Process Level Mediation Module of the DIP Mediation
Component.
2.2.1 Processes and Process Technologies
This section gives a brief overview of process technologies and their use within
Semantic Web Services, and the resulting requirements for process level mediation
scenarios.
2.2.1.1 Usage of Process Technologies within Semantic Web Services
A process is a set of activities and transitions with conditions for transitions. Depending
on the specific process, its tasks could be a combination of services that stand for
queries, transactions, applications, and administrative activities. These services can be
distributed within or across enterprises and are coordinated by constraining control and
data among them. The services can themselves be composite, that is, implemented as
processes, thus introducing nested processes and recursive definition of processes.
Before explaining the technologies’ requirements and the challenges arising for process
technologies, we first describe the general building blocks of processes and their
definition [Bussler, 2003]:
-
Activity/Action/State: a step in a process that can be resolved arbitrarily, that is,
either by a simple program or by a more complex one as well as by another process
(‘sub-process’ or ‘nested process’), or by a manual activity. Activities in a process
represent the basic building blocks of what is done or achieved in a process. With
regard to the level of abstraction represented by the process, activities are not split
into smaller building blocks.
-
Transition: a transition is a conversion between activities. Transitions are realized
by conditions.
-
Data Flow: process technology allows specifying, executing, and controlling
complex, multi-step information processing. Thus, the information to be processed
has to be passed through the building blocks of the process. Data flow is concerned
with real application data, in contrast to control flow, which deals with process
technology information. The duty of data flow technology for processes is to ensure
that each activity in a process receives the information it needs for execution.
-
Control Flow: control information is needed in order to provide the means for
defining the nature of a process and for controlling its execution. Control flow
primitives can be distinguished as:
24
FP6 – 504083
Deliverable 5.1
a. Process Logic Primitives:
Control flow elements for the specification of control flow structures
that can be combined into more complex algorithms. The most
common process logic description primitives are (naming in
accordance to BPEL4WS, see [Curbera et al., 2002]):

sequence, for serial execution

while, to implement a loop

switch, for multiple way branching

flow, for parallel execution

pick, for choosing among alternative paths based on an external
event
b. Execution Control Primitives:
Primitives for defining the execution handling of a process or its
activities. This group (optionally) contains primitives for:

Timing: handling of timeouts, and so on during process
execution

Event Handling: support for event-driven execution of processes

Interaction, that is interoperation between parties
Adequate process technologies face a number of technical challenges. At first, they have
Adequate process technologies face a number of technical challenges. Firstly, they have
to support modeling processes and ensure correctness of execution with respect to the
model and to the constraints of the underlying services and their resources. Normal
execution of a process is easy when the process model specifies a partial order of the
activities in the process. Exception conditions can be more difficult to model and
handle. More important, because interesting business processes are often long running,
interactions among them are non-atomic, leading to the possibility that the information
they take as input can be subject to revision, causing their own results to be invalidated.
Exceptions and revisions are the main sources of complications in the modeling of a
process.
Secondly, a suitable process technology has to support interfacing the process with
underlying functionalities, that is, the resources that actually fulfil the distinct activities
in a process. Within database systems, this would require linkage to the concurrency
control and recovery mechanisms of a DBMS – within Semantic Web Services a
linkage to the execution control of Web Services is required.
A major use of process technologies is to allow the automation of business processes
within organizations. This area is commonly referred to as “workflow technologies”,
which are a special type of processes12. Numerous academic, industrial, and joint efforts
12
We apply the following understanding of process technologies and workflow technologies within this
document, according to [Bussler, 2003]: ‘process technology’ is the general notion for technologies for
25
FP6 – 504083
Deliverable 5.1
work on specification requirements, overall frameworks, and software tools for
workflow technologies – the estimates range from 100 to 250 such efforts world-wide.
Within the area of Semantic Web Services, we identify the following purposes for the
use of process technologies:
Choreography. This takes the perspective of a process as being a set of message
exchanges between participants, that is, when a user (machine or human) interacts by
exchanging data and information via messages. The message exchanges are constrained
to occur in various sequences and may be grouped into various transactions. Thus,
within this application field, process technologies are needed to formally describe the
external visible behaviour of Web Services, which are the steps of the business process
that a Web Service shows to its user in order to allow the interaction and information
exchange needed to fulfil its service.
Orchestration. This takes the perspective of a process as a program or partial order of
operations that need to be executed. This view is logically centralized in that it views
the process from the perspective of one “orchestrating” engine. It is as if the process
specification is being executed under the control of or on behalf of a specific party. In
other words, the Orchestration of a Web Service A describes how other Web Services
(W1, .., Wn) are composed into the functionality of Web Service A.
Collaboration. This takes the perspective of a process as a collaboration involving
some business partners. The business partners not only send messages to one another
but also enter into business relationships such as contracts and obligations. They
generate flexible message exchanges depending on the evolving circumstances and their
local policies, for example for handling business exceptions.
Collaboration is emerging as a serious approach for carrying out large-scale business
processes. Automated collaboration describes the aim of Semantic Web Services
technologies in correspondence to the vision of the Semantic Web: several autonomous
Web Services shall be combinable and usable in a collaborative manner as components
for more complex, specific applications that support various kinds of functionality by
re-use and dynamic composition.
2.2.1.2 Need for Process Mediation
With respect to the usage scenarios of process technologies within Semantic Web
Services, the need for process mediation technologies that support handling and
resolving heterogeneities consequently emerge as a further technological challenge. For
example, if a production scheduling software system employs a different modeling
formalism than a purchase order processing software in a supply chain, then the given
enterprises’ cooperation may be adversely affected. Especially in open and
handling transactional and dynamic behaviours, while ‘workflow technology’ is a special kind of process
technology that is concerned with real-world business processes.
26
FP6 – 504083
Deliverable 5.1
decentralized environments like the Internet, heterogeneity handling is a major issue in
system design.
Interoperability among processes, which clearly is an important need in practical
settings, requires some kind of translator among process models – this is what we
understand to be technologies for process level mediation within the context of WP 5 in
DIP. For investigating existing technologies, as well as for depicting the requirements
for the DIP Process Level Mediation Module, we have the following understanding of
process level mediation:

The overall aim of process technologies within Semantic Web Services is to
allow automated Collaboration (see above) of Web Services with complex (that
is, multi-step, process-like) external behaviours.

Process technologies for Choreography allow the use of complex Web Services
by a user or Service Requester (a more general term which can be a human,
another Web Service, or any other agent).

Process technologies for Orchestration allow the composition of existing Web
Services into a more complex Web Service.

The realization of automated Collaboration can be achieved by the combination
of suitable technologies for Choreography and Orchestration into a coherent
framework.
On the basis of these general requirements for process mediation technologies, we can
examine the needs arising for process mediation within Semantic Web Services,
whereby we focus on the notions of Choreography and Orchestration. Choreography
and Orchestration are explained in more detail below and the specific needs for process
level mediation are discussed13.
2.2.1.2.1. Choreography
The Choreography of a single Semantic Web Service describes the external visible
behaviour of a Web Service as needed to use the Web Service, along with the
messaging sequence expected for its use, that is, the pattern of user-service interaction.
In other words, the Choreography of Web Services is described by the data flow,
control flow, and message exchange pattern that a Web Service makes visible.
Consequently, a Web Service only makes those aspects of its functionality externally
visible where it needs interaction with the user (for example, input or notification). On
the basis of such a behavioural description of a single Web Service, global interaction
models for several Web Services can be determined. The interaction is realized by
defining a message exchange between possibly several Web Services in accordance to
their individual Choreographies.
The challenge for process level mediation within Choreography is to establish a global
interaction model of Web Services that do not have compatible Choreographies a priori.
13
The definition of the notions of Choreography and Orchestration are based on the definition provided in
[Singh and Huhns, 2004]. This also corresponds to the terminology definitions within the Web Service
Modeling Ontology WSMO [Roman et al., 2004].
27
FP6 – 504083
Deliverable 5.1
Figure 9 shows the general structure of a Choreography of a single Web Service, with
further explanations on the need for process level mediation below.
Figure 9: Web Service Choreography - General Structure
As an example, we can imagine a Web Service for purchasing goods. It is a multi-step
service with the following sequential activities:
a.
Select the goods to purchase.
b.
Make an agreement between the seller (the owner/provider of the Web Service)
and the buyer (the Service Requester) on the purchase contract for the selected
goods and payment.
c.
Deliver the goods.
In order to use this Web Service, the Web Service and the Service Requester have to
interact within the distinct activities for, (1) choosing the goods; (2) offering and
accepting the contract; and (3) choosing the delivery method. The external behaviour of
the buyer in this interaction has to be compatible with the business process of the Web
Service.
This means that the buyer has to have facilities to select the goods, to accept a contract,
and so on, and, (most relevant for process mediation) the sequence of the compatible
activities of the buyer side has to be compatible. For example, the buyer cannot pay
after product delivery if the seller requires pre-delivery payment, because then each
process would be frozen waiting for the other.
28
FP6 – 504083
Deliverable 5.1
If we think of the buyer in this example as a Web Service, not a human user, it makes
clearer the process technologies that Choreography will have to support, as well as what
is needed for process level mediation within Choreography.
The process technology has to be able to describe a business process, that is, a
workflow. This description contains only those activities of the internal functionality of
the Web Service in which user interaction is needed. For example, the Choreography for
the payment functionality contains the need for the user selection of the payment
method and a notification when the payment is achieved, but not a description of the steps
involved in how the payment is actually done. The activities in the Choreography
description comprise the messaging pattern needed for user service interaction for this
specific activity. In Figure 9, the blue arrows denote the process control flow, while the
black arrows denote the messages as the data flow of between the interacting entities.
Also, the need for process level mediation within Choreographies becomes clearer. The
process level mediation technology has to ensure that the process structures of the
Choreographies of two Web Services that will interact are compatible, and that the
distinct activities of the Choreographies are compatible, within the definition of
compatibility as exemplified above. We understand this as the “determination of
Choreography compatibility”, in other words, establishing a suitable global interaction
model of Web Services in Choreographies that previously were not compatible.
Therefore, compatibility refers to the workflows of the distinct Web Services, that is,
the business processes from the application perspective (called Process Level within
WSMF), as well as the congruency of the messaging patterns defined in the
Choreographies of the individual services (called Protocol Level in WSMF). The
challenge of the Process Level Mediation Module in the DIP Mediation component is to
provide the means for resolving any mismatches that occur within these aspects, and to
provide (semi) automated support for resolving these mismatches.
2.2.1.2.2. Orchestration
The Orchestration of a Web Service A describes how other Web Services (W1 .. Wn) are
composed into the functionality of Web Service A. The task for process level mediation
within Orchestration is to determine the correctness and suitability of the composition
of Web Services (W1 .. Wn), and to resolve any disparities that might occur in the
composition. Figure 10 shows the general structure of a Web Service Orchestration,
with further explanations on the need for process level mediation below.
29
FP6 – 504083
Deliverable 5.1
Figure 10: Web Service Orchestration - general structure
In Figure 10, the squares represent the activities of the business process of Web Service
A with respect to the functional decomposition of A. For each of these activities, there is
a request for an implementation that implements or realizes this functionality. The
different symbols inside the process activities show the realization of the activities:
Activity 1 is realized by invoking a single Web Service; for Activity 2, two Web
Services need to be composed; for Activity 3, a single Web Service is used which has a
complex (multi-step) Choreography – therefore the realization has to hold a compatible
Choreography. While these realizations are concerned with service usage, or service
interaction respectively, and thus denote Choreography, the Orchestration of Web
Service A describes the decomposed functionality of a Web Services. Although with a
very different functional purpose, this is a process description similar to the behavioural
description in Choreography.
Orchestration describes the functional decomposition of a Web Service. This is defined,
on the one hand, in the process description, the single activities and their process
structure and, on the other hand, the composition of the used Web Services (W1, .., Wn)
has to be executable. This requires the resolution of disparities between the particular
Web Services as well as between the distinct activities of the business process of the
Orchestration. So, the process level mediation for Orchestration has to provide the
means to combine other Web Services according to the functional needs defined in the
decomposition of a Web Service, and to resolve disparities inside in the composition. In
conclusion, the need for process mediation within Orchestration is support for solving
mismatches in the composition of external Web Services (regarding data and control
flow), and the integration of several composed Web Services into the Orchestration of
another Web Service.
30
FP6 – 504083
Deliverable 5.1
2.2.2 Technologies for Process Mediation
Having outlined the usage of process technologies within Semantic Web Services, as
well as the specific needs for process level mediation, we now investigate existing
approaches and technologies for mediation of processes. In order to provide a suitable
support for mediation of processes in DIP, the Process Level Mediation Module has to
support the representation for processes to be chosen within the DIP framework. To this
end, we briefly examine existing process technologies that are currently mentioned
within the field of Semantic Web Services as well as formalisms that support inferencebased reasoning as the technology to be used for process mediation.
2.2.2.1 Existing Process Representation Technologies
After outlining the general understanding and usage of process technologies, we
commence an analysis of the state-of-the-art in process technologies. This is restricted
to existing process technologies for Semantic Web Services as the major field of interest
in DIP. We briefly summarize the most frequently mentioned approaches and point to
exhaustive surveys existing in literature, for example, [Solanki and Abela, 2003] and
[Peltz, 2003].
BPEL4WS
The Business Process Execution Language for Web Services (BPEL4WS) [Curbera et
al., 2002] is an approach for describing the behaviour of Web Services in a business
interaction. It specifies a XML-based grammar for the control logic that is used to
coordinate web services, thus is to be considered to be a technology for process
technology in Orchestration as defined above. BPEL4WS is based on industrial
initiatives for process description languages: XLANG14 developed by Microsoft and the
Web Service Flow Language WSFL15 developed at IBM. Thus, BPEL4WS combines
the features of block-structured process languages (XLANG) with those of graph-based
approaches (WSFL).
BPEL4WS provides a language for the formal specification of business processes (that
is, the control level), and business interaction protocols (that is, the Web Service
interaction level). It distinguishes two kinds of processes:

Executable business processes model the actual behaviour of a participant in a
business interaction.
14
XLANG focused on the creation of business processes and the interactions between web service
providers. The specification provided support for sequential, parallel, and conditional process control
flow. It also included a robust exception handling facility, with support for long-running transactions
through compensation. XLANG used WSDL as a means to describe the service interface of a process. see
Microsoft Specification at: http://www.gotdotnet.com/team/xml_wsspecs/xlang-c/default.htm
15
WSFL was proposed to describe both public and private process flows [Leymann, 2001]. WSFL
defines a specific order of activities and data exchanges for a particular process. It defines both the
execution sequence and the mapping of each step in the flow to specific operations, referred to as flow
models and global models. The flow model represents the series of activities in the process, while the
global model binds each activity to a specific web service instance. A WSFL definition can also be
exposed with a WSDL interface, allowing for recursive decomposition. WSFL supports the handling of
exceptions but has no direct support for transactions.
31
FP6 – 504083
Deliverable 5.1

Business protocols use process descriptions that specify the mutually visible
message exchange behaviour of each of the parties involved in the protocol,
without revealing their internal behaviour. That is, the descriptions specify
interfaces. The process descriptions for business protocols are called abstract
processes and cannot be executed.
In BPEL4WS, a simple business process is layered on WSDL-defined Web Services.
The interaction model of WSDL is essentially a stateless client-server of synchronous or
uncorrelated asynchronous interaction. However, BPEL4WS defines business processes
consisting of stateful, long-running interactions in which each interaction has a
beginning, a defined behavior and an end, modeled by a flow. This flow is composed by
a sequence of activities. The behavior context for each activity is provided by a scope.
A scope can provide fault handlers, event handlers, compensation handlers and a set of
data variables and correlation sets. Table 1 summaries the functionalities of these
process modeling concepts:
Table 1: BPEL4 WS Process Modeling Concepts
BPEL4 WS Process Modeling Concepts
Activities: An action/step in a process. Activities can be combined via
following connectors:
the
Receive: message arrival handling
Reply: answering a received message
Invoke: invocation of a request-response operation on a portType offered by a
partner
Assign: for updating values in variables
Throw: generates a fault inside the business process
Wait: time-out wait
Empty: insertion of an empty operation into a process
Sequence: defines a collection of activities to be performed sequentially
Switch: selects a branch of activities from a set of choices
While: repetition of an activity until a certain success condition of has been met
Pick: blocks a process and waits for a suitable message to arrive
Flow: specifies one or more activities to be performed concurrently
Scope: defines a nested activity with its own associated variables, fault handlers
and compensation handlers
Variables: Variables allow specifying stateful interactions in a business process.
They provide the means for holding messages that constitute the state of a business
process. These messages can be either those that have been received from business
partners or those who are to be sent to the business partners. Variables can also
32
FP6 – 504083
Deliverable 5.1
hold data which are needed for holding state related to the process and never
exchanged with partners. They are associated with a messageType, which
corresponds with a WSDL message type definition.
Correlations: Correlation deal with conversational and negotiation properties.
Business processes exchange information using messages in XML syntax. This
exchange of information can be enhanced by means of correlation. During its
lifetime, a business process typically holds one or more conversations with
partners involved in its work. Conversations may be based on sophisticated
communication infrastructure that correlates the messages involved in a
conversation using some form of conversation identity.
Event Handling: Each scope can be associated with a set of event handlers when
a certain event occurs. Several actions that can range from simple to sequenced
activities are performed within the event handler. In BPEL4WS there are two types
of events: alarms that go off after user-set times or incoming messages
corresponding to a request/response or a one-way WSDL operation..
Fault Handling: Each scope can be associated with a set of custom fault-handling
activities. Every activity is intended to fit a specific kind of fault. These faults can
result from a WSDL operation fault or a programmatic throw activity.
Regarding the needs for Web Service Orchestration stated in the introduction,
BPEL4WS offers a modeling technique that covers the control level and the interaction
level in a suitable manner. A study that examines the expressivity of BPEL4WS in
terms of workflow and communication support is presented in [Wohed et al., 2003].
A major drawback of this language is that is does not explicitly rely on or incorporate a
formalized process representation, as is needed in order to compose Web Services
dynamically by intelligent mechanisms working on descriptive information. XLANG is
based on -Calculus, a further development of the Process Algebra CCS (see Section
2.2.2.2). WSFL as the other basis of BPEL4WS relies on statecharts, a technique for
describing complex transitions in finite state machines [Harel, 1987]. The approach and
the formal model underlying statecharts is comparable to the one of the Process Algebra
CSP (see Section 2.2.2.2). These formal models are only implicitly recalled with in
BPEL4WS, and there does not exists a direct mapping from BPEL 4WS to -Calculus
or any related formalism.
BPML
The Business Process Modeling Language [BPML, 2003] is a meta-language for the
modeling of business processes. In conjunction with WSCI (see below), BPML
provides a similar functionality as BPEL4WS. Therein, BPML provides a modeling
technique for the process control level and WSCI is designed for specifying interactions
between Web Services.
33
FP6 – 504083
Deliverable 5.1
BPML provides an abstracted execution model for collaborative and transactional
business processes based on the concept of a transactional finite-state machine. It
consist of three parts, a Public Interface and two Private Implementations (one for each
partner). The Public Interface, which is common to the partners, is supported by
protocols such as ebXML16 or RosettaNet17, and BizTalk18; the private interfaces are
specific to each partner and can be described in any executable language. BPML
provides a BPML XML Schema as the general ontological structure for processes
descriptions. Business processes are represented as the interleaving of control flow, data
flow, and event flow, while adding orthogonal design capabilities for business rules,
security roles, and transaction contexts. BPML offers explicit support for synchronous
and asynchronous distributed transactions, and therefore can be used as an execution
model for embedding existing applications within e-Business processes as process
components. Process specifications can also be loosely, BPML provides similar process
flow constructs and activities as BPEL4WS. Basic activities for sending, receiving, and
invoking services are available, along with structured activities that handle conditional
choices, sequential and parallel activities, joins, and looping. BPML also supports the
scheduling of tasks at specific times. Other features supported in BPML include
persistence, roles, instance correlation, and recursive decomposition, i.e. the ability to
compose sub-processes into a larger business process. The language has been designed
to manage long-lived processes, with persistence supported in a transparent manner.
In comparison to BPEL4WS, XML exchanges occur between the various participants,
with roles and partner components similar to the BPEL constructs. Both short and long
running transactions are supported, with compensation techniques used for more
complex transactions. BPML uses a scoping technique similar to BPEL4WS to manage
the compensation rules. It also provides the ability to nest processes and transactions, a
feature that BPEL currently does not provide. Also, a robust exception handling
mechanism is available within BPML, following many of the constructs in XLANG.
Timeout constraints can also be specified for specific activities defined within the
process.
The formal foundation of BPML is similar to that of BPEL4WS. There is no formal
process representation that explicitly supports BPML, but the constructs inherited from
XLANG (which has been applied as a foundation for BPML as well) still exist within
BPML. Nevertheless, no concrete formalization of BPML process descriptions exists.
WSCI / WS-CDL
The Web Service Choreography Interface (WSCI) [Arkin et al., 2002] is an XML-based
interface description language that describes the flow of messages exchanged by a Web
Service participating in choreographed interactions with other services, that is Web
Service collaboration. It describes the dynamic interface of the Web Service
participating in a given message exchange by means of reusing the operations defined
for a static interface. WSCI works in conjunction with the Web Service Description
16
see: http://www.ebxml.org/
17
see: http://www.rosettanet.org/
18
see: http://www.microsoft.com/biztalk/
34
FP6 – 504083
Deliverable 5.1
Language [W3C, WSDL, 2004], but it can also work with another service definition
language that exhibits the same characteristics as WSDL.
A WSCI specification supports message correlation, sequencing rules, exception
handling, transactions, and dynamic collaboration. Specific transactional contexts can
be set up within WSCI, similar to the scope activity in BPEL4WS. When a set of
activities is defined within a context, any failure will result in the entire group being
rolled back. WSCI describes the observable behaviour of a Web Service. This is
expressed in terms of temporal and logical dependencies among the exchanged
messages, featuring sequencing rules, correlation, exception handling, and transactions.
WSCI also describes the collective message exchange among interacting Web Services,
thus providing a global, message-oriented view of the interactions.
WSCI supports both basic and structured activities: The <action> tag is used to define a
basic request or response message. Each activity specifies the WSDL operations
involved and the role being played by this participant. External services can then be
invoked through the <call> tag. A wide variety of structured activities are supported,
including sequential and parallel processing, and condition looping. WSCI also
introduces an <all> activity, used to indicate that the specific actions have to be
performed, but not in any particular order.
WSCI does not address the definition and the implementation of the internal processes
that actually drive the message exchange. Rather, the goal of WSCI is to describe the
observable behaviour of a Web Service by means of a message-flow oriented interface.
This description enables developers, architects and tools to describe and compose a
global view of the dynamic of the message exchange by understanding the interactions
with the Web Service. WSCI does not address the definition of executable business
processes as defined by BPEL4WS.
WSCI Choreography includes a set of WSCI documents, one for each partner in the
interaction. In WSCI, there is no single controlling process managing the interaction.
Each action in WSCI represents a unit of work, which typically would map to a specific
WSDL operation. WSCI can be thought of as the glue around WSDL, describing how
the operations can be choreographed. In other words, WSDL would be used to describe
the entry points for each service available and WSCI would describe the interactions
among these WSDL operations, very similar to how BPEL4WS leverages WSDL.
The work on WSCI is not continuing, as the W3C Web Service Choreography Working
group19 is concentrating its work on the Web Services Choreography Description
Language (WS-CDL) [Kavantzas et al, 2004]. The aim of WS-CDL is to develop a
language for describing global interaction models for several Web Services, thus
following a different understanding of Choreography than WSCI.
Within WS-CDL, the notion of Choreography within Web Services is concerned with
global, multi-party, peer-to-peer collaborations. WS-CDL aims at providing the
description language for this, describing a common observable behaviour of two or
more participants. The description perspective a global, participant agnostic viewpoint,
wherein information exchange takes place when jointly agreed among the participants.
Therefore, a set of information-driven reactive rules is specified. In contrast to
19
homepage: http://www.w3.org/2002/ws/chor/
35
FP6 – 504083
Deliverable 5.1
BPEL4WS, WS-CDL follows a top-down approach for describing the interaction of
aggregated Web Services. This means that BPEL4WS starts with specifying the
behavioural requirements of single participants first, and then tries to aggregate them
together. WS-CDL starts with the specification of global visible information, that is,
those needed for the interaction as well as the global message exchange between
participants along with information-driven rules that allow dynamic compatibility
checking during run-time. Then, the requirements and description for the participants’
behaviours is recursively determined top-down from the global settings, aiming at
automated generation of the behavioural interfaces of participants [Kavantzas et al.,
2004]. The description elements defined in WS-CDL are organized in three groups:
2. Information Typing: description of the information to be exchanged between
participants. These are placeholders and container-structures specified at design
time, and filled with real data at execution time.
3. Participant Description: defines the entities to participate in a Choreography,
describing their identity, their roles, and the relationships between them.
4. Information-Driven Collaboration Roles: defines notions of channels (concrete
information exchange paths between participants), process description notions, work
units, and management of Choreography descriptions (import / reuse support).
Regarding the formal foundation of the W3C efforts around WSCI and WS-CDL,
WSCI does not rely on or incorporate a formalized representation. In order to support
inference-based handling of WSCI-definitions, mappings to suitable formalisms have to
be defined retroactively (see Section 2.2.2.2.2 for a description of this approach).
Obviously, such techniques risk that the mappings might not be isomorphic
(information-preserving), or that certain aspects can not be modelled in the formalism.
In contrast, WS-CDL claims to be based on a formal foundation: the Explicit Solos
Calculus, which is a variant of -Calculus and allows modeling a system from a global
point of view (although this formal foundation has been announced as an important
feature of WS-CDL, the specification has not been officially released at the point of
writing).
OWL-S
In OWL-S Web Services are understood as processes whereby the term process is used
in the sense of an activity, as in its antecedent DAML-S [DAML-S, 2004]. The
objective of the OWL-S process model is to define an ontology that covers all
information needed for semantically enhanced Web Service composition. This
information is modelled in the OWL-S Process Ontology [OWL-S, 2004]. Additionally,
a so-called process control ontology is defined that describes the monitoring of a
process execution. Figure 11 shows the structure of the OWL-S Process Model.
36
FP6 – 504083
Deliverable 5.1
Figure 11: OWL-S Process Ontology20
OWL-S defines three types of Web Service process: atomic, simple, and composite
processes. A process is described via data inputs, data outputs, pre-conditions, and
effects (which concern state-of-the-world conditions, described using condition
concepts). The OWL-S Process Ontology defines basic control constructs (Sequence,
Split, Fork + Join, Unordered, Condition, If-Then-Else, Iterate, Repeat-While, and
Repeat-Until). The expressiveness of the OWL-S process model is not as rich as that of
the workflow definition technologies presented above. This process ontology only
provides very basic modeling concepts for processes, concentrating on control
constructs. It is unclear how the process descriptions (input, output, pre-conditions, and
effects) are meant to be used for dynamic discovery and composition of Web Services
or if it is to be used for describing the interaction behaviour of a Web Service. Because
of this, current research proposes the replacement or enhancement of the OWL-S
process model with ontological descriptions based on the process models underlying
BPEL4WS or BMPL/WSCI [Lara et al, 2003].
Similar to WSCI, the OWL-S process model is not based explicitly on a formal process
representation. As a retroactive formalization in order to allow analysis and simulation
of the OWL-S process model, [Narayanan and McIlraith, 2003] provide a formalization
on basis of situation calculus. Although the mapping allows formal representation of
OWL-S process models, it is a proprietary approach and not explicitly supported within
the OWL-S framework.
20
Source: [OWL-S, 2004].
37
FP6 – 504083
Deliverable 5.1
2.2.2.2 Formalization of Process Representation
In order to allow mediation of processes on the basis of reasoning mechanisms, there
needs to be a formalization of the modeling constructs for the processes of the language
that are used for defining the processes. With such a formalization, specific mechanisms
for (semi-)automatic mediation can be defined. An example of such an approach is the
Process Specification Language (PSL21), which has been a NIST research project for
developing a standardized process ontology for the manufacturing domain, including
techniques for “Semantic Translations” between different manufacturing systems. It
uses the Knowledge Interchange Format (KIF) for representing processes and defines
rules for translations of processes on the semantic level [Schlenoff et al, 2000].
However, PSL represents a proprietary approach for formal process specification and
transformation for a specific domain - we are interested in a general model for process
level mediation based on a sound theoretical foundation.
Such a formalization has to support all the modeling constructs for processes provided
by the process representation language, especially notions for states (that is, activities)
and state transitions (that is, crossovers from one activity to the next in a process).
Moreover, control structures need to be defined that determine the correctness of the
information exchanged between activities (data flow), and validity of transitions
(control flow). The survey of existing formalizations is restricted to an investigation of
existing formalisms that might serve as a starting point for the formalization of process
representations as the basis of the Process Level Mediation Module in the DIP
Mediation component.
The requirements outlined above are supported by certain types of logical formalisms,
mostly referred to as logics for specifying dynamics [Eck et al, 2001]. Analyses of such
logical languages are provided in [Constantinescu and Faltings, 2002] and [Keller and
de Bruijn, 2004]. In accordance to these works, we briefly summarize the existing
logical approach of representation of processes that might serve as a basis for
development of the Process Level Mediation Module in the DIP Mediation Component.
In general, two different groups of logical formalisms for representing the dynamics of
processes are distinguished. On the one hand, notions for representing states and state
changes are needed, and, on the other hand, techniques for formalization of
communication and interaction between different parties are needed. The first group is
concerned with the formal representation of the general notions of processes
(states/activities, transitions, data- and control flow as discussed in Section 2.2.1.1). The
second group is concerned with the deployment of processes, that is when a process is
actually used for executing a complex, multi-step interaction between two or more
parties, including transaction management. With regard to process mediation, we are
only interested in the second group: in order to make processes interoperable, we have
to inspect the process specifications of interacting parties from a structural level (that is
without regard to the actual content of the interaction), and resolve possibly occurring
heterogeneities between the process specifications. Thus, the following briefly
summarizes the most promising formalisms existing for the second group.
21
see homepage: http://www.mel.nist.gov/psl/
38
FP6 – 504083
Deliverable 5.1
2.2.2.2.1. Logics for Representing Interaction Protocols
As outlined above, we are mostly interested in formalisms for representing the process
of interactions between parties. More precisely, we need a formalism that allows the
description of the structure of the processes that parties take when they are participating
in an interaction. These formalisms will serve as the formal basis for describing
processes and defining mappings for the mediation of possibly heterogeneous processes.
Subsequently, we discuss existing approaches for formalization of processes within
Choreography and Orchestration of Semantic Web Services with regard to the needs
identified above. In fact, the approaches to be investigated are those that are indirectly
applied within the process representation technologies examined above.
Process Algebras
As outlined above, we are mostly interested in formalisms for representing the process
of interactions between parties. More precisely, we need a formalism that allows the
description of the structure of the processes that parties perform when they participate in
an interaction. These formalisms will serve as the formal basis for describing processes
and defining mappings for the mediation of possibly heterogeneous processes.
Process Algebras (PA) provide this type of process representation formalism. A PA is a
formal description technique for complex computer systems that pays special attention
to concurrently executing components that interact in parallel and distributed systems.
The objective of PA is to allow the observation of the behaviour of a system or its
components. The approach is the definition of a formal language for the constituting
elements of the processes and the performance of algebraic calculations on the basis of
these process descriptions [Bergstra et al., 2001].
Compared to the general idea of mediation facilities outlined in the introduction, PA
seems to be the appropriate choice for process level mediation within Semantic Web
Services. The formal description language adds formal semantics to process
descriptions, and the algebraic calculations existing for process algebras can serve as a
basis for the development of inference-based mediation facilities for processes. Also,
currently existing approaches for formalization of processes within Choreography and
Orchestration apply PAs, as investigated in more detail below.
Research within PA started in the 1970s, touching many topic areas of computer science
and discrete maths, including system design notations, logic, concurrency theory,
specification and verification, operational semantics, algorithms, complexity theory,
and, of course, algebra. The very early works on PA developed the automata theory,
which identified states and state changes for modeling a process and was concerned
with formally describing the execution of process, a so-called run. Soon, the notion of
interaction was added, and the attention of PA research turned to behaviour observation
within the interactions of process-driven systems of components [Baeten, 2003].
The main approaches developed within PA are the Calculus of Communicating Systems
(CCS) [Milner, 1980], Communicating Sequential Processes (CSP) [Hoare, 1978] and
the Algebra of Communicating Processes (ACP) [Bergstra and Klop, 1984]. We briefly
describe these approaches, omitting formal analysis here to concentrate on the support
offered by existing process specification technologies and determine their usability for
development of the DIP Process Level Mediation Module. The formalizations will be
investigated more closely in DIP Deliverable D5.3, the specification of the Process
Level Mediation Component.
39
FP6 – 504083
Deliverable 5.1
CCS is mainly the work of Robin Milner, which developed over time. The main focus
of CCS is to formalize behaviours and determine equivalence between these, which is
basically the same objective as followed in process level mediation. CCS relies on a
synchronization tree as the underlying model for representing processes: a node
represents a process activity and an arch is a transition; arches are equipped with socalled laws that specify the conditions or validity constraints for a transition. Based on
this, CCS provides an algebraic model for process equivalence. A newer approach
based on CCS is -Calculus, which adds notions for handling process interactions
dynamically [Milner, 1991]. In contrast to CSS, CSP builds on the message passing
paradigm of communication as a contrary approach than describing processes
individually along with the notion of process equivalence as in CSS. Besides the
underlying models for process representation, CCS and CSP have developed techniques
to handle process identification, failures and deadlocks, and inference-based
determination of equivalences of processes. The development of CCS and CSP are very
interwoven, and most modern approaches combine the concepts of both. An exhaustive
comparison CSS and CSP is provided in [Glabbeek, 1997].
Situation Calculus
The term situation calculus, initially mentioned in [McCarthy and Hayes, 1969], is used
for a variety of formalisms treating situations as objects, considering fluents that take
values in situations, and events (including actions) that generate new situations from
old. The situation calculus language mostly used as defined in [Reiter, 2001] is a firstorder logical language for reasoning about actions, based on Predicate Calculus. The
aim is to represent dynamically changing worlds in which all of the changes are the
direct result of named actions performed by some agent. Situations are sequences of
actions, evolving from an initial distinguished situation, designated by the constant S0.
If a(y) is an parameterized action and s, a situation, the result of performing a in s is the
situation represented by the function do(a,s). Functions and relations whose values vary
from situation to situation, are called fluents, and are denoted by a predicate symbol
taking a situation term as the last argument (for example Own(bookName,s)). Finally,
Poss(a,s) is a distinguished fluent expressing that action a is possible to perform in
situation s.
The general problem within situation calculus is that there are several other problems
that have to be expressed explicitly, in order to achieve the correct semantics of the
clipping of the world one wants to formalize:
-
Quantification Problem: is concerned with the executability of an action in
specific situations. Usually, Poss(a,s) means that activity a can be executed
in situation s. The problem is that it is nearly impossible to determine all
situations in which an activity can be executed. In situation calculus, this has
to be specified explicitly, narrowing the applicability of this formalism to
small and closed worlds.
-
Frame Problem: adhering to the general law of inertia, it has to be defined
which information (that is the fluents in situation calculus) remains
unaffected by the execution of an activity. This is called the frame problem,
wherein a frame is understood as the set of information items that represent a
40
FP6 – 504083
Deliverable 5.1
state. Similar to the Quantification Problem, all fluents that are not affected
by executing an activity have to be specified explicitly.
Abstract State Machines
The approach of Abstract State Machines (ASM), originally proposed by [Gurevich,
1994], is an attempt to provide operational semantics to programs and programming
languages, in order to overcome the gap between formal models of computation and
practical specification methods. The ASM thesis is that any algorithm can be modeled
at its natural abstraction level by an appropriate ASM. Many research efforts have been
made around ASMs in the recent years, resulting in a simple, generic methodology for
describing simple abstract machines that correspond to algorithms.22
The structure of an ASM is that there is an algebra A over a signature (a finite
collection of function names) together with interpretations on the signature, and there is
a program that holds transition rules. Basically, such a transition rule is an expression of
the form f(t) := t0, with f as a function symbol, t as a set of terms in the signature, and t0
as another term. When this rule is fired in a certain algebra S0, then the terms in t are
transformed to t0 is course of transition, resulting in a new algebra S1 as the new state of
the universe of discourse. Thus, in an ASM only the transitions are defined; the run of
an ASM results in consecutive algebras, terminating when no further transition rules can
be executed.
With regard to the general idea of mediation facilities outlined in the introduction,
process algebras seem to be the appropriate choice for process level mediation within
Semantic Web Services: the formal description language adds formal semantics to
process descriptions and the algebraic calculations existing for process algebras can
serve as a basis for development of inference-based mediation facilities for processes.
Also, currently existing approaches for formalization of processes within Choreography
and Orchestration apply PAs, as investigated in more detail below.
Situation Calculus provides a similar expressivity for describing interaction processes as
process algebras. In contrast, the ASM approach supports only very abstract definitions
of processes. But the major advantage of ASMs is that the problems arising within
process algebras and situation calculus (namely the quantification problem and the
frame problem, as described above) are omitted, as only the transition rules are
specified within an ASM.
Further investigation of the appropriate formalism for process representation within DIP
will be provided in Deliverable D5.3 of this Work Package.
2.2.2.2.2. Formalizing Choreography Description
A more recent approach for formalizing Choreography descriptions in order to support
the definition of inference-mechanisms on processes is provided in [Brogi et al., 2004].
This paper presents a formalization of WSCI (see above) on the basis of CCS, aiming at
22
exhaustive information on ASM and the ASM research community can be found on several websites,
for instance or www.eecs.umich.edu/gasm.
41
FP6 – 504083
Deliverable 5.1
the specification of a technique to determine the compatibility of Web Services by
checking the interoperability of the Choreographies of distinct Web Services that are
supposed to interact automatically, as well as automated specification of mediators that
make a priori incompatible Web Services interoperable. We briefly summarize the
essence of this approach.
The starting point of this approach is WSCI, which was chosen because it offers, in
contrast to other representation techniques, the two possible views on Choreographies
of Web Services: the <interface> construct that describes the external visible behaviour
of a single Web Service, and the <model> construct that allows describing the
combination of several interfaces (that is, distinct Web Services) into a global model of
interaction.
For an isomorphic formalization, CCS has been selected. For the formalization of the
WSCI constructs (individual process logic primitives: sequence, parallel, choice, switch,
loop, activities, and so on; exceptions: on fault, on timeout, on message; calling of
processes in the global model) each WSDL message is represented by a CCS channel,
the individual process logic primitives are mapped to the corresponding CCS primitives,
and the calls for processes in the global model are transformed into parallel CCS models
(for the detailed discussion of the transformations see the paper).
A set of (WSCI) interfaces is defined as compatible if it terminates according to the
order and the content of messages, meaning if the conversation of Web Services does
not run into an infinite loop. Conversely, interfaces (Web Services) are not compatible
if the system fails because of running into an infinite loop. This possibility is checked
by matching the input and output actions of interfaces according to their structure and
content.
Another important aspect is replaceability, that is, finding other interfaces (Services)
that can provide the same functionality, commonly referred to as compensation within
the area of Semantic Web Services. This is determined by a set of heuristics, whereby
the replaceability of S1 by S2 is feasible if the interface of S2 is a subset of the interface
of S1 or the behavior of S1 and S2 is consistent if:
1.
S2 preserves the semantics of WS1 (concerning global choices),
2.
S2 does not extend S1 (that is, all actions in S2 are also in S1),
3.
WS2 terminates whenever WS1 does, which is called “behavioral subtyping”.
Additionally, the paper outlines how to develop mediators, called “adaptors” with
reference to earlier works [Bracciali et al., 2002]. Therein, mapping rules for process
entities of interfaces in a global model, that is that participate in an interaction, are
specified in order to resolve structural and content mismatches. These mappings are
only defined for specific use cases, but the approach can be generalized into general
process level mappings.
This approach covers the most important aspects needed for determining compatibility
of Choreographies automatically, thus can serve as a basis for the development of the
Process Level Mediation Module. It also stresses the choice of process algebras as the
theoretical basis for formalization of process representations in order to support
inference-based mechanisms for mediation. On the other hand, it applies WSCI as the
42
FP6 – 504083
Deliverable 5.1
Choreography description language, and does not provide a generic framework for
formalization of Choreographies within Semantic Web Services. Nevertheless, this
work serves as a starting point for further development.
2.2.2.2.3. Formalizing Web Service Orchestrations
Some approaches for formalizing Web Service Orchestration in the sense defined above
have recently been presented. We briefly summarize two approaches that may serve as a
starting point for the Process Level Mediation Module with regard to the requirements
of process mediation within Web Service Orchestration.
The first approach takes BPEL4WS as a starting point and creates a framework for
verification of interactions of BPEL4WS-described Web Services by transforming
BPEL4WS descriptions into formal languages and determining the validity of Web
Service Orchestrations [Fu et al., 2004].
The work relies on a model for Web Service conversations in which a global model
defines the overall task to be achieved by combing several Web Services (this is similar
to the approach of BPEL4WS: a process described in BPEL4WS defines the overall
task, and external Web Services are called in order to fulfil the task defined in a specific
activity of this process), and each Service is described using pre- and postconditions.
The verification mechanism determines the suitability of the execution order specified
in the global model by checking the pre- and postconditions of successive Web Service
calls.
For realization, BPEL4WS descriptions are transformed into proprietary formal
languages that are supported by a reasoning mechanism that perform “Synchronizability
Analysis”, in which heuristics that describe structure and relations of “valid
interactions” are used to determine the suitability of a Web Service Orchestration.
Although this approach uses proprietary formalisms and does not provide means for
specifying mapping rules to resolve possibly mismatches in an Orchestration, it might
be considered as an example for formalization and verification in Web Service
Orchestration.
The second approach presented at the same conference is the Concurrent Transaction
Logic CTR-S [Davulcu et al, 2004]. CTR-S is a sound logic with a standard model
theory that aims at combing process modeling and, based on this, automated contracting
of Web Services within multi-party processes – which means Web Service
Orchestration in the understanding outlined above.
The aim of this approach is to specify a logical language with formal semantics for
representing processes and to provide a means for automated contracting using
workflows that restrict the requirements of possibly usable Web Services for a certain
process activity.
The paper concentrates on the explanation of the CTR-S syntax, the language model,
and proof theory. It also provides a set of inference rules for proving CTR-S statements
and a pre-defined set of constraints on valid contracts (that is, Service compositions),
which are used to determine the validity of Web Service Orchestrations.
The advantage of this approach is that there is a single representation language for
modeling processes as well as the constraints on possible Web Service Orchestrations.
As this is a logical language, it allows the specification of axioms and inference-rules
43
FP6 – 504083
Deliverable 5.1
along with a sound proof theory to determine suitability and validity of Service
compositions. A shortcoming is that the approach is not very aligned with frameworks
for Semantic Web Service descriptions or existing process representation techniques,
and thus would require immense adaptation for compliance.
2.2.2.3 Process Integration by Process Composition
Process mediation involves various techniques with the aim to make different processes
interoperate. Process composition is one important approach to achieve interoperability
among processes, making them collaborate in order to achieve given user goals. This
approach is also used in orchestration, as explained above. Some existing approaches to
processes composition are reviewed below.
2.2.2.3.1. Situation Calculus for Service Composition
An initial approach to process composition in [McIlraith et al., 2001] and [McIlraith and
Son, 2002] was to use a planning formalism based on the situation calculus, a first-order
logical language for reasoning about action and change. In the situation calculus, the
state of the world is expressed in terms of functions and relations relativized to a
particular situation. The advantage of this approach is that complex control constructs
like loops can be modelled using this framework. The drawback of this approach is its
high computational complexity.
This work build on and extends Golog, a high-level logic programming language,
developed at the University of Toronto. Golog supports the specification and execution
of complex actions in dynamical domains.
2.2.2.3.2. Hierarchical Task Planning for Service Composition
In [Wu et al., 2003] the authors describe SHOP2, a hierarchical planning formalism for
encoding the composition domains. This approach is more efficient but it doesn’t
support complex constructs like loops.
SHOP2 is a domain-independent HTN planning system. HTN planning is an AI
planning methodology that creates plan by task decomposition. This is a process in
which the planning system decomposes tasks into smaller and smaller subtasks, until
primitive tasks are found that can be performed directly. The concept of task
decomposition in HTN is very similar to the concept of process decomposition in
DAML-S (see [DAML-S, 2004]).
One difference between SHOP2 and most other HTN planning systems is that SHOP2
plans for tasks in the same order that they will later be executed. Planning for tasks in
the order they will be performed makes it possible to know the current state of the world
at each step in the planning process, which makes it possible for SHOP2’s preconditionevaluation mechanism to incorporate significant inferencing and reasoning power,
including the ability to call external programs. This allows SHOP2 to integrate planning
with external information sources as in the Web environment.
In order to do planning in a given planning domain, SHOP2 needs to be given the
knowledge about that domain. SHOP2’s knowledge base contains operators and
methods. Each operator is a description of what needs to be done to accomplish some
44
FP6 – 504083
Deliverable 5.1
primitive task, and each method tells how to decompose some compound task into
partially ordered subtasks.
2.2.2.3.3. Type Based Service Composition
The previously presented approaches compose processes based on the semantic mark-up
of the parameters in service descriptions. Another possible approach [Constantinescu et
al., 2004] is to extend this by using also the composition typing information.
Formalism and assumptions
In this approach services and queries are represented in the standard way [W3C, WSDL,
2004] as two sets of parameters (inputs and outputs). A parameter is defined through its
name and a type that can be primitive [W3C, XML, 2003] (for example, a decimal in
the range [10,12] or [14,16]) or a class/ontological type [OWL, 2004]. Both primitive
and class types are represented as sets of numeric intervals. For instance, the generic
type Colour may be encoded as the interval [1,3], whereas the specific colours
(subtypes) Red, Green, and Blue may be represented as the single-point subintervals
[1,1], [2,2], and [3,3]. For more details on the encoding of classes/ontologies as numeric
intervals see below Representing types.
Input and output parameters of service descriptions have the following semantics:

In order for the service to be invokable, a value must be known for each of the
service input parameters and it has to be consistent with the respective parameter
type. For primitive data types, the invocation value must be in the range of
allowed values or in the case of classes the invocation value must be subsumed
by the parameter type.
 Upon successful invocation, the service will provide a value for each of the
output parameters and each of these values will be consistent with the respective
parameter type.
Service composition queries are represented in a similar manner but have different
semantics:

The query inputs are the parameters available to the integration (for example,
provided by the user). Each of these input parameters can be either a concrete
value of a given type, or just the type information. In the second case the
integration solution has to be able to handle all the possible values for the given
input parameter type.
 The query outputs are the parameters that a successful integration must provide
and the parameter types define what ranges of values can be handled. The
integration solution must be able to provide a value for each of the parameters in
the problem output and the value must be in the range defined by the respective
problem output parameter type.
For manipulating service or query descriptions, we will make use of the following
helper functions:

in(X), out(X) - return the set of input or output parameter names of a service or
query description X.
45
FP6 – 504083
Deliverable 5.1

type(P,X) - returns the type of a parameter named P in the frame of a service or
query description X as the set of intervals of all possible values for P. The
operator  in conjunction with this function will represent a range inclusion in
the case that P has a primitive data type or subsumption in case P is defined
through a class or concept description [OWL, 2004]. The operator  in
conjunction with this function will represent a range intersection in the case that
P has a primitive data type or in the case of a class/concept description it will
represent the sub-class common to both the arguments of the operator (possibly
the bottom class Nothing).
We assume that both service and query descriptions (X) are well formed in that they
cannot have the same parameter both as input and output: in( X )  out( X )   .
The rationale behind this assumption is that if a description had an overlap between
input and output parameters this would only lead to two equally undesirable cases:
either the two parameters would have the same type, in which case the output parameter
is redundant, or they would have different types, in which case the service description is
inconsistent.
Parameter names (properties in the case of OWL-S [OWL-S, 2004] or strings in the
case of WSDL [W3C, WSDL, 2004]) attach also some semantic information to the
parameters23. Thus, in our composition algorithm we not only consider type
compatibility between parameters but also semantic compatibility.
Composing services
Informally, the idea of composing services using forward chaining is to iteratively apply
a possible service S to a set of input parameters provided by a query Q (that is, all inputs
required by S have to be available). If applying S does not solve the problem (that is,
still not all the outputs required by the query Q are available) then a new query Q’ can
be computed from Q and S and the whole process is iterated. This part of our
framework corresponds to the planning techniques currently used for service
composition [Thakkar et al., 2002].
Now we consider the conditions needed for a service S to be applied to the inputs
available from a query Q using forward chaining: for all of the inputs required by the
service S, there has to be a compatible parameter in the inputs provided by the query Q.
Compatibility has to be achieved both for names (that have to be semantically
equivalent) and for types, where the range provided by the query Q has to be more
specific (  ) than the one accepted by the service S:
(P  in(S ))( P  in(Q)  type( P, Q)  type( P, S ))
This kind of matching between the inputs of query Q and of service S corresponds to
the plugIn match identified by Paolucci [Paolucci et al., 2002].
23
For WSDL this is not explicitly specified by the standard, but we assume that two parameters with the same name are
semantically equivalent.
46
FP6 – 504083
Deliverable 5.1
Forward complete matching of types is too restrictive and might not always work,
because the types accepted by the available services may partially overlap the type
specified in the query. For example, a query for restaurant recommendation services
across all Switzerland could specify that the integer parameter zip code could be in the
range [1000,9999] while an existing service providing recommendations for the frenchspeaking part of Switzerland could accept only integers in the range [1000-2999] for the
zip code parameter.
The above condition for forward chaining can be modified such that services with
partial type matches can be supported. For doing that we relax the type inclusion to a
simple overlap:
(P  in(S ))( P  in(Q)  (type( P, Q)  type( P, S )  ))
This kind of matching between the inputs of query Q and of service S corresponds to
the overlap or intersection match identified by Li [Li and Horrocks, 2003] and
Constantinescu [Constantinescu and Faltings, 2003].
We will also consider the condition needed for a backward chaining approach. The
service S has to provide at least one output that is required by the query Q. This
corresponds to the plugIn match for query and service outputs. Using the formal
notation above this can be specified as:
(P  out(S ))( P  out(Q)  type( P, S )  type( P, Q))
The above condition can be also relaxed such that services with partial type matches
can be backward chained:
(P  out(S ))( P  out(Q)  type( P, Q)  type( P, S )  ))
Type-compatible service composition versus planning
As the majority of service composition approaches today rely on planning, we will
analyse the correspondence between our formalism for service descriptions with types
and a hypothetic planning formalism using symbol-free first order logic formulas for
preconditions and effects.
As an example let's consider the service description S that has two input parameters A
and B, and two output parameters C and D. Their types are represented as sets of
accepted and provided values and are a1, a2 for A; respectively b1, b2 for B; c1, c2 for
C; and d1, d2 for D. This corresponds to an operator S that has disjunctive preconditions
and disjunctive effects. Negation is not required.
47
FP6 – 504083
Deliverable 5.1
Table 2: Service with types and corresponding planning operator
in(S) = [A, B]
:action S
type(A,S) = [a1, a2]
:precondition
type(B,S) = [b1, b2]
(and
(or a1 a2)
(or b1 b2))
out(S) = [C, D]
:effect
type(C,S) = [c1, c2]
(and
type(D,S) = [d1, d2]
(or c2 c2)
(or d2 d2)
Written in this way our formalism has some correspondence with existing planning
languages like ADL [Pednault, 1989] or more recently PDDL [McDermott, 1998]
(concerning the disjunctive preconditions) and planning with non-deterministic actions
[Kushmerick et al., 1995] (regarding the disjunctive effects), but the combination as a
whole (positive-only disjunctive preconditions and effects) stands as a novel formalism.
The structure of type-compatible service composition problems
As described previously we specify a service integration query in terms of a set of
available input parameters and a set of required output parameters. An integration
solution consists of a given ordering of services that can be invoked so that finally all
parameters required by the query are known.
From the perspective of the match type between services and queries (see below Figure
12) we consider the following three cases: forward complete matches, backward
complete matches, and forward partial matches.
By using forward-completely matching services the initial set of available parameters
can be incrementally extended. As there is a single point from which a service can be
applied, once all its required inputs are available, forward chaining services does not
introduce any choice points.
Applying backward-completely matching services creates a directed graph of sets of
required parameters as the order in which different parameters can be applied affects the
set of parameters that still need to be provided.
Several forward-partially matching services can be aggregated together into a composite
service as a software switch that maps each possible combination of parameter values
from the space of available inputs to one or more partially matching services. In order
to be able to fulfil the same functionality as the completely matching service we have to
have for each possible range combination of input parameters one or more services that
can accept those values.
48
FP6 – 504083
Deliverable 5.1
query
f orward
f orward
backward
complete
partial
complete
matches
matches
matches
av ailable parameters
switches
branch av ailable
sub-problems f rom:
parameters
query
required parameters
branch av ailable
x
backward required
serv ice
parameter
switch
set
sub-problem
Figure 12: The structure of type-compatible service composition problems
Our software switch corresponds to a non-deterministic planning operator in that the
choice point that it introduces will allow for a number of possible service invocation
paths to be followed without commitment at integration time to a particular one. The
choice will be made only at run-time based on the values of the switch input parameters.
Each of the branches in a switch will provide a (possibly different) set of available
parameters. It has to be noted that in order for the switch to be part of a service
integration solution all of the distinct sets of available outputs of the switch will have to
be part of an integration solution. Still for determining which branches can lead to a
solution we might have to construct for each pair of branch available outputs and
backward complete required inputs a sub-problem that we then solve recursively.
Representing types
Service descriptions are a key element for service discovery and service composition
and should enable automated interactions between applications. Currently, different
overlapping formalisms are proposed (for example, [UDDI, 2004] [FIPA, 2003] [OWLS, 2004] [Ankolekar et al., 2002]) and any single choice could be quite controversial
due to the trade-off between expressiveness and tractability specific to any of the
aforementioned formalisms.
In this paper, we will partially build on existing developments, such as [UDDI, 2004]
[Ankolekar et al., 2002], by considering a simple table-based formalism where each
service is described through a set of tuples mapping service parameters (unique names
of inputs or outputs) to parameter types (the spaces of possible values for a given
parameter). Parameter types can be expressed either as sets of intervals of basic data
types (for example, date/time, integers, floating-points) or as classes of individuals.
Class parameter types can be defined through a descriptive language like XML Schema
[W3C, XML, 2003] or the Ontology Web Language [OWL, 2004]. From the
49
FP6 – 504083
Deliverable 5.1
descriptions we can then derive either directly or by using a description logic classifier a
directed graph (DG) of simple is-a relations.
For efficiency reasons, we represent the DG numerically. We assume that each class
will be represented as a set of intervals. Then we encode each parent-child relation by
sub-dividing each of the intervals of the parent; in the case of multiple parents the child
class will then be represented by the union of the sub-intervals resulting from the
encoding of each of the parent-child relations. Since for a given domain we can have
several parameters represented by intervals, the space of all possible parameter values
can be represented as a rectangular hyperspace, with a dimension for each parameter.
Details concerning the numerical encoding of services can be found in [Constantinescu
and Faltings, 2003].
50
FP6 – 504083
Deliverable 5.1
2.2.3 Conclusion
In Section 2.2 we have studied the general needs for process level mediation within
Semantic Web Services as well as existing approaches that might serve as a starting
point for the development of the Process Level Mediation Module of the DIP Mediation
Component.
We initially outlined the use of process technologies within Semantic Web Services,
differentiating between Choreography and Orchestration. The former is concerned with
the usage and interaction of Web Services with a complex, multi-step externally visible
behaviour, while the latter is concerned with the composition of several Web Services
into a higher levelled functionality.
The requirements for mediation of processes are very different. Within a Choreography,
one must determine the compatibility of the external visible behaviours of Web
Services. Therefore, techniques are needed that allow the description of the behaviour
of individual Web Services as well as observing the process between Web Services that
participate in an interaction. For Orchestration, the challenge of process mediation is to
determine the correctness and validity of a Web Service composition.
Both aspects of process level mediation require the formalization of process
descriptions as the basis for automated mediation facilities. The analysis of existing
approaches has shown that process algebras seem to be a proper basis for the
formalization of process descriptions, and that there are some initial approaches that
follow this direction.
In conclusion, we have defined the scope of mediation of processes to be tackled within
the DIP Mediation component, and outlined possible starting points for development.
These have to be generalized, and they have to be combined into a coherent framework
for process level mediation.
51
FP6 – 504083
Deliverable 5.1
3 REQUIREMENT ANALYSIS
Requirements for the Mediation Component originate from two sources in the DIP
project. One is the overall DIP architecture as developed in Work Package 6
[Altenhofen et al., 2004] and the other the case studies developed in Work Packages 8, 9
and 10 ([Hadek et al., 2004], [Davies and Rowlatt, 2004], [Montez et al., 2004]).
3.1 Architectural Requirements for DIP Mediation Component
In this section we outline the architectural requirements as defined in [Altenhofen et al.,
2004].
The goal we want to achieve with Semantic Web Service is the seamless integration of
different services. In order to enable this seamless integration mediation must be
performed transparently for both, the requestor and the provider of a service. Therefore
the main requirement for the DIP Mediation Component is:
[R1] Transparency: The Mediation Component needs to be transparent for both, the
requestor and the provider of a service. For example, the requestor of a service does
not need and probably does not want to know the intermediary processes needed for
obtaining a certain service
In addition to that [Altenhofen et al., 2004] defines two main requirements for the DIP
Mediation Component. It should be:
•
Independent of particular execution environments, and
•
Decoupled from any other components of the DIP architecture.
The solution suggested for achieving this is to develop the DIP Mediation Component
as a (set of) Web Service. The following requirements result from the architectural
decisions taken for the overall architecture of DIP.
[R2] The Mediation Component must be available as a Web Service. This will enable
the desired decoupling from the other components and make the component
independent of the execution environment. Furthermore this enables the usage of the
Mediation Component inside existing web service environments, thereby offering a
transition path for gradual adoption of the technology developed in DIP.
As shown in the state-of-the-art section of this document (see Section 2) currently no
technology is available that offer automated mediation support for the data level, nor for
the process level. Therefore it is suggested in [Altenhofen et al., 2004]. to separate the
functionality of the DIP Mediation Component into two sub-components, a run-time
environment and a design-time.
[R3] Run-Time environment: There must be a run-time environment that is, given an
existing transformation and either a source and a target data format or a source and a
target process, capable of mediating between them. Note that the functionality to
mediate between different data formats and the functionality to mediate between
processes need to be independent of each other.
[R4] Design-time tool: There must be a design-time tool that assists the user during
the creation of transformations. This tool must enable the user to create the
necessary transformations much more quickly and easily than is possible with
52
FP6 – 504083
Deliverable 5.1
today’s state-of-the-art tools. It must also be possible to create these transformations
with a minimal amount of additional background knowledge. This can be achieved
by developing new approaches and algorithms base for example on semantics and
reuse. (For additional requirements on the design-time tool see Section 3.1.2)
3.1.1 Requirements for the Run-Time Environment
In addition to the requirements the DIP architecture has for the Mediation Component,
there are also a number of requirements that the case studies have elucidated for the
component. These requirements can easily be derived from the input the case studies
gave to the architectural team and result mainly in additional requirements for the RunTime environment.
Since at least two of the case studies have to deal with sensitive customer data, one of
the most important requirements derived from the studies for the run-time environment
is the requirement for security.
[R5] Support transport-level security: As message to and from the Mediation
Component might be sent over the public internet, it is very important to offer
transport-level security. This can be achieved by implementing or using one of the
specifications developed in the Web Service area.
Still, there is one important aspect to take into consideration when dealing with security
aspects: although there are already many algorithms for assuring the security (like
secret-key algorithms and encryption algorithms), it is well known that they are time
consuming, thus reducing the efficiency of the mediator. So a compromise needs to be
made between security and efficiency
[R6] Trust: To ensure that sensitive data is only sent to and handled by trusted
partners, a mechanism to establish trust is needed. As we don’t want to limit the runtime environment to the usage of a particular trust policy, a mechanism to support
different policies is necessary. For a more detailed discussion on trust see
[Altenhofen et al., 2004].
[R7] Data integrity: This requirement actually combines the previous two. By
assuring data integrity, the system assures not only quality in terms of accuracy, but
also security. Once delivered to the mediator system, data must not be the subject of
incorrect mediation, and must not be corrupted by any external factor.
[R8] Auditing: Auditing trials are necessary in the run-time environment for two
reasons. Firstly the ability to trace messages and message transformation in the
system is a prerequisite when dealing with sensitive (for example, the e-Government
case study) or mission critical (for example, the VISP case study) information. The
availability of such audit trails might even be a legal obligation in order to enable
the establishment of contracts and so on. Secondly, such audit trails will be needed
to support debugging and error resolution in a highly distributed system like DIP.
In addition to the requirements that originated from the need for security and trust, there
will also be strong requirements on the quality of the mediation service. In none of the
scenarios described in the case studies could incorrect transformations be tolerated at
run-time. This results in the following requirement:
53
FP6 – 504083
Deliverable 5.1
[R9] Transformation quality: At run-time an executed transformation needs to be
correct. In the area of data transformation this translates into the requirement for
achieving precision=1 and recall=1 [Do and Rahm, 2002]. However the
requirements on the transformation quality are different during design time (see
Section 3.1.2).
The proposed DIP architecture will result in a highly distributed system. Therefore the
remaining requirements that result from the DIP architecture are common requirements
like scalability and flexibility, which are normally proposed in a distributed system.
[R10] Scalability: As it is not possible to estimate the number of different data sources
and clients the run-time environment will have to deal with in a given scenario, it
must be designed to be scalable in both areas. Scalability in the number of different
data sources can only be achieved if the effort for adding new sources is very low.
[R11] Flexibility: Flexibility is needed on two levels in the run-time environment.
Internally the DIP Mediation Component must allow the easy exchange and the
arbitrary combination of matching the algorithms used. This will enable the
integration of newly developed, improved algorithms as well as adjusting the
Mediation Component to specific scenarios by selecting certain algorithms. In
addition the Mediation Component must support different deployment scenarios (for
example, mediation component as part of “own” infrastructure vs. external
mediation component).
[R12] Stackable: We define a stackable run-time environment as one capable of using
other external mediation services. However, this requirement raises another
question: which external mediation services are trusted? Another possible problem
is that a failure in the execution of one of the intermediary mediation services might
lead to a failure of the entire process. Therefore a hierarchy of the quality of external
mediation services needs to be created and recovery methods need to be defined.
3.1.2 Requirements on the Design-Time Tool
As mentioned above, the development of the DIP Mediation Component will be divided
into run-time and design-time parts. In the beginning of the project the runtime will
mainly execute static transformations generated using the design-time tool. But as the
project evolves we hope to be able to dynamically generate transformations at runtime,
possibly from pre-existing building blocks.
The main goal for the design-time tool is to enable the user to create transformations
between different data formats very quickly, with minimum manual effort and with as
much automatic assistance as possible. This results in the following requirements for the
design-time tool.
[R13] Transformation-IDE: As the creation of transformations between either
processes or data formats is a development process, the design-time tool needs to
support the developer during this whole process. This results in the need for the
integration of the creation, debugging and the final deployment of the
transformation.
54
FP6 – 504083
Deliverable 5.1
In order to achieve this IDE-like behaviour several requirements have to be met. The
most important requirement is the quality of the algorithms used to automatically
generate transformations.
[Do and Rahm, 2002] identified the following parameters as useful in determining the
accuracy of a mediator system:
False negatives (A) – matches needed but not automatically discovered
True positives (B) – possible matches correctly identified
False positives (C) – matches falsely proposed by the mediator
Based on these parameters [Do and Rahm, 2002] measured the precision and the recall,
by using the following functions:
precision 
recall 
|B|
| B||C |
|B|
| A|  | B |
In the ideal case, precision and recall both have a value of one, but that hardly ever
happens in a semi-automatic tool. Note that the parameters for a given mediation system
may be computed relative to a set of perfect (and most probably manually determined)
mappings.
One problem with the usage of precision and recall for measuring the performance of a
mapping system is that one of the two can easily be maximized at the expense of the
other. Recall, for example, can easily be maximized by returning all possible matches
(that is, the cross product) resulting in very poor precision. Precision can be maximized
by returning a single correct match. Therefore [Do and Rahm, 2002] suggest using
overall as an additional indicator. Overall is defined as a combination of precision and
recall:


1

overall = recall   2 
precision 

These three indicators or others similar one should be used in measuring the accuracy
offered by a mediation system on test cases.
As well as the quality of the mapping algorithms, the execution time of these algorithms
also needs to be taken into account. A competitive system should provide good results
in a reasonable time.
[R14] Quality of the “online” algorithms: Although the algorithms used to generate
transformations will not be able to generate perfect transformation they must be
precise enough (with respect to the defined measures) to enable the user to generate
correct transformations in a reasonable time. What exactly “precise enough”,
“sufficient recall”, and “reasonable time” mean will have to be investigated during
the project. The quality requirements will most likely be different in the area of
process mediation and the area of data mediation.
[R15] Error and consistency checking: Dynamic error and consistency checking is
very important for assuring good transformation quality. However as specified in
55
FP6 – 504083
Deliverable 5.1
requirement [R9] transformations must be correct during run-time. Therefore this
functionality is needed in the design-time tool to support the user during the creation
of transformations. This will result in high quality transformations and reduce the
need for debugging.
[R16] Graphical User interface: In order to enable the easy creation of necessary
transformations a graphical user interface is needed. This GUI must support the user
as much as possible during the generation of the transformations. Visual indications
are needed to show which parts of a message or a process are already transformed,
which parts of this transformation have been generated automatically or manually,
how high the confidence in the automatically generated results is, and so on.
As it might be not possible to provide a fully integrated tool right from the beginning,
two tools – one for creating data transformations and one for creating process
transformations – would also be suitable.
A very special requirement results from the e-Government case study. All solutions
used in future in any UK government institution must conform to the e-Government
Interoperability Framework (eGIF). The eGIF states that: “XSL has to be used for data
transformation”. As we do not want to be restricted to the use of XSL as a
transformation language, this results in the following requirement:
[R17] XSLT Export: The tool used to create data transformations must be capable of
exporting the created transformations as an XSL script.
The possibility to export the XSLT scripts will also enable the easy integration of our
mediation technology into existing EAI24 systems (for example, SAP Exchange
Infrastructure) as most of these systems are capable of executing XSLT scripts. As a
result this might help the adoption of semantic web technology in general.
24
Enterprise Application Integration: This term is generally used for systems enabling the interoperability
of enterprise applications that would otherwise not be able to communicate.
56
FP6 – 504083
Deliverable 5.1
3.2 Requirements for Data Level Mediation
The difficulty of solving a given data level meditation problem heavily depends on the
types of mismatches between the two data instances at hand. These mismatches can
range from simple naming conflicts of XML elements (for example, <firstname> versus
<givenName>) to complex mismatches (for example, objects of type “Human” with the
attribute gender versus objects of type “Man” and “Woman”). As different algorithms
will be needed to resolve different types of mismatches a classification of possible
mismatches is necessary.
[R18] Classification of Mismatches between Data Instance: The semantics of data
instances is described using ontologies. In order to abstract from syntactical
mismatches introduced by a specific data format we will need to classify
mismatches between ontologies that describe data instances. This classification will
then be used by different algorithms to resolve these mismatches.
[R19] Algorithms for resolving Mismatches between Data Instances: There must be a
library of algorithms that are capable of resolving a subset of the classified
mismatches.
As described in requirement [R18] we want to solve the data level mediation problem,
not on the syntactical but on the semantic level, as we strongly believe that this
approach will achieve better results. An abstract view on how to mediate between
business data on different semantic levels is given in Figure 13.
Figure 13: Business data mediation on the ontology level
In order to enable this approach, we need functionality to lift data from the syntactical to
the semantic level as well as functionality to drop data back down to the syntax level
after the mediation has been performed.
[R20] Lifting mechanism: A mechanism for lifting data from the syntax level (for
example, XML) to the semantic level is needed.
57
FP6 – 504083
Deliverable 5.1
[R21] Dropping mechanism: A mechanism for dropping data from the semantic level
down to the syntax level is needed.
In addition to these mechanisms a formalism to specify transformations between
ontologies is necessary. This language will be used to describe how an instance of one
ontology can be transformed into an instance of another ontology. Such a language is
needed for two reasons. Firstly, it enables the storage of the transformations
independent of the implementation of the run-time environment, and, secondly, such a
machine understandable language is the key to allowing reuse of mappings.
[R22] Formalism to specify transformations: It is necessary to develop a machine
understandable language, which can be used to express transformations between
data on the ontology levels.
It is important to note, that standard transformation languages like, for example, XSLT
are not suitable in this setting. Although they provide an implementation neutral format
to store transformations they are basically programming languages. The semantics of a
program written in such a language is not easily understandable, even for a human.
[R23] Language to describe instances: In addition to a formalism to specify
transformations we also need a language that describes the instances of either
processes or data that need to be transformed.
Although we want to tackle the data transformation problem on a semantic level it is
important to recognize that there exists a certain class of problems that cannot be solved
at this semantic level. An example of such a problem is the Identity Problem. Consider
for example that in both ontologies there exists a concept CITY. In each of them there
also exists an instance of the concept, but in the first ontology the name of this instance
is “Insbruck” while in the other it is “Innsbruck”. This kind of problem cannot be solved
on a semantic level only.
[R24] Syntax level mediation: To solve mediation problems similar to the one
described in the previous section the Mediation Component must enable mediation
on the syntax level when needed.
58
FP6 – 504083
Deliverable 5.1
3.3 Requirements for Process Level Mediation
The overall objective for the DIP technology for process level mediation is to define a
suitable technology for mediation of process definitions within Semantic Web Services.
As outlined above, this component should provide mediation facilities for process
technologies applied within Choreography and Orchestration, as the overall notions
within behavioural description of Web Services. The following summarizes the
requirements for the Process Level Mediation Component.
[R25] Integration with DIP Mediation Component: regarding the architectural
construction, the Process Level Mediation Module has to be integrated into the the
design of the overall DIP Mediation Component. It also should provide design time
tool support, as well as a run-time environment. Therefore, the same requirements
hold as defined above.
[R26] Conformability with DIP technology: in order to support mediation support for
Web Services, the mediation technology for processes has to be conform with the
technologies and languages to be used within DIP for representing process
definitions.
Regarding the technological realization of the Process Mediation Facility, the following
requirements arise.
[R27] General Requirements on Process Level Mediation technology: identification
and specification of the building blocks for the process mediation facility, which
are:

Process Representation Language

Formalization and Algebra

Classification of mismatches between Processes

Mechanism(s) for resolving (a subset of) the mismatches
[R28] Mediation Support for Choreography and Orchestration: Within the section on
State-of-the-Art Analysis for Mediation of Processes (see Section 2.2), we have
determined Choreography and Orchestration as the two fields where process
technologies are applied within Semantic Web Services. These notions have very
different requirements for mediation, as outlined below.
[R29] Choreography Mediation Requirements: Choreography is concerned with usage
and interaction of Web Services that have complex, multi-step externally visible
behaviours for communication with a service requester. The requirements for
Choreography description are to provide a technique for describing local behaviours
as well as global interaction models, which is to be based on a sound formal
foundation. The requirements for Choreography mediation are:

Ability to observe interaction processes

Classification of possibly occurring mismatches in interactions

Specification of a language for “mapping rules” to resolve mismatches
59
FP6 – 504083
Deliverable 5.1
[R30] Orchestration Mediation Requirements: Orchestration is concerned with the
composition of several Web Services into a higher level functionality, which is then
the functionality of another Web Service. Therefore, a suitable Orchestration
description language is needed, which allows defining the decomposition of
functionalities into sub-functionalities. The requirements for a suitable Orchestration
mediation technology are:

Ability to determine correctness and validity of Web Service
compositions in an Orchestration

Classification of possibly occurring mismatches in Web Service
compositions

Specification of a language for “mapping rules” to resolve mismatches
Regarding application of existing technologies and approaches that can serve as a basis
for the process level mediation technology, we have investigated the most relevant ones
within the State of the Art analysis on mediation of processes in Section 2.2.
[R31] Formal Representation of Processes: as examined throughout the document, the
prerequisite for the required mediation facilities for Choreography and Orchestration
is a formalization of process representations. Therefore, a suitable approach on basis
of existing formal languages for processes (see Section 2.2.2.2) has to be developed.
[R32] Dependency on DIP Process Representation Language: The foundation of the
Process Level Mediation Component is the process representation language to be
used or developed within DIP. In order to provide suitable mediation support, this
language has to be supported by the Process Level Mediation Module. Thus, there is
a strong interrelation with the DIP Deliverable D3.4 wherein the “business process
and protocol ontology” for DIP is defined.
Further requirements and design decisions for the Process Level Mediation Module will
be investigated in detail in DIP Deliverable D5.3 “Process Level Mediation Module
Specification”.
60
FP6 – 504083
Deliverable 5.1
4 CONCLUSIONS
This deliverable has studied existing mediation systems and technologies (Section 2,
State-of-the-Art Analysis) and derived requirements for the development of the DIP
Mediation Component (Section 3, Requirements Analysis).
For a better illustration of the mediation problem, the mediation of processes was
studied separately from the mediation of data and information. The purpose of this
separation was to underline the complexity of this problem, and also to emphasize the
differences between these two apparently similar problems.
The requirements are analyzed considering the following three aspects: the DIP
Mediation Component (analysing the design-time tool and the run-time tool
requirements), the data mediation and the processes level mediation. To obtaining a
more complete set of requirements, we also based our analysis on the inputs provided
by the Work Packages 8, 9 and 10, which are dealing with the three case studies (Virtual
Internet Service Providers, eGovernment and eBanking).
The results achieved within this deliverable will be addressed and further elaborated in
the following deliverables:
-
D5.2: Business data level mediation module specification – for this deliverable
the parts concerning data and information mediation will be of further use
-
D5.3: Business Process Level Mediation Module specification – the overview of
the current state of the art and requirements analysis in processes mediation
provided by this document are guidelines in the elaboration of Process Level
Mediation Module specification
-
D5.4: Business data and process mediation module prototype – this deliverable
will be based this document and the other two deliverables listed here. Special
attention should be paid here to the requirements for the DIP Mediation
Component and for the design-time tool.
As a general conclusion, in this deliverable we have defined the scope of data,
information and processes mediation within the DIP Mediation Component, and
outlined possible starting points for development, which should be further combined
with the results obtained in other work packages.
61
FP6 – 504083
Deliverable 5.1
5 REFERENCES
[Altenhofen et al., 2004]
M. Altenhofen, M. Hauswirth, V. Kirov,A. Kiryakov, C. Mack, J. Quantz, and R.
Schmidt: Report on requirements analysis and state-of-the-art, Deliverable 6.1, DIP
Project, 2004.
[Ankolekar et al., 2002]
A. Ankolekar, M. Burstein, J.R. Hobbs, O. Lassila, D. Martin, D. McDermott, S.A.
McIlraith, S. Narayanan, M. Paolucci, T. Payne, and K. Sycara: DAML-S: Web
service description for the Semantic Web, Lecture Notes in Computer Science, vol.
2342, 2002.
[Arkin et al., 2002]
A. Arkin, S. Askary, S. Fordin, W. Jekeli, K. Kawaguchi, D. Orchard, S. Pogliani,
K. Riemer, S. Struble, P. Takacsi-Nagy, I. Trickovic, and S. Zimek: Web Service
Choreography Interface (WSCI) 1.0. W3C Note 8 August 2002, available at:
http://www.w3.org/TR/wsci/, 2002.
[Baeten, 2003]
J.C.M. Baeten: Brief History of Process Algebra. Technische Universiteit
Eindhoven, 2003.
[Bergstra and Klop, 1984]
J.A. Berkstra and J.W. Klop: Process algebra for synchronous communication. In:
Information and Control, 60(1/3):109–137, 1984.
[Bergstra et al., 2001]
J.A. Berkstra, A. Ponse, and S.A. Smolka (eds.): Handbook of Process Algebra.
Amsterdam: Elsevier, 2001.
[Booth et al., 2004]
D. Booth, H. Haas, F. McCabe, E. Newcomer, I.M. Champion, C. Ferris, and D.
Orchard (eds): Web Services Architecture, W3C Working Group Note 11 February
2004, available at http://www.w3.org/TR/2004/NOTE-ws-arch-20040211/, 2004.
[BPML, 2003]
Business Process Modeling Language, http://xml.coverpages.org/bpml.html, 2003.
[Bracciali et al., 2002]
A. Bracciali, A. Brogi, and C. Canal: A formal approach to component adaptation.
In: Component deployment, LNCS 2370, pp. 185--199. Springer, 2002.
[Brogi et al., 2004]
A. Brogi, C. Canal, E. Pimentel, and A. Vallecillo: Formalizing Web Service
Choreographies. In Proceedings of First International Workshop on Web Services
and Formal Methods, Pisa, Italy, February 2004. To appear in ENTCS, 2004.
[Bussler, 2003]
C. Bussler: B2B Integration. Berlin, Heidelberg: Springer, 2003.
62
FP6 – 504083
Deliverable 5.1
[Constantinescu and Faltings, 2002]
I. Constantinescu and B. Faltings: Behavioural Description Formalisms for Service
Integration - Survey and Comparation. Technical Report No. 200224, Swiss
Federal Institute of Technology (EPFL), Lausanne (Switzerland), 2002.
[Constantinescu and Faltings, 2003]
I. Constantinescu and B. Faltings: Efficient matchmaking and directory Services. In
The 2003 IEEE/WIC International Conference on Web Intelligence, 2003.
[Constantinescu et al., 2004]
I. Constantinescu, B. Faltings, and W. Binder: Large scale, type-compatible service
composition. In IEEE International Conference on Web Services (ICWS-2004), San
Diego, CA, USA, July 2004.
[Crubezy et al., 2002]
M. Crubezy, E. Motta, W. Lu, and M. Musen: Configuring Online Problem-Solving
Resources with the Internet Reasoning Service. IEEE Intelligent Systems 2002.
[Curbera et al., 2002]
F. Curbera, Y. Goland, J. Klein, F. Leymann, D. Roller, S. Thatte, and S.
Weerawarana: Business Process Execution Language For Web Services, BEA
Systems & IBM Coporation & Microsoft Corporation, 2002.
[DAML-S, 2004]
DAML Services, http://www.daml.org/services, 2004.
[Davies and Rowlatt, 2004]
R. Davies and M. Rowlatt: Analysis Report: e-Government Business Needs,
Deliverable 9.1, DIP Project, 2004.
[Davulcu et al., 2004]
H. Davulcu, M. Kifer, and I.V. Ramakrishnan: CTR-S: A Logic for Specifying
Contracts in Semantic Web Services. In: Proceedings of the Alternate Tracks of the
13th World Wide Web Conference 2004, New York, pp. 144-153, 2004.
[Do and Rahm, 2002]
H.-H. Do and E. Rahm: COMA - a system for flexible combination of schema
matching approaches. In Proceedings 28th VLDB Conference, 2002.
[Doan et al., 2002]
AH. Doan, J. Madhavan, P. Domingos, and A. Halevy: Learning to map between
Ontologies on the Semantic Web. InWWW2002, 2002.
[Eck et al., 2001]
P. van Eck, J. Engelfriet, D. Fensel, F. van Harmelen, Y. Venema, and M. Willems:
A Survey of Languages for Specifying Dynamics: A Knowledge Engineering
Perspective. IEEE Transactions of Knowledge and Data Engineering, 13(3):462496, May/June, 2001.
[Fensel and Bussler, 2002]
63
FP6 – 504083
Deliverable 5.1
D. Fensel and C. Bussler: The Web Service Modeling Framework WSMF.
Electronic Commerce Research and Applications, 1(2), 2002.
[Fensel and Motta, 2001]
D. Fensel and E. Motta: Structured Development of Problem Solving Methods.
IEEE Transactions on Knowledge and Data Engineering, vol. 13, pp. 913-932,
2001.
[FIPA, 2003]
Foundation for Intelligent Physical Agents Web Site, http://www.fipa.org/, 2003.
[Fu et al., 2004]
X. Fu, T. Bultan, and J. Su: Analysis of Interacting BPEL Web Services. In:
Proceedings of the 13th World Wide Web Conference 2004, New York, pp. 621630., 2004.
[Glabbeeck, 1997]
R.J. Glabbeeck: Notes on the methodology of CCS and CSP. In: Theoretical
Computer Science 177 (1997), pp. 329-349, 1997.
[Gruber, 1993]
T.R. Gruber: A translation approach to portable ontologies. Knowledge
Acquisition, 5(2):199-220, 1993.
[Goh et al., 1999]
C.H. Goh, S. Bressan, S. Madnick, and M. Siegel: Context interchange: New
features and formalisms for the intelligent integration of information. ACM
Transaction on Information Systems, 17(3):270-290, 1999.
[Gurevich, 1994]
Y. Gurevich: Evolving Algebras 1993: Lipari Guide. In E. Börger (ed):
Specification and Validation Methods, Oxford (GB): Oxford University Press,
1994.
[Hadek et al., 2004]
T. Hadek, M. Isop, C. Mack, A. Duke, K. Niederacher, and A. Wahler: Analysis
Report: VISP Business Needs, Deliverable 8.1, DIP Project, 2004.
[Harel, 1987]
D. Harel: Statecharts: A visual Formalism for complex systems. The Science of
Computer Programming, 1987, 8, pp.231-274, 1987.
[Hoare, 1978]
C.A.R. Hoare: Communicating sequential processes. In: Communications of the
ACM, 21(8):666–677, 1978.
[Jones, 1998]
D. Jones: Developing shared ontologies in multi-agent systems. In ECAI’98
Workshop on Intelligent Information Integration, Brighton, U.K, 1998.
[Kavantzas et al, 2004]
64
FP6 – 504083
Deliverable 5.1
N. Kavantzas, D. Burdett, and G. Ritzinger: Web Services Choreography Language
Version 1.0. W3C Working Draft, 27 April 2004.
[Keller and de Brujin]
U. Keller and J. de Bruijn: Language Evaluation and Comparison. WSMO
Deliverable D8, available at: http://www.wsmo.org, 2004.
[Kushmerick et al., 1995]
N. Kushmerick, S. Hanks, and D.S. Weld: An algorithm for probabilistic planning,
Artificial Intelligence, vol. 76, no. 1.2, pp. 239.286, 1995.
[Lara et al., 2003]
R. Lara, H. Lausen, S. Arroyo, J. de Bruijn, and D. Fensel: Semantic Web Services.
Description Requirements and Current Technologies. In Proceedings of the
Semantic Web Services for Enterprise Application Integration and E-Commerce
workshop, at the Fifth International Conference on Electronic Commerce (ICEC
2003), Pittsburgh, 1-3 October, 2003.
[Larson et al., 1989]
J.A. Larson, S.B. Navathe, and R. Elmasri: A theory of attributed equivalence in
databases with application to schema integration. IEEE Transactions on Software
Engineering, 15(4):449-463, 1989.
[Lenat, 1995]
D.B. Lenat: CYC: A large-scale investment in knowledge infrastructure.
Communications of the ACM, 38(11):33-38., 1995.
[Leymann, 2001]
F. Leymann: Web Service Flow Language 1.0. IBM Software Group 2001.
Available at: www-4.ibm.com/software/solutions/webservices/pdf/WSFL.pdf, 2001.
[Li and Horrocks., 2003]
L. Li and I. Horrocks: A software framework for matchmaking based on semantic
web technology, In Proceedings of the 12th International Conference on the World
Wide Web, 2003.
[Madhavan et al., 2002]
J. Madhavan, P.A. Bernstein , P. Domingos , A. Halevy: Representing and
reasoning about mappings between domain models. Eighteenth national conference
on Artificial intelligence, p.80-86, Edmonton, Alberta, Canada, July 28-August 01,
2002.
[McCarthy and Hyes, 1969]
J. McCarthy and P.J. Hayes: Some Philosophical Problems from the Standpoint of
Artificial Intelligence. In B. Meltzer and D. Michie (eds.): Machine Intelligence 4,
pages 463 - 502. Edinburgh University Press, 1969.
[McDermott, 1998]
D. McDermott: The planning domain definition language manual. Yale Computer
Science, Tech. Rep. 1165, 1998.
65
FP6 – 504083
Deliverable 5.1
[McIlraith et al., 2001]
S. McIlraith, T. Son, and H. Zeng: Mobilizing the semantic web with daml-enabled
web services. In Proceedings Second International Workshop on the Semantic Web
(SemWeb-2001), Hongkong, China, May 2001.
[McIlraith and Son, 2002]
S. A. McIlraith and T. C. Son: Adapting golog for composition of semantic web
services. In Proceedings of the 8th International Conference on Principles and
Knowledge Representation and Reasoning (KR-02), D. Fensel, F. Giunchiglia, D.
McGuinness, and M.-A.Williams (eds.) San Francisco, CA: Morgan Kaufmann
Publishers, pp. 482.496, 2002.
[Mena et al., 2000]
E. Mena, A. Illarramendi, V. Kashyap, and A. Sheth: OBSERVER: An Approach
for Query Processing in Global Information Systems Based on Interoperation
across Pre-existing Ontologies. Distributed in Paralel Databases, 8(2):223-271,
2000.
[Milner, 1980]
R. Milner: A Calculus of Communicating Systems. Number 92 in Lecture Notes in
Computer Science. Springer Verlag, 1980.
[Milner, 1991]
R. Milner: The Polyadic - Calculus: a Tutorial. Edinburgh, 1991.
[Montez et al., 2004]
M.M. Montes, J.L. Bas, S. Bellido, J.M. López, S. Losada, and R. Benjamins:
Analysis Report on eBanking Business Needs, Deliverable 9.1, DIP Project, 2004.
[Motta et al., 2003]
E. Motta, J. Domingue, L. Cabral, and M. Gaspari: IRS-II: A Framework and
Infrastructure for Semantic Web Services. 2nd International Semantic Web
Conference (ISWC2003) 20-23, Sundial Resort, Sanibel Island, Florida, USA,
October 2003.
[MS BizTalk, 2004]
Microsoft BizTalk Server, http://www.biztalk.org, 2004.
[Narayanan and McIlraith, 2003]
S. Narayanan and S. McIlraith, S. 2003. Analysis and simulation of Web Services.
Computer Networks 42(5) : 675– 693, 2003.
[Noy and Munsen, 2000]
N.F. Noy and M.A. Musen: PROMPT. Algorithm and Tool for Automated
Ontology Merging and Alignment. In Proceedings of the 17th National Conference
on Artificial Intelligence (AAAI-2000). Menlo Park (California): AAAI/MIT Press,
2000.
[Omelayenko et al., 2003]
66
FP6 – 504083
Deliverable 5.1
B. Omelayenko, M. Crubezy, D. Fensel, R. Benjamins, B. Wielinga, E. Motta, M.
Musen, and Y. Ding: UPML: The language and Tool Support for Making the
Semantic Web Alive. In D. Fensel et al. (eds.): Spinning the Semantic Web:
Bringing the WWW to its Full Potential. MIT Press, pp. 141–170, 2003.
[OWL, 2004]
OWL web ontology language 1.0 reference, http://www.w3.org/tr/owl-ref/, 2004.
[OWL-S, 2004]
The OWL Services Coalition: OWL-S: Semantic Markup for Web Services,
version 1.0 available at http://www.daml.org/services/owl-s/1.0/owl-s.pdf, 2004.
[Paolucci et al., 2002]
M. Paolucci, T. Kawamura, T.R. Payne, and K. Sycara: Semantic matching of web
services capabilities, In Proceedings of the 1st International Semantic Web
Conference (ISWC), 2002.
[Papakonstantinou et al., 1996]
Y. Papakonstantinou, H. Garcia-Molina, and J. Ullman: MedMaker: A Mediation
System Based on Declarative Specifications. In Proceedings of the International
Conference on Data Engineering (ICDE 96), pp. 132-141, 1996.
[Pednault, 1989]
E.P.D. Pednault: Adl: Exploring the middle ground between strips and the situation
calculus. In Proceedings of the First International Conference on Principles of
Knowledge Representation and Reasoning (KR'89), Morgan Kaufmann Publishers,
pp. 324.332, 1989.
[Peltz, 2003]
C. Peltz: Web Service Orchestration. A Review of emerging technologies, tools, and
standards. Hewlett Packard, CO., January 2003.
[Popa et al., 2002]
L. Popa, M.A. Hernandez, Y. Velegrakis, R.J. Miller, F. Naumann, and H. Ho:
Mapping XML and Relational Schemas with Clio, Demo, In International
Conference on Data Engineering, 2002.
[Rahm and Bernstein, 2001]
E. Rahm and P.A. Bernstein: A survey of approaches to automatic schema
matching. In VLDB Journal: Very Large Data Bases, 10(4):334–350, 2001
[Reiter, 2001]
R. Reiter: Knowledge in Action: Logical Foundations for Specifying and
Implementing Dynamical Systems. Boston: MIT Press 2001.
[Roman et al., 2004]
D. Roman, H. Lausen, and U. Keller (eds.): Web Service Modeling Ontology Standard
(WSMO
Standard),
version
0.1
available
at
http://www.wsmo.org/2004/d2/v0.3/20040329/, 2004.
[SAP XI, 2004]
67
FP6 – 504083
Deliverable 5.1
SAP Exchange Infrastructure, http://www.sap.com/xi, 2004.
[S BIS, 2004]
Seeburger Business Integration Server, http://www.seeburger.de, 2004.
[Schlenoff et al, 2000]
C. Schlenoff, M. Gruninger, M. Ciocoiu, and J. Lee: The essence of the Process
Specification Language. In: Transactions of the Society for Computer Simulation
International, 16(4):204-216, 2000.
[Singh and Huhns, 2004]
M. Singh and M.N. Huhns: Service-Oriented Computing: Semantics, Transactions,
Agents. Wiley, 2004 [to appear].
[Solanki and Abela, 2003]
M. Solanki and C. Abela: The Landscape of Markup Languages for Web Service
Composition, May 2003.
[Thakkar et al., 2002]
S. Thakkar, C.A. Knoblock, J.L. Ambite, and C. Shahabi: Dynamically composing
web services from on-line sources, In Proceeding of the AAAI-2002 Workshop on
Intelligent Service Integration, Edmonton, Alberta, Canada, pp. 1.7, July 2002.
[UDDI, 2004]
UDDI, Universal Description,
http://www.uddi.org/, 2004.
Discovery
and
Integration
Web
Site,
[Visser et al., 1999]
P.R.Visser, D.M. Jones, M. Beer, T. Bench-Capon, B. Diaz, and M. Shave:
Resolving ontological heterogeneity in the KRAFT project. In 10th International
Conference and Workshop on Database and Expert Systems Applications
DEXA'99. University of Florence, Italy, 1999.
[Wache and Fensel, 2000]
H. Wache and D. Fensel: Special issue of the International Journal of Cooperative
Information Systems on Intelligent Information Integration, 9(4), 2000.
[Wiederhold, 1992]
G. Wiederhold: Mediators in the architecture of the future information systems.
Computer, 25(3):38-49, 1992.
[Wu et al., 2003]
D. Wu, B. Parsia, E. Sirin, J. Hendler, and D. Nau: Automating DAML-S web
services composition using SHOP2.In Proceedings of 2nd International Semantic
Web Conference (ISWC2003), Sanibel Island, Florida, October 2003.
[Wohed et al, 2003]
P. Wohed, W.M.P. van der Aalst, M. Dumas, and A.H.M. der Hofstede: Analysis
of Web Services Composition Languages: The Case of BPEL4WS. ER 2003: 200215, 2003.
68
FP6 – 504083
Deliverable 5.1
[W3C, WSDL, 2004]
W3C, Web Services Description
http://www.w3.org/tr/wsdl12, 2004.
Language
(WSDL)
version
1.2,
[W3C, XML, 2003]
W3C, XML Schema, http://www.w3.org/xml/schema, 2003.
[XML, 2001]
XML Schema Part 2: Datatypes, http://www.w3.org/tr/xmlschema-2/, 2001.
[W3C, XSLT, 1999]
W3C, XSL Transformations, http://www.w3.org/TR/xslt/ , 1999.
[Yerneni et al., 1999]
R. Yerneni, C. Li, H. Garcia-Molina, and J. Ullman: Computing Capabilities of
Mediators. In Proceedings of ACM SIGMOD, Philadelphia, 1999.
69
Download