RSSM: A Rough Sets based Service Matchmaking Algorithm

advertisement
RSSM: A Rough Sets based Service Matchmaking Algorithm
Bin Yu and Maozhen Li
Electronic and Computer Engineering, School of Engineering and Design
Brunel University, Uxbridge, UB8 3PH, UK
{Bin.Yu, Maozhen.Li}@brunel.ac.uk
notably UDDIe [6] and UDDI-MT [7]. UDDIe extends
UDDI with user-defined properties such as service
leasing,
service
access
cost,
performance
characteristics, or usage index associated with a
service. UDDI-MT extends UDDI in a more flexible
way allowing metadata to be attached to various
entities associated with a service. The work of UDDIMT has been utilised in the myGrid project [8] to
facilitate Grid service discovery with semantic
descriptions [9]. There is also current work in
implementing a service registry based on extensions to
UDDI, called GRIMOIRES [10].
With the development of Semantic Web [11],
services can be annotated with metadata for
enhancement of service discovery. One key enabling
technology to facilitate service annotation and
matching is OWL-S [12], an OWL [13] based ontology
for encoding properties of Web services. OWL-S
ontology defines a service profile for encoding a
service description, a service model for specifying the
behavior of a service, and a service grounding for how
to invoke the service. Typically, a service discovery
process involves a matching between the profile of a
service advertisement and the profile of a service
request using domain ontologies described in OWL.
The service profile not only describes the functional
properties of a service such as its inputs, outputs, preconditions, and effects (IOPEs), but also nonfunctional features including service name, service
category, and aspects related to the quality of a service.
Paolucci et al. present an algorithm for matching
OWL-S services [14]. This work has been extended in
various way for service discovery, e.g. Jaeger et al.
[15] introduce “contravariance” in matching inputs and
outputs between service advertisements and service
requests using OWL-S, Li et al. [16] introduce a
“intersection” relationship between a service
advertisement and a service request, Majithia et al. [17]
introduce reputation metrics in matching services.
Besides OWL-S, another prominent effort to
enhance service description and discovery is WSMO
Abstract
The past few years have seen the Grid is evolving as
a service-oriented computing infrastructure. It is
envisioned that various resources in a future Grid
environment will be exposed as services. Service
discovery becomes an issue of vital importance for
utilising Grid facilities. This paper presents RSSM, a
Rough Sets based service matchmaking algorithm for
service discovery that can deal with uncertainty of
service
properties
when
matching
service
advertisements with service requests. The evaluation
results show that the RSSM algorithm is more effective
in service discovery compared with other mechanisms
such as UDDI and OWL-S.
1. Introduction
The past few years have seen the Grid [1, 2] is
evolving
as
a
service-oriented
computing
infrastructure. Open Grid Services Architecture
(OGSA) [3], a service-oriented architecture promoted
by the Global Grid Forum (GGF) [23], has facilitated
the evolution. It is expected that Web Services
Resource Framework (WSRF) [4] will be acting as an
enabling technology to drive this further. It is
envisioned that a future Grid environment may
therefore host a large number of services exposing
resources such as applications, software libraries,
CPUs, disk storage, network links, instruments and
visualisation devices. Service discovery becomes an
issue of vital importance for utilising Grid facilities.
UDDI [5] has been proposed as an industry standard
for Web service publication and discovery. However,
the search mechanism supported by UDDI is limited to
keyword matches and does not support any inference
based on the taxonomies. To enhance service
discovery, UDDI has been extended in such a way that
service descriptions are attached with metadata,
1
dispensable properties which will be marked in
advertised domain services (step 7). Invoked by the
Dependent Property Reduction component (step 8), the
Service Matching and Ranking component accesses the
advertised domain services for service matching and
ranking (step 9), and finally it produces a list of
matched services (step 10).
[18], which is built on four key concepts – ontologies,
standard WSDL based Web services, goals, and
mediators. WSMO stresses the role of a mediator in
order to support interoperation between Web services.
WSMO introduces mediators aiming to support distinct
ontologies employed by service requests and service
advertisements.
The above-mentioned methods facilitate service
discovery in some way. However, when matching
service advertisements with service requests, these
methods assume that service advertisements and
service requests use consistent properties to describe
relevant services. This is not always true for a largescale Grid system where service publishers and
requestors may use their pre-defined properties to
describe services. Some properties used in a service
advertisement may not be used by a service request
when searching for services. Therefore, one
challenging work in service discovery is that service
matchmaking should be able to deal with uncertainty
of service properties when matching service
advertisements with service requests.
In this paper, we present RSSM, a Rough Sets [19]
based service matchmaking algorithm for service
discovery that can deal with uncertainty of service
properties. Experiment results show that the algorithm
is more effective for service matchmaking than UDDI
and OWL-S mechanisms.
The remainder of this paper is organised as follows.
Section 2 presents in depth the design of the RSSM
matchmaking algorithm. Section 3 compares the
RSSM with UDDI and OWL-S mechanisms in terms
of effectiveness and overhead in service discovery.
Section 4 concludes this paper and presents future
work.
Figure 1. RSSM components.
In the following sections, we describe in depth the
design of RSSM components. Firstly, we introduce
Rough Sets for service discovery.
2.1 Service Discovery with Rough Sets
Rough sets theory is a mathematic tool for
knowledge discovery in databases. It is based on the
concept of an upper and a lower approximation of a set
as shown in Figure 2. For a given set X , the yellow
grids (lighter shading) represent its upper
approximation, and the green grids (darker shading)
represent its lower approximation.
2. Algorithm Design
RSSM considers input and output properties
individually when matching services. For the
simplicity of expression, input and output properties
used in a service request are generally referred to as
service request properties. The same goes to service
advertisements.
Figure 1 shows RSSM components for service
matchmaking. The Irrelevant Property Reduction
component takes a service request as an input (step 1),
and then it accesses a set of advertised domain services
(step 2) to remove irrelevant service properties using
the domain ontology (step 3). Reduced properties will
be marked in the set of advertised domain services
(step 4). Once invoked (step 5), the Dependent
Property Reduction component accesses the advertised
domain services (step 6) to discover and reduce
Figure 2. Approximation in rough sets.
Let
Ω be a domain ontology.
2
Following the work proposed by Paolucci et al.
[14], we define the following relationships between
p R and p A :
U be a set of N advertised domain services,
U = {s1 , s2 ,K, s N } , N ≥ 1 .
P be a set of K properties used to describe the
N advertised services of the set U ,
P = { p1 , p2 , K , p K } , K ≥ 2 .
PA be a set of M advertised service properties
relevant to a service request R in terms of the
domain ontology Ω ,
PA = { p A1 , p A2 , K , p AM } , PA ⊆ P , M ≥ 1 .
X be a set of advertised services relevant to the
service request R , X ⊆ U .
X be the lower approximation of the set X .
exact match, p R and p A are equivalent or
p R is a subclass of p A .
plug-in match, p A subsumes p R .
subsume match, p R subsumes p A .
nomatch, no subsumption between p R and
pA .
Algorithm 1. Reducing irrelevant properties
from advertised services.
X be the upper approximation of the set X .
1: for each property pA used in advertised domain services
2: for all properties used in a service request
3:
if pA is nomatch with any pR
According to the Rough Sets theory, we have
X = {x ∈U : [x]P ⊆ X}
(1)
X = {x ∈U : [x]P I X ≠ ∅}
(2)
A
4:
then pA is marked with nomatch
5:
end if
6: end for
7: end for
8: remove all pA that are marked with nomatch
A
For a property p ∈ PA , we have
For each property of a service request, we use the
reduction algorithm as shown in Algorithm 1 to
remove irrelevant properties from advertised services.
For those properties in advertised services that have a
nomatch result they will be treated as irrelevant
properties. Advertised services are organised as service
records in a database. Properties are organised in such
a way that each uses one column to ensure the
correctness in the following reduction of dependent
properties. As a property used by one advertised
service might not be used by another advertised
service, some properties may have empty values. A
property with an empty value in an advertised service
record becomes an uncertain property in terms of a
service request. If a property in an advertised service
record is marked as nomatch, the column associated
with the property will be marked as nomatch. As a
result, all properties within the column including
uncertain properties (i.e. properties with empty values)
will not be considered in service matchmaking.
∀x ∈ X , x definitely has property p .
∀x ∈ X , x possibly has property p .
∀x ∈ U − X , x absolutely does not have
property p .
The use of “definitely”, “possibly” and “absolutely”
are used to encode properties that cannot be specified
in a more exact way. This is a significant addition to
existing work, where discovery of services needs to be
encoded in a precise way, making it difficult to find
services which have an approximate match to a query.
2.2 Reducing Irrelevant Properties
When searching for a service, a service requestor
may use some properties irrelevant to the properties
used by a service publisher in terms of a domain
ontology. These irrelevant properties used by
advertised services should be removed before the
service matchmaking process is carried out.
2.3 Reducing Dependent Properties
Property reduction is an import concept in Rough Sets.
Properties used by an advertised service may have
dependencies. Dependent properties are indecisive
properties that are dispensable in matching advertised
services.
Let
p R be a property for a service request.
p A be a property for a service advertisement.
3
discovery process can be speeded up because of the
reduction of properties in matching advertised services.
Let
Ω , U , P , PA be defined as in Section 2.1.
2.4 Service Matching and Ranking
PAD be a set of LD properties of which each is an
individual decisive property for identifying
advertised services relevant to the service request
R in terms of Ω ,
The Service Matching and Ranking component uses
the decisive properties received from the Dependent
Property Reduction component to match and rank
advertised services in terms of a service request.
D
PAD = { p AD1 , p AD2 ,K, p AL
} , PAD ⊆ PA ,
D
LD ≥ 1 .
PAIND be a set of LIND properties of which each is
Algorithm 2. Reducing dependent properties from
advertised services.
an individual indecisive property for identifying
advertised services relevant to the service request
R in terms of Ω ,
S is a set of advertised services with the maximum number of nonempty
property values relevant to a service request;
PA is a set of properties used by the S set of advertised services;
PAD is a set of decisive properties, PAD ⊆ PA;
IND
IND
PAIND = { p AIND
1 , p A 2 , K , p ALIND } ,
PAIND is a set of individual indecisive properties, PAIND ⊆ PA ;
PAIND_Core is a set of combined indecisive properties,
PAIND ⊆ PA , LIND ≥ 1 .
IND _ Core
A
IND
A
P
be a core subset of P
that has the
maximum number of individual indecisive
properties and the group of these properties in
1:
2:
3:
4:
PAIND _ Core are indecisive in identifying advertised
services relevant to the service request R in terms
IND
IND _ Core
of Ω , PA
⊆ PA .
IND ( PA
5:
6:
add p into PAIND_Core;
7:
end if
8: end for
9: for i=2 to sizeof(PAIND)-1
IND
) be an equivalence relation also
called indiscernibility relation on U .
f be a decision function discerning an advertised
10:
11:
12:
13:
service with properties.
= {(x, y) ∈U : ∀p AiIND ∈ PAIND , f ( x, p AiIND ) = f ( y, p AiIND )}
PAD = PA − PAIND
calculate all possible i combinations of the properties in PAIND;
if any combined i properties are indecisive properties for identifying
the S set of services
then
PAIND_Core = Ø;
add the i properties into PAIND_Core;
14:
15:
continue;
16:
else if any combined i properties are decisive properties
17:
then break;
18:
end if
19: end for
20: PAD = PA-PAIND_Core;
21: return PAD;
Then
IND ( PAIND )
PAIND_Core ⊆ PAIND;
PAD = Ø; PAIND = Ø; PAIND_Core = Ø;
for each property p∈ PA
if p is an indecisive property for identifying the S set of services
then
add p into PAIND;
PAIND_Core = Ø;
(3)
(4)
Let
The Dependent Property Reduction component uses
the algorithm as shown in Algorithm 2 to find the
decisive properties used by advertised services.
Algorithm 2 uses the advertised services with the
maximum number of nonempty property values as
targets to find indecisive properties. The targeted
services can still be uniquely identified without using
these indecisive properties. All possible combinations
of individual indecisive properties will be checked with
an aim to remove the maximum indecisive properties
which may include uncertain properties whose values
are empty. In the mean time, the following service
Ω , U , P , PA be defined as in Section 2.1.
PR be a set of M properties used in a service
request R . PR = { p R1 , p R 2 , K , p RM } , M ≥ 1 .
PAD be a set of LD decisive properties for
identifying advertised services relevant to the
service request R in terms of Ω ,
D
PAD = { p AD1 , p AD2 ,K, p AL
} , LD ≥ 1 .
D
4
service request property and the relationship is out of
five generations. Similarly, an advertised service
property will be given a match degree of 50% if it has
a subsume relationship with a service request property
and the relationship is out of three generations.
Each decisive property used for identifying
advertised services has a maximum match degree when
matching all the properties used in a service request.
S ( R, s ) can be calculated using formula (5).
m( p Ri , p Aj ) be a match degree between a
requested service property
service
property
PAj
PRi and an advertised
in
terms
of
Ω,
p Ri ∈ PR , 1 ≤ i ≤ M , p Aj ∈ PAD , 1 ≤ j ≤ LD .
v( p Aj ) be a value of the property
p Aj ,
p Aj ∈ PAD , 1 ≤ j ≤ LD .
S ( R, s ) be a similarity degree between an
advertised service s and the service request R ,
LD
S ( R, s ) = ∑
j =1
s ∈U .
4:
5:
6:
7:
8:
9:
10:
11:
12:
PAD, v(pAj) ≠ NULL
for each property pRi ∈ PR
if pAj is an exact match with pRi
then m(pRi, pAj) = 1;
else if pAj is a plug-in match with pRi
then if pRi is the kth subclass of pAj and 2≤k≤5
then m(pRi, pAj) = 1-(k-1)×0.1;
else if pRi is the kth subclass of pAj and k>5
then m(pRi, pAj) = 0.5;
Ri
, p Aj )) LD (5)
We have implemented the RSSM algorithm using Java
on a Pentium IIII 2.6G with 512M RAM running Red
Hat Fedora Linux 3. jUDDI [20] and mySQL [21]
were used to build a UDDI registry. We designed Pizza
services for the tests using the Pizza ontology defined
by http://www.co-ode.org/ontologies/pizza/pizza_20041007.owl.
Figure 3 shows the Pizza ontology structure. The
approach adopted here can be applied to other domains
– where a specific ontology can be specified. The use
of service properties needs to be related to a particular
application-specific ontology.
end if
else if pAj is a subsume match with pRi
then if pAj is the kth subclass of pRi and 1≤k≤3
Size
then m(pRi, pAj) = 0.8-(k-1)×0.1;
else if pAj is the kth subclass of pRi and k>3
13:
14:
15:
16:
17:
end if
18: end for
19: end for
then m(pRi,
end if
pAj) = 0.5;
Pizza
has attributes of
Price
Location
Pizzashop
Vegetabletopping
PAD,
for each property pRi ∈
22:
m(pRi,
23: end for
24: end for
Delivery
Hot Pizza
Nuttopping
20: for each property pAj ∈
21:
i =1
3. Experimental Results
1: for each property pAj ∈
3:
∑ max(m( p
Using the formula (5), each advertised service has a
similarity degree in terms of a service request.
Algorithm 3. Rules for calculating match degrees
between requested service properties and
advertised service properties.
2:
M
Fruittopping
v(pAj) = NULL
has attributes of Vegetarian
Pizza
Meat
Pizza
Tomatotopping
has attributes of
Meattopping
Fishtopping
PR
pAj) = 0.5;
Mushroom
Veneziana
Rosa
Caprina
Soho
Algorithm 3 shows the rules for calculating a match
degree between a requested service property and an
advertised service property. A decisive property with
an empty value has a match degree of 50% when
matching all the properties used by a service request.
An advertised service property will be given a match
degree of 100% if it has an exact match relationship
with a property used by a service request. An
advertised service property will be given a match
degree of 50% if it has a plug-in relationship with a
Fiorentina
Figure 3: Pizza ontology structure.
In the following sections, we evaluate the RSSM
algorithm in terms of its effectiveness and overhead in
service discovery. We compare RSSM matchmaking
with UDDI and the OWL-S matchmaking algorithm
proposed by Paolucci et al. [14] respectively. RACER
5
[22] was used by the OWL-S algorithm to reason the
relationship between an advertised service property
and a requested service property. We implemented a
simple reasoning component in RSSM to reduce the
high overhead incurred by RACER.
3.2 Evaluating Overhead in Service Discovery
We have registered 10,000 Pizza service records in
a database for testing the overhead of the RSSM
matchmaking. We compare the overhead of the RSSM
matchmaking with that of UDDI and the OWL-S
matchmaking respectively in service matching as
shown in Figure 4. We also compare their performance
in accessing service records as shown in Figure 5.
From Figure 4 we can see that UDDI has the least
overhead in service matching services. This is because
UDDI only supports keyword matching. It does not
support the inference of the relationships between
requested service properties and advertised service
properties which is a time consuming process.
3.1 Evaluating Effectiveness in Service Discovery
We have performed four groups of tests to evaluate
the effectiveness of RSSM in service discovery. Each
group had some targeted services to be discovered.
Properties such as Size, Price, Nuttoping,
Vegetariantopping, and Fruittopping were used by the
advertised services. Table 1 shows the evaluation
results.
Table 1. The results of effectiveness evaluation.
UDDI
Service
Property
Keyword
Matchmaking
OWL-S
RSSM
Matchmaking
Matchmaking
Certainty
Correct
Matched
Correct
Matched
Correct
Rate
Matching
Service
Matching
Service
Matching
Service
Rate
Records
Rate
Records
Rate
Records
100%
50%
4
100%
3
100%
3
70%
50%
30%
33.3%
0%
0%
3
1
1
0%
0%
0%
0
0
0
100%
100%
100%
1
1
1
Matched
In the tests conducted for group one, both the OWLS matchmaking and RSSM matchmaking have a
correct match rate of 100%. This is because all services
in this group do not have uncertain properties (i.e.
properties with empty values). UDDI discovers four
services, but only two services are correct because of
the use of keyword matching (in this case making use
of a categoryBag).
In the tests of the last three groups where
advertised, services have uncertain properties, the
OWL-S matchmaking cannot discover any services
producing a matching rate of 0%. Although UDDI can
still discover some services in the tests of the last three
groups, the correct matching rate for each group is low.
For example, in the tests of group three and group four
where advertised services have only 50% and 30%
certain properties respectively, UDDI cannot discover
any correct services. The RSSM matchmaking is more
effective than UDDI and the OWL-S matchmaking in
tolerating uncertain properties when matching services.
For example, the RSSM matchmaking is still able to
produce a correct match rate of 100% in the tests of the
last three groups.
Figure 4. Overhead evaluation in matching services.
Figure 5: Overhead evaluation in accessing service
records.
We also observe that the RSSM matchmaking has a
better performance than the OWL-S matchmaking
when the number of advertised services is within 5500.
This is because the RSSM matchmaking used a simpler
reasoning component than RACER, which was used by
6
the OWL-S matchmaking. However, the overhead of
the RSSM matchmaking increases when the number of
services gets larger, due to the reduction of dependent
properties. The major overhead of the OWL-S
matchmaking is caused by RACER, which is sensitive
to the number of properties used by advertised services
instead of the number of services.
From Figure 5 we can see that the RSSM
matchmaking has the best performance in accessing
service records due to its reduction of dependent
properties. The OWL-S matchmaking has a similar
performance to UDDI in this process.
[8] myGrid project, http://www.mygrid.org.uk
[9] S. Miles, J. Papay, T. Payne, K. Decker, and L.
Moreau. Towards a protocol for the attachment of
semantic descriptions to grid services. Proceedings
of the Second European across Grids Conference,
volume 3165 of Lecture Notes in Computer
Science, Nicosia, Cyprus, January 2004.
[10] L. Moreau, Grid RegIstry with Metadata Oriented
Interface: Robustness, Efficiency, Security
(GRIMOIRES). http://www.grimoires.org/, 2005.
[11] T. Berners-Lee, J. Hendler, and O. Lassila. The
Semantic Web. Scientific American, Vol. 284 (4),
pages 34-43, 2001.
[12] OWL Services Coalition, OWL-S: Semantic
Markup for Web Services, white paper, Nov.
2003; www.daml.org/services/owl-s/1.0.
[13] S. Bechhofer, F. Harmelen, J. Hendler, I.
Horrocks, D. McGuinness, P. F. Patel-Schneider,
and L. A. Stein. OWL Web Ontology Language
Reference. W3C Recommendation, Feb. 2004.
[14] M. Paolucci, T. Kawamura, T. Payne, and K.
Sycara. Semantic matching of web service
capabilities. Proceedings of 1st International
Semantic Web Conference. (ISWC2002), Berlin,
2002.
[15] M. C. Jaeger, G. Rojec-Goldmann, G. Mühl, C.
Liebetruth, and K. Geihs. Ranked Matching for
Service Descriptions using OWL-S. Proceedings
of KiVS 2005, Kaiserslautern, Germany, Feb.
2005.
[16] L. Li and I. Horrocks. A software framework for
matchmaking based on semantic web technology.
Int. J. of Electronic Commerce, 8(4):39-60, 2004.
[17] S. Majithia, A. S. Ali, O. F. Rana, D. W. Walker:
Reputation-Based Semantic Service Discovery.
Proceedings of WETICE 2004, Italy.
[18] Web Service Modeling Ontology (WSMO),
http://www.wsmo.org
[19] Z. Pawlak. Rough sets. International Journal of
Computer and Information Science, 11(5):341356, 1982.
[20] jUDDI, http://ws.apache.org/juddi/
[21] mySQL, http://www.myswl.com
[22] V. Haarslev and R. Möller. Description of the
RACER System and its Applications. Proceedings
International Workshop on Description Logics
(DL-2001), Stanford, USA.
[23] Global Grid Forum (GGF), http://www.ggf.org
4. Conclusions and Future Work
In this paper we have presented RSSM, a Rough
Sets based algorithm for service matchmaking. By
dynamically reducing irrelevant and dependent
properties in terms of a service request, the RSSM
algorithm can tolerate uncertain properties in service
discovery. Furthermore, RSSM uses a lower
approximation and an upper approximation of a set to
dynamically determine the number of discovered
services which may maximally satisfy user requests.
Experimental results have shown that the RSSM
algorithm is effective in service matchmaking.
As the reduction of dependent properties is a time
consuming process, the future work will be focused on
the design of heuristics for reducing dependent
properties to speed up the process.
References
[1] I. Foster and C. Kesselman, The Grid, Blueprint
for a New Computing Infrastructure, Morgan
Kaufmann Publishers Inc., San Francisco, USA,
1998.
[2] M. Li and M.A.Baker, The Grid: Core
Technologies, Wiley, 2005.
[3] Open Grid Services Architecture (OGSA),
http://www.globus.org/ogsa/
[4] Web Services Resource Framework (WSRF),
http://www.globus.org/wsrf/
[5] Universal Description, Discovery and Integration
(UDDI), http://www.uddi.org/
[6] A. ShaikhAli, O. F. Rana, R. J. Al-Ali, D. W.
Walker: UDDIe: An Extended Registry for Web
Service. Proceedings of SAINT Workshops,
Orlando, Florida, USA, 2003.
[7] S. Miles, J. Papay, V. Dialani, M. Luck, K.
Decker, T. Payne, and L. Moreau. Personalised
Grid Service Discovery. IEE Proceedings
Software: Special Issue on Performance
Engineering, 150(4):252-256, August 2003.
7
Download