APPLICATION RESPONSE MEASUREMENT OF DISTRIBUTED WEB SERVICES

advertisement
APPLICATION RESPONSE MEASUREMENT OF
DISTRIBUTED WEB SERVICES
J.D. Turner, D.A. Bacigalupo, S.A. Jarvis, D.N. Dillenberger+, G.R. Nudd
High Performance Systems Group, University of Warwick, Coventry, UK
IBM TJ Watson Research Center, New York, USA+
{jdt,daveb,saj,grn}@dcs.warwick.ac.uk engd@us.ibm.com+
Web service technology will provide a platform for dynamic e-business
applications. This paper describes a framework for identifying, monitoring
and reporting performance data of critical transactions within a web
service using the Java ARM standard, a Transaction Definition Language
(TDL) and a bytecode instrumentation tool. The data extracted using this
framework is shown to be appropriate for dynamically selecting web
services based on performance metrics. A case study is provided using
Gourmet2Go, an IBM demonstrator of a dynamic business-to-business
(B2B) application.
Keywords: Web Services, ARM, Performance Measurement, Service Routing, Java
1
Introduction
High performance computing has traditionally been
based on large parallel multi-node MIMD machines.
However, the vast increase in network bandwidth, in
desktop computer performance and in the availability
of large numbers of commodity computing
components, has resulted in a change in the preferred
high performance computing infrastructure. This is
demonstrated most notably in the emergence of Grid
computing
[FOST98,
LEIN99];
geographically
dispersed resource-sharing networks and disparate
heterogeneous resources whose management, rather
than being centralised, is maintained through multiple
administrative domains. A number of middleware
standards are emerging to support this infrastructure
(including Globus [FOST01], for example), together
with high performance resource allocation services
with the ability to adapt to the continuous variations in
user demands and resource availability.
More recently it has been proposed that Grids provide
a supporting infrastructure for both e-science and ebusiness applications [FOST02], allowing efficient
execution while maintaining user-driven quality of
service (QoS) requirements. To be of maximum value,
such an infrastructure should assist applications in
meeting these requirements autonomically - that is,
applications should be self-configuring so they are
able to hide their complexities and be less reliant on
human intervention [IBM01a]. However, the different
characteristics of these e-science and e-business
applications (e.g. large ‘run-once’ jobs verses highfrequency request-driven services) necessitate
different methodologies for each - for example, the
use of traditional resource scheduling for large
scientific jobs and service routing for high frequency
e-business-type requests.
1.1
Performance-Driven Grid Scheduling
The scheduling of distributed scientific applications
over Grid architectures is facilitated through the use of
TITAN [SPOO02] (see Figure 1). TITAN uses iterative
heuristic algorithms to improve a number of
scheduling metrics including makespan and idle time,
while aiming to meet QoS requirements including
dead-line time. This work is differentiated from other
Grid scheduling research through the application of
PACE [NUDD00, CAO00] - a performance prediction
environment - and A4 [CAO01, CAO02] - an agentbased resource discovery and advertisement
framework. PACE provides performance prediction
statistics for communication and computation by
analysing and abstracting the characteristics of the
application (application modelling) and evaluating this
model on an abstracted hardware model (resource
modelling) provided from benchmarks of the available
computing and network resources. These predictions
can be calculated in real-time, which means that the
TITAN scheduler can be continuously updated as new
applications are launched and as the available
resources change.
are documented and show how performance metrics
such as average service response time and
associated confidence (in the predicted service time)
impact on the selected service and routing behaviour.
The remaining sections of this paper provide:
!
!
!
Figure 1: TITAN: a scheduling environment
incorporating
performance
prediction,
resource
allocation, system policies, performance histories and
QoS. Feedback during execution aids the prediction and
scheduling of similar applications in the future.
1.2
Performance-Driven Web Service
Routing
While TITAN is well suited to the scheduling of large
scientific applications, a different strategy is required
for e-business applications. Business to business
(B2B) and business to consumer (B2C) applications
are often characterised by their long-running and
request-driven services. The requests are potentially
high frequency and typically exhibit short execution
time.
The web services framework, which is currently being
standardised with a strong but not exclusive focus on
e-business applications, will provide a platform for
these
high-throughput,
dynamic,
distributed
applications. In a similar way that scientific Grid
applications require resource allocation and
scheduling, it will be necessary for web services to be
apportioned in some way. This process is likely to
involve routing the communication from a service
requester, through several intermediaries, to the
ultimate service provider and in order to achieve this
behaviour a service routing mechanism is required.
2
Web Services
The recent interest in web service technology parallels
the increase in distributed infrastructures and their
associated application domains. The Web, for
example, is known to facilitate human-to-application
interaction, but currently provides limited support for
fully automated and dynamic application-to-application
interactions.
Distributed
enterprise
computing
platforms
and
their
associated
component
technologies, including Enterprise Java Beans (EJB),
COM+ and CORBA, facilitate application-toapplication integration, but are hard to apply across
organisational boundaries because of the lack of a
standardised infrastructure.
One approach to facilitating application-to-application
interactions across distributed infrastructures is the
creation of a standard framework that can be used to
create, describe, publish and connect network
accessible ‘web’ services (including programs,
databases, networks, computational or storage
resources etc. [FOST02]). It has been the emergence
of e-business however that has led to the increase in
the industry-wide effort to create a standard web
services framework, the primary goals of which are
described as [CURB01a]:
!
This paper describes a performance-driven service
routing framework for web services. Routing decisions
are based on measured end-to-end response times of
available services, measurements of which are made
through the use of the Application Response
Measurement [JOHN00, OPNG01] (ARM) open
standard. Real-time performance monitoring and
historical performance data is used as the basis for
service routing. A case study is provided using
Gourmet2Go, a dynamic B2B application supplied
with the IBM Web Services Toolkit [IBM02]. Results
an introduction to web services, their related
technologies
and
service
routing
intermediaries (Section 2);
an overview of ARM and an ARM compliant
XML-based Transaction Definition Language
[TURN01], for the automatic ‘ARMing’ of Java
applications (Section 3);
a demonstration of our performance
monitoring and service routing framework
using an example web services application
(Section 4).
!
!
systematic application-to-application
interactions on the Web;
the smooth integration of existing
infrastructure;
the support for efficient application integration.
However, although this effort is focused on creating a
web services framework based on existing web
protocols (including HTTP, for example), web services
are likely to function in other environments including
private wide-area and local-area networks.
The open, heterogeneous and dynamic nature of the
Web leads to a set of key requirements for a webbased distributed computing model such as the web
services framework [CURB01b]:
!
!
!
!
!
‘A small set of supported standard protocols
and data formats for each computing node.’
E.g. a web services communications protocol
and a web services description language are
all that is required for basic interactions.
‘A loose coupling between web services.’ E.g.
allowing web services to switch between subservices which provide the same functionality.
‘Interfaces defined in terms of the messages
that they process.’ Involving a shift in focus
from the APIs supported by each platform to
what goes ‘on the wire’.
‘Web services which are able to integrate with
each other on an ad-hoc basis.’ E.g. ebusiness web services that are able to
establish
short-term
goal-directed
relationships.
‘Web services located ‘by-what’ and ‘by-how’.’
E.g. searching for web services that provide
particular functionality (from a given
community) with a predefined QoS.
These requirements could be realised by a web
services platform in which a wide variety of
functionalities are encapsulated as web services and
run on any computing node that meets a minimal set
of requirements. These web services would be able to
‘dynamically bind’ to each other at runtime, which
might involve a web service searching for other web
services based on functional and non-functional
criteria. These services would interact through the
exchange of ‘messages on the wire’ and would enable
future applications to be constructed by dynamically
orchestrating collections of contributing services.
2.1
language for describing web services. An
abstract description, for example the XML
messages sent and received by the web
services, is mapped onto one or more
communication
protocol
bindings.
An
application wishing to use the web service
chooses which of the available protocols to
use. It is likely that other specifications will
build on WSDL to enable a more complete
description of web services (for example nonfunctional aspects of the service such as
QoS).
More recently web service invocation frameworks
such as WSIF [DUFT01] have been proposed. These
should simplify the interaction with WSDL-described
services, for example, an application might only be
required to specify the WSDL file of a web service to
interact with it at the XML exchange level. Remaining
interactions, such as selecting the protocol binding,
could then be handled by the invocation framework.
SOAP and WSDL have now become de-facto industry
standards. Since their release, many other
specifications have been proposed with standards for
service discovery receiving particular attention. For
example, service registries such as UDDI [UDDI00]
allow web service descriptions to be published (see
Figure 2 [KREG01]). These registries can be
searched at develop-time and at run-time; for
example, web services can search and ‘dynamically
bind’ to other services. Services can also be published
using inspection languages such as WS-Inspection
[BALL01], where service discovery is performed
through the inspection of service sites.
Web Service Standards
One of the early attempts at web services
standardisation began with the release of IBM and
Microsoft's web service specifications for B2B ebusiness. The specifications, built on existing industry
standards, were based on XML and contained two
core initiatives [CURB01a]:
1. A data exchange protocol called SOAP
[BOX00], for sending XML messages
wrapped in a standard ‘header and body’based XML envelope over an arbitrary
transport protocol. SOAP includes a binding
for HTTP and a convention for Remote
Procedure Call (RPC).
2. A unified representation for services called
WSDL [CHRI01], providing an XML-based
Figure 2: Publishing a web service; a provider publishes
a service description in a registry. Subsequently, a
requestor finds the description and proceeds to ‘bind’ to
the service.
2.2
Implementing Web Services
There are currently two main implementation
platforms for web services, Microsoft's .NET and
J2EE. Both platforms use SOAP, WSDL and UDDI to
promote basic interoperability, but additional nonstandard functionality is also provided. Supporting
web-server application products are currently being
implemented and include IBM's WebSphere.
In addition to the WebSphere e-business platform,
IBM has also published their J2EE-based Web
Services Toolkit (WSTK) [IBM02]. This provides a
variety of APIs, a pre-installed run-time environment,
an implementation of a private UDDI registry and a
number
of
demonstrators.
One
of
these
demonstrators, Gourmet2Go, forms the basis of the
case study found in Section 4.
2.3
Service Routing Intermediaries
It is important for intermediaries to consider nonfunctional criteria such as the performance of the web
services being considered for selection. This requires
mechanisms for measuring, storing and utilising web
service performance data. An approach to web
service performance monitoring using the ARM
standard is discussed in the next section.
3
With increasingly complex computer infrastructure it is
more difficult to analyse and report on performance.
Individual units of work (or transactions) can be
distributed, executed on different systems, across
multiple domains, via different processes and threads.
The type of work that these transactions carry out may
also vary, including database access, application logic
and data input and/or presentation. Analysing
performance
is
therefore
problematic
and
administrators
face a number of
complex
performance-related questions:
The web services framework provides various
mechanisms for interacting with services at higher
levels of abstraction and enables different versions of
a conceptual service to be published. For example in
an e-business scenario the communication between
service requester and a conceptual service that has
been found in a registry could traverse several service
selection levels:
1. Service provider selection: A standard
interface for providing a particular product or
service which is implemented (or extended)
by competing companies.
2. Service level agreement (SLA) selection: The
service provider offers the web service with
various QoS characteristics and associated
prices.
3. Service replica selection: The web service is
replicated on different servers and a server
that can provide the agreed QoS level is
selected.
It is likely that the communication will pass through a
number of intermediaries that will contribute toward
service selection decisions and hence perform a
service routing role. Web service routing (or message
exchange) is the subject of significant attention and
has been identified as one of the key functional
components of the web services framework [IBM01b].
An example of an intermediary might be a broker that
helps a shopper select and then connect to a
company to obtain a product or service based on
criteria such as price, quality and the performance of
the company's web services, using parameters set by
the requester and/or provider.
Application Response Measurement
!
!
!
!
!
!
!
Are transactions succeeding?
If a transaction fails, what is the cause of the
failure?
What is the response time experienced by the
end user?
Which sub-transactions are taking too long?
Where are the bottlenecks?
How many of which transactions are being
used?
How can the application and environment be
tuned to be more robust and to perform
better?
The Application Response Measurement (ARM)
standard allows these questions to be answered.
ARM allows the developer to define a number of
transactions within an application, the performance of
which are then measured during execution by an ARM
consumer interface. The choice of which transactions
to monitor is made by the developer and will usually
correspond to the areas of the application which are
considered performance critical.
3.1
Java 3.0 Standard
The ARM 3.0 specification [OPNG01] provides a Java
binding for the response measurement of Java
applications. It defines a number of Java interfaces
which must be implemented by a valid ARM consumer
interface (an example of which can be found in Figure
3).
Transactions
are
defined
using
an
‘ArmTransactionFactory’ and by creating new
instantiations of ‘ArmTransaction’ objects. The
consumer interface measures the performance of this
transaction using start and stop calls and reports the
information to an associated data repository
(represented in Figure 3 by the reporting classes).
Figure 4: An ARM implementation model containing: an
ARM consumer interface client residing in the same JVM
as the original application; an ARM data repository
server for all reported and processed transactions; a
number of server clients for data retrieval, including a
GUI client for a continuous update of all previously
reported and currently executing transactions, and a
data client for specific transaction queries.
Figure 3: An example of an ‘ARMed’ application. The
transaction is defined using an ARM consumer interface.
All ARM transactions contain (optional) definition
information. This is achieved using ‘ArmMetricDefinition’,
‘ArmTranDefinition’ and ‘ArmUserDefinition’ objects.
When the application invokes a process call, the
consumer interface records this information in the
reporting classes. The application then creates a new
instantiation of an ‘ArmTransaction’ object, defining the
location of the transaction using start and stop
transaction calls.
Optional descriptive information can be associated
with the transaction to aid analysis. This data comes
in the form of definition objects; including the
‘ArmMetricDefinition’,
‘ArmTranDefinition’
and
‘ArmUserDefinition’ objects found in Figure 3.
Instantiations of these objects are provided by the
consumer
interface's
implementation
of
the
‘ArmDefinitionFactory’. These objects are populated
with descriptive information provided by the
application and are processed at set intervals. The
descriptive information is associated with each
transaction using unique UUID numbers, allowing
meaningful descriptive data for each transaction to be
stored within the data repository.
3.2
An ARM Implementation
An ARM consumer implementation and data
repository has been developed to allow distributed
‘ARMed’ applications to process and report
transactions to a remote data repository service. The
ARM consumer interface connects to this data
repository and reports all processed transaction
measurement data, storing the information in an
efficient format for future use. A number of standard
clients have also been implemented that allow data to
be extracted from the repository; this includes a GUI
so that developers can monitor current and previous
transactions. The ARM data repository infrastructure
is shown in Figure 4.
In the current implementation a central ARM
repository service is defined. Further scalability
enhancements can be provided by associating local
data repository services with each ‘group’ of web
services and linking these through a connectivity
hierarchy.
3.3
Transaction Definition Language
While the ARM standard provides a well-defined
framework for the response measurement of
distributed applications, there are a number of
inherent disadvantages to this approach. These
include the need for a detailed knowledge of the ARM
specification, the time required to instrument
performance-critical transactions, and the availability
of source code.
In order to overcome these issues a Transaction
Definition Language (TDL) has been developed
[TURN01]. This allows developers to define where
transactions are located within their applications using
an XML-based descriptor language. This is then used
to automatically instrument the object code of the
application using a bytecode instrumentation tool,
which itself conforms to both the ARM and Java
Virtual Machine specifications. The TDL allows
descriptive information to be associated with each
transaction using a number of optional attributes; it is
also possible to define groups of transactions which
may reside in an application's jarfile or classpath. The
current version of the TDL specification can be seen
in Figure 5.
The main advantages of this approach are that
developers can use the TDL to ARM performancecritical areas of code by editing the associated XML
file. The process of code instrumentation is then
automated (saving the developer time) and bytecodebased (eliminating the need for proprietary source
code). Applications can then be monitored during the
process of development, or alternatively after they
have been deployed in a live environment. The
Transaction Definition Language, including its
specification, design and implementation is described
in [TURN01].
<!-- Transaction Definition Language DTD
(v1.1) -->
<!ELEMENT tdl (transaction+)>
<!ELEMENT transaction (location,
line_number?,
metric*)>
<!ATTLIST transaction type (method_source |
method_call |
line_number) #REQUIRED
appl_name CDATA #IMPLIED
tran_name CDATA #IMPLIED
user_name CDATA #IMPLIED
fail_on_exception (yes | no)
"yes">
<!ELEMENT location EMPTY>
<!ATTLIST location class CDATA #REQUIRED
method CDATA #REQUIRED>
<!ELEMENT line_number EMPTY>
<!ATTLIST line_number begin CDATA #REQUIRED
end CDATA #REQUIRED>
<!ELEMENT metric EMPTY>
<!ATTLIST metric type (Counter32 | Counter64 |
CounterFloat32 | Guage32 |
Guage64 | GuageFloat32 |
NumericId32 | NumericId64 |
String8 |
String32) #REQUIRED
value CDATA #REQUIRED>
Figure 5: Transaction Definition Language DTD (v1.1).
4
Case Study
Service routing intermediaries (described in Section 2)
are likely to play an important role in future web
service networks. Such intermediaries will need to
consider QoS metrics when making service selection
decisions; the end-to-end response time being one
such example. For this to be possible there must be a
mechanism for monitoring the performance of the web
services and a way of using this data during service
selection.
This section provides a case study that demonstrates
an ARM-based routing framework. Gourmet2Go, a
demonstrator of a dynamic B2B application provided
with the IBM Web Services Toolkit, is used together
with the ARM standard which provides the
infrastructure for service performance monitoring and
measurement. The process involved in configuring
Gourmet2Go for performance-based service routing is
described; the results obtained from a number of
Gourmet2Go scenarios are also documented.
4.1
Gourmet2Go
Gourmet2Go is a web services demonstrator from the
IBM Web Services Toolkit. The demonstrator provides
an example of how a web service acts as an
intermediary broker, assisting users to select backend web services by obtaining bids from services
published in a UDDI registry. In Gourmet2Go, the
back-end web services sell groceries and the broker
presents itself as a value-added ‘meal planning’
service. The underlying architecture is however
generic and is therefore used to demonstrate the
brokering of any kind of service.
The architecture of the Gourmet2Go demonstrator
can be found in Figure 6. A typical interaction is as
follows:
1. The user interacts with the Gourmet2Go Web
application via a Web browser with the
intention of building a shopping list of
groceries. This represents the user specifying
the service that the back-end web service
must perform.
2. The broker searches the registry for
businesses with published web services that
sell groceries. This represents the broker
selecting a number of potential services using
information provided in the registry.
3. The shopping list is passed by the broker to
each of the back-end web services (located
from the registry) via a ‘getBid’ request. This
represents the broker contacting the short-list
of candidates directly.
4. The broker summarises the bids for the user
based on price, and the user then selects a
supplier using the information presented to
them. This stage represents the broker
assisting the user in decision making.
5. The broker sends a ‘commitBid’ request to the
service that the users selects. This represents
the broker continuing to act as an
intermediary while the user interacts with the
selected back-end service. This interaction
has the potential to be significantly more
complex than the confirmation message in the
Gourmet2Go demonstrator. For example,
specifying the details of what exactly is to be
purchased and how it is to be paid for,
through to delivery tracking and after-sales
support.
The Web services in Gourmet2Go are implemented in
Java using the WSTK and associated standards.
The following performance data is retrieved:
!
Figure 6: The design of the Gourmet2Go demonstrator
from the IBM Web Services Toolkit. A typical interaction
is shown in which a user browses the Gourmet2Go
broker to obtain information on the available services
and then select the preferred service.
4.2
!
Enhancing Gourmet2Go with
Performance Information
From an application performance perspective, there
are two limitations within Gourmet2Go that our
framework overcomes:
1. The only metric the broker considers when
evaluating the back-end web services is the
price of those services. When dynamically
orchestrating web services it is also important
to consider performance.
2. It is currently the user who makes the final
decision as to which back-end web service to
select. It would also be useful if the broker
was able to make this decision automatically
and thereby act as a service router.
A new framework based on ARM and using the TDL is
described. This framework allows the performance of
web services to be monitored and allows historic
performance data to be utilised by intermediaries such
as the broker provided in the Gourmet2Go
demonstrator. The design of the framework is
described in terms of four key design questions:
What performance data should be collected?
As stated in Section 3, one aim of ARM is to provide a
basis for the measurement of the performance of
defined transactions within e-business applications;
this includes the response time of the transaction, the
status of the transaction, etc.. In this case study endto-end response time is used as the primary service
routing metric. It would be possible however to extend
this method to include other custom metrics such as
transaction success-rate.
Mean end-to-end response time and
confidence: In the Gourmet2Go example, the
key measure is the mean end-to-end
response time of recent ‘commitBid’ requests.
We also calculate how representative this
mean is likely to be; this is done by providing
an additional confidence estimate based on
the number of recently recorded invocations1.
When a new web service is published in the
registry there will not be any historical
performance data available for ‘commitBid’
requests. Our solution is to extrapolate from
the recent measurements of the ‘getBid’
request, and associate a lower confidence
value.
Communication and processing costs: We
also measure the mean response time of the
Java application that implements the backend web service. Assuming that both mean
response time figures are calculated from
measurements of invocations from a specified
broker to a specified web service, this allows
the mean communication time from the broker
to the service (and back) to be calculated. As
the measurements are being taken at the
Java application level, the communication
figure includes the time it takes to traverse the
communication stack at both ends (including
the web service communication API, java.net
classes etc.), but this is likely to be small in
comparison to the overall communication
cost. The mean communication time can be
calculated from the mean end-to-end
response time minus the mean processing
time.
In the current service routing algorithm the end-to-end
response
time
and
confidence
is
used.
Communication and processing costs provide the user
or system administrator with additional information,
but are not used as part of the service routing
algorithm. The framework can also be enhanced by
decomposing the end-to-end response time into key
transactions such as communications to other web
services (e.g. databases) and service registries etc.
Initial experiments have shown that this can provide
useful additional information.
The Gourmet2Go demonstrator is monitored by
defining key transactions around the communication
between the broker and the back-end web services
1
In this set of experiments we are focusing on a confidence
metric defined in terms of the amount and relevance of
historical performance data. It would be possible to extend
this, for example to consider the variability of the data used
to calculate the mean.
(front-end transactions) and also around the ‘getBid’
and ‘commitBid’ operations (back-end transactions).
Each transaction is assigned additional descriptive
information - the application is set to the name of the
back-end service and the transaction is set to the
name of the operation. The user name associated
with the transaction remains constant throughout.
How should the performance data be measured
and stored?
Using the transactions described in the previous
section, the broker and back-end web services were
2
‘ARMed’ . A sample from the TDL for one of the
back-end web services is shown in Figure 7.
<transaction type="method_source"
appl_name="Sammys Grocery Ordering Service"
tran_name="getBid_backend"
user_name="ARM">
<location class="com/ibm/ews/g2gServices/
grocery/SammysGroceryService"
method="getBid"/>
</transaction>
<transaction type="method_source"
appl_name="Sammys Grocery Ordering Service"
tran_name="commitBid_backend"
user_name="ARM">
<location class="com/ibm/ews/g2gServices/
grocery/SammysGroceryService"
method="commitBid"/>
</transaction>
Figure 7: Gourmet2Go TDL sample showing the
transaction definitions for the ‘getBid’ and ‘commitBid’
operations within one of the back-end web services’
interfaces.
How can useful data be extracted from the
performance information?
Before the performance data can be used for service
routing, it is necessary to determine how the mean
response time and confidence figures for ‘getBid’ and
‘commitBid’ requests contribute to the evaluation of
web service performance. In the current framework
two simplifications are made:
!
2
‘Using the mean response time of recent
requests as the primary performance metric is
appropriate.’ For simplicity we will assume
that the variability of the back-end web
services’ response times are low so it is
appropriate to observe them for a limited
period to calculate a mean response time
prior to establishing a relationship with them.
It should be noted that this research is based on the
ARMing of Java; however, the research is generally
applicable, and can be widely applied, for example to
Microsoft's .NET
!
‘The complexity of the e-business scenario is
restricted.’
A
number
of
simulation
assumptions are made: the factor by which
‘commitBid’ operations are, on average,
slower than ‘getBid’ operations is known for all
service providers and has a low variability; it
will be rare for multiple services to have the
same rating on which the broker must make a
decision; there will be no correlation of broker
and back-end service transactions (for
individual invocation chains) - while this is
possible using ARM parent correlators, it is
not the aim of this set of experiments.
Given the above assumptions - including that which
states that there is a known factor (P) by which the
mean end-to-end response time of a ‘commitBid’
operation (rc) is, on average, slower than the mean
end-to-end response time of a ‘getBid’ operation (rb),
for a particular service provider - the expected
performance (EP) of a service provider is the overall
average:
n
 n

EP =  b P rb  +  c rc 
 nt
  nt 
where nb, nc and nt are the number of bids, commits
and total number of invocations respectively.
Calculating confidence is more complex. When a web
service is first registered, the broker will have no
reliable confidence assigned until a number of
invocations have been recorded:
 0,
nb < nm
Cb = 
 n b , otherwise
 0,
nc < nm
Cc = 
 n c , otherwise
where Cb and Cc are the confidence values for bids
and commits and nm is the minimum number of
invocations required by the broker before the service
is used.
The combined confidence is calculated by weighting
each figure and adding them, as the increased use of
the service will increase the reliability of the
confidence value (although the rate of increase in
reliability will level over time):
Co = log(W b (Cb + 1) + W c (Cc + 1))
where Wb and Wc are the weights for bids and
commits. The higher the Wc relative to Wb the more
importance the broker will put on having actual
performance data as opposed to data extrapolated
from getBid requests. In the experimental framework
Wb = Wc = 0.5.
Summary of the Design Methodology
1. The user connects to the broker and specifies
their requirements for a service, whether to
make a selection based on performance or
price and whether the broker should select the
best service automatically.
2. The broker queries a list of potential services
for bids and the end-to-end response time and
communication/back-end service processing
is reported to the ARM data repository as they
respond.
3. The broker obtains from the repository
historical performance information regarding
the services and uses this to select a service,
possibly with user involvement. In the current
implementation only the end-to-end service
response time is considered.
4. The broker commits the selected bid, the
response of which is also measured and
reported to the repository.
Finally, the weighted expected performance (WEP) for
a particular service provider is:
WEP =
Co
EP
How
should
the
extracted
information be used by the broker?
performance
In this experimental framework the broker has two
metrics for each back-end web service; price and
weighted expected performance (WEP). The user can
choose to have the broker act as a service router to
the back-end web services by sorting them by either
figure and attempting to connect to the services in that
order until successful.
WEP is rounded to a user-specified number of
significant digits; if two or more web services exhibit
the same rounded weighted expected performance
they are sorted randomly. This allows the broker to
obtain additional performance information about two
or more web services when their WEPs are observed
as being identical.
Incorporating basic pricing into these equations is
straightforward. However, it is noted that in a realworld system it is likely that pricing is more complex,
including QoS-based pricing schemes etc..
Alternatively, the user can choose to have the broker
display all potential services for user selection. The
services can be sorted by either metric, but all price
and performance information for each potential
service is available, including a response time
breakdown into communication and processing times.
This approach is particularly useful when there are
factors that are hard for an automated broker to take
into account, for example a user's history with a
particular service provider.
In the current implementation, the service routing
functionality is added to the Gourmet2Go broker by
modifying the source to allow communication with the
ARM data repository service. This provides
performance information which is then used for the
service bid processing algorithm. A straightforward
extension would be to extract the service routing
functionality into a web service thus providing a
dedicated service routing intermediary.
4.3
Results
This framework has been tested using Gourmet2Go
and a number of experiments have been conducted
that provide insight into routing algorithms and web
service performance. To facilitate the analysis of the
results, a number of enhancements were made to the
Gourmet2Go simulation:
!
!
A client application was written that interacts
with the broker's HTTP-based web service
interface using the same requests that a usercontrolled browser would submit during a runthrough of the Gourmet2Go demonstrator.
This allows the demonstrator to run through a
fixed number of iterations.
The ‘getBid’ and ‘commitBid’ operations of the
three grocery back-end web services were
weighted with a random performance delay.
Both the ‘Natural Bag’ (S1) and ‘Sammy’s’ (S2)
services were equally weighted to provide (on
average) the same processing response time
so as to represent the established service
providers; the ‘Dee Dees’ (S3) service,
representing a new service provider, was then
weighted differently for each of the three
simulations whose results are documented
below. Each weighting incorporates a random
element allowing a 40% performance increase
or decrease from the mean. This random
element models the possible range of
response times induced by factors such as
communication delay and load on the broker
and back-end servers (due to the volume of
requests, for example). For simplicity both
‘getBid’ and ‘commitBid’ are identically
weighted by setting P = 1 and the duration of
Three sets of results are presented (see Figures 8, 9
and 10) from which conclusions are drawn as to how
the routing algorithm responds to a newly published
service when other available services already have an
established performance-confidence rating.
S1 and S2 are both published in the service registry at
the start of each simulation. S3 is weighted at three
different levels ranging from a very high performance
(in comparison to the other two services), to the same
performance, and is published in each simulation after
20 iterations.
For each simulation all web services and the HTTP
client were executed on the same machine3 with
random communication delays modelled as stated
above. The minimum number of iterations required
before the service confidence rating can rise above
zero is set to 10 (although this figure can of course be
tuned).
0.006
0.005
0.007
0.006
0.005
0.004
0.003
0.002
S1
0.001
S2
S3
0
1
7
13
19
25
31
37
43
49
55
61
67
73
79
85
91
97
Number of Iterations
Figure 9: The simulation results when S3 is published as
a service with higher performance in relation to S1 and
S2. The results are very similar to Figure 8, except for
the fact it takes longer for S3 to become the dominant
service. Again, once selected, the confidence in service
S3 increases and it continues to be used as the favoured
service.
0.004
0.007
0.003
0.002
S1
0.001
S2
S3
0
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
Number of Iterations
Figure 8: The simulation results when S3 is published as
a ‘high performance’ service in relation to S1 and S2. The
results show that S1 is initially selected over S2 due to
the fact that it performed better during this period. It then
continues to be used because of its increased
confidence rating over S2; the broker picks a service that
seems to perform well and keeps using it whilst it
performs consistently. After 20 iterations the service S3
is published and provides ‘getBid’ response times even
though it is not selected by the broker. Once available
for selection, its superior performance provides a sharp
increase in weighted expected performance assuring its
dominant selection for the remainder of the experiment.
It follows from the results that a new web service
would have to perform significantly better than the
currently published services if the broker is to select
3
0.008
All results were obtained using a Pentium III 450 Mhz with
256 Mb RAM, running Redhat Linux 7.1, kernel 2.4.17, IBM
JVM 1.3.7, and the WSTK 2.4.2 run-time environment which
includes WebSphere Micro-Edition 4.
Weighted Expected Performance
Weighted Expected Performance
0.007
this new service over services for which it has a large
degree of confidence. The level at which this selection
is made can be partly controlled by modifying the
weights which are given to the confidence values (Wb
and Wc), the minimum number of invocations required
by the broker before a service is used (nm) and the
factor by which a ‘commitBid’ operation is, on average
slower than a ‘getBid’ operation (P).
Weighted Expected Performance
the experiment is considered to be too short
to start expiring any measurements in the
performance history.
0.006
0.005
0.004
0.003
0.002
S1
0.001
S2
S3
0
1
13
25
37
49
61
73
85
97 109 121 133 145 157 169 181 193
Number of Iterations
Figure 10: The simulation results when S3 is published
with the same performance as S1 and S2. Unless the
average response time for S3 improves over the other
services, the confidence will never overtake that of S2,
for which ‘commitBid’ as well as ‘getBid’ requests are
being recorded. It is noted that S2 is the preferred
service throughout the simulation due to its better
performance during the first 10 iterations.
The results for the three simulations confirm that the
routing algorithm is driving the selection of web
services based on actual performance results. As the
performance of the web services change so the
routing of requests will change. The rate of change
will once again depend on the confidence weightings
set.
While Wb, Wc, nm and P might initially be set by hand,
tuning will probably be automated. This can be based
on the overall success of the system, judged for
example by the ‘contract success rate’ of the
subscribing customers, which can be calculated in
real-time based on system monitoring and feedback.
This is currently being investigated.
5
Conclusion
This paper demonstrates how the ARM standard and
a novel transaction definition and instrumentation
technique can be applied to the performance
monitoring of web services. The process of
transaction identification, mark up and performance
reporting provides insight into the process of applying
existing performance tools and standards to web
service based applications.
The framework can be used as the basis for
performance-based web service routing which is
illustrated by applying it to Gourmet2Go, a dynamic
B2B application supplied with the IBM Web Services
Toolkit. The resulting application is both selfmonitoring and self-configuring; two essential
properties for an autonomic computing system
[IBM01a].
Future work is to include the addition of QoS contracts
aimed to provide different levels of service provision.
This will allow routing decisions to be influenced by
the service classes associated with each request.
Load balancing and additional dynamic system
properties (such as network outage and performance
bottlenecks) will also be explored in the context of
service routing.
A predictive framework is also being developed that
allows the performance of MPI-based e-science
applications to be characterised and predicted prior to
their execution; ARM is being used as a monitoring
framework for automated characterisation refinement.
Using such a framework to enhance the performance
of e-business applications such as those described
within this paper is currently being investigated.
6
Acknowledgments
The authors would like to express their gratitude to
IBM's TJ Watson Research Center and Hursley
Laboratories for their contributions towards this
research, and in particular to Robert Berry for his
valuable comments.
The work is sponsored in part by the EPSRC eScience
Core
Programme
(contract
no.
GR/S03058/01), the NASA AMES Research Center
administered by USARDSG (contract no. N68171-01-
C-9012), the ESPRC (contract no. GR/R47424/01)
and IBM UK Ltd.
7
References
[BALL01] K. Ballinger, P. Brittenham, A. Malhotra, W.
Nagy and S. Pharies, “Web Services Inspection
Language (WS-Inspection) 1.0'', November 2001.
Available at http://www.ibm.com/developerworks/
webservices/library/ws-wsilspec.html
[BOX00] D. Box, D. Ehnebuske, G. Kakivaya, A.
Layman, N. Mendelsohn, H. F. Nielsen, S. Thatte and
D. Winer, “Simple Object Access Protocol (SOAP)
1.1'', W3C Note, May 2000. Available at
http://www.w3.org/TR/SOAP
[CAO00] J. Cao, D.J. Kerbyson, E. Papaefstathiou
and G.R. Nudd, “Modeling of ASCI High Performance
Applications using PACE'', 19th IEEE International
Performance,
Computing
and
Communication
Conference, Phoenix USA. 485-492 (2000).
[CAO01] J. Cao, D.J. Kerbyson and G.R. Nudd,
“Performance Evaluation of an Agent-Based
Resource Management Infrastructure for GRID
Computing'', Proceedings of 1st IEEE/ACM Int.
Symposium on Cluster Computing and the Grid,
Brisbane Australia. 311-318 (2001).
[CAO02] J. Cao, D.P. Spooner, J.D. Turner, S.A.
Jarvis, D.J. Kerbyson, S. Saini and G.R. Nudd,
“Agent-based Resource Management for Grid
Computing'', Invited paper at the 2nd IEEE/ACM Int.
Symposium on Cluster Computing and the Grid,
Berlin, May (2002).
[CHRI01] E. Christensen, F. Curbera, G. Meredith and
S. Weerawarana, “Web Services Description
Language (WSDL) 1.1'', W3C Note, March 2001,
Available at http://www.w3.org/TR/wsdl
[CURB01a] F. Curbera, W. Nagy and S.
Weerawarana, “Web Services: Why and How'',
OOPSLA 2001 Workshop on Object-Oriented Web
Services, Florida, USA, 2001
[CURB01b] F. Curbera, N. Mukhi and S.
Weerawarana, “On the Emergence of a Web Services
Component Model'', 6th International Workshop on
Component-Oriented
Programming
(ECOOP),
Budapest, Hungary, 2001
[DUFT01] M. Duftler, N. Mukhi, A. Slominski and S.
Weerawarana, “Web Services Invocation Framework
(WSIF)'', OOPSLA 2001 Workshop on ObjectOriented Web Services, Florida, USA, 2001
[FOST98] I. Foster and C. Kesselman, “The Grid :
Blueprint for a New Computing Infrastructure'', Morgan
Kaufmann. 279-290 (1998).
[FOST01] I. Foster, C. Kesselman and S. Tuecke,
“The Anatomy of the Grid: Enabling Scalable Virtual
Organizations'', International J. Supercomputer
Applications, 15(3), 2001.
[OPNG01] The Open Group, “Application Response
Measurement (Issue 3.0 - Java Binding)'', Open
Group Technical Specification, October 2001.
Available at http://www.opengroup.org/management/
arm.htm
[SPOO02] D.P. Spooner, J. Cao, J.D. Turner, H.N. Lin
Choi Keung, S.A. Jarvis and G.R. Nudd, “Localised
Workload Management Using Performance Prediction
and QoS Contracts'', 18th Annual UK Performance
Engineering Workshop (UKPEW2002), University of
Glasgow, UK. July 2002
[FOST02] I. Foster, C. Kesselman, J. Nick, S. Tuecke,
“The Physiology of the Grid: An Open Grid Services
Architecture for Distributed Systems Integration'',
February 2002. Available at http://www.globus.org/
research/papers/ogsa.pdf
[TURN01] J.D. Turner, D.P. Spooner, J. Cao, S.A.
Jarvis, D.N. Dillenberger and G.R. Nudd, “A
Transaction Definition Language for Application
Response Measurement'', International Journal of
Computer Resource Measurement, 105, 55-65
(2001).
[IBM01a] IBM Corporation, “Autonomic Computing
Manifesto'',
October
2001.
Available
at
http://www.research.ibm.com/autonomic/
[UDDI00] UDDI Project, “UDDI Technical White
Paper'',
September
2000.
Available
at
http://www.uddi.org
[IBM01b] IBM Corporation, Microsoft Corporation,
“Web Services Framework'', W3C Web Services
Workshop, San Jose, USA, April 2001. Available at
http://w3.org/2001/03/WSWS-popa/
[IBM02] IBM Web Services Toolkit. Available at
http://www.alphaworks.ibm.com/tech/webservicestool
kit
[JOHN00] M.W. Johnson and J. Crowe, “Measuring
the Performance of ARM 3.0 for Java'', Proceedings of
CMG2000 International Conference, Orlando, USA,
December 2000.
[KREG01] H. Kreger, “IBM Web Services Conceptual
Architecture (WSCA 1.0)”, May 2001. Available at
http://www.ibm.com/software/solutions/webservices/
pdf/WSCA.pdf
[LEIN99] W. Leinberger and V. Kumar, “Information
Power Grid: The New Frontier in Parallel
Computing?'', IEEE Concurrency 7(4), (1999).
[NUDD00]
G.R.
Nudd,
D.J.
Kerbyson,
E.
Papaefstathiou, S.C. Perry, J.S. Harper and D.V.
Wilcox, “PACE - A Toolset for the Performance
Prediction of Parallel and Distributed Systems'',
International Journal of High Performance Computing
Applications, Special Issues on Performance
Modelling. 14(3), 228-251 (2000).
Download