APPLICATION RESPONSE MEASUREMENT OF DISTRIBUTED WEB SERVICES J.D. Turner, D.A. Bacigalupo, S.A. Jarvis, D.N. Dillenberger+, G.R. Nudd High Performance Systems Group, University of Warwick, Coventry, UK IBM TJ Watson Research Center, New York, USA+ {jdt,daveb,saj,grn}@dcs.warwick.ac.uk engd@us.ibm.com+ Web service technology will provide a platform for dynamic e-business applications. This paper describes a framework for identifying, monitoring and reporting performance data of critical transactions within a web service using the Java ARM standard, a Transaction Definition Language (TDL) and a bytecode instrumentation tool. The data extracted using this framework is shown to be appropriate for dynamically selecting web services based on performance metrics. A case study is provided using Gourmet2Go, an IBM demonstrator of a dynamic business-to-business (B2B) application. Keywords: Web Services, ARM, Performance Measurement, Service Routing, Java 1 Introduction High performance computing has traditionally been based on large parallel multi-node MIMD machines. However, the vast increase in network bandwidth, in desktop computer performance and in the availability of large numbers of commodity computing components, has resulted in a change in the preferred high performance computing infrastructure. This is demonstrated most notably in the emergence of Grid computing [FOST98, LEIN99]; geographically dispersed resource-sharing networks and disparate heterogeneous resources whose management, rather than being centralised, is maintained through multiple administrative domains. A number of middleware standards are emerging to support this infrastructure (including Globus [FOST01], for example), together with high performance resource allocation services with the ability to adapt to the continuous variations in user demands and resource availability. More recently it has been proposed that Grids provide a supporting infrastructure for both e-science and ebusiness applications [FOST02], allowing efficient execution while maintaining user-driven quality of service (QoS) requirements. To be of maximum value, such an infrastructure should assist applications in meeting these requirements autonomically - that is, applications should be self-configuring so they are able to hide their complexities and be less reliant on human intervention [IBM01a]. However, the different characteristics of these e-science and e-business applications (e.g. large ‘run-once’ jobs verses highfrequency request-driven services) necessitate different methodologies for each - for example, the use of traditional resource scheduling for large scientific jobs and service routing for high frequency e-business-type requests. 1.1 Performance-Driven Grid Scheduling The scheduling of distributed scientific applications over Grid architectures is facilitated through the use of TITAN [SPOO02] (see Figure 1). TITAN uses iterative heuristic algorithms to improve a number of scheduling metrics including makespan and idle time, while aiming to meet QoS requirements including dead-line time. This work is differentiated from other Grid scheduling research through the application of PACE [NUDD00, CAO00] - a performance prediction environment - and A4 [CAO01, CAO02] - an agentbased resource discovery and advertisement framework. PACE provides performance prediction statistics for communication and computation by analysing and abstracting the characteristics of the application (application modelling) and evaluating this model on an abstracted hardware model (resource modelling) provided from benchmarks of the available computing and network resources. These predictions can be calculated in real-time, which means that the TITAN scheduler can be continuously updated as new applications are launched and as the available resources change. are documented and show how performance metrics such as average service response time and associated confidence (in the predicted service time) impact on the selected service and routing behaviour. The remaining sections of this paper provide: ! ! ! Figure 1: TITAN: a scheduling environment incorporating performance prediction, resource allocation, system policies, performance histories and QoS. Feedback during execution aids the prediction and scheduling of similar applications in the future. 1.2 Performance-Driven Web Service Routing While TITAN is well suited to the scheduling of large scientific applications, a different strategy is required for e-business applications. Business to business (B2B) and business to consumer (B2C) applications are often characterised by their long-running and request-driven services. The requests are potentially high frequency and typically exhibit short execution time. The web services framework, which is currently being standardised with a strong but not exclusive focus on e-business applications, will provide a platform for these high-throughput, dynamic, distributed applications. In a similar way that scientific Grid applications require resource allocation and scheduling, it will be necessary for web services to be apportioned in some way. This process is likely to involve routing the communication from a service requester, through several intermediaries, to the ultimate service provider and in order to achieve this behaviour a service routing mechanism is required. 2 Web Services The recent interest in web service technology parallels the increase in distributed infrastructures and their associated application domains. The Web, for example, is known to facilitate human-to-application interaction, but currently provides limited support for fully automated and dynamic application-to-application interactions. Distributed enterprise computing platforms and their associated component technologies, including Enterprise Java Beans (EJB), COM+ and CORBA, facilitate application-toapplication integration, but are hard to apply across organisational boundaries because of the lack of a standardised infrastructure. One approach to facilitating application-to-application interactions across distributed infrastructures is the creation of a standard framework that can be used to create, describe, publish and connect network accessible ‘web’ services (including programs, databases, networks, computational or storage resources etc. [FOST02]). It has been the emergence of e-business however that has led to the increase in the industry-wide effort to create a standard web services framework, the primary goals of which are described as [CURB01a]: ! This paper describes a performance-driven service routing framework for web services. Routing decisions are based on measured end-to-end response times of available services, measurements of which are made through the use of the Application Response Measurement [JOHN00, OPNG01] (ARM) open standard. Real-time performance monitoring and historical performance data is used as the basis for service routing. A case study is provided using Gourmet2Go, a dynamic B2B application supplied with the IBM Web Services Toolkit [IBM02]. Results an introduction to web services, their related technologies and service routing intermediaries (Section 2); an overview of ARM and an ARM compliant XML-based Transaction Definition Language [TURN01], for the automatic ‘ARMing’ of Java applications (Section 3); a demonstration of our performance monitoring and service routing framework using an example web services application (Section 4). ! ! systematic application-to-application interactions on the Web; the smooth integration of existing infrastructure; the support for efficient application integration. However, although this effort is focused on creating a web services framework based on existing web protocols (including HTTP, for example), web services are likely to function in other environments including private wide-area and local-area networks. The open, heterogeneous and dynamic nature of the Web leads to a set of key requirements for a webbased distributed computing model such as the web services framework [CURB01b]: ! ! ! ! ! ‘A small set of supported standard protocols and data formats for each computing node.’ E.g. a web services communications protocol and a web services description language are all that is required for basic interactions. ‘A loose coupling between web services.’ E.g. allowing web services to switch between subservices which provide the same functionality. ‘Interfaces defined in terms of the messages that they process.’ Involving a shift in focus from the APIs supported by each platform to what goes ‘on the wire’. ‘Web services which are able to integrate with each other on an ad-hoc basis.’ E.g. ebusiness web services that are able to establish short-term goal-directed relationships. ‘Web services located ‘by-what’ and ‘by-how’.’ E.g. searching for web services that provide particular functionality (from a given community) with a predefined QoS. These requirements could be realised by a web services platform in which a wide variety of functionalities are encapsulated as web services and run on any computing node that meets a minimal set of requirements. These web services would be able to ‘dynamically bind’ to each other at runtime, which might involve a web service searching for other web services based on functional and non-functional criteria. These services would interact through the exchange of ‘messages on the wire’ and would enable future applications to be constructed by dynamically orchestrating collections of contributing services. 2.1 language for describing web services. An abstract description, for example the XML messages sent and received by the web services, is mapped onto one or more communication protocol bindings. An application wishing to use the web service chooses which of the available protocols to use. It is likely that other specifications will build on WSDL to enable a more complete description of web services (for example nonfunctional aspects of the service such as QoS). More recently web service invocation frameworks such as WSIF [DUFT01] have been proposed. These should simplify the interaction with WSDL-described services, for example, an application might only be required to specify the WSDL file of a web service to interact with it at the XML exchange level. Remaining interactions, such as selecting the protocol binding, could then be handled by the invocation framework. SOAP and WSDL have now become de-facto industry standards. Since their release, many other specifications have been proposed with standards for service discovery receiving particular attention. For example, service registries such as UDDI [UDDI00] allow web service descriptions to be published (see Figure 2 [KREG01]). These registries can be searched at develop-time and at run-time; for example, web services can search and ‘dynamically bind’ to other services. Services can also be published using inspection languages such as WS-Inspection [BALL01], where service discovery is performed through the inspection of service sites. Web Service Standards One of the early attempts at web services standardisation began with the release of IBM and Microsoft's web service specifications for B2B ebusiness. The specifications, built on existing industry standards, were based on XML and contained two core initiatives [CURB01a]: 1. A data exchange protocol called SOAP [BOX00], for sending XML messages wrapped in a standard ‘header and body’based XML envelope over an arbitrary transport protocol. SOAP includes a binding for HTTP and a convention for Remote Procedure Call (RPC). 2. A unified representation for services called WSDL [CHRI01], providing an XML-based Figure 2: Publishing a web service; a provider publishes a service description in a registry. Subsequently, a requestor finds the description and proceeds to ‘bind’ to the service. 2.2 Implementing Web Services There are currently two main implementation platforms for web services, Microsoft's .NET and J2EE. Both platforms use SOAP, WSDL and UDDI to promote basic interoperability, but additional nonstandard functionality is also provided. Supporting web-server application products are currently being implemented and include IBM's WebSphere. In addition to the WebSphere e-business platform, IBM has also published their J2EE-based Web Services Toolkit (WSTK) [IBM02]. This provides a variety of APIs, a pre-installed run-time environment, an implementation of a private UDDI registry and a number of demonstrators. One of these demonstrators, Gourmet2Go, forms the basis of the case study found in Section 4. 2.3 Service Routing Intermediaries It is important for intermediaries to consider nonfunctional criteria such as the performance of the web services being considered for selection. This requires mechanisms for measuring, storing and utilising web service performance data. An approach to web service performance monitoring using the ARM standard is discussed in the next section. 3 With increasingly complex computer infrastructure it is more difficult to analyse and report on performance. Individual units of work (or transactions) can be distributed, executed on different systems, across multiple domains, via different processes and threads. The type of work that these transactions carry out may also vary, including database access, application logic and data input and/or presentation. Analysing performance is therefore problematic and administrators face a number of complex performance-related questions: The web services framework provides various mechanisms for interacting with services at higher levels of abstraction and enables different versions of a conceptual service to be published. For example in an e-business scenario the communication between service requester and a conceptual service that has been found in a registry could traverse several service selection levels: 1. Service provider selection: A standard interface for providing a particular product or service which is implemented (or extended) by competing companies. 2. Service level agreement (SLA) selection: The service provider offers the web service with various QoS characteristics and associated prices. 3. Service replica selection: The web service is replicated on different servers and a server that can provide the agreed QoS level is selected. It is likely that the communication will pass through a number of intermediaries that will contribute toward service selection decisions and hence perform a service routing role. Web service routing (or message exchange) is the subject of significant attention and has been identified as one of the key functional components of the web services framework [IBM01b]. An example of an intermediary might be a broker that helps a shopper select and then connect to a company to obtain a product or service based on criteria such as price, quality and the performance of the company's web services, using parameters set by the requester and/or provider. Application Response Measurement ! ! ! ! ! ! ! Are transactions succeeding? If a transaction fails, what is the cause of the failure? What is the response time experienced by the end user? Which sub-transactions are taking too long? Where are the bottlenecks? How many of which transactions are being used? How can the application and environment be tuned to be more robust and to perform better? The Application Response Measurement (ARM) standard allows these questions to be answered. ARM allows the developer to define a number of transactions within an application, the performance of which are then measured during execution by an ARM consumer interface. The choice of which transactions to monitor is made by the developer and will usually correspond to the areas of the application which are considered performance critical. 3.1 Java 3.0 Standard The ARM 3.0 specification [OPNG01] provides a Java binding for the response measurement of Java applications. It defines a number of Java interfaces which must be implemented by a valid ARM consumer interface (an example of which can be found in Figure 3). Transactions are defined using an ‘ArmTransactionFactory’ and by creating new instantiations of ‘ArmTransaction’ objects. The consumer interface measures the performance of this transaction using start and stop calls and reports the information to an associated data repository (represented in Figure 3 by the reporting classes). Figure 4: An ARM implementation model containing: an ARM consumer interface client residing in the same JVM as the original application; an ARM data repository server for all reported and processed transactions; a number of server clients for data retrieval, including a GUI client for a continuous update of all previously reported and currently executing transactions, and a data client for specific transaction queries. Figure 3: An example of an ‘ARMed’ application. The transaction is defined using an ARM consumer interface. All ARM transactions contain (optional) definition information. This is achieved using ‘ArmMetricDefinition’, ‘ArmTranDefinition’ and ‘ArmUserDefinition’ objects. When the application invokes a process call, the consumer interface records this information in the reporting classes. The application then creates a new instantiation of an ‘ArmTransaction’ object, defining the location of the transaction using start and stop transaction calls. Optional descriptive information can be associated with the transaction to aid analysis. This data comes in the form of definition objects; including the ‘ArmMetricDefinition’, ‘ArmTranDefinition’ and ‘ArmUserDefinition’ objects found in Figure 3. Instantiations of these objects are provided by the consumer interface's implementation of the ‘ArmDefinitionFactory’. These objects are populated with descriptive information provided by the application and are processed at set intervals. The descriptive information is associated with each transaction using unique UUID numbers, allowing meaningful descriptive data for each transaction to be stored within the data repository. 3.2 An ARM Implementation An ARM consumer implementation and data repository has been developed to allow distributed ‘ARMed’ applications to process and report transactions to a remote data repository service. The ARM consumer interface connects to this data repository and reports all processed transaction measurement data, storing the information in an efficient format for future use. A number of standard clients have also been implemented that allow data to be extracted from the repository; this includes a GUI so that developers can monitor current and previous transactions. The ARM data repository infrastructure is shown in Figure 4. In the current implementation a central ARM repository service is defined. Further scalability enhancements can be provided by associating local data repository services with each ‘group’ of web services and linking these through a connectivity hierarchy. 3.3 Transaction Definition Language While the ARM standard provides a well-defined framework for the response measurement of distributed applications, there are a number of inherent disadvantages to this approach. These include the need for a detailed knowledge of the ARM specification, the time required to instrument performance-critical transactions, and the availability of source code. In order to overcome these issues a Transaction Definition Language (TDL) has been developed [TURN01]. This allows developers to define where transactions are located within their applications using an XML-based descriptor language. This is then used to automatically instrument the object code of the application using a bytecode instrumentation tool, which itself conforms to both the ARM and Java Virtual Machine specifications. The TDL allows descriptive information to be associated with each transaction using a number of optional attributes; it is also possible to define groups of transactions which may reside in an application's jarfile or classpath. The current version of the TDL specification can be seen in Figure 5. The main advantages of this approach are that developers can use the TDL to ARM performancecritical areas of code by editing the associated XML file. The process of code instrumentation is then automated (saving the developer time) and bytecodebased (eliminating the need for proprietary source code). Applications can then be monitored during the process of development, or alternatively after they have been deployed in a live environment. The Transaction Definition Language, including its specification, design and implementation is described in [TURN01]. <!-- Transaction Definition Language DTD (v1.1) --> <!ELEMENT tdl (transaction+)> <!ELEMENT transaction (location, line_number?, metric*)> <!ATTLIST transaction type (method_source | method_call | line_number) #REQUIRED appl_name CDATA #IMPLIED tran_name CDATA #IMPLIED user_name CDATA #IMPLIED fail_on_exception (yes | no) "yes"> <!ELEMENT location EMPTY> <!ATTLIST location class CDATA #REQUIRED method CDATA #REQUIRED> <!ELEMENT line_number EMPTY> <!ATTLIST line_number begin CDATA #REQUIRED end CDATA #REQUIRED> <!ELEMENT metric EMPTY> <!ATTLIST metric type (Counter32 | Counter64 | CounterFloat32 | Guage32 | Guage64 | GuageFloat32 | NumericId32 | NumericId64 | String8 | String32) #REQUIRED value CDATA #REQUIRED> Figure 5: Transaction Definition Language DTD (v1.1). 4 Case Study Service routing intermediaries (described in Section 2) are likely to play an important role in future web service networks. Such intermediaries will need to consider QoS metrics when making service selection decisions; the end-to-end response time being one such example. For this to be possible there must be a mechanism for monitoring the performance of the web services and a way of using this data during service selection. This section provides a case study that demonstrates an ARM-based routing framework. Gourmet2Go, a demonstrator of a dynamic B2B application provided with the IBM Web Services Toolkit, is used together with the ARM standard which provides the infrastructure for service performance monitoring and measurement. The process involved in configuring Gourmet2Go for performance-based service routing is described; the results obtained from a number of Gourmet2Go scenarios are also documented. 4.1 Gourmet2Go Gourmet2Go is a web services demonstrator from the IBM Web Services Toolkit. The demonstrator provides an example of how a web service acts as an intermediary broker, assisting users to select backend web services by obtaining bids from services published in a UDDI registry. In Gourmet2Go, the back-end web services sell groceries and the broker presents itself as a value-added ‘meal planning’ service. The underlying architecture is however generic and is therefore used to demonstrate the brokering of any kind of service. The architecture of the Gourmet2Go demonstrator can be found in Figure 6. A typical interaction is as follows: 1. The user interacts with the Gourmet2Go Web application via a Web browser with the intention of building a shopping list of groceries. This represents the user specifying the service that the back-end web service must perform. 2. The broker searches the registry for businesses with published web services that sell groceries. This represents the broker selecting a number of potential services using information provided in the registry. 3. The shopping list is passed by the broker to each of the back-end web services (located from the registry) via a ‘getBid’ request. This represents the broker contacting the short-list of candidates directly. 4. The broker summarises the bids for the user based on price, and the user then selects a supplier using the information presented to them. This stage represents the broker assisting the user in decision making. 5. The broker sends a ‘commitBid’ request to the service that the users selects. This represents the broker continuing to act as an intermediary while the user interacts with the selected back-end service. This interaction has the potential to be significantly more complex than the confirmation message in the Gourmet2Go demonstrator. For example, specifying the details of what exactly is to be purchased and how it is to be paid for, through to delivery tracking and after-sales support. The Web services in Gourmet2Go are implemented in Java using the WSTK and associated standards. The following performance data is retrieved: ! Figure 6: The design of the Gourmet2Go demonstrator from the IBM Web Services Toolkit. A typical interaction is shown in which a user browses the Gourmet2Go broker to obtain information on the available services and then select the preferred service. 4.2 ! Enhancing Gourmet2Go with Performance Information From an application performance perspective, there are two limitations within Gourmet2Go that our framework overcomes: 1. The only metric the broker considers when evaluating the back-end web services is the price of those services. When dynamically orchestrating web services it is also important to consider performance. 2. It is currently the user who makes the final decision as to which back-end web service to select. It would also be useful if the broker was able to make this decision automatically and thereby act as a service router. A new framework based on ARM and using the TDL is described. This framework allows the performance of web services to be monitored and allows historic performance data to be utilised by intermediaries such as the broker provided in the Gourmet2Go demonstrator. The design of the framework is described in terms of four key design questions: What performance data should be collected? As stated in Section 3, one aim of ARM is to provide a basis for the measurement of the performance of defined transactions within e-business applications; this includes the response time of the transaction, the status of the transaction, etc.. In this case study endto-end response time is used as the primary service routing metric. It would be possible however to extend this method to include other custom metrics such as transaction success-rate. Mean end-to-end response time and confidence: In the Gourmet2Go example, the key measure is the mean end-to-end response time of recent ‘commitBid’ requests. We also calculate how representative this mean is likely to be; this is done by providing an additional confidence estimate based on the number of recently recorded invocations1. When a new web service is published in the registry there will not be any historical performance data available for ‘commitBid’ requests. Our solution is to extrapolate from the recent measurements of the ‘getBid’ request, and associate a lower confidence value. Communication and processing costs: We also measure the mean response time of the Java application that implements the backend web service. Assuming that both mean response time figures are calculated from measurements of invocations from a specified broker to a specified web service, this allows the mean communication time from the broker to the service (and back) to be calculated. As the measurements are being taken at the Java application level, the communication figure includes the time it takes to traverse the communication stack at both ends (including the web service communication API, java.net classes etc.), but this is likely to be small in comparison to the overall communication cost. The mean communication time can be calculated from the mean end-to-end response time minus the mean processing time. In the current service routing algorithm the end-to-end response time and confidence is used. Communication and processing costs provide the user or system administrator with additional information, but are not used as part of the service routing algorithm. The framework can also be enhanced by decomposing the end-to-end response time into key transactions such as communications to other web services (e.g. databases) and service registries etc. Initial experiments have shown that this can provide useful additional information. The Gourmet2Go demonstrator is monitored by defining key transactions around the communication between the broker and the back-end web services 1 In this set of experiments we are focusing on a confidence metric defined in terms of the amount and relevance of historical performance data. It would be possible to extend this, for example to consider the variability of the data used to calculate the mean. (front-end transactions) and also around the ‘getBid’ and ‘commitBid’ operations (back-end transactions). Each transaction is assigned additional descriptive information - the application is set to the name of the back-end service and the transaction is set to the name of the operation. The user name associated with the transaction remains constant throughout. How should the performance data be measured and stored? Using the transactions described in the previous section, the broker and back-end web services were 2 ‘ARMed’ . A sample from the TDL for one of the back-end web services is shown in Figure 7. <transaction type="method_source" appl_name="Sammys Grocery Ordering Service" tran_name="getBid_backend" user_name="ARM"> <location class="com/ibm/ews/g2gServices/ grocery/SammysGroceryService" method="getBid"/> </transaction> <transaction type="method_source" appl_name="Sammys Grocery Ordering Service" tran_name="commitBid_backend" user_name="ARM"> <location class="com/ibm/ews/g2gServices/ grocery/SammysGroceryService" method="commitBid"/> </transaction> Figure 7: Gourmet2Go TDL sample showing the transaction definitions for the ‘getBid’ and ‘commitBid’ operations within one of the back-end web services’ interfaces. How can useful data be extracted from the performance information? Before the performance data can be used for service routing, it is necessary to determine how the mean response time and confidence figures for ‘getBid’ and ‘commitBid’ requests contribute to the evaluation of web service performance. In the current framework two simplifications are made: ! 2 ‘Using the mean response time of recent requests as the primary performance metric is appropriate.’ For simplicity we will assume that the variability of the back-end web services’ response times are low so it is appropriate to observe them for a limited period to calculate a mean response time prior to establishing a relationship with them. It should be noted that this research is based on the ARMing of Java; however, the research is generally applicable, and can be widely applied, for example to Microsoft's .NET ! ‘The complexity of the e-business scenario is restricted.’ A number of simulation assumptions are made: the factor by which ‘commitBid’ operations are, on average, slower than ‘getBid’ operations is known for all service providers and has a low variability; it will be rare for multiple services to have the same rating on which the broker must make a decision; there will be no correlation of broker and back-end service transactions (for individual invocation chains) - while this is possible using ARM parent correlators, it is not the aim of this set of experiments. Given the above assumptions - including that which states that there is a known factor (P) by which the mean end-to-end response time of a ‘commitBid’ operation (rc) is, on average, slower than the mean end-to-end response time of a ‘getBid’ operation (rb), for a particular service provider - the expected performance (EP) of a service provider is the overall average: n n EP = b P rb + c rc nt nt where nb, nc and nt are the number of bids, commits and total number of invocations respectively. Calculating confidence is more complex. When a web service is first registered, the broker will have no reliable confidence assigned until a number of invocations have been recorded: 0, nb < nm Cb = n b , otherwise 0, nc < nm Cc = n c , otherwise where Cb and Cc are the confidence values for bids and commits and nm is the minimum number of invocations required by the broker before the service is used. The combined confidence is calculated by weighting each figure and adding them, as the increased use of the service will increase the reliability of the confidence value (although the rate of increase in reliability will level over time): Co = log(W b (Cb + 1) + W c (Cc + 1)) where Wb and Wc are the weights for bids and commits. The higher the Wc relative to Wb the more importance the broker will put on having actual performance data as opposed to data extrapolated from getBid requests. In the experimental framework Wb = Wc = 0.5. Summary of the Design Methodology 1. The user connects to the broker and specifies their requirements for a service, whether to make a selection based on performance or price and whether the broker should select the best service automatically. 2. The broker queries a list of potential services for bids and the end-to-end response time and communication/back-end service processing is reported to the ARM data repository as they respond. 3. The broker obtains from the repository historical performance information regarding the services and uses this to select a service, possibly with user involvement. In the current implementation only the end-to-end service response time is considered. 4. The broker commits the selected bid, the response of which is also measured and reported to the repository. Finally, the weighted expected performance (WEP) for a particular service provider is: WEP = Co EP How should the extracted information be used by the broker? performance In this experimental framework the broker has two metrics for each back-end web service; price and weighted expected performance (WEP). The user can choose to have the broker act as a service router to the back-end web services by sorting them by either figure and attempting to connect to the services in that order until successful. WEP is rounded to a user-specified number of significant digits; if two or more web services exhibit the same rounded weighted expected performance they are sorted randomly. This allows the broker to obtain additional performance information about two or more web services when their WEPs are observed as being identical. Incorporating basic pricing into these equations is straightforward. However, it is noted that in a realworld system it is likely that pricing is more complex, including QoS-based pricing schemes etc.. Alternatively, the user can choose to have the broker display all potential services for user selection. The services can be sorted by either metric, but all price and performance information for each potential service is available, including a response time breakdown into communication and processing times. This approach is particularly useful when there are factors that are hard for an automated broker to take into account, for example a user's history with a particular service provider. In the current implementation, the service routing functionality is added to the Gourmet2Go broker by modifying the source to allow communication with the ARM data repository service. This provides performance information which is then used for the service bid processing algorithm. A straightforward extension would be to extract the service routing functionality into a web service thus providing a dedicated service routing intermediary. 4.3 Results This framework has been tested using Gourmet2Go and a number of experiments have been conducted that provide insight into routing algorithms and web service performance. To facilitate the analysis of the results, a number of enhancements were made to the Gourmet2Go simulation: ! ! A client application was written that interacts with the broker's HTTP-based web service interface using the same requests that a usercontrolled browser would submit during a runthrough of the Gourmet2Go demonstrator. This allows the demonstrator to run through a fixed number of iterations. The ‘getBid’ and ‘commitBid’ operations of the three grocery back-end web services were weighted with a random performance delay. Both the ‘Natural Bag’ (S1) and ‘Sammy’s’ (S2) services were equally weighted to provide (on average) the same processing response time so as to represent the established service providers; the ‘Dee Dees’ (S3) service, representing a new service provider, was then weighted differently for each of the three simulations whose results are documented below. Each weighting incorporates a random element allowing a 40% performance increase or decrease from the mean. This random element models the possible range of response times induced by factors such as communication delay and load on the broker and back-end servers (due to the volume of requests, for example). For simplicity both ‘getBid’ and ‘commitBid’ are identically weighted by setting P = 1 and the duration of Three sets of results are presented (see Figures 8, 9 and 10) from which conclusions are drawn as to how the routing algorithm responds to a newly published service when other available services already have an established performance-confidence rating. S1 and S2 are both published in the service registry at the start of each simulation. S3 is weighted at three different levels ranging from a very high performance (in comparison to the other two services), to the same performance, and is published in each simulation after 20 iterations. For each simulation all web services and the HTTP client were executed on the same machine3 with random communication delays modelled as stated above. The minimum number of iterations required before the service confidence rating can rise above zero is set to 10 (although this figure can of course be tuned). 0.006 0.005 0.007 0.006 0.005 0.004 0.003 0.002 S1 0.001 S2 S3 0 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 Number of Iterations Figure 9: The simulation results when S3 is published as a service with higher performance in relation to S1 and S2. The results are very similar to Figure 8, except for the fact it takes longer for S3 to become the dominant service. Again, once selected, the confidence in service S3 increases and it continues to be used as the favoured service. 0.004 0.007 0.003 0.002 S1 0.001 S2 S3 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 Number of Iterations Figure 8: The simulation results when S3 is published as a ‘high performance’ service in relation to S1 and S2. The results show that S1 is initially selected over S2 due to the fact that it performed better during this period. It then continues to be used because of its increased confidence rating over S2; the broker picks a service that seems to perform well and keeps using it whilst it performs consistently. After 20 iterations the service S3 is published and provides ‘getBid’ response times even though it is not selected by the broker. Once available for selection, its superior performance provides a sharp increase in weighted expected performance assuring its dominant selection for the remainder of the experiment. It follows from the results that a new web service would have to perform significantly better than the currently published services if the broker is to select 3 0.008 All results were obtained using a Pentium III 450 Mhz with 256 Mb RAM, running Redhat Linux 7.1, kernel 2.4.17, IBM JVM 1.3.7, and the WSTK 2.4.2 run-time environment which includes WebSphere Micro-Edition 4. Weighted Expected Performance Weighted Expected Performance 0.007 this new service over services for which it has a large degree of confidence. The level at which this selection is made can be partly controlled by modifying the weights which are given to the confidence values (Wb and Wc), the minimum number of invocations required by the broker before a service is used (nm) and the factor by which a ‘commitBid’ operation is, on average slower than a ‘getBid’ operation (P). Weighted Expected Performance the experiment is considered to be too short to start expiring any measurements in the performance history. 0.006 0.005 0.004 0.003 0.002 S1 0.001 S2 S3 0 1 13 25 37 49 61 73 85 97 109 121 133 145 157 169 181 193 Number of Iterations Figure 10: The simulation results when S3 is published with the same performance as S1 and S2. Unless the average response time for S3 improves over the other services, the confidence will never overtake that of S2, for which ‘commitBid’ as well as ‘getBid’ requests are being recorded. It is noted that S2 is the preferred service throughout the simulation due to its better performance during the first 10 iterations. The results for the three simulations confirm that the routing algorithm is driving the selection of web services based on actual performance results. As the performance of the web services change so the routing of requests will change. The rate of change will once again depend on the confidence weightings set. While Wb, Wc, nm and P might initially be set by hand, tuning will probably be automated. This can be based on the overall success of the system, judged for example by the ‘contract success rate’ of the subscribing customers, which can be calculated in real-time based on system monitoring and feedback. This is currently being investigated. 5 Conclusion This paper demonstrates how the ARM standard and a novel transaction definition and instrumentation technique can be applied to the performance monitoring of web services. The process of transaction identification, mark up and performance reporting provides insight into the process of applying existing performance tools and standards to web service based applications. The framework can be used as the basis for performance-based web service routing which is illustrated by applying it to Gourmet2Go, a dynamic B2B application supplied with the IBM Web Services Toolkit. The resulting application is both selfmonitoring and self-configuring; two essential properties for an autonomic computing system [IBM01a]. Future work is to include the addition of QoS contracts aimed to provide different levels of service provision. This will allow routing decisions to be influenced by the service classes associated with each request. Load balancing and additional dynamic system properties (such as network outage and performance bottlenecks) will also be explored in the context of service routing. A predictive framework is also being developed that allows the performance of MPI-based e-science applications to be characterised and predicted prior to their execution; ARM is being used as a monitoring framework for automated characterisation refinement. Using such a framework to enhance the performance of e-business applications such as those described within this paper is currently being investigated. 6 Acknowledgments The authors would like to express their gratitude to IBM's TJ Watson Research Center and Hursley Laboratories for their contributions towards this research, and in particular to Robert Berry for his valuable comments. The work is sponsored in part by the EPSRC eScience Core Programme (contract no. GR/S03058/01), the NASA AMES Research Center administered by USARDSG (contract no. N68171-01- C-9012), the ESPRC (contract no. GR/R47424/01) and IBM UK Ltd. 7 References [BALL01] K. Ballinger, P. Brittenham, A. Malhotra, W. Nagy and S. Pharies, “Web Services Inspection Language (WS-Inspection) 1.0'', November 2001. Available at http://www.ibm.com/developerworks/ webservices/library/ws-wsilspec.html [BOX00] D. Box, D. Ehnebuske, G. Kakivaya, A. Layman, N. Mendelsohn, H. F. Nielsen, S. Thatte and D. Winer, “Simple Object Access Protocol (SOAP) 1.1'', W3C Note, May 2000. Available at http://www.w3.org/TR/SOAP [CAO00] J. Cao, D.J. Kerbyson, E. Papaefstathiou and G.R. Nudd, “Modeling of ASCI High Performance Applications using PACE'', 19th IEEE International Performance, Computing and Communication Conference, Phoenix USA. 485-492 (2000). [CAO01] J. Cao, D.J. Kerbyson and G.R. Nudd, “Performance Evaluation of an Agent-Based Resource Management Infrastructure for GRID Computing'', Proceedings of 1st IEEE/ACM Int. Symposium on Cluster Computing and the Grid, Brisbane Australia. 311-318 (2001). [CAO02] J. Cao, D.P. Spooner, J.D. Turner, S.A. Jarvis, D.J. Kerbyson, S. Saini and G.R. Nudd, “Agent-based Resource Management for Grid Computing'', Invited paper at the 2nd IEEE/ACM Int. Symposium on Cluster Computing and the Grid, Berlin, May (2002). [CHRI01] E. Christensen, F. Curbera, G. Meredith and S. Weerawarana, “Web Services Description Language (WSDL) 1.1'', W3C Note, March 2001, Available at http://www.w3.org/TR/wsdl [CURB01a] F. Curbera, W. Nagy and S. Weerawarana, “Web Services: Why and How'', OOPSLA 2001 Workshop on Object-Oriented Web Services, Florida, USA, 2001 [CURB01b] F. Curbera, N. Mukhi and S. Weerawarana, “On the Emergence of a Web Services Component Model'', 6th International Workshop on Component-Oriented Programming (ECOOP), Budapest, Hungary, 2001 [DUFT01] M. Duftler, N. Mukhi, A. Slominski and S. Weerawarana, “Web Services Invocation Framework (WSIF)'', OOPSLA 2001 Workshop on ObjectOriented Web Services, Florida, USA, 2001 [FOST98] I. Foster and C. Kesselman, “The Grid : Blueprint for a New Computing Infrastructure'', Morgan Kaufmann. 279-290 (1998). [FOST01] I. Foster, C. Kesselman and S. Tuecke, “The Anatomy of the Grid: Enabling Scalable Virtual Organizations'', International J. Supercomputer Applications, 15(3), 2001. [OPNG01] The Open Group, “Application Response Measurement (Issue 3.0 - Java Binding)'', Open Group Technical Specification, October 2001. Available at http://www.opengroup.org/management/ arm.htm [SPOO02] D.P. Spooner, J. Cao, J.D. Turner, H.N. Lin Choi Keung, S.A. Jarvis and G.R. Nudd, “Localised Workload Management Using Performance Prediction and QoS Contracts'', 18th Annual UK Performance Engineering Workshop (UKPEW2002), University of Glasgow, UK. July 2002 [FOST02] I. Foster, C. Kesselman, J. Nick, S. Tuecke, “The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration'', February 2002. Available at http://www.globus.org/ research/papers/ogsa.pdf [TURN01] J.D. Turner, D.P. Spooner, J. Cao, S.A. Jarvis, D.N. Dillenberger and G.R. Nudd, “A Transaction Definition Language for Application Response Measurement'', International Journal of Computer Resource Measurement, 105, 55-65 (2001). [IBM01a] IBM Corporation, “Autonomic Computing Manifesto'', October 2001. Available at http://www.research.ibm.com/autonomic/ [UDDI00] UDDI Project, “UDDI Technical White Paper'', September 2000. Available at http://www.uddi.org [IBM01b] IBM Corporation, Microsoft Corporation, “Web Services Framework'', W3C Web Services Workshop, San Jose, USA, April 2001. Available at http://w3.org/2001/03/WSWS-popa/ [IBM02] IBM Web Services Toolkit. Available at http://www.alphaworks.ibm.com/tech/webservicestool kit [JOHN00] M.W. Johnson and J. Crowe, “Measuring the Performance of ARM 3.0 for Java'', Proceedings of CMG2000 International Conference, Orlando, USA, December 2000. [KREG01] H. Kreger, “IBM Web Services Conceptual Architecture (WSCA 1.0)”, May 2001. Available at http://www.ibm.com/software/solutions/webservices/ pdf/WSCA.pdf [LEIN99] W. Leinberger and V. Kumar, “Information Power Grid: The New Frontier in Parallel Computing?'', IEEE Concurrency 7(4), (1999). [NUDD00] G.R. Nudd, D.J. Kerbyson, E. Papaefstathiou, S.C. Perry, J.S. Harper and D.V. Wilcox, “PACE - A Toolset for the Performance Prediction of Parallel and Distributed Systems'', International Journal of High Performance Computing Applications, Special Issues on Performance Modelling. 14(3), 228-251 (2000).