GGF-17 Report Omer Rana – GridNet ID: 107 o.f.rana@cs.cardiff.ac.uk Cardiff School of Computer Science/Welsh eScience Centre May 30, 2006 GGF-17 took place at the Tokyo International Forum in Japan and was colocated with the “GridWorld” event. The event attracted around 300 delegates (for GGF) and over 3,500 delegates (for GridWorld). Most of the presentations in the GridWorld event were in Japanese with limited simultaneous translation available. It was, however, useful to see a variety of industry presentations – ranging from hardware/software vendors to some specialist end users. I attended the event from May 9 to May 13. 1 GRAAP – WS-Agreement The sessions associated with this activity were primarily intended to analyze comments received on the current specification (released in September 2005). The comments received on the specification are based on emails to the GRAAP mailing list and subsequent discussions via weekly telecons. Good progress is being made to provide a more robust version of the specification. It was felt however that some of the comments were out of scope for the current specification (such as support for negotiation) – and these requirements would be considered in a future version of the document. Most comments have now been taken into account, although some aspects need to be considered in more detail: • The need to support the expiration/cancellation of an agreement. It was felt that this provision was necessary for book keeping purposes, and simply associating a time period over which an agreement was valid was not enough. It was necessary to provide a more explicit mechanism for cancellation of an existing agreement. Supporting expiration was however not straight forward, as it may involve resolving existing reservation for resources – which may not be under the direct control of the agreement initiator. It was felt that additional intermediate agreement states may need to be introduced to achieve this. Discussion on this continues. • The need to support aggregation of multiple agreements. In this case it was necessary to combine constraints that were part of multiple agreements and merge these constraints into a single composite agreement. It was generally felt that more examples were needed before this aspect could be pursued further. 1 • Guarantee terms in the agreement have not been modified – although some comments on this have been received. The comments primarily indicate that it is necessary to specify which service objectives have associated guarantee terms (not currently supported). Hence in the current version of the specification, guarantee terms may not be associated with particular objectives defined in the agreement. In the same context, it was also necessary to clarify on the state of a guarantee term – i.e. whether it had been fulfilled, violated (and to what extent), etc. • It was also felt necessary to state the relationship between an agreement template and the created agreement in some more straight forward manner. Some of the comments received relate to the requirement of “dynamic Service Level Agreements”, and this formed the basis for two sessions. In particular, there had been feedback from participants in the European “OntoGrid” project that support for negotiations was needed within WS-Agreement. Most of these participants advocate an agent-based approach and feel that work that has been undertaken within the multi-agent systems community must be taken onboard. None of the OntoGrid participants were present at this event to discuss their particular viewpoints. The remainder of the session focused on discussing interoperability issues in WS-Agreement. For instance, how would interoperability constraints be defined in the context of Service Level Agreements, and in particular, what are the interoperability issues in the context of WS-Agreement? Some argued that it was too early to consider interoperability, and the current focus should be on defining a stable specification. Others felt that the specification should account for possible interoperability issues that might arise when the specification is subsequently implemented. It was felt that to support interaction between different implementations it was necessary to define, in a precise way, how a client from one application making use of WS-Agreement could talk to another. It was felt that experience from OGSA-DAI should be used to define mechanisms to support such interoperability. Some of these issues would necessarily relate to application specific data models – for instance, the terms within an agreement would be application specific and interoperability would then require the two applications to share these terms. However, it was also necessary to define some generic terms that could be application independent, and which all implementations of WS-Agreement must support. To demonstrate interoperability, it was felt that multiple implementations of the same specification were necessary to identify whether interoperability between the two is possible. It was also felt that advice from the “Grid Interopability Now” (GIN) group should be sought for this. It was generally agreed that an experience document would be useful at this stage, that outlines how WS-Agreement is currently being used, what additional capability particular users need and subsequently identify what type of operations were necessary to support interoperability. It was also necessary to 2 undertake tests against the specification – which could only be done once the specification had been implemented. An agreement server could be set up to allow members of the community to test their specification for interoperability. The next step was to provide implementation of the agreement once the specification had been finalized. It was necessary to identify (i) who was involved in implementing the specification; (2) to make parts available for others to use; and (3) to make parts of the specification open source. In the same context, it was also necessary to identify “micro-specs” for additional domain specific terms that could be used in the agreement. 1.1 GRAAP Session 2 – Dynamic Agreements In the context of supporting dynamic agreements two talks were given: (1) “Dynamic SLAs” by the author of this report; (2) “Function-based WS-Agreement” by Viktor Yarmolenko (University of Manchester). The first talk discussed the requirements for establishing dynamic SLAs, which included: • The need to modify an agreement that had already been established – especially if the agreement is used at a time much later than when the agreement had been defined. The requirement here relates to comparing the cost of re-establishing a new agreement vs. being able to adapt an agreement that is already in place. • The need to support flexibility in the agreement if an agreement initiator is not fully aware of the operating environment when the agreement is defined. In this case, the agreement initiator may not have enough information to determine what to ask for from a provider. This is likely to be the case when an agreement initiator or provider operates with imprecise knowledge about the other party involved in the agreement. Based on these requirements, two types of dynamic agreements were defined: 1. Case 1: Static Agreement. In this case, it was necessary to identify Service Description Terms, Guarantee Terms, and Service Level Objectives (SLOs). Both Guarantee terms and SLOs were to be precisely defined at agreement creation time. 2. Case 2: Dynamic Agreement. In this case, it was necessary to identify Service Description Terms, Guarantee Terms which were now defined as ranges or as functions, and Service Level Objectives which were defined as ranges or as functions. The use of range-based or function-based agreements provided a useful basis for supporting dynamicity in the agreement. Examples from the European FP6 “CATNETs” project were used to demonstrate how dynamic agreements could be specified and used for developing a Grid resource and service market [1]. The second talk focused on the ability to specify agreements as functions – where options are expressed as a set of variables. Variables could include terms 3 such as start time and end time of a particular job. In this instance, a client would send a function-based agreement to a provider – who would evaluate the function locally and return to the client the type of resources that it had the ability to provide (at the time the agreement was to become valid). The aim here was to minimize the number of re-negotiations necessary to reach some consensus on values associated with agreement terms. The presenter discussed a case study that demonstrated how the approach could be used in practice [2]. Currently, the focus within this work was on specifying guarantee terms as functions, and the presenter compared the use of binary and fuzzy functions. A key advantage of this approach was that particular terms could be included in the agreement the value for which was not available at the time of agreement creation. This would therefore allow greater flexibility in the way that an agreement was defined. WS-Agreement in its current form could be used to support this function-based approach, although it was necessary to identify how the functions would be described using XML. It was agreed that a workshop would be organized a day before the next GGF in Washington to discuss: (1) common application independent terms that may be used within an agreement, thereby leading to re-usable implementations; (2) implementations of the WS-Agreement specifications that are currently available. 2 Job Submission Description Language (JSDL) The session on JSDL primarily focused on extensions being proposed to the specification to enable description of parallel jobs (with particular focus on MPI). Two efforts in this area were presented, one from Imperial College (UK) based on the GridSAM project and the other from the Japanese Grid initiative (as part of NAREGI). JSDL extensions in the NAREGI project were based on two types of job requirements: (i) single MPI executables, which would make use of “worker side” JSDL; and (ii) multiple executables running on multiple systems at multiple sites. In the first case, it was enough to provide a reference to a single MPI executable, whereas in the second case it was necessary to relate multiple MPI executables. Based on these job types, the following types of JSDL extensions were proposed: • JSDL submission and JSDL execution; • JSDL abstract and JSDL concrete; • Wrapper for different JSDL documents – primarily by providing extensions for MPI jobs. Consequently, researchers in the NAREGI project have added ComplexJobInstance, JobInstance, and AssociateJobID. Other MPI specific extensions include MPIType, MPITasks, TasksPerHost, HelperCommand etc. 4 For instance, MPITasks defines how many tasks run on a host. Some of these extensions are therefore aimed at capturing the command line arguments that are passed when executing an MPI job. The GridSAM project also proposed MPI-based extensions to JSDL and focused on identifying the minimal set of terms that could be used across all MPI versions. The GridSAM extensions were particularly focused on supporting job submission within the UK National Grid Service (NGS), which made use of a Globus-based submission interface. It was therefore necessary that the absolute minimum set of assumptions be made about the types of jobs being submitted. The work extended “POSIX” elements with additional terms defined in the GridSAM project. It was also clarified that GridSAM does not execute jobs – but primarily submits jobs to the appropriate system that is then responsible for their execution. During discussion in this session it became clear that vendors, such as IBM, had their own internal developments taking place with reference to JSDL, and such vendors were extending the JSDL specification internally. One example presented was the use of terms associated with the Tivoli workload scheduler – a product internal to IBM. One reason cited for these extensions was the lack of expressiveness available within the existing JSDL specification; it was outlined that IBM requirements were not being met with the existing specification as it was too coarse grained for their internal use. However, IBM was clearly interested in participating in the JSDL group within GGF, and make contributions based on their use of this specification. It was also made clear that JSDL primarily provides a basis for job execution, and was not intending to provide a programming model. As various JSDL extensions were being proposed, it was recognized that using very specific terms in JSDL may be too restrictive – thereby leading to incompatibility between different extensions. It was felt, therefore, that a more open symmetric matching scheme should be employed, allowing developers to add more complex attributes if necessary. It was identified that Condor ClassAds primarily provides a set of “conventions”, however these can be extended in arbitrary ways. In this context, the relationship between JSDL and Condor ClassAds was discussed, in addition to similarities with other projects at Boston/Harvard. These projects are focused on allowing the specification of arbitrary attributes. It was therefore necessary that some “asymmetric” matching scheme be employed that enabled a resource to advertise its capabilities using a particular set of terms. Subsequently, a job would define its own requirements using terms that may not be identical to those used for defining the capabilities of a resource. This would therefore allow resource providers to focus on the capabilities of their own resources, and application developers to focus on their own requirements. Hence, the idea that a match occurs between task requirements and resource capabilities would be incidental rather than planned. It was felt that such a match making scheme would lead to greater use, and at the same time avoid the need for everyone to use the same set of terms. In general it was agreed that more discussion was necessary on how JSDL 5 descriptions could map into a more open framework. Furthermore, the resource requirements section for JSDL should act primarily as a placeholder that could be extended by developers as necessary. Hence, the intention would be to not put the extensibility in the JSDL schema – and keep this outside. It was also generally agreed that the Common Information Model (CIM) provided a useful basis for adding additional terms to JSDL, although additional work was necessary to fully understand how terms in CIM could be deployed in real scenarios. The JSDL specification provides a useful basis for defining terms within a Service Level Agreement (such as WS-Agreement). Any updates being proposed to this specification therefore will increase the expressiveness of a WS-Agreement. The flexibility being proposed could have benefit for service providers/users, but would lead to greater complexity when used as part of a Service Level Agreement – as it would now be more difficult to evaluate whether an agreement had been violated by a provider. 3 Information Model The Information Model session focused on the extension of terms in JSDL, to enable better match making between a client request and a provider advertisement. A position paper by E.J.Stokes (from IBM) [3] formed the basis for discussion in this session. The discussion focused on the requirements for providing an information model for resources – which included the ability: • to manage resource information in the system; • to advertise the capabilities of resources in the system; • to express a set of requirements needed by a job that needs resources to run. It was necessary to determine the granularity at which such information was to be provided. Too much detail would make the information model complex to use, and too little detail may lead to the model being unusable – and leading to vendors extending the model in ways that would constrain interoperability. The focus of the discussion in this session was on the type of XML syntax (primarily name value pairs, where values could be literals or ranges) that could be used to specify the information model. Examples of terms that could be included are: <node> <name>computerA.acme.com</name> <processor> <type>Intel</type> <CPUspeed>3200</CPUspeed> </processor> <OS> 6 <type>Linux</type> <physicalMemory>3000000</physicalMemory> <virtualMemory>12000000</virtualMemory> <maxProcessesPerUser>32</maxProcessesPerLimit> </OS> </node> advertising the capability of resource computerA. Similarly, the requirement for some job Activity2 would be expressed as: <activity> <name>activity1</name> <processor> <type>Intel</type> <CPUCount> <=2 </CPUCount> <CPUspeed> >3000 </CPUspeed> </processor> <OS> <type>WindowsXP</type> </OS> </activity> Overall, work on the information model provides a good starting point for evaluating more useful match making techniques. The use of ranges in XML – as advocated in this position paper – is possibly too restrictive for a real application. There has been significant work in the match making/service discovery community on using more complex description techniques (based on RDF or OWL) that are better suited to encode such resource capability or task requirements. Clearly, a closer collaboration is needed between the OGSA-WG and the Semantic Grids-RG. The information model being developed in this WG has implications also for WS-Agreement – as one aim of work in Service Level Agreements (SLAs) is to validate the provision of particular resource capability that has been defined in the SLA. The set of terms that are agreed upon in the information model, and subsequently the mechanisms to define constraints on these terms, will impact the types of applications that WS-Agreement may be deployed in. 4 Certificate Authorities The Internet Grid Trust Federation (IGTF) continues to focus on developing specialist certificate authorities across the world, to enable Grid users to be authenticated in some uniform manner, and enable resource sharing across multiple administrative domains. Their aim in this session was to highlight work that had already taken place within various member organisations – such as European GridPMA, work in the Asia-Pacific region, etc. 7 The discussion focused on identifying the structure of a CA – for instance, should it be hierarchic – should it be developed as a ”bridge”? Further discussion was focused on identifying what the architecture should be for this. Identify “policy-OIDs” that could be associated with a particular CA. What should these be? It was also felt that there was a need to verify the quality of “identity tokens” that had been generated by middleware. One discussion point focused on the need to build support within existing middleware to be able to support Policy-OIDs. Request for namespace constraints — and request and decide on Policy-OIDs. Question: to which middleware providers should this request be directed. Identify the level of complexity that should be provided. There were also requirement identified for externally-defined namespace constraints – so that relying parties can uniquely assign namespaces for subject identifiers to specific issuing authorities. 5 GridWorld The GridWorld had over 3,500 registered participants making this, possibly, one of the largest Grid computing event I have participated in. The participants were primarily from Japan, with the presentations also reflecting this particular demographics. Much of the discussion in GridWorld focused on Web Services and requirements for application deployment over distributed infrastructure. 6 Future Actions Based on this participation, I shall be involved in the following future activities: • Provide comments on the Information Model for OGSA-WG. • Contribution to dynamic WS-Agreement use case document, based on CATNETS and other SLA-based research at Cardiff. • Contribution to GRAAP workshop hosted alongside GGF-18. I also made contributions to drafting the Call for Papers for this workshop. • Use of JSDL as for match making – and discussion of extensions to JSDL. • Discussions at the GRAAP and JSDL sessions formed the basis for material in the “Grid Workflow” tutorial at the IEEE CCGrid 2006 conference, and discussions at the UK-Singapore collaboration session at the GridAsia 2006 conference. References [1] Liviu Joita, Omer Rana, Oscar Ardaiz, Pablo Chacin, Isaac Chao, Felix Freitag, Leandra Navarro, “Application Deployment using Catallactic Grid Middleware”, Third International Workshop on Middleware for 8 Grid Computing, ACM/IFIP/USENIX International Middleware Conference, November 28–December 2, Grenoble, France, 2005. [2] Rizos Sakellariou and Viktor Yarmolenko, “On the Flexibility of WSAgreement for Job Submission”, Third International Workshop on Middleware for Grid Computing, ACM/IFIP/USENIX International Middleware Conference, November 28–December 2, Grenoble, France, 2005. [3] E. J. Stokes, “Information Modeling in OGSA Position Paper”, Open Grid Services Architecture WG, May 11, 2006. 9