OGF 22 February 25-29 2008, Cambridge, MA

advertisement

OGF 22 February 25-29 2008, Cambridge, MA

Attendees: A Stephen McGough, Vesselin Novov

Main groups of focus: OGSA-RUS, SAGA, JSDL, WFM-RG, AIDS, HPCP, OGSA

Info, OGSA EMS, Cloud Computing, GSA, GRAAP, AuthZ

With new reinvigoration of effort within the OGSA-RUS group there are now new documents coming out from this group. Implementations of this work are in progress and interactions with new standards are emerging. As we wrote the OMII-RUS implementation this is a prime area for us to work within. We are feeding our experiences into this effort.

The SAGA group is defining an API for accessing the Grid. This is of interest to us as one of the first areas they are developing their API for is job execution. This is not seen as a competitor to JSDL and BES but rather a layer which can be built on-top of these specifications. Coders could use the SAGA API to define jobs for execution and then the implementation of SAGA use JSDL and BES to execute the job. This is significance to us as GridSAM has an API for developing JSDL documents and submitting them which could be extended to include the SAGA API.

For the JSDL group we organised two main sessions along with a BoF session for the

Activity Instance Document Schema – see below. The JSDL 1.0 Errata document is now completed public comment and is now awaiting evaluation from GFSG as to final publication. This is significant as it is the culmination of over twelve group’s evaluation through implementation of the schema – highlighting major uptake within the community. During the general session we discussed the work that was done previously for the parameter sweep extension. This was previously proposed around one year ago, though efforts on HPCB profile and the Errata document had caused this work to be suspended. Through interest in uptake, mainly from GridSAM in the

UK community, this work has been revived and presented again. The document is effectively complete and received positive feedback from those attending the session.

The intention is now to tidy up the document and look to submitting it to GFSG. In the second session we discussed further the Application Instance Document Schema.

For this we were able to present more information on how we are using instance information within GridSAM. It was proposed that the scope of the work was too much – though when clarified that this would just be where to put the data and not what the data was (not even to profile the work of other standards). The consensus was to continue this work within the JSDL group by scoping out exactly what was needed – use case gathering – and then evaluate how to take it further.

From our role within the OGSA design team for Workflow Stephen McGough attended the WFM-RG session. The WFM group is focusing on interoperability in workflow and conducting a survey of what existing workflow systems and use cases exist already. It was noted that there is already a large amount of effort out-with the

OGF on workflow standardisation and the intention here is to survey this before moving forward. This is of interest both from the OGSA workflow point of view and from our efforts with ICENI II and GRIDCC.

The Application Instance Document Schema BoF allowed us to present our instance extensions to the JSDL specification within GridSAM. The intention of this group would be to extend (probably outside of) JSDL to encompass all the other information about a job which is not part of the jobs submission description. This would include information about scheduling, resource usage, security credentials used. The information here would be dynamic and may not all be located in the same document

but may just refer off to other documents. It was not clear at the end of the BoF how the work should continue. Though as we have another session (JSDL – see above) these issues were resolved there.

After the second successful HPC Profile interopfest at SuperComputing'07, the

HPCBP group held a session on specification adoption. This session outlined the history of HPCBP, through the work of JSDL, BES and finally HPCBP. This clearly shows the significance of the work we have carried out through our efforts in JSDL and BES and the uptake within the community as the presentations highlighted both industrial implementations of these specifications as well as those from research groups such as our own. HPCP-WG continued improving the specifications of the extensions to the basic profile that were demonstrated. Since the integrated HPCP support in GridSAM is one of the early implementations of the profile, we have participated regularly at the WG's weekly conference calls as well as all OGF working sessions. The three extension that underwent revisions were the Advanced Filter

Profile (AFP), the Application Template(AT) and File Staging Profile(FSP). At the

OGF22 group session the discussion focused on the latest state of these documents and a review of the work completed in the months after SC07. Advance Filter Profile

(AFP) - the purpose of the profile is to define a functionality for scoping of the data returned by calls to GetFactoryAttributesDocument(). The current returned document definition allows for basic filtering by simple inclusion/exclusion of given activities and resources. The proposed Advance Filter (for activities) expands the basic filtering by allowing for scoping based on; UserName, Owner, State etc. An HPCP implementation MAY not support the AFP but MUST return

UnsupportedFeatureFault if it doesn't. The proposed Advance Filter (for resources) expands the basic filtering by returning available resources in per-resource-type records rather than per-resource-instance records. A particular issue was raised about the use of String-based jobIDs in the returned data of the AFP calls. The jobID is returned inside EPRs and the EPRs are opaque to the user (the jobID value format as well as its placement within the EPR is service specific). Application Template (AT)

- the purpose of the template is to simplify the job submission with JSDL documents, i.e. the user would not need to specify every detail, executable, parameters etc, when submitting a job. However, there needs to be a way of discovering the ATs available at given HPCP service end point. The currently proposed way is to use an

ApplicationTemplates element in GetFactoryAttribtesDocument return doc. containing multiple JSDLApplication elements providing the ApplicationName,

ApplicationVrsion and Description values. Point for further discussion is the fact that as proposed the AT's names are not portable among HPCBP endpoints. To use an AT a client would provide a jsdl:ApplicationName (optional jsdl:ApplicationVersion)element; may specify jsdl:JobIdentification, jsdl:DataStaging elements; jsdl:Resources elements over-ridden by the AT. Precedence of

HPCApplication values after the HPCP endpoint applies an AT to a submitted JSDL is pre-defined. Some of the values will be replaced by values defined in the AT others will be added. Another potential problem that the group would need to address is the

'silent' replacement of JSDL values supplied by the user and which clash with values defined in the ATs. A possible solution would be the use of a set of Faults used to notify the user what exactly is/is not accepted. File Staging Profile (FSP) – this profile was one of the first extensions to the basic HPC profile and given the fair amount of work that the WG has put into it has reached a more mature stage. The profile was moved to public comment period from 4 until 11 Feb'08 but this period was extended.

The profile as it stands now was based on JSDL v1.0 DataStaging Element with the

addition of 'Credential' element, specification of a narrow set of supported protocols, discovery mechanism through BESExtension element and file staging failure exceptions. Along with the review of the three HPC profile extensions much of the session discussion focused on the need for support of Kerberos Authentication within the Credential element of the File Staging profile. This particular requirement was coming from users of vendors implementing the HPCP. There seemed to be a large user base using; Kerberos as authN mechanism, single-sign-on services built on top of

Kerberos and/or large number of batch system relying on Kerberos credentials. In addition to authN of services, the delegation of access rights was seen as an issue to be addressed. Kerberos authN mechanism is simple and well understood, however, usually is non-trivial to implement and program for. One of the latest developments within the WG would likely affect the current plans for the next GridSAM release.

There was a proposal from the GenesisII team in respect to integrating the task of application deployment within a BES endpoint functionality. It had been the team's experience that the use of the jsdl:DataStaging element for staging executables and configuration files is inefficient and doesn't scale. The team's evaluation of the

CDDLM and some other similar deployment mechanisms found them to be too complex. The evaluation process found that most of the use cases involve a transferring and unpacking zip-ped binary executable and/or configuration files. The proposed solution would introduce a new deployment descriptor element of the JSDL document and let the BES endpoint handle this task. GridSAM has supported the staging of input/output files as part of a computational job processing since its inception. Should this new proposal be accepted and become officially a part of JSDL standard. GridSAM is in a position to become one of the first BES endpoints to provide such application deployment functionality.

The OGSA Info model is now close to submission and contains significant input from the JSDL group both in terms of the new matching of resources and jobs paradigm and the development of XQuery interactions on the documents.

OGSA EMS specification was further discussed. CDDLM is now being removed from the specification where appropriate and file staging and resource selection is being extended. This fits in nicely with the work we are doing with GridSAM and

GridBS and we are participating in this effort with our experiences from this work.

An open BoF session was held on Cloud Computing and how this fits into the OGF model. There were a number of presentations about different Cloud implementations and how these are partly Grids and partly something at a higher level. The general consensus from the session was that Cloud computing was a higher level above the

Grid – more the interface rather than the “how it works”. The feeling was that Cloud computing was relevant for OGF though at present this was not an area that OGF could standardise. The work we have been doing on JSDL, BES and with GridSAM however seem quite inline with the Cloud model as we provide an interface for submission which could be used with the Cloud.

The Grid Scheduling Architecture group is working towards interoperability in and between scheduling services on the Grid. They seek to take JSDL documents, along with Scheduling documents, to select the appropriate BES instance to deploy a job to.

If in the case the scheduler can’t find a suitable BES instance it can communicate with other Schedulers to find a resource. Through our work on GridSAM and GridBS we have been able to contribute our experience of these scenarios back to the group.

The GRAAP group now has two implementations of services for submitting jobs through WS-Agreement which are able to interoperate. There was however much concern during their session that what they had implemented was far more complex

than mealy implementing an interoperation for WS-Agreement. Their implementations both relied on JSDL among other specifications and although this is good from our point of view as they are effectively providing another two implementations and showing interoperation of JSDL for us, there is concern that this will not help others wishing to implement WS-Agreement for other use cases.

Due to our ongoing development of OMII Authorization project we have been following closely the AuthZ WG's and other Grid security related activities in the last few OGFs. Our design is based in large part on the proposed OGSA AuthZ architecture. The latest changes and the current state of this architecture were reviewed as well as the recommendation of removing the OGSA acronym from the official specification title. The architecture and its document are deemed to be in a completed state and ready to be moved to public comment phase. Another document to review was the proposed XACML profile. The profile is to be stripped off of some now-mandatory attributes passed between Policy Enforcement Point (PEP) and Policy

Decision Point (PDP). The attributes are to be moved to the appendix and made optional. So far only one team has completed a reference implementation and published experience and test results. Support for XACML has been considered as part of our OMII Authorization development effort and it's quite likely such support will be introduced in future releases. The SAML profile was one more document the group discussed. Particular attention was paid to definitions for attributes to express

VO-related data such cross-organizational groups and roles. However, an issue was raised that this particular profile/document type fell outside the charter of the AuthZ

WG. Since there has been and administrative decision to close this WG the document's future remained uncertain. One proposed solution was to submit a new charter document along with the proposed profile by interested parties. If however, this approach was to be undertaken there would have to be a careful consideration of the amount of work needed to bring the profile to a final state. On the issue of closing the WG - the timely submission of the AuthZ architecture, XACML and WS-Trust documents for public comment period should allow for enough time for any comment/suggestions to be addressed and any relevant changes incorporated into the final official versions. The tentative deadline for this should be the OGF in Barcelona where the last WG meeting was scheduled for. Any outstanding concerns raised afterwards were to be resolved by following the relevant OGF mechanism in practice.

Download