WS-JDML: A Web Service Interface for Job Submission and Monitoring Stephen M

advertisement
WS-JDML: A Web Service
Interface for Job Submission
and Monitoring
Stephen MCGough
William Lee
London e-Science Centre
Department of Computing, Imperial College London
What Services do we need to make the Grid
Work?
• One of the key services required is job
submission
– The ability to transparently submit a job to a
resource (potentially through a DRM) where it
will run
• Many DRM systems exist (Condor, Globus,
SGE etc…)
– Each have their own way to define a job
(language)
– Each have their own submission mechanism
(command line, API, Service)
2
The Problem
• Submitting jobs requires
– Knowledge of the job definition procedure
– The ability to interface with the appropriate DRM
• The Solution
– One common Job description language that can be
used with all resources (eg RSL)
– A generic submission system for jobs
• Using community based standards that are in common
use
3
Generic Job Submission
JDML
Web
Services
WS-JDML
4
Web Service
• We are using a plain “Vanilla” Web Service
– Don’t rely on any proposed WS standards
– Don’t need anything more than core standards for
this simple service
• Developed in Java
• Our work has been deployed into the J2EE
enterprise platform
– This enables
• Scalability
• Fault tolerance
5
Job Description Markup Language
JDML
• Originally developed from Condor ClassAds
• Developed for the European DataGrid project
• Used within the Imperial College ICENI
project
• This work is now feeding into the Global Grid
Forum Job Submission Description Language
standardisation work
• JDML will morph to become JSDL
6
JDML (2)
• JDML documents are written in sections
–
–
–
–
What job to run
The environment to run the job in
Where to get files from
Where to send files to at the end
• JDML is strongly typed
• Consists of name/value pairs
7
JDML (3)
• Can have DRM specific sections
– It must be safe to ignore this section and the job
still work correctly
– Seen as a set of hints to the DRM
• File transfer is defined for multiple protocols
– Grid FTP, HTTP, copy etc…
– Each file may have multiple of these definitions
• DRM can select the appropriate ones to use
8
WS-JDML Architecture
9
Job Submission Port Type
• Takes a JDML document describing the job to
run
• Validates the JDML so that an immediate
response can be given
• Validates user credentials, passed as part of
the SOAP header, using WS-Security
• Job is then placed into queue before being
processed into a DRM specific version and
deployed locally
10
Job Submission Port Type (2)
• Various results
– Unrecognised Job Term
• The JDML contains some term that the Service doesn’t
understand
– Invalid Job Term
• The JDML has a term which has the wrong type or an
invalid value
– Successful Submission
• URI to identify the job instance is returned
11
Job Monitoring Port Type
• This port provides a means to observe the
current status of a job and manipulate the
output transfer mechanism
• Requires the URI representing a job provided
from job submission
• Current job status is returned
– pending, scheduled, running, suspended, done, exit
– Not all DRMs support all states
12
Job Monitoring Port Type (2)
File Transfers
• Port provides the ability to
– Get portions of the files specified in the JDML
transferred
– Override the transfer methods given in the JDML
– Indicate that files should be transferred back as
attachments to the SOAP document
• Allows easy monitoring of the job progress
13
Deployment
• DRM Specific Translators have been obtained
from existing code within the ICENI project
– These include Shell, SGE, Globus and Condor
• Web Service architecture has been deployed
in Java J2EE 1.4 platform
– This provides a number of support features for the
services.
14
Demo
• Hopefully
• http://rhea.lesc.doc.ic.ac.uk:9999/jdmljobservice
• Need to run over SSH
15
Further Work
• Job State Transition
– The ability to represent the status of a job running within a
resource
• Notification
– Currently to monitor a job requires the polling of the
monitoring port
• Would be better if notifications to a sink service through say WSNotification
• Job Term Semantics
– Definition of job terms using natural language
– No formal model makes JDML transformation error prone
– Develop an Ontology for Job submission terms
16
What do you use to build your service?
•
Widely Implemented Standard Specification (1pt)
– <Demonstrable Multiple Implementations, e.g. SOAP, WSDL>
•
Implemented draft specification (2pt)
– <Specification in standards body and supported by most/many companies. One/few
implementations exist (e.g., WS-Security, BPEL)>
•
•
•
Implemented draft specification (3pt)
– <Specification in standards body but alternatives exist. Industry is divided. One/few
implementations exist. (e.g., Transactions, coordination, notification, etc.).
Implemented proposal (4pt)
– An implementation of an idea, a proposal but not submitted to standards body yet (e.g.,
WS-Addressing, WS-Trust, etc.)
Non-implemented proposal (5pt)
– <An idea that exits as a white paper, but no code and no specification details>
•
Concept (6pt)
– <An idea that exists only as power point slides!!>
•
TOTAL: SOAP, WSDL, WS-Security = 3
17
Service Dependencies
• What else does your service depend on (i.e.
external dependencies)?
– RDBMs / J2EE EJBs
– Logging (Java Logging)
– Message Queue (JMS)
• What does your implementation depend on?
– Java
– J2EE 1.4 compliant
18
AAA & Security
• What authentication mechanism do you use?
– WS-Security
• What authorisation mechanism do you use?
– Flexible composition of authorisation plugins.
• What accounting mechanism do you use?
– Java logging
• Does service interaction need to be encrypted?
• If these are not used now, will they be in the future?
19
Exploiting the Service Architecture
• What features from your ‘plumbing’ do you
use in your service?
– Event notification
– Meta-data
20
Service Activity
• Multiple interaction or single user?
– Multiple
• Throughput (1/per day or 100/per second?)
• Typical data volume moved in
• Typical data volume moved out
21
Service Failure
• Required Reliability
– Failure semantics?
• Positive ack (might need WS-ReliableMessaging)
• Required Persistence
– Job entered into the queue is always persisted
• Required Availability
– One of many or unique requirement
22
Required Service Management
• Remote access to:
– Usage statistics
– Job Progress
– Job Diagnostic and repair interfaces
23
Acknowledgements
• Director: Professor John Darlington
• Research Staff:
–
–
–
–
–
Anthony Mayer, Nathalie Furmento
Stephen McGough, James Stanton
Yong Xie, William Lee
Marko Krznaric, Murtaza Gulamali
Asif Saleem, Laurie Young, Gary Kong
• Contact:
– http://www.lesc.ic.ac.uk/
– e-mail: lesc@ic.ac.uk
24
Download