Predictable Workflow Deployment Service Stephen M Gough

advertisement
Predictable Workflow
Deployment Service
Stephen MCGough
Ali Afzal, Anthony Mayer, Steven Newhouse, Laurie Young
London e-Science Centre
Department of Computing, Imperial College London
ICENI
The Iceni, under Queen Boudicca,
united the tribes of South-East
England in a revolt against the
occupying Roman forces in AD60.
•
•
•
•
•
IC e-Science Networked Infrastructure
Developed by LeSC Grid Middleware Group
Collect and provide relevant Grid meta-data
Use to define and develop higher-level services
Interaction with other frameworks: OGSA, Jxta etc.
2
The Architecture: Showing the Trinity
Scheduler
Performance
Store
Reservation
Service
Launcher
Application
Service
Reservation
Engine
3
Scheduling Service
Scheduling Algorithm
-Algorithm to select where to deploy
components
Scheduling
Framework
Application Mapper
- Generates the possible mappings of
Components to resources
Listen out for services
-Launcher Services
-Reservation Services
-Performance Services
4
Scheduling Ports
• launchJob – Takes an EP (workflow) or
JDML (job description)
– Works out where to deploy on the grid
• Uses Performance, Reservation and Launching service
to help determine this
– Deploys work to appropriate Resources (as JDML)
– Returns EP indicating what has been done
• generateQuote – same but doesn’t deploy
5
The Performance Repository Framework:
PerformanceStore
Store
Performance
Performance
Store
- Persistent Performance
storage
- Persistent Performance storage
- Persistent Performance storage
Performance
Framework
Data Collector
-Collecting data on currently running
applications (event times)
Performance Processing
- Conversion of raw event times into
performance data
6
Performance Ports
• register – Inform the Performance Service
of a new application to monitor
– Provides a unique id that is used for further calls
to the PS
• addEP – Provide a workflow for an
application the PS is monitoring
– Requires the Execution Plan (a workflow)
– Requires the unique id provided above
7
Performance Ports (2)
• getActivityTime – Get an estimated
execution time for part of a workflow
– Compulsory data
• Component type – identifies the component we are
interested in
• Resource – the resource it will be run on
• Activity – which part of the component
– Optional data
• Share count – the number of other components that will be
running on the resource
• Problem space characteristics – a set of parameters
specified by the component designer (eg number of
unknowns for a set of liner equations)
8
Performance Ports (3)
• getProblemCharacteristics – find
out the set of parameters and their types that
can be used when querying the performance
service for a given component
– Requires component type, resource, activity
9
Performance Events
• When ICENI components start or component
ports are accessed events are fired
– Used to gather performance information about
currently running application
• Events contain data about
– Time, Component where event happened,
resource, type of event (start or port), application
which event refers to.
• Are serialised objects – can be XML
documents
10
Collection of Performance Results
Linear Equation
Source
Event:
Start Linear
Equation Source
Data Collector
Linear Equation
Workflow
Solver
Display Vector
Results
Time
12:00
12:04
12:03
12:05
12:12
Event
Linear Equation Source Start
Send out Equations
Linear Equation Solver Start
Receive Equations
………..
Performance Processing
Performance Store
11
Launching Service
Launcher
Launcher
Launcher
-ConvertsaaJDML
JDMLdocument
documentinto
into a
-Converts
-Converts
a JDML document
into aa
platformspecific
specificjob
job
platform
platform
specific job
Launcher Factory
Launching
Framework
-Generates a Launcher for each job
submitted to the Launching Service
Advertiser
-Generate a document for each resource
available from this Launcher
Reservation
- Provides mechanism for reservations
to be made
12
Launching Service
• launchJob – Takes an XML description of
the job to deploy written in JDML and enacts
this job on the appropriate resource
– JDML is translated to the local DRM specifics
• getResources – return the set of id’s of
the resources available from this launcher.
– If a set of user credentials are provided then the
list only contains those resources that the user may
use.
13
Launching Service (2)
• getResourceDescription – Get the resource
description for a named resource as an XML
document.
– If credentials are provided only return the document if the
user can use the resource.
• getResourceAttribute – Query a specific
attribute from a resource. Given the name of a
resource and the name of one of the attributes return
the value of this attribute.
• getLocations – Get a list of the names of the
resources
14
Launcher With Reservations
• createReservation - Given an agreement
document requests a reservation for a resource
– Returns an agreement document and an agreement identity
• renegotiateAgreement – Takes an agreement
document returned previously and attempts to
modify it.
– If successful new document returned
– If unsuccessful return an alternative proposal
• cancelReservation – takes a reservation
identity and cancels the associated Reservation
15
Launcher With Reservations (2)
• createHold – Given an agreement document and
timeout value make a hold on a resource
– Arguments may be negotiated
– Returns an Agreement Document with the Hold Identity
– Hold is not permanent (time limited)
• may need to cancel if can’t hold all other components in
application
• confirmHold - Takes a hold identity and makes
the hold permanent
• cancelHold – Takes a hold identity and cancels
that hold on the resource.
16
Reservation Service
• makeReservations – Takes a set of EPs
(workflows) and tries to see if any of them can be
fully reserved for the given user credentials
– Returns an EP that can be fully reserved (if one exists)
– Does this by making holds with the Launching Services
and confirming them
• cancelReservation – Takes the Resource
Identity and Reservation Identity and cancels that
reservation
– These are found in the EP returned from creating a
reservaiton
17
Reservation Engine
• Exposes the underlying reservation features of the DRM
• makeReservation – Takes reservation including
time interval and user credentials
– Either confirms the reservation is accepted or offers an
alternative
• cancelReservation – Takes a reservation identity
and cancels it
• makeHold – Takes a reservation request and duration
– Returns the time interval that the hold will be held for
• cancelHold – Cancel a Hold request given its id
• confirmHold – Make a Hold into a reservation –
requires id
18
Example Execution
Performance
Service
Scheduling
Service
Reservations
Service
Launcher
Service
Advertise
Actor
Reservations
Engine
Advertise
Advertise
Advertise
Submit workflow
Get resource information
Get performance
information
Performance
data
Resource information
Evaluate Performance Models
Schedule workflow
Create
Reservations
Create Hold
Hold Created
Confirm Hold
Execution
Plan
Reservation
Confirmed
Create Hold
Hold Created
Confirm Hold
Reservation
Created
Deploy Jobs onto Resources
Application Started
19
Service: ICENI
• End to end Grid middleware. Providing
Launching, Scheduling, Reservation and interapplication communication.
– URL: www.lesc.doc.ic.ac.uk/iceni
– Licence: ICENI, based on Sun open source licence
– Support: Web site / mailing list
• SOA Model:Jini
20
What do you use to build your service?
(i.e. How ‘standard’ is your service?)
• Widely Implemented Standard Specification (1pt)
– JINI
• Implemented draft specification (2pt)
• Implemented draft specification (3pt)
• Implemented proposal (4pt)
– ICENI Architecture
• Non-implemented proposal (5pt)
• Concept (6pt)
• TOTAL: JINI, 1pt, Implemented Pro 4pt = 5pt.
21
Service Dependencies
• What else does your service depend on (i.e.
external dependencies)?
– Logging : Java Logging
• What does your implementation depend on?
– Languages : Java
– JINI based.
22
AAA & Security
• What authentication mechanism do you use?
– X509 certificates based.
• What authorisation mechanism do you use?
– From ICENI infrastructure.
• What accounting mechanism do you use?
– None at present.
• Does service interaction need to be encrypted?
• If these are not used now, will they be in the future?
23
Exploiting the Service Architecture
• What features from your ‘plumbing’ do you
use in your service?
– Event notification
– Meta-data
– Registry discovery/advertisement
24
Service Activity
• Multiple interaction or single user?
– Multiple interaction
• Throughput (1/per day or 100/per second?)
– ~ 10/per min.
• Typical data volume moved in
• Typical data volume moved out
– Depends on job.
25
Service Failure
• Required Reliability
– Failure semantics?
• Positive ack
• Required Persistence
– No current persistence.
• Required Availability
– One of many.
26
Required Service Management
• Remote access to:
– Performance
– Progress (limited at present).
27
The Future
•
•
•
•
How will ICENI develop?
Want to re-engineer services as web-services
Already have this for launcher (WS-JDML)
Bring ICENI back into main stream services
– More reliable and useful to others
– Fragment ICENI into separate interoperating
services
– Explore different service discovery mechanisms
28
Acknowledgements
• Director: Professor John Darlington
• Research Staff:
–
–
–
–
Anthony Mayer, Nathalie Furmento
Stephen McGough, William Lee, Jeremy Cohen
Marko Krznaric, Murtaza Gulamali
Asif Saleem, Laurie Young, Jeffery Hau
• Others:
– Steven Newhouse, Yong Xie, Gary Kong, James Stanton
• Contact:
– http://www.lesc.ic.ac.uk/iceni
– e-mail: lesc@ic.ac.uk
29
Related documents
Download