Workflow as a Service: An Approach to Workflow Farming

advertisement
Workflow as a Service:
An Approach to
Workflow Farming
Reginald Cushing, Adam Belloum, Vladimir
Korkhov, Dmitry Vasyunin, Marian Bubak,
Carole Leguy
Institute for Informatics
University of Amsterdam
3rd International Workshop on Emerging
Computational Methods for the Life Sciences
18th June 2012
Outline
●
Scientific Workflows
●
Farming Concepts
●
Workflow as a Service (WfaaS)
●
System overview
–
–
Task Harnessing
Messaging
●
Application Use Case
●
Results
●
Conclusions
Scientific Workflows
Composing experiments from reusable modules
● Vertexes represent computation
● Edges represent data dependency and data communication
● Modules/Tasks communicate through channels represented by ports
● Workflow engines distribute workload onto resources such as grids
and clouds
● Modules run in parallel thus achieving better throughput
●
Farming Concepts
Many scientific applications require a parameter space study a.k.a
parameter sweep
● In workflows parameter sweeps can be achieved by running multiple
identical workflows with different parameter inputs
● Cons: Every instance of a workflow has to be submitted to distributed
resources where queue waiting times play significant role on throughput
●
Farming Concepts
Parameters
organized on
message queues
Task
Farming Concepts
Parameters
organized on
message queues
Task processes data
sequentially
Task
Farming Concepts
Parameters
organized on
message queues
Task processes data
sequentially
Task
Farming Concepts
Parameters
organized on
message queues
Task processes data
sequentially
Task
Farming Concepts
Parameters
organized on
message queues
Task processes data
sequentially
Task
Task
Task
Adding more tasks
increases message
consumption rate
Challenge: How
many tasks to
create?
Too many - tasks get stuck on queues. Too few - optimal performance not
achieved
Workflow as a Service
●
Workflow execution is persistent i.e. it runs, process data and does
NOT terminate but wait for more data
●
An active workflow instance can process multiple parameters
●
Make better usage of computing resources
●
●
●
A parameter space can be partitioned amongst a pool of active
workflow instances (a farm of workflows)
A workflow acts as a service by accepting requests to process data
with given parameters
–
Request 1: data A, parameters {p1,p2,...}
–
Request 2: data A, parameters {k1,k2,...}
Multiple WfaaS processing requests form a farm of workflows
System Overview
Loosely coupled
modules revolving
around a message
Queues
Enactment Engine
Dataflow engine
(top-level
scheduler) based on
Freefluo§ engine
Models workflows
as dataflow graphs
Vertices are tasks
while edges are
dependencies(data
Tasks have ports to
simulate data
channels
Dataflow model dictates that only tasks which have input are scheduled for
execution.
§
http://freefluo.sourceforge.net
Message Broker
Message broker
plays a pivotal role
in the system
Message queues
act as a data buffer
Communicating
tasks are time
decoupled
Through queue
sharing we can
achieve scaling
Tasks communicate through messaging where messages contain
references to actual data
Submission System
Pluggable
schedulers (bottomlevel) for task
match-making
Submitters (drivers)
abstract actual
resources such as
cluster, grid, cloud
Scheduler matches
a task to a submitter
Submitter does
actual task/job
submission
Task Harnessing
Task harness is a
late binding, pilotjob mechanism
A pilot-job (harness)
is submitted which
will pull the actual
job
The harness
separates data
transport from
scientific logic
Better control of
tasks
Task Auto-Scaling
Messages between
tasks are monitored
Size of queued data
and mean data
processing time are
used to calculate
task load
Auto-scaling
replicates a
particular task to
ameliorate the task
load
Replicated tasks (clones) partition data by sharing same input message
queues
Parameter Mapping
One to one mapping: each parameter is mapped to one
workflow instance
● Generates many workflow instances which end up
stuck on queues waiting execution
● High scheduling overhead, high concurrency
●
Many to one mapping: all parameters are mapped to
the same workflow instance
● Only one workflow to schedule, takes long to process
all the parameter space
● Low scheduling overhead, Low concurrency
●
Many to many: parameter space is partitioned amongst a farm
of workflows
● A number of workflows scheduled which accelerates
processing
● Low scheduling overhead, high concurrency
●
Task harnessing
WfaaS is enabled through
task harnessing
● A harness is a caretaker
code that runs alongside the
module on the
resource/worker node
● It implements a plugin
architecture
● Modules are dynamically
loaded at runtime
●
Data communication to and from the module is taken care of by the
harness
● The harness invokes the module with new requests of data processing
● The harness is akin to a container while the module is akin to a service
● The harness enables asynchronous module execution as
communication is done through messaging
●
Messaging
In WfaaS modules communicate
through messaging
● Message queues allow multiple
instances of modules to share the
same input space
● Through message queues, data is
partitioned amongst modules
● Messaging circumvents the need to
co-allocate resources
●
A pull model implies that each module can process data at its own
pace
● Once a module has finished processing data it asks for more (pull)
●
Application Use Case
Biomedical study for which 3000
runs were required to perform
global sensitivity analysis
● Patient-specific simulation
includes many parameters based
on data measured in-vivo
●
Arterial tree model geometry
and representation of model
parameters constrained to
uncertainties
● Parameters: flow velocity,
brachial, radial, ulnar radii.
Length of brachial, radial, ulnar.
etc
●
Results
Left: WfaaS 100 simulations takes around 3h:15min
● Right: Non WfaaS 100 simulations take 5h:15min
● The WfaaS approach, each workflow instance performs multiple simulations which
drastically reduces queue waiting times
● The non-WFaaS approach generates 100 workflow instances with most of them getting
stuck on job queues
● In both cases worklows were competing for 28 worker nodes
●
Conclusions
●
●
●
●
WfaaS is an ideal approach to large parametric
studies
WfaaS reduces common scheduling overhead
associated with queue waiting times
WfaaS is achieved through task harnessing
whereby caretaker routines can invoke the task
multiple times
A farm of wokflows can progress at its own
pace through a parameter pulling mechanisim
Further Information
●
WSVLAM workflow management system
–
●
Computational Sciences at University of
Amsterdam
–
●
http://staff.science.uva.nl/~gvlam/wsvlam/
http://uva.computationalscience.nl
COMMIT
–
http://www.commit-nl.nl/new
Download