GSOC_Airavata

advertisement
Apache Airavata
GSOC 2013
Target Community: Science Gateways
Enabling & Democratizing Scientific Research
Advanced Science Tools
Computational
Resources
Scientific
Instruments
Algorithms and
Models
Knowledge and Expertise
Archived Data
and Metadata
What does Apache Airavata do?
• Compose, manage,
execute, and monitor
distributed, computational
workflows.
• Wrap legacy command line
scientific applications with
Web services.
• Run jobs on computational
resources ranging from
local resources to
computational grids and
clouds.
• Manage provenance data.
Apache Airavata
L
o
r
i
e
n
m
s
d
o
iu
plm
pox
1e5
s
n
u
s
m
End Users
Core
Developer
Message
Box
Scientific
Applicati
on
Gateway Developer
Apache
Airavata
API
Workflow
Interpreter
Application
Factory
Computational
Resources
Regist
ry
Apache Airavata Components
Component
Description
XBaya
Workflow graphical composition tool.
Registry Service
Insert and access application, host machine,
workflow, and provenance data.
Workflow Interpreter
Service
Execute the workflow on one or more resources.
Application Factory
Service (GFAC)
Manages the execution and management of an
application in a workflow
Messaging System
WS-Notification and WS-Eventing compliant
publish/subscribe messaging system for workflow
events
Airavata API
Single wrapping client to provide higher level
programming interfaces.
Hi, I’m Nolram.
I’m a computational
physicist.
I run computational
experiments everyday
This is how typically I
run my experiments
First I collect my
observed data
This is starting to
become a very tiring
task
And then pass data to
my applications & get
the result
Scientific Application
Another Scientific
Application
How can I make this
much simpler…?
Logically, this is how
my life would be
made easier…
Is it possible to
automate this flow
sequence without my
guidance?
Scientists from many
different fields face this
problem everyday.
What is a workflow you
ask?
The solution is to use a
workflow-powered
science gateway to
manage the experiment
online.
Well, you just saw one in
our previous animation…
We introduce Apache Airavata, a system capable of
composing, managing, executing, and monitoring
small to large scale applications and workflows
Want to see how it works?
A Typical Workflow
…
I will
andhandover
while I wait
my for
data
results,
& my
Airavata will complete the
experiment
Airavata will
details
notify
(theme
workflow)
with
experiment & return me the results
progress
to updates
the Airavata
of myserver
experiment
Results
Progress of the experiment
Apache Airavata
The Gateway
Let’s look closely how Airavata
manages workflows.
Experiment progress
Apache Airavata
Results
The Gateway
Let’s look closely how Airavata
manages workflows.
Experiment progress
Results
The Gateway
3. The Message
Registry
4.
2.
GFac
Box
1. Workflow Interpreter
Airavata
main
has
components…
Defines
theprogress
available
&
Records
Steer
science
the
app4executions
ofapplications
the workflow
& data
Steer the workflow execution
records all results of experiments
execution
transfers
Message Box
GFac
Workflow Interpreter
The Gateway
Registry
End Users
A Stable API for
Airavata
Scientific
Application
Gateway Developer
Apache Airavata
Computational Resources
A1
Application
Registration
UI
Application
Developer
A2
Service Map
XML
Get AWSDL
W1
Workflow
Developer
Airavata Service
Interface
(wraps client API)
W2
Web Based
workflow
composer
Service Map
to AWSDL
Put XWF
W3
E1
Experiment
Builder
Web Based
Experiment
Builder
Launch
Workflow
E3
Get Workflow
Graph
M2
M1
Watch Progress
Web Based
Workflow
Monitor
M3
W4
Shred
Workflow
Inputs
Get WI’s
E2
A3
Monitor
Workflow
Airavata
Server
Goal of the project
• Design Web-Based interfaces for Airavata:
– Application Registration
– Workflow Construction
– Workflow Execution
– Workflow Monitoring
• Provide an opportunity for GSoC to
understand Distributed System in action
• Scope for Research and Software Engineering
papers
Data Model
• Application Description
– User describes inputs and outputs of the
application.
– Currently this information is captured in Service
Map Schema.
– This schema is stored in Airavata Registry as XML.
Also the schema utility generates a application
service WSDL from this schema using the Airavata
WSDL Generator.
Launch & Manage Jobs
Applicatio
n Desc
A1
Application
Registration
UI
Execute & Manage
Computations
A2
W1 Get AWSDL
Service Map
to AWSDL
Web Based
workflow
composer
Workflow
Developer
Registry
Service Map
XML
Airavata Server API
Application
Developer
XML
Notify progress of job
or workflow execution
Messaging
Subsystem
W2
Workflow
Application
Factory (Gfac)
Workflow
Interpreter
Real-Time
Monitoring
A peek at one of the cluster
Interconnect
Nodes
Scheduling ‘qsub’ batch jobs on the cluster
worker
node
worker
node
worker
node
worker
node
C Slot 1
B Slot 1
A Slot 1
C Slot 2
Queue-B
C Slot 1
B Slot 1
B Slot 3
B Slot 2
B Slot 1
C Slot 3
C Slot 2
C Slot 1
B Slot 1
A Slot 2
A Slot 1
Queue-A
worker
node
Queue-C
SGE MASTER node
Queues
Policies
Priorities
JOB X
JOB Y
Share/Tickets
JOB Z
JOB O
JOB N
JOB U
Resources
Users/Projects
Resource Matching
Selection
Scheduling
JOB
User
User policies
Groups
Roles
Departments
Projects
Job policies
Resources
System
characteristics
System status
Resources
Simplified Gateway Architecture
Community Account Grid Certificate
username, password
Step 0
One time Gateway
Community Setup
Gateway Authentication
Step 1
Job Submit or
File Transfer request
Output
Gateway Interface
Step 2,3,,
Gateway Server
Compute Servers
CIPRES
ParamChem
Apache
Airavata 1.0
GridChem
Apache
Airavata 1.0
DES
BioVLab
NSG
POPLAR
Apache
Airavata 1.0
Apache
Airavata 1.0
UltraScan
Apache
Airavata 1.0
VLAB
Apache
Airavata 1.0
Apache
Airavata 2.0
ParamChem
GridChem
VLAB
UltraScan
DES
BioVLab
Download