Apache Airavata GSOC 2013 Target Community: Science Gateways Enabling & Democratizing Scientific Research Advanced Science Tools Computational Resources Scientific Instruments Algorithms and Models Knowledge and Expertise Archived Data and Metadata What does Apache Airavata do? • Compose, manage, execute, and monitor distributed, computational workflows. • Wrap legacy command line scientific applications with Web services. • Run jobs on computational resources ranging from local resources to computational grids and clouds. • Manage provenance data. Apache Airavata L o r i e n m s d o iu plm pox 1e5 s n u s m End Users Core Developer Message Box Scientific Applicati on Gateway Developer Apache Airavata API Workflow Interpreter Application Factory Computational Resources Regist ry Apache Airavata Components Component Description XBaya Workflow graphical composition tool. Registry Service Insert and access application, host machine, workflow, and provenance data. Workflow Interpreter Service Execute the workflow on one or more resources. Application Factory Service (GFAC) Manages the execution and management of an application in a workflow Messaging System WS-Notification and WS-Eventing compliant publish/subscribe messaging system for workflow events Airavata API Single wrapping client to provide higher level programming interfaces. Hi, I’m Nolram. I’m a computational physicist. I run computational experiments everyday This is how typically I run my experiments First I collect my observed data This is starting to become a very tiring task And then pass data to my applications & get the result Scientific Application Another Scientific Application How can I make this much simpler…? Logically, this is how my life would be made easier… Is it possible to automate this flow sequence without my guidance? Scientists from many different fields face this problem everyday. What is a workflow you ask? The solution is to use a workflow-powered science gateway to manage the experiment online. Well, you just saw one in our previous animation… We introduce Apache Airavata, a system capable of composing, managing, executing, and monitoring small to large scale applications and workflows Want to see how it works? A Typical Workflow … I will andhandover while I wait my for data results, & my Airavata will complete the experiment Airavata will details notify (theme workflow) with experiment & return me the results progress to updates the Airavata of myserver experiment Results Progress of the experiment Apache Airavata The Gateway Let’s look closely how Airavata manages workflows. Experiment progress Apache Airavata Results The Gateway Let’s look closely how Airavata manages workflows. Experiment progress Results The Gateway 3. The Message Registry 4. 2. GFac Box 1. Workflow Interpreter Airavata main has components… Defines theprogress available & Records Steer science the app4executions ofapplications the workflow & data Steer the workflow execution records all results of experiments execution transfers Message Box GFac Workflow Interpreter The Gateway Registry End Users A Stable API for Airavata Scientific Application Gateway Developer Apache Airavata Computational Resources A1 Application Registration UI Application Developer A2 Service Map XML Get AWSDL W1 Workflow Developer Airavata Service Interface (wraps client API) W2 Web Based workflow composer Service Map to AWSDL Put XWF W3 E1 Experiment Builder Web Based Experiment Builder Launch Workflow E3 Get Workflow Graph M2 M1 Watch Progress Web Based Workflow Monitor M3 W4 Shred Workflow Inputs Get WI’s E2 A3 Monitor Workflow Airavata Server Goal of the project • Design Web-Based interfaces for Airavata: – Application Registration – Workflow Construction – Workflow Execution – Workflow Monitoring • Provide an opportunity for GSoC to understand Distributed System in action • Scope for Research and Software Engineering papers Data Model • Application Description – User describes inputs and outputs of the application. – Currently this information is captured in Service Map Schema. – This schema is stored in Airavata Registry as XML. Also the schema utility generates a application service WSDL from this schema using the Airavata WSDL Generator. Launch & Manage Jobs Applicatio n Desc A1 Application Registration UI Execute & Manage Computations A2 W1 Get AWSDL Service Map to AWSDL Web Based workflow composer Workflow Developer Registry Service Map XML Airavata Server API Application Developer XML Notify progress of job or workflow execution Messaging Subsystem W2 Workflow Application Factory (Gfac) Workflow Interpreter Real-Time Monitoring A peek at one of the cluster Interconnect Nodes Scheduling ‘qsub’ batch jobs on the cluster worker node worker node worker node worker node C Slot 1 B Slot 1 A Slot 1 C Slot 2 Queue-B C Slot 1 B Slot 1 B Slot 3 B Slot 2 B Slot 1 C Slot 3 C Slot 2 C Slot 1 B Slot 1 A Slot 2 A Slot 1 Queue-A worker node Queue-C SGE MASTER node Queues Policies Priorities JOB X JOB Y Share/Tickets JOB Z JOB O JOB N JOB U Resources Users/Projects Resource Matching Selection Scheduling JOB User User policies Groups Roles Departments Projects Job policies Resources System characteristics System status Resources Simplified Gateway Architecture Community Account Grid Certificate username, password Step 0 One time Gateway Community Setup Gateway Authentication Step 1 Job Submit or File Transfer request Output Gateway Interface Step 2,3,, Gateway Server Compute Servers CIPRES ParamChem Apache Airavata 1.0 GridChem Apache Airavata 1.0 DES BioVLab NSG POPLAR Apache Airavata 1.0 Apache Airavata 1.0 UltraScan Apache Airavata 1.0 VLAB Apache Airavata 1.0 Apache Airavata 2.0 ParamChem GridChem VLAB UltraScan DES BioVLab