Multiple Views of Workflow through ICENI John Darlington, Nathalie Furmento, William Lee, Anthony Mayer, Steve McGough, Steven Newhouse London e-Science Centre Different Views of “Workflow”: Aggregation of Elementary Tasks • Loosely defined: – “Programming the Grid” – Composition of Services – Scripting languages • … more tightly defined: – Ordering of Execution (often DAGs) – Ordering of Communication similar in purpose… but certainly not identical in semantics Interactive vs Batch Grid Services • Batch: – Execution time occurs after design time. – Need to express ordering of actions. • Interactive: – Execution time and design time overlap, execution time passes while one designs. – Expressing temporal ordering is tricky. Expressions of Aggregation A B Spatial Relationship A then B A B do A ; do B A interacts with B Temporal Relationship Application = A B Spatial Expressions • Defined (observed) as: interactions are NOT temporally ordered • Implicit assumption: Entities may be concurrent, Do not occupy same resources • Usually dataflow relations • Multiple models of computation • Used when considering components and composition • Usually graphical, certainly declarative Temporal Expressions • Defined (observed) as: interactions with temporal ordering • Previous assumptions are relaxed – Entities could occupy same space, (as long as it’s at a different time) • Used when considering activities, tasks • Workflow, imperative • Usually textual, procedural ICENI: Imperial College e-Science Network Infrastructure • • • • • • • • Integrated Grid Middleware Solution Interoperability between architectures, APIs Added value layer to other middleware Usability: Interactive Grid Workflows Deployment: Complete Install from Webstart Role and policy driven security Foundation for higher-level Services and Autonomous Composition ICENI Open Source licence (extended SISSL) http://www.lesc.ic.ac.uk/iceni/ ICENI Release 1.0 available for download The ICENI Stack Higher Level Services Client Side Tools: Netbeans / Portal Runtime Component Framework Execution Mechanism Security Layer Service Oriented Architecture OGSA Gateway Domain & Identity Management Service API Discovery API Core API Jini Jxta Implementation Fabric OGSI Augmented Component Programming Model Matrix Meaning Linear Solver Vector Vector How components can beMatrix linked together Matrix Jacobi LU Vector Vector Vector Vector Behaviour How they interact with each other Pull Model Push Model Parallel LU Implementation How they will perform on different resources Sequential LU Netbeans GUI Plug In: Drag & Drop Composition Focus on Usability: ICENI Portal Utilising the Spatial View: Some advantages… 1. Raising the level of abstraction 2. Semantic adaptation 3. Interactivity through dynamic composition ICENI: Utilising the Spatial View (1) Raising the level of abstraction Collaborations Composition CDL: Meaning Behavioural Annotations Implementation Annotations • Implementations may be resource specific • High level of abstraction means we can select implementation • More options for scheduler! ICENI: Utilising the Spatial View (2): Semantic Adaptation Metadata Space 2. Gather metadata Semantic Matching Service 1. publish User Semantic Description requirement 4. Invoke service Adaptation Proxy Adaptation Service 3. Produce Adaptation Proxy 1. publish Grid Service Grid Service Grid Service ICENI: Utilising the Spatial View (3): Dynamic Composition Drag-and-drop running component Deployed application Application Visualisation Server Register as running component services in the NetBeans user interface Add new advertised components Execute to create new component instances and connect to application Delivering e-Science: Case Study of Interactive Dataflows LB3D – (Lattice-Boltzmann 3D) ICENI provides collaborative visualisation and steering across the Access Grid Capturing the Information: Workflow from Dataflow A B • Each component has it’s own thread • B does nothing… • … until it receives control • A and B are partners in the transaction • A does work, then communicates with its partner, by sending a message • Data is worked on by both partners • and subsequently does work Utilisation of Temporal View: Informed Scheduling (1) A B •Sequence: A then B • Execution Time: T(A) + T(B) • Data Constraint: M is shared M Utilisation of Temporal View: Informed Scheduling (2) • Both components share a single resource • Opportunities to interleave components on resources Model of Execution time + Individual Performance of Activities = Composite Model of Application A B ICENI: Temporal Expression • Annotations component-by-component • Inferred temporal ordering • Information provided to scheduler Spatial Composition Behavioural Annotations Implementation Annotations Temporal Expression: Workflow Scheduling Services ICENI Workflow: Elementary Units Activity Receive • Description • Duration • Resource Usage Start • Port • Port Send Stop ICENI Workflow: Splits & Joins And Split Or Split • Conditions And Join Or Join Workflow graph built recursively Start Activity Send Receive Activity Stop Why graphs? • Lot of theory – liveness properties – deadlock • Looks good – Feedback to user is important • Underlying is XML – Experiments with modifications to existing workflow languages • Could (easily?) use a scripting language Netbeans GUI (workflow view) Information from the temporal view Start Activity And Split Width: Resource Usage Send Send Receive Receive Activity Activity Stop Stop Length: Execution Time Loops! Start Activity: Generate Generate Sorted List Receive Or Join Activity: Sort Process List Send Receive Activity: Process Or Split Stop … not the end of the world • Loop detection eased by join nodes • Performance Model in terms of loop terms: double iteration . term • <iteration>s are not ordered, so terms are not ordered in general • But where a single dominant <iteration>, we can produce an ordering on its factor Useful information despite loops • Performance models used for comparison – implementation selection – resource selection • So in our example: T(Generate) + y. (T(Sort) + T(Process)) thus we can order implementations on (T(Sort) + T(Process)) • Surprisingly common! Delivering e-Science: Case Study of Iterative Workflows GENIE – (Grid ENabled Integrated Earth system model) Has already produced major scientific results, showing the fragility of the freshwater transport system. Example: GENIE Spatial Composition Setup Control Atmos Integrate Sea Ice Display Land Ocean Example: GENIE Inferred Temporal Information Setup Control Integrate <S> Sea Ice Atmos Ocean <F> Display Future Developments… • More user involvement in the behavioural descriptions (control flow editing) – Not too hard • Looking at data concerns (protocols etc) – Harder • Multiple forms of expression – Scripting, GUI tools, portals, – Standards for reuse, sharing – Technically easier, but requires community effort (shocking hard) Summary: Separation of Concerns – Capturing user requirements: spatial view • Allows dataflow models of computation • Interactive applications • High level of abstraction – Mapping application to grid: temporal view • Workflow • Performance Modelling • Optimised Scheduling – ICENI • infers temporal from spatial • by exploiting component meta-data Development Infrastructure • Project Website & mailing lists • Daily build – Regression tests – On success binaries updated – Regenerated JavaDoc – Deployment tests • CVS – Code split across multiple repositories & modules • Documentation, manuals & user guides Acknowledgements London e-Science Centre • Director: Professor John Darlington • Technical Director: Dr Steven Newhouse • Research Staff: – – – – – Anthony Mayer, Nathalie Furmento Stephen McGough, James Stanton Yong Xie, William Lee Marko Krznaric, Murtaza Gulamali Asif Saleem, Laurie Young, Gary Kong • Support Staff: – Keith Sephton (Systems Manager) – Oliver Jevons (Operations Manager) – Susan Brookes (Administrative Assistant) ICENI: An integrated Grid Middleware http://www.lesc.ic.ac.uk/iceni/ ICENI Release 1.0 available !!! ICENI Open Source licence (extended SISSL)