Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau University of Utah, School of Computing USENIX 2006 / June 3, 2006 This Talk in One Slide z Current network testbeds z z …manage the “laboratory” …not the experimentation process. z → A big problem for large-scale activities! z Evolve Emulab for experiments based on scientific workflows z z Big mutual benefits: testbed ↔ workflow Work in progress 2 Example: UAV Simulation z A distributed, real-time application images → z Evaluate improvements to real-time middleware z z z z z vs. CPU load vs. network load 4 research groups x 19 experiments x 56 metrics alerts → UAV Receiver ATR ← images 3 Use Emulab write “ns” file Concept Experiment “swap in” Emulate 4 Problems Solved z I get machines! 328 PCs, and more z Time- & space-shared z Loads OS and software z z I get network! z z Config. topology & quality I get to collaborate! Available to researchers and educators worldwide z File storage, email, … z 5 Problems Not Solved z “Now what?” z Getting off the ground Run all my software z Add instrumentation z Collect all my data z Analyze it z z Scaling up 19 configurations z Automation z 6 More Problems Not Solved z “How did I get here?” z Over the short term… “Where are the results I got last week?” z “How did I get those results anyway?” z “What if…?” z z …and the long term Reproducing results z Reusing artifacts z 7 Idea: Scientific Workflow z Managing activities, inputs, and outputs is the job of a scientific workflow system z Our approach: evolve Emulab with integrated support for scientific workflows z z z Build on existing abstractions & mechanisms Resource focus → user & task focus Users work “within” and “across” experiments 8 Contributions z Address demand + opportunity z z z Advance the applicability of testbeds z z Users need to manage large-scale complexity A symbiotic combination: leverage and impact Not just Emulab — e.g., PlanetLab and DETER Advance scientific workflow systems z z Exploit testbed capabilities — e.g., “total control” Address testbed requirements — e.g., flexible use 9 Issue: Encapsulation z Current “experiment” model is not fully encapsulating z z z Topology + static events Need everything else! ns file packages inputs outputs Challenge: specification z z Complete and precise… …w/o huge user burden my software z OSes Approach: be automatic z z z E.g., track files used Snapshot, archive, restore User can refine “extent” NFS monitors Subversion repo. packet monitors datapository (DB) AJAX GUI research filesystems 10 Issue: Definition vs. Execution z Current “experiment” has multiple roles z z z Challenge: representing relationships z z z Definition The thing that you run Multiple runs of one setup Similar configurations Approach: a new model of experimentation z z Separate the roles Evolve the new abstractions 11 New Model z Template z Swapin z Experiment z Activity z Record n=2 n=4 12 Issue: History rev 1.1 z Research and educational plans are dynamic z By design & by discovery z Challenge: safe exploration z Fork z Back up z Approach: keep history & support temporal navigation z Keep template revisions z Track provenance z Locate, repeat, and reuse oops: need new measurements bigger nets what about loss? add params 13 Implementation in Progress Definition Data Analysis Execution & History 14 Conclusion z Large and powerful testbeds z z z Integrated workflow management can leverage the strengths of testbeds z z …enable complex and large-scale activities …lead to complex and large-scale workflow management problems Systems approach — and systems challenges → Better testbeds and workflow systems 15 http://www.emulab.net/ Thanks! Extra Slides After This Point 17