HPSearch and NaradaBrokering: Workflow Scripting and Stream Management Edinburgh December 3 2003 PTLIU Laboratory for Community Grids Geoffrey Fox, Harshawardhan S. Gadgil, Shrideep Pallickara Indiana University, Bloomington IN 47404 http://www.hpsearch.org http://www.naradabrokering.org gcf@indiana.edu Backdrop Workflow systems have several components Development Environment – Graphical User Interface Specification Language such as BPEL4WS Some interface between specification and runtime (compiler?) Run-time managing linkage of services, error handling and notification This project contributes to • Procedural specification of workflow • Stream management part of run-time Workflow is sufficiently complex that we ought to agree on a general architecture so we can each build parts and link together Comments on Standards In this talk, workflow is synonymous with “Programming the Grid/Internet” We never agreed on a programming model for the simpler case of “Programming a CPU” so not very likely we will agree on standards for workflow We did roughly agree on standards within a particular language Fortran v. Java v C++ v C# v Lisp We also had a little more agreement on common runtime than on languages but not complete What to remember about this talk All streams (flow between ports of a web service) are handled by publish-subscribe messaging infrastructure • Allows robust data transfer with adaptive routing e.g. allows use of GridFTP • Supports full concurrency inter and intra streams Data, Streams, Files, Web Services are manipulated by a scripting language analogous to Shell and Perl in UNIX http://www.hpsearch.org has the details Software will be included in open-source release of http://wwww.naradabrokering.org • NB Version 0.93 today; 1.0 February 04; 2.0 for SC04 includes HPSearch Scripting Environment I HPSearch is designed as a scripting interface to the Internet (Grid) using currently the Rhino implementation of Javascript • Could use Python, Perl • Called HPSearch because could access variables either by URI or by search interface (To Google Web Service) but this is not relevant here x = ‘wsdl:http://156.56.104.155:8080/axis/services/Calculator.jws/add(5, 6)‘; Or x = ‘wsdl:WSDL for WS/function(arguments)‘; Returns x=11 if function adds its arguments Could follow by y=x+1 setting y=12 etc. Can access any data in this fashion and support normal capabilities supported in most languages (set x=data and use I/O) • Perhaps prefer all I/O to go through Web Services Scripting Environment II So scripting environment can manipulate its own variables and methods as usual but can also invoke any web service with the wsdl: primitive xpath: primitive evaluates an XPath query against a local variable defined (say by return from a Web service) as an XML instance Can have multiple communicating scripting engines WS1 WS2 Script1 WS3 WS4 WS6 WS5 Script2 This is scripting control; workflow is between web services Script3 NaradaBrokering Computer Minicomputer Audio/Video Conferencing Client Modem Server Web Service B Peers NaradaBrokering Broker Network Queues Firewall Stream Workstation Laptop computer Peers Web Service A PDA Audio/Video Conferencing Client Grid Messaging Substrate Standard client-server style communication. SOAP+HTTP GridFTP RTP …. Consumer Service Substrate mediated communication removes Consumer transport protocol dependence. SOAP+HTTP GridFTP RTP …. Service Messaging Substrate Any Protocol satisfying QoS Protocols have become overloaded e.g. MUST use UDP for A/V latency requirements but MUSTn’t use UDP as firewall will not support ……… NaradaBrokering Based on a network of cooperating broker nodes • Cluster based architecture allows system to scale in size Originally designed to provide uniform software multicast to support real-time collaboration linked to publish-subscribe for asynchronous systems. Now has several core functions • Reliable order-preserving “Optimized” Message transport (based on performance measurement) in heterogeneous multi-link fashion with TCP, UDP, SSL, HTTP, and will add GridFTP • General publish-subscribe including JMS & JXTA and support for RTP-based audio/video conferencing • Distributed XML event selection using XPATH metaphor • QoS, Security profiles for sent and received messages • Interface with reliable storage for persistent events Laudable Features of NaradaBrokering Is open source http://www.naradabrokering.org Has end-point “plug-in” as well as standalone brokers Will have a discovery service to find nearest brokers and manage topics Does tunnel through many firewalls without requiring ports to be opened Supports JXTA, JMS (Java Message Service) and more powerful native mode Transit time < 1 millisecond per broker Initial version of setup and broker network administration module • Currently expect to use HPSearch scripts to specify setup NaradaBrokering Naturally Supports Filtering of events to support different client requirements (e.g,. PDA versus desktop, slow lines, different A/V codecs) Virtualization of addressing, routing, interfaces Federation and Mediation of multiple instances of Grid services as illustrated by • Composition of Gridlets into full Grids (Gridlets are single computers in P2P case) • JXTA with peer-group forming a Gridlet Monitoring of messages for Service management and general autonomic functions Fault tolerant data transport Virtual Private Grid with fine-grain Security model NaradaBrokering Communication Applications interface to NaradaBrokering through UserChannels which NB constructs as a set of links between NB Brokers acting as “waystations” which may need to be dynamically instantiated UserChannels have publish/subscribe semantics with XML topics Links implement a single conventional “data” protocol. • Interface to add new transport protocols within the Framework • Administrative channel negotiates the best available communication protocol for each link Different links can have different underlying transport implementations • Implementations in the current release include support for TCP,UDP, Multicast, SSL, RTP and HTTP. • GridFTP most interesting new protocol • Supports communication through proxies and firewalls such as iPlanet, Netscape, Apache, Microsoft ISA and Checkpoint. Manipulating Streams flow: primitive manages streams between Web services There is service-oriented workflow where streams are typically implicit. Here HPSearch supports UNIX style pipe and tee and we have trivial examples For stream-oriented, the streams are explicit. We have built a sophisticated system GlobalMMCS but it is currently not supported in HPSearch HPSearch will become control engine for NaradaBrokering when streams are “just” message flows on the Grid. Here one would use NB discovery services – find streams – and monitor • In this view a client talking to a Web Service is workflow HPSearch Flow Example // The input file x = "file:///u/hgadgil/datafile.txt"; // Reverses every line in the i/p e.g. abcd becomes dcba y1 = "156.56.104.155:5050"; // Computes the length of each line minus the last (\n or \r) y2 = "156.56.104.155:6060"; NaradaBrokering Queue // And finally the outputs... z1 = "file:///u/hgadgil/reversed.txt"; y1 z1 z2 = "file:///u/hgadgil/length.txt"; x `flow: x &> (y1 | z1), (y2 | z2)`; T Pipe Pipe y2 z2 Another Example `flow: x &> (y1|z1 &> p,(q|storage1)), (y2|z2|storage2)`; y1 z1 p q x y2 z2 storage1 storage2 NaradaBrokering Topic (Queue) Note this approach allows for example all workflow streams to use RMI, GridFTP, RTP – your or rather NaradaBrokering’s choice Stream–oriented Workflow As in audio-video conferencing and multimedia file delivery where it’s the media streams that are the “point” Services generate and transform streams but one thinks of streams going through services rather than services generating streams Multi-cast streams where video from one client sent to all other participants in a collaborative session common One thinks of a stream being published and participants subscribing to it. Pub/Sub Queue Publish Subscribe XGSP Web Service MCU Architecture Use Multiple Media servers to scale to many codecs and many versions of audio/video mixing; should allow all e-Scientists to be connected Session Server XGSP-based Control NB Scales as distributed Admire Web Services NaradaBrokering All Messaging SIP H323 Media Servers Filters High Performance (RTP) and XML/SOAP and .. Access Grid Gateways convert to uniform XGSP Messaging NaradaBrokering Native XGSP Service-oriented Workflow I As in follow of data between different simulation programs where one has a program (which becomes a Web service) view and data flow between programs often not explicitly interesting Elastic Dislocation Inversion Viscoelastic FEM Viscoelastic Layered BEM Elastic Dislocation Pattern Recognizers Fault Model BEM Service-oriented Workflow II Initial input and output files identified with perhaps a visualization as output In many implementations such as ours in earthquake example one writes and reads files for stream interface • Sometimes one wants the intermediate output files AVS and such visualization and image processing systems have such a model using streams Multicast not important per-se; use a publish/subscribe mechanism as it is fault-tolerant and higher performance and not because of multi-cast support Streams and Data Scripting engine can either define topics or find them out from NaradaBrokering discovery service Run-time ensures that all I/O goes through NaradaBrokering • Note one either uses a proxy or builds NaradaBrokering interface into Web service • Proxy should be near Web Service as only NaradaBrokering “guarantees” firewall penetration, fault-tolerance, performance • NaradaBrokering needs improved discovery system NaradaBrokering and Scripts are distributed so no central bottlenecks NaradaBrokering in practice One can “best” insert NaradaBrokering end-point interface into each client or web service NB Broker Network Client NB Endpt NB Streams WS NB Endpt But proxy model easiest for existing applications Client Proxy NB Endpt NB Broker Network NB Streams Proxy WS NB Endpt “Native Communication” – cannot use added value of NB including fault tolerance. Current GridFTP Implementation Entities in HPSearch Each Script is a Web Service Each Web Service, File, Web Page has a URI and can be accessed by a Script • HPSearch at its heart was URI’s bound to Javascript Publish/Subscribe system defines topics which are the URI of streams. Note syntax is often • topic://Session URI/stream1 with classic hierarchical labeling Scripts need discovery system to keep track of URI’s and in particular the session URI (which plays role of context) -currently this is same as NaradaBrokering Discovery System • Pub/Sub Streams typically support conversations with related streams topic://Session URI/stream1/WS-A and topic://Session URI/stream1/WS-B to allow Web services A and B to interact Publish/Subscribe Topics One has “data” which has perhaps an intrinsic URI For files and web pages, we have as well the location URL I think Publish/Subscribe topic is like the URI for streams and it is instantiated as a particular queue (or set of queues) in NB In NB Topics are integers (for performance), URI style or general XML instances Note that session topic can be thought of as “context” for messages sent to topic as it provides intrinsic information as to meaning of stream (cf. OGSI; WS-Addressing WSContext WS-Reliable Messaging and WS-Routing) Topics for streams and sessions virtualize destination, routing and context Role of Pub/Sub Queues One can think of N/B as providing an operating service to transmit streams between end-points with various value-added capabilities Messages are the units of a stream Events are messages with time-stamps (which could be absent); so events are messages and vice versa Streams are ordered collections of messages • NB manipulates streams and collections of streams • Delivery is guaranteed order preserving NB provides a virtual stream desktop which you can use to manipulate streams in same way you manipulate files in conventional O/S Multiple Input and Output Ports We can deal with Web Services with multiple input and output using an array notation but the &> Tee and | Pipe notation get clumsy So can use explicit notation such as • x.port[0].publish = NBTopicA; • y1.port[0].subscribe = NBTopicA; • y2.port[0].subscribe = NBTopicA; This would also be natural way of implementing stream-oriented workflow Errors and notifications would be easy in this syntax • notifyTOPIC = SessionTOPIC + ‘/notify’; • x.notify.publish = notifyTOPIC; • scriptasaWS.port[1].subscribe = notifyTOPIC; HPSearch Administrative Interface to NB One can build administrative policies and procedures by flowing administrative and monitoring data to appropriate scripting engines • • • • • • performanceTOPIC = SessionTOPIC + ‘/performance’; nbws = NBDiscover(“aggregateperformancews”) nbws.performancedata.publish = performanceTOPIC; scriptasaWS.port[2].subscribe = performanceTOPIC; Niftyperformanceanalyser(scriptasaWS.port[2]); ……. This example pipes performance data from NaradaBrokering and spawns some analysis NB provides for each link (broker to broker, broker to end-point) available bandwidth, used bandwidth, latency etc. Other NB Features to be added to HPSearch Full details of available Brokers and Stable storage Pending queue sizes Message statistics – size, number per second, time since since last message – at brokers and end-points Current stream sequence number at different parts of pipeline from source to destination Heartbeat Information Active Topics and list of publishers and subscribers (subject to security restrictions) Fault tolerance statistics including those subscribed end-points which are “down” Transit Delay (Milliseconds) Mean transit delay for message samples in NaradaBrokering: Different communication hops 9 8 7 6 5 4 3 2 1 0 hop-2 hop-3 hop-5 hop-7 100 1000 Message Payload Size (Bytes) Pentium-3, 1GHz, 256 MB RAM 100 Mbps LAN JRE 1.3 Linux Standard Deviation for message samples in NaradaBrokering Different communication hops - Internal Machines Standard Deviation (Milliseconds) 0.8 hop-2 hop-3 hop-5 hop-7 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1000 1500 2000 2500 3000 3500 Message Payload Size (Bytes) 4000 4500 5000 Average delays per packet for 50 video-clients NaradaBrokering Avg=2.23 ms, JMF Avg=3.08 ms 60 NaradaBrokering-RTP JMF-RTP Delay (Milliseconds) 50 40 30 20 10 0 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Packet Number Average jitter (std. dev) for 50 video clients. NaradaBrokering Avg=0.95 ms, JMF Avg=1.10 ms 8 NaradaBrokering-RTP JMF-RTP 7 Jitter (Milliseconds) 6 5 4 3 2 1 0 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Packet Number