The Active Streams approach to adaptive distributed systems Fabián E. Bustamante, Greg Eisenhauer, Karsten Schwan, and Patrick Widener {fabianb,eisen,schwan,pmw}@cc.gatech.edu Systems Research Group College of Computing Georgia Institute of Technology Motivation New networking technologies + increasing number of network-capable end devices novel computing environments like Grid and Pervasive computing Building applications for these environments is complicated – Heterogeneity of nodes and network connectivity – Dynamic changes in resource availability and demand – “Right” application distribution depends on the circumstances Active Streams Approach Customizable services and dynamically extensible applications Dynamically adaptive applications and service Structured I/O – application-level strongly typed records Component-based model and event-based integration Dynamic code generation for heterogeneity and high-performance Continuous monitoring and adaptation Active Streams Adaptation through attachment Application/service evolution and/or a coarse form of adaptation are obtained through the attachment/detachment of streamlets that operate on and change data steams properties. A C Client Server B Server Active Streams Adaptation through attachment A simple example of adaptation through streamlet attachment is the application-specific filtering of data-stream contents in order to handle a sudden drop on bandwidth availability. A Server C S Active Stream Node B Server Active Streams Client Adaptation through parameterization Finer grain adaptation involves tuning a streamlet’s behavior through parameters updated remotely via a push-type operation. Parameter block B Parameter block update A B S Client Server Active Stream Node Parameterization examples: image resolution, zoom, atmospheric level, longitude/latitude, etc. Active Streams Adaptation through redeployment Finer grain adaptation can also be attained by dynamically re-deployment of streamlets to approach “optimal” placement according to monitoring information. A S B S Server Active Stream Node Streamlet migration Active Streams Client Streamlet example Streamlets are the basic units of composition in Active Streams. This streamlet computes and returns the average of its input array: { int i; int j; double sum = 0.0; To deal with heterogeneity, portable streamlets can be written in ECL, a subset of a general procedural language (C), and a native version of a given streamlet code can be dynamically generated at the destination. We support dynamic code generation for MIPS, Alpha, Sparcs, and x86 processors. for (i = 0; i < MAXI; i = i + 1) { for (j = 0; j < MAXJ; j = j + 1) { sum = sum + input.array[i][j]; } } output.avg_array = sum / (MAXI * MAXJ); return 1; /* submit record */ } Active Streams Active Streams framework Participating nodes makes themselves available running as Active Streams Nodes (ASN), where each ASN provides a welldefined environment for streamlet execution. Streamlets can be obtained from a number of locations; they can be downloaded from clients or retrieved from a streamlet repository. Essential for introspection is a customizable active resource monitoring system. Provides the “glue” that holds the framework together. Streamlet Streamlet Streamlet The framework relies on ECho, our high-performance event infrastructure for integration and general communication. ECho Active Streams Latency effects of specialization End-to-end latency for 20k messages with different degrees of client specialization at the source. Each subfigure corresponds to a different message size: (clock-wise) 100KB, 10KB, 1KB, and 100B. The y axis represents different specialization scenarios: (1) w/o specialization, (2) with an “identity” streamlet, and (3), (4), and (5) “filtering out” 30, 60, and 90% of the stream content. Active Streams Shares of server/ASN utilization System/User shares of server/ASN utilization when transmitting 20k 10KB-size messages with different of specialization. The y axis represents different specialization scenarios: (1) w/o specialization, (2) with an “identity” streamlet, and (3), (4), and (5) “filtering out” 30, 60, and 90% of the stream content. Effect of dropping messages: avoiding the TCP/IP stack, etc. Load increase from executing streamlet. Active Streams Server/ASN load with specialization Percentage of server/ASN utilization when transmitting 20k messages with different degrees of specialization. Each subfigure corresponds to a different message size: (clock-wise) 100KB, 10KB, 1KB, and 100B. The y axis represents different specialization scenarios: (1) w/o specialization, (2) with an “identity” streamlet, and (3), (4), and (5) “filtering out” 30, 60, and 90% of the stream content. Active Streams Intermediate nodes and re-deployment Adaptation at the endpoints of a connection is not enough. Percentage of Server CPU utilization contrasted with end-to-end latency when transmitting 20k 1KB messages to 30 clients, with a varying number of them specializing the stream at the source. signals the point from where there are no additional benefits of placing streamlets at this node. indicates the point of diminishing returns. Active Streams Status and ongoing work Design and prototype implementation finished First public release in December 2001 How to best decompose application and services with Active Streams? What are useful re-deployment algorithms and heuristics? What OS-type services should Active Streams Nodes provide? Active Streams