Multiple Bypass: Interposition agents for distributed computing Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu http://www.cs.wisc.edu/condor/bypass › › › › Overview Good news and bad news. Our solution: Bypass Three simple (but useful) examples Problems: Impedance Matching Composition › Related and Future Work www.cs.wisc.edu/condor Good News! New distributed systems give you access to untold computing resources around the world. www.cs.wisc.edu/condor Bad News Your programs won’t run on them. www.cs.wisc.edu/condor HELP! core dumped remote machine home machine www.cs.wisc.edu/condor Why not? › Interface mismatch: open() != OpenFile() open() != super_duper_open() › Resource mismatch: Open(“datafile”) -> “doesn’t exist!” Open(“output”) -> “no space for you!” Getpwnam(“thain”) -> “who is that?” www.cs.wisc.edu/condor Just rewrite your programs! › Not possible: Commercial application Don’t know how! Unwilling to spend time/money to achieve uncertain benefits. N programs * M systems = not scalable www.cs.wisc.edu/condor Solution: Interposition Agent › An agent can solve an interface mismatch by converting the application’s operations into those provided by the available system. › An agent can solve a resource mismatch by sending the application’s operations to be executed elsewhere: split execution. www.cs.wisc.edu/condor Solution to Interface Mismatch Application open() Agent super_duper_name_lookup() super_duper_open() Super-Duper Library www.cs.wisc.edu/condor Solution to Resource Mismatch Application Shadow Via RPC Agent Standard Lib Standard Lib Kernel Kernel Home Machine Remote Machine www.cs.wisc.edu/condor Just like home! remote machine home machine www.cs.wisc.edu/condor Interposition Agents are an Open Research Topic › Several systems have been built, each with various strengths and weaknesses. › What is the appropriate mechanism? › What are the semantics of stacking? › Interesting problems result when we do “impedance matching”. www.cs.wisc.edu/condor Split Execution is an Open Research Topic › We want to explore many possibilities: Remote machine has some needed resources, but not all. Data may be buffered and cached at both the agent and the shadow. What procedure calls to trap depends on the application and the services needed. Some procedure calls could be routed to third parties such as file servers. … www.cs.wisc.edu/condor Split Execution is Hard › One example of many: Trapping stat() Different data types: • struct stat, struct stat64 • Depending on system, integer elements are 2->8 bytes Multiple entry points: • stat, _stat, __libc_stat Surprises: • #define stat(a,b) _fxstat(VERSION,a,b) www.cs.wisc.edu/condor Is this new? › Several previous systems have built gigantic and ambitious agents to virtualize the entire UNIX interface Condor MOSIX GLUnix Legion www.cs.wisc.edu/condor These systems work, but... › They never cover all of the features. e.g. memory-mapped files › They combine unrelated features. e.g. checkpointing and remote file access › They are difficult to customize to new classes of applications e.g. ORCA needs network access, remote stdio, but not full remote file access. www.cs.wisc.edu/condor Our Vision: › We want... to create agents in a language independent of their (ugly) implementation. to create simple agents that are small enough to be understood and debugged. to compose simple agents together into larger agents that do no more (and no less) than what is needed. www.cs.wisc.edu/condor › › › › Overview Good news and bad news. Our solution: Bypass Three simple (but useful) examples Problems: Impedance Matching Composition › Related and Future Work www.cs.wisc.edu/condor Our Solution: Bypass › Bypass takes a specification of a split execution system and produces a matched shadow and agent. › Building only an agent is a subset of this ability. › Bypass hides all of the ugly details of trapping, type conversion, and RPCs. www.cs.wisc.edu/condor Bypass allows you to... › ...split any dynamically-linked application. › ...transparently use heterogeneous systems. › ...trap calls with minimal overhead. › ...control execution paths with plain C. › ...combine small agents in interesting ways. www.cs.wisc.edu/condor Bypass Language › Declare what procedures to trap in C++ › Annotate pointer types with data flow. Direction: in, out, or in out Binary data: give expression yielding the number of bytes to send/receive. › Give two function bodies: agent_action shadow_action www.cs.wisc.edu/condor ssize_t write ( int fd, in "length" const void *data, size_t length ) agent_action {{ if( fd<3 ) { return bypass_shadow_write(fd,data,length); } else { return write(fd,data,length); } }} shadow_action {{ return write(fd,data,length); }} ; www.cs.wisc.edu/condor Agent Action › Any arbitrary C++ code. › When the program invokes write(), the agent_action is executed at the home machine. › Within the agent_action: write() - Invoke the original write() at the foreign machine. bypass_shadow_write() - Invoke the shadow_action via RPC. www.cs.wisc.edu/condor Shadow Action › Any arbitrary C++ code. › If the agent decides to invoke the RPC to the shadow, the shadow_action is executed at the home machine. › Within the shadow_action: write() - Invoke write() at the home machine. www.cs.wisc.edu/condor Using Bypass › Run "bypass" to read the specification and produce C++ source code: • % bypass -agent -shadow simple.bypass › The shadow is compiled into a plain executable. › The agent is compiled into a shared library. www.cs.wisc.edu/condor Using Bypass › The dynamic linker is used to force the agent into an executable at run-time: • setenv LD_PRELOAD simple_agent.so › Procedure calls are “trapped” merely by putting the agent first in the link list. › This method can be used on any dynamicallylinked program: tcsh, netscape, emacs… www.cs.wisc.edu/condor Shadow Features › Multiple configurations: One shadow, one agent New process per incoming agent New thread per incoming agent › Tracing of calls actually executed › Authentication: Trivial: Hostname Secure: Globus GAA, X509 identities www.cs.wisc.edu/condor Bypass can be used by Real Users! › Bypass works on unmodified executables. (Real Users are not willing/able to rewrite/recompile their programs.) › Bypass requires no special privileges. (Real Users do not have the root password) › Thus, Bypass allows a Real User to make good use of a remote machine without begging the administrator to configure it to his/her needs. www.cs.wisc.edu/condor Performance › Overhead of trapping a system call is very small: 1-9 us The "trapping mechanism" simply interposes a few extra function calls. Small compared to the expense of a real system call (about 10-70us) › Remote procedure calls are, as expected, much slower: about 1 ms under the best conditions. www.cs.wisc.edu/condor › › › › Overview Good news and bad news. Our solution: Bypass Three simple (but useful) examples Problems: Impedance Matching Composition › Related and Future Work www.cs.wisc.edu/condor Example One: Remote Console › Trap only read and write, and send operations on standard files back to a single shadow process. int read( int fd, in opaque “length” void *data, int length ) agent_action {{ if( fd<3 ) { bypass_remote_read( fd, data,length ); } else { return read(fd,data,length); } }} shadow_action {{ return read(fd,data,length); }}; www.cs.wisc.edu/condor Remote Console Appl Agent Shadow Standard Lib Kernel Standard I/O reads and writes Foreign Machine Standard Lib Appl Agent Appl Kernel Agent Standard Lib Standard Lib Home Machine Kernel Foreign Machine Kernel Foreign Machine www.cs.wisc.edu/condor Example Two: Attach New Filesystem › Trap standard I/O calls and replace them with calls to a user-level filesystem library, such as Globus GASS. int open( in string const char *path, int flags, int mode ) agent_action {{ return globus_gass_open( path, flags, mode ); }}; int close( int fd ) agent_action {{ return globus_gass_close( fd ); }}; www.cs.wisc.edu/condor Application Application attempts a plain POSIX open(). Globus GASS does a variety of system calls to strong authentication, remote file access, caching, etc… open close POSIX to GASS Agent open read write close Standard Library Layer www.cs.wisc.edu/condor Example Three: Instrumentation agent_prologue {{ static int bytes_read=0 static int bytes_written=0; }}; int read( int fd, out opaque “length” void *data, int length ) agent_action {{ int result; result = read( fd, data, length); if(result>0) bytes_read+= result; return result; }}; /** Definition for write is very similar **/ www.cs.wisc.edu/condor Example Three: Instrumentation Cont. int exit( int status ) agent_action {{ printf(“NOTICE: %d bytes read, %d bytes written,” bytes_read, bytes_written ); exit(status): }}; www.cs.wisc.edu/condor Application read write exit Measurement Agent read write exit Standard Library Layer www.cs.wisc.edu/condor › › › › Overview Good news and bad news. Our solution: Bypass Three simple (but useful) examples Problems: Impedance Matching Composition › Related and Future Work www.cs.wisc.edu/condor Problem One: Impedance Matching › An agent may not be able to transform operations from a layer above to a layer below. › Example: Globus GASS provides an equivalent for open() and close(), but not for stat(). www.cs.wisc.edu/condor Possible Solutions: › Be honest. Make stat() fail: “not supported” › Be evasive. Find some way to serve the request indirectly. › Be dishonest. Conjure up a complete lie about the file. www.cs.wisc.edu/condor What to do? › We need not come up with a universally › applicable solution: we are building small, interchangeable software. Consider why the application uses stat: to see if the file exists. to test permission to access it. to find out the best block size. to get its size before creating a buffer. to report meta-data to the user. www.cs.wisc.edu/condor Should I be honest? › Cause stat() to fail: “not supported” › Occasionally works! › If the application only needs a hint such as › › block size, it might fall back on a default. Example: Sometimes a big malloc() calls mmap() to get a new segment. If that fails, fall back on brk(). Fails in many contexts: “not supported” is often interpreted as “permission denied.” www.cs.wisc.edu/condor Should I be evasive? › Open the file, fstat() it, then close it. › Almost always preserves the correct semantics. › May break application’s assumptions. stat() is assumed to be quite cheap. open() through GASS or other storage system may › incur huge delays as the entire file is pulled in. In this example, GASS caches recently used files. This solution is good if the application only stat()s files it intends to read anyway. www.cs.wisc.edu/condor Should I be dishonest? › Return very permissive information: read/write/execute by anyone block size is 4K owned by you file is 4GB big › Almost always works! › (Not sufficient to implement “ls -l”) www.cs.wisc.edu/condor Why is dishonesty the best policy? › The results from stat are (almost) universally used as hints. First check permissions, then open. First check size, then read data. › In both cases, the situation may change, so the application must check for error conditions anyway. www.cs.wisc.edu/condor Problem Two: Composition › Bypass allows agents to be composed together: simply preload them all together. › How do procedure calls bind to procedure definitions? › Previous agent systems have proposed such rules, but do not explore their ramifications. www.cs.wisc.edu/condor Rules of Composition › 1: The process maintains a pointer to an active › › › layer. The topmost layer is the initial active layer. 2: A call to a trapped procedure resolves to the highest definition found below the active layer. 3: After resolving, but before invoking, the active layer is lowered to that of the callee. Before returning, the active layer is restored to that of the caller. 4: Calls to untrapped procedures do not consult or change the active layer. www.cs.wisc.edu/condor Practical Interpretation › A layer is only capable of invoking those below it. › A layer can only be invoked by those above. › Why? Strict layering creates order from chaos. Without it, measurement is not possible. www.cs.wisc.edu/condor Example: Measure above GASS › Notice: calls only › propagate down Measurement layer only traps those operations actually attempted by the application. Application Layer read write Measurement Layer open exit close POSIX to GASS Layer open read write close exit Standard Library Layer www.cs.wisc.edu/condor Example: GASS above Measure › Again: calls only › propagate down Measurement layer catches the resources consumed by both layers together. Application Layer open close POSIX to GASS Layer read write Measurement Layer exit open read write close exit Standard Library Layer www.cs.wisc.edu/condor Example: Third Party Function › printf is a third › › party function: it is not trapped by a layer. It contains a write, so where does it bind? It binds to the layer below that of the caller. Application Layer write Agent Layer write Standard Library Layer www.cs.wisc.edu/condor printf Others Have Chosen Different Rules › Mediating Connectors: Layer may invoke either the layer below, or start again at the topmost. Disjoint layers may commute. › We disagree: If you can re-invoke at top, it is not possible to build a sensible measuring agent. Careful with “disjoint”: GASS and measurement layers appear to be disjoint, but they do not commute. www.cs.wisc.edu/condor A Layered Remote Execution System Shadow Via RPC Application Measurement POSIX to GASS Remote I/O Measurement Standard Lib Standard Lib Kernel Kernel Home Machine Remote Machine www.cs.wisc.edu/condor › › › › Overview Good news and bad news. Our solution: Bypass Three simple (but useful) examples Problems: Impedance Matching Composition › Related and Future Work www.cs.wisc.edu/condor Related Work › “Classic” RPC and XDR: Define standard integer sizes, endianness, etc. Start by defining external protocol, then produce programming interface which is not always convenient: • struct read_results * read_1( int fd, int length ); www.cs.wisc.edu/condor Related Work › Bypass: We are stuck with existing interfaces, so annotate them to produce a protocol: • int read( int fd, out opaque “length” void *data, int length ); Do “best effort” conversion to/from external data format: • off_t is 4 bytes on some platforms, 8 bytes on others. • A conversion might fail! Define canonical values for source-level symbols: • O_CREAT has different values on Linux and Solaris! www.cs.wisc.edu/condor Related Work › Hunt and Brubacher, “Detours” Trap library calls on NT using binary rewriting – can be applied to any executable. Make original procedure available through special “trampoline” call. Bypass leaves the original entry point intact, so subroutines need not be re-written to use the trampoline. www.cs.wisc.edu/condor Related Work › Alexandrov, et al., “UFO” Use a kernel-level facility to trap all of a process’ system calls and translate some of them into WWW operations. The kernel mechanism is secure and can be applied to any process. But… it has a high (7x) trapping overhead and cannot be applied to procedures that are not true system calls. www.cs.wisc.edu/condor Related Work › Bypass: Trapping overhead is very small and can be performed on procedures that are not necessarily system calls. But… can only be applied to dynamicallylinked executables, and is not suitable as a security mechanism. www.cs.wisc.edu/condor Related/Future Work › A complete remote execution system needs both methods: The program owner provides a lightweight mechanism for creating a correct split execution environment. The machine owner provides a heavyweight mechanism to defend itself from a (possibly) malicious program. www.cs.wisc.edu/condor Complete System Application Via RPC Shadow Agent Standard Lib Standard Lib Sandbox Kernel Kernel Home Machine Remote Machine www.cs.wisc.edu/condor Our Contributions › A language for writing agents Independent of implementation mechanism. Correct mechanism depends on purpose. › Implicit binding: Agents name procedures, not other agents. Original procedure entry point preserved. › Composition rules Strict layering makes order from chaos. www.cs.wisc.edu/condor Future Work › Interaction of sandbox and utility agents A utility agent modified the application’s operations to make them acceptable to the sandbox. Should they negotiate on permitted operations? › Signal handling How to specify? (Many relevant functions) Flow of control is backwards › Other implementations Binary rewriting. Build specialized linker that understands multiple definitions of symbols. www.cs.wisc.edu/condor Further Questions? › Douglas Thain thain@cs.wisc.edu › Miron Livny miron@cs.wisc.edu › Bypass Web Page http://www.cs.wisc.edu/condor/bypass › Questions now? www.cs.wisc.edu/condor