Parrot: Transparent User-Level Middleware for Data-Intensive Computing Douglas Thain Condor Project, University of Wisconsin Workshop on Adaptive Grid Middleware 28 September 2003 The Reality of the Grid afwuhweiuhsdvxmndf (and then a miracle happens) P=NP Look at my new proof! I think you have a problem here... run this batch job Condor PBS NQE LSF Load Leveler Local Operating System Process Interface (main, exit, abort, kill, sleep) User’s App User’s App (open, close, read, write, lseek) I/O Interface Storage Server Parrot Local Operating System access data Chirp FTP NeST RFIO DCAP Applications of Parrot • Interactive Browsing – tcsh, tar, gzip, make, acroread, gv, xv... • Improved Reliability – Transparent retry/reassignment/reallocation – Files, sockets, even repair broken apps. • Private Namespaces – Make /home/thain appear the same everywhere. – Make /usr/data/calibration different everywhere. • Dynamic/Distributed Program Construction – Remote link, remote exec, remote eval... • Profiling and Debugging – Users may not know low-level I/O patterns. Challenges • • • • • Technical Methods of Interposition Semantic Differences Error Management CPU – I/O Integration Performance • The butterfly effect: – Subtle underlying differences can have large effects in performance and usability. Internal Techniques Binary Rewriting Polymorphic Extension App Code App Code Standard Library Library M1 New Code M2 NEW App Code New Library Standard Library Static or Dynamic Re-Linking External Techniques Debugger Trap Remote Filesystem App App Kernel Agent Kernel Callout Kernel NFS LFS FFS agent App Agent Kernel NFS LFS USR NFS LFS FFS Techniques Compared technique burden speed polymorphic rewrite fast hole detection easy static link relink fast hard dynamic link dynlink medium hard binary rewrite dynlink fast hard remote fs root varies easy callout root slow easy debugger none very slow easy Hole Detection Matters • Dynamic Linking – – – – Bypass Toolkit, ca. 2000 Works with some standard tools. Many still crash in strange ways. Doesn’t apply to static exes; always a surprise. • Debugger Trap – – – – Parrot: Coding began in May of 2003. Works reliably with almost everything in /usr/bin. Caveat #1: Twice as much code Caveat #2: Higher latency Debugger Trap • For the rest of this talk, we select the debugger trap for completeness and reliability. Much of the discussion still applies to the other techniques too. • Some technical details in the paper: – – – – Only on Linux. Must manage process ancestry. Must fudge some broken ptrace behavior. Cannot write directly to process, must take roundabout path through temp file. User Process File Descr. SYS_write SYS_read SYS_open parrot_write parrot_read parrot_open (debugger trap) 0 1 2 3 4 5 6 7 8 9 ... name resolver File Pointers pos: 100 File Objects “outfile” “infile” “config” “data” Device Drivers Local Driver Chirp Driver FTP Driver NeST Driver pos: 0 pos: 0 pos: pos: 1 MB 42 mount list driver chirp lookup driver RFIO Driver DCAP Driver Adaptation On same host: On distant host: On nearby host: /mydata ->/chirp/host1/usr/mydata /mydata -> /ftp/host2/opt/DAT App App App open(“/mydata/foo”) open(“/mydata/foo”) open(“/mydata/foo”) Parrot Parrot Parrot Local FTP Chirp Local FTP Chirp Local FTP Chirp /mydata -> /usr/data /usr/data chirpd ftpd /opt/DAT What Protocol? • File Transfer Protocol: – Internet standard, many implementations. – High bandwidth sequential access. • NeST – General purpose storage appliance from UW. – Virtual users, namespace, and allocation. • RFIO: – Remote I/O protocol used with CERN CASTOR. – UNIX like, most ops require a new TCP. • DCAP – Remote I/O protocol used with Fermi D-Cache – UNIX like, WORM semantics, no directories, caching/ • Chirp: – Protocol developed @ UW for Parrot. – Corresponds very closely to UNIX, incl errnos. Small Details Matter • Standard tools need to know subtle details, otherwise, they break: – – – – ls –lR performs getdents(“foo”) on success: descend on ENOTDIR: display and continue on ENOENT: display error and stop. • FTP does not provide this detail – Failed LIST -> error 550 – Failed GET -> error 550 – Failed CDIR -> error 550 • Simple assignment doesn’t work: – Making 550=ENOENT breaks many tools. Example Solution Success 200 LIST “foo” other 550 Not a dir. 550 CWD “foo” other Transient Error 200 Access denied. 200 other SIZE “foo” 550 No such entry. CPU-IO Integration • Errors that cannot be expressed in the client’s interface must be passed to a higher level (the batch system.) • Simple options: – kill –9 application (retry app elsewhere) – exit(1) application (don’t retry app) • Complex options: (Condor only) – restart with (Subnet!=“128.101.175”) – restart with (CurrentTime>5pm) bandwidth (MB/s) Bandwidth by Protocol 9 8 7 6 5 4 3 2 1 0 nest ftp rfio chirp dcap (unix default hint) 4KB 16KB 64KB (parrot default hint) 256KB 1MB block size 4MB 16MB 64MB Latency by Protocol (ms) stat open close read 1B read 8KB write 1B write 8KB chirp 0.50 0.84 0.61 2.80 0.38 2.23 ftp 0.87 2.82 - - - - nest 2.51 2.53 2.96 4.48 5.53 7.41 rfio 13.41 23.11 0.50 3.32 39.8 2.85 dcap 152.53 159.09 40.05 3.01 40.14 3.14 Andrew-Like Benchmark • Original Andrew benchmark is no longer appropriate, so replace with the Parrot source: 296 files, 955 KB. • Copy the source to a remote device, then manipulate in five stages: – – – – – copy: cp –rp list: ls –lR scan: grep searchstring –r * make: make delete: rm –rf * Overheads Compared 160 140 time (s) 120 100 parrot only + chirp + lan + cache 80 60 40 20 0 copy list scan make benchmark stage delete time (s) Overheads Compared 10 9 8 7 6 5 4 3 2 1 0 parrot only + chirp + lan + cache copy list scan make benchmark stage delete Protocols Compared 350 300 time (s) 250 200 150 chirp ftp nest rfio (failed) dcap (no dirs) 100 50 0 copy list scan make benchmark stage delete time (s) Protocols Compared 50 45 40 35 30 25 20 15 10 5 0 chirp ftp nest rfio (failed) dcap (no dirs) copy list scan make benchmark stage delete Moral of the story: • The butterfly effect: Small underlying differences can have big effects on performance and reliability. • Examples in interposition: – Dynamic linking: fast but poor hole detection. – Debugger trap: slow but good hold detection. • Examples in protocols: – – – – Chirp: UNIX semantics restrict bandwidth. FTP: Need for multiple ops increases latency. NeST: Powerful virtualization increases latency. RFIO: Connection per op doesn’t scale. For more info... • Douglas Thain – thain@cs.wisc.edu • Miron Livny – miron@cs.wisc.edu • Software, manuals, more info: – http://www.cs.wisc.edu/condor/parrot • The Condor Project: – http://www.cs.wisc.edu/condor