Efficient Robert Wahbe Software-Based Steven Thomas Lucco Computer space. to provide fault isolation among cooperating modules is to place each in its own address for tightly-coupled modules, Graham 94720 1 However, L. Division CA Abstract way Susan of California Berkeley, software Isolation E. Anderson Science University One Fault this so- Introduction Application incorporating ules. programs often achieve extensibility by independently developed software mod- However, faults in extension code can render a lution incurs prohibitive context switch overhead, In this paper, we present a software approach to implementing fault isolation within a single address space. Our approach has two parts. First, we load the code and data for a distrusted module into its own fault doseparate portion of the application’s main, a logically software system unreliable, or even dangerous, since such faults could corrupt permanent data. To increase the reliability of these applications, an operating system can provide services that prevent faults in distrusted modules from corrupting application data. Such fault isolation address space. Second, we modify velopment by helping the object code of a distrusted module to prevent it from writing or jumping to an address outside its fault domain. Both these software operations are portable and programming language independent. Our approach poses a tradeoff relative to hardware fault isolation: substantially faster communication between fault domains, at a cost of slightly increased execution time for distrusted modules. strate that for frequently communicating plementing fault isolation ware can substantially in software improve We demonmodules, im- rather end-to-end ence Projects Equipment the External Anderson Young Investigator Email: Permission H,ented to prov,ded commercial title of that copying the SIGOPS 0 1993 To ‘93/12 ACM AT&T the user-defined objects. cused content and the or ACM of the otherwtse, Foundation. or copyright appear, and Association or to republish, /0012 . ..$l material d,arr(buceu’ notice not[ce for the /93/N. C., -632 IS given a fee USA -S/93 the systems together ules. Also, added extension the Among [HC92], Embedding Xprem code BSD and system developed desktop into network Active systems, that gener- application-specific industry independently Quark More example, ex- operat- servers replaced. MJ93], An the for management and of fo- vendors system. parts have [M RA87, the uses query has party as user-level or system, Linking link that research third operating design; modified [vCGS92]. query an unrelated system for are implemented memory pred- as geometric virMes- Microsoft’s [Cla92] can software mod- publishing sys- [Dys92] is structured to support incorporation of general-purpose third party code. As with PO ST GRES, permtsslon. 0-89791 of filter any with operating parts several tem Computing requires de- database. micro-kernel easily and such isolation, easier operating types interfere it ally, Object edu IS fault making the sages for and be destructors, data could recent system tual does not .berkeley. pa~t of th!s is packet Foundation of the Government grahefn}@cs all Its date and be inferred. tea, the enhance can the Digi- of the paper code on ample ing constructors, Without Similarly, and Center Science GVPIeS W%not mode the advantage, copy and or the policy fee IS by permission specific Research should w!thout publ!catlon Machinery. The lUCCO, that NO0600-93-C-2481, by a National endorsement copy MDA972-92-.J-1028 Systems Program), the position {rwahbe, grant (the Award. reflect and no official defines for or corrupt part by the National SciDefense Advanced Research and Corporation Research in under was also supported necessarily andlor (DARPA) D ABT63-92-C-0026 tal d!rect was supported ( CDA-8722788), Agency contracts code that icates to This work Foundation software sources of system fail- database manager inFor example, the POSTGRES cludes an extensible type system [Sto87]. Using this queries can refer to general-purpose facility, POSTGRES than hard- performance. to identify ure. extension application services also facilitate .50 203 faults in extension modules systems unreliable. One way to provide can render fault isolation any of these ple among cooperat- ing software modules is to place each in its own address space. Using Remote Procedure Call (RPC) [BN84], transformation only slightly ified object provide code. more greater modules in separate address spaces can call into each other through a normal procedure call interface. Hardware page tables prevent the code in one address space from corrupting the contents of another. fault cross hardware lower-cost with comes mance [ALBL91]. in a processor’s integer A cross-address-space RPC requires execution the to three execution orders time call [BALL90, of magnitude overhead greater of a normal enough that mance effect separate fault lation would are frequent quently than cheap isolation ning the makes mark, ules trusted a software approach A to implementing a single address space. Our the ap- in tiguous region of memory a unique identifier process resources modify the vent its it from fault software-enforced other’s data or jumping fault have several independent transformation object unable code segments. In this each to fect quire outside Section strategies we its own that code concentrate modify except are fault sys- benchisolation on a DECsta- it full trusted modules incur data frequent to any outside others use run our of its are overspeed. techniques domain, We disonly time at full distrusted overhead. mod- code), execution can be- If some extension preventing time a tradeoff while domains is possible security, provides of distrust. case with in trusted to code at quantify imfrom a cost of this ef- 5. of the 2 provides some paper is organized examples communication of systems between fault as follows. that re- domains. 3 outlines it from for our prototype, related can render and type this how we modify object code to pregenerating illegal addresses. Section 4 describes how we implement low latency cross-faultdomain RPC. Section 5 presents performance results vent interface. on level be the remainder Section in sepa- not run- benchmark data For of three also and a program in Section to pre- code RPC we fre- system The reduced a factor for approach performance. extensible approach uncommon our database operators. than be software-enforced benchmark. approach execution The programming-language- escape paper, can other’s cross-fault-domain identified module isolated domains has Second, to an address modules or execute an explicit space, the to common But applied POSTGRES software reading higher its access to descriptors. of a distrusted Program each to control a con- up domains, POSTGRES 2000 by more Code plement address to comprising an address is used as file code writing rate We such object application’s within which domain. through of the in addition seem the end-to-end we geometric distrusted head. even portion domain, fault to the (as may logically separate C of DECstation down speed better performance Similarly, A fault the may slowing to this, software proach has two parts. First, we load the code and data domain, a for a distrusted module into its own fault space. the 5000/240. hardware within time module an average communication). Sequoia the its own address space, because of the cost of crossing isolation in RPC the incurs Althan including isolation are execution) define overhead boundaries. in faster distrusted on both fault we use of the to tween We propose overhead substantially fault tion suite to on a on a DEC reduction sandboxing time demonstrate tem 0.8ps 1. lps of magnitude This need A safe RPC roughly increased communicating can offer To domains. a test (cross-domain computation per call. In this situation it is impractical to isolate each logically separate module within fault On the substantially Alpha. (normal case system developers can ignore its perforin choosing which modules to place in domains. In many cases where fault isobe useful, cross-domain procedure calls yet involve only a moderate amount of protection time. benchmarks, the case procedure isolation system. of slightly Software-enforced ALBL91]. The goal of our work is to make fault RPC incur to hardware- offer takes that which relative and roughly mod- techniques but we can fault that of the we eliminate an order counter-intuitive: The execution time overhead of an RPC, even a highly optimized implementation, will often be two Because than execution and overhead. a tradeoff 5000/240 at a cost SPEC92 4% time poses between more any existing perfor- at least: a trap into the operating system kernel, copying each argument from the caller to the callee, saving and restoring registers, switching hardware address spaces (on many machines, flushing the translation lookaside buffer), and a trap back to user level. These operations must be repeated upon RPC return. with 400, investigate implementation DECstation sandboxing, time information boundaries, RPC Unfortunately, there is a high performance cost to providing fault isolation through separate address spaces. Transferring control across protection boundand does not necessarily scale aries is expensive, called execution also isolation. our prototype improvements We approach based the debugging execution Our pha technique, increases data a sim- 204 work. and finally Section 6 discusses some 2 Background the distrusted tifies In this type section, of enforced of the we characterize application fault that can isolation. We extensible POSTGRES in more benefit defer detail from the and software- further 5, which gives performance measurements for this application. The operating systems community has focused considerable attention on supporting kernel extensibility. For example, the UNIX vnode interface is designed to make it easy to add a new file system new file system implementations 3 In this tion section, so that is a it can domains discussion degraded load/store has to be transferred data [FP93]. be to aligned can be accessed from a segment share We divide the message into the kernel for reason- the system. ware that will tions unforeseen same number fault of problems. Because extension system of the the end-users Hence, Quark do not lack corrupt bugs system Quark’s party of these examples using hardware icant portion a small fault system amount time software of code isolation fault duces the time spent while only slightly would result code. is distrusted; is likely isolation crossing increasing to be spent of a RISC could 4 describes cross-fault-domain its data segment. (segment by ifying code its by the normal the module a module share from for data mod- an address This page fault bit data and to be illegal, page. tar- upper legal code is possible system to tar- jump same all Separate identifier distrusted only legal the distrusted to an unmapped operating all seg- to addresses similarly, It segment a only have to prevent segmentl. if it refers Hence, identifier. are necessary the correct write the other time. it can jump and called of two specific transforms identifier); generated segment code, at load module segments stance spent distrusted bits, The into within consists stack. so that segment, same of upper module’s and space addresses domain are determined code address pattern encapsulation in the with for in- is caught mechanism. only exe- 3.1 situation, more it domain the time of the In this because fault being Second, most code. First, in a signif- time switch in trusted hardware characteristics. execution context is spent fault two isolation of the overall in operating cution share heap code addresses the sources on techniques virtual A fault in its pattern unreliable, among data, object within gets a unique addresses gets Quark in third appear distinguish a all one for a distrusted static module’s of efficient where remainder Section virtual so that identifier. Software At Quark its ment func- [Dys92]. computers can soft- perform caused for failure. All than has modules structures. to desktop party designers on the personal make Xpress third system extensibility can because this original this data code Quark can purchase by its time, runs, internal One extend domains Xpress is the our a soft- cooperating The safe and efficient an application’s segment segments, publishing the we implement increasing allows CISCS. we can iso- a performance operating although handle that RPC. performance. example memory. we are the Next, we present that first to pinpoint slightly Finally, share module We module. only assumes architecture, extended users 5 provides technique to segments, user-level while technique. how A Iso- encapsula- sandboxing, Section encapsulation Similarly, Active Messages provide high performance message handling in distributed-memory multiprocessors [vCGS92]. Typically, the message handlers are application-specific, but unless the network controller able is cost domain, a software module of this fault called time. this ware software allows within a distrusted execution fault user level [Thi92], ex- isolation a distrusted its that a technique, analysis must be compiled repre- application fault several escape of faults Another example is user-programmable high perforIf data is arriving on an 1/0 mance 1/0 systems. channel at a high enough rate, performance will be handlers demon- total transforming not location late party for a technique its the incoming and overhead Fault we outline describe erating system vendor. if control software-enforced techniques introduce substantially of the Software-Enforced kernel cache for performance [HKM+88].) Epoch’s tertiary storage file system [Web93] is one example of op- to user level to manipulate overhead, lat ion are added by a third time domain-crossing proportion time, 5 quanoverhead effective. to so directly into the kernel. (The Andrew file system largely implemented at user level, but it maintains kernel code developed if Section domain-crossing into UNIX [Kle86]. Unfortunately, it is too expensive forward every file system operation to user level, typically application. between even a modest ecut ion of the execution that sents Section part trade-off application strates description type system until this efficient sharply An re- Segment unsafe or stores boundaries, 10ur executing interface. 205 Matching anstructzon is any to an address that system supports dynamic instruction can not linking that jumps be statically through to ver- a special ified to be within transfer the instructions, branches, can variables be often most returns, statically used stores address, that A straightforward illegal ery unsafe whether the code trap distrusted software module’s typical requires fragment this fragment by moves code distrusted code the unsafe are segment The store to check. and jump first call and register. segment is a dedicated -reg register. equal if store address is outside uses 1: Assembly of segment. dedicated-reg pseudo to clear only code for segment matching. target-reg&and-mask-reg code register segment dedicated-reg in and-mask-reg identifier register I segment-reg segment-reg to set segment identifier instruction uses store bits. += dedicated-reg Use dedicated because may ~ Use dedicated are used by we instructions not dedicated-reg into bits. dedicated-reg arrange instruction, Hence, Figure instruction necessary store -reg identifier. register. this address module unsafe segment a dedicated in- matching modified distrusted the ch-reg instruction a pseudo-code registers are the to get is not register. >shift-reg) matchzng. target never store outside segment They in the inserted store We Dedicated and directly passing the module. elsewhere to jump domain. matching. r-eg~ster. inserted routine 1 lists if dedicated is a dedicated s crat Trap has the fails, error Figure trap ev- determines address check architectures, segment a dedzcated the fault instructions. for the technique RISC four target If to a system encapsulation On in instruction’s code address ch-reg segment the use of before into (dedicated-reg> shift-reg their compare code checking identifier. will hold verified. checking The unsafe procedure to to preventing is to insert segment serted be statically instruction. the correct implement a register + s crat address address Right-shift reg- target target scratch-reg and through + Move static mode jumps to approach addresses dedicated-reg to addressing use can not control Stores However, commonly Most verified. verified. and target segment. as program-counter-relative use an immediate can be statically isters, correct such by- transform Figure all 2: Assembly pseudo code to sandbox address in target-reg. to use a dedicated register. All in the this software paper encapsulation require matching requires four addresses in the code in the data amount, Using dedicated most MIPS and modern compiler on the registers did execution the not to hold the DECstation impact on Section icated 5 reducing on the by ment average can 3.2 The that Address segment it can capability can matching pinpoint is useful reduce of providing 2 For Sandboxing runtime [Int86], served registers the with it is possible by restricting has offending still about limited control advantage further, at the the source sets, a module flow within cost module a fault One no re- domain. 206 the the illegal instruction bits the correct will has been correctly sandboxing is used are used the with seg- or jump we are module code by jumping sequence; register identifier, Section that seg- will this alsec- 3.6 presents an object code sandboxed. requires to hold the Since sandboxing no effect. the a ded- 2 lists store even dedicated can verify to As distrusted segment have in sets register. in the arith- clears result unsafe do- or jump Figure address of the that register registers an of two operation. dedicated algorithm Address as the using contain value. this fault store instruction correct register, upper a simple of faults. such second ond instruction We second the address. instruction the the addresses; any the unsafe stores we modify produce the ready This development. register to encapsulate the instruction. software overhead no information architectures 80386 technique during not since and to use the a dedicated to the inserted bits to the matching, using first to perform instruction illegal insertion each code to sandboxzng affecting requires The identifier pseudo-code benchmarks. SPECg~ register. ment to a C compiler effect identifier catch from before The segment not insert address this one generating sandboxing instructions instruction. can set with example, set available the we register metic does the target We call them than we simply of the identifier. prevents other Address However, bits Sandboxing merely main including 5000/240, a significant it instruction upper segment address. shift 32 registers, to use a smaller of the an each unsafe sets the correct addresses module. For that one to hold segment have least Before Segment identifier. architectures, impact. presented to hold segment may at register have time one the have performance that, one RISC Alpha, the (gee) segment, of the distrusted retarget five registers: registers time minimal shows dedicated one to hold the execution since registers. segment, and techniques dedicated hold the five the code dedicated segment and registers, mask, data two segment — this register pointer whenever to form changes it is set. addresses to it, this Since are much optimization uses of the more stack plentiful significantly than improves performance. Further, after Guard Zones Segment we can the modified store address tion. If the modified zone, the cause a hardware There Figure 3: A segment offsets zones with covers guard the range in register-plus-offset zones. The size of of possible addressing could the immediate are dress from changes ment the sandboxed 3.4 3.3 Optimization a number of optimizations specialized moved the into a using it DEC Alpha to will both optimizations overhead. could the example, sandboxing a store constant prototype that For remove a small Our or instruc- pointer. cases where only as of a load offset does se- target not ad- during yet imple- optimizations. Process Resources Because multiple fault domains share the same virtual address space, the fault domain implementation must prevent distrusted modules from corrupting resources that are allocated on a per-address-space basis. For example, if a fault domain is allowed to make system The overhead of software encapsulation can be reduced by using conventional compiler optimizations. Our current prototype applies loop invariant code motion and instruction scheduling optimizations [ASU86, ACD74]. In addition to these conventional techniques, we employ by On of further in iteration. these stack tool loops, has optimizations sandboxing transformation each loop identifiers, and two are used to hold code and data addresses. the pointer as long transfer instruction fault. these a number reduce quences modes. and as part pointer address stack offset control or store apply pointer next stack load the constant is used the guard frame the guard pointer before we sandboxing by a small stack processor, ( avoid it is modified calls, it can close or delete files needed by other executing in the address space, potentially application as a whole to crash. to code causing the software encapsulation. We can reduce the overhead of software encapsulation mechanisms by avoiding arithmetic that computes One solution is to modify the operating system to know about fault domains. On a system call or page target addresses. For example, many RISC architectures include a register-plus-offset instruction mode, fault, the kernel can use the program counter to determine the currently executing fault domain, and restrict where the offset is an immediate resources accordingly. ited range. limited On the MIPS to store the range instruction address off dressing three the to Sandboxing instructions: register, to set the segment the in some lim- such +64K, value, uses dedicated structions -64K (reg) mode. inserted into architecture store set constant off offsets Consider set (reg), instruction one to and two whose identifier quire ad- reg+offset sandboxing in- of the dedicated register. Our only prototype the register dress reg+of support guard To optimizes We MIPS f set, this stack sandboxing thereby at the the to the also rather optimization, zones create jacent regj top guard reduce pointer case than the saving the and zones, segment this actual prototype bottom virtual as a dedicated target memory register. pointer module distrusted only through fault domain in a separate modules we addition to to hold access whether by other module’s object transform this performs into the To tion of the application and shares a fault the case of an extensible call If can performed system call, RPC call. make the trusted system the de- a distrusted a direct with a that appropriate application, domain reserve code is safe. we reresources We system domain code call system arbitration a particular fault each domain, RPC. trusted implemented to placing fault cross-fault-domain termines some portable, In calls arbitration we In por- directly code. segment. pages (see Figure by prototype approach. ad- establishes of each overhead the uses of the stack sandboxing an instruction. are unmapped runtime by our distrusted requires sum keep an alternative the register-plus-offset this To are treating ad- 3.5 Data Sharing 3). Hardware the sharing We avoid ing by sandboxing 207 page fault among table isolation mechanisms can support virtual address spaces entries. Fault domains by data manipulat- share an ad- dress space, so they and can use plementation. since software ter load ule the has been we call lazy then other domains fault pointer provides a mechanism many ditional runtime region access; in each the into space, but low order To aliased address bits. each must unique code, the that the hardware As the Under shared addresses works are set to the struction into of the like address the An mat thing. use a dedicated register indicates which segments For unsafe implement to hold a bitmap. instruction requires one more analyzes each dedicated any is valid upon a load to a dedicated If the region allow icated register, and virtual regions unsafe by sharing, The region By technique: we isting can not compiler, piler bitmap can access. unsafe incorporating approach have software identified encapsulation. to emit encapsulated ule; integrity the module the is loaded system directly two will support can object into a fault encapsulate modifying its object well as store and jump variants for is then verified at load instructions. We discuss the bust by time. peras the performance code. patching, possible format tools rely code from dle object binary binary When is rejected. into an ex- advantage of com- optimization. However, First, most mod- one programming lan- technique code 208 so that module being and use the it which have been idioms processor challenge a subset of and code, but one these distinguish emulation For to registers, han- software is to transform the ro- is not translate built, to the modi- practical jumps[SCK+93]. main uses directly in efficient Tools another alleviates is loaded, by Unfortunately, resulting indirect encapsulation, 5. domain compiler-specific data unknown patching, fault the [LB92]. to on binary the can encapsulate currently module versions of these techniques that by encapsulating load instructions in Section fying the Alternatively, distrusted code code encapsulation called problems. system mod- when alternative, these uses a compiler domain. the dedIf an employed. implementing code for a distrusted code the safe. guage (gee supports C, C++, and Pascal). Second, the compiler and verifier must be synchronized with re- approach of this 3 We have implemented form general protection of these One an unsafe is deemed only ex- segment Verification strategies begins the in For than An We region. re- ified spect to the particular and modified segment instruction Implementation or jump shared matching. 3.6 store to take code segment. regions. register has two disadvantages. compilers code encapsulation we are able for instruction, sandboxes be verified, which address, jump of the region of the to be executed, register software infrastructure this target transfer appropriately the be exe- is a store unsafe exit ample, domain checked, region to one its in- is executed, when unsafe that region. same, shared To verifier to insure identifier. the encapsulation the fault The mod- first ends in the for any guaranteed form re- store If the guaranteed is used of in- store with region instructions definition address. unsafe begins is a control are no more A similar target instruction is not software and jumps to a dedicated are to this sequences register. on address of our An region next in the begin stores or jump register instruction the do not a dedicated instruction transla- that the next hardware remain hold: next gion store store To their region jump instructions unsafe the correspond- ]ump in a unsafe or there ac- automati- unsafe is that, into regions. set match- may unsafe any modification the segment. segment software a data data systems inter- are instruction a property program unsafe with subsequent uses in code code we can implement segment matching begins cuted. is given object domain’s exactly bits a new each fault low aliasing, object distrusted bits introducing creation shared the sandboxing the operating address the shared within translation high called An all imple- segment segment. to form the to a dedicated mem- current execution all register ification comparisons memory each uses register. a different We soft- techniques in verification code verifier divides structions same and techniques: shared the our our MIPS jumps, address the have the verifier gion while the in the use a dedicated offset approach. to perform sandboxing challenge instruction The shared must main encapsulation that for both of indirect technique, virtual pointer shared memory, translates tion; situation, ar- no ad- that dependent, a verifier for the shared has exactly region shared ensure addresses the location share same we alias in As with incorrect address. This words, The any segment at the ing. following cesses shared cally space works presence offset, sandboxed ing this built that swizzling with to map is mapped shared to regions tables locations each avoid face domains address In other multiple schemes, segment every read-write Note is language first gcc compiler independent. We mod- uses the of the encapsulation. language 3. prototype a version mentation a technique pointer To support page region segment. region ory into share Lazy fault overhead. space through memory the hardware memory needs for read-write ware al- any mem- distrusted it can swizzling. bitrarily we modify address current modified im- do not can read a particular sandboxed, Our entries, memory is straightforward; domains in table techniques application’s code with page shared sharing fault in the object memory of encapsulation instructions, mapped If a set a standard Read-only our ory hence not the leav- (read-only) Tmsted Mler Domain Cake call Add call F Domain For each Currently, tected of fault domains created for the stubs ~ outside stubs the components ing registers type that for of a cross-fault-domain The dedicated working uses simple formats. usage available we are on extensions extensions information store that use. a binary To solve patching to current control is sufficient this proto- Low Latency object flow main message callee must having the caller and buffer, LRPC of this work stubs used isolation for cooperating ules. the last In solution: tion, efficient fault Figure Do- safely stub domain. ciently how between domain which stub Second, we discuss how among trol for only a jump transfer entry whose point target fault other RPCS exporting and way for table. Each instruction outside address the Third, fault the jump table register. Because By entry address using not the for modules on tion The also may Our all techregisters. presence current call these errors; and notifies uses the fault of an infinite a timer facility to the same domains, that because of while implementa- to catch a call determine correct violation, application for the stubs encapsulation in the terminate [Kar89]. the be robust thread and this registers, the outstanding use avoid establish facility an can spaces an addressing to As to perform dedicated If the regis- no instrucwe all domain. to Only are saved. address software are convention contains stack, signal example, periodically calls that potentially there is taking loop. interrupt if a call too Trusted needs executo be terminated. among a fault table target does isolation. long, and be protected. validate machine caller restoring the and system be a way By a shared techniques. is an immediate a dedicated effi- and domain. must and to pass data registers domain execution in a fault fault via any linker uses the UNIX caller’s kernel space, copy uses a trusted must terminates The the arguments. by architectural for example, it then data. for managing the must callee for system operating we detail procedures jump instruction, are are managed escape domain. domain; arguments of our whose to destination the state to domain its naming control describe outside to insure is independent we tion the register used, errors, executing as- into domains. machine fatal distrusted the being transfer address call between context nique on three a fault routine calls domains is via allows safely and and First, then cross-fault-domain The crossing. a trusted passed protocol register of a cross- concentrates routine registers fault a trusted section mechanism call that sec- Tra- a preserved to saving switch copy spaces a single procedure of optimization must communication components Karger In addition of our In this fast kind mod- half across modify it. to address communicate by callee if the that Our This fault a simple half of fault software one encapsulation. the major RPC of cost future by the optimization, able across target cross-domain in the be preserved are a message, are also responsible are designated domains, domain. pects distrustful other 4 illustrates fault-domain fault the the we presented software we describe across but section, machine [BALL91]. modified that the callee ters tions is to reduce domain. cross-domain domain. de-marshall also uses only On each both Communication purpose callee than unpro- target to into to domains The software Fault the the saving The run managing we the copies marshaled copies state. to three are finally file and register to support Cross and and rather stubs copying implementations perform between encapsulation. 4 caller and procedure. by hand The are trusted, directly RPC arguments RPC. problem, by call exported for domains stubs arguments typically Major the responsible between ditional I 4: a customized each [J RT85]. of both are Because call Jump Table Figure be modified state. T – only are generated generator arguments Return Stub can is a stub The it pair stub using return segment, module. return Add: Stub code a trusted Untrusted rely Performance Results is a conTo evaluate is a legal domains, instructions encoded table 5 domain on the is kept in the performance we implemented of our system the MIPS) use of and YVe consider in the 209 on a 40 MHz a 160iMhz three Alpha of software-enforced and measured DECstation 400 questions. fault a prototype 5000/240 (DEC- (DEC-ALPHA). First, how much over- head does fast so ftware encapsulation is a cross-fault-domain performance isolation impact of using on an end-user of these questions We measured the a wide range marks it its code. the DEC-MIPS, boxing to requires On the additional [Dig, data all and DEC- jump 3 reports the possible use computed by the structions, as the original we and used qpt instructions BL92]. program reduced. pro- measure- calculate executed Column as a percentage the to the due 4 of Table of original to number program The data of grams, 026. for compress In veloped an model instruction contain the on a num- benchmark the DEC-MIPS sandboxing of cases the model overhead instructions ing point the substantially cycles available these for execution based the on due number are (interlocks). of floating-point reduced. interlocking; The integer instead, delay is is computed (s-instructions ortglnal-execut conjecture caused by not in the libraries, sandboxed. the observed mappzng encapsulation changes hence sandbox- struction float- reduce researcher be ply not Though linked [PH90]. icantly improved arranging only this differences ton-t zme-seconds were of in which Samples and 3% to cache the an have conflicts to Sam88]. gain object Hilfinger of of in- of researchers cache 8!Z0 instruc- number a 2070 performance instruction effect, among contained One by sim- files report miss were signif- rates by application’s there were programs. statistically on re- basic a significant incurred less overhead. MIPS the overhead for is 2.5’%0, compared benchmarks. programs percentage operations mean significant average, point the remaining 210 the PH90, order are Soft- [Sam88]. Beyond /cyc!es-per-second the [McF89, variations conj?icts. mapping instruction time reported the changing A number minimizing changing blocks conflicts. execution al- are explicitly lines, cache investigated in- to C library, programs, variation. cache to cache which such as the standard that instruction benchmarks 4 Loads of the benchmark tions as: – interlocks) versions ware does the variations experiments. de- parallelism, pipeline that paging. number interlocks slots it is unlikely memory difficult to quantify, we do not believe that data cache source of variation in our alignment was an important Sandboxing instruction-level number bound, and sandboxed filled with nop instructions by the compiler or assembler. Hence, scheduling effects among integer instructions will be accurately reflected by the count of inThe expected overstructions added (s-mstructtons). head compute overhead. saved over- and we to cache is for measured successive runs showed insignificant reduced the of account pro- overhead variations executed and interlock the of size of the instruction this might for are We source predicts (s-tnstructtons), provide ear analytical ing creases 056. to of a number the additional lowing some If we assume at predicting the overwe can conclude that heads higher than predicted, it does not account the opposite effect. Because all of our benchmarks low. identify The 1 appears For on the DEC-ALPHA, time. surprisingly To Table example, execution of in anomalies. overhead. The DEC-MIPS has a physically indexed, physically tagged, direct mapped data cache. In our experiments sandboxing did not affect the size, contents, or starting virtual address of the data segment. For both original this counts. ber time the effective While due to virtual sandbox- 1 reports execution there is a second order effect creating the observed anomalies in measured overhead. We can discount effective instruction cache size and virtual memory paging as sources for the observed execution time variance. Because sandboxing adds in- sand- 3, sandboxing from average that the model is also accurate head of individual benchmarks, of using as store Column divided dict on on the by are of overhead Section cally significant. This experiment demonstrates that, by combining instruction count overhead and floating point interlock measurements, we can accurately pre- benchmark protection registers C time. pixie ing in time the bench- the overhead overheads DEC-MIPS, of each overhead as well these of sand- Splash 1 reports registers, execution tools As can be seen from Table 1, the model is, on average, effective at predicting sandboxing overhead. The differences between measured and expected overheads are normally distributed with mean 0.770 and standard deviation of 2.6?10. The difference between the means of the measured and expected overheads is not statisti- each sandboxing general detailed All execution ment treated 7 report instructions compiler. gram’s We provide of removing additional fault including of the 6 reports 5 dedicated the overhead module, 2 and As overhead several 1]. column load time 1 of Table Columns instructions. by and a distrusted technique enforced We discuss of C programs, Column ALPHA. our execution SWG9 were is the Overhead benchmarks [Ass91, as if software The model provides an effective way to separate known colsources of overhead from second order effects. umn 5 of Table 1 are the predicted overheads. how what in turn. boxing SPEC 92 Second, Third, application? Encapsulation 5.1 incur? RPC? All floating On point to a mean of our of floating the DEC- intensive of 5.6% benchmarks for are DEC-~lIPS Benchmark DEC-ALPHA Fault Protection Reserved Isolation Overhead Register Overhead Instruction Fault Fault Protection Count Isolation Isolation Overhead Overhead Overhead Overhead (predicted) Overhead 052. alvinn FP 1.4% 33.4% -0.3% 19.4% 0.2% 8.1% 35.5% bps FP 5.6% 15.5% -0.1% 8.9% 4.7% 20.3% cholesky FP 0.0% 22.7% 0.5% 6.5% 5.7% -1.5% 0.0% 9.3% 026.compress INT 3.3% 13.3% 0.0% 10.9% 4.4% -4.3% 0.0% 056, ear FP -1.2% 19.1% 0.2% 12.4% 2.2% 3.7% 18.3% 023.eqntott INT 2.9% 1.0% 2.7% 2.2% 2.3% 17.4% -1.6% 11.8% 10.5% 13.3% 33.6% -9.4% 17.0% 8.9% NA NA 11.4% 008.espresso INT 12.4% 34.4% 27.0% ooi.gcci.35 INT 3.1% 18.7% 022.li INT 5.1% 23.4’% 0.3% 14.9% 5.4% 16.2% locus INT 10.3% 13.3% 8.6% 4.3% 8.7% 8.7% 0.0% 6.7% psgrind qcd FP INT FP 30.4% 10.7% 4,3% mp3d 8.7% 10.7% 10.4% 1.3% 2.0% 12.1’% 0.5% 19.5% 27.0% 8.8% 9.9% 1.2% 36.0% 12.1% 072.sc tracker INT INT 5.6% 11.2% 7.0% -0.8% 10.5% 0.4% 8.0% 3.9% 3.8% 2.1% 8.0% -0.8% NA 10.9% 19.9% water FP Average Table 1: Sandboxing 072.sc are dependent isolation overhead compute intensive. amounts of 1/0 0.7% 7.4% 0.3% 6.7% 1.5% 4.3% 12.3% 21.8’%0 0.4’%0 10.5% 5.070 4.3% 17.6’Zo overheads for DEC-MIPS and DEC-ALPHA platforms. The benchmarks OOi.gccl.35 ona pointer size of32 bits and do not compile on the DEC-ALPHA. The predicted Programs incur that negative due to conservative perform significant less overhead. interlocking Fault Domain native operating trip time. These of We now turn to the cost of cross-fault-domain RPC. Our RPC mechanism spends most of its time saving and restoring registers. As detailed in Section 4, only space registers tude preserved are designated across addition, by the architecture procedure calls if no instructions modify a preserved saved. Table a NULL 2 reports then the times when Column 2 lists integer registers Column the saving registers. In many reduced by statically for times when the preserved cases crossing times partitioning In is 314 versions a 40Mhz of the For comparison, C procedure no value. address we First, the preserved times listed floating call that Second, spaces using takes we sent the two the time no arguments a single pipe byte abstraction that with highly reduced the tion in On than more [H K93]. to reduce the by an order of calls. RPC of 3.0, a local im- magnicross- 5000/200 procedure running on cross-address-space expensive Software orders Mach system, delivers cost orders DECstation operating two the of cross-address- two expensive is 73 times RPC optimized calls. roundlast procedure cost roughly Spring the the three local on a 25Mhz more is able domain point within The call is roughly have procedure in platforms, than SPARCstation2, procedure these systems unit. measured reported and fault than enforced relative cost of magnitude a local fault leaf isola- of cross-faultover these sys- tems. be further the registers measured we measured On calls RPC [Ber93]. RPC saved. between 5.3 domains. mechanisms. 2. floating-point and are expensive to times call 1 lists could RPC of local address-space domain are caller more Operating need to be Column Finally, all fault three registers crossing are saved. 3 include callee RPC. all data Table plementations to be to be saved. it does not times cross-fault-domain crossing need in the register system times of cross-address-space Crossing magnitude that on the MIPS the columns 5.2 NA 4.3% for choleskyis will 0.0% other calling to perform and To provided capture performance, a mains returns between Using and two 211 the effect we added to the PO STGRES measured benchmark by Fault Domains of our system software database POSTGRES [SFGIV193]. in The POSTGRES on enforced application fault management running Sequoia the Sequoia 2000 do- system, 2000 benchmark Cross Platform Fault-Domain Caller Save Save Integer l.llps 1.81ps 2.83ps O.lops 204.72ps 1.35ps 1.80ps 0.06ps 227.88ps 60989 18.6% 121986 38.6% Query 8 9.0% 2.7% 121978 31.2% Query 10 9.6% 5.7% 1427024 31.9% 3: Fault typical the saved loaded to instrumented formance kinds into the ing database four queries and using both code cross-fault-domain integer the code RPCS to registers count (the 2). the we could isolation only centage of the of of Table 3 presents the in POSTGRES based nism from head of the built-in untrusted ware enforced fault 3 are relative results. the to original trusted function manager. sured overhead of software by is our competing of (td), among fault (h), crossing In techniques code encapsulation operat- spaces. a function crossing of native address distrusted domain this time over the per- the per- domains and to the the ratio, crossing RPC hardware-based savzngs variant cor- In addition, we Figure = 5 graphically axis gives mecha- (1– T)tc– ht~ crossing among relative cost ing hardware on separate address over user-defined calling 1 lists reported POSTGRES the mechathe over- without overheads fault address ingly in attractive of fault using the un- plication Column 2 reports the meaenforced fault domains. Us- our when performance than its competitor. ing the number of cross-domain calls listed in Column 3 and the DEG-MIPS-PIPE time reported in Table 2, Column 4 lists the estimated overhead using conventional reported hardware MIPS as little as 212 is in fault in Section l.10~s time and 5.2, our for the enforced alternative. becomes increas- higher crossing only Applications 10YOof their 39’%0 improvement address spaces. need degrees if an ap- fault perform domains, 1O$ZObetter that currently crossing require domain crossing DEC-ALPHA the is 4.3Y0, For example, 3070 of its time mechanism the that code achieve 5). X cross- software isolation Figure reports Assuming better (see spends RPC spaces. as applications isolation Y axis of encapsulated fault The spends fault-domain illustrates is the Software-enforced soft- The enforced overhead trade-offs. an application domains. region isolation these of time of software time shaded fault depicts the percentage the per- manager All in spent of cross- Column domains. time number Untrusted function spent estimate use a separate functions. provided encapsulation using hardware mechanisms time fault execution functions savings overhead our time spaces. Table of and over nism. sand- For centage r, isolation. code. services the software savings RP C mechanism 2 in Table so that we user system (ic), We unprotected fault is trusted, loaded types. benchmark. experiment substantial hardware-based 2000 bench- data software-enforced Column of fault polygon 2000 POSTGRES general, to be a serious in the Sequoia the provided types Sequoia Analysis For a user- languages, been recognized queries preserved fault-domain 5.4 scien- these running POSTGRES [Sto88]. our the earth user-defined dynamically dynamically responding for provides PO STGRES Currently, has long POSTGRES experiment, by programming these the used To support use of user-defined the overhead conventional linking Since of those system. of the eleven boxed isolation climate. queries, and measured Overhead (predicted) 1.7% This dynamic Overhead DEC-MIPS-PIPE 1.8% in make Overhead Number Cross-Domain Calls 5.0% type Four Soft ware-Enforced Fault Isolation 7 problem mark Untrusted Function Manager Query written safety crossing times. 1.4% of non-traditional as C, Cross-fault-domain 6 queries manager. Call Query in studying such Registers o.75ps Table are Pipes DEC-ALPHA Sequoia 2000 Query extensible c Procedure Float DEC-MIPS Table2: tists Save Integer+ Registers Registers contains RPC crossing time for 0.75~s. spend only time. the a As DEC- Hence, 1 on% . .. 8$$ $5$3 .$$s$$ Percentage of Execution Time Spent Crossing Figure 5: ware The shaded enforced formance of time The speed region (r). (l–r)tc The curve = htd. overhead X represents axis the represents In this on the the crossing represents graph, soft- better per- fault domains relative RPC crossing the break Figure per- among h = 0.043 and DEC-MIPS when provides The spent Y axis represents isolation alternative. centage (t.). fault even ware 6: formance centage point: (t.). (encapsulation DEC-ALPHA). of time The speed ing latter time example, of 1.80ps software fault duction level domains. Figure = htd. transforms the fault pro- provides isolation 100%, this tion’s execution overhead mains to small regardless crossing supplied optimized performance stability performance modules stability allows effect to place 6 I applica- of isolation in separate fault Related systems RPC performance BALL90, BALL91]. considered [vvST88, TA88, Traditional RPC 5000 Software RPC the o% rel- to ignore of 10 o 20 solution. in choosing Bla90, Switch of vendor- # Crossings/Millisecond the which Figure domains. ways LJltrix 4.2 Context Fig- Work have I DECstation 7: optimizing SB90, systems ond on the DEC-MIPS, number is taken Table based 213 of fault ing text HK93, Percentage number of context Many DEC-ALPHA). 40% versus 6 point: (encapsulation behavior software developers and even re- behavior. illustrates the crossing break h = 0.043 that isolation typical graph system of fault as hardware-based The RPC the crossing of crossing times relative code. an fault of application are shown. such Figure 7 demonstrates enforced Crossing highly of domains enforced whenever as a function cost. and mechanisms Figure software overhead choice DEC-MIPS fault 50’%0 of software proportion time. due ure 7 plots best a significant that extension that graph, / applica- applications, in distrusted the entire assuming 6 illustrate is the the represents per- among on the no is conservative. figure, is spent is that In this per- represents axis crossing represents soft- X The curve when better the than as we know, For many previous 5 and overhead This 5 assumes assumption execution Figures at ive 1.23ps currently on the represents provides cross- performance far system was encapsulated. this and As The of performance. PO STGRES, total and better space region isolation spent Y axis (r). (1–r)t. address DEC-MIPS provide or experimental Further, tion on the would DEC-ALPHA a hardware shaded fault alternative. overhead for this The enforced switch 2. switch time times of time domain The from spent in crossings hardware crossing per minimum a cross-architectural [ALBL9 is as reported 1]. The in the Ultrix last code milliseccrossstudy 4.2 con- column of on hardware the minimal and two of the our fault hardware first LRPC both caller directly to in the callee the fault space Firefly (which reduce based which are fault fault typed language; space [RDH+ strong of is that all 80]. typing languages is that like programmers trast, C, Deutsch on and native to statically tection An verify interpreter the defines system the operating which speed code with In that Our not if it is not Anonymous low latency [YBA93]. safe), operating RPC exploits RPC Logically failure was a called distrusted yield the overhead of ten the faster escaping best incurs about this overall fault in isola- application RPC software-based 4~0 overhead optimizations software-based their encapsulation software-based kernel our situation, better that of hardware-based over from Despite code, Extensive communica- a software overhead. often tor use sandboxing, executing space, of magnitude modules we By address to date. distrusted time per- can reduce to within a fac- alternative. fault Even isolation performance choice whenever the hardware-based RPC is greater than will be overhead 570. Acknowledgements thank Peter Dave Patterson, their to be safely then probabdtstzc independent the and and on the profiling Paul tool Aoki for Hen- Lazowska, Oliver Mike Sharp, Stonebraker for paper. Jim Larus qpt. We also helping prothank us with POST- References cus- [ACD74] in at near spaces fault isoare Adam, 17(12):685–690, [ALBL91] to domains T.L. K.M. Chandy, and A comparison of list schedules cessing systems. Communications encapsulated address comments us with Olson Ousterhout, Smith John Ed operat- written executed John Alan Burrows, Lampson, GRES. system. 64-bit helpful Mike Butler utility in the code Bershad, Kessler, Sites, vided insulates faults Brian nessy, Richard For filter by the allows and mechanism domain, will an order fault modules. a singIe cross-fault-domain than for independent software within delivers execution We pro- isolation. interpreter possible and cooperating RPC mechanism language isolation prevent 8 it easier violate packet The approach con- in con- directly to make is interpreted from pre- to pro- no hardware a software-based is more fault of using allowed format did network language by the any the Modula-3, and executed designed driver. programming (or rejected lation UNIX system code. than in this to be dynamically module can also provide BSD network tomization provide a system The the that Mike a language ing full that choice Even and To trap independent. system code on boundaries. example, any object address isolation. modules [D Gi’1]. approach tion tion to use loopholes fault this formance, conventional assembly. need fault technique, kernel, the out must will domain probes. call however, programming among To tags. of relying described isolation own a strongly a single as Ada built the operating all restricts are language Grant processor stylized they measurement into the find perfor- on special- in Mesa, shares undercutting calls, can also be used ruling and TLB space requires severely such techniques user-defined loaded it C++, often rely disadvantage languages system, our then languages, strongly-typed type Pilot to have providing of software- as address kernel; domain other is needed. portable, however, integer languages main due or system it does not such code the do not, with to be written The was on advantage isolation. code programming the scaling It LRPC was tags management features and library tags) space domain caller probabilis- any of any memory a cross in the that location of the provides Summary We the switch. an is unlikely ad- anonymous, location This accidental hardware are than otherwise context of more code switch 7 to reduce allow TLB; cost programming provide user, Tags the important isolation architectural can be used have not An Restrictive to not Address mance[ALBL91]. ized tags the cost of register operations switches, or the side. the same domains reveal discover anonymity, text costs. do not to either malicious in the between – it to tected a dispatch context they callee serve locations Calls protection by are hardware on each does tic code jumps software- 25T0 of the misses [BALL90]. stubs is, be able in systems, share be flushed that or the runs the avoiding times. to one these crossing switch that domain, space. and found thread and the crossing identifier space must callee avoids context address estimated techniques same procedure, reducing hardware of the the limit, at random dress traps was this Unlike isolation Address TLB called domain. substantially one the as possible, L RPC placed by kernel approach systems: and as simple based to limited two switches. systems later ultimately of taking uses a number and the kept are cost cent ext RPC prototype in isolation hardware Thomas shad, Anderson, and Edward of Architecture 214 December J.R. Dickson. for parallel pro- of the A CM, 1974. Henry Lazowska. and Operating Levy, Brian Ber- “1 he “interaction“” System Design. In Proceedings ence Languages and April 120, [Ass91] of the ~th on Architectural International Support Operating for [HK93] Confer- Graham Hamilton and Panes Spring nucleus: A microkernel Programming pages 108- Systems, Proceedings 1991. Administrator: National Association. SPEC Computer Newsletter, Graphics [HKM+88] December 3(4), [ASU86] Alfred V. Aho, Compilers, Sethi, Techniques, Publishing and Company, Brian 1986. Bershad, Thomas and Henry Call. ACM February Systems, 8(l), Brian Bershad, Thomas and Henry Brian Bershad, terface Edward User-Level SIGA on of Mul- [Kar89] 1993. Private BaJl and James and Conference In on Principles pages 59-70, guages, David Black. rency R. Larus. tracing. and System. Support for Concur- in the Mach Operating Computer, ing Remote tions and Bruce Procedure on Computer Nelson. Calls. Systems, J.D. [McF89] Clark. Window Prentice-Hall, OLE/DDE. [DG71] [Dig] L. P. Deutsch tool for Congress, 1971. software A flexible R. port ing IFIP Corporation, Ultrix Peter on Desktop [FP93] Kevin Publwhing, for Xpress: Systems. Modular Seybold FaJl and Joseph PasquaJe. [HC92] Winter USENIX 333, January [MRA87] 1992. Exploiting in- David Cheriton. ment. Conference gramming October on Architectural Languages page-cache of the and 5th March optimization Proceedings on and A Wznter be- University of for of the Archztecturai Languages Filter: Rewrit- program 1992. Program Packet J. C. Mogul, The Sup- and April Operat- 1989. The Van Jacobsen. New Architecture Capture. In- for In Proceedings USENIX of January Conference, R. F. packet Rashid, filter: and An M. efficient J. Acmecha- for user-level network code. In Proceedthe Sgmposium on Operating System of pages 39-51, Principles, Karl the physi- Support for and code Conference sign manage- and Plains, International Operating Pettis guided and Application-controlled In Proceedings Ball. 1083, pages 183–191, cetta. ings [PH90] using external 1986. Report McCanne UNIX. USENIX November 1987. pages 327– Conference, Harty caJ memory Summer measure In Architecture in SUN Thomas to Conference Packet nism 1993. Keiran and files An 1993. kernel data paths to improve 1/0 throughput of the and CPU a vailabilit y. In Proceedings 1993 on Lan- Report June 6(10):1–21, to Optimize In Proceed- pages 194–204, Types 1986 Programming User-Level Xtensions for Custom Larus Systems, BSD Page. Dyson. Software the caches. for the 1993 [Dys92] of McFarling. Steven W/.2 225–235, Conference Vnodes: Technical Scott mea- In ACM Principles Programming pages 238-247, ternational systems. 12th on Systems, File System instruction To [MJ93] Equipment Manual Pixie Guide 1992. and C. A. Grant. surement Digit al Programmer’ in- distributed pages for Wisconsin-Madison, Febru- and An the International Support executable havior. ary 1984. [Cla92] of Symposium R. Kleiman. James Transac- 2(1):39–59, Rashid, for Languages, Proceedings ing Implement- ACM Proceedings %-d Conference, [LB92] Birrell F. Manual, 3-61989. Steven In 1990. Andrew the for Multiple May 23(5):35–43, of April [Kle86] Scheduling California. Matchmaker: guages and Operating the Lan- 1992. Parallelism IEEE of of Programming 1988. 1985. Architectural OptimaJly Proceedings Clara, Reference language Programming on February Richard CT- SIGPLAN ings Thomas In a Dis- Transactions Santa Jones, and in Paul A. Karger. Using Registers Cross-Domain CaJl Performance. Commu- nication. profiling ACM 6(1):51-82, specification January Computer Performance Thompson. processing. La- Interpro- 1991. August R. D. Nichols, Sidebotham, Programmer’s B. Mary 1990. Transactions May Michael [JRT85] Com- for Shared-Memory ACM 9(2), La- Remote on Anderson, Levy. cess Communication Edward Lightweight Transactions puter Systems, Anderson, Levy. R. and Confer- 1993. S. Menees, System. Corporation, [Int86] USENIX June Sgstems, 803’86 tiprocessors. [BN84] File Computer Intel zowska, [Bla90] Scale Intel Procedure [BL92] West. tributed D. Unm- Summer M. Kazar, 1986. zowska, [Ber93] and Jeffrey Principles, Addison-Wesley Tools. [BALL91] Ravi the Satyanarayanan, M. an. [BALL90] J. Howard, M. 1991. of pages 147–159, ence, Kougiouris. The for objects. In C. Hansen. In York, Language of De- pages 16–27, 1990. Appeared as K. DaJaJ, Thomas R. June NOTICES Profile Proceedings on Programming Implementation, New SIGPLAN Robert positioning. White 25(6j. Pro- [RDH+ Systems, 80] David Horsley, 1992. 215 D. RedelJ, Hugh C. Yogen Lauer, William C. Lynch, Paul R. McJones, Stephen C, Purcell. temfora Personal A. Dain Samples. struction caches. 88/447, [SB90] and Protection Sys- ceedings Computer. Code June Communications February 1980. reorganization TechnicaJ University tober Murray, An Operating G. 23(2):81–92, of the ACM, [Sam88] Hal Pilot: Report of California, for in- UCB/CSD Berkeley, Oc- 1988. Michael Schroeder formance tions of on and Michael Firefly Burrows. RPC. Computer ACM Per- Tmnsckc- 8(1):1-17, S~stems, Febru- ary 1990. [SCK+ 93] Richard Kirk, son. (SFGM931 L. Sites, Anton Maurice Binary ACM, M. Stonebraker, Proceedings national May [Sto87] The Frew, of the Gardels, 2000 ACM on of 1993. K. Sequoia Conference B. G. Robin- Cornmunzcat~ons February J. J. Meridith. Matthew and Scott translation. 36(2):69–81, the In Chernoff, P. Marks, and Benchmark. SIGMOD Interof Data, Management 1993. Michael Stonebraker. GRES. IEEE Extensibility Database in POSTSeptem- Engineering, ber 1987. [Sto88] Michael Stonebraker. relational braker, editor, Readings pages 480–487. Inc., [SWG91] J. P. Singh, W. Shin-Yuan Tzou System. 88/452, Computer of California, Thinking von [VVST88] Robbert David P. Anderson. of the DASH A Message- Report UCB/CSD University October D. 1988. CM- 5 Net- Corporation, Guide, Culler, Communication of the Computer- systems and 19th Computa- Annual Symp- 1992. Hans van St averen, Review, and A Mechanism Architecture, van Renesse, 1992. S. Goldstein, Messages: S. Tanenbaum. Fastest Distributed (@wzt;?zg for CSL-TR-91- Division, In Proceedings Andrew World’s Gupta. Science Active Integrated on Report Programmer’s Eicken, osium A. Technical Machines Interface tion, and Berkeley, K. Schauser. for applications Evaluation Passing work and 1991. Performance T. Publishers, parallel Technical 469, Stanford, [vCGS92] Kaufmann StoneSystems, Weber, Stanford shared-memory. [Thi92] Database in Morgan of new types in In Michael 1988. Splash: [TA88] Inclusion data base systems. Performance Operating 22(4) and of the System. Octo- ;25–34, ber 1988. [Web93] Neil Webber. Portable ings of the January [YBA93] Curtis Operating Filesystem 1993 System Extensions. Winter USENIX Support for In ProceedConference, 1993. Yarvin, Anderson. Richard Anonymous Bukowski, RPC: and Thomas Low Latency 216 in a 64-Bit of the 1993. Summer Address USENIX Space. In Pro- Conference,