Network Objects Presenter: Dan Williams Trends Network centric view of world Jini, Web Services Based on Object Oriented models Both papers contributed some ideas Seems to be future even if we’re not quite there yet! Communication RPC Would like to access named objects Doesn’t always fit our intuition Look like local objects to programmer May not care who receiver/worker is Communication RPC Doesn’t always fit our intuition Would like to access named objects Look like local objects to programmer May not care who receiver/worker is We want a simple model! Linda New model for parallel programming Simple, Elegant Linda in Context. Carriero and Gelernter. Motivation: Linda Message-passing Concurrent OO programming Instance of message passing! Concurrent logic languages Not simple, not flexible! Not simple! Functional programming languages Not always suited to problem! Basic Idea: Linda • in / rd Process 1 Process 2 • out / eval (“a string”, 15.01, 17, “another string”) (0, 1) Tuple Space Linda Operations in = take a tuple (block until match) rd = same as in but leave tuple out = output a tuple in(“a string”, ? F, ? I, “another string”) out(“a string”, 15.01, 17, “another string”) eval = output a live tuple eval(“M”, i, j, compute(i, j)) New process to compute something Becomes data tuple upon completion Linda vs. Concurrent Obj Concurrent objects use monitors Simplicity Monitors need process forking, shared state variables, condition queue/signal Synchronous communication not norm Flexibility All elements of distributed data structure must reside inside monitor (restrict access) Linda vs. Concurrent Logic Parallel conjunction Guarded clauses Shared logical variables Merge Problem Multiple clients – explicit merge Dining philosophers Much simpler in Linda Linda vs. Functional Prog Specification for parallel compiler “Interpretive abstraction” Parallelism: compiler or programmer? Don’t have every process send individual result Fill some distributed data structure (freedom) Recursion equations not always helpful in understanding distributed problems Search may lend to recursion equations Comparing searches does not Conclusion Linda is wonderful Or is it? Tuple Spaces: Location Where is it located: 3rd party? Node with tuple space A B Tuple Spaces: Location Where is it located: 3rd party? Node with tuple space Large tuples? A B Tuple Spaces: Location Can be lots of communication overhead Single point of failure Large amounts of clients/buggy clients What if machine with tuple space crashes? What if tuple space fills up? Solution Distributed tuple space? Linda doesn’t look so simple/elegant anymore Tuple Spaces: Implementation How is the tuple space organized Efficiently locate tuples Large number of tuples in space? How is the tuple space managed? Send request to tuple space for tuple space manager to search? Will tuple space need to spawn tons of threads? Other Tuple Space Issues Tuple space memory leaks Program outputs a tuple, forgets about it If intended receiver crashed, tuple could remain a long time How can tuple space manager know how to get rid of these old tuples? Leases? Security/Debugging Sender does not need to know receiver Malicious/Buggy receiver can take tuples not intended for it Responsibility offloaded to receiver How do you debug a system if tuples are “disappearing” in this manner? Malicious/Buggy sender can introduce bogus tuples How do you debug a system if tuples are “unexpected”? Conclusions Linda looks really simple and elegant Thinking about how to actually implement a system using Linda Hidden complexities Correct model of computation? Linda not in widespread use Linda Today Jini/Javaspaces Figure 1. Processes use spaces and simple operations to coordinate activities Copyright Sun Microsystems, Inc. Motivation: Network Objects Object Oriented model Clients access state through methods Intuitively fits with distributed computing Method calls = Communication Details addressed How to implement Network Objects. Birrell et al. Which features do we want? Focus on features believed to be of value to all distributed applications Powerful marshaling Strong type-checking Garbage collection Streams Local Obj vs. Remote Obj Client Client call Transparent! call Server Server method invocation on local object method invocation on remote object Basic Idea: Network Objects Stub Marshals using Pickles General purpose (efficient and compact) Network objects passed by reference Other objects passed by copying Surrogate object Methods perform RPC to owner Object Types T Pure object type (only declares methods) TSrg TImpl Subtype of T with overridden methods to perform RPC Subtype of T corresponding to the owner object (including data fields) Overview Client Obj Ref Owner Obj Overview Client Obj Ref Owner Stub • Unmarshal Ref • Select Transport • Select Srg Type Obj Overview Client Obj Ref Owner Stub • Unmarshal Ref • Select Transport • Select Srg Type call Surrogate Object Obj Overview Client Obj Ref Owner Dispatcher Stub • Unmarshal Ref • Select Transport • Select Srg Type call Surrogate Object Obj Overview Client Obj Ref Owner Dispatcher Stub • Unmarshal Ref • Select Transport • Select Srg Type call Surrogate Object Obj Garbage Collection Exported Objects cannot be collected if surrogates exist somewhere Synchronously make dirty call upon creation Make clean calls when local garbage collector picks up surrogate If dirty set is empty, exported object can be collected What if dirty client disappears? Transports Many protocols for communication TCP, UDP, shared mem, etc. Transports generate and manage connections Returns Location Creates new connections to an address space c.rd, c.wr Marshaling / Unmarshaling Marshaling Object marked as exported Unmarshaling New Surrogate Locate owner Select transport that both share (RPC) Determine surrogate type Narrowest surrogate rule Get owner’s stub types during Dirty call Efficiency Reliability Issues Similar to RPC issues Failure semantics not well-defined Operation occur “exactly once” How can you make this guarantee How can you add reliability Inside network object abstraction Lose simplicity and similarity to local objects Transparency/Scalability More exceptions to deal with Latency Issues Object granularity Should objects be large/small Large number of objects Local calls don’t fail like remote calls Lots of copying throughout network for surrogates Large number of clients Will owner get overloaded – replicate object? What about consistent state among replicas? Other Issues Semantics still not powerful enough Multicast not primitive Network Objects Today Web services Take solution from middleware and deploy on internet Are Web services really distributed objects? More centered on documents than objects CORBA Heavyweight to deal with all the issues Address fault-tolerance, scalability Conclusions Some models are very pretty Linda Network Objects Simple – few primitives Hidden details in Tuple Spaces Convenient OO model for communication Semantic (Reliability) issues in RPC carry over Need to think about all the intricacies that arise in distributed programs