Lecture XVIII: Concluding Remarks CMPT 401 Summer 2007 Dr. Alexandra Fedorova Outline • Discuss A Note on Distributed Computing by Jim Waldo et al. • Jim Waldo: – Distinguished Engineer at Sun Microsystems – Chief architect of Jini – Adjunct professor at Harvard CMPT 401 Summer 2007 © A. Fedorova 2 A Note on Distributed Computing • Distributed computing is fundamentally different from local computing • The two paradigms are so different that it would be very inefficient to try and make them look the same – You’d end up with distributed applications that aren’t robust to failures – Or with local applications that are more complex than they need to be • Most programming environments for DS attempt to mask the difference between local and remote invocation – But this is not what’s hard about distributed computing… CMPT 401 Summer 2007 © A. Fedorova 3 Key Argument • Achieving interface transparency in distributed systems is unreasonable – Distributed systems have different failure modes than local systems – Handling those failures properly requires a certain interface – Therefore, distributed systems must be accessed via different interfaces – Those interfaces would be an overkill for local systems CMPT 401 Summer 2007 © A. Fedorova 4 Differences Between Local and Distributed Applications • Latency • Memory access • Partial failure and concurrency CMPT 401 Summer 2007 © A. Fedorova 5 Latency • A remote method call takes longer to execute than a local method call • If you build your application without taking this into account, you are doomed to have performance problems • Suppose you disregard local/remote differences: – You build/test your application using local objects – You decide later which objects are local and which are remote – You find out that if frequently accessed objects are remote, your performance sucks CMPT 401 Summer 2007 © A. Fedorova 6 Latency (cont.) • One way to overcome the latency problem: – Make available tools that will allow developer to debug performance – Understand what components are slowing down the system – Make recommendations about the components that should be local • But can we be sure that such tools would be available? (Do you know of a good one?) This is an active research area – this means that this is hard! CMPT 401 Summer 2007 © A. Fedorova 7 Memory Access • A local pointer does not make sense in a remote address space • What are the solutions? – Create a language where all memory access is managed by a runtime system (i.e., Java) – everything is a reference • But not everyone uses Java – Force the programmer to access memory in a way that does not use pointers (in C++ you can do both) • But not all programmers are well behaved CMPT 401 Summer 2007 © A. Fedorova 8 Memory Access and Latency: The Verdict • Conceptually, it is possible to mask the difference between local and distributed computing w.r.t. memory access and latency • Latency: – Develop your application without consideration for object locations – Decide on object locations later – Rely on good debugging tools to determine the right location • Memory access – Enforce memory access though the underlying management system • But masking this difference is difficult, and so it’s not clear whether we can realistically expect it to be masked CMPT 401 Summer 2007 © A. Fedorova 9 Partial Failure • One component has failed others keep operating • You don’t know how much of the computation has actually completed – this is unique to distributed systems – Has the server failed or is it just slow? – Did it update my bank account before it failed? • With local computing, a function can also fail, or a system may block or deadlock, but – You can always find out what’s happening by asking the operating system or the application – In distributed computing, you cannot always find out what happened, because you may be unable communicate with the entity in question CMPT 401 Summer 2007 © A. Fedorova 10 Concurrency • Aren’t local multithreaded applications subject to same issues as distributed applications? • Not quite: – In local programming, a programmer can always force a certain order of operations – In distributed computing this cannot be done – In local programming, the underlying system provides synchronization primitives and mechanisms – In distributed systems, this is not easily available, and the system providing the synchronization infrastructure may fail CMPT 401 Summer 2007 © A. Fedorova 11 So What Do We Do? • Design the right interfaces • Interfaces must allow the programmer to handle errors that are unique to distributed systems • For example: a read() system call: – Local interface: int read(int fd, char *buf, int size) – Remote interface: int read(int fd, char *buf, int size, long timeout) Error codes are expanded to indicate timeout or network failure CMPT 401 Summer 2007 © A. Fedorova 12 But Wait… Can’t You Unify Interfaces • Can’t you use the beefed-up remote interface even when programming local applications? • Then you don’t need to have different sets of interfaces • You could, but – Local programming would become a nightmare – This defeats the purpose of unifying local and distributed paradigms: instead of making distributed programming simpler you’d be making local programming more complex CMPT 401 Summer 2007 © A. Fedorova 13 So What Does Jim Suggest? • Design objects with local interfaces • Add an extension to the interface if the object is to be distributed • The programmer will be aware of the object’s location • How is this actually done? Recall RMI: – A remote object must implement Remote interface – A method invoked on a remote object must catch Remote exception – But the same object can be used locally, without specifying that it implements Remote CMPT 401 Summer 2007 © A. Fedorova 14 Summary • Distributed computing is fundamentally different from local computing because of different failure modes • By making distributed interfaces look like local interfaces, we are diminishing our ability to properly handle those failures – this results in brittle applications • To handle those failures properly, interfaces must be designed in a certain way • Therefore, remote interfaces must be different from local interfaces (unless you want to make local interfaces unnecessarily complicated) CMPT 401 Summer 2007 © A. Fedorova 15