Lecture18-Conclusion..

advertisement
Lecture XVIII: Concluding Remarks
CMPT 401 Summer 2007
Dr. Alexandra Fedorova
Outline
• Discuss A Note on Distributed Computing by Jim Waldo et
al.
• Jim Waldo:
– Distinguished Engineer at Sun Microsystems
– Chief architect of Jini
– Adjunct professor at Harvard
CMPT 401 Summer 2007 © A. Fedorova
2
A Note on Distributed Computing
• Distributed computing is fundamentally different from
local computing
• The two paradigms are so different that it would be very
inefficient to try and make them look the same
– You’d end up with distributed applications that aren’t robust to
failures
– Or with local applications that are more complex than they need
to be
• Most programming environments for DS attempt to mask
the difference between local and remote invocation
– But this is not what’s hard about distributed computing…
CMPT 401 Summer 2007 © A. Fedorova
3
Key Argument
• Achieving interface transparency in distributed systems is
unreasonable
– Distributed systems have different failure modes than local
systems
– Handling those failures properly requires a certain interface
– Therefore, distributed systems must be accessed via
different interfaces
– Those interfaces would be an overkill for local systems
CMPT 401 Summer 2007 © A. Fedorova
4
Differences Between Local and
Distributed Applications
• Latency
• Memory access
• Partial failure and concurrency
CMPT 401 Summer 2007 © A. Fedorova
5
Latency
• A remote method call takes longer to execute than a local
method call
• If you build your application without taking this into
account, you are doomed to have performance problems
• Suppose you disregard local/remote differences:
– You build/test your application using local objects
– You decide later which objects are local and which are remote
– You find out that if frequently accessed objects are remote, your
performance sucks
CMPT 401 Summer 2007 © A. Fedorova
6
Latency (cont.)
• One way to overcome the latency problem:
– Make available tools that will allow developer to debug
performance
– Understand what components are slowing down the system
– Make recommendations about the components that should
be local
• But can we be sure that such tools would be available?
(Do you know of a good one?) This is an active research
area – this means that this is hard!
CMPT 401 Summer 2007 © A. Fedorova
7
Memory Access
• A local pointer does not make sense in a remote address
space
• What are the solutions?
– Create a language where all memory access is managed by a
runtime system (i.e., Java) – everything is a reference
• But not everyone uses Java
– Force the programmer to access memory in a way that does
not use pointers (in C++ you can do both)
• But not all programmers are well behaved
CMPT 401 Summer 2007 © A. Fedorova
8
Memory Access and Latency: The Verdict
• Conceptually, it is possible to mask the difference between local and
distributed computing w.r.t. memory access and latency
• Latency:
– Develop your application without consideration for object
locations
– Decide on object locations later
– Rely on good debugging tools to determine the right location
• Memory access
– Enforce memory access though the underlying management
system
• But masking this difference is difficult, and so it’s not clear whether
we can realistically expect it to be masked
CMPT 401 Summer 2007 © A. Fedorova
9
Partial Failure
• One component has failed others keep operating
• You don’t know how much of the computation has
actually completed – this is unique to distributed systems
– Has the server failed or is it just slow?
– Did it update my bank account before it failed?
• With local computing, a function can also fail, or a system
may block or deadlock, but
– You can always find out what’s happening by asking the operating
system or the application
– In distributed computing, you cannot always find out what
happened, because you may be unable communicate with the
entity in question
CMPT 401 Summer 2007 © A. Fedorova
10
Concurrency
• Aren’t local multithreaded applications subject to same
issues as distributed applications?
• Not quite:
– In local programming, a programmer can always force a certain
order of operations
– In distributed computing this cannot be done
– In local programming, the underlying system provides
synchronization primitives and mechanisms
– In distributed systems, this is not easily available, and the system
providing the synchronization infrastructure may fail
CMPT 401 Summer 2007 © A. Fedorova
11
So What Do We Do?
• Design the right interfaces
• Interfaces must allow the programmer to handle errors that
are unique to distributed systems
• For example: a read() system call:
– Local interface:
int read(int fd, char *buf, int size)
– Remote interface:
int read(int fd, char *buf, int size, long timeout)
Error codes are expanded to indicate timeout or network failure
CMPT 401 Summer 2007 © A. Fedorova
12
But Wait… Can’t You Unify Interfaces
• Can’t you use the beefed-up remote interface even when
programming local applications?
• Then you don’t need to have different sets of interfaces
• You could, but
– Local programming would become a nightmare
– This defeats the purpose of unifying local and distributed
paradigms: instead of making distributed programming
simpler you’d be making local programming more complex
CMPT 401 Summer 2007 © A. Fedorova
13
So What Does Jim Suggest?
• Design objects with local interfaces
• Add an extension to the interface if the object is to be
distributed
• The programmer will be aware of the object’s location
• How is this actually done? Recall RMI:
– A remote object must implement Remote interface
– A method invoked on a remote object must catch Remote
exception
– But the same object can be used locally, without specifying that it
implements Remote
CMPT 401 Summer 2007 © A. Fedorova
14
Summary
• Distributed computing is fundamentally different from
local computing because of different failure modes
• By making distributed interfaces look like local interfaces,
we are diminishing our ability to properly handle those
failures – this results in brittle applications
• To handle those failures properly, interfaces must be
designed in a certain way
• Therefore, remote interfaces must be different from local
interfaces (unless you want to make local interfaces
unnecessarily complicated)
CMPT 401 Summer 2007 © A. Fedorova
15
Download