Distributed Systems Basics (Chapter 4 and 5)

advertisement

Basic Concepts in Distributed Systems
1
 Motivation for Distributed Systems
 Resource sharing.
 Performance enhancement.
 Availability and reliability.
 Modular expandability.
2
 Distributed System Issues
 Global knowledge
 knowledge of the state of the system
 Lack of common clock and order of events
 Naming (name service)
 Mapping from the logical name to a physical address
 Scalability and system growth
 Compatibility and interoperability
 Binary level compatibility
- Common processor binary instructions
- Requires common machine level instructions and architecture
- Not very common in larger distributed systems
3
 Execution level compatibility
- Same source code can be compiled and executed on computers through out the system
- Generally, the various computers would be running the same or very similar OS
 Protocol level compatibility
- Computers within the system support a set of common protocols
- Computers in the system can run different operating systems (i.e. share protocols like Sun’s
NFS, FTP, etc.)
4
 Process synchronization
 Lack of shared memory and a common clock
 Resource management
 Data migration
- Distributed file system
- Network transparency
- Distributed shared memory and virtual address space
 Computation migration
- Remote Procedure Calls (RPC)
5
 Distributed scheduling
- Maximizing performance is desired
 Security
 Authentication
 Authorization
 Structuring
 Monolithic kernel
- Seldom used due to multi-functionality of computer systems
 Collective kernel
- Most commonly used
- Modular design
6
- Microkernel, provides most basic OS services (i.e. task management, processor
management, virtual memory management) and supports communication between OS
processes
- Allows customized OS to fit the application
 Object Oriented
 Client-server
7
 OS Communication basics
 Message passing
 SEND and RECEIVE
 Buffered
- Messages copied: from the user buffer to the kernel buffer, from the kernel buffer of the
sending computer onto the kernel buffer of the receiving computer, and from there to a user
buffer
 Unbuffered
- Data is copied from one user buffer to another user buffer directly
8
 Nonblocking
- SEND: returns control to the user process as soon as the message is copied from the user
buffer onto the kernel buffer
- RECEIVE: provides a buffer to copy the message. May poll to check and see if the message
is in the buffer or the kernel can signal the process that the message has been received
- Primary advantage: programs have maximum flexibility to perform computation and
communication in any order they want
- Significant disadvantage: program design, implementation, and verification are difficult
- Used in: producer-consumer relationships
9
 Blocking
- SEND: does not return control to the user program until the message has been sent
(unreliable) , or until an acknowledgement has been received (reliable)
- RECEIVE: does not return control until a message is copied to the user buffer
- Predictable behavior, hand-shacking, easier to program system
- Lack of concurrency between computation and communication
 Synchronous vs. asynchronous
- SEND is blocked until a corresponding RECEIVE is executed at the receiving computer
- Rendezvous
10
 RPCs
 On invoking an RPC, the calling process (client) is suspended and
parameters, if any, are passed to the remote machine where the procedure
will executes
 On completion of the procedure execution, the results are passed back
from the server to the client and the client resumes as if it had called a
local procedure
11

Theoretical Foundations for Distributed Systems
12
 Limitations of a Distributed System.
 Absence of global clock.
 The absence of perfectly synchronized clocks and global time in distributed
systems, makes the determination of the occurrences of events in different
system difficult to do without some algorithms to help us.
 Many algorithms under certain conditions, make it possible to ascertain the
order in which two events occur
 Absence of shared memory.
13
 Lamport’s Logical Clocks.
 Some definitions:
 a  b, if a and b are events in the same system, a occurred before b.
 a  b, if a is the event of sending a message m in a process, b is the event
of receipt of the same message m by another process.
 If a  b and b  c, then a  c.
 Causally related events: past events influence future events.
- Event a causally affects event b if a  b.
 Concurrent events: Two distinct events a and b are said to be concurrent
(a || b) if a  b and b  a.
14
 Conditions satisfied by the system of clocks.
 For any event, if a  b, then C(a) < C(b).
 For any two events a and b in a process PI, if a occurs before b, then CI(a) <
CI(b).
 If a is the event of sending a message m is process PI and b is the event of
receiving the same message m at process Pj, then CI(a) < Cj(b).
 Clock CI is incremented between any two successive events in process PI.
- CI := CI + d (d > 0)
- If a and b are two successive events in PI. and a  b, then CI(b) := CI(a)+ d.
 If event a is the sending of message m by process Pi, the message m is
assigned a timestamp tm = CI(a). On receiving the same message m by
process Pi, CI is set to a value greater than or equal to its present value
and greater than tm.
- Cj := max(Cj , tm + d) (d > 0)
15
 If a  b, then C(a) < C(b). The reverse is not necessary true if the
events have occurred in different processes.
 Using Lamport’s clock timestamps, we cannot conclude the causal
relationship between two events occurring in different processes by
just looking at the timestamps of the events.
 One solution to this limitation is the use of Vector clocks, where on
receipt of messages, a process learns about the more recent clock
values of the rest of the processes in the system.
16
 Vector Clocks
 Each process PI is equipped with a clock CI which is an integer vector
of length n, where n is the number of processes.
 Implementation rules for the vector clocks:
 Clock CI is incremented between any two successive events in process PI
- CI[ I ] := CI[ I ] + d (d > 0)
 If event a is the sending of the message m by process PI, then message m is
assigned a vector timestamp tm = CI( a ); on receiving the same message m by
process Pj, Cj( is updated as follows:
- k, Cj[ k ] := max (Cj[ k ], tm[k])
 The basic idea behind vector clocks is that on the receipt of messages, a
process learns about the more recent clock values of the rest of the
processes in the system.
17
 Causally Related Events
 Events a and b are causally related, if ta < tb or tb < ta. Otherwise, these
events are concurrent
18
 Causal Ordering of Messages
 Send (M1)  Send (M2), then every recipient of both messages M1 and
M2 must receive M1 before M2.
 Each and every update must be checked to ensure that it does not violate the
consistency constraint.
 Protocols
 The basic idea is to deliver a message to a process only if the message
immediately preceding it has been delivered to the process.
 Otherwise, the message is not delivered immediately but is buffered until the
message immediately preceding it is delivered.
19
 Birman-Schiper-Stephenson Protocol
 Before broadcasting a message m, a process PI increments the vector VT PI [ I ]
and timestamps m. VT PI [ I ] – 1 indicates how many messages from PI precede
m.
 A process Pj  PI, upon receiving message m timestamped VTm from PI, delays
its delivery until both the following conditions are satisfied.
- VT Pj [ I ] = VT Pm [ I ] – 1 (ensures that the process has received all the messages from Pi
that preceed m)
- VT Pj [ k ]  VT Pm [ k ]  k  {1, 2, …, n} – {I} (ensures that Pj has received all those
messages received by Pi before sending m)
where n is the total number of processes and FIFO queuing is used for messages.
 When a message is delivered at a process Pj, VTPj is updated according to
the vector clocks rule 2.
 Schiper-Eggli-Sandoz protocol
20
 Local State
 LSI denotes the local state of SI.
 send(mij) denotes the send event of a message mij by site SI to Sj.
 rec(mij) denotes the receive event of a message mij by site Sj.
 time(x) denotes the time at which state x was recorded and
time(send(m)) denotes the time at which event send(m) occurred.
 send(mij)  LSI iff time(send(mij)) < time(LSI)
 rec(mij)  LSj iff time(rec(mij)) < time(LSj)
 transit (LSi, LSj) = { mij | send(mij)  LSI  rec(mij)  LSj}
 inconsistent (LSi, LSj) = { mij | send(mij)  LSI  rec(mij)  LSj}
21
 Global State
 A global state. GS, of a system is a collection of the local states of its
sites: GS = { LS1, LS2, …, LSn} where n is the number of sites in the
system.
 Consistent global state GS = { LS1, LS2, …, LSn} iff I, j : 1  I, j  n ::
inconsistent(LSi, LSj) = 
 Transitless global state GS iff I, j : 1  I, j  n :: transit (LSi, LSj) = 
 Strongly consistent global state. consistent and transitless.
 Chandy-Lamport’s Global State Recording Algorithm.
Termination Detection Algorithms.
22
Download