Mutual Exclusion in Distributed Systems

advertisement

Mutual Exclusion in Distributed Systems
1
 Mutual for Distributed Systems
 Lack of shared memory (shared variables).
 Must rely on message passing to assure mutual exclusion.
 Mutual exclusion algorithms for distributed systems:
 Non-token based.
 Token based.
2
 Measuring performance.
 Number of messages necessary per CS (Critical Section) invocation.
 Synchronization Delay: the time required after a site leaves the CS and
before the next site enters the CS.
 Response time: the time interval a request waits for its CS execution to
be over after its request messages have been sent out.
 Throughput: CS execution rate.
 Low and high system load performance rating.
 Best and worst case performance.
3
 Solving the distributed mutual exclusion problem.
 The control site algorithm.
 Single point of failure.
 Uneven work load.
 High synchronization delay.
 Low system throughput.
 High and uneven network traffic.
4
 Non-Token-Based Algorithms.
 The concept of information structure forms the basis for unifying
different non-token-based mutual exclusion algorithms.
 This information structure defines the data structure needed at a site to
record the status of other sites.
 The information kept in the data structure is used by a site in making
decisions when invoking mutual exclusion.
 The information structure at site Si consists of the following three sets:
-
Request set Ri
- Inform set Ii
-
Status set Sti
5
 A site must obtain permission from all the sites in its request set
before entering CS
 Every site must inform all the sites in its inform set when waiting to
enter the CS or exiting the CS of its status change
 The status set contains the ids of sites for which Si maintains status
information
 If Si  Ij  Sj  Sti
 A site maintains a variable CSSTAT containing the site’s knowledge of
the status of the CS and a queue containing REQUEST messages in
the order of their timestamps for which no GRANT message has been
sent
6
 Correctness condition: To guarantee mutual exclusion, the information
structure of sites in the generalized algorithm must satisfy the
following conditions:
 If i: 1  i  N :: Si  Ii the following conditions are necessary
and sufficient to guarantee mutual exclusion
 i: 1  i  N :: Ii  Ri
 ij: 1  i,j  N :: (Ii  I
j
 )(Si  Rj  Sj  Ri)
(for every two sites, either they request permission from each oher or they
request permission from a common site)
7
8
 Lamport’s algorithm.
 Uses Lamport’s clocks.
 Every site SI keeps a request-queue containing mutual exclusion requests
ordered by their timestamps.
 Every site has a request set. i: 1  i  N :: RI = {S1, S2, …, SN}.
 Number of messages per CS invocation are 3(N-1).
 Synchronization delay is T.
9
 Requesting the CS by site Si:
 Si Sends REQUEST(tsI, i) message to all sites in Ri and places the request
on the request_queuei.
 Sj receives REQUEST(tsI, i), message and sends timestamped REPLY
message back to Si, and places the request on the request_queuej.
 Entering the CS by Si:
 [L1] It has received a message with larger timestamp than (tsi, i) from all
other sites (assuring that every one has received and replied to its request)
and [L2] it’s request is at the top of the request_queuei.
 Releasing the CS by Si:
 Removes its request from the top of its request queue and sends a
timestamped RELEASE message to all the sites in its request set.
 Sj receives RELEASE message and removes the request from the request
queue.
10
 The proof of the theorem that Lamport’s algorithm achieves mutual
exclusion can be done by contradiction.
 Performance:
 Requires 3(N-1) messages, where N is the number of sites in the request
set.
 Has a synchronization delay of T, where T is the average message delay.
11
 Ricart-Agrawala Algorithm.
 An improvement on Lamport’s algorithm which requires few number of
messages.
 Requesting the CS by Si:
 Sends timetamped REQUEST messages to all sites in its request set.
 Receiving site Sj sends a REPLY message to Si if it is neither requesting
nor executing the CS or if it is requesting and has a higher timestamp
than Si.
 Executing the CS by Si:
 Enters the CS after it has received REPLY messages from all the sites in
its request set.
 Releasing the CS by Si:
 Sends REPLY messages to all the deferred requests.
12
 The proof of the theorem that Ricart-Agrawala algorithm achieves
mutual exclusion can also be done by contradiction.
 Performance:
 Requires 2(N-1) messages, where N is the number of sites in the request
set.
 Has a synchronization delay of T, where T is the average message delay.
13
 Maekawa’s (square root) Algorithm.
 Radically different approach to distributed mutual exclusion.
 A site does not request permission from every other site, only a subset of
the sites.
 A site can have only one outstanding REPLY message at any time and
therefore it grants permission to an incoming request if it has not granted
permission to some other site.
 The construction of request sets:
- i j : i != j, 1  I, j  N :: RI  Rj
-
i : 1  i  N :: Si  RI
- i : 1  i  N :: |Ri | = K
- Any site Sj is contained in K number of RI ‘s. |Ri | =  N.
14
 Requesting the CS by Si:
 Sends REQUEST(i) messages to all sites in its request set Ri.
 Receiving site Sj sends a REPLY(j) message to Si if it hasn’t sent a
REPLY message to a site from the time it received the last RELEASE
message. Otherwise, it queues up the REQUEST for later consideration.
 Executing the CS by Si:
 Enters the CS after it has received REPLY messages from all the sites in
its request set.
 Releasing the CS by Si:
 Sends RELEASE(i) message to all the sites in Ri.
 When a site receives a RELEASE(i) message, it sends a REPLY message
to the next site waiting in the queue.
15
 The proof of the theorem that Maekawa’s algorithm achieves mutual
exclusion is again by contradiction.
 Performance:
 Requires 3N messages per CS execution, where N is the number of sites
in the request set.
 Has a synchronization delay of 2T, where T is the average message delay.
 Prone to dead-lock.
 Dead-lock can be expected because a site can be exclusively locked by
other sites and requests are not prioritized by their timestamps.
16
 Token-Based Algorithms.
 Unlike non-token-based algorithms, token-based algorithms are free
from starvation and deadlock.
 Suzuki-Kasami algorithm.
 CS is entered by the site having the token which is passed around the sites.
 A site void of the token, attempting to enter the CS broadcasts a REQUEST
message for the token to all the other sites.
 Outdated REQUEST(i,n) messages are distinguished from the current ones by
the array RNI[1…N] where RNI[ j ] is the largest sequence number received so
far in a REQUEST message from Sj. When site Si receives a REQUEST(j,n)
message, it sets RNi[ j ] := max(RNI[ j ], n).
 The token consists of a queue of requesting sites, Q, and an array of integers
LN[1..N], where LN[j] is the sequence number of the request that site Sj
executed most recently. LN[I] := RNi[ i ] by SI to indicate that its request has
been executed.
17
 Requesting the CS by Si when it does not have the token:
 Increment RNi[i] and sends REQUEST(i,sn) messages to all sites. sn =
RNi[i].
 Receiving site Sj sets RNj[i] = max(RNi[i], sn) and if it has the token and
does not need it, sends it to Si if RNj[i] = LN[I] + 1.
 Executing the CS by Si:
 When it possesses the token.
 Releasing the CS by Si:
 Sets LN[I] = RNi[i]
 For every Sj whose ID is not in the token queue, it appends its ID to the
token queue if RNi[j] = LN[j] + 1
 If token queue is non empty after the above update, it deletes the top site
ID from the queue and sends the token to the site indicated by the ID.
18
 Performance:
 Very simple yet very efficient.
 Requires ONLY 0 or N messages per CS execution, where N is the number
of sites.
 Has a synchronization delay of 0 or T, where T is the average message
delay.
19
 Raymond’s tree-based algorithm.
 Sites are logically arranged as a directed tree.
 Edges of the tree are assigned directions toward the site (root of the tree) that
has the token.
 Every site has a local variable holder that points to an immediate neighbor
node on a directed path to the root node.
 Every site keeps a FIFO queue, called request_q which stores the requests of
those neighboring sites that have sent a request to this site, but have not yet
been sent the token.
20
 Requesting the CS:
 When a site wants to enter CS, it sends a REQUEST message to the node
along the directed path to the root, provided it does not hold the token and
its request_q is empty. It then adds its request to its request_q.
 When a site on the path receives this message, it places the REQUEST in
its request_q and sends a REQUEST message along the directed path to
the root provided it has not sent out a REQUEST message on its outgoing
edge for a previously received REQUEST on its request_q.
 When the root site receives a REQUEST message, it sends the token to the
site from which it received the REQUEST message and sets its holder
variable to point at that site.
 When a site receives the token, it deletes the top entry from its request_q,
sends the token to the site indicated in this entry, and sets its holder
variable to point at that site. If the request_q is nonempty at this point,
then the site sends a REQUEST message to the site which is pointed at by
the holder variable.
21
 Executing the critical section:
 A site enters the CS when it receives the token and its own entry is at the
top of its request_q. In this case, the site deletes the top entry from its
request_q and enters the CS.
 Releasing the critical section:
 If its request_q is nonempty, then it deletes the top entry from its
request_q, sends the token to that site, and sets its holder variable to point
at that site.
 If the request_q is nonempty at this point, then the site sends a REQUEST
message to the site which is pointed at by the holder variable.
22
 Performance:
 Average message complexity is O(log N) because average distance
between any two nodes in a tree with N nodes is O(log N).
 Synchronization delay is (T log N)/2 because the average distance between
two sites to successively execute the CS is (log N)/2.
 Greedy strategy.
23
Download