HOMEWORK 3 (Due Date: March 27, 2003) With Possible Solutions (Some of the questions may have more than 1 possible solution. I have asked TA to give due credit to other reasonable solutions) 1. Raymond’s tree-based algorithm for distributed mutual exclusion does not try to order critical section requests based on time. So, the algorithm may be unfair. A counter-argument to this statement of unfairness is that the hierarchical structure of the tree built by the algorithm takes care of request ordering in an implicit way. Explain whether Raymond’s distributed mutual exclusion algorithm is fair. Note: if you say the algorithm is fair, you should explain the concept that ensures ALL the requests will be fairly treated. If you say the algorithm is unfair, you should give an example where a request or a set of requests will be treated unfairly. Answer: It is possible that Raymond’s tree treats some requests in an unfair manner. This happens when the tree is unbalanced. E.g., assume that the tree has 10 levels on the left-hand side and only 1 level on the right hand side. Request from a site on the right hand side can reach the token holder earlier than a request made by a site (say at the 10th level) on the left hand side, even though the requests might have been made at the same time. 2. Consider the requesting phase of the Singhal’s heuristic algorithm for distributed mutual exclusion. In step 2(a), the status of the requesting site is set to R by the receivers of the request. Over a period of time, as all sites make a request for access to critical sections, all sites will have the status of all others as R. Now, by step 1(b), a critical section access request will be sent to all sites (since all sites are in R state). The above reasoning implies that Singhal’s algorithm will be broadcasting all critical section requests after a period of time. Explain whether this is true and if so, under what conditions. Answer: No, the algorithm does not broadcast requests at any stage. When a token holder, say j, forwards the token, it sets its TSV[j] element to “N”. Hence, the receiver of the token will set the status SV[j] to N. Hence, each site will have at least 1 site with status marked N at any point in time. This implies that no broadcasting takes place. 3. Consider the diffusion-based algorithm for detecting deadlocks under the OR-request model. Modify the algorithm so that the initiator detecting a deadlock gets the list of processes that are involved in the deadlock. Answer: When a process initiates a reply message, it will insert its id into the message. Other processes forwarding the reply message will append their ids to this list. Hence, the initiator of the query will get to know the list of processes involved in the deadlock. 4. A student came to my office with the following space-time diagram for Lamport’s distributed mutual exclusion algorithm. The student argued that the Site S2 can enter into Critical Section (CS) at Tx (instead of Ty), even though: i. S2 has not received a REPLY message from S1 at Tx. ii. S2’s request is not at the top of S1’s request_queue at Tx. Give me precise points to counter or accept the student’s argument. (If the answer is “Yes”, you should say why there would not be a problem in entering into CS without S1’s REPLY message and without S2’s request being at the top of S1’s queue. If the answer is “No”, you should explain which aspect of Lamport’s algorithm will prevent S2’s entry into CS at Tx). What would be your arguments if the system follows Ricart-Agrawala’s algorithm instead of Lamport’s? (5, 1) S1 S2 Tx Ty (3, 2) S3 (3, 2) Answer: Yes, S2 enter into CS at Tx since: a) S2 has replied a message (Request message in this case) with a higher time stamp from S1 before Tx. b) S2’s request need not be at the top of S1’s queue for S2 to enter into CS. The algorithm demands that S2’s request should be at the top of S2’s own queue only, not at other sites’ queues. With Ricart-Agrawala algorithm, S2 can enter CS only at Ty since the algorithm demands that REPLY message be received from all other sites. 5. Singhal’s heuristic algorthm can potentially run into starvation problems at the starting of the distributed system, depending on how critical section requests are made initially. For instance, consider the following example with 4 sites S1, S2, S3, and S4. According to the initialization rules, S1 is designated as the token holder. Initially, let S4 make the first request after the start of the distributed system. This request is sent to S1 (the token holder) and S1 passes the token to S4. Now, S3 has a request. However, due to the way initialization is done, S3 has the status of S4 as N (None of the R, H, E states) and the status of S1 and S2 as R. So, S3 sends the request to S1 & S2, and not to S4. In the above case, since S4 does not get S3’s request, S3 will not get the token. This type of starvation can end only when S1 has a request and gets the token back from S4. One possible reason for this type of initial starvation is that Singhal’s algorithm does not use broadcasts and hence the fact of S4 getting the token was not known to S2 and S3. Explain whether the above arguments are correct. If the existence of this type of initial starvation is indeed correct, suggest the steps that can be added to Singhal’s algorithm to overcome the problem. Answer: Key point to remember is that any algorithm has to be shown free of 2 properties: startvation and deadlock. (Now, if a property such as deadlock possibility exists, the algorithm (e.g., Maekawa) should suggest mechanisms to overcome the deadlock). Here, the above arguments are NOT correct. Reason: Initially, S4 will have the status of S3 as R. So S4 will send its request to S3. On receiving S4’s request, S3 will set the status of S4 to R. So S3 will send the request to S4 and S4 will send the token back to S3. No starvation exists. 6. Consider the following Wait For Graph (WFG) in a distributed system following the AND resource request model. Let us assume that the edge-chasing algorithm is initiated by P5 and P2. Both P5 and P2 detect cycles, and hence detect 2 deadlocks in the system. Now, P5 identifies that rolling back or terminating P3 (i.e., make P3 release the resources involved in the deadlock) will help in resolving the deadlock (on the left hand side of the figure). Incidentally, rolling back or terminating P3 also helps in resolving the other deadlock (on the right hand side of the figure). However, P2 being unaware of this fact, wants to rollback or terminate P1 to resolve the deadlock (on the right hand side of the figure). Note: you can assume all processes have the same priority. As you can see, the above case of 2 roll backs or terminations is not needed. Can you suggest some ways to overcome the extra roll back or termination initiated by P2? Try to make your suggestion in such a way that it can work in a general scenario as well, i.e., do not make your suggestion very specific to the given example WFG. Also, your suggestion need NOT have any relation to the edge-chasing algorithm. P5 P1 P2 P4 P3 Answer: Main issue is to identify the overlapping processes in deadlocks. This can be done by the initiators of deadlock detection algorithm, by exchanging the WFG and determining the overlaping processes. Next question: how to identify the initiators? 1 possibility is that if a process (e.g., P1 or P3 in the above figure) receives deadlock detection request/probe from more than 1 initiator, the process can inform the initiators about multiple detections being carried out. (e.g.,) P2 or P3 can infor P5 and P2 about the 2 simultaneous detections being carried out. So P2 and P5 get to know each other as “initiators”. After the 2 deadlocks are detected, P2 and P5 can exchange the detected WFGs and identify the overlapping processes, and reach an agreement on the process to be terminated.