Monitors - the GMU ECE Department

advertisement

Deadlock Detection in Distributed Systems
1
 Deadlock Model for Distributed Systems
 Systems have only reusable resources.
 Processes are allowed only exclusive access to resources.
 There is only one copy of each resource.
2
 Simplified Deadlock model
 Since we have only reusable resources in this model and it is a single unit
resource model, a cycle is the necessary and sufficient (iff) condition for
deadlock.
 Wait-For-Graph (WFG)
 Graph edges are requests for resources and not data or control flow
 Any WFG reduction will have the same result as any other.
 A knot in the resource allocation graph indicates deadlock.
 Knot in a graph is a nonempty set of nodes such that for every node in the
graph, all nodes in the graph and only nodes in the graph are reachable from
the node
 There are no terminating sink nodes in the graph or sub-graph reachable from
the given node
3
 Deadlock Handling in Distributed Systems.
 More complicated in distributed systems.
 Lack of accurate knowledge of the current state of the system by any one
site.
 Intersite communication involves unpredictable delays.
4
 Deadlock Prevention.
 Process acquires all needed resources simultaneously before execution.
 Preemption of process that holds the needed resource.
 Decreases system concurrency.
 Deadlock in resource acquiring phase.
- Problem can be solved by forcing processes to acquire needed resources one by one.
This will farther reduce efficiency and concurrency.
5
 Deadlock avoidance.
 Resources are granted to processes in various sites if the resulting global
system state is safe.
 This is not practical in distributed systems.
 It is impractical for every site to maintain information on the global state
of the system due to extensive need for communication between sites
and the delays and inaccuracies involved in such extensive
communication.
 The larger the system, the higher the computational intensity of
computing global state of the system.
6
 Deadlock detection.
 Looking for the cycle which is the necessary and sufficient condition for
deadlock in the distributed model.
 After detecting the cycle, it must be broken to terminate deadlock.
 This is the most common and practical way of dealing with the problem of
deadlock in distributed systems.
7
 Control Organizations for Distributed Deadlock Detection.
 Centralized control.
 A designated site known as the control site is responsible to construct and
maintain the global WFG and search for cycles.
 Single point of failure.
 High message traffic to and from control site.
 Message traffic is independent of the rate of deadlock formation!
 All sites request and release resources including local resources, by sending
request resource and release resource messages to the control site.
 Control site updates its WFG after receiving messages from other sites and
checks for deadlock.
8
 The Ho-Ramamoorthy Algorithm.
 The two-phase algorithm.
 Every site maintains a status table containing the status of all the processes
initiated at that site.
 resources locked.
 resources being waited for.
 Periodically, a designated site requests the status table from all sites,
constructs a WFG from the information received and searches it for cycles.
 If a cycle is detected, the designated site again requests status tables from all
the sites and again constructs a WFG using ONLY those transactions which
are common to both reports to see if the same cycle is detected again. If it is,
the control site will declare the system to be deadlocked.
 By getting two reports, the designated site reduces the probability of getting
an inconsistent state of the system and reporting false deadlock.
9
 The one-phase algorithm.
 Each site maintains two status tables.
 A resource status table.
 A process status table.
 Periodically, a designated site requests both the tables from every site,
constructs a WFG using only those transactions for which the entry in the
resource table matches the corresponding entry in the process table, and
searches the WFG for cycles.
 No false deadlocks detected.
 Comparison of the two-phase and one-phase algorithm.
 One-phase is faster.
 Requires fewer messages.
10
 Distributed control.
 All sites collectively cooperate to detect a cycle in the state graph that is likely
to be distributed over several sites of the system.
 Deadlock detection is initiated when ever a process is forced to wait.
11
 Chandy-Misra-Haas’s algorithm, an edge-chasing algorithm.
 Uses a special message known as the probe (i,j,k). (i,j,k) denotes that the
deadlock detection is initiated by the process Pi and it is being sent by the
home site of process Pj to the home site of process Pk.
 A probe message travels along the edges of the global TWF graph. A
deadlock is detected if a probe message returns to its initiating process.
 A boolean array, dependenti, for each process Pi , is maintained. If Pi knows
that Pj is dependent on it, dependenti (j) is set to true. Otherwise it is false.
12
 Chandy-Misra-Haas’s algorithm, a diffusion computation based algorithm.
 A process determines if it is deadlocked by initiating a diffusion computation.
 Two types of messages used for this are the query(i,j,k) and the reply(I,j,k)
messages.
 A blocked process initiates deadlock detection by sending query messages
to all the processes from whom it is waiting to receive a message (dependent
set).
 Active processes will discard query and reply messages.
 Upon receiving the first query message initiated by Pi, the blocked process Pk
propagates the query to all the processes in its dependent set and sets a
local variable numk(i) to the number of query messages sent.
 If the query message is not the first one received by Pk initiated by Pi, it
replies to it if it has been continuously been blocked since the first query.
Otherwise, it discards the query.
 The process Pk will finally send a reply message to PI when it has received a
reply for every query message that it sent. For every reply, Pk will decrement
numk(i) and when numk(i)=0, it send s a reply to Pi.
13
 An initiator detects a deadlock when it receives reply messages to all the
query messages it had sent out.
14
 Hierarchical control.
 Sites are arranged in hierarchical fashion and a site is responsible for
detecting deadlocks involving only its children sites.
 The Menasce-Muntz algorithm
15
 The Ho-Ramamoorthy algorithm.
 Sites are grouped in disjointed clusters.
 Periodically, a site is chosen as a central site, which dynamically
chooses a control site for each cluster.
 A control site collects status tables from all the sites in its cluster and
applies the one-phase deadlock detection algorithm to detect all
deadlocks involving only transactions within the cluster.
 The central site uses the information from the control sites to detect any
deadlocks between the clusters.
16
 Performance
 Number of messages exchanged may not be the true indicator of
communication overhead.
 Deadlock persistence time vs. message traffic.
 Storage overhead to store deadlock information.
 Processing overhead to search for cycles and resolve deadlocks.
 False deadlocks.
17

Deadlock resolution.
 A process that detects a deadlock does not know all the processes
involved in the deadlock.
 Two or more processes may detect the same deadlock. Can result in
unnecessary abortion of processes.
 Solution: assign unique priorities to processes.
18
Download