Chapter 5 : Process Management • Deadlock – – – – 7 Cases of Deadlock Conditions for Deadlock Modeling Deadlocks Strategies for Handling Deadlocks – Avoidance – Detection – Recovery Process Synchronization Deadlock Management Starvation Management • Starvation Understanding Operating Systems 1 A Lack of Process Synchronization Causes Deadlock or Starvation • Deadlock (“deadly embrace”) -- a system-wide tangle of resource requests that begins when 2+ jobs are put on hold. – Each job is waiting for a vital resource to become available. – Needed resources are held by other jobs also waiting to run but can’t because they’re waiting for other unavailable resources. – The jobs come to a standstill. – The deadlock is complete if remainder of system comes to a standstill as well. • Resolved via external intervention. Understanding Operating Systems 2 Deadlock • Deadlock is more serious than indefinite postponement or starvation because it affects more than one job. • Because resources are being tied up, the entire system (not just a few programs) is affected. • Requires outside intervention (e.g., operators or users terminate a job). Understanding Operating Systems 3 Seven Cases of Deadlocks 1. 2. 3. 4. 5. 6. 7. Deadlocks on file requests Deadlocks in databases Deadlocks in dedicated device allocation Deadlocks in multiple device allocation Deadlocks in spooling Deadlocks in disk sharing Deadlocks in a network Understanding Operating Systems 4 Case 1 : Deadlocks on File Requests • If jobs can request and hold files for duration of their execution, deadlock can occur. • Any other programs that require F1 or F2 are put on hold as long as this situation continues. • Deadlock remains until a programs is withdrawn or forcibly removed and its file is released. Understanding Operating Systems 5 Case 2 : Deadlocks in Databases 1. P1 accesses R1 and locks it. • Deadlock can occur if 2 processes access & lock records in database. 2. P2 accesses R2 and locks it. • 3 different levels of locking : 3. P1 requests R2, which is locked by P2. 4. P2 requests R1, which is locked by P1. – entire database for duration of request – a subsection of the database – individual record until process is completed. • If don’t use locks, can lead to a race condition. Understanding Operating Systems 6 Case 3: Deadlocks in Dedicated Device Allocation 1. P1 requests tape drive 1 and gets it. 2. P2 requests tape drive 2 and gets it. 3. P1 requests tape drive 2 but is blocked. • Deadlock can occur when there is a limited number of dedicated devices. • E.g., printers, plotters or tape drives. 4. P2 requests tape drive 1 but is blocked. Understanding Operating Systems 7 Case 4 : Deadlocks in Multiple Device Allocation • Deadlocks can happen when several processes request, and hold on to, dedicated devices while other processes act in a similar manner. Understanding Operating Systems 8 Case 5 : Deadlocks in Spooling • Most systems have transformed dedicated devices such as a printer into a sharable device by installing a high-speed device, a disk, between it and the CPU. • Disk accepts output from several users and acts as a temporary storage area for all output until printer is ready to accept it (spooling). • If printer needs all of a job's output before it will begin printing, but spooling system fills available disk space with only partially completed output, then a deadlock can occur. Understanding Operating Systems 9 Case 6 : Deadlocks in Disk Sharing P1 Read records at cylinder 20 P2 Write to file at cylinder 310 I/O Channel Disk control unit • Disks are designed to be shared, so it’s not uncommon for 2 processes access different areas of same disk. • Without controls to regulate use of disk drive, competing processes could send conflicting commands and deadlock the system. Understanding Operating Systems 10 Case 7: Deadlocks in a Network • A network that’s congested (or filled large % of its I/O buffer space) can become deadlocked if it doesn’t have protocols to control flow of messages through network. Understanding Operating Systems 11 Four Conditions for Deadlock • Deadlock preceded by simultaneous occurrence of four conditions that operating system could have recognized: • • • • Mutual exclusion Resource holding No preemption Circular wait Understanding Operating Systems 12 • Mutual exclusion -- the act of allowing only one process to have access to a dedicated resource. • Resource holding -- the act of holding a resource and not releasing it; waiting for the other job to retreat. • No preemption -- the lack of temporary reallocation of resources; once a job gets a resource it can hold on to it for as long as it needs. • Circular wait -- each process involved in impasse is waiting for another to voluntarily release the resource so that at least one will be able to continue. Understanding Operating Systems 13 Modeling Deadlocks Using Directed Graphs (Holt, 1972) • Processes represented by circles. • Resources represented by squares. • Solid line from a resource to a process means that process is holding that resource. • Solid line from a process to a resource means that process is waiting for that resource. • Direction of arrow indicates flow. • If there’s a cycle in the graph then there’s a deadlock involving the processes and the resources in the cycle. Understanding Operating Systems 14 Directed Graph Examples R1 P1 Figure 5.7 (a) R1 P2 P1 R2 R1 Figure 5.7 (b) Understanding Operating Systems P1 Figure 5.7 (c) 15 Figure 5.8 R1 R1 R2 R2 R3 R3 P1 P1 P2 P2 P3 P3 Figure 5.9 R1 R2 R3 R1 R2 R3 P1 P2 P3 P1 P2 P3 Understanding Operating Systems 16 P1 P3 Figure 5.11 (a) P2 P1 P3 Figure 5.11 (b) P2 Understanding Operating Systems 17 Strategies for Handling Deadlocks • Prevent one of the four conditions from occurring. • Avoid the deadlock if it becomes probable. • Detect the deadlock when it occurs and recover from it gracefully. Understanding Operating Systems 18 Prevention of Deadlock • To prevent a deadlock OS must eliminate 1 out of 4 necessary conditions. – Same condition can’t be eliminated from every resource. • Mutual exclusion is necessary in any computer system because some resources (memory, CPU, dedicated devices) must be exclusively allocated to 1 user at a time. – Might be able to use spooling for some devices. – May trade 1 type of deadlock (Case 3) for another (Case 5). Understanding Operating Systems 19 Prevention of Resource Holding or No Preemption • Resource holding can be avoided by forcing each job to request, at creation time, every resource it will need to run to completion. – Significantly decreases degree of multiprogramming. – Peripheral devices would be idle because allocated to a job even though they wouldn't be used all the time. • No preemption could be bypassed by allowing OS to deallocate resources from jobs. – OK if state of job can be easily saved and restored. – Bad if preempt dedicated I/O device or files during modification. Understanding Operating Systems 20 Prevention of Circular Wait • Circular wait can be bypassed if OS prevents formation of a circle. • Havender’s solution (1968) is based on a numbering system for resources such as: printer = 1, disk = 2, tape= 3. – Forces each job to request its resources in ascending order. – Any “number one” devices required by job requested first; any “number two” devices requested next … • Require that jobs anticipate order in which they will request resources. – A best order is difficult to determine. Understanding Operating Systems 21 Avoidance • Even if OS can’t remove 1 conditions for deadlock, it can avoid one if system knows ahead of time sequence of requests associated with each of the active processes. • Dijkstra’s Bankers Algorithm (1965) used to regulate resources allocation to avoid deadlock. • Safe state -- if there exists a safe sequence of all processes where they can all get the resources needed. • Unsafe state -- doesn’t necessarily lead to deadlock, but it does indicate that system is an excellent candidate for one. Understanding Operating Systems 22 Banker’s Algorithm • Based on a bank with a fixed amount of capital that operates on the following principles: No customer will be granted a loan exceeding bank’s total capital. All customers will be given a maximum credit limit when opening an account. No customer will be allowed to borrow over the limit. The sum of all loans won’t exceed the bank’s total capital. • OS (bank) must be sure never to satisfy a request that moves it from a safe state to an unsafe one. • Job with smallest number of remaining resources < = number of available resources Understanding Operating Systems 23 A Bank’s Safe and Unsafe States Safe Customer Loan amount C1 0 C2 2,000 C3 4,000 Total loaned: $6,000 Total capital fund: $10,000 Maximum credit 4,000 5,000 8,000 Remaining credit 4,000 3,000 4,000 Customer Loan amount C1 2,000 C2 3,000 C3 4,000 Total loaned: $9,000 Total capital fund: $10,000 Maximum credit 4,000 5,000 8,000 Remaining credit 2,000 2,000 4,000 Unsafe Understanding Operating Systems 24 Problems with Banker’s Algorithm 1. As they enter system, jobs must state in advance the maximum number of resources needed. 2. Number of total resources for each class must remain constant. 3. Number of jobs must remain fixed. 4. Overhead cost incurred by running the avoidance algorithm can be quite high. 5. Resources aren’t well utilized because the algorithm assumes the worst case. 6. Scheduling suffers as a result of the poor utilization and jobs are kept waiting for resource allocation. Understanding Operating Systems 25 Detection • Use directed graphs to show circular wait which indicates a deadlock. • Algorithm used to detect circularity can be executed whenever it is appropriate. Understanding Operating Systems 26 Reducing Directed Resource Graphs 1. Find a process that is currently using a resource and not waiting for one. Remove this process from graph and return resource to “available list.” 2. Find a process that’s waiting only for resource classes that aren’t fully allocated. – Process isn’t contributing to deadlock since eventually gets resource it’s waiting for, finish its work, and return resource to “available list.” 3. Go back to Step 1 and continue the loop until all lines connecting resources to processes have been removed. Understanding Operating Systems 27 R1 P2 P3 R1 P2 P3 P1 R2 R3 P1 R2 R3 Figure 5.12 (a) Figure 5.12 (b) R1 P2 P3 R1 P2 P3 P1 R2 R3 P1 R2 R3 Figure 5.12 (c) Understanding Operating Systems Figure 5.12 (d) 28 Recovery • Once a deadlock has been detected it must be untangled and system returned to normal as quickly as possible. • There are several recovery algorithms, all requiring at least one victim, an expendable job, which, when removed from deadlock, frees system. 1. Terminate every job that’s active in system and restart them from beginning. 2. Terminate only the jobs involved in deadlock and ask their users to resubmit them. Understanding Operating Systems 29 Recovery Algorithms - 2 3. Terminate jobs involved in deadlock one at a time, checking to see if deadlock is eliminated after each removal, until it has been resolved. 4. Have job keep record (snapshot) of its progress so it can be interrupted and then continued without starting again from the beginning of its execution. 5. Select a non-deadlocked job, preempt resources it’s holding, and allocate them to a deadlocked process so it can resume execution, thus breaking the deadlock 6. Stop new jobs from entering system, which allows nondeadlocked jobs to run to completion so they’ll release their resources (no victim). 30 Select Victim with Least-Negative Effect • Priority of job under consideration—high-priority jobs are usually untouched. • CPU time used by job—jobs close to completion are usually left alone. • Number of other jobs that would be affected if this job were selected as the victim. • Programs working with databases deserve special treatment. Understanding Operating Systems 31 Starvation • Starvation -- result of conservative allocation of resources where a single job is prevented from execution because it’s kept waiting for resources that never become available. • “The dining philosophers” Dijkstra (1968). • Avoid starvation via algorithm designed to detect starving jobs which tracks how long each job has been waiting for resources (aging). Understanding Operating Systems 32 Terminology • • • • • • • • • avoidance circular wait deadlock deadly embrace detection directed graphs locking mutual exclusion no preemption Understanding Operating Systems • • • • • • • • • prevention process synchronization race resource holding safe state spooling starvation unsafe state victim 33