Concurrency Control Course: 03-60-415 Dr. Joan Morrissey School of Computer Science University of Windsor Windsor, Canada Two phase locking – one protocol for concurrent access. Used in most commercial DBMSs. Other techniques exist. Called 2PL because all locking is done first & then locks are released after all operations. Locks are used to control concurrent access to the DB by several transactions. There is a lock for every item in the DB. (DB, table, row & column) Binary locks are very simple but too restrictive. The lock has only two states: locked or unlocked. If an item is locked then no other transaction can have access to that item. But reads can not interfere with one another so why hold up a transaction waiting to read an item? Example: 2 Rules for binary locks. To use the locking scheme in binary mode, the system must enforce the following rules: 1. A transaction T must issue a lock-item(X) before any read-item(X) or write-item (X) is done. 2. A transaction T must issue a unlock(X) after all write-item (X) and read-item(X) operations are completed in T. 3. T will not issue a lock-item(X) if it already holds a lock on X. 4. T will not issue a unlock-item(X) unless it already holds a lock on X. The rules are enforced by the lock manager module of the DBMS. Between lock and unlock on X the transaction is said to hold the lock on X. Only one transaction can hold the lock on X at a particular time. So no two transactions can hold the lock on an item at the same time – no concurrency! But correctness! 3 Shared/Exclusive or Multi-mode locks. An item can be read-locked, write-locked or unlocked. An improvement over Binary! A read-locked item is called a share-locked item because other transactions can read the item. Many transactions can hold a read-lock on an item. Example, booking concert tickets – but can lead to the unrepeatable read problem. A write-locked item is called exclusive-locked because no other transaction can access the item. Can only have one write-lock on an item. It is possible to upgrade a read-lock to a write-lock. Has to be done carefully. It is possible to downgrade a write-lock to a read-lock – always. 4 Rules for Shared/Exclusive locks. 1. A transaction T must issue a read-lock(X) before any read-item(X) is done. 2. A transaction T must issue a write-lock (X) before any write-item(X) is done. 3. A transaction T must issue a unlock(X) after all write-item(X) and read-item(X) operations are completed in T. 4. T will not issue a read-lock(X) if it already holds a read or write lock on X. 5. T will not issue a write-lock(X) if it already holds a read or write lock on X. Must release read lock before applying write lock. 6. T will not issue a unlock(X) unless it already holds a read or write lock on X. 5 Conversion of locks. Sometimes want to upgrade from a read-lock to a write-lock or downgrade from a writelock to a read-lock. Rules: If T is the only transaction holding a read-lock on an item then it can issue a writelock; otherwise it must wait until all other read-locks are released. Increases concurrency if we allow a read-lock to upgrade to a write-lock. It is always possible to downgrade to a read-lock. Write-locks are exclusive! Records of locks are kept in a lock table – part of the lock manager . Two Phase Locking: A transaction follows 2PL if all locking operations are done before the first unlock. Have a growing (or expanding) phase where locks are acquired – but none can be released. In the second phase (shrinking) all existing locks are released and no new locks can be acquired. Upgrading must be done in the expanding phase – as we a acquiring a new lock; downgrading is done in the shrinking phase. Remember –not acquiring a new lock! 6 Two-Phase Locking done wrongly. 7 Two-Phase Locking done correctly. It can be proven that if every schedule follows the two-phase locking protocol (2PL) then the schedule is serializable, giving us a correct schedule. Two-phase locking can reduce concurrency as transactions have to wait for an item to be unlocked. Can cause deadlock. Different types of 2PL: conservative , strict (above) and rigorous. 8 Different types of 2PL: Conservative 2PL requires a transaction to lock all items it needs before the transaction begins – that is before any reads or writes are done. Strict 2PL is the most widely used. In strict 2PL a transaction does not release any of its write locks until after the transaction is committed or aborts. But can unlock before a commit or abort. In rigorous 2PL, a transaction does not release any of its locks until after the transaction is committed or aborted. Can reduce concurrency. 9 Deadlock – it’s like a vicious circle. A deadlock is a situation in which two or more competing transactions are each waiting for the other to finish, and thus neither ever does. The transactions are waiting to lock but first need the other to unlock! Below, T1¹ is waiting for X to be unlocked; T2¹ is waiting for T1¹ to unlock Y. There are deadlock prevention protocols. One happens in 2PL where a transaction must lock all the required items in advance or none. It must wait if one of the items is not available. However, we get reduced concurrency as the transaction waits! It’s like “When two trains approach each other at a crossing, both shall come to a full stop and neither shall start up again until the other has gone.” - old Kansas Law, source unknown. 10 Timestamping & deadlock prevention. A timestamp, TS(T), is a unique identifier assigned to each transaction. Based on the order in which transactions are started. Hence, if T1 starts before T2 then TS(T1) < TS(T2). So T1 is older than T2. Two methods of deadlock prevention: Wait-Die and Wound-Wait Wait-Die: non-pre-emptive. • • Older transaction may wait for younger transaction. A younger transaction may never wait for an older transaction – it is aborted and rolled back instead. The younger transaction may be aborted several times until it acquired the needed data lock. Wound-Wait: pre-emptive. • • Older transaction wounds (forces rollback) of the younger transaction instead of waiting for it to finish. Younger transaction may wait for an older one to complete. May cause fewer rollbacks than Wait-Die. Both techniques prevent deadlock - however some transactions may be aborted needlessly. Timespamping & restarting an aborted transaction with the original timestamp will reduce the wait time for a transaction to be redone. 11 Deadlock detection, prevention and starvation. In dead-lock detection the systems checks to see if dead-lock actually exists. Very practical if you know that transactions are not likely to interfere with each other. Otherwise not! Done by maintaining a wait-for graph – which indicates when a transaction is waiting for another to finish. We have a node for each transaction that is executing. If we have T i waiting to lock X which is currently locked by Tj the we have a directed edge (Ti → Tj ) in the graph. We have deadlock if there is a cycle in the graph. Then some transactions must be aborted. Called victim-selection. Usually selects transactions that have just started. One problem with this approach is deciding when the system should check for deadlock. Every time an edge is added? Too much overhead. If a transaction has been waiting a long time to lock an item may be a better idea. Timeouts. Very simple method. It a transaction goes beyond a certain system defined time, then deadlock is assumed and the transaction is aborted. However, there may not have been a deadlock. Starvation. Occurs when one or more transactions are blocked from gaining access to a item and, as a result, cannot make progress. May happen if the process of giving access is unfair in some way. A first-come-first-served queue can prevent this happening. Can also happen if a transaction is repeatedly chosen as the victim. In this case, the wait-die and wound-wait processes will prevent starvation – because of timestamping. The transaction restarts with the original timestamp, so the possibility of being a victim again is slim. 12 Locking Granularity. All concurrency control techniques assume that the DB contains a number of named data items. Thus we can choose one of the following items to lock: A tuple. An attribute. A table. A disk block. An entire file. The whole database! Granularity clearly affects the amount of concurrency and how easy it is to do recovery. Coarse granularity means we lock large data items such as a disk block or the DB. Coarse granularity decreases the amount of concurrency. In the worst case, where the DB is locked, NO transaction can proceed until the lock is released. Fine granularity means we have more concurrency. However, more locks have to be maintained. So higher overheads. 13 Multiple granularity level locking. Since the granularity depends on the type of transaction, it makes sense to allow different levels of granularity for different transactions. Below is a simple granularity hierarchy with a DB containing two files, where each file contains several disk pages and each page contains several records. Will be used to illustrate a multiple granularity level locking 2PL protocol. Additional types of locks will be needed – more later! 14 Multiple granularity level locking continued. Suppose T1 wants to update all records in f1 – then it requests and is granted an exclusive lock on f1. Now suppose that T2 wants to read a single record in f1 then T2 would request a shared lock on that record. The lock manager would deem the request to be incompatible and deny the request. (An exclusive and a shared lock are not compatible.) What happens if T2 ‘s request comes before T1 ‘s ? The shared lock would be granted but it would be very difficult for the lock manager to detect lock conflict – it would have to check all nodes from f1 down . Hence need for intention locks. 15 Intention locks. The idea is that a transaction must indicate what types of locks it will need along the path to the desired item (node) required. Intention locks allow a higher level node to be locked without checking descendent nodes. Three types: Intention-shared (IS) node. Indicates that one or more share type locks will be requested at a lower level in the tree. Can have several IS locks on a node. Only shared locks can be granted at lower nodes. Intention-exclusive (IX) node. Indicates that one or more X or S locks will be requested at a lower level in the tree. Remember that X can always to downgraded to S! Shared-Intention-Exclusive (SIX). Indicates that the current node is locked in shared mode but that one or more X locks will be requested on descendant node(s). Can only place one SIX lock on a node at one time preventing updates to the node made by other transactions. Means you don’t have to check descendent nodes for incompatibility ( two X locks on a node in the subtree). Can have any number of IS locks on a node as it is read only. Multiple IX locks may also be granted on a node because an IX intends to update only some rows of a table in a DB. 16 Lock compatibility matrix. An X lock is never compatible with any other type of lock. Lock held by node i Lock being requested for node i Hierarchy of locks can become very complicated. For example, start with SIX, then IS….etc. However, if you start with an X lock then no other lock type can be applied at a lower level. 17 Multiple granularity locking (MGL) protocol. The protocol consists of using the lock compatibility chart and the following rules: 1. Lock compatibility (as in previous slide) must be adhered to. 2. The root of the tree must be locked first, in any mode. 3. A node N can be locked in S or IS mode only if the parent node is currently locked in IS or IX mode. (IX is superset of IS so can lock in S or IS). 4. A node N can be locked in X, IX or SIX mode only if the parent node is currently locked in IX or SIX mode. (If in IX or SIX can always place X lock lower down. For example, SIX on file, IX on table, X on tuples to be updated.) 5. A transaction can only lock a node if it has not unlocked any node(s) − to enforce 2PL. 6. A transaction T can only unlock a node, N, if none of the children of N are currently locked by T – again to enforce 2PL. 18 Example: Here, T1 wants to update records 111 & 211. T2 wants to update all records on page 12. T3 wants to read record 11j and all of file 2. Figure 22.9 shows how it is done. 19 Other concurrency control issues. When a new piece of data is inserted, it cannot be accessed until it is created and the insertion completed. A lock is used in exclusive (write) mode and released when the data is written. When a piece of data is deleted, an exclusive (write) lock must be acquired before the data can be deleted. Phantom reads occur when new records added to the database are detectable by transactions that started prior to the insert. For example, transaction T is summing up the salaries of all employees. Transaction T' adds a new employee after T began. T really should not include the new record in its calculation – the inserted data is a phantom record and can cause lock conflicts which may not be recognized by the concurrency control protocol. A technique called predicate locking would lock access to newly inserted data by older transactions. However, hard to implement efficiently. Rarely used in commercial systems. 20 Other issues continued. Another problem occurs when interactive transactions read input and write output to a device (laptop, PC, phone etc.) before the transaction is committed. For example, hotel bookings. You read that there is only one room left but you delay and then book only to find the room is gone! Solution is delay output until all transactions on the data are committed. Latches are locks that are held for a very short time. They do not follow 2PL. Example, writing a buffer to disk. A latch is placed on the buffer until it is written to disk. Then it is released. 21